<<

Phylogenomics of the archaeal : rare horizontal transfer in a unique motility structure Elie Desmond, Céline Brochier-Armanet, Simonetta Gribaldo

To cite this version:

Elie Desmond, Céline Brochier-Armanet, Simonetta Gribaldo. Phylogenomics of the archaeal flagel- lum: rare in a unique motility structure. BMC Evolutionary Biology, BioMed Central, 2007, 7, pp.53-70. ￿10.1186/1471-2148-7-106￿. ￿hal-00698404￿

HAL Id: hal-00698404 https://hal.archives-ouvertes.fr/hal-00698404 Submitted on 7 Apr 2020

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés.

Distributed under a Creative Commons Attribution| 4.0 International License BMC Evolutionary Biology BioMed Central

Research article Open Access Phylogenomics of the archaeal flagellum: rare horizontal gene transfer in a unique motility structure Elie Desmond1, Celine Brochier-Armanet2,3 and Simonetta Gribaldo*1

Address: 1Unite Biologie Moléculaire du Gène chez les , Institut Pasteur, 25 rue du Dr. Roux, 75724 Paris Cedex 15, France, 2Université de Provence Aix-Marseille I, Marseille, France and 3Laboratoire de chimie bactérienne, Institut de Biologie Structurale et de Microbiologie (CNRS), Marseille, France Email: Elie Desmond - [email protected]; Celine Brochier-Armanet - [email protected]; Simonetta Gribaldo* - [email protected] * Corresponding author

Published: 2 July 2007 Received: 28 February 2007 Accepted: 2 July 2007 BMC Evolutionary Biology 2007, 7:106 doi:10.1186/1471-2148-7-106 This article is available from: http://www.biomedcentral.com/1471-2148/7/106 © 2007 Desmond et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract Background: As , motile archaeal swim by means of rotating flagellum structures driven by a proton gradient force. Interestingly, experimental data have shown that the archaeal flagellum is non-homologous to the bacterial flagellum either in terms of overall structure, components and assembly. The growing number of complete archaeal now permits to investigate the evolution of this unique motility system. Results: We report here an exhaustive phylogenomic analysis of the components of the archaeal flagellum. In all complete archaeal genomes, the coding for flagellum components are co- localized in one or two well-conserved genomic clusters showing two different types of organizations. Despite their small size, these genes harbor a good phylogenetic signal that allows reconstruction of their evolutionary histories. These support a history of mainly vertical inheritance for the components of this unique motility system, and an interesting possible ancient horizontal gene transfer event (HGT) of a whole flagellum-coding gene cluster between and . Conclusion: Our study is one of the few exhaustive phylogenomics analyses of a non- informational cell machinery from the third of life. We propose an evolutionary scenario for the evolution of the components of the archaeal flagellum. Moreover, we show that the components of the archaeal flagellar system have not been frequently transferred among archaeal species, indicating that gene fixation following HGT can also be rare for genes encoding components of large macromolecular complexes with a structural role.

Background components and assembly (for a recent review see [4,5]). Motile archaeal species swim by means of rotating flagel- The bacterial flagellum is a complex rotary structure made lum structures driven by a proton gradient force [1,2], as up of as much as 20 and composed of three in bacteria [3]. Interestingly, although they are both major parts -the basal body, the hook, and the filament. responsible for swimming, archaeal and bacterial flagella Rotation is provided by an ATPase exploiting a proton gra- are not homologous, either in terms of overall structure, dient force, and can be switched by specific proteins in

Page 1 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

response to attractants or repellents in the environment [5,6]. FlaH harbors a domain similar to that found in bac- through the chemotaxis system. The filament is a hollow terial RecA-like ATPases, and FlaH mutants are nonmotile structure about 20 nm in diameter and is composed of a and nonflagellated [17]. FlaJ contains many transmem- single type of called flagellin. Bacterial flagellins brane domains, while FlaI probably encodes an ATPase are assembled by a complex type III secretion system that may be important for flagellins export, similarly to located in the basal body and are added to the distal tip of the role of its bacterial homologue, the pilin export the flagellum after passing through the hollow cavity [4]. ATPase PilT, and/or for providing the force for rotation. No experimental data are presently available for FlaG and Much less is known about the archaeal flagellum. It has FlaF, although FlaG may be a component of the anchoring been extensively studied in terms of components, assem- system between the hook and the filament [6]. It may be bly, and mutation experiments, in Halobacteria and possible that some of the multiple flagellin proteins have (for recent reviews [4-6]). The archaeal different roles in flagellum substructures other than the flagellum is a structure thinner than its bacterial counter- filament [6]. Finally, additional components of the part, where at least a filament and a hook are evident [7- archaeal flagellum may be encoded by genes that have not 9]. The archaeal flagellum has been shown to have a yet been identified. unique symmetry in salinarium. In fact, it has 3.3 subunits/turn of a 1.9 nm pitch left-handed helix The uniqueness of the archaeal flagellum in terms of com- compared to 5.5 subunits/turn of a 2.6 nm pitch right- ponents, structure, and assembly indicates that archaeal handed helix for plain bacterial flagellum filaments and bacterial flagella have distinct origins (i.e. they are [10,11]. The archaeal filament can be made up of different analogous systems). Interestingly, homologues of most types of homologous flagellin proteins (called FlaA or bacterial chemotaxis genes are found in archaeal genomes FlaB). The filament is ~10 nm in diameter and is not hol- [5,18], suggesting that archaeal and bacterial chemotaxis low, resembling more to bacterial type IV pili in this systems are evolutionary related. However, their interac- respect [10]. A few other characteristics of archaeal flagella tion with the flagellum system in remains largely make them more alike bacterial pili than flagella: as bac- unknown (for a recent review see [19]). In this work, we terial pilins, archaeal flagellins (i) are made as preproteins sought to contribute to the research on archaeal flagella with short signal peptides that are processed by a recently and archaeal motility in general performing an accurate identified archaeal-specific signal peptidase (called FlaK) phylogenomic study (sensu Eisen [20]) of the archaeal [12-14] that shows weak sequence similarity with the bac- motility apparatus in terms of taxonomic distribution of terial pili leader peptidase PilD, (ii) are likely added at the the genes coding for its components, their genomic con- base of the filament as in bacterial pili, and (iii) undergo text, and their phylogeny. This allowed us to sketch a fairly glycosylation as post-translational modification [15] (see detailed image on the origin and evolution of this macro- [5] for a recent review). Moreover, one component of molecular structure. archaeal flagella (FlaI) is homologous to bacterial PilT, an ATPase involved in bacterial pilin export (a type II/IV Results secretion system) and pilus retraction during twitching Taxonomic distribution and genomic context motility [16]. However, none of the remaining archaeal The taxonomic distribution of the genes coding for flagellum components are homologous to those of bacte- archaeal flagellum components is congruent with that rial pili [5]. Moreover, bacterial pili are not rotating struc- presented in a recent review[5] and is generally consistent tures, and no specific anchoring structures have ever been with species descriptions [21]. Gene homologues for all observed, indicating substantial differences between these components of the archaeal flagellum are found in the two cellular structures. complete genomes of Archaea that are described as motile [5] (indicated by M and ● signs in Figure 1). Conversely, A number of putative flagellum accessory genes lie close no homologues of genes coding archaeal flagellum pro- to flagellin genes in archaeal genomes (called flaC, flaD/ teins are found in the complete genomes of Archaea that flaE, flaF, flaG, flaH, flaI, and flaJ) [5]. Their putative role are described as non motile [5] (indicated by NM and ❍ in flagellum structure and assembly was tentatively signs in Figure 1). Although the representative of the deduced based on their sequences, cellular location, and (i.e. Methanosarcina mazei, Meth- mutation experiments [4-6]. anosarcina acetivorans and ) are described as non-motile [21], at least a full complement FlaC, FlaD, FlaH and FlaI are associated with the mem- of homologues of the genes coding for archaeal flagellum brane fraction in voltae and may thus be components is present in the complete genomes of these peripheral components of the archaeal flagellum [5,6]. species [5,22] (Red rectangles in Figure 1). Conversely, no FlaH, FlaI and FlaJ may be important for the assembly of homologues of the genes coding for archaeal flagellum archaeal flagella and possibly form a secretion complex components are found in the complete of Pyrob-

Page 2 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

aculum aerophilum [5] (Yellow rectangle in Figure 1). This ( pernix) and is surprising since the genus Pyrobaculum is described as ( solfataricus Sulfolobus acido- "motile due to flagellation" in the Bergey's manual and a caldarius) and in some Euryarchaeota: picture is included of a "platinum shadowed cell showing ( hungatei, burtonii, flagella of P. aerophilum" [21]. Similarly, no homologues Methanosarcina acetivorans, Methanosarcina mazei and of the genes coding for archaeal flagellum components are Methanosarcina barkeri) and the Archaeoglobale Archae- present in the complete genome of kandleri oglobus fulgidus (Figure 2B). In and Sul- [5] (Yellow rectangle in Figure 1), which is described as folobales the clusters also include non-flagellum genes motile in the Bergey's manual [21]. (Figure 2A). In particular, a gene coding for a homologue of a bacterial chemotaxis component (MCP domain sig- In all archaeal genomes harboring flagellum components, nal transducer) is present in Halobacterium sp., and three the corresponding genes are always organized into one or genes coding for components of the chemotaxis system two very well conserved clusters [5] (fla clusters, Figure 2). are present in H. marismortui (CheY, CheA, and CheD) The only exception is the gene coding for the preflagellin and in N. pharaonis (CheY, CheC, and CheD). In this peptidase FlaK, which is located close to the fla cluster in archaeon, the operon is disrupted, the second half lying at Methanococcus jannaschii only [5]. flaK homologues are ~50 ORFs downstream (Figure 2A). Interestingly,M. mazei nevertheless always present, at least in single copy at dif- and M. acetivorans each harbor two copies of the fla2 clus- ferent locations in the other archaeal genomes, to the ter (hereafter called fla2A and fla2B), that differ in the notable exception of , aci- number of flagellin gene copies (two in fla2A and one in dophilum and . We verified that fla2B), and in the presence of two different hypothetical these species do not harbor any homologue of PilD -the genes (possibly very distant homologues of flaD, see bacterial prepilin peptidase- that they may have recruited above) lying in between the genes coding for FlaB and by horizontal gene transfer, and how they cope with the FlaG (Figure 2B). According to these characteristics, the A. absence of this enzymatic activity (or which non-homol- fulgidus, the M. hungatei and one of the M. burtonii gene ogous performs the function) remains puzzling. clusters resemble more to the fla2A cluster, while that from M. barkeri resembles more to the fla2B cluster (Fig- A careful observation of gene order within each cluster ure 2B). Multiple copies of flagellin genes (flaB/flaA) are revealed two types of organizations, that we will hereafter found in most archaeal genomes, especially in Thermo- call fla1 and fla2 (Figure 2). fla1 clusters are characterized coccales, whereas and Sulfolobales by the presence of flaC, plus one or few copies of flaD harbor single gene copies (Figure 2), confirming earlier (also annotated as flaE), or by a fusion of flaC and flaD studies on the composition of the flagella from T. volca- (Figure 2A), whereas fla2 clusters lack these genes (Figure nium and S. shibatae [23]. In M. hungatei, the flagellin 2B). Nevertheless, psi-Blast searches revealed that the genes lie in another region of the chromosome (two are genes (hyp1 and hyp2, Figure 2B) lying between flaB and clustered together and an additional small one is isolated flaG in fla2 clusters from Methanomicrobia (Methanosa- (Figure 2B)), suggesting a disruption of the original clus- rcinales and the Methanomicrobiale Methanospirillum ter. This is also the case of the fla1 cluster of M. burtonii, hungatei) and Archaeoglobales may be a very distant which presents no nearby flagellin genes (a single isolated homologue of flaD, as already suggested [5]. A second flaB gene was possibly part of this cluster before disrup- characteristic differentiates fla1 and fla2 clusters: they tion, Figure 2A and see below). In S. solfataricus a trans- show an inverted order of flaG and flaF. While the gene posase disrupts the gene coding for FlaG (Figure 2B). order in fla1 clusters is flaB-flaC-flaD-flaF-flaG-flaH-flaI- However, both the N- and C-ter sequences of FlaG are still flaJ, fla2 clusters (lacking flaC/D) display the order flaB- very similar to FlaG homologues found in S. solfataricus flaG-flaF-flaH-flaI-flaJ (Figure 2). Interestingly, all the close relatives (e.g. S. acidocaldarius and S. tokodaii), sug- archaeal genomes contain only a single type of fla clusters gesting that the disruption of flaG is recent. Indeed, the (i.e. fla1 or fla2), except M. burtonii that is the only species sequenced genome of S. solfataricus presents indeed a high that harbors both type I and II fla clusters. fla1 clusters are number of insertion elements that may have recently present only in Euryarchaeota: (Pyrococ- invaded this strain [24]. Interestingly, cells of this strain cus abyssi, furiosus, Pyrococcus horikoshii and appear non-flagellated under the electron microscope (P. Thermococcus kodakarensis), Methanococcales (Methanocal- Redder, personal communication) even if the disruption dococcus jannashii and Methanococcus maripaludis), Ther- of the flaG gene does not affect the transcription of the moplasmatales ( and downstream operon genes [25]. This suggests that FlaG Thermoplasma volcanium), Halobacteriales ( (possibly involved in the flagellum anchoring system [6]) marismortuii, Halobacterium sp. and Natromonas pharaonis), is an essential flagellum component. and Methanomicrobia(Methanococcoides burtonii) (Figure 2A); whereas fla2 clusters are present in Crenarchaeota:

Page 3 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

Cenarchaeum symbiosum NA ° Mesophilic Crenarchaea Crenarchaea Pyrobaculum aerophilum M mésophiles ° Desulfurococcales Aeropyrum pernix M • Crenarchaea M • Sulfolobus tokodaii M • Sulfolobales Sulfolobus acidocaldarius M • Nanoarchaeum equitans NM ° Thermococcus kodakarensis M • Thermococcus gammatolerans M • M • Thermococcales Pyrococcus horikoshii M • Pyrococcus abyssi M • I Class Methanopyrus kandleri M ° Methanothermobacter thermautotrophicus NM ° jannaschii M • Methanococcales Methanococcus maripaludis M • torridus NM ° acidarmanus NM Euryarchaea ° Thermoplasmatales Thermoplasma acidophilum M • Thermoplasma volcanium M • fulgidus M • Archaeoglobales Haloarcula marismortui M • Halobacterium sp M • Halobacteriales

Natronomonas pharaonis M • I Class Methanogen Methanomicrobia Methanospirillum hungatei M • Methanococcoides burtonii M • Methanosarcina barkeri NM • Methanosarcina mazei NM •

Methanosarcina acetivorans NM • I

SchematicsequencedFigure 1 phylogenygenomes areadapted available from in [29]databases showing the relationship between the main archaeal phyla for which completely Schematic phylogeny adapted from [29] showing the relationship between the main archaeal phyla for which completely sequenced genomes are available in databases. Motility or non-motility by the mean of a flagellum of each organism according to [21] is indicated by M and NM, respectively (NA is used when no information is available). Black circles and open circles indicate the presence or the absence of flagellum components coding gene in the genomes of the considered organisms, respectively. Red rectangles indicate the presence of flagellum component coding genes in organisms described as non-motile whereas yellow rectangles indicate the absence of flagellum component coding genes in organisms described as motile.

Finally, additional copies of flagellum components lie in FlaG, FlaH, FlaI, FlaJ a few instances outside of the clusters (examples are addi- Among all archaeal flagellum components, FlaH, FlaI and tional flaB genes in M. burtonii, the two Thermoplasmat- FlaJ are the most conserved at the sequence level and ales, H. marismortui and N. pharaonis; an additional flaG in always lie close to each other in all the analyzed genomes Halobacterium sp.; an additional flaD in H. marismortui, N. (Figure 2) strengthening their likely fundamental role in pharaonis, and M. burtonii; an additional flaF in M. mari- flagellum assembly and function (see above). FlaI has a paludis, and an additional flaK in Methanococcales) (Fig- number of bacterial homologues belonging to type IV and ure 2). type II secretion systems, including the typeIV pili compo- nent PilT [26]. Moreover, FlaI has additional archaeal Phylogenetic analysis homologues that are also probably part of yet to describe Phylogenetic analyses were performed on six amino secretion machineries [27,28]. In a phylogeny including sequence datasets corresponding to FlaA/B, FlaD/E, FlaG, all these homologues FlaI sequences form a monophyletic FlaH, FlaI, and FlaJ. Phylogenetic analysis of FlaC, FlaF group and are most closely related to their archaeal coun- and FlaK could not be performed due to a too restricted terparts (not shown). FlaH shares a RecA-like ATPase phylogenetic distribution of FlaC, and the poor sequence domain with distant archaeal and bacterial homologues conservation of FlaF and FlaK. that are not involved in motility structures. Psi-blast

Page 4 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

A B Thermococcales Desulfurococcales & Sulfolobales

Thermococcus kodakarensis (TK0038-39-40-41-42-43-44-45-46-47-48-49) // +K TK0053 Aeropyrum pernix (APE1907-1905-1904-1903-1901-1899-1898-1896-1895) B B B B B C D F G H I J B B G H I J Pyrococcus furiosus (PF0338-337-336-335-334-333-332-331-330) // +K PF0471 Sulfolobus solfataricus (SSO2323-2322-2320-2319-2319-2318-2316-2315-2314) +K SSO0131 B B C D F G H I J B G G F H I J Pyrococcus horikoshii (PH0546-548-549-550-551-552-553-553.1n-555-556-557-559) // +K PH0446 transposase transposase

B B B B B C D F G H I J Sulfolobus tokodaii (ST2518-2519-2520-2521-2522-2523-2524) +K ST2258 Sulfolobus acidocaldarius (Saci_1178-1177-1176-1175-1174-1173-1172) + K Saci_0139 Pyrococcus abyssi (PAB1378-1379-1380-1381-1382-1383-1384-1385-1386-1387) // +K PAB1309 B G F H I J B B B C D F G H I J

Methanococcales & Thermoplasmatales Methanomicrobia & Archaeoglobales Methanocaldococcus jannashii (MJ0891-892-893-894-895-896-897-898-899-900-901-902) Methanosarcina barkeri (Mbar_A1970-1969-1968-1967-1966-1965-1964) +K Mbar_A1491 // +2K MJ0835.2 MJ1282.1 B G F H I J fla2B B B B C D E F G H I J K hyp2 Methanococcus maripaludis (MMP1666-1667-1668-1669-1670-1671-1672-1673-1674-1675-1676) Methanosarcina mazei (MM0418-417-416-415-414-413-412) // K MM1678 // +2K MMP0232-MMP0555 +F MMP0342 (MM0323-322-321-320-319-318-317-316) B B B C D E F G H I J B G F H I J fla2B hyp2 Thermoplasma acidophilum (Ta0553-554-555-556-557a-558-559-560) // +B Ta1407m B B G F H I J fla2A B C D F G H I J hyp1 Methanosarcina acetivorans (MA3077-3078-3079-3080-3081-3082-3083) // 2K MA0519-MA4505 Thermoplasma volcanium (TVN0607-608-609-610-611-612-613-614) // +B TVN1426 (MA3062-3061-3060-3059-3058-3057-3056-3055) B C D F G H I J B G F H I J fla2B hyp2 Halobacteriales & Methanomicrobia B B G F H I J fla2A hyp1

Haloarcula marismortui (rrnAC2198-2197-2195-2194-2193-2192-2191-2190-2187-2186-2184-2183) Methanococcoides burtonii (Mbur_0346-0347-0348-0349-0350-0351-0352-0353) // +2B pNG1026 -rrnB0018 +D rrnAC1482 + K rrnAC2525 B B G F H I J fla2A B C/D F G H I J hyp1 Halobacterium sp. (VNG0962G-961G-960G-959H-958G-955G-954C-953C-950a-950Gm-949G-947G) Methanospirillum hungatei (Mhun_0100-0101-0102-0103-0104-0105-0106) // +3B VNG1181G-VNG1009-VNG1008+G VNG2567C + K VNG2285C // +3B Mhun_1238 Mhun_3139 Mhun_3140 + K Mhun_2921 B B B D C/D F G H I J G G F H I J fla2A hyp1 Natromononas pharaonis (NP2086A-2088A-2090A-2092A-2094A-2096A-2098A-2100A-2102A-2104A-2106A) // (NP2152A-2154A-2156A-2158A-2160A) Archaeoglobus fulgidus (AF1055-1054-1053-1052-1051-1050-1049-1048) +K AF0936 // +K NP1276A +D NP2686A B B G F H I J fla2A B2B B hyp1 C/D H I J

Methanococcoides burtonii (Mbur_1570-1571-1572-1573-1574-1575) // +B Mbur_0104 + D Mbur_1246 + K Mbur_0062 C G H I J

FigureGenomic 2 organization of the genes coding for flagellum components in complete archaeal genomes (fla clusters) Genomic organization of the genes coding for flagellum components in complete archaeal genomes (fla clusters). Numbers within brackets correspond to the locus tags of each gene. A // sign indicates that the following components are elsewhere in the genome. The genes annotated as hyp1 and hyp2 are probable distant homologues of flaD/E. Genes colored in grey are homologues of chemotaxis components, as discussed in the text. Remaining genes where no name is indicated are annotated as hypothetical. A Genomic organization of type I clusters (fla1). These are found only in Euryarchaea and are characterized by the presence of flaC and flaD/E and by a conserved gene order flaA/B, flaC, flaD, flaF, flaG, flaH, flaI, flaJ. B Genomic organiza- tion of the type II clusters (fla2). These are found in Crenarchaea and in some Euryarchaea and are characterized by the absence of flaC and flaD/E and by a conserved gene order flaA/B, flaC, flaD, flaG, flaF, flaH, flaI, flaJ. searches revealed (i) that FlaJ harbours a few distant amino . To our point of view this sequence similarity archaeal homologues annotated as involved in type II is too weak to definitively conclude that these sequences secretion and (ii) weak similarities with the bacterial pilus are homologues although this has been claimed [27]. assembly protein TadC and TadB. After removal of ambiguously aligned positions, 104, FlaJ shares the domain GSPII F with TadC and TadB. How- 193, 392 and 353 amino acids could be kept for phyloge- ever, this may not be significant, given that the domain netic analysis of FlaG, FlaH, FlaI, and FlaJ, respectively. was defined on the basis of an alignment that included The resulting trees are strikingly congruent (Figure 3). both archaeal and bacterial sequences. The similarity Notably, all major archaeal groups except Methanomicro- between the FlaJ sequences and TadB and TadC sequences bia (Methanomicrobiales plus Methanosarcinales) are is very weak (16% and 35% of identity and similarity with well defined and strongly supported statistically (Boot- TadB sequences, respectively and 14% and 32% of iden- strap Values -BV- > 990‰ and/or Posterior Probabilities - tity and similarity with TadC sequences, respectively) and PP- = 1), suggesting that no recent horizontal transfer of is mainly the result of the sharing of small hydrophobic flaH, flaG, flaI and flaJ genes occurred across these groups.

Page 5 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

The sequences from Methanomicrobia form a well sup- the divergence of the three Halobacteriales, but after the ported cluster (BV > 980‰ and/or Posterior Probabilities duplication event and that one of the two copies was dis- -PP- = 1) that also includes sequences from A. fulgidus. It placed after the duplication event in the ancestor of H. is not possible for the time being to decide whether this is marismortui and N. pharaonis. A similar duplication of due to a HGT or a hidden paralogy. Importantly, the cor- FlaD followed by a fusion of one of the two copies of FlaD responding trees are also strongly consistent with gene and FlaC also appears to have occurred in M. burtonii. As cluster organization. In fact, homologues from fla1 and in most Halobacteriales one of the two copies resides out- fla2 clusters (characterized by a flaF-flaG and by a flaG-flaF side the fla cluster (Figure 2A) suggesting its displacement gene order, respectively) form two distinct strongly groups after the duplication event. The poor resolution of the (BV > 975‰ and PP = 1, Figure 3). In particular, in all relationships between groups (due to the small number of four trees, the homologues from the fla2 clusters of M. positions kept for the phylogenetic analysis) does not per- hungatei, the four Methanosarcinales and A. fulgidus mit to determine if a single fusion event of FlaC and FlaD appear close to Crenarchaeota (Figure 3). This is in con- occurred in ancestor of Halobacteriales and Methanomi- trast to their expected position as sister-group of Halobac- crobia or if this event occurred two times independently teriales within Euryarchaeota (Figure 1, [29,30]). in both lineages. Interestingly, such expected position is shown by the M. burtonii sequences belonging to its fla1 cluster (Figure 2A). FlaB This suggests that the fla1 and fla2 clusters from M. burto- Given the use of only 72 unambiguously aligned posi- nii have different origins (see below). Independent spe- tions for analysis, the FlaB tree presents a number of cies-specific duplications of flaG appear to have occurred poorly resolved nodes (Figure 4B). Nevertheless, the in Halobacterium sp., N. pharaonis, and M. hungatei (tan- monophyly of a number of groups is recovered and sup- dem gene duplications in these last two species). Moreo- ported by BV > 850‰, except for Thermococcales, Meth- ver, in the FlaH, FlaI and FlaJ phylogenies, the sequences anococcales and Methanomicrobia. As for FlaG, FlaI, FlaJ, from M. burtonii, M. hungatei, M. barkeri and A. fulgidus and FlaH, the FlaB tree is again strongly consistent with fla2 clusters group with the sequences belonging to the gene cluster organization (Figure 4B). In particular, the fla2B cluster from M. mazei and M. acetivorans (Figures 3B, FlaB from the fla2 clusters of M. hungatei, the four Meth- 3C and 3D, respectively), supporting a close relationship anosarcinales and A. fulgidus appear close to Crenarchae- of these clusters, as suggested by their gene organization ota, while the isolated FlaB from M. burtonii appear close (Figure 2B). to Halobacteria, and thus likely functions with the flagel- lum components of fla1 cluster (Figure 4B). Interestingly, FlaD/E the multiple copies of flagellins group in a group-specific As discussed above, homologues of FlaD/E genes are miss- manner (Figure 4B), suggesting that flagellin gene family ing in all fla2 clusters from Crenarchaea. In fla2 clusters expansion occurred mainly by gene duplication and mul- from Methanomicrobia and A. fulgidus the two hypothet- tiple times independently in different species, and not by ical proteins hyp1 and hyp2 could be distantly related to HGT. FlaD/E (Figure 2B). However they are too distant to be included in any phylogenetic analysis. The small number Discussion and conclusion of unambiguously aligned positions (77 amino acids The archaeal flagellum is a complex cellular structure positions) that could be kept for phylogenetic analysis composed of multiple subunits and anchored to the gives a poor resolution of the relationships between major membrane. The striking conservation of the genomic con- euryarchaeal groups (Figure 4A). However, sequences text of the genes coding for these subunits indicates a from phyla form monophyletic groups generally well sup- likely highly coordinated expression and assembly mech- ported (BV = 996‰, PP = 0.93, BV = 1000‰ and BV = anisms. Coupled to the high sequence conservation of the 958‰ for Halobacteriales, Methanosarcinales, Thermo- different subunits across archaeal species inhabiting very plasmatales and Thermococcales, respectively, Figure 4A). different habitats, this underlines the importance for This indicates that, similarly to FlaG, FlaI, FlaJ, and FlaH, structural maintenance of the archaeal flagellum. How- no recent HGT involving the flaD/E gene occurred among ever, we highlighted an important dimorphism of the these groups. Interestingly, a tandem duplication event genetic context organization, with two types of gene clus- appears to have occurred in an ancestor of Methanococca- ters harboring differences in both gene content and gene les, with the two copies having been conserved within the order (Figure 2). In fact, most Euryarchaea exhibit the fla1 cluster. Halobacteriales also harbor two copies of flaD. cluster, whereas all the Crenarchaea and some Euryar- One of the two copies is fused with flaC and always resides chaea have the fla2 cluster (e.g. Methanomicrobia and A. within the fla cluster, whereas the second copy resides fulgidus). The difference between the two clusters is within the fla cluster only in Halobacterium sp. This sug- strongly supported by phylogenetic analysis, and indi- gests that the fusion of flaC and flaD genes occurred before cates that Methanomicrobia and A. fulgidus flagellum

Page 6 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

ABAeropyrum pernix BAA80906 Aeropyrum pernix NP_148248 AAY80523 788 Sulfolobus acidocaldarius 983 Sulfolobus acidocaldarius YP_255814 .93 1000 Sulfolobus tokodaii BAB67633 1 1000 Sulfolobus tokodaii NP_378526 1 624 Sulfolobus solfataricus 1 998 Sulfolobus solfataricus NP_343682 .94 Methanospirillum hungatei ABD39880 1 Methanosarcina barkeri YP_305482 Methanospirillum hungatei ABD39879 1000 1 892 Methanosarcina mazei NP_632344 Methanosarcina mazei AAM30110 fla2B 1 734 943 fla2A Methanosarcina acetivorans NP_617974 881 Methanosarcina acetivorans AAM06432 .83 .98 985 .88 497 Methanococcoides burtonii ABE51344 1 Methanospirillum hungatei EAP16187 .89 442 470 Archaeoglobus fulgidus NP_069885 Archaeoglobus fulgidus AAB90187 885 992 .91 .96 Methanosarcina barkeri AAZ70904 1 Type II clusters (fla2) 1000 1 Methanococcoides burtonii EAN00028 fla2A Type II clusters (fla2) 1 Methanosarcina acetivorans AAM06452 fla2B 1000/1 Methanosarcina mazei AAM30014 G F 542/.55 NP_632440 Methanosarcina mazei G F 999/1 Methanosarcina acetivorans NP_617950 Methanococcoides burtonii EAN01051 F G Methanococcoides burtonii EAN01052 433 Haloarcula marismortui AAV47027 F G Type I 639 volcanii 641 .6 1000 clusters Type I .96 Halobacterium sp NP_444203 .86 pharaonis YP_326701 clusters 1 (fla1) 722 AAV47026 784 731/.85 Haloarcula marismortui 1 Natronomonas pharaonis YP_326700 (fla1) .97 887/1 Natronomonas pharaonis YP_326730 213 Halobacterium sp NP_444204 .48 478 Halobacterium sp NP_281136 1000 Methanocaldococcus jannaschii Q58309 977 .98 1 999 thermolithotrophicus AAG50077 1 458 Pyrococcus furiosus NP_578062 1 NP_988794 - Thermococcus kodakarensis YP_182459 939 Methanococcus maripaludis 999 Pyrococcus abyssi NP_127164 1 979 Methanococcus voltae O06641 1 1 985 904 Thermoplasma volcanium NP_111131 Pyrococcus horikoshii NP_142523 1000 1 1 901 1000 Thermoplasma volcanium NP_111130 1 Thermoplasma acidophilum NP_394032 1 1 Thermoplasma acidophilum NP_394031 628/.8 Pyrococcus horikoshii NP_142524 353 532 Methanocaldococcus jannaschii NP_247893 Pyrococcus abyssi NP_127163 .57 .73 994/1 Methanococcus maripaludis NP_988793 1000/1 Pyrococcus furiosus NP_578061 816/1 Methanothermococcus thermolithotrophicus AAG50076 607 0.1 577/.63 0.1 Thermococcus kodakarensis YP_182460 Methanococcus voltae O06640 .61

C Aeropyrum pernix NP_148247 D Aeropyrum pernix NP_148246 Sulfolobus acidocaldarius YP_255813 654 Sulfolobus solfataricus AAK42470 501 1000 .73 .6 1 Sulfolobus tokodaii NP_378527 1000/1 Sulfolobus tokodaii BAB67637 731 Sulfolobus solfataricus NP_343681 320/.75 Sulfolobus acidocaldarius AAY80519 .69 Methanosarcina barkeri YP_305481 Methanosarcina barkeri YP_305480 AAM30108 1000/1 Methanosarcina acetivorans NP_617975 fla2B 1000/1 Methanosarcina mazei fla2B 863/.97 Methanosarcina mazei AAM30109 884/1 Methanosarcina acetivorans NP_617976 1000 1000 Methanospirillum hungatei EAP16188 Methanospirillum hungatei EAP16189 1 1 Archaeoglobus fulgidus AAB90188 Archaeoglobus fulgidus AAB90189 641 421 .66 EAN00027 .69 Methanococcoides burtonii EAN00026 fla2A 903 Methanococcoides burtonii fla2A 1000 Type II clusters (fla2) Type II clusters (fla2) 1 AAM30012 1 996 Methanosarcina mazei AAM30013 1000/1 Methanosarcina mazei G F 1 1000/1 Methanosarcina acetivorans NP_617949 G F 1000/1 Methanosarcina acetivorans NP_617948 Halobacterium sp NP_279895 F G Methanococcoides burtonii EAN01053 F G 697/.57 Haloferax volcanii Haloferax volcanii Type I Type I 1000/1 Haloarcula marismortui AAV47025 Natronomonas pharaonis YP_326732 clusters 1000/1 clusters 803/.99 Natronomonas pharaonis YP_326731 694/.99 (fla1) 540/.85 (fla1) Haloarcula marismortui AAV47024 407/.9 Halobacterium sp NP_279896 1000 Methanococcoides burtonii EAN01054 1 1000 Methanocaldococcus jannaschii Q58310 Methanocaldococcus jannaschii NP_247896 1 Methanothermococcus thermolithotrophicus AAG50078 Methanothermococcus thermolithotrophicus AAG50079 999 1000 1 Methanococcus maripaludis NP_988795 872 1 Methanococcus maripaludis NP_988796 998/1 .99 1000 961/1 Methanococcus voltae AAB57833 1 998 Methanococcus voltae AAC19121 1 539 Thermoplasma acidophilum NP_394033 823 Thermoplasma volcanium NP_111133 .99 1 1000/1 Thermoplasma volcanium NP_111132 1000/1 Thermoplasma acidophilum NP_394034

405 Thermococcus kodakarensis YP_182461 442 Thermococcus kodakarensis YP_182462 .98 Pyrococcus furiosus NP_578060 .58 Pyrococcus furiosus NP_578059 999/1 1000/1 923/1 Pyrococcus horikoshii NP_142525 890/1 Pyrococcus horikoshii NP_142526 0.1 965/1 Pyrococcus abyssi NP_127162 0.1 978/1 Pyrococcus abyssi NP_127161

FigureUnrooted 3 maximum likelihood phylogenetic trees of FlaG (A), FlaH (B), FlaI (C) and FlaJ (D) Unrooted maximum likelihood phylogenetic trees of FlaG (A), FlaH (B), FlaI (C) and FlaJ (D). Numbers at nodes indicate boot- strap values for 1000 replicates of the original dataset and posterior probabilities, computed by PHYML and MrBayes, respec- tively. The scale bar represents the average number of substitutions per site. The phylogenies show a clear separation between the type I clusters (characterized by a FlaF FlaG order of genes) and type II clusters (characterized by a FlaG FlaF order of genes). components encoded by fla2 gene clusters are more would have kept the two types of clusters). Finally, a closely related to their crenarchaeal homologues than to duplication event of the whole fla2 cluster occurred in an the homologues encoded by the fla1 gene cluster of their ancestor of Methanosarcina (red circle, Figure 5) leading to close relatives (i.e. M. burtonii, Thermoplasmatales and the fla2B cluster (orange cluster, Figure 5). This first sce- Halobacteriales, Figures 1, 3 and 4). Two different evolu- nario involves 16 losses, and implies that the ancestor of tionary scenarios can be proposed to explain our results. Archaea and the ancestor of each euryarchaeal group had In the first scenario (Figure 5), the last ancestor of Archaea two types of flagella. Moreover, it does not explain the was flagellated and had the two types of clusters (i.e. both position of A. fulgidus sequences within the Methanomi- fla1 and fla2, blue and red clusters, respectively, Figure 5), crobia group and not as sister of this group, as generally resulting from the duplication of an ancestral cluster (pur- indicated by molecular phylogeny of multiple markers ple cluster, Figure 5), and these were secondarily and dif- (Figure 1 and [30]). A second scenario involves less losses ferently lost during archaeal lineages evolution (seven (three losses of the fla2 cluster and seven losses of the fla1 losses of the fla1 cluster and nine losses of the fla2 cluster). cluster) (Figure 6 and Figure 7 for a more detailed sce- Importantly, some of these losses would have also nario). Here, the ancestor of Archaea was also flagellated, occurred recently in the Euryarchaea (for example the loss but had only a single type of fla cluster (either fla1, fla2, or of the fla1 cluster in the Methanosarcinale lineage would else, purple cluster in Figure 6 and Figure 7). After the have occurred after the divergence of M. burtonii, that divergence of Crenarchaea and Euryarchaea (black circle,

Page 7 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

Sulfolobus acidocaldarius YP_255818 B cluster Sulfolobus tokodaii NP_378522 B AB997/1 cluster 987/.99 Sulfolobus solfataricus NP_343686 B cluster Pyrococcus furiosus NP_578064 B B C D F G H I J Aeropyrum pernix NP_148254 B B cluster 977/1 Aeropyrum pernix NP_148253 B B cluster Archaeoglobus fulgidus O29207 547 B cluster fla2A 958 Pyrococcus horikoshii NP_142521 B B B B B C D F G H I J M. hungatei EAP15115 B 1 868/1 800/1 Methanospirillum hungatei EAP15114 B Methanosarcina acetivorans NP_617955 B cluster 874 242/.58 Pyrococcus abyssi NP_127166 B B B C D F G H I J Methanosarcina mazei AAM30019 B cluster .76 205/.68 620/.93Methanococcoides burtonii EAN00033 B cluster 576 768 Methanococcoides burtonii EAN00032 B B cluster fla2A .37 Thermococcus kodakarensis YP_182457 B B B B B C D F G H I J 100/.54 .84 Methanosarcina acetivorans NP_617954 B cluster 924/1Methanosarcina mazei AAM30018 B B cluster 334/.96 Archaeoglobus fulgidus O29208 B B cluster Methanothermococcus thermolithotrophicus AAG50074 Methanosarcina acetivorans NP_617970 B cluster Type II clusters (fla2) 512/.97 cluster 1000/1 Methanosarcina barkeri YP_305486 B fla2B G F 969/.97 Methanosarcina mazei AAM30114 B cluster Methanocaldococcus jannaschii Q58306 962 B B B C D E F G H I J F G Methanococcoides burtonii EAM99976 B 1 Natronomonas pharaonis YP_326696 B B B cluster Type I 1000/1 Natronomonas pharaonis YP_326697 B B B cluster 601 706 clusters 945 Methanococcus maripaludis NP_988791 B B B C D E F G H I J 1 Natronomonas pharaonis YP_326695 B B B cluster .81 .71 (fla1) magadii O93719 974 Natrialba magadii Q9P9I3 900 1 744/.9 Methanococcus voltae O06638 Natrialba magadii O93718 546 .99 826 990/1 Natrialba magadii Q9P9I1 .73 1 782 Haloarcula marismortuii AAV48232 B .98 Methanocaldococcus jannaschii Q58305 B B B C D D F G H I J 530/.8 Haloarcula marismortuii AAV44283 993/1 Haloarcula marismortuii AAV47035 cluster 400 876 AAA72644 1000 .77 B cluster .98 .97 870 Methanococcus maripaludis NP_988790 B B B C D D F G H I J Halobacterium salinarum AAA72641 B cluster 1 1000/1 Halobacterium sp NP_279945 334/.31 Halobacterium sp NP_279905 934 716/.53 Methanococcus voltae O06637 Halobacterium salinarum AAA72643 B B B cluster 1 Methanococcus voltae AAA73073 Pyrococcus abyssi NP_127170 cluster 452 317/- B B B .6 Methanothermococcus thermolithotrophicus AAG50073 Thermoplasma volcanium NP_111126 B cluster 940 998 Thermoplasma acidophilum NP_394027 cluster 1 B 345 1 Thermoplasma volcanium NP_111945 B Thermoplasma volcanium NP_111128 B C D F G H I J 984/1 Thermoplasma acidophilum NP_394861 B Methanocaldococcus jannaschii NP_247886 cluster 285 B B B Methanocaldococcus jannaschii NP_247887 1000 786 B B B cluster 1 Thermoplasma acidophilum NP_394029 B C D F G H I J Methanothermococcus thermolithotrophicus 504 Methanococcus voltae AAA73074 Methanococcus maripaludis AAG50058 167 968 Methanococcoides burtonii ABE52168 318Methanococcus maripaludis NP_988787 B B B cluster D 149 Methanococcus vannielii AAC27726 406 497 Methanothermococcus thermolithotrophicus .56 782 626 Methanococcus voltae AAA73075 .93 Methanococcoides burtonii EAN01049 C/D F G H I J 14 Methanococcus vannielii AAC27725 238 Methanococcus maripaludis AAG50057 Methanococcus maripaludis NP_988786 498 Halobacterium sp NP_279899 B B B D F G H I J 998 B B B cluster .7 Methanococcus voltae AAA73076 Methanococcus vannielii AAC27727 5 986 Methanothermococcus thermolithotrophicus Natronomonas pharaonis D 404 YP_326993 996 564 Methanocaldococcus jannaschii Q58303 B B B cluster 382 .7 Methanococcus maripaludis AAG50059 979 Haloarcula marismortui B C/D F G H I J 473 Methanococcus maripaludis NP_988788 B B B cluster AAV47029 Thermococcus kodakarensis BAA84106 B B B B cluster 454 610 Pyrococcus horikoshii NP_142515 B B B B B cluster 1 .99 Natronomonas pharaonis C/D H I J 558 Pyrococcus horikoshii NP_142517 B B B B B cluster YP_326729 321 Thermococcus kodakarensis BAA84105 154 B B B B cluster 253 Pyrococcus horikoshii NP_142516 B B B B B cluster 328 304 Thermococcus kodakarensis BAA84107 Haloarcula marismortui AAV46407 179 B B B B cluster .67 Pyrococcus abyssi Q9UYL3 922 B B B cluster Thermococcus kodakarensis BAA84108 0.1 885 B B B B cluster Pyrococcus furiosus NP_578067 508 Halobacterium sp B B B C/D F G H I J 1000 B B cluster .88 NP_279900 520 Pyrococcus horikoshii NP_142518 B B B B B cluster Pyrococcus furiosus NP_578066 428 B B cluster 0.1 Pyrococcus horikoshii NP_142519 .63 Haloferax volcanii 345 B B B B B cluster 297 Pyrococcus abyssi Q9UYL4 B B B cluster 270 Thermococcus kodakarensis BAA84109 B B B B cluster

AFigure. Unrooted 4 maximum likelihood phylogenetic trees of FlaD/E A. Unrooted maximum likelihood phylogenetic trees of FlaD/E. Numbers at nodes indicate bootstrap values for 1000 repli- cates of the original dataset and posterior probabilities, computed by PHYML and MrBayes, respectively. The scale bar repre- sents the average number of substitutions per site. The light blue circles indicate duplication events. B. Unrooted maximum likelihood phylogenetic trees of FlaA/B. Numbers at nodes indicate bootstrap values for 1000 replicates of the original dataset and posterior probabilities, computed by PHYML and MrBayes, respectively. The scale bar represents the average number of substitutions per site. Detailed cluster organizations are not shown. White arrows are used to schematize the clusters. The phylogenies show a clear separation between the type I clusters (characterized by a FlaF FlaG order of genes) and type II clus- ters (characterized by a FlaG FlaF order of genes).

Figure 6 and Figure 7), evolution led to the fla1 and fla2 cluster than to their fla1 orthologues from Halobacteri- clusters. A single HGT of the fla2 cluster would have then ales. occurred from some ancestors of Sulfolobales and Desul- furococcales to an ancestor of Methanomicrobia (Figure 6 Both scenarios imply that the archaeal flagellum would Figure 7, HGT 1), and was followed by a HGT from some have appeared prior to the last archaeal ancestor ancestors of Methanosarcinales to A. fulgidus (Figure 6 Fig- ure 7, HGT 2). Methanosarcina, M. hungatei and A. fulgidus Apart these two likely cases of HGT of the entire fla2 gene would thus have lost their original fla1 gene cluster before cluster, we found no clear evidence for recent transfers of or after their replacement by a fla2 gene cluster of crenar- the genes coding for flagellum components among chaeal origin. We favor a HGT from Crenarchaea to Meth- archaeal species. Indeed, the poor resolution of inter- anomicrobia rather than the opposite, since in this case phyla relationships in some trees is more likely due to lack we would expect to find the M. burtonii fla1 genes more of phylogenetic signal rather than horizontal gene trans- closely related to their paralogues belonging to the fla2 fer. One explanation for the rarity of HGT is that it is pos-

Page 8 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

Mesophilic Crenarchaea Pyrobaculum aerophilum Desulfurococcales Aeropyrum pernix Sulfolobus solfataricus Type I cluster Sulfolobus tokodaii Sulfolobales Sulfolobus acidocaldarius T pe II cluster Nanoarchaeum equitans Nanoarchaeota Thermococcus kodakarensis Thermococcus gammatolerans Pyrococcus furiosus Thermococcales Pyrococcus horikoshii Pyrococcus abyssi ehngncasI class Methanogen Methanopyrus kandleri Methanopyrales Methanothermobacter thermautotrophicus Methanobacteriales Methanocaldococcus jannaschii Methanococcales Methanococcus maripaludis Ferroplasma acidarmanus Thermoplasmatales Thermoplasma acidophilum Thermoplasma volcanium Archaeoglobus fulgidus Archaeoglobales Haloarcula marismortui Halobacterium sp Halobacteriales Natronomonas pharaonis ehngncasII class Methanogen Methanospirillum hungatei Methanomicrobia Methanomicrobiales Methanococcoides burtonii Methanosarcina barkeri Methanosarcinales Methanosarcina mazei Methanosarcina acetivorans

AnFigure evolutionary 5 scenario for the origin and evolution of the archaeal flagellum An evolutionary scenario for the origin and evolution of the archaeal flagellum. Blue, red, orange and purple clusters represent fla1 cluster, fla2A cluster, fla2B cluster and their ancestor, respectively. The black circle represents the last common ancestor of Archaea. The green circle represents the duplication event that led to the fla1 and fla2 clusters. This duplication event occurred before the last common ancestor of Archaea, which thus harbored the two types of clusters. The red circle repre- sents the recent duplication event of the fla2 cluster in ancestor of Methanosarcina. The blue and red crosses represent the loss of the fla1 and fla2A clusters, respectively. sible the result of the high level of integration of flagellum clusters in terms of expression and assembly, and how two components within the macromolecular structure. Impor- different flagellum systems coexist in this archaeon. tantly, this contradicts the generally assumed notion that HGT of "informational" genes (i.e. those coding for pro- Finally, it has been suggested that archaea are modified teins involved in the expression and the transmission of Actinobacteria and that archaeal flagella are derived from genetic information) are less frequent than those involv- bacterial flagella following an adaptation to acidic envi- ing the remaining ones ("operational") genes. Neverthe- ronments by the recruitment of an already acid-stable less, the acquisition of a whole flagellum component glycoprotein from the pilus machinery that would have coding gene cluster from distant donors seems possible. replaced the original flagellin while leaving intact the Interestingly, even if two clusters coexist within a genome basal rotary motor [31]. We find this hypothesis unlikely (i.e. in M. burtonii) neither gene recombination is for the fact that archaeal flagella do not resemble to bacte- observed between homologous genes belonging to the rial flagella in major structure, assembly, and compo- two clusters, nor losses, suggesting that the components of nents, and not only concerning flagellin. Indeed, no even a cluster may interact preferentially due to coevolution, distant homologues to any component, including the although they form similar cellular structures. The pres- basal rotary motor, of bacterial flagella can be recovered in ence in M. burtonii of two types of fla clusters (possibly archaeal genomes, including the related type III secretion one native and one acquired by HGT from crenarchaeota) system components [28]. Moreover, flagella of acido- represents an interesting experimental model to study. It philic bacteria (such as Thiobacillus) show a typical bacte- would be in fact particular interesting to know the differ- rial structure (e.g. a diameter of approximately 20 nm), so ence between the components encoded by the two fla they adapted to acidic conditions without radically modi-

Page 9 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

Type II cluster Cenarchaeum symbiosum Mesophilic Crenarchaea Pyrobaculum aerophilum Desulfurococcales Aeropyrum pernix Sulfolobus solfataricus Sulfolobus tokodaii Sulfolobales Sulfolobus acidocaldarius Nanoarchaeum equitans Nanoarchaeota Thermococcus kodakarensis Thermococcus gammatolerans Ancestral cluster Pyrococcus furiosus Thermococcales Pyrococcus horikoshii Pyrococcus abyssi ehngncasI class Methanogen Type I cluster Methanopyrus kandleri Methanopyrales Methanothermobacter thermautotrophicus Methanobacteriales Methanocaldococcus jannaschii Methanococcales Methanococcus maripaludis Picrophilus torridus Ferroplasma acidarmanus Thermoplasmatales Thermoplasma acidophilum Thermoplasma volcanium Archaeoglobus fulgidus Archaeoglobales Haloarcula marismortui Halobacterium sp HGT 1 Halobacteriales Natronomonas pharaonis ehngncasII class Methanogen Methanospirillum hungatei Methanomicrobia Methanomicrobiales HGT 2 Methanococcoides burtonii Methanosarcina barkeri Methanosarcinales Methanosarcina mazei Methanosarcina acetivorans

AnFigure evolutionary 6 scenario for the origin and the evolution of the archaeal flagellum An evolutionary scenario for the origin and the evolution of the archaeal flagellum. Blue, red, orange and purple clusters repre- sent fla1 cluster, fla2A cluster, fla2B cluster and their ancestor, respectively. The black circle represents the last common ancestor of Archaea. The red circle represents the recent duplication event of the fla2 cluster in ancestor of Methanosarcina. The blue and red crosses represent the loss of the fla1 and fla2A clusters, respectively. The green arrows represent horizontal gene transfers. fying their motility structures and components [28]. the expression and localization of the flagellum compo- Indeed, the uncomplete genome of the extreme acido- nents in Methanosarcinales, in the light of the fact that the philic bacterium (optimal pH<3) Acidobacterium capsula- flagellum genes of Methanosarcinales may have been tum contains a clear homologue of bacterial flagellin, recruited from the distantly related Crenarchaeota, since indicating that adaptation to acidic environments was this event may have been an important step in the evolu- possible without its replacement. tion of this archaeal lineage. Moreover, virtually nothing is known about other types of motility than swimming in The lack of congruence between the description of archaea [28], while in bacteria these are starting being archaeal species as motile or non-motile and the taxo- investigated [4]. The fact that no homologues of flagellum nomic distribution of homologues of flagellum compo- components are encoded in the genomes of at least two nent coding genes[5] is particularly striking and archaeal species that are described as motile (M. kandleri underlines the need to explore more in depth the motility and P. aerophilum) is also puzzling. Although it is possible systems in Archaea. The presence of two gene clusters in that the strains used for genome sequencing have lost the non motile Methanosarcinales is particularly puzzling. flagellum operon following lab cultivation (see for exam- Either these species can be flagellated under particular ple [32], it would surely be interesting to test motility in conditions that have not yet been tested, or the flagellum these archaeal species. Alternatively, this observation component homologues are involved in other functions could also suggest that other types of motility may occur than motility (for example, they could be required for cell- in archaea and are made possible by still unknown molec- cell adhesion to form cell aggregates, a peculiarity of this ular structures. It would also be very useful to investigate archaeal family). It will be extremely interesting to study the relationship between the flagellum and the secretion

Page 10 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

C.symbiosum Type II cluster P. aerophilum B D/E G F H I J

A. pernix B B I J

G H I J S. acidocaldarius I J B G F H I J B G F H I J S. solfataricus B G F H I J B G F H I J S. tokodaii B G F H I J N. equitans

J T. kodakarensis B B B B B C D F G H I J B B C D F G H I J P. furiosus Ancestral cluster B B C D F G H I J B B B B B C D F G H I J P. horikoshii B B B B B C D F G H I J

B B B C D F G H I J P. abyssi B B B C D F G H I J M. kandleri B D/E F G H I J M. thermautotrophicus C B B B C D D F G H I J Type I cluster M. jannaschii B B B C D E F G H I J B B C D D F G H I J M. maripaludis B B B C D E F G H I J B B B C D D F F G H I J F torridus acidarmanus

T. acidophilum B B C D F G H I J HGT 1 B B C D F G H I J T. volcanium B B C D F G H I J A. fulgidus B B G F H I J H. volcanii B B D B B B D C D F I J H. marismortui C D F G H I J B B B D/E C D F G G H I J N. pharaonis B HGT 2 D I J B B D F G G H I J F H. sp G B B B C D I J C G H I J M. hungatei B B B G G F H I J B B B G G F H I J M. burtonii B D C D F G H I J B D CD F G H I J B B G F H I J B G F H I J M. barkeri B G F H I J B D C D F G H I J B B G F H I J B B G F H I J M. mazei B B G F H I J G F H I J M. acetivorans B G F H I J B G F H I J B G F H I J

AFigure detailed 7 evolutionary scenario for the origin and the evolution of the archaeal flagellum based on figure 6 A detailed evolutionary scenario for the origin and the evolution of the archaeal flagellum based on figure 6. The purple cluster represents the ancestor of fla1 and fla2 clusters. The black circle represents the last common ancestor of Archaea. Blue arrows represent the recent duplication events. Black arrows indicate gene movements to different locations from their origi- nal positions in the cluster. Dark green { symbols indicate gene insertions within the clusters. The blue, red and black crosses represent the loss of the fla1 cluster, fla2A cluster, or of single genes, respectively. A red arrow indicates gene inversion. F indi- cates the fusion of FlaC and FlaD. The green arrows represent horizontal gene transfers. systems in Archaea. Indeed, archaeal genomes harbor ponent were retrieved from the nr database at the only a few homologues of bacterial TypeII/IV secretion National Center for Biotechnology Information [40] systems [28], and it is not known whether they form pili, using the BlastP program with different seeds [41] and the despite rare observations [33-35] and evidence for conju- ALIBABA program (P. Lopez personal communication). gation [36-38]. For each dataset, additional searches using psi-Blast were performed to search for divergent homologues (especially To sum up, two radically different options for motility from bacteria) [41]. tBlastN searches on the unfinished were adopted at the divergence of the two prokaryotic genomes of the Halobacteriale Haloferax volcanii were per- domains, and much still remain to be uncovered on formed at TIGR [42]. Multiple alignments were done with archaeal motility systems. ClustalW [43] and MUSCLE [44]. For each dataset, the quality of the alignments obtained with CLUSTALW and Methods MUSCLE, was evaluated with T-COFFEE (CLUSTALW and Data set construction MUSCLE provided alignments of comparable quality, not The list of archaeal flagellum components was retrieved shown) [45]. All the alignments were edited and manu- from the Kyoto Encyclopedia of Genes and Genomes [39]. ally improved using the ED program from the MUST pack- Homologous sequences of each archaeal flagellum com- age [46]. Regions where the between sites was

Page 11 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

doubtful were manually removed from the datasets for flagellar filaments and type IV pili. J Mol Biol 2002, 321(3):383-395. phylogenetic analyses. 11. Trachtenberg S, Cohen-Krausz S: The archaeabacterial flagellar filament: a bacterial propeller with a pilus-like structure. J Phylogenetic analysis Mol Microbiol Biotechnol 2006, 11(3-5):208-220. 12. Bardy SL, Jarrell KF: FlaK of the archaeon Methanococcus mari- Maximum Likelihood (ML) phylogenetic trees were com- paludis possesses preflagellin peptidase activity. FEMS Micro- puted with Phyml [47,48] and the JJT model of amino biol Lett 2002, 208(1):53-59. acid substitution (Jones, Taylor and Thornton [49]. A 13. Bardy SL, Jarrell KF: Cleavage of preflagellins by an aspartic acid signal peptidase is essential for flagellation in the archaeon gamma correction with 4 discrete classes of sites was used Methanococcus voltae. Mol Microbiol 2003, 50(4):1339-1347. to take into account the heterogeneity of evolutionary 14. Albers SV, Szabo Z, Driessen AJ: Archaeal homolog of bacterial type IV prepilin signal peptidases with broad substrate spe- rates across sites. The alpha parameter and the proportion cificity. J Bacteriol 2003, 185(13):3918-3925. of invariable sites were estimated for each dataset. The 15. Wieland F, Paul G, Sumper M: Halobacterial flagellins are sul- robustness of each branch was estimated by non-paramet- fated glycoproteins. J Biol Chem 1985, 260(28):15180-15185. 16. Mattick JS: Type IV pili and twitching motility. Annu Rev Microbiol ric bootstrap analysis (1000 replicates) using PHYML. In 2002, 56:289-314. addition, bayesian analyses were performed using 17. Thomas NA, Pawson CT, Jarrell KF: Insertional inactivation of the flaH gene in the archaeon Methanococcus voltae results MrBayes [50] with a mixed model of amino acid substitu- in non-flagellated cells. Mol Genet Genomics 2001, tion. As for ML tree reconstruction, a gamma correction 265(4):596-603. with 4 discrete classes of sites was used and the alpha 18. Rudolph J, Oesterhelt D: Deletion analysis of the che operon in the archaeon Halobacterium salinarium. J Mol Biol 1996, parameter and the proportion of invariable sites were esti- 258(4):548-554. mated. MrBayes was run with four chains for 1 million 19. Szurmant H, Ordal GW: Diversity in chemotaxis mechanisms generations and trees were sampled every 100 genera- among the bacteria and archaea. Microbiol Mol Biol Rev 2004, 68(2):301-319. tions. To construct the consensus tree, the first 1500 trees 20. Eisen JA: A phylogenomic study of the MutS family of pro- were discarded as "burnin". teins. Nucleic Acids Res 1998, 26(18):4291-4300. 21. Garrity GM: Bergey's Manual of Systematic Bacteriology. 2nd edition. Edited by: Garrity GM. New York , Springer-Verlag; 2001. Genomic context analysis 22. Galagan JE, Nusbaum C, Roy A, Endrizzi MG, Macdonald P, FitzHugh The genomic localization of each archaeal flagellum com- W, Calvo S, Engels R, Smirnov S, Atnoor D, Brown A, Allen N, Naylor J, Stange-Thomann N, DeArellano K, Johnson R, Linton L, McEwan P, ponent was manually investigated in all archaeal com- McKernan K, Talamas J, Tirrell A, Ye W, Zimmer A, Barber RD, Cann plete genomes available at the NCBI. I, Graham DE, Grahame DA, Guss AM, Hedderich R, Ingram-Smith C, Kuettner HC, Krzycki JA, Leigh JA, Li W, Liu J, Mukhopadhyay B, Reeve JN, Smith K, Springer TA, Umayam LA, White O, White RH, Authors' contributions Conway de Macario E, Ferry JG, Jarrell KF, Jing H, Macario AJ, Paulsen SG conceived the study and supervised the analyses. ED I, Pritchett M, Sowers KR, Swanson RV, Zinder SH, Lander E, Metcalf WW, Birren B: The genome of M. acetivorans reveals exten- carried out the analyses. CB and SG refined and com- sive metabolic and physiological diversity. Genome Res 2002, pleted the analyses and wrote the manuscript. All authors 12(4):532-542. read and approved the final manuscript. 23. Faguy DM, Bayley DP, Kostyukova AS, Thomas NA, Jarrell KF: Isola- tion and characterization of flagella and flagellin proteins from the Thermoacidophilic archaea Thermoplasma volca- References nium and Sulfolobus shibatae. J Bacteriol 1996, 178(3):902-905. 1. Alam M, Oesterhelt D: Morphology, function and isolation of 24. Redder P, Garrett RA: Mutations and rearrangements in the halobacterial flagella. J Mol Biol 1984, 176(4):459-475. genome of Sulfolobus solfataricus P2. J Bacteriol 2006, 2. Marwan W, Alam M, Oesterhelt D: Rotation and switching of the 188(12):4198-4206. flagellar motor assembly in Halobacterium halobium. J Bac- 25. Albers SV, Driessen AJ: Analysis of ATPases of putative secre- teriol 1991, 173(6):1971-1977. tion operons in the thermoacidophilic archaeon Sulfolobus 3. Macnab RM: The bacterial flagellum: reversible rotary propel- solfataricus. Microbiology 2005, 151(Pt 3):763-773. lor and type III export apparatus. J Bacteriol 1999, 26. Planet PJ, Kachlany SC, DeSalle R, Figurski DH: Phylogeny of genes 181(23):7149-7153. for secretion NTPases: identification of the widespread tadA 4. Bardy SL, Ng SY, Jarrell KF: Prokaryotic motility structures. subfamily and development of a diagnostic key for gene clas- Microbiology 2003, 149(Pt 2):295-304. sification. Proc Natl Acad Sci U S A 2001, 98(5):2503-2508. 5. Ng SY, Chaban B, Jarrell KF: Archaeal flagella, bacterial flagella 27. Peabody CR, Chung YJ, Yen MR, Vidal-Ingigliardi D, Pugsley AP, Saier and type IV pili: a comparison of genes and posttranslational MH Jr.: Type II protein secretion and its relationship to bac- modifications. J Mol Microbiol Biotechnol 2006, 11(3-5):167-191. terial type IV pili and archaeal flagella. Microbiology 2003, 6. Thomas NA, Bardy SL, Jarrell KF: The archaeal flagellum: a dif- 149(Pt 11):3051-3072. ferent kind of prokaryotic motility structure. FEMS Microbiol 28. Albers SV, Szabo Z, Driessen AJ: Protein secretion in the Rev 2001, 25(2):147-174. Archaea: multiple paths towards a unique cell surface. Nat 7. Faguy DM, Koval SF, Jarrell KF: Physical characterization of the Rev Microbiol 2006, 4(7):537-547. flagella and flagellins from Methanospirillum hungatei. J Bac- 29. Gribaldo S, Brochier-Armanet C: The origin and evolution of teriol 1994, 176(24):7491-7498. Archaea: a state of the art. Philos Trans R Soc Lond B Biol Sci 2006, 8. Metlina AL: Bacterial and archaeal flagella as prokaryotic 361(1470):1007-1022. motility organelles. Biochemistry (Mosc) 2004, 69(11):1203-1212. 30. Brochier C, Forterre P, Gribaldo S: An emerging phylogenetic 9. Cruden D, Sparling R, Markovetz AJ: Isolation and Ultrastructure core of Archaea: phylogenies of transcription and translation of the Flagella of Methanococcus thermolithotrophicus and machineries converge following addition of new genome Methanospirillum hungatei. Appl Environ Microbiol 1989, sequences. BMC Evol Biol 2005, 5(1):36. 55(6):1414-1419. 31. Cavalier-Smith T: The neomuran origin of archaebacteria, the 10. Cohen-Krausz S, Trachtenberg S: The structure of the archea- negibacterial root of the universal tree and bacterial mega- bacterial flagellar filament of the extreme Halo- classification. Int J Syst Evol Microbiol 2002, 52(Pt 1):7-76. bacterium salinarum R1M1 and its relation to eubacterial

Page 12 of 13 (page number not for citation purposes) BMC Evolutionary Biology 2007, 7:106 http://www.biomedcentral.com/1471-2148/7/106

32. Labes A, Schonheit P: Sugar utilization in the hyperther- mophilic, sulfate-reducing archaeon Archaeoglobus fulgidus strain 7324: starch degradation to acetate and CO2 via a modified Embden-Meyerhof pathway and acetyl-CoA syn- thetase (ADP-forming). Arch Microbiol 2001, 176(5):329-338. 33. Leadbetter JR, Breznak JA: Physiological ecology of Methano- brevibacter cuticularis sp. nov. and Methanobrevibacter cur- vatus sp. nov., isolated from the hindgut of the termite Reticulitermes flavipes. Appl Environ Microbiol 1996, 62(10):3620-3631. 34. Miroshnichenko ML, Gongadze GM, Rainey FA, Kostyukova AS, Lysenko AM, Chernyh NA, Bonch-Osmolovskaya EA: Thermococ- cus gorgonarius sp. nov. and Thermococcus pacificus sp. nov.: heterotrophic extremely thermophilic archaea from New Zealand submarine hot vents. Int J Syst Bacteriol 1998, 48 Pt 1:23-29. 35. Weiss RL: Attachment of bacteria to in extreme envi- ronments. J Gen Microbiol 1973, 77:501-507. 36. Prangishvili D, Albers SV, Holz I, Arnold HP, Stedman K, Klein T, Singh H, Hiort J, Schweier A, Kristjansson JK, Zillig W: Conjugation in archaea: frequent occurrence of conjugative plasmids in Sulfolobus. Plasmid 1998, 40(3):190-202. 37. Rosenshine I, Tchelet R, Mevarech M: The mechanism of DNA transfer in the of an archaebacterium. Science 1989, 245(4924):1387-1389. 38. Stedman KM, She Q, Phan H, Holz I, Singh H, Prangishvili D, Garrett R, Zillig W: pING family of conjugative plasmids from the extremely thermophilic archaeon Sulfolobus islandicus: insights into recombination and conjugation in Crenarchae- ota. J Bacteriol 2000, 182(24):7014-7020. 39. KEGG: [http://www.genome.jp/kegg/]. 40. NCBI: [http://www.ncbi.nlm.nih.gov/]. 41. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lip- man DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389-3402. 42. TIGR: [http://tigrblast.tigr.org/]. 43. Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22(22):4673-4680. 44. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32(5):1792-1797. 45. Poirot O, O'Toole E, Notredame C: Tcoffee@igs: A web server for computing, evaluating and combining multiple sequence alignments. Nucleic Acids Res 2003, 31(13):3503-3506. 46. Philippe H: MUST, a computer package of Management Utili- ties for Sequences and Trees. Nucleic Acids Res 1993, 21(22):5264-5272. 47. Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 2003, 52(5):696-704. 48. Guindon S, Lethiec F, Duroux P, Gascuel O: PHYML Online--a web server for fast maximum likelihood-based phylogenetic inference. Nucleic Acids Res 2005, 33(Web Server issue):W557-9. 49. Jones DT, Taylor WR, Thornton JM: The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci 1992, 8(3):275-282. Publish with BioMed Central and every 50. Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 2001, 17(8):754-755. scientist can read your work free of charge "BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime." Sir Paul Nurse, Cancer Research UK Your research papers will be: available free of charge to the entire biomedical community peer reviewed and published immediately upon acceptance cited in PubMed and archived on PubMed Central yours — you keep the copyright

Submit your manuscript here: BioMedcentral http://www.biomedcentral.com/info/publishing_adv.asp

Page 13 of 13 (page number not for citation purposes)