Opinion

Lateral transfer challenges principles of microbial systematics

Eric Bapteste1 and Yan Boucher2

1 UPMC UMR 7138, 7 quai Saint-Bernard, Baˆ timent A, 4e`me e´ tage, 75005, Paris, France 2 Department of Civil and Environmental Engineering, MIT, Building 48–305, 77 Massachusetts Avenue, Cambridge, MA 02139, USA

Evolutionists strive to learn about the natural historical As a result, the traditional TOL reconstruction project, process that gave rise to various taxa, while also as far as prokaryotic organisms are concerned, fell short. It attempting to classify them efficiently and make gener- is arguable whether debating the branching order in the alizations about them. The quantitative importance of TOL and looking for a unique nested hierarchy is a satis- lateral gene transfer inferred from genomic data, factory way to classify such microbes in the presence of although well acknowledged by microbiologists, is in lateral gene transfer. Instead, we propose alternative con- conflict with the conceptual foundations of the cepts to the traditional phylogenetic projects to deal with traditional phylogenetic system erected to achieve these microbial and systematics: (i) a redefinition of goals. To provide a true account of microbial evolution, natural groups; (ii) the description of a new type of evol- we suggest developing an alternative conception of utionary unit originating from lateral gene transfer (LGT); natural groups and introduce a new notion – the com- and (iii) the realization of an interactive taxonomical data- posite evolutionary unit. Furthermore, we argue that a base (comprising overlapping groups) to progress towards comprehensive database containing overlapping taxo- a more natural classification. The third of these solutions nomical groups would constitute a step forward regard- would constitute a transition possibly as significant as the ing the classification of microbes in the presence of change from a linear system of classification to a nested lateral gene transfer. hierarchy that occurred thousands of years ago.

Introduction The molecular project conceived by Zucker- Glossary kandl and Pauling [1] in the 1960s was ambitious. Among Essentialism: the view that some permanent, unalterable properties other revolutionary accomplishments, molecular phyloge- of objects are essential to them, so that, for any specific type of entity, netics was expected to function as a powerful time machine, it is at least theoretically possible to specify a finite list of character- enabling the identification of genetic, ultrastructural and istics – all of which must be possessed by any entity to belong to the metabolic features of ancient life forms for which no fossils group defined. For instance, for a property essentialist, all essential parts of a species remain unchanging throughout time. In historical had been left [2]. Through their congruence (i.e. the agree- essentialism, the unchanging essential characteristic is a common ment between phylogenies obtained using different data- history. A monophyletic group is thus natural because it is defined by sets) [2], could help to reconstruct what is often called the existence of a last common ancestor exclusively shared by all its the Tree of Life (TOL). To understand ancient microbial members, even though these members are not similar to each other in other respects (ecologically, morphologically, functionally etc.). evolution, the biggest challenges have been seen as mostly Mill, John Stuart: British philosopher (1806–1873) who was an influ- methodological – improving phylogenetic algorithms accu- ential liberal thinker. He is notably famous for his defense of utilitar- rately to model the complex evolution of molecules [3] and ianism and his book A System of Logic: Ratiocinative and Inductive, sequencing a sufficient number of phylogenetic markers [4]. published in 1843, describing the five basic principles of induction Using a wealth of methods and data, TOLs flourished [5,6]. and the methods of scientific inquiry. Monism: at the methodological level, the view that a single method Yet, over the past 15 years, lateral inheritance (as opposed to and a unique representation can account satisfactorily for the unified vertical descent) was discovered to be a major evolutionary set of laws that underlie nature. force in [7–11]. For , and Pluralism: opposes monism by endorsing the view that several some unicellular eukaryotes, individual gene histories can methods and theories are legitimate in an evolutionary study legitimately differ from species history, and the two phylo- because no single coherent explanatory system can account satis- factorily for all the diverse phenomena of life. genetic patterns (species trees and gene trees) do not have to Polythetic: a phylogenetic group in which ‘(i) each individual has a show much identity with one another on a broad evolution- large but unspecified number of a set of properties occurring in the ary scale. Microbial physiologists and geneticists were not aggregate as a whole; (ii) each of those properties is possessed by surprised by the fact that a single genome could comprise large numbers of those individuals; (iii) not one of those properties is genes arising from multiple phylogenetic sources, yet it possessed by every individual in the aggregate’, as explained in Ref. [46]. conflicted with the conceptual foundations of the phyloge- Synapomorphy: a derived character state shared by two or more netic system. terminal groups (taxa included in a cladistic analysis as further indivisible units) and inherited from their most recent common ancestor. Corresponding author: Bapteste, E. ([email protected]).

200 0966-842X/$ – see front matter ß 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.tim.2008.02.005 Available online 15 April 2008 Opinion Trends in Vol.16 No.5

Box 1. Different types of evolutionary trees Box 2. LGT and the definition of natural groups

Two types of evolutionary trees are currently being reconstructed. Consider four organisms, in two independently evolving lineages: First are genome trees, based on the statistical properties of the two photobacteria (P1 and P2) having photoreceptors; and two genome, on the presence or absence of genes, on the chromosomal flagellobacteria (F3 and F4) harboring a flagellum (see Figure 1 in

gene order or on average sequence similarity, as calculated in the main text). Suppose that, at t1, a descendant from P2 laterally BLAST analyses (and variants thereof). Second are phylogenomic acquired a flagellum it obtained from an F4 relative in addition to its trees, based on vertically inherited orthologs [17]. Genome trees photoreceptors. How should the chimeric P2 descendant be provide a way to compare the evolutionary information present in classified? Multiple answers seem possible: (i) because it harbors different genomes. However, they do not reflect the exact course of photoreceptors, the P2 descendant could be joined to the photo- organismal evolution and should not be interpreted as phylogenies. bacteria; (ii) because it harbors a flagellum, the P2 descendant could In no case is the relevance of the tree model tested in these be joined to the flagellobacteria; (iii) because it presents both approaches. Furthermore, such phenetic trees are especially com- photoreceptors and a flagellum, the P2 descendant is something plex to interpret because some of the groupings obtained can result new, neither a photobacteria nor a flagellobacteria. Evolutionists principally from lateral relationships, whereas others result from generally rely on historical evidence, and consider the photorecep- vertical ones. In summary, genome trees show prevailing trends in tors as a synapomorphy, the flagellum as a bad character for natural the evolution of genome-scale gene sets [16].Bycontrast, classification, and the P2 descendant as ‘a photobacterium that phylogenomic Trees – species Trees – are reconstructed to learn acquired a flagellum’ (the exact description of its evolutionary about the pattern of natural relationships between species [18,19] history).

on the basis of strictly vertically inherited markers. To this end, Suppose now that, later, at t2, the P2 descendant lost its molecular datasets are trimmed to exclude genes with conflicting photoreceptors. Then, the P2 descendant would only harbor a signals. There are, however, very few data for which one can flagellum, homologous to those found in F4 and F3 descendants. confidently assess a strictly vertical transmission, resulting in Would it be considered a flagellobacteria? This would seem the skeletal microbial phylogenomic trees, built on a very small amount most natural solution, given the presence of a flagellum, which is of information. The latest TOL, published by Ciccarelli et al. [6], the essence of the flagellobacteria category, and the absence of which Dagan and Martin legitimately renamed the ‘tree of one per other traits that would suggest an alternative classification. Yet, it

cent’ [20], is a good example of this. In addition, this approach is would be in direct contradiction with the historical logic used at t1, probably unwillingly essentialist, because a few characters are according to which ‘being a photobacterium’ means to have a last being reified in the name of the congruence between the gene trees common ancestor that had photoreceptors, regardless of the make- and the species tree. Such a definition makes genes the essence of up of the extant descendants. Paradoxically, this solution both species in a systematic scheme based on . describes the notion of photobacteria and empties it of its Such an approach is likely to be endlessly criticized, for instance in substance, creating groups where no part (and thus no gene) can the debate over the choice of what is an essential character, or when define the ‘essence’ of species and of higher taxa. If a descendant of essentialist definitions of species are being rejected from the F3 subsequently lost its flagellum, flagellobacteria would become evolutionary field [47,48]. ‘bacteria with or without a flagellum, knowing that not all bacteria with a flagellum are flagellobacteria’ and photobacteria ‘bacteria with or without photoreceptors, with or without a flagellum’, two descriptions indistinguishable from each other if one ignores the Problems in traditional tree making history of these features. In the presence of LGT (and in the absence Despite the molecular saturation problem [12], responsible of historical evidence), some groups could seem ‘more natural’ (i.e. for the weakness of phylogenetic signals on a large evol- all the flagellated organisms sharing homologous characters, all the utionary scale, and other tree reconstruction artifacts [13], organisms with homologous photoreceptors etc.) than the poly- thetic groups of higher level in the TOL. In presence of LGT, Millian phylogenetics showed that some genes are congruent with and historical essentialist definitions of a ‘natural group’ will fail to each other, whereas other markers display significantly produce a consensual microbial systematics. conflicting signals [14,15]. This situation affects the mean- ing of the two main types of tree-like phylogenies of life under reconstruction (Box 1). On the one hand, genome SimpsonorMayrusedintheirclassification.Theirswas trees [16,17] (based on genomic properties or content) closer to that of Mill [22] (i.e. ‘groups respecting which a provide only central tendencies. Such trees index taxa well greater number of general propositions can be made, [...] but they do not tell us much about their history, speciation than could be made respecting any other groups into events, etc. On the other hand, phylogenomic trees (built which the same things could be distributed’). In the from strictly vertically inherited markers [18,19]) have a case of animals, these two classical definitions can over- limited power to explain the features of extant and past lap. Yet, in , LGT exacerbates the tension microbial biodiversity because the vast majority of the between these two definitions of natural groups [Figure 1, molecular characters (at least) might have evolved along Box 2 and Table 1 (data from Garrity [23])]. Clearly, different evolutionary patterns than the vertical one. molecular-based systematics requires phylogenetic char- Simply put, genealogical relationships might differ signifi- acters whose history is decipherable and stable enough to cantly from similarity relationships. In this case, the utility form groups. However, in a more adverse context, when of a ‘tree of one per cent’ [20] to generalize about the molecular phylogeneticists try to define taxonomical genomic and genetic evolution of a lineage is probably categories of high rank (such as the Haloarchaea, Alpha- close to nil on a broad evolutionary scale. and methanogens), we argue that they try Consequently, it can be argued whether groups to solve an issue that cannot be conclusively resolved in derived from such vertical trees should be held as traditional terms. ‘natural’. Monophyletic groups are considered natural Because neither of the two different tree-based because all of their members share an exclusive last approaches (genomic or verticalist) satisfactorily fulfills common ancestor – termed an ‘historical essentialist’ the goals of traditional phylogenetic systematics for (see Glossary) definition of the natural group by Rieppel microbial organisms (i.e. to produce informative natural [21]. This is in sharp contrast with the definition that groups), we suggest nontraditional alternatives to address

201 Opinion Trends in Microbiology Vol.16 No.5

Figure 1. Questioning natural groups in the presence of LGT. Two unrelated hypothetical bacterial lineages: the photobacteria (with a photoreceptor, symbolized by a crown, and two species, P1 and P2) and the flagellobacteria (with a flagellum, and two species, F3 and F4). Their evolution (through the gain and loss of the aforementioned features) unfolds from the top to the bottom of the drawing, and their corresponding morphology is represented at three different times (t0,t1 and t2). At these different times, the classification of P2 descendants in a ‘natural group’ is particularly arguable, notably under the historical essentialist definition (Box 2). the problems of phylogenetic systematics, such as that they have a causal effect and have a role in the natural raised in Box 2. world – are natural groups. The consequences of such a perspective are far reaching. First, higher taxa (e.g. the Alternative approaches to microbial phylogenetics Proteobacteria), might not be considered as a natural and systematics group under this definition because there is no such thing Proposition of an alternative definition of natural groups as a real causal impact of the Proteobacteria phylum (i.e. A third definition of a natural group (neither Millian nor there is not a single physiological feature shared by all historical essentialist), inspired by the work of Splitter [24] Proteobacteria that is not a general feature of bacterial (and other philosophers [25,26]), could prove useful for cells). Such higher taxa are an arbitrary way of classifying microbial phylogenetics. For Splitter, a specialist in the the living world rather than the natural one. Second, species concept debate, a natural group is ‘natural when it because multiple evolutionary units of all sizes have a role is causally efficacious, relative to some explanatory theory’ in different biological processes, natural groups in a [24] – that is, natural groups are real when they have a real revised systematics are expected to be diverse. causal impact and real consequences on the biological Despite their variability, the emergence of evolutionary world. Under this definition, evolutionary units –because units seems to follow a general scheme that enables their

Table 1. Some examples of physiological properties showing variation within and between taxonomic groups of microbes Property Taxonomic group Positive representative Negative representative Anoxygenic photosynthesis Family Bradyrhizobiaceae Rhodopseudomonas palustris Nitrobacter winogradskyi Family purpureus Azoarcus communis Family Ectothiorhodospiraceae Ectothiorhodospira marina Nitrococcus mobilis Dissimilatory sulfate reduction Family Archaeoglobaceae Archaeoglobus fulgidus Ferroglobus placidus Family Nitrospiraceae Thermodesulfovibrio yellowstonii Nitrospira marina Order Desulfobacterales Desulfotalea psychrophila Nitrospina gracilis Nitrification Family Bradyrhizobiaceae Nitrobacter winogradskyi Rhodopseudomonas palustris Family Nitrospiraceae Nitrospira marina Thermodesulfovibrio yellowstonii Order Desulfobacterales Nitrospina gracilis Desulfotalea psychrophila Family Ectothiorhodospiraceae Nitrococcus mobilis Ectothiorhodospira marina Nitrogen fixation Genus Azoarcus Azoarcus communis Azoarcus anaerobius Genus Methanococcus Methanococcus maripaludis Methanococcus vannielii Genus Rhodocyclus Rhodocyclus tenuis Rhodocylcus purpureus Sulfur oxidation Family Hydrogenophilaceae Thiobacillus denitrificans Hydrogenophilus thermoluteus Family Sulfolobaceae Sulfolobus solfataricus Stygiolobus azoricus Family Ectothiorhodospiraceae Ectothiorhodospira marina Nitrococcus mobilis Hyperthermophily Order Methanococcales Methanocaldococcus jannaschii Methanococcus vannielii Family Thermotogaceae Thermotoga maritima Geotoga petraea Obligate aerobiosis Family Desulfurococcaceae Aeropyrum pernix Desulfurococcus mobilis Family Hydrogenophilaceae Hydrogenophilus thermoluteolus Thiobacillus denitrificans

202 Opinion Trends in Microbiology Vol.16 No.5

Figure 2. Schematic description of coherent and composite evolutionary units. (a) Scheme of a lower-level evolutionary unit, symbolized by a small circle with two arrows. Circles of similar color correspond to phylogenetically related evolutionary units. Circles of different colors correspond to phylogenetically unrelated evolutionary units. (b) Scheme of a coherent higher-level evolutionary unit, emerging from a selective process applied to many phylogenetically related lower-level evolutionary units. (c) Scheme of a composite higher-level evolutionary unit, emerging from a selective process applying on many phylogenetically unrelated lower-level evolutionary units. Selective processes might involve selection on a function, environmental pressures, natural selection, interbreeding, homeostatic loops etc.

characterization. In it simplest form, an evolutionary Introducing composite evolutionary units unit rests on the integrated association of lower level Microbiologists are also familiar with phylogenetically elements that can be replicated and are held together by diverse, yet functionally integrated groups. In nature, some biological mechanism (Figure 2). Depending on coevolving associations of multiple phylogenetically dis- which biological process is responsible for the integration tinct microorganisms are frequent. Most importantly, such of the lower-level elements of the ‘whole’ evolutionary evolutionary units often display emerging properties unit, these evolutionary units are more or less familiar to that none of their constituent parts harbors alone. For phylogeneticists (and to systematicists). Animal ‘species’, example, syntrophic microbial consortia, composed of for example, are macroscopic evolutionary units emer- multiple organisms with various , are able ging when a reproductive process (interbreeding) causes to achieve chemical reactions that would be energetically the functional integration of a set of organisms which are unfavorable if carried out by a single microbe. Such a similar enough to interbreed, and thus results in the relationship was uncovered between closely associated relative persistence in the traits of their offspring across methanotrophic archaea and sulfate-reducing bacteria time (Figure 2b). Based on such a process, these natural found in anoxic marine sediments [29]. In this case, the groups comprise organisms that show some similarity archaeal partner metabolizes methane and the bacteria (i.e. that are more similar to one another than to organ- use a resulting metabolite as an electron source. Other isms of another interbreeding group). In this case, know- examples include oxidation of fermentative end products ing the genealogical relationships certainly helps in by acetogenic bacteria in the presence of methanogens [30], proposing a useful phylogenetically based : anaerobic oxidation of methane coupled to denitrification monophyletic groups can match natural groups (sensu [31] and mineralization of chlorinated aromatic com- Mill or Splitter), providing a good index of biodiversity pounds under methanogenic conditions [32]. Furthermore, and yielding explanatory power. Yet, as philosophers ecological and environmental pressures influencing LGT, have long known, there is no necessity for the various and consequently the genetic units composing organisms, integrated constituents of an evolutionary unit to have a create evolutionary units by the association of different unique coherent phylogenetic origin or to show similarity genes or pathways within organisms. For instance, signifi- with each other [27,28]. In fact, and especially for cant LGT has been detected between Sulfolobales and microbes, the representatives of which far outnumber members of the Thermoplasmatales, two phylogenetically members of animal ‘species’, biological processes other distant phyla that frequently share thermoacidophilic than interbreeding can be responsible for the functional environments [33]. The evolution of the hyperthermophilic integration of diverse molecular constituents and the bacteria Thermotogales is also likely to have been shaped emergence of more disparate – yet real – evolutionary by their uptake of DNA from the archaea that often share units (Figure 2c). their environment [34].

203 Opinion Trends in Microbiology Vol.16 No.5

Finally, it is essential to realize that composite evol- Pluralistic microbial ontology utionary units of all sizes and levels can emerge in nature. If natural biodiversity is truly irreducible to a hierarchic Such units rely on parts which might have different ori- scheme and cannot be studied with accuracy under a single gins, some global biological process being responsible for model, a pluralistic approach [37] is then ontologically their association while selection is acting on the emerging justified to acknowledge the multiplicity of evolutionary higher-level phenotype. For instance, when a biological units in nature, as long as ‘objects both large and small function is selected, the composition of its lower-level have an equal reality and causal efficacy’ [38]. structural components can be flexible (i.e. bacteria can In this context, the question of the origin of a microbe synthesize the essential isoprenoid building block isopen- is superseded by (i) the question of the origins of its tenyl diphosphate through two analogous pathways, one many constitutive elements (the various smaller evol- using 1-deoxy-D-xylulose-5-phosphate as a precursor and utionary units of which it is made) and (ii) the question of the other using mevalonate [35]). Thus, the list of genes whether this organism might itself belong to larger able to fulfill a function can be extensive. The mix–match composite evolutionary units. This transition – searching model proposed by Charlebois and Doolittle [36] (Box 3) for the multiple origins of a microbe rather than its formalized this idea well by describing modular evolution- unique origin and for the many natural groups to which ary units of all sizes which are more or less flexible in their a microbe belongs rather than its unique natural group – composition because not all their lower-level constituents might seem counterintuitive. Indeed, evolutionists have to be the same forever. Consequently, studies on LGT are familiar with assigning a unique phylogenetic pos- strongly suggest the existence of multiple levels of selec- ition to microbial lineages, as if all their parts originated tion and the presence of many biological ‘individualities’ in from a unique point in space and time and remained complex interactions in the microbial world. We thus argue cohesive. Yet, the study of the origins (note the plural for a richer view of biodiversity, comprising more evol- form) of microbes would be consistent with a deeper utionary units than the mere ‘species’ and ‘genes’ generally understanding of the evolutionary theory, in which phy- considered in traditional phylogenetics, and thus more logenetics means ‘phylum genesis’, the processes by natural groups to classify. which various evolutionary units emerge across time rather than the ‘branching pattern arising through evol- utionary time’. Box 3. Different types of composite evolutionary units Importantly, our model presupposes populations of Particularly relevant for prokaryotic genome evolution, the mix– elements on which selection is functioning to sustain evol- match model stems from the idea that cells need to fulfill different utionary units. Because elementary parts of microbes can functions but that the genes responsible for realizing these multiple originate from pools of phylogenetically diverse genes, functions might differ over time. It proposes that, for a given different parts might come from different populations. function, the available genes can belong to different gene families (i.e. be ‘analogous’, non-homologous markers) and that the set of Consequently, the further back we move in the history genes fulfilling a given function varies during the course of of microbial evolutionary units, the more useless and evolution (owing to gene and function loss). As a result, new empty the notion of a single last common ancestor becomes. genomic lineages would arise through mixing and matching of Evolutionary units present in a microbial population are genes performing different functions, not only by vertical descent, likely to have been carried by multiple separate popu- but also by processes of replacement. Thus, ‘where there are many analogous types of genes ... that can perform the same general lations in the past, in different combinations. Only a function (e.g. energy production or cell envelope formation), the variety of evolutionary Trees – as opposed to a unique living world will collectively exhibit much variability, and there will phylogeny – would enable us to approximate these differ- be no ubiquitous sets of genes that appear as part of any universal ent ancestral combinations of features, by trying to recon- core. Where choices are more limited, most genes performing the struct the history of these smaller gene associations. needed function (some step in translation for instance) being homologous, there will appear to be little variability’ [36]. Hence, it seems important to revise some of our phyloge- This model can account for the evolution of two distinct types of netic and systematic practices. composite evolutionary units, as described in the main text. On one hand, if the set of genes fulfilling the selected function remains Revised practices in microbial phylogenetics and stable, the collection of lower-level elements from which the systematics function emerges is limited, and the genetic composition of the evolutionary unit is mostly definable. In this case, the unit is mostly Revised phylogenetic practices rigid: in theory, its constitutive elements can be listed exhaustively. A good phylogenetic analysis of multiple markers no longer The translation machinery seems to be a good example of this, as consists in the mere addition of various phylogenetic sig- already noted by Charlebois and Doolittle [36]. On the other hand, nals through concatenation to obtain the best unique composite evolutionary units can be built from many different elements changing over time. In this case, the unit is mostly flexible; topology. The accumulation of data under the null hypoth- it has a tendency to vary in the details of its make-up over long esis that there is a common tree, without having a chance historical periods. An example of this would be the methionine to refute this premise, even for data of poor phylogenetic biosynthesis pathway, in which the enzymes catalyzing each of the quality, suffers from a logical flaw [39]. In the presence of various steps can differ between organisms but still catalyze the LGT, the resolution in a concatenated tree can no longer be same reaction [49]. Woese [50] also seems to defend a comparable view. For him, the components of the cell ‘are modular to one extent taken as evidence for the existence of a tree. Instead, the or another’, and if, in the integrative process, some cellular validity of the null hypothesis must be tested by exploring functions ’became more or less refractory to horizontal gene flow the origin of the resolution in such a super-tree, and by ... still others of them remained, and remain today, subject to the testing whether its support is genuine or artifactual [40]. vagaries of horizontal gene flow’ [50]. In addition, phylogenetic analysis of complete genomes

204 Opinion Trends in Microbiology Vol.16 No.5

(from pure cultures or from metagenomic projects) could Overlapping microbial taxonomies systematically include a decomposition analysis to identify The complexity of the evolutionary process acting on the incongruent phylogenetic patterns within individual microbes indicates that a single taxonomy will be likely genomes. This analysis would isolate the various incon- to provide an overly coarse picture of microbial relation- gruent sets of genes that every single genome comprises ships. As shown in Table 1, the binomial nomenclature and and could thereby inform us about potential smaller-level the sole hierarchical classification are a poor proxy of the evolutionary units that are part of the genomic make-up of genetic make-up of a microbe. By contrast, more taxo- any microbe [41]. Some software, such as Concaterpillar, nomies based on real biological processes could bring which uses a hierarchical likelihood ratio test framework significant information that it would be arbitrary to over- to assess both the topological congruence between gene look [44]. Discarding all but one of these process-based phylogenies and branch-length congruence [42], could help taxonomies would be comparable to reducing a person’s in this task. Moreover, phylogeneticists could search sys- identity to a single aspect of his or her life, even though he tematically for local congruencies between a priori unre- or she might have an effective role in many organizations: lated gene phylogenies – that is, trees of a same professional, artistic, sportive, familial and so on. To avoid environment or between distantly related taxa. Starting overlooking any of the natural groups, it seems legitimate with thousands of topologies issued from metagenomic or to propose – rather than a single taxonomy of microbial genomic projects, analyses of split decomposition identify- species – many taxonomies describing the multiple evol- ing common bipartitions or common embedded quartets utionary units and their role. Thus, we suggest giving up [43] should enable the discovery of coevolving sets of genes the unique hierarchy as the reference classification system of all sizes. If these sets of genes prove to have a role in the and instead encourage the production of a comprehensive evolutionary process, they too could help in discovering interactive database in which an individual could possibly composite evolutionary units. belong to overlapping taxonomical groups.

Figure 3. Three alternative typical classification systems. (a) The linear system (here in alphabetical order) is often unambiguous but uninformative about the history and properties of classified organisms. (b) The tree, informative on the vertical relationships of organisms but not necessarily on their properties. Typically, incongruent features are overlooked in such a hierarchical classification. (c) The interactive database, with its keywords and overlapping groups, where a given organism can be simultaneously placed in different taxonomical groups because it is naturally involved in different processes and belongs to multiple nonexclusive evolutionary units. Importantly, this system preserves the information concerning vertical inheritance learned from (b). Simply, this information becomes a part, and not the end, of evolutionary knowledge.

205 Opinion Trends in Microbiology Vol.16 No.5

Through the elaboration of this database, phylogeneti- book on the internet, entering a series of keywords in a cists would be able to appreciate that the tree of cells is not search engine, has experienced this: there is no need to use the only evolutionary pattern and that it should not mask nested notions (such as a tree) to access the information. the complexity of microbial evolution. Importantly, the Thus, even though the transition from a tree-like structure database would contain other patterns that evolutionists of classification to a more dynamic reticulated system is might also be willing to classify and generalize about. For probably as shocking as was the transition from a linear instance, one should be able to generalize about the adap- order to a series of dichotomies thousands of years ago (and tation to high temperature in thermophiles or the survival in fact this is still encountering resistance nowadays [45]), of halophiles at high salt concentrations (irrespective of it will most likely prove to be even more useful in micro- whether these groups comprise polyphyletic associations of biology (Figure 3). archaea and bacteria), etc. Using this system, the extent of convergences and their genetic basis would be better Conclusions appreciated, especially in prokaryotes. Such an evolution- We advocate here a pluralistic microbial systematics, ary-based microbial systematics should also improve our multiplying names and taxa when it is legitimate – that working knowledge, providing keys to distinguish patho- is, when identifying biological units having a causal role in genic microbes from benign ones, to classify bacterial the evolutionary process – to avoid presenting an overly communities and so on. To achieve this, we cannot rely coarse view of microbial history. It would be important to exclusively on traditional genealogical relationships. evaluate whether such an alternative model offers a better Medical cases are obvious examples of this; if a patient description of natural diversity than that provided is sick, what ultimately matters is to identify which through a unique nested hierarchy, splitting the living particular genetic associations are responsible for the anti- world into various inclusive categories (i.e. taxa of high biotic resistance by the infectious organisms, and not the rank), many of them devoid of causal efficacy. This nature of the sister group of these organisms in the TOL. If approach, applicable to archaea, bacteria and possibly all information about the evolutionary units composing unicellular eukaryotes, undoubtedly goes beyond the microbes and their communities were to be recorded in a traditional classification on a ‘debated tree’ of ‘debated comprehensive database – just as we pool all the sequences species’. It adds to the traditional classification, because it known at the National Center for Biotechnology Infor- acknowledges the importance of the studies by various mation – we would be able to access them at the click of microbial specialists, including those of traditional mol- a mouse. ecular phylogeneticists, without giving absolute priority Our main reason to recommend a comprehensive data- or exclusivity to the latter. For us, it could constitute a step base, rather than multiple ones, is easing scientific com- forward by promoting a more informative and integrated munication. However, we do not have a recipe for naming systematics, implicating an increasing number of scien- its taxonomical groups. Simple names referring to poly- tists in this huge task. We also expect the identification of phyletic groups of organisms carrying specific evolutionary composite evolutionary units through alternative phylo- units are already used by the microbiology community. In genetic analyses, less constrained by the tree formalism, practice, we do use the terms ‘denitrifier’, ‘sulfate reducer’ to bring forth new perspectives about the evolution of life and ‘methanogen’, and know what they mean because and its taxa. In contrast to the traditional practice of these functions are associated with specific evolutionary molecular phylogenetics centered around a unique tree, units (sets of well-characterized genes allowing a certain we feel that it is time for evolutionists to explore the whole biochemical function to be performed). We also use simple phylogenetic forest. terms such as ‘’, ‘Proteobacteria’ and ‘Cre- narchaeon’ knowing that these names also refer to evol- Acknowledgements utionary units but of a different type (monophyletic core We thank Ford Doolittle, Pascal Tassy, Michel Morange, Armand de Ricqle`s and Jean Gayon for critical discussions, and also Chris Lane, Sara sets of genes). Providing that these names encapture real Hopkins and Hans Wildschutte for careful reading of the manuscript. evolutionary units – that is, not just whatever arbitrary suites of traits, but those having a causal role in the References evolutionary process – they can all constitute valuable 1 Zuckerkandl, E. and Pauling, L. (1965) Molecules as documents of keywords in our evolutionary-based taxonomical database. evolutionary history. J. Theor. Biol. 8, 357–366 Any given organism can then be characterized by many 2 Zuckerkandl, E. and Pauling, L. (1965) Evolutionary divergence and convergence in . In Evolving Genes and Proteins (Bryson, V. names because it can belong to more than one group at and Vogel, H.J., eds), pp. 97–166, Academic Press once, which is, in theory, testable. Furthermore, some 3 Felsenstein, J. (2004) Inferring Phylogenies, Sinauer fields of microbiology (metagenomics) do not use organ- 4 Cavalier-Smith, T. (1981) Eukaryote kingdoms: seven or nine? isms, but rather DNA extracted directly from the environ- Biosystems 14, 461–481 ment, to investigate biological processes. This makes the 5 Schwartz, R.M. and Dayhoff, M.O. (1978) Origins of prokaryotes, eukaryotes, mitochondria, and chloroplasts. Science 199, 395–403 use of concepts such as evolutionary units not only useful, 6 Ciccarelli, F.D. et al. (2006) Toward automatic reconstruction of a but essential. highly resolved tree of life. Science 311, 1283–1287 Importantly, the considerable progress that has been 7 Doolittle, W.F. (1999) Phylogenetic classification and the universal made in computer science makes non-tree-like, yet effi- tree. Science 284, 2124–2129 cient, classifications realistic and promising. Classification 8 Koonin, E.V. et al. (2001) in prokaryotes: quantification and classification. Annu. Rev. Microbiol. 55, 709–742 systems with overlapping groups, previously known to be 9 Thompson, J.R. et al. (2005) Genotypic diversity within a natural intractable, are no longer so. Anyone who has looked for a coastal bacterioplankton population. Science 307, 1311–1313

206 Opinion Trends in Microbiology Vol.16 No.5

10 Lo, I. et al. (2007) Strain-resolved community proteomics reveals 31 Strous, M. et al. (2006) Deciphering the evolution and metabolism of recombining genomes of acidophilic bacteria. Nature 446, 537–541 an anammox bacterium from a community genome. Nature 440, 790– 11 Hanage, W.P. et al. (2006) The impact of homologous recombination on 794 the generation of diversity in bacteria. J. Theor. Biol. 239, 210–219 32 Becker, J.G. et al. (2005) The role of syntrophic associations in 12 Penny, D. et al. (2003) Testing fundamental evolutionary hypotheses. sustaining anaerobic mineralization of chlorinated organic J. Theor. Biol. 223, 377–385 compounds. Environ. Health Perspect. 113, 310–316 13 Gribaldo, S. and Philippe, H. (2002) Ancient phylogenetic 33 Ruepp, A. et al. (2000) The genome sequence of the thermoacidophilic relationships. Theor. Popul. Biol. 61, 391–408 scavenger Thermoplasma acidophilum. Nature 407, 508–513 14 Bapteste, E. et al. (2005) Do orthologous gene phylogenies really 34 Nelson, K.E. et al. (1999) Evidence for lateral gene transfer between support tree-thinking? BMC Evol. Biol. 5, 33 Archaea and bacteria from genome sequence of Thermotoga maritima. 15 Susko, E. et al. (2006) Visualizing and assessing phylogenetic Nature 399, 323–329 congruence of core gene sets: a case study of the gamma- 35 Boucher, Y. et al. (2003) Lateral gene transfer and the origins of proteobacteria. Mol. Biol. Evol. 23, 1019–1030 prokaryotic groups. Annu. Rev. Genet. 37, 283–328 16 Wolf, Y.I. et al. (2002) Genome trees and the tree of life. Trends Genet. 36 Charlebois, R.L. and Doolittle, W.F. (2004) Computing prokaryotic 18, 472–479 gene ubiquity: rescuing the core from extinction. Genome Res. 14, 17 Snel, B. et al. (2005) Genome trees and the nature of genome evolution. 2469–2477 Annu. Rev. Microbiol. 59, 191–209 37 Doolittle, W.F. and Bapteste, E. (2007) Pattern pluralism and the Tree 18 Daubin, V. et al. (2002) A phylogenomic approach to bacterial of Life hypothesis. Proc. Natl. Acad. Sci. U. S. A. 104, 2043–2049 phylogeny: evidence of a core of genes sharing a common history. 38 Steel, D. (2004) Can a reductionist be a pluralist? Biol. Philos. 19, 55– Genome Res. 12, 1080–1090 73 19 Brochier, C. et al. (2005) An emerging phylogenetic core of Archaea: 39 Bucknam, J. et al. (2006) Refuting phylogenetic relationships. Biol. phylogenies of transcription and translation machineries converge Direct. 1, 26 following addition of new genome sequences. BMC Evol. Biol. 5, 36 40 Bapteste, E. et al. (2008) Alternative methods for concatenation of core 20 Dagan, T. and Martin, W. (2006) The tree of one percent. Genome Biol. genes indicate a lack of resolution in deep nodes of the prokaryotic 7, 118 phylogeny. Mol. Biol. Evol. 25, 83–91 21 Rieppel, O. (2005) The philosophy of total evidence and its relevance for 41 Azad, R.K. and Lawrence, J.G. (2007) Detecting laterally transferred phylogenetic inference. Pap. Avulsos Zool. 45, 1–31 genes: use of entropic clustering methods and genome position. Nucleic 22 Mill, J.S. (1843) A System of Logic – Ratiocinative and Inductive, Acids Res. 35, 4629–4639 Longman 42 Leigh, J. et al. (2008) Testing congruence in phylogenomic anaysis. 23 Garrity, G.M. (2005) Bergey’s Manual of Systematic Bacteriology (The Syst. Biol. 57, 104–115 Proteobacteria) (Vol. 2), Springer Verlag 43 Zhaxybayeva, O. et al. (2006) Phylogenetic analyses of cyanobacterial 24 Splitter, L.J. (1988) Species and identity. Philos. Sci. 55, 323–348 genomes: quantification of horizontal gene transfer events. Genome 25 Brooks, D.R. (2001) Evolution in the information age: rediscovering the Res. 16, 1099–1108 nature of the organism. SSED 1, 1–29 44 Ereshefsky, M. (1992) Eliminative pluralism. Philos. Sci. 59, 671–690 26 Collier, J.D. and Muller, S.J. (1998) The dynamical basis of emergence 45 Gupta, R.S. (2001) The branching order and phylogenetic placement of in natural hierarchies. In Emergence, Complexity, Hierarchy and species from completed bacterial genomes, based on conserved indels Organization (Farre, G. and Oksala, T., eds), pp. 1–30, Acta found in various proteins. Int. Microbiol. 4, 187–202 Polytechnica Scandinavica 46 Panchen, A.L. (1992) Classification, Evolution, and the Nature of 27 Ghiselin, M.T. (1974) A radical solution to the species problem. Syst. Biology, Cambridge University Press Zool. 23, 536–544 47 Ereshefsky, M. (2006) Species. In The Stanford Encyclopedia of 28 Kitcher, P. (1984) Species. Philos. Sci. 51, 308–333 Philosophy (Zalta, E.N., ed.), Stanford University Press 29 Nauhaus, K. et al. (2007) In vitro cell growth of marine archaeal- 48 Doolittle, F. and Papke, R.T. (2006) Genomics and the bacterial species bacterial consortia during anaerobic oxidation of methane with problem. Genome. Biol. 7, 116 sulfate. Environ. Microbiol. 9, 187–196 49 Gophna, U. et al. (2005) Evolutionary plasticity of methionine 30 Stams, A.J. (1994) Metabolic interactions between anaerobic biosynthesis. Gene 355, 48–57 bacteria in methanogenic environments. 50 Woese, C.R. (2000) Interpreting the universal phylogenetic tree. Proc. 66, 271–294 Natl. Acad. Sci. U. S. A. 97, 8392–8396

Have your say Trends in Microbiology is a unique forum for the discussion of exciting current research in all aspects of microbiology from microbial evolution to virulence. Would you like to respond to any of the issues raised in this month’s TiM? Letters to the editor can be up to 900 words and include a figure or table and 10 references. If you are interested in contributing a letter, please contact the Editor at: [email protected]

207