Thermotoga Heats up Lateral Gene Transfer John M
Total Page:16
File Type:pdf, Size:1020Kb
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Dispatch R747 Evolutionary genomics: Thermotoga heats up lateral gene transfer John M. Logsdon, Jr. and David M. Faguy The complete sequence of the bacterium Thermotoga having the largest (by far) fraction of archaeal-like genes maritima genome has revealed a large fraction of observed in a bacterial species. The high fraction of genes most closely related to those of archaeal species. archaeal-like genes is found in the T. maritima gene even This adds to the accumulating evidence that lateral though the comparisons included the previously gene transfer is a potent evolutionary force in determined A. aeolicus genome, though the converse is prokaryotes, though questions of its magnitude remain. not true. Indeed, others have made strong claims for “massive gene exchange” between A. aeolicus and Address: Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Nova Scotia, B3H 4H7, Canada. archaeal thermophiles [8], yet it appears that the extent of E-mail: [email protected] archaeal genes in T. maritima is even greater. There is little doubt that T. maritima is a member of the Bacteria, Current Biology 1999, 9:R747–R751 and over half of its genes (though only just) appear bacte- 0960-9822/99/$ – see front matter rial in origin. Although many of the archaeal-like T. mar- © 1999 Elsevier Science Ltd. All rights reserved. itima genes appear to be involved in metabolic functions, such as transport and energy metabolism, it is perhaps Prokaryotes exchange genes on a regular basis, especially surprising that at least some are involved in such presum- under highly selective conditions (such as in the presence ably more general (‘core’) functions as transcription and of antibiotics). These DNA transfer events have been gene regulation (Figure 1). appreciated by biologists for some time now. Indeed, it was recognized early on that this process could have Consistent with the view that with these archaeal-like profound implications for the evolution of microbes and genes arose as a result of rampant lateral gene transfer our ability to trace their history, yet most microbiologists during the evolution of the T. maritima genome, Nelson maintained that an evolutionary classification of microor- et al. [1] observed that substantial regions of the genome ganisms, while difficult, was possible. In the last decade, have a DNA base composition significantly different than reports of lateral — or horizontal — gene transfers have the rest of the genome. This may indicate that the genes in been steadily rising. From the profusion of recent articles these regions were transferred en masse. In further support on the topic [1–13], observations of lateral gene transfer of an origin of these regions by lateral gene transfer, the have seemingly reached fever pitch, largely catalyzed by authors suggest — though with no statistical support — the sequencing and analyses of complete genomes from that the archaeal-like genes are clustered in these areas. numerous diverse prokaryotes. Lateral gene transfer has Curiously, some of these regions contain a series of clearly caught the attention of biologists, but despite this 30 base-pair repeats that are very similar in structure and excitement important questions on the prevalence and base composition to repeats found in Archaea and some impact of the process remain unanswered. (especially thermophilic) Bacteria. But as these repeats were originally reported in (archaeal) mesophilic halophiles The latest addition to the lateral gene transfer fray comes [15], and a similar repeat structure is found in Escherichia from the genome-sequencing crew at The Institute for coli [16], their relevance for lateral gene transfer is unclear. Genomic Research (TIGR) [1], who have determined the complete genome sequence of Thermotoga maritima,a The genome sequence of T. maritima, like all completed hyperthermophilic bacterium that may be one of the genomes of hyperthermophiles (to date, mostly Archaea), deepest-branching lineages within the Bacteria. The most contains significant numbers of genes classed as ‘unknown’ interesting feature of this genome is the surprisingly high or ‘hypothetical’ because their closest sequence matches are proportion of open reading frames — putative protein- to genes of unknown or hypothetical function, respectively encoding genes — that most closely resemble genes, not [1]. It is likely that a number of these genes will turn out to from Bacteria, but instead from the other prokaryotic be specific to hyperthermophiles, whether by common domain, the Archaea. ancestry and loss in other lineages or by lateral gene trans- fer. This is borne out by the T. maritima data: of the Nelson et al. [1] found that a full 24% of the T. maritima 108 genes matching only genes in other hyperthermophiles, open reading frames — 451 of 1877 — are most similar to 93 are in the ‘hypothetical’ function class, roughly 23% of all archaeal genes (Figure 1). This fraction of archaeal-like ‘hypothetical’ proteins encoded in the genome. Most rele- open reading frames is nearly twice that of another hyper- vant, perhaps, to the question of lateral gene transfer is that thermophilic — and likely deep-branching — bacterium, a significant fraction of these ‘hypothetical’ genes in T. mar- Aquifex aeolicus [14], which previously held the record for itima are archaeal-like (Figure 1, inset). R748 Current Biology Vol 9 No 19 Figure 1 Distribution of genes in T. maritima by Fatty acid / phospholipid metabolism 80% functional class (using values and class Unknown 67% Miscellaneous known 53% assignments from [1]). The number of genes Transcription 100% in T. maritima which most closely resemble Hypothetical 55% Biosynthesis of co-factors 69% (by BLAST) known genes from Bacteria, Archaea, and Eukarya are shown in blue, Central intermediary metabolism 68% Known 73% green and red, respectively. In each case, the Purines & pyrimidines 71% percentage of genes showing a best match to DNA metabolism 88% 0 500 1000 Number of T. maritima genes from Bacteria is specified. The inset Cellular processes 85% protein-coding genes shows the same distribution for all genes in Cell envelope 85% T. maritima that have any match in the Best match to: Functional class Regulatory functions non-redundant protein database: the ‘known’ 87% Bacteria class is the sum of the functional classes Amino acid biosynthesis 69% Archaea Translation Eukarya shown in the main graph; the ‘unknown’ class 95% is all those showing a match to a gene with Energy metabolism 67% unknown function; the ‘hypothetical’ class is Transport 49% all those showing a match to a gene with an 0 100 200 inferred or hypothetical function. Number of T. maritima protein-coding genes of known function Current Biology Evidence for lateral gene transfer? using a midpoint rooting — or is unrooted, it can appear Although T. maritima is not the first genome that appears that organism 5 is not in the Bacteria, but instead groups to have a mosaic origin, what is striking is the large with Archaea. This effect would be exacerbated by fraction of its genes (almost 25%) which appear specifi- unequal evolutionary rates [10]. cally related to another domain. If most prokaryotic organ- isms experienced lateral gene transfer of this magnitude, As T. maritima is a plausible candidate for being a the very concept of a prokaryotic lineage would be called representative of one of the deepest bacterial lineages, into question. Similarly, results reported last year by this scenario is certainly possible for some of the genes Lawrence and Ochman [5] indicate that approximately thought to be derived by lateral gene transfer. In any case, 18% of the genes in the E. coli genome are derived from it is easy to see how incorrect inferences of lateral gene lateral gene transfers, although it is unclear how compara- transfer can arise. In the absence of additional supporting ble these lateral gene transfers are to more distant ones, data — most clearly a well-supported tree in which the such as those inferred from the T. maritima genome lateral gene transfer recipient is nested within the donor sequence. In any case, these data, taken in sum, are lineage (Figure 2b) — inferences of lateral gene transfer prompting the deconstruction of prokaryotic molecular from such distance comparisons (such as BLAST scores), systematics [2,3,12]. But before throwing out the organis- regardless of their sheer numbers, are really hypotheses in mal trees, we should ask if there are explanations — other need of further testing. than lateral gene transfer — for at least some of the T. maritima cases. Nelson et al. [1] did perform a phylogenetic analysis on 33 homologous gene families with members from T. maritima, With the incredible amount of data present in a complete and report that, in this small subset, a “majority of genes” genome, it is now common for bioinformaticians to showed no lateral gene transfer between Archaea and Bac- describe each gene by its closest match in the database teria. These analyses revealed significant differences (usually using the BLAST program). While this practice is between different gene trees within the Bacteria, suggest- certainly useful as a first cut, it can lead to unwarranted ing that gene duplication, loss and/or lateral gene transfer conclusions. Caution should be exercised in interpreting (within Bacteria) are important in the evolution of the the results