<<

MEETING REPORT

releasing Sir proteins from the be silenced by the Sir protein com- 3 Loo, S. and Rine, J. (1995) Annu. Ku70p–Ku80p telomerase complex plex. In summary, the importance of Rev. Cell Dev. Biol. 11, (David Shore, Univ. of Geneva, chromatin structure was evident in 519–548 Switzerland). Cdc13p protein binds all sessions. Yeast origins, cen- 4 Smith, J.S. and Boeke, J.D. (1997) single-stranded DNA at the tromeres and telomeres bind elegant Genes Dev. 11, 241–254 Ku70p–Ku80p telomerase complex multiprotein complexes that act as 5 Weaver, D.T. (1995) Trends Genet. (Vicki Lundblad, Baylor, USA). regulatory machines to change 11, 388–392 Nuclear organization of telomeres is chromatin structure and to allow important with telomeres located at important cellular processes to occur. the nuclear periphery (Sussan Robert A. Sclafani [email protected] Gasser, ISREC, Switzerland). Target- Further reading ting DNA to the periphery using a 1 Dutta, A. and Bell, S.P. (1997) ER–Golgi anchoring signal can pro- Annu. Rev. Cell Dev. Biol. 13, Department of Biochemistry and Molecular duce silencing (Rolf Sternglanz, 293–332 Genetics, University of Colorado Health SUNY, USA). Hence, any gene 2 Pluta, A.F. et al. (1995) Science 270, Sciences Center, 4200 E. Ninth Avenue, brought to the nuclear periphery will 1591–1594 Denver, CO 80262, USA.

LETTER

reasoned that a detailed comparison Evidence for massive gene exchange of the and archaeal between archaeal and bacterial could reveal -scale adaptations for thermophily. The protein sequences encoded in all complete bacterial genomes were compared with the non- redundant protein sequence Sequencing of multiple complete exceptional among in that it database using the gapped BLAST genomes of bacteria and occupies the hyperthermophilic program7, and a phylogenetic makes it possible to perform niche otherwise dominated by breakdown was automatically systematic, genome-scale archaea2. In the published analysis produced using the comparisons that aim to delineate of the Aquifex genome, it has been TAX_COLLECTOR program (Ref. 8, the genomic complement of a concluded that the genome and D.R. Walker, unpublished). The particular phenotype. Recently, the sequence yielded ‘only a few results show that the fraction of first genome of a hyperthermophilic specific indications of Aquifex gene products that have bacterium, , has thermophily’1. With three genomes archaeal proteins as clear best hits is been sequenced1. Previous studies of extreme thermophilic archaea by far greater than for each of the based on rRNA and aminoacyl-tRNA (Methanococcus jannaschii, other bacteria (Table 1). Taking the analysis had suggested a very early Methanobacterium thermo- fraction of ‘archaeal’ genes in divergence of Aquifex from the rest autotrophicum and Archaeoglobus Bacillus subtilis (Table 1) as a of the bacteria2,3. Aquifex is fulgidus) currently available4–6, we conservative estimate for the random expectation in a bacterial genome and using the normal TABLE 1. ‘Archaeal’ genes in bacterial genomes approximation of the binomial distribution, it could be estimated Bacterial speciesa Reliable best hits to archaeal proteinsb that the excess of ‘archaeal’ genes in Aquifex could not be explained by a Aquifex aeolicus 246 (16.2%) random fluctuation, with p<<10Ϫ10. Bacillus subtilis 207 (5.0%) Synechocystis sp. 126 (4.0%) A reciprocal comparison showed Borrelia burgdorferi 45 (3.6%) that, for proteins encoded in each of 99 (2.3%) the three archaeal genomes, Aquifex proteins are the best hits aThe data on Haemophilus influenzae, Helicobacter pylori (), significantly more frequently than Mycoplasma genitalium and Mycoplasma pneumoniae (Gram-positive bacteria) proteins from other bacteria, even are not shown because, in these species, the majority of the best hits are to those with genomes 2–3 times homologs from larger genomes within the same phylogenetic lineages, namely larger than the Aquifex genome, E. coli and B. subtilis, respectively. b Ϫ3 such as Synechocystis sp. or All database hits with associated expectation (e) values <10 were analyzed; a B. subtilis (Table 2). In a ‘reliable best hit’ was registered when the e-value with an archaeal protein was lower than that with any bacterial or eukaryotic protein by at least a factor of 100. complementary analysis, bacterial proteins were compared with

TIG NOVEMBER 1998 VOL. 14 NO. 11

0168-9525/98/$ – see front matter. Published by Elsevier Science. 442 PII: S0168-9525(98)01553-4 LETTER

TABLE 2. ‘Bacterial’ proteins in archaea

Reliable best hits in bacteriaa

Archaeal species Aa Bs Ssp Ec Bb

Methanococcus jannaschii 193 (10.9%) 78 (4.4%) 56 (3.2%) 44 (2.5%) 16 (0.9%) Methanobacterium autotrophicum 151 (8.0%) 103 (5.4%) 91 (4.8%) 41 (2.2%) 13 (0.7%) Archaeoglobus fulgidus 227 (9.4%) 140 (5.8%) 80 (3.9%) 59 (2.5%) 16 (0.7%)

aDefined as in Table 1. The bacterial species included are the same as in Table 1; abbreviations: Aa, Aquifex aeolicus; Bb, Bacillus burgdorferi; Bs, Bacillus subtilis; Ec, Escherichia coli; Ssp, Synechocystis sp. protein families that are conserved These observations suggest that predicted to possess as yet in all three sequenced archaeal there has been massive gene unknown enzymatic activities. The genomes (Ref. 9 and K. Makarova, exchange between extreme remaining genes have homologs in L. Aravind, R.L. Tatusov and E.V. thermophilic archaea and the well-characterized genomes and, Koonin, unpublished). The fraction lineage of bacterial accordingly, functions can be of bacterial proteins that could be hyperthermophiles represented by predicted in most cases. These included into the conserved Aquifex. Convergence brought include metabolic enzymes, archaeal families was essentially about by positive selection for transporters and proteins involved uniform at the level of about 20% of thermotolerance could account for a in genome replication and repair. each of the bacterial proteomes, subset of archaeal best hits among Of particular interest are two with a sharp deviation at 39% Aquifex proteins. Nevertheless, the families of ATP-dependent DNA observed for Aquifex (Table 3). highly significant differences in the ligases, one of which has not been Given these indications of a level of sequence similarity between described previously and is only direct relationship between a sizeable archaeal and bacterial best hits for distantly related to eukaryotic fraction of genes in Aquifex and many Aquifex proteins, conservation ligases, an archaeal/eukaryotic type archaea, we investigated the protein of unique architectures in ATPase distantly related to the families that they share in further archaea and Aquifex, and the bacterial RecA, and a small protein detail using iterative database phylogenetic analysis results, appear homologous to the catalytic domain searches with the PSI-BLAST to indicate that at least 10% of the of DnaG-type DNA primases. In program7 and phylogenetic tree Aquifex have been horizontally each of these cases, Aquifex also construction with the neighbor- transferred from the archaea. encodes a typical bacterial joining and parsimony methods10. The ‘archaeal’ genes in Aquifex counterpart of the ‘archaeal’ protein, Of the 246 Aquifex proteins that are are a functionally diverse set. namely the NAD-dependent DNA most similar to their archaeal Predictably, the genes that are ligase, RecA, and a classic DnaG homologs (Table 1), 26 belong to found exclusively in archaea and ortholog. Similar chimerism was families found in archaea and Aquifex are functionally observed for several enzymes, for Aquifex only. In addition, 60 of the uncharacterized owing to the lack example, tryptophan synthase ␤ remaining families were investigated of experimental data on these subunit, peroxidase and isopalmate by phylogenetic methods and, for organisms. Several of them, dehydratase. In these cases, it seems 26, statistically significant support however, form highly conserved particularly plausible that the (>65% bootstrap replications) of the families that, on the basis of the ‘archaeal’ genes have been Aquifex/archaea grouping was observed patterns of amino acid introduced into the Aquifex genome observed (data not shown). residue conservation, could be by horizontal transfer, on top of a Aquifex genome contains 36 clusters of two or more adjacent ‘archaeal’ genes (Fig. 1); the mean TABLE 3. Inclusion of bacterial proteins into conserved archaeal length of a cluster is significantly familiesa greater (p <10Ϫ3) than expected on the basis of a random distribution in Protein from the given species Bacterial species included in archaeal COGs the genome (as calculated using a geometric distribution Aquifex aeolicus 597 (39%) approximation and confirmed by Synechocystis sp. 707 (22%) computer simulation). This suggests Bacillus subtilis 910 (22%) a conserved arrangement of some Escherichia coli 891 (21%) genes in Aquifex and archaea and, Borrelia burgdorferi 215 (17%) indeed, three such clusters were identified, with the most prominent aA total of 789 families of probable orthologs (clusters of orthologous groups, or one including 13 Aquifex genes COGs) in the three archaeal genomes were identified as previously described. whose arrangement is partially Bacterial proteins were compared with these COGs using the gapped BLASTP program, and a bacterial protein was included in the given COG if its best hits to conserved in the archaea but not in at least two archaeal genomes were among the COG’s members9. any other known bacterium (Fig. 1).

TIG NOVEMBER 1998 VOL. 14 NO. 11 443 LETTER

bacterial to the the case, theoretical and exclusion of mesophilic bacteria are experimental analysis of these detectable. For example, in addition genes will be helpful for to the previously described reverse understanding the mechanisms of gyrase found in all of the archaea, thermophily. A complete, annotated Aquifex and Thermotoga list of Aquifex genes whose products maritima1,11, we detected a putative show the greatest similarity to DNA methylase with a modified archaeal homologs is available as SAM-binding motif that is encoded supplementary information on the not only by Aquifex and the archaea, World Wide Web12. but also by and might be involved in additional DNA References methylation contributing to 1 Deckert, G. et al. (1998) Nature 392, thermotolerance. 353–358 We showed that the genome of 2 Pace, N.R. (1997) Science 276, FIGURE 1. Genes of apparent archaeal 734–740 origin in the genome of Aquifex aeolicus. Aquifex is a chimera that has a large component shared with the 3 Brown, J.R. and Doolittle, W.F. Yellow circles represent genes encoding (1995) Proc. Natl. Acad. Sci. U. S. A. proteins with reliable best hits to archaeal archaea, in addition to the core 92, 2441–2445 homologs. Gene clusters conserved in gene set in common with the rest of 4 Bult, C.J. et al. (1996) Science 273, Aquifex and archaea are boxed. The the bacteria. It seems likely that 1058–1073 largest cluster contains genes for a bacterial hyperthermophily has 5 Klenk, H.P. et al. (1997) Nature 390, predicted RNA helicase, a nuclease and a evolved secondarily within 364–370 zinc-finger-containing nucleic acid- moderately thermophilic bacteria by 6 Smith, D.R. et al. (1997) J. Bacteriol. binding protein; the remaining genes continuous acquisition of 179, 7135–7155 encode uncharacterized proteins, most of thermotolerance genes from 7 Altschul, S.F. et al. (1997) Nucleic which are conserved in archaea and Acids Res. 25, 3389–3402 Aquifex only. preadapted hyperthermophiles, namely the archaea. An alternative, 8 Walker, D.R. and Koonin, E.V. (1997) ISMB 5, 333–339 in our opinion less likely, is that the 9 Tatusov, R.L., Koonin, E.V. and typical bacterial gene repertoire, preponderance of ‘archaeal’ genes Lipman, D.J. (1997) Science 278, and have been retained owing to in Aquifex is not the cause but just a 631–637 the specific selective advantage they consequence of its adaptation to the 10 Felsenstein, J. (1996) Methods provided by enabling the bacterium existence under extreme Enzymol. 266, 418–427 to thrive in high-temperature habitats. thermophilic environments, where 11 Guipaud, O. et al. (1997) Proc. Natl. The presence of the same set of archaea are dominant organisms. Acad. Sci. U. S. A. 94, 10606–10611 genes of apparent archaeal origin in This dilemma is likely to be solved 12 http://ncbi.nlm.nih.gov/pub/koonin/ the genomes of two or more once genomes of other bacterial aquifex/index.html thermophilic bacteria from distant thermophiles are sequenced. If bacterial lineages would present there is a causal relationship between L. Aravind, Roman L. Tatusov, strong evidence for the role of these the acquisition of archaeal genes Yuri I. Wolf, D. Roland Walker apparently horizontally transferred and adaptation to extreme and Eugene V. Koonin genes in thermophily. At this time, thermophily, the sets of genes of [email protected] the sequence information on archaeal origin found in different bacterial thermophiles other than thermophilic bacteria will overlap to National Center for Biotechnology Aquifex is insufficient for a much greater extent than Information, National Library of Medicine, generalizations. Nevertheless, several expected under a random National Institutes of Health, Bethesda, genes shared by archaeal and acquisition model. Should that be MD 20894, USA.

A new section in Trends in Genetics for 1999 – Genome Analysis

The purpose of the Genome Analysis section is to provide a Manuscripts of up to 1000 words will be considered with forum for original observations concerning the function, one or two small illustrations or tables. More detailed organization and evolution of genomes. With the increasing instructions are available on request. Genome Analysis quantities of genome maps and sequence data in public will be edited by Eugene Koonin. Potential authors are databases, genome analysis and bioinformatics are invited to contact the editor or the Trends in Genetics providing spectacular insights into fundamental biological editorial office for further information. questions, and this trend is set to continue. In Genome Analysis, Trends in Genetics will publish short articles based Eugene Koonin on the analysis of publicly accessible data. Publications of [email protected] outstanding quality and of interest to a broad audience of geneticists and molecular biologists will be considered, National Center for Biotechnology Information, National and all manuscripts will be peer reviewed by an expert Library of Medicine, National Institutes of Health, panel of referees. Building 38A, Bethesda, MD 20894, USA.

TIG NOVEMBER 1998 VOL. 14 NO. 11 444