Archaebacterial genomes: eubacterial form and eukaryotic content

Patrick J Keeling, Robert LCharlebois and W Ford Doolittle

Dalhousie University, Halifax and University of Ottawa, Ottawa, Canada

Since the recognition of the uniqueness and coherence of the archaebacteria (sometimes called Archaea), our perception of their role in early has been modified repeatedly. The deluge of sequence data and rapidly improving molecular systematic methods have combined with a better understanding of archaebacterial to describe a group that in some ways appears to be very similar to the eubacteria, though in others is more like the . The structure and contents of archaebacterial genomes are examined here, with an eye to their meaning in terms of the evolution of cell structure and function.

Current Opinion in Genetics and Development 1994, 4:816-822

Introduction this review we will focus on more recent discoveries and try to assimilate these observations with phylogeny. As molecular phylogenetics advances, it provides evolu- tionary biologists with a f&rework on which to assem- ble a coherent picture of the history of life. Two un- Archaebacterial genome design expected discoveries that emerged from sequence com- parisons have had particularly far-reaching effects on our At present, seven archaebacterial genomes have been view of early evolution; both concern the archaebacteria. physically mapped. In each case, the genome is com- The first, on the basis of ribosomal RNA (rRNA) se- posed of a single circular chromosome that falls within quence information, and now extensively supported by a the size range observed in the eubacteria [5-11,12”,13*] variety of other data, was the finding that the (Table 1). In several instances, extrachromosomal ele- are actually composed of two distinct groups, eubacteria ments have also been identified [14]; in some halophiles, and archaebacteria, as distant horn one another as either plasmid and megaplasmid (up to 700 kb) DNA makes are f%om the eukaryotes [l]. The second, equally excit- up a substantial portion of the genome [15]. Halophile ing discovery is that the eubacteria branched first from plasmids and chromosomes bear scores of transposable the universal tree, so that the archaebacteria and eukary- elements, which give rise to high frequencies of ge- otes seem to be sister groups [2]. The significance of this netic instability as revealed by spontaneous mutations and second conclusion (which, although increasingly popu- local genetic rearrangements detectable through South- lar, does need confirmation from additional data sets) is ern analysis [16]. This phenomenon can be observed on that it allows us to propose specific arguments about the the time scale of laboratory experiments, but compara- nature of both the prokaryotic ancestor of the eukary- tive mapping indicates that in evolutionary time, chro- otes and the last common ancestor of all extant life on mosomal order is remarkably conserved in halobacteria the basis of characters shared between pairs of taxa. (P Lopez-Garcia, A St Jean, R Amils, R Charlebois, un- published data; N Hackett, personal communication). As our understanding of the archaebacteria on the Forces must exist that can effectively oppose the pow- molecular level develops, it is becoming clear that they erful destabilizing pressures present in the halobacterial share a number of features in overall genome design and genome. gene organization with the eubacteria. These traits can then be said most parsimoniously to be ancestral to all Archaebacterial and eubacterial genomes are similar in life. Many other characters, particularly those involved size and structure, both being small, compact, and cir- with gene expression, are shared between the archae- cular. If the rooting of the universal tree by Iwabe [2] is bacteria and eukaryotes; these can be either primitive correct, it is reasonable to suppose that this design is an- or derived, but in both cases help us to understand the cestral to all extant life, although we know too few of the order of evolutionary events at or near the origin of the details of genome construction to be certain that the trait eukaryotes. Detailed reviews of archaebacterial molecu- could not have evolved twice independently. To describe lar biology are available in two recent books [3,4], so in the nature of the primitive genome more accurately, we

Abbreviation snoRNA-small nucleolar RNA.

816 0 Current Biology Ltd ISSN 0959-437X Archaebacterial genomes Keeling, Charlebois and Doolittle 817

order S12-S7-EF-G-EF-Tu for both M. uunnielii and Table 1. Archaebacterial genome sizes as determined by mapping. E. coli [28]). In one case, the M. vannielii Ll operon, the similarity can be extended to the regulation of ex- Species Cenome size Method References pression, as it has been shown that the Ll protein in M. vannielii represses translation of its own mRNA by bind- Mefhanococcus voltae 1900 PFC I71 ing to a site that resembles its binding site on the 23s tialobacrerium halobium 2400 PFC 1101 rRNA [25**]. The same autoregulation is seen in E. coli, Halobacterium sp. CR6 2472 Cloning [12-l differing only in the position of the binding site on the Haloferax volcanii 4140 Cloning 161 mRNA. Haloferax mediterranei 3840 PFG l8,13*1 Thermococcus celer 1890 PFG 151 Enough data may not be available to say that gene or- Sulfolobus acidocaldarius 2760 PFG I91 der is conserved over even longer distances in eubacterial and archaebacterial chromosomes, but enough is known

Summary of archaebacterial genomes that have been mapped. Each to begin to assemble some intriguing correlations. The of these gendmes is within the range of genome size documented in spectinomycin operon is immediately downstream of the eubacteria and all finished maps are circular. PFG, pulsed-field gel the SlO operon in both E. coli and M. vannielii [17], electrophoresis. and the streptomycin operon is followed by the SlO operon in Themotogu maritima and by the SlO protein (the first reading tie in the SlO operon) in three d& need more comparative data and a better understanding ferent archaebacteria [28], as well as in the cyanobac- of archaebacterial genomes Corn a functional standpoint; terium Spinrlina platensis [29]. The current best guess for instance, the mechanisms of replication initiation, is that in the ancestor of archaebacteria wd eubac- termination, and chromosomal segregation need to be teria, these three operons followed one another in the examined. order streptomycin-SlO-spectinomycin, with the genes in each being similar to those found commonly in con- Archaebacterial gene organization temporary operons.

In general, the structure of archaebacterial genes closely The transition to eukaryotes resembles that of eubacteria [15,17]. They are organized into co-transcribed units, or operons, that are expressed A particularly close afinity between archaebacteria and as long non-capped mRNAs with only short bacterial- eukaryotes has been long suspected, but for the most part like poly(A) tails [18]. Protein-coding genes of the ar- on vague grounds - resistance to certain drugs, lack chaebacteria are not disrupted by introns of any type, of peptidoglycan, or the occurrence of tRNA introns other than some spliced at the protein level [19,20], and [30]. Recent work, however, has elucidated a ham&l to date no intron resembling the nuclear spliceosomal of molecular similarities that argue unambiguously for a type has been identified. Introns have been found in shared inheritance with the eukaryotes, three of which 16S, 23s. and tRNA genes; these are of a novel type, are reviewed here. dependent upon a defined secondary structure at the intron/exon boundary and an unidentified enzymatic One of the strongest similarities is in transcription mech- activity [21]. Some of these introns have been found to anisms, now known to dif&r fundamentally &om the contain open reading frames related to homing endonu- homologous eubacterial process. Evidence for this simi- cleases first characterized in group I self-splicing introns larity has been accumulating for a number of years, be- of mitochondria and eubacteriophages [22*]. ginning with the observation that archaebacterial RNA polymerases are multisubunit complexes, unlike the sim- The degree of similarity between archaebacterial and ple three or four subunit eubacterial enzyme [31,32], eubacterial operons or operon clusters is best exemplified and that the consensus archaebacterial promoter bears by a number of archaebacterial operons that contain the a closer resemblance to the eukaryotic TATA box than same genes in the same order as their eubacterial homo- to eubacterial promoter sequences [33]. The depth of logs (reviewed in [23]; see Fig. 1). These include the this similarity was further revealed when the existence of RNA polymerase operon and ribosomal protein gene an archaebacterial homolog of the eukaryotic transcrip- clusters, which are homologous to the Escherichiu colt’ tion &ctor TFIIB was discovered by a database search Lll and LlO clusters (four of four genes in the same [34,35*], and by wo reports of a gene that encodes order in two archaebacteria and E. coli [24,25”]). the a protein closely resembling both TFIIS and a subunit spectinomycin operon (for which Methunococcusvunnielii of eukaryotic RNA polymerases I and II [36’,37*]. and Sulfolobus ucidoculduriusshow eleven and nine, respec- Moreover, work by two independent investigators has tively, of the eleven E. coli genes in the same order, now shown that archaebacterial genomes also although the archaebacteria each have three ‘extras’, TFIID, or the TATA-binding protein [38”,39”], and which are homologs of eukaryotic ribosomal protein that TFIID recognizes archaebacterial promoters and genes [24,26]), the SlO operon (seven genes in a stretch directs the binding of TFIIB [39”]. Unlike their eubac- of seven in the same order in halobacteria and E. coli terial counterparts, eukaryotic RNA polymerases alone [27]), and the streptomycin operon (four genes in the cannot recognize promoters; instead, the assembly of the 818 Genomes and evolution

‘Lll , LlO Clusters E. co/i

S. sollataricus

Hb. cutirubrum

M. vannielii

Streptomycin Operon (str) 480kb 15kb m $-gglO E. cot wE

S. platensis

T. ma&ma ti 30kb M. vannielii RN4P +I $-El0

t? woesei

S. acidocaldarius

Hb. halobium

SlO Operon (SlO) 15kb

Fig. 1. Summary of conserved gene or- Ha. marismortui der between eubacteria and archaebac- teria in four operons. Homologous genes are indicated by shading, the actual gene names have not been included for sim- plicity, but can be found in three recent Spectinomycin Operon (spc) reviews [17,23,28]. In the case of the spectinomycin operon, the archaebacte- rial genes with only eukaryotic homo- E. co/i logs are shaded black. Where the po- sition of the cluster relevant to other M. vannielii s clusters is known, it is shown as an S. acidocaldarius open box on the end, and distances are given where they are not closely linked [17,23,24,25”,26-29,571.

TFIID/TFIIB complex is responsible for efficiently di- process (S Potter, P Dennis, personal communication). recting the polymerase to the promoter site in vim. It Furthermore, in eukaryotes U3 forms a close associ- appears that this may also be the case in the archae- ation with fibrillarin, and this also appears to be the bacteria; if so, it represents a significant divergence from case in Sulfolobus, as U3 precipitates from cell extract eubacterial transcription machinery. when exposed to anti-fibrillarin antibody (S Potter, P Dennis, personal communication). Independent support A similar message can be drawn I?om current work for this comes from the discovery of fibrillarin in two on the mechanism for processing the rRNA primary methanogens [42”], which are related only distantly to transcript for which, once again, the archaebacteria ap- Sr/Jhbm. This argues that this pathway is common to pear to use a system homologous to the eukaryotic pro- all archaebacteria, rather than only the crenarchaeota, cessing pathway and distinct from that of eubacteria. The which have an rRNA operon structure unlike other 16s and 23s rRNA genes of archaebacteria are flanked archaebacteria or eubacteria, and more like eukaryotes by repeats that can form a bulge-helix-bulge motif and [43-471 (Fig. 2). which have a demonstrated role in transcript processing that is neither eubacterial nor eukaryotic in nature [40]. Arguably the best example of a feature shared between In the 16s gene of S. acidocaldarius, however, it is now the archaebacteria and eukaryotes is ubiquitin-directed known that the bulge-helix-bulge motif is not necessary proteolysis by the proteasome, as this system has no in defining the 5’ processing site, but that the process- known eubacterial homolog. The eukaryotic proteasome ing is actually carried out by an (as yet) unidentified is a large multisubunit complex that recognizes polypep- enzymatic complex that requires an RNA component tides bound covalently to ubiquitin and degrades them. [41”], now recognized to be U3, the small nucleolar Both the proteasome and ubiquitin have been character- RNA (snoRNA) used by eukaryotes in the homologous ized in the genus Tlwnwoplaana [48,49,50**], in which Archaebacterial genomes Keeling, Charlebois and Daolittle 819

Hanging ornaments on the universal tree Thermofilum pendans Sulti7lobus solfataricus Besides simply lacking a nucleus, the archaebacteria 16S-23S....SS Desulti~mccous mobilis share much of their gene and genome structure with Thermo@oteus tenax the eubacteria, arguing for the utility of the concept ‘’. They also have a number of molecular - Thermoplasma 16s . . . . 23s . . . .5s characters in common with the eukaryotes, however, Themrococcus celar 16S-Ala-23sSS....5S seemingly at the exclusion of the eubacteria, so it is Methanoccus vannielii imperative to be sure of the place of the archaebacte- Methanothermus krvidus ria in the universal tree. Conveniently, the most strongly Methanobacterium 16S-Ala-235-5s supported topology of the universal tree, that which thermoautotrophicum groups the archaebacteria and eukaryotes as sister taxa, is in agreement with most aspects of archaebacterial bi- Halobacterium halobium Halobacterium cutribrum ology and can be used as a model fi-om which to draw some interesting conclusions. Haloferax voicanii Haloarcula marismottui Halococcus morrhuae Earlier, we pointed out that, given this topology, any- thing found in both the archaebacteria and eubacteria Q 1994 Curmnt opinion in Genetic5 and Development can be said to have been inherited from the last common ancestor of all cells, even if it is now absent horn eukary- otes. So, it seems likely that the universal ancestor was a Fig. 2. Schematic diagram of archaebacterial phylogeny showing the different classes of rRNA operons. The crenarchaeota, repre- complex prokaryotic cell with a circular DNA genome sented here by T. pendans, S. solfataricus, 0. mobilis, and T tenax, organized into operons with well developed regulatory have a separately transcribed 55 gene, as for eukaryotes. The eur- systems and contents, which are still maintained in some yarchaeota, which cover the remaining archaebacteria, maintain a cases. The genome of this cell included highly effective 1615-235-5s operon. The intergenic spacers of the euryarchaeota also contain lineage-specific tRNA genes, the halophiles being de- replication and expression systems, as well as a full com- fined by an additional tRNAcYs at the extreme 3’ end of the operon. plement of metabolic enzymes, and possibly a complex Thermococcus celer and Methanococcus vannielii have an operon locomotary system resembling the rotary motor found typical of the euryarchaeota, but have additional independently ex- in eubacteria and archaebacteria (the nature of the actual pressed 5s genes (they are shown grouping together, but this trait may have evolved independently). Thermoplasma, as usual, has a flagellar filaments is questionable [53”]). This ancestor is unique organization 140,43471. often referred to as a ‘progenote’, a term coined origi- nally to define a very different organism, one that lacked all these things [54]. Given what we currently believe about the last common ancestor, it is clear that referring their activity has been demonstrated to resemble the to it as a progenote is misleading. eukaryotic complex [51**]. The Thermoplasrna protea- some is much simpler. consisting of only two subunits, rather than the 12-20 found commonly in eukaryotes. It is not possible to use the same logic to fix the time Moreover, the two subunits are similar in sequence, a of origin of features found only in archaebacteria and product of duplication and divergence, a process that eukaryotes, as it is conceivable that these traits are has apparently continued in the eukaryotes leading to themselves primitive. One can, however, actually be- the many contemporary subunits of the eukaryotic com- gin to consider the order of events that took place in plex [52”]. As such, the two Thennoplasrna subunits can the development of eukaryotes without having to as- be thought of as representing an ancestral-like complex sume that the majority of complex characters present in and give us an opportunity to study the process of gene only the archaebacterial and/or eukaryotic lineage are family evolution in the eukaryotes. The relatively sim- derived. Even the deepest branching protist lineages are ple subunit structure of the archaebacterial proteasome significantly different from eubacteria in many aspects of also makes it an ideal model for the difficult task of un- cell structure and physiology, making it hard to imag- derstanding the proteolytic activity of the complex, and ine how such a rapid and thorough transition could progress has already been made in this area [51**]. take place. By identifying the archaebacterial homo- logs of systems thought to be strictly eukaryotic, we can reduce the number of novel developments neces- Curiously, all efforts to identify proteasomes in archae- sary to account for the transition to eukaryotic cells, bacterial genera other than Thermoplasma have failed simplifying our explanation of a difficult step in cellu- resoundingly [52”]. Whether this is actually because of lar evolution. The presence of proteasomes, U3-based their absence, or only resistance to detection by the var- rRNA processing, and eukaryotic-like transcription in ious techniques employed, is not certain, but the unique the archaebacteria, for instance, argue that these features presence of proteasomes in Thennoplusma in the archae- had all evolved prior to the divergence of archaebac- bacteria would raise some interesting issues. teria. Similarly, the apparent absence of other cellular 820 Cenomes and evolution

Established along this branch Nucleus, multiple linear chromosomes, chromatin, microtubule structures, cytoskeletal components, capped and polyadenylated mRNA

Present by this point Either primitive (present in Fig. 3. Universal tree as determined by common ancestor) or derived duplicated gene phylogeny [2]. A hy- (arose on branch leading to pothetical order of events is shown for archaebacteria and eukaryotes): three major steps leading to the origin proteasome and ubiquitin, of eukaryotes. For the universal ancestor multisubunit RNA polymerase, and the ancestor of eukaryotes and ar- Present in last common ancestor promoter recognition by chaebacteria, we can say only that the Circular DNA genome with genes organized in transcription factors, snoRNA- various traits common to both resulting operons with complex regulatory mechanisms, based rRNA processing, novel lineages had developed by that point, major metabolic pathways established, fast and ribosomal proteins, N-linked but we cannot say exactly when. For accurate transcription and translation apparatus, glycoproteins, tRNA genes the traits common only to eukaryotes, it probably a rotary motor based motility system coding terniinal CCA nucleotides, is most parsimonious to argue that they like eubacterial flagella no fMet initiator tRNA developed sometime after the divergence J of archaebacteria and eukaryotes and be- fore the divergence of the first eukaryotic lineages.

elements, such as microtubules and other cytoskeletal A few interesting cases can be made for the recogni- structures, cap-dependent translation initiation or mul- tion of these distant homologs, exemplified by actin tiple linear chromosomes (to name a few), implies that and tubulin, two cytoskeletal proteins for which puta- these evolved in the eukaryotes after the divergence of tive eubacterial relatives have been identified [55,56]. the archaebacteria. Even this short list puts some large- These assertions can be tested by identifying archae- scale changes into perspective. In the case of gene expres- bacterial homologs of the relevant eubacterial proteins. sion, the eukaryotic system did not arise by concurrent The relatively short phylogenetic distance between the changes to both translation and transcription, as the lat- archaebacteria and the eukaryotes would lead to the ter was already established in the prokaryotic ancestor prediction that the archaebacterial sequence would show of eukaryotes (a summary of this and other evolutionary a more obvious similarity to the novel eukaryotic gene inferences is presented in Fig. 3). than do any of the eubacterial sequences.

Another important observation is that an archaebacterial gene is often related equally to two or more differ- Conclusions ent eukaryotic genes that perform different functions [37*,39**]. If this is a general condition, then the ap- Eukaryotic cells are defined by their possession of many parent slow rate of divergence in many archaebacterial novel and complex features. As it is doubtful that many sequences [38**] may allow the identification of new of these evolved de nova, their antecedants will proba- families or perhaps new members of existing families of bly be found somewhere in the prokaryotes, and if the genes and will certainly clarify the genealogy of recog- archaebacteria really are the closest living relatives of nized eukaryotic gene families by providing an external the eukaryotes, then one would expect them to con- reference. tain homologs of the vast majority of eukaryotic genes. This has already been seen to be the case for several in- A major task of future genomic studies will be to explain tegrated systems (some of which may also prove to be changes in function and, thus, the differences between present in the eubacteria). In other instances, in which a prokaryotic and eukaryotic cell and molecular biology complex system appears to have evolved aher the diver- in terms of changes in gene sequence and number. The gence of eukaryotes and archaebacteria, it is still possible sisterhood of the archaebacteria and eukaryotes makes to uncover the prokaryotic origin of this trait by iden- the former, not the eubacteria, the appropriate model tifying the divergent, but distantly related, homologs of for defining the ‘starter kit’ of genes available to the first individual elements of this system. eukaryotic cells. Archaebacterial genomes Keeling, Charlebois and Doolittle 821

References and recommended reading 18. Brown JW, Reeve JN: Polyadenyhted, noncapped RNA from the archaebacterium Methanococcus vannieiii. J Bacterial 1985, 162:909-917. Papers of particular interest, published within the annual period of review, have been highlighted as: 19. Perler FB, Comb DG, Jack WE, Moran LS, Qiang B, Kucera . of special interest RB, Benner J, Slatko BE, Nwankwo DO, Hempstead SK el a/.: . . of outstanding interest Intervening sequences in an Archaea DNA polymerase gene. Proc Nati Acad Sci USA 1992, 895577-5581. 1. Woese CR, Fox GE: Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Nat/ Acad Sci USA 1977, 20. Xu M-Q, Southworth MW, Mersha FB, Horns&a LJ. Perler FB: 745088-5090. In vivo protein splicing 04 purified p&cursor and &e identifi- cation of a branched intermediite. Cell 1993, 75:1371-1377. 2. lwabe N, Kuma K-l, Hasegawa M, Osawa 5, Miyata T: Evolu- tionary relationship of archaebacteria, eubacteria, and eukary- 21. Nieuwlandt DT, Carr MB, Daniels CJ: In vivo processing of an otes inferred from phylogenetic trees of duplicated genes. Proc intron-containing archael (RNA. MO/ Microbial 1993, 8:93-99. Naf/ Acad Sci USA 1989, 86:9355-9359. 22. Dalgaard JZ, Garrett RA, Belfort M: A site-specific endonuclease . encoded by a typical archaeal intron. Proc Nat/ Acad Sci USA Pfeifer F, Palm P, Schleifer K-H (EdsJ: Molecular bio/ogy of 3. 1993, 90:5414-5417. Archaea. Stuttgart: Gustav Fischer Verlag; 1994. This demonstration that an intron-encoded open reading frame in Desul- 4. Kate; M, Kushner DJ, Matheson AT (Eds): The biochemistry of fumcoccus is an endonuclease homologous to those found in certain Archaea. Amsterdam: Elsevier; 1993. homing introns provides clear evidence that these endonucleases are themselves mobile elements and argues that some archaebacterial introns 5. Nell K: Chromosome map of the thermophilic archaebacterium can also home. l%mnococcus celer. J Bacferiol 1989, 171:6720-6725. 23. Ramfrez C, Wpke AKE, Yang D-C, Boeckh T, Ma&son AT: 6. Charlebois RL, Schalkwyk LC, Hofman JD, Doolittle WF: De- The structure, function and evolution of archaeal rlbosomes. tailed physical map and set of overlapping clones covering the In The biochemistry of Archaea. Edited By Kates M, Kushner genome of the archaebacterium Haloferax volcanii DS2. I MO/ DJ, Matheson AT: Amsterdam. Elsevier; 1993:439-466. Biol 1991, 222:509-524. 24 Shimmin LC, Newton CH, Ramfrez C, Yee J, Downing WL, Louie A, Matheson AT, Dennis PP: Organization of penes en- 7. Sitzmann J, Klein A: Physical map of the Medtanococcus voltae coding the 111, 11,110, and 112 equiv&nt ribodproteins chromosome. MO/ Microbio/ 1991, 5:505-513. in eubacteria, archaebacteria, and eukaryotes. Can J Micmbiol 8. L6pez-Carcfa P, Abad JP, Smith C, Arnils R: Cenomic orga- 1989, 35164-l 70. nization of the halophilic archaeon /fa/oferax mediterranek 25. Hanner M, Mayer C, Kl)hrer C, Golderer G, Gr6bner P, Piendl physical map of the chromosome. Nucleic Acids Res 1992, . . W: AutoRenous translational regulation of the ribosomal Mvall 20:2459-2464. operon L the archaebacterium~Meth;lnococcus vaniefii. I Bac- teriol 1994, 176z409-418. 9. Kondo S, Yamagishi A, Oshima T: A physical map of the suffir- dependent archaebacterium Su/fo/ohs acidocaldarius 7 chm- Feedback regulation at the level of translation is characteristic of eubac- terial ribosomal protein operons. Here, it was shown that an archaebac- mosome. 1 Bacreriol 1993, 175:1532-l 536. terial operon uses the same system as its homolog in the eubacteria. 10. Bobovnikova Y, Ng W-L, Dassarma 5, Hackett NR: Restriction 26. Auer J, Spicker G, B&k A: Organization and structure of mapping the genome of Halobacterium halobium strain NRC-l. the Methnococcus transcriptional unit homolwus to the Sysr Appl Microbio/ 1994, 16597-604. Escherichia co/i ‘spectinomycin operon’. / MO/ Biol 1989, 20921-36. 11. Antdn J, L6pez-Carcfa P, Abad JP, Smith C, Amils R: Align- ment of genes and Swal restriction sites to the BamHl ge- 27. Arndt E, Kmmer W, Hatakevama T: Orttaniution and nu- nomic map of Haloferax mediterranei. FEMS Microbial Left cleotide sequence of .a gene ciuster coding-for eight ribosomal 1994, 11753-60. proteins in the archaebacterium HaMacterium marismortui. J Biol Chem 1990, 265:303&3039. 12. St Jean A, Trieselmann BA, Charlebois RL: Physical map and . . set of overlapping cosmid clones representing the genome of 28. Zillig W, Palm P, Klenk H-P, Langer D, Hudepohl U, Hain J, the archaeon Halobaclerium sp. CRB. Nucleic Acids Res 1994, Lanzendorfer M, Holz H: Transcription in Archaea. In The bio- 22: 1476-t 483. chemistry of Archaea. Edited by Kates M, Kushner DJ, Matheson The second construction of an ordered set of cosmids representing the AT. Elsevier: Amsterdam; 1993:367-391. genome of a halophile (the other is Haloferax volcani3. This will be an important tool in future investigations into comparative genomics and 29. Sanangelantoni AM, Tiboni 0: The chromosomal location of genome stability. genes for elongation factor Tu and ribosomal protein SlO in the cyanobacterium Spindina plater&s provides clues to the 13. L&ez-Carcla P, Abad JP, Smith C, Amils R: Cenome analysis ancestral organization of the str and 510 operom in prokary- . of ‘different Ha/oferax-hediterranei strains using pulsed-held otes. J Gen Microbial 1993, 1392579-2584. Rel electrophoresis. Svsl ADD/ Microbial 1993, 16:31 O-321. Develois a strain&eening piotocbi for Hf. mediterranei using a previ- 30. Zillig W: Eukaryotic traits in archaebacteria. Ann NY Acad Sci 1981, 503~78-81. ously established map to predict and compare fingerprints. Also shows a fair degree of fingerprint polymorphism between strains. 31. Piihler G, Leffers H, Gropp F, Palm P, Klenk H-P, Lottspeich F, Garrett RA, Zillig W: Archaebacterial DNA-dependent RNA 14. Zillig W, Kletzin A, Schleper C, Holr I, Janekovic D, Lanzendo- polymerases testify to the evolution of the eukaryotic nuclear rfer. Kristiansson IK: Screenine for So/foloba/es, their plasmids genome. Proc Nat/ Acad Sci USA 1989, B&4569-4573. an6 viru&s in Icelandic solf&was. Syst Appl Microbib/ 1994, l&609-624. 32. Klenk H-P, Palm P, Lottspeich F, Zillig W: Component H of the DNA-dependent RNA polymerase of Archaea is homolo- 15. Schalkwyk L: Halobacterial genes and genomes. In The bio- gous to a subunit shared by the three eukaryal nuclear RNA chemistry of Archaea. Edited by Kates M, Kushner DJ, Matheson polymerases. Proc Nat/ Acad Sci USA 1992, 89:407-410. AT. Amsterdam: Elsevier; 1993:467-496. 33. Reiter W-D, Hudepohl U, Zillig W: Mutational analysis of an 16. Sapienza C, Rose MR, Doolittle WF: High-frequency genomic archaebacterial promoter: essential role of a TATA box for rearrangements involving archaebacterial repeat sequence el- transcription efficiency and start-site selection in vitro. Proc ements. Nature 1982, 299:182-185. Nat/ Acad Sci USA 1990, B7:9509-9513.

17. Palmer JR, Reeve JN: Structure and function of methanogen 34. Ouzounis C, Sander C: TFIIB, an evolutionary link between genes. In The biochemistry of Archaea. Edited By Kates M, the transcription machineries of archaebacteria and eukary- Kushner DJ, Matheson AT. Amsterdam: Elsevier; 1993:497-534. otes. Cell 1992, 71:189-190. 822 Genomes and evolution

35. Creti R, Londei P, Cammarano P: Complete nucleotide se- 45. Reiter W-D, Palm P, Voos W, Kaniecki J, Grampp 6, Schulz W, . quence of an archaeal (Pyrococcus woe&/) gene encoding a Zillig W: Putative promoter elements for the ribosomal RNA homoloa of eukaryotic transcription factor IIB ~TFIIB). Nucleic genes of the thermoacidophilic archaebacterium Sulfolobus sp. Acids Ris 1993, 21:2942. . strain B12. Nucleic Acids Res 1987, 15:5581-5595. An unidentified open reading frame, recognized originally by Ouzounis 46. Leffers H, Kjems J, 0stergaard L, Larsen N, Garrett RA: Evolu- and Sander to be TFIIB [341, is sequenced fully here. tionary relationships amongt archaebacteria. A comparative 36. Langer D, Zillig W: Putative /f//s gene of Su/folobuJ acidoca/- study .of 235 rib&ma1 RNAs of a sulphur-dependent ex- . darius encoding an archaeal transcription elongation factor is treme thermophile, an extreme halophile and a thermophilic situated directly downstream of the gene for a small subunit methanogen. / MO/ Eio/ 1987, 195:43X.1. of DNA&pendant RNA polymerase. Nucleic Acids Res 1993, 47. Achenbach-Richter L, Woese CR: The ribosomal gene 21:22Sl. spacer region in archaebacteria. Sysf Appl Microbial 1988, Initial report of a putative TFIIS gene in archaebacteria. 10:211-214. 37. Kaine BP, Mehr II, Woese CR: The sequence, and its evo- 48. Zwickl P, Grziwa A, Pi.ihler G, Dahlmann B, Lottspeich F, . lutionary implications, of a Thermococcus celer protein as- Baumeister W: Primary structure of the Thermoplasma pro- sociated with transcription. Proc /Vat/ Acad Sci USA 1994, teasome and its implications for the structure, function, and 91:38543856. evolution of the multicatalytic proteinase. Biochemistry 1992, Report from a second archaebacteria of a gene homologous to the Sul- 31:964-972. folobus TFIIS. This analysis concludes that this gene is actually not TFIIS, but rather more similar to subunits of eukaryotic RNA polymerases I and 49. Zwickl P, Lottspeich F, Dahlmann B, Baumeister W: Cloning II. This situation will probably become increasingly common as more and sequencing of the gene encoding the large (a-) subunit archaebacterial genes are shown to have multiple eukaryotic homologs. of the proteasome from Thermoplasma acidophilum. FEBS Lerr 1991, 278:217-221. 38. Marsh TL, Reich Cl, Whitlock RB, Olsen CJ: Transcription fac- . . tor IID in the Archaea: sequences in the Thermococcus celer 50. Wolf 5, Lottspeich F, Baumeister W: Ubiquitin found in the . . archaebacterium Thermoplasma acidophilum. FEES Lert 1993, genome would encode a product closely related to the TATA- binding protein of eukaryotes. Proc Naf/ Acad Sci USA 1994, 326~42-44. 91:4180-4104. A heroic effort to demonstrate that Thermoplasma acidophilum contains This paper presents the sequence of an archaebacterial TFIID gene from ubiquitin by systematically sequencing low molecular weight proteins. a random genome sequencing project. TFIID is a tandem dimer, and se- 51. Wenzel T, Baumeister W: Thermoplasma acidophilum protea- quence analysis of the two halves of the Thermococcus gene show that . . some degrades partially unfolded and ubiquitin-associated pro- the duplication occurred before the divergence of the archaebacteria and teins. FE/IS Len 1993, 326:215-218. that the rate of change in the archaebacterial sequence is less than in the Shows that the Thermoplasma proteasome acts like its eukaryotic homo- eukaryotes, suggesting that archaebacterial proteins will be useful in un- log. Also argues for the chaotropic effect of ubiquitin on the basis of ob- derstanding the evolution of gene families. servations from this in vitro system. 39. Rowlands T, Baumann P, Jackson SP: The TATA-binding protein: 52. Ptihler G, Pitzer F, Zwickl P, Baumeister W: Proteasomes: multi- . . a general transcription factor in eukaryotes and archaebacte- . . subunit proteinases common to Thermoplasma and eukaryotes. ria. Science 1994,264:1326-l 329. Sysr App/ Micro&o/ 1994, 16:734-741. The sequence of TFIID from Pyrococcus is determined and the activity An intensive search for proteasome particles in archaebacteria other than of the initiation complex (specifically TFIIB/TFIIDJ described to be much Thermoplasma failed to detect anything by several techniques (as was like eukaryotes. This provides definitive evidence for the homology of also the case in several eubacteria), raising questions as to the position the archaebacterial and eukaryotic transcription mechanisms. of Thermoplasma in the archaebacterial tree. Evolution of gene families is also examined; analysis of proteasome sequences reveals that the large 40. Garrett RA, Aagaard C, Andersen M, Dalgaard JZ, Lykke-An- number of eukaryotic subunits all arose from two original subunits and dersen J, Phan HTN, Trevisanato S, 0stergaard L, Larsen N, that the ancestral state is maintained in Thermoplasma. Leffers H: Archaeal rRNA operons, intron splicing and homing endonucleases, RNA polymerase operons and phylogeny. Sysr 53. Faguy DM, Jarrell KF, Kuzio J, Kalmokoff ML: Molecular analy- Appl Micro&o/ 1994, 16:6&X69 1. . . sis of archaeal flagellins: similarity to the type IV pillin-trans- port superfamily widespread in bacteria. Can / Micro&o/ 1994, 41. Durovic P, Dennis PP: Separate pathways for excision and pro- 40:67-71. . . cessinR of 165 and 235 rRNA from the prirnarv rRNA ooeron Shows that flagellin genes from the halophiles and methanogens are trans&pt from the hyperthermophilic &haebacterium &I/- more closely related to eubacterial type IV pilin than to other flagellins. fo/obus acidoca/darius: similarities to eubrvotic rRNA pro- This correlates with other characteristics of archaebacterial flagellins and cessing. MO/ Micro&o/ 1994, 13:229-242. ’ raises questions about flagellar evolution and biogenesis. Analysis of 235 and 16s rRNA processing in Su/lo/obus shows that the former requires the typical bulge-helix-bulge structure, whereas the 165 54. Woese CR: Evolutionary questions: the ‘progenote’. Science RNA does not. The mature 5’ end of the 165 is defined by a putative 1990, 2471709. RNA-containing endonuclease, and it is speculated that this is related to 55. Lutkenhaus J: FtsZ ring in bacterial cytokinesis. MO/ Micro&o/ the eukaryotic L/3-dependent processing system. 1993, 9:403-409. 42. Amiri KA: Fibrillarin-like proteins occur in the domain Ar- 56. Bork P, Sander C, Valencia A: An ATPase domain common . . chaea. / Bacferiol 1994, 176:2124-2127. to prokaryotic cell cycle proteins, sugar kinases, actin, and Genes from two methanogens were sequenced and shown to be very hsp70 heat shock proteins. Proc Nat/ Acad Sci USA 1992, similar to eukaryotic fibrillarins. The presence of fibrillarins in these or- 89:7290-7294. ganisms not only supports the argument that archaebacterial rRNA pro- cessing is homologous to the process in eukaryotes, but also that this is 57. Creti R, Citarella F, Tiboni 0, Sanangelantoni A, Palm P, Cam- generally true of archaebacteria and not just of the crenarchaeota. marano P: Nucleotide sequence of a DNA region comprising the gene for elongation factor t-alpha (EF-lalpha) from the 43. Kjems I, Leffers H, Olesen T, Holz I, Garrett RA: Sequence, ultrathermophilic archaeote Pyrococcus woesei: phylogenetic ofBanization and transcription of the ribosomal RNA ooeron implications. / MO/ Eva/ 1991, 33:332-342. and the downstream tRN’A and protein genes in the aichae- bacterium ThermopMum pendans. Sysf App/ Microbial 1990, 13:117-127. PJ Keeling and WF Doolittle, Canadian Institute for Advanced Research, Department of Biochemistry, Dalhousie University, 44. Kjems I, Leffers H, Garrett RA, With G, Leinfelder W, Bock A: Halifax, U3H 4H7, Nova Scotia, Canada. Gene organization, transcription signals and processing of the sinRle ribosomal RNA operon of the archaebacterium T/term* I\L Charlebois. IIepartment of Biology, University of Ottawa, p&eus tenax. Nucleic -Acids Res 1987, 15:4821-4835. Ottawa, KIN 6N5, Ontario, Canada.