Biology Direct BioMed Central

Discovery notes Open Access A DNA topoisomerase IB in testifies for the presence of this enzyme in the last common ancestor of and Eucarya Céline Brochier-Armanet*1, Simonetta Gribaldo2 and Patrick Forterre2,3

Address: 1Université de Provence, Aix-Marseille I, CNRS UPR9043, Laboratoire de Chimie Bactérienne, IFR88, Marseille, France, 2Institut Pasteur, 25 rue du Docteur roux, 75015 Paris, France and 3Univ Paris-sud, Institut de Génétique et Microbiologie, CNRS UMR8621, 91405 Orsay Cedex, France Email: Céline Brochier-Armanet* - [email protected]; Simonetta Gribaldo - [email protected]; Patrick Forterre - [email protected] * Corresponding author

Published: 23 December 2008 Received: 25 November 2008 Accepted: 23 December 2008 Biology Direct 2008, 3:54 doi:10.1186/1745-6150-3-54 This article is available from: http://www.biology-direct.com/content/3/1/54 © 2008 Brochier-Armanet et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract : DNA topoisomerase IB (TopoIB) was thought for a long time to be a eukaryotic specific enzyme. A shorter version was then found in viruses and later on in several bacteria, but not in archaea. Here, we show that a eukaryotic-like TopoIB is present in the recently sequenced genomes of two archaea of the newly proposed phylum Thaumarchaeota. Phylogenetic analyses suggest that a TopoIB was present in the last common ancestor of Archaea and Eucarya. This finding indicates that the last common ancestor of Archaea and Eucarya may have harboured a DNA genome. Reviewer: This article was reviewed by Eugene Koonin and Anthony Poole

Findings independently in the course of evolution. For instance, DNA topoisomerases are ubiquitous enzymes that control Topo IIA and IIB share a homologous ATP binding subu- DNA topology and solve topological conflicts arising dur- nit, but their DNA cleavage-religation subunits are non ing DNA replication, transcription, and recombination homologous and are structurally unrelated [2,7]. Regard- [1-3] (For a recent review on DNA topoisomerases see also ing Topo I enzymes, Topo IA, which form a transient cov- [4]). Based on their mechanisms of action, DNA topoi- alent link in 5' of the DNA break during the reaction of somerases belong to two classes, type I (Topo I) and type topoisomerization, share a Toprim domain with Topo II, II (Topo II): Topo I change the number of DNA topologi- some nucleases and primases [8], whereas Topo IB, which cal links by introducing transient single-stranded breaks form a transient covalent link in 3' of the DNA break, are in the DNA molecule, whereas Topo II introduce transient distantly related to tyrosine recombinases [2,9]. Although double-stranded breaks. According to phylogenetic crite- Topo IC forms a 3' DNA link similarly to Topo IB, it har- ria, both Topo II and Topo I classes regroup several fami- bors a novel unique fold, and is unrelated to Topo IB and lies of unrelated (i.e. non homologous) proteins: Topo IIA tyrosine recombinases [10]. The three different Topo I and IIB on one hand, and Topo IA (that also includes the families show very distinctive distributions in the living so-called Topo III of eukaryotes and bacteria), IB and IC world: Topo IA are present in currently available complete on the other hand [5,6]. This indicates that enzymes with genomes of organisms from the three domains of life [6], either Topo I or Topo II activity originated multiple times whereas Topo IC appears so far specific to one particular

Page 1 of 8 (page number not for citation purposes) Biology Direct 2008, 3:54 http://www.biology-direct.com/content/3/1/54

, the archaeon Methanopyrus kandleri [5]. Finally, ase, in the recently sequenced genome of a second thau- Topo IB is present in eukaryotes, in poxviruses, in the marchaeon Nitrosopumilus maritimus [25], which also mimivirus, and in some bacteria [6,10,11]. lacks a Topo IA homologue. Both thaumarchaeal Topo IB display a domain organisation that is very similar to that Topo IB (sometimes named swivelase) was first described of their eukaryotic homologues, since these harbour both in mouse and plays a very important role [1,12]. Indeed, the N-terminal Topoisom_I_N and the C-terminal whereas Topo IA can only relax negative superturns, Topo Topoisom_I domain (Figure 1A and Additional files 1). IB can relax both positive and negative superturns in vitro. The main difference between the eukaryotic and the As a consequence, eukaryotic Topo IB may relax the posi- archaeal Topo IB is that the former possess a long and tive superturns that accumulate in front of replication highly variable extension upstream of the Topoisom_I_N forks or transcription bubbles during DNA replication, domain that is absent in the archaeal sequences (Figure transcription, and chromatin assembly. In addition, Topo 1A and Additional files 1). Two hypotheses can be pro- IB may also relax the compensatory positive superturns posed to explain the presence of a Topo IB coding gene in that form when the DNA becomes negatively wrapped Thaumarchaeota. One is that this gene was acquired by around the histone octamer during nucleosome forma- the last common ancestor of Thaumarchaeota via a hori- tion. Although these tasks can be fulfilled also by Topo II zontal gene transfer (HGT) (blue arrow, Figure 1B-a). In enzymes, genetic analyses have clearly indicated that that case, the donor would have been a eukaryote since Topo IB plays a major role in DNA replication, transcrip- both the thaumarchaeal and the eukaryotic Topo IB har- tion and chromatin assembly in Saccharomyces cerevisiae bour a similar domain organisation. Alternatively, a Topo [13-15]. Testifying for its crucial role in eukaryotes, Topo IB coding gene might have been present in the last com- IB is the target of one of the most important antitumoral mon ancestor of Archaea and Eucarya and was then lost in drugs, camptothecin [16]. Topo IB have been discovered all archaea, except in the lineage leading to Thaumarchae- in Poxviruses by Bauer and colleagues in 1977 [17], and ota (Figures 1Bb-d). To distinguish between these two the vaccinia virus Topo IB has been widely used as a hypotheses on the origin of thaumarchaeal Topo IB, we model system to decipher the catalytic activity of this have performed an in-depth phylogenetic analysis of enzyme [18-20] and more recently to search for new anti- Topo IB homologues. viral drugs [21]. However, viral Topo IB are quite different from their eukaryotic counterparts, since they harbour a We retrieved homologues of Topo IB from the nr database specific domain (virDNA-Topo-I_N) in their N-terminus at the NCBI (117 sequences from Eucarya, 2 from instead of the long Topoisom_I_N domain found in Archaea, 152 from Bacteria and 30 from viruses), as well eukaryotic homologues (Figure 1A and Additional files as some environmental putative thaumarchaeal 1). Recently, homologues of Topo IB have been detected sequences from the GOS project [26] at the NCBI (For in several bacterial genomes and one of these has been more details, see Additional files 2). We then selected 151 characterized from Deinococcus radiodurans [22]. These sequences representatives of Topo IB diversity for phylo- bacterial Topo IB harbour a domain organisation close to genetic analysis. The resulting maximum likelihood tree the viral enzymes (Figure 1A and Additional files 1). (Figure 2) shows that the two archaeal Topo IB group with the few environmental sequences (BV = 100%) confirm- Up to now, Topo IB have never been observed in Archaea, ing that these are likely from yet uncultivated representa- in sharp contrast to members of the Topo IA family which tives of Thaumarchaeota. Although thaumarchaeal are present in one or more copies in all archaeal genomes sequences are not yet abundant in environmental data- [6] (Additional files 2 and 3). Surprisingly, we recently bases, this suggests that the presence of Topo IB is very noticed that a Topo IB coding gene was identified in the likely a characteristic of this phylum. Moreover, thaumar- genome of the archaeon [23,24], chaeal Topo IB form a strongly supported sister-group to but that a Topo IA coding gene was absent [24]. Phyloge- their eukaryotic homologues (BV = 100%). This sister- netic analyses of the archaeal domain based on concate- grouping of eukaryotic and thaumarchaeal sequences is nation of ribosomal proteins and comparative genome also strongly supported when other reconstruction meth- analysis have recently led us to propose that C. symbiosum ods are used (not shown). The fact that thaumarchaeal and its relatives, formerly included in the phylum Crenar- sequences are sister to eukaryotes and do not arise from chaeota, should be considered as members of a separate within them, coupled to the absence of the N-terminal and possibly ancient phylum, that we proposed to name extension in the archaeal sequences, strongly suggest that Thaumarchaeota [24]. We predicted that the absence of a Thaumarchaeota did not acquire their Topo IB gene from Topo IA and the presence of a Topo IB might be a distinc- a present-day eukaryotic lineage via a recent HGT. Based tive feature of all thaumarchaeota members. As expected, on phylogenetic and genomic analysis, we have recently we have detected an archaeal Topo IB homologue proposed that Thaumarchaeota may represent the deepest (YP_001582656), misannotated as an 2-alkenal reduct- branching lineage in the archaeal phylogeny, i.e. they

Page 2 of 8 (page number not for citation purposes) Biology Direct 2008, 3:54 http://www.biology-direct.com/content/3/1/54

A 1 100 200 300 400 500 600 700 800 900 1000

Eucarya

Thaumarchaeota

Topoisom_I_N Topoisom_I

Bacteria + Mimivirus

virDNA-topo-I_N-like Topoisom_I Virus

virDNA-topo-I_N Topoisom_I

B ARCHAEA ARCHAEA Euryarchaeota a b Euryarchaeota Crenarchaeota

Thaumarchaeota Thaumarchaeota

EUCARYA EUCARYA

BACTERIA BACTERIA ARCHAEA ARCHAEA c Euryarchaeota d Crenarchaeota Thaumarchaeota Thaumarchaeota

Crenarchaeota Euryarchaeota

EUCARYA EUCARYA

BACTERIA BACTERIA

Figure 1 (see legend on next page)

Page 3 of 8 (page number not for citation purposes) Biology Direct 2008, 3:54 http://www.biology-direct.com/content/3/1/54

A.bacteriaFigure Schematic 1and (see three representation previous viruses page) (the of multiplethe domain alignment organisation of these of sequencesTopo IB sequences is provided from as Additionalthree eucarya, Files two 1) thaumarchaeota, two A. Schematic representation of the domain organisation of Topo IB sequences from three eucarya, two thau- marchaeota, two bacteria and three viruses (the multiple alignment of these sequences is provided as Addi- tional Files 1). Coloured boxes delineate the putative functional domains according to the PFAM database http:// pfam.sanger.ac.uk/: Topoisom_I_N (PF02919, Eukaryotic DNA topoisomerase I, DNA binding fragment) in red, Topoisom_I (PF01028, Eukaryotic DNA topoisomerase I, catalytic core) in blue and virDNA-Topo-I_N (PF09266, Viral DNA topoisomer- ase I, N-terminal) in light-green. The N-ter regions of viral Topo IB is similar in size and share conserved residues with bacterial and mimiviral homologues, suggesting the presence of a virDNA-Topo-I_N-like domain in these sequences (in dark-green). B. Alternative evolutionary scenarios explaining the presence of Topo IB in Thaumarchaeota. Filled blue diamonds indicate the presence of a Topo IB coding gene. Empty blue diamonds indicate a Topo IB coding gene that were present in the ancestor of the corresponding lineage and lost during its evolution. Blue crosses indicate the loss events of Topo IB coding genes. (a) A Topo IB coding gene was acquired by the ancestor of Thaumarchaeota via horizontal gene transfer (blue arrow) from a eukary- otic lineage. (b), (c) and (d) A Topo IB coding gene was present in the ancestor of Archaea and Eucarya and was subsequently lost in the ancestor of Crenarchaeota and Euryarchaeota, in agreement with a thaumarchaeal rooting of the archaeal tree (B). The Topo IB coding gene was independently lost in the ancestors of Euryarchaeota and Crenarchaeota according to an euryar- chaeal or crenarchaeal rooting of the archaeal tree (C and D). emerged before the divergence between Euryarchaeota previously suggested that the viral-like Topo IB found in and Crenarchaeota [24]. This proposal is consistent with Bacteria was originally introduced from a DNA virus [6]. a large scale analysis performed by Koonin and collabora- Our new and more detailed phylogenetic analysis, as well tors [27]. The basal branching of Thaumarchaeota is also as the similarity of the domain organisation of viral and supported by the fact that, as in eukaryotes, the largest bacterial Topo IB, confirms the close relationship between subunit of the RNA polymerase is not split in C. symbiosum these sequences and their probable common ancestry, and N. maritimus whereas it is split in A00 and A0 although the direction of transfer is yet unclear. polypeptides in all other archaea for which sequences are available [28]. In order to account for the observed distri- The likely presence of both a Topo IA and Topo IB in the bution of Topo IB in modern archaea, a deep branching of last common archaeal ancestor ([6] and this study, respec- Thaumarchaeota requires only one evolutionary event tively), suggests that this ancestor was possibly more (the loss of the Topo IB gene in an ancestor of Euryarchae- "complex" than modern archaea (if complexity is defined ota and Crenarchaeota, after their divergence from Thau- in terms of number of genes and/or redundancy of cellu- marchaeota) (blue cross, Figure 1B-b). Accordingly, the lar processes). This idea was already proposed by presence of a Topo IB in C. symbiosum and N. maritimus Lecompte et al. who highlighted a streamlining in the evo- may represent an ancestral archaeal feature. In contrast, if lution of archaeal ribosomes [29]. This is consistent with the root of the archaeal tree is located in either the euryar- the recent observation that several proteins common to chaeal or the crenarchaeal branch, two independent losses Archaea and Eukaryotes are missing in either Crenarchae- of Topo IB would be required to explain the observed data ota, Euryarchaeota or Thaumarchaeota [27] and may indi- (i.e. in the ancestor of Crenarchaeota and in the ancestor cate a possible tendency of evolution by streamlining of of Euryarchaeota, blue crosses, Figure 1B-c and 1B-d). some central molecular processes in the archaeal domain. Thus, the evolutionary history of Topo IB provides addi- Finally, one of us has recently proposed that a transition tional and independent evidence consistently with a root- from RNA genomes to DNA genomes occurred independ- ing of the archaeal tree in the thaumarchaeal branch [24]. ently in each of the three life domains by the contribution of three different DNA viruses to three complex RNA cells Topo IB have been for long thought to be absent in [30]. The idea of different DNA viruses at the origin of Archaea. Our finding now extends the presence of Topo IB Archaea and Eucarya sought to explain the existence of homologues in members of all three domains of life. This several critical differences in their DNA replication sys- may thus suggest that this enzyme was already present in tems, including the ancestral presence of a Topo IB exclu- the Last Universal Common Ancestor (LUCA). However, sively in Eucarya. Our finding that the last common Topo IB homologues are either absent or scarcely distrib- ancestor of Archaea and Eucarya probably contained a uted in complete genomes from most main bacterial Topo IB weakens this argument, and is more in favour of phyla (Additional files 3). Moreover, the bacterial part of a DNA genome for this ancestor. the Topo IB tree is not congruent with the bacterial specie tree (i.e. the monophyly of main bacterial groups is not Competing interests recovered, Figure 2), suggesting that the history of Topo IB The authors declare that they have no competing interests. in Bacteria was dominated by lateral gene transfers. It was

Page 4 of 8 (page number not for citation purposes) Biology Direct 2008, 3:54 http://www.biology-direct.com/content/3/1/54

Nitrosopumilus maritimus SCM1 YP_001582656 marine metagenome ECX78833 Cenarchaeum symbiosum A YP_875131 100 64 Thaumarchaeota ARCHAEA 73 marine metagenome EDA41376 100 marine metagenome ECU81978 Ostreococcus lucimarinus CCE9901 XP_001420818 100 Ostreococcus tauri CAL57294 Viridiplantae Physarum polycephalum AAC14193 Dictyostelium discoideum AX4 XP_639222 Amoebozoa 73 Thalassiosira pseudonana Incomplet Phytophthora infestans incomplet 100 Phytophthora ramorum Incomplet Stramenopiles Tetrahymena thermophila SB210 XP_001025304 94 98 Paramecium tetraurelia strain d4-2 XP_001451092 Cryptosporidium hominis TU502 XP_668245 84 Plasmodium falciparum 3D7 XP_001351663 Alveolata 100 Babesia bovis T2Bo XP_001610263 92 87 Theileria annulata strain Ankara XP_952523 Trypanosoma brucei TREU927 XP_844283 Trypanosoma cruzi strain CL Brener XP_817800 91 92 Leishmania major strain Friedlin XP_001686448 Euglenozoa Chlamydomonas reinhardtii XP_001703256 Physcomitrella patens subsp. patens XP_001773330 62 100 Arabidopsis thaliana NP_200341 82 Arabidopsis thaliana NP_200342 98 Vitis vinifera CAO23979 Viridiplantae 50 Vitis vinifera CAO47952 Oryza sativa Japonica Group NP_001061015 91 Coprinopsis cinerea okayama7 130 XP_001830595 Cryptococcus neoformans var. neoformans JEC21 XP_572925 EUCARYA 51 Schizosaccharomyces pombe 972h- NP_596209 Aspergillus terreus NIH2624 XP_001211728 71 100 Neurospora crassa OR74A XP_958033 Fungi Yarrowia lipolytica CLIB122 XP_503300 Saccharomyces cerevisiae NP_014637 94 Candida albicans AAB39507 60 Monosiga brevicollis MX1 XP_001748042 Choanoflagellata Apis mellifera XP_396203 63 Nasonia vitripennis XP_001605104 92Tribolium castaneum XP_971195 83 56 Drosophila melanogaster P30189 Brugia malayi XP_001899795 100 Caenorhabditis elegans CAA65537 86 100 Danio rerio NP_001037789 Gallus gallus NP_990441 99 Homo sapiens AAA61207 Metazoa 93 Mus musculus NP_033434 Danio rerio XP_689637 Danio rerio NP_001001838 Homo sapiens AAH52285 Mus musculus NP_082680 71 Gallus gallus NP_001001300 Bovine papular stomatitis virus NP_957971 98 Orf virus NP_957839 Yaba monkey tumor virus NP_938332 100 97 Tanapox virus YP_001497072 81 Monkeypox virus NP_536523 Vaccinia virus Tian Tan AAF33965 VIRUSES 100 68 Variola major virus AAA60837 Acanthamoeba polyphaga mimivirus YP_142548 100 Pseudoalteromonas atlantica T6c YP_662415 73 Alteromonas macleodii Deep ecotype ZP_01112272 Beta/Gamma-Proteobacteria Deinococcus radiodurans R1 NP_294413 Deinococcus geothermalis DSM 11300 YP_605521 99 Azoarcus sp. BH72 YP_933299 Deinococci 57 100 89 70 Azoarcus sp. EbN1 YP_160972 Stigmatella aurantiaca DW4 3-1 ZP_01466520 Beta/Gamma-Proteobacteria Anaeromyxobacter sp. Fw109-5 YP_001378897 91 Anaeromyxobacter dehalogenans 2CP-C YP_466158 69 Anaeromyxobacter sp. K ZP_02174452 100 Delta-Proteobacteria 100 Anaeromyxobacter dehalogenans 2CP-1 ZP_02325039 77 Aurantimonas sp. SI85-9A1 ZP_01227889 Bradyrhizobium sp. ORS278 YP_001202546 64 100 Azorhizobium caulinodans ORS 571 YP_001525200 Agrobacterium tumefaciens str. C58 NP_356605 61 Rhizobium leguminosarum bv. viciae 3841 YP_771338 97 Sinorhizobium meliloti 1021 NP_437814 86 86 Roseobacter sp. CCS2 ZP_01751851 Sagittula stellata E-37 ZP_01746177 Phaeobacter gallaeciensis 2.10 ZP_02149742 Rhodobacter sphaeroides 2.4.1 YP_354029 Alpha-Proteobacteria Oceanicola batsensis HTCC2597 ZP_01001622 61 Oceanibulbus indolifex HEL-45 ZP_02155347 Sphingomonas wittichii RW1 YP_001263436 Mesorhizobium sp. BNC1 YP_675314 71 Ochrobactrum anthropi ATCC 49188 YP_001372842 Novosphingobium aromaticivorans DSM 12444 YP_497864 Erythrobacter litoralis HTCC2594 YP_458571 100 81 Erythrobacter sp. SD-21 ZP_01862564 Arthrobacter aurescens TC1 YP_946110 Clavibacter michiganensis subsp. sepedonicus YP_001709063 99 marine actinobacterium PHSC20C1 ZP_01129392 99 Arthrobacter sp. FB24 YP_832570 81 Nocardioides sp. JS614 YP_921940 Salinispora arenicola CNS-205 YP_001536631 Frankia alni ACN14a YP_714104 Actinobacteria 65 Nocardia farcinica IFM 10152 YP_119639 Mycobacterium vanbaalenii PYR-1 YP_955527 99 Mycobacterium smegmatis str. MC2 155 YP_886157 86 59 Mycobacterium avium subsp. paratuberculosis K-10 NP_962178 Acidovorax avenae subsp. citrulli AAC00-1 YP_969856 90 Delftia acidovorans SPH-1 YP_001565359 Nitrosospira multiformis ATCC 25196 YP_410740 Beta/Gamma-Proteobacteria Caulobacter sp. K31 YP_001685265 Stenotrophomonas maltophilia R551-3 ZP_01644160 Alpha-Proteobacteria Xanthomonas campestris pv. campestris str. ATCC 33913 NP_635429 100 Xanthomonas axonopodis pv. citri str. 306 NP_640391 Beta/Gamma-Proteobacteria 100 Xanthomonas oryzae pv. oryzicola BLS256 ZP_02241230 91 Gemmata obscuriglobus UQM 2246 ZP_02731925 Bordetella petrii DSM 12804 YP_001631006 PVC Ralstonia eutropha JMP134 YP_298569 100 Ralstonia eutropha H16 YP_725990 70 Burkholderia xenovorans LB400 YP_554144 Burkholderia phymatum STM815 YP_001861421 97 78 Burkholderia cenocepacia PC184 ZP_00977096 Burkholderia multivorans ATCC 17616 YP_001585914 100 Burkholderia dolosa AUO158 ZP_00983228 71 Burkholderia ambifaria MEX-5 ZP_02905124 90 97 Burkholderia vietnamiensis G4 YP_001115703 Thiobacillus denitrificans ATCC 25259 YP_314503 Beta/Gamma-Proteobacteria Pseudomonas stutzeri A1501 YP_001172718 65 Pseudomonas aeruginosa C3719 ZP_00969479 0.1 Pseudomonas putida F1 YP_001267266 100 Pseudomonas putida W619 YP_001750059 75 Ralstonia pickettii 12J ZP_01661204 Pseudomonas mendocina ymp YP_001188262 Pseudomonas fluorescens Pf-5 YP_260520 Pseudomonas syringae pv. tomato str. DC3000 NP_792773 66 Pseudomonas fluorescens PfO-1 YP_348675 86Sorangium cellulosum So ce 56 YP_001611798 Delta-Proteobacteria Solibacter usitatus Ellin6076 YP_821734 Acidobacteria Methylobacillus flagellatus KT YP_544669 bacterium Ellin514 ZP_02967294 Beta/Gamma-Proteobacteria Opitutus terrae PB90-1 YP_001818028 Candidatus Protochlamydia amoebophila UWE25 YP_008181 PVC 57 Methylobacterium sp. 4-46 YP_001771806 PVC Methylobacterium radiotolerans JCM 2831 YP_001753451 Sphingomonas wittichii RW1 YP_001260160 93 Nitrobacter hamburgensis X14 YP_579129 Alpha-Proteobacteria 64 Methylocella silvestris BL2 ZP_02955491 88 Psychrobacter sp. PRwf-1 YP_001279766 100 Psychrobacter cryohalolentis K5 YP_580104 Beta/Gamma-Proteobacteria Cytophaga hutchinsonii ATCC 33406 YP_678894 76 Pedobacter sp. BAL39 ZP_01886451 Polaribacter dokdonensis MED152 ZP_01053988 54 Flavobacterium johnsoniae UW101 YP_001195701 93 Dokdonia donghaensis MED134 ZP_01050427 83 Leeuwenhoekiella blandensis MED217 ZP_01061547 CFB 59 Gramella forsetii KT0803 YP_861935 Croceibacter atlanticus HTCC2559 ZP_00950597 Flavobacteria bacterium BBFL7 ZP_01201969 BACTERIA FigureUnrooted 2 maximum likelihood phylogenetic tree of 151 Topo IB sequences Unrooted maximum likelihood phylogenetic tree of 151 Topo IB sequences. Numbers at branches represent boot- strap proportions. The scale bar represents the average number of substitutions per site.

Page 5 of 8 (page number not for citation purposes) Biology Direct 2008, 3:54 http://www.biology-direct.com/content/3/1/54

Authors' contributions 2) I also think that another adjustment, a less fundamen- CB, PF and SG conceived the study. CB designed and car- tal but, perhaps, even more badly needed one relates to ried out the analyses, CB, SG and PF wrote the manu- the very "discovery" of the archaeal Topo IB. The protein script. All authors read and approved the final sequence is very well conserved, so it is somewhat disin- manuscript. genuous to claim the finding of Topo IB as a discovery sensu strictu. The Cenarchaeum Topo IB is annotated in Reviewers' comments GenBank as such; it is another matter that the presence of Eugene Koonin this interesting gene in the Cenarchaeum genome is not Review of Brochier-Armanet, gribaldo, and Forterre highlighted in the primary paper (Hallam et al. PNAS 2006, 103: 18296) although "two topoisomerases" are 'A DNA topoisomerase IB in Thaumarchaeota testifies for mentioned. In any case, I do not think that it is proper to the presence of this enzyme in the last common ancestor claim this finding in itself as a "discovery"; it would be of Archaea and Eukaryotes" much better to cite Hallam et al., and to explain the entire situation. This is a very straightforward study of the Topo IB of Thau- marchaeota (formerly, mesophilic Crenarchaeota).dem- We cite the paper describing the genome of C. symbiosum and onstrating that the archaeal TopoIB clusters with the explain in the text, that one of the two DNA topoisomerases eukaryotic orthologs, at the base of the eukaryotic subtree. coding genes identified in the genome of C. symbiosum codes Combined with the fact that the archaeal and eukaryotic for a Topo IB, and that surprisingly no Topo IA coding gene was Topo IB proteins have similar domain organizations, present in this genome. these findings clearly demonstrate their monophyly. Conversely, the ortholog from Nitrosopumilis is mistakenly 1) I think, however, this is where the certainty stops. annotated as some completely unrelated enzyme, and I Indeed, I do not believe that the scenario with horizontal think it is desirable to correct this (trivial) error. These cor- transfer of the eukaryotic Topo IB gene into the common rections will not detract from the message of the present ancestor of Thaumarchaeota can be considered rigorously article but will make it better balanced. falsified because it is hardly possible to rule out a dramatic acceleration of evolution after the transfer, resulting in the We mention the fact that the gene coding for a putative homo- observed tree topology. PHyml is relatively robust to this logue of TopoIB in N. maritimus was misannotated in this sort of artifacts but there are inescapable limits. Ditto genome. regarding the presence of Topo IB: the results of this work add credence to such a conclusion but alternatives based Anthony Poole on horizontal gene transfer cannot be ruled out. I think This succinct report presents a nice phylogenetic result the paper would become better balanced if these uncer- that provides two important evolutionary insights. The tainties were acknowledged, and the conclusions, espe- first is that the identification of Topo IB topoisomerases cially, those at the end of the Abstract are toned down. In within members of the recently proposed archaeal phy- particular, the "support" of the Thaumaarcaheal rooting lum Thaumarchaeota (together with a supporting phylo- of the tree inferred from the phylogenetic analysis of this genetic analysis) indicates that a Topo IB enzyme was single gene is very weak, and it would be better to speak of likely present in the common ancestor of eukaryotes and the compatibility of the results with such rooting. archaea. This potentially tells us two things. First, if the presence of Topo IB within archaea is restricted to the We think that the hypothesis of a HGT from present days Thaumarchaea, it strengthens the view that this is a genu- eukaryotes to the ancestor of Thaumarchaeota is less likely than ine phylum (as recently proposed by these authors – ref. the hypothesis of the presence of a Topo IB gene in the ancestor [24]). In that paper, the authors presented evidence that of Eucarya and Archaea, followed by the loss of gene in the the mesophilic archaeon, Crenarchaeum symbiosum did not ancestor of Euryarchaeota/Crenarchaeota. However, as pointed group within the Crenarchaea, and that, in their trees, this out by referee two, we present both hypotheses and said carefully species was likewise distinct from Euryarchaeota. If the in the text that our phylogenetic analysis as the domain organ- basal position of Thaumarchaeota is correct, the implica- isation of Topo IB homologues "strongly suggest". tion is that Topo IB was lost early in archaeal evolution, prior to the divergence of Euryarchaea and Crenarchaea. Concerning the phylogenetic analyses, we used alternative While their results (in ref. [24]) indicated that C. symbio- methods to ML (as Bayesian methods), all the resulting trees sum is basal to the archaeal tree, in the current paper, they strongly support the sister-grouping of Thaumarchaeota and nevertheless approach this with caution, and provide us Eucarya. We add this point in the text. with three different scenarios (Figure 1B) that serve as a valuable framework for evaluating the implications of the

Page 6 of 8 (page number not for citation purposes) Biology Direct 2008, 3:54 http://www.biology-direct.com/content/3/1/54

conservation of eukaryotic and archaeal Topo IB (the fourth, transfer from eukaryotes – their Figure 1B-a – can Additional file 2 be ruled out on the phylogenetic results presented). Figure Archaeal-topoin-af2. Material and methods. 1B is therefore a very welcome addition to this paper Click here for file [http://www.biomedcentral.com/content/supplementary/1745- because it allows the reader to evaluate the data and phy- 6150-3-54-S2.pdf] logeny in Figure 1B with respect to several hypotheses. Too often we see only one possible hypothesis being pre- Additional file 3 sented (and one sometimes gets a sense that the analysis Archaeal-topoin-af3. Table showing the taxonomic distribution of the 95 of the data in a particular way is a foregone conclusion), Topo IB, 634 Topo IA sensu stricto, 369 Topo III and 40 Reverse gyrase so it is nice to see that the authors have thought this sequences retrieved from the 670 complete bacterial and archaeal through carefully, and are both aware of and open to the genomes available in June 2008. compexities of interpretation. Click here for file [http://www.biomedcentral.com/content/supplementary/1745- 6150-3-54-S3.pdf] The second insight is that placement of this topoisomer- ase type in the common ancestor of archaea and eukaryo- tes strengthens the evidence that this ancestor had a DNA- based genome. This point might need brief explanation. Acknowledgements While the naïve expectation is that DNA was present in CBA is the recipient of an Action Thématique et Incitative sur Programme the Last Universal Common Ancestor, the available com- (ATIP) of the French Centre National de la Recherche Scientifique. The parative genomic data on enzymes involved in deoxyribo- work on DNA topoisomerases at the university Paris-Sud is supported by nucleotide synthesis and DNA replication do not allow a grant from the Association de la Recherche contre le Cancer (ARC), PF this conclusion to be readily drawn. In light of these con- is supported by funding from the Institut Universitaire de France (IUF) flicting data, Forterre recently proposed a model (ref. References [30]) wherein each domain may have independently 1. Champoux JJ: DNA topoisomerases: structure, function, and gained the capacity for DNA synthesis. The essence of the mechanism. Annu Rev Biochem 2001, 70:369-413. model (an arms race between cells and viruses) is very ele- 2. Corbett KD, Berger JM: Structure, molecular mechanisms, and evolutionary relationships in DNA topoisomerases. Annu Rev gant, and invokes known processes (there are several cases Biophys Biomol Struct 2004, 33:95-118. where viruses are known to carry altered genomes – phage 3. Wang JC: Cellular roles of DNA topoisomerases: a molecular genomes with uracil instead of thymine, for example). It perspective. Nat Rev Mol Cell Biol 2002, 3:430-440. 4. Schoeffler AJ, Berger JM: DNA topoisomerases: harnessing and is exciting to see that the discovery of Thaumarchaeal constraining energy to govern chromosome topology. Q Rev Topo IB helps to improve our understanding of DNA ori- Biophys 2008, 41:41-101. 5. Forterre P: DNA topoisomerase V: a new fold of mysterious gins in that its inclusion supports a less complex scenario origin. Trends Biotechnol 2006, 24:245-247. (i.e. at most two independent gains). 6. Forterre P, Gribaldo S, Gadelle D, Serre MC: Origin and evolution of DNA topoisomerases. Biochimie 2007, 89:427-446. 7. Gadelle D, Filee J, Buhler C, Forterre P: Phylogenomics of type II Additional material DNA topoisomerases. Bioessays 2003, 25:232-242. 8. Aravind L, Leipe DD, Koonin EV: Toprim – a conserved catalytic domain in type IA and II topoisomerases, DnaG-type pri- Additional file 1 mases, OLD family nucleases and RecR proteins. Nucleic Acids Res 1998, 26:4205-4213. Archaeal-topoin-af1. Multiple alignment of Topo IB sequences from three 9. Cheng C, Kussie P, Pavletich N, Shuman S: Conservation of struc- eukaryotes (Scerevisiae = Saccharomyces cerevisiae, Hsapiens = Homo ture and mechanism between eukaryotic topoisomerase I sapiens, Osativa = Oryza sativa), the two thaumarchaeota (Cenar- and site-specific recombinases. Cell 1998, 92:841-850. chaeum symbiosum and Nitrosopumilus maritimus), two bacteria 10. Taneja B, Patel A, Slesarev A, Mondragon A: Structure of the N- (Oterrae = Opitutus terrae PB90-1 and Rlitoralis = Roseobacter litora- terminal fragment of topoisomerase V reveals a new family of topoisomerases. Embo J 2006, 25:398-408. lis Och 149) and three viruses (Apolyphaga = Acanthamoeba polyphaga 11. Benarroch D, Claverie JM, Raoult D, Shuman S: Characterization mimivirus, Bpapular = Bovine papular stomatitis virus and Ymonkey = of mimivirus DNA topoisomerase IB suggests horizontal Yaba monkey tumor virus). Coloured boxes delineate the putative func- gene transfer between eukaryal viruses and bacteria. J Virol tional domains according to the PFAM database: Topoisom_I_N 2006, 80:314-321. (PF02919, Eukaryotic DNA topoisomerase I, DNA binding fragment) in 12. Champoux JJ, Dulbecco R: An activity from mammalian cells red, Topoisom_I (PF01028, Eukaryotic DNA topoisomerase I, catalytic that untwists superhelical DNA – a possible swivel for DNA replication (polyoma-ethidium bromide-mouse-embryo core) in blue and virDNA-Topo-I_N (PF09266, Viral DNA topoisomer- cells-dye binding assay). Proc Natl Acad Sci USA 1972, 69:143-146. ase I, N-terminal) in green. The N-ter regions of viral Topo IB share con- 13. Garinther WI, Schultz MC: Topoisomerase function during rep- served residues with bacterial and mimiviral homologues, suggesting the lication-independent chromatin assembly in yeast. Mol Cell presence of a virDNA-Topo-I_N-like domain in these sequences. Biol 1997, 17:3520-3526. Click here for file 14. Kim RA, Wang JC: Function of DNA topoisomerases as repli- [http://www.biomedcentral.com/content/supplementary/1745- cation swivels in Saccharomyces cerevisiae. J Mol Biol 1989, 208:257-267. 6150-3-54-S1.pdf] 15. Brill SJ, DiNardo S, Voelkel-Meiman K, Sternglanz R: Need for DNA topoisomerase activity as a swivel for DNA replication for transcription of ribosomal RNA. Nature 1987, 326:414-416.

Page 7 of 8 (page number not for citation purposes) Biology Direct 2008, 3:54 http://www.biology-direct.com/content/3/1/54

16. Liu LF, Desai SD, Li TK, Mao Y, Sun M, Sim SP: Mechanism of action of camptothecin. Ann N Y Acad Sci 2000, 922:1-10. 17. Bauer WR, Ressner EC, Kates J, Patzke JV: A DNA nicking-closing enzyme encapsidated in vaccinia virus: partial purification and properties. Proc Natl Acad Sci USA 1977, 74:1841-1845. 18. Shuman S: Vaccinia virus DNA topoisomerase: a model eukaryotic type IB enzyme. Biochim Biophys Acta 1998, 1400:321-337. 19. Krogh BO, Shuman S: Catalytic mechanism of DNA topoi- somerase IB. Mol Cell 2000, 5:1035-1041. 20. Tian L, Shuman S: Vaccinia topoisomerase mutants illuminate roles for Phe59, Gly73, Gln69 and Phe215. Virology 2007, 359:466-476. 21. Osheroff N: Unraveling the structure of the variola topoi- somerase IB-DNA complex: a possible new twist on small- pox therapy. Mol Interv 2006, 6:245-248. 22. Krogh BO, Shuman S: A poxvirus-like type IB topoisomerase family in bacteria. Proc Natl Acad Sci USA 2002, 99:1853-1858. 23. Hallam SJ, Konstantinidis KT, Putnam N, Schleper C, Watanabe Y, Sugahara J, Preston C, de la Torre J, Richardson PM, DeLong EF: Genomic analysis of the uncultivated marine crenarchaeote Cenarchaeum symbiosum. Proc Natl Acad Sci USA 2006, 103:18296-18301. 24. Brochier-Armanet C, Boussau B, Gribaldo S, Forterre P: Mesophilic Crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota. Nat Rev Microbiol 2008, 6:245-252. 25. Konneke M, Bernhard AE, de la Torre JR, Walker CB, Waterbury JB, Stahl DA: Isolation of an autotrophic ammonia-oxidizing marine archaeon. Nature 2005, 437:543-546. 26. Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W, et al.: Environmental genome shotgun sequencing of the Sargasso Sea. Science 2004, 304:66-74. 27. Makarova KS, Sorokin AV, Novichkov PS, Wolf YI, Koonin EV: Clus- ters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea. Biol Direct 2007, 2:33. 28. Kwapisz M, Beckouet F, Thuriaux P: Early evolution of eukaryotic DNA-dependent RNA polymerases. Trends Genet 2008, 24:211-215. 29. Lecompte O, Ripp R, Thierry JC, Moras D, Poch O: Comparative analysis of ribosomal proteins in complete genomes: an example of reductive evolution at the domain scale. Nucleic Acids Res 2002, 30:5382-5390. 30. Forterre P: Three RNA cells for ribosomal lineages and three DNA viruses to replicate their genomes: a hypothesis for the origin of cellular domain. Proc Natl Acad Sci USA 2006, 103:3669-3674.

Publish with BioMed Central and every scientist can read your work free of charge "BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime." Sir Paul Nurse, Cancer Research UK Your research papers will be: available free of charge to the entire biomedical community peer reviewed and published immediately upon acceptance cited in PubMed and archived on PubMed Central yours — you keep the copyright

Submit your manuscript here: BioMedcentral http://www.biomedcentral.com/info/publishing_adv.asp

Page 8 of 8 (page number not for citation purposes)