U N I V E R S I T Y O F C O P E N H A G E N F A C U L T Y O F S C I E N C E

PhD Thesis Xu Feng Identification of the archaeal regulatory network in DNA damage response

Supervisor: Qunxin She

Data of Submission: 12 June 2018

Name of department: Department of Biology

Author(s): Xu Feng

Title and subtitle: Identification of the archaeal regulatory network in DNA damage response

Topic description: The role of a paralog of TFB/TFIIB protein and an orthologue of Orc1/Cdc6 protein during DNA damage-induced transcriptional responses in Sulfolobus islandicus Rey15A was investigated.

Supervisor: Qunxin She

Submitted on: 12 June 2018

This thesis has been submitted to the PhD School of the Faculty of Science, University of Copenhagen

The cover image is an integrated model of the regulatory network of DNA damage response in Sulfolobus.

PREFACE

The thesis entitled ‘Identification of the archaeal regulatory network in DNA damage response’ was submitted to the Faculty of Science, University of Copenhagen. The work presented in the thesis was carried on at the Danish Archaea Center (DAC), Department of Biology, University of Copenhagen, Denmark, under the supervision of Dr. Qunxin She and with financial support from Danish Council for Independent Research (DFF-4181- 00274) and China Scholarship Council (CSC).

The thesis starts with a general overview of mechanisms employed by three domains of life to deal with DNA damage, including DNA damage repair and tolerance pathways. It is followed by an introduction of the current knowledge about the DNA damage response (DDR) across the tree of life. Subsequently, the results obtained during my PhD study are summarized. At last, the thesis goes to its end with the discussion of the results and perspectives for future research.

Two papers are included at the end of the thesis, with the first one addressing the function of TFB3 and the second focusing on the functional study of Orc1-2 during DDR regulation in Sulfolobus islandicus Rey15A. These papers are given below:

Xu Feng, Mengmeng Sun, Wenyuan Han, Yun Xiang Liang, Qunxin She; A transcriptional factor B paralog functions as an activator to DNA damage-responsive expression in archaea, Nucleic Acids Research, gky236, https://doi.org/10.1093/nar/gky236

Mengmeng Sun, Xu Feng, Zhenzhen Liu, Wenyuan Han, Yun Xiang Liang, Qunxin She; An Orc1/Cdc6 ortholog functions as a key regulator in the DNA damage response in Archaea, Nucleic Acids Research, gky487, https://doi.org/10.1093/nar/gky487

I

ACKNOWLEDGEMENTS

It has been a wonderful time during my stay in Denmark and I would like to say thanks to everybody that has supported me here.

Firstly, I would like to express my sincere gratitude to my supervisor Dr. Qunxin She for his support during my PhD study. His guidance has enlightened me in the past years and will be the most valuable thing for my scientific career.

I am grateful to all my current and former labmates in the Danish Archaea Centre and the Molecular Biology of Archaea Lab in Wuhan including Yunxiang Liang, Roger Garrett, Xu Peng, Yongmei Hu, Nan Peng, Yuxia Mei, Zhengjun Chen, Ling Deng, Changyi Zhang, Wenyuan Han, Wenfang Peng, Fei He, Daniel Stiefler-Jensen, Mariana Awayez, Thi Ngoc Hien Phan, Soley Gudbergsdottir, Carlos Leon, Laura Alvarez, Dongqing Jiang, Yingjun Li, Min Ren, Wenqing She, Jingzhong Lin, Yan Zhang, Mingxia Feng, Mengmeng Sun, Tong Guo, Anders Lynge Kjeldsen, Yuvaraj Bhoobalan, Anders Fuglsang, Pavlos Papathanasiou, Anne Louise Grøn Jensen, Weijia Zhang, Zhenzhen Liu and Saifu Pan. It has been a wonderful experience working together with them. My sincere thanks also go to Dr. Li Huang and all the lab members in his lab for their support during my stay in Institute of Microbiology, Chinese Academy of Science at Beijing.

My special thanks are dedicated to my best friends including Yingwei Feng, Liuquan Feng, Bin Li, Zhengkun Kuang, Jun Wang, Qishan Zhang and Kaisong Huang. Thanks to them for being there through all those tough times in my life.

Last, and most importantly, I would like to express my deep gratitude to all my beloved family members, especially my wife (Jianglan Liao) and my son (Muxin Feng) for their unconditional love and support.

II

TABLE OF CONTENTS

Preface ...... I

Acknowledgements ...... II

Table of contents ...... III

Summary ...... V

Sammendrag ...... VII

Abstracts ...... IX

Abbreviations ...... X

Objectives ...... XII

Introduction ...... 1

Universal strategies for DNA damage removal ...... 2 Direct reversal of DNA damage ...... 3 Excision of DNA damage ...... 5 Repair of double strand DNA breaks ...... 13

DNA damage tolerance by Translesion DNA synthesis ...... 16

DNA damage response in three domains of Life ...... 19 ATM/ATR mediated DNA damage signaling pathways in Eukarya ...... 19 Multilayer regulations of SOS response in Bacteria ...... 23 Cellular responses towards DNA damage in Archaea ...... 26

Transcriptional regulation in Archaea ...... 30

Summary of the results ...... 33 TFB3 functions as a transcriptional activator for DNA transfer pathway ...... 33 Orc1-2 functions as a global regulator essential for DDR in Sulfolobus ...... 35

Discussions and future perspectives ...... 38

III

References ...... 41

IV

SUMMARY

DNA damage response (DDR) is essential for the maintenance of genome integrity in all three domains of life, and the process is controlled by evolutionarily unrelated factors in Bacteria and in Eukarya. While DDR is primarily mediated by cleavage of the global , LexA in the former, the process in the latter is mainly orchestrated by two evolutionally conserved kinases, ATM/ATR. Strikingly, none of these DDR regulators have a homologue in Archaea. As a result, it remains elusive as to how organisms in Archaea coordinate cellular processes in response to DNA damage signal (s). Nevertheless, investigation of genome expression upon UV light exposure in Sulfolobus species revealed a number of differentially expressed including genes encoding a paralogue of TFB protein (TFB3) and an orthologue of Orc1/Cdc6 protein (Orc1-2). Here, we apply a combination of genetic, biochemical, transcriptome and phylogenetic analysis to investigate their possible roles in archaeal DDR using Sulfolobus islandicus REY15A as the model.

Firstly, we constructed tfb3 deletion mutant and the transcriptome analysis of the resulting mutant (∆tfb3) revealed that TFB3 is essential for the transcriptional activation of a subset of DDR genes. Phenotypic characterization of ∆tfb3 showed that the mutant loses its ability to form cell aggregates upon DNA damage and is moderately sensitive to DNA damage. Interestingly, CHIP-qPCR analysis showed that TFB3 specifically binds to the region of TFB3-dependent genes, suggesting that TFB3 directly modulates the transcriptional process upon DNA damage. Further, mutagenesis of the TFB3 protein and subsequent functional analysis indicated that the N terminal Zn ribbon and C terminal Coiled-Coil motif are essential for its function in the transcriptional activation. Furthermore, the phylogenetic analysis revealed a co-evolution of TFB3 with its target system (Ced, the Crenarchaeal system for exchange of DNA), suggesting that the TFB3- mediated transcriptional regulation may represent a well conserved DDR regulatory circuit for intercellular DNA transfer in Crenarchaeota.

Then, we showed that the previously constructed orc1-2 deletion mutant (∆orc1-2) is hypersensitive to NQO treatment and the transcriptome analysis of the mutant revealed that

V

Orc1-2 is essential for the global transcriptional regulation upon DNA damage. Orc1-2- dependent processes include the TFB3-controlled DNA transfer pathway, DNA replication initiation, cell cycle arrest and potential translesion DNA synthesis. Consistently, ∆orc1-2 is defective in DNA damage induced cell aggregation and cell cycle control. Furthermore, DNase I footprinting assay with Orc1-2 indicated that this protein is capable of protecting a conserved promoter element present in a number of Orc1-2-dependent genes and reporter gene assay demonstrated that this motif is responsible for the DNA-damage responsive expression, suggesting that Orc1-2 binds to the conserved DNA damage responsive element (DDRE) upon DNA damage and modulates transcriptions of DDR genes. Eventually, a promoter switch strain was constructed and analysis of the DDR in this strain showed that the induction of Orc1-2 is essential but not sufficient for activation of DDR, suggesting Orc1-2 could be posttranslationally modified upon DNA damage.

VI

SAMMENDRAG

DNA damage response (DDR) er essentiel for vedligeholdelse af genomets integritet i alle livsformer, og processen er styret af evolutionæ rt uafhæ ngige faktorer i bakterier og eukaryoter. Hvor DDR primæ rt er kontrolleret af den globale repressor LexA i bakterier, er processen i eukaryoter hovedsageligt styret af to bevarede kinaser, ATM/ATR. Ingen af disse DDR-regulatorer har homologer i arkæ er. Derfor er det fortsat ukendt hvordan organismer i det arkæ iske domæ ne koordinerer lignende cellulæ re processer som respons til DNA-skadesignaler. Ikke desto mindre har forskningen af genomekspressionen i Sulfolobus-arter afsløret et antal forskelligt udtrykte gener inklusivt gener kodende for en paralog af TFB-proteinet (TFB3) og en ortolog af Orc1/Cdc6-proteinet (Orc1-2). Her er en ræ kke genetiske, biokemiske, transkriptom- og phylogenanalyser blevet udført for at undersøge deres mulige funktion i arkæ isk DNA-skade reparation ved brug af Sulfolobus islandicus REY15A som modelorganisme.

Først konstruerede vi en tfb3-deletionsmutant og en transkriptomanalyse af den resulterende mutant afslørede, at TFB3 er nødvendig for transkriptionel aktivering af nogle af DDR-generne. Fæ notypekarakterisering af tfb3-deletionsmutanten viste at mutanten havde mistet evnen til at forme celleaggregater ved DNA-skade og er delvis sensitiv til DNA-skade. CHIP-qPCR-analyse viste at TFB3 associerer specifikt med promoterregionen af TFB3-afhæ ngige gener, hvilket tyder på at TFB3 modulerer den transkriptionelle proces direkte ved DNA-skade. Ydermere indikerer mutagenesen af TFB3-proteinet og den efterfølgende funktionelle analyse, at det N-terminale Zinkbånd og det C-terminale coiled-coil-motiv i TFB3 er essentielle for transkriptionel aktivering og at den sidstnæ vnte sandsynligvis medierer rekrutteringen af TFB3 til promoterregioner. Derudover påviste fylogenetisk analyse koevolution af TFB3 og dets target-system (Ced), hvilket tyder på at TFB3-medieret transkriptionsregulering kan repræ sentere et konserveret DDR-regulatorisk kredsløb ved intercellulæ r DNA-overførelse i Crenarchaeota.

Derefter viste vi, at den tidligere konstrueret orc1-2-deletions mutant er hypersensitiv til NQO-behandling og transskriptomanalyse af mutanten viste at Orc1-2 er nødvendig for

VII den globale transskriptionelle regulering ved DNA-skade. Orc1-2-afhæ ngige processer inkluderer den TFB3-kontrollerede DNA-overførsels-pathway, initiation af DNA- replikation, cellecyklusarrest og potentiel translesionsyntese af DNA. ∆orc1-2 er ikke funktionsdygtig til celleaggregering og cellecycluskontrol ved DNA-skade. Desuden indikerede DNase I footprinting assays med Orc1-2, at proteinet beskytter en konserveret promoterelement der findes i et antal Orc1-2-afhæ ngige gener og reportergen-assays demonstrerede at motivet er ansvarlig for DNA-skadereaktiv ekspression, hvilket tyder på at Orc1-2 binder til det bevarede DNA damage responsive element (DDRE) ved DNA- skade og modulerer transskription af DDR-generne. Til sidst viste vi ved konstruktion af en promoter-skiftet stamme og undersøgelse af DDR i denne, at induktion af Orc1-2 er essentiel, men ikke tilstræ kkelig, for aktivering af DDR, hvilket tyder på at Or4c1-2 kunne væ re posttranslationelt modificeret ved DNA-skade.

VIII

ABSTRACTS

To counteract the threat of genomic DNA lesions, organisms belonging to domains of Eukarya and Bacteria have evolved a sophisticated network of DNA damage response (DDR) systems. These include events that lead to activation of DNA repair, cell-cycle arrest and tolerance of DNA damage. In bacteria, these processes are coordinated by the global regulator, LexA and in eukarya the principal DDR regulators are two evolutionally conserved kinases, ATM and ATR. In contrast, comparative genomics analysis failed to detect any of the reported regulators mediating DDR in Archaea. In this work, we aim to investigate the function of two potential DDR regulators including a paralog of TFB family protein (TFB3) and an orthologue of archaeal/eukaryal Orc1/Cdc6 protein (Orc1-2) in DDR of Sulfolobus islandicus Rey15A, a model archaeon. We found that tfb3 deletion mutant (Δtfb3) is more sensitive to a DNA damaging agent, NQO, and transcriptome analysis of the response of WT and Δtfb3 to NQO treatment revealed that TFB3 is essential for the transcriptional activation of a subset of genes, including a number of genes implicated in intercellular DNA transfer. Consistently, we demonstrated that the deficiency of TFB3 leads to the loss of cell aggregation upon DNA damage. Furthermore, CHIP- qPCR analysis indicated that TFB3 is specifically associated with the promoter region of its target genes and functional analysis of TFB3 by mutagenesis demonstrated that the conserved Zn-ribbon and coiled-coil motif are essential for its function. These results indicate that TFB3 functions as a transcriptional activator for DDR genes probably by interacting with specific transcriptional regulator, thus facilitating PIC (Pre-initiation complex) formation. More strikingly, phenotypic characterization of the previously constructed Δorc1-2 revealed a hypersensitivity phenotype to NQO, and subsequent transcriptome analysis indicated that the deficiency of Orc1-2 abrogates the differential expression of all DDR genes, including those implicated in DNA replication initiation, cell division, translesion DNA synthesis and TFB3-dependent DNA transfer pathway. Furthermore, DNase I footprinting analysis and reporter gene assay demonstrated that Orc1-2 interacts with a conserved hexanucleotide motif present in the promoter regions of a number of DDR genes and regulates their expression. In addition, by manipulating the expression level of orc1-2, we showed that a high level of Orc1-2 is essential but not sufficient for the DDR activation.

IX

ABBREVIATIONS

DDR DNA damage response DDRE DNA damage responsive element Orc1/Cdc6 Origin recognition complex 1/cell division cycle 6 TBP TATA box binding protein TFB factor B AP Apurinic/apyrimidinic CPD Cyclobutane pyrimidine dimer 6-4PP 6-4 photoproduct DSB Double strand DNA break BPS Base pair substitution Indel Insertion and deletion ROS Reactive oxygen species BER NER Nucleotide excision repair GGR Global genomic repair TCR Transcription-coupled repair HRR Homologous recombination repair MMR DNA Mismatch repair NHEJ Non-homologous end joining TLS Translesion DNA synthesis PIKK Phosphatidyl inositol 3' kinase-related kinases ATM Ataxia-telangiectasia mutated ATR ATM and Rad3 related DNA-PKcs DNA dependent protein kinase catalytic subunit ATRIP ATR Interacting Protein MRN/X Mre11-Rad50-Nbs1/Xrs2 complex RPA Replication protein A XP Xeroderma pigmentosum TRCF Transcription-repair coupling factor NucS Nuclease S1

X

SSB Single strand DNA binding protein/ Single strand DNA break CDK Cyclin-dependent kinase PTM Posttranslational modification MMS Methyl methanesulfonate 4-NQO 4-Nitroquinoline 1-oxide UV Ultraviolet Ced Crenarchaeal system for exchange of DNA Ups UV-inducible pili operon of Sulfolobus FAD Flavin adenine dinucleotide UDG Uracil DNA glycosylase dRPase Deoxyribophosphodiesterase OGT O-linked β-N-acetylglucosamine transferase MGMT Methylguanine DNA methyltransferase XRCC4 X-ray repair cross-complementing protein 4 XLF XRCC4-like factor ExoI Exonuclease 1 BLM Bloom Syndrome RecQ Like Dna2 DNA replication helicase/nuclease 2 BRCA2 Breast cancer early-onset 2 MCM Minichromosome maintenance Hje Holiday junction endonuclease hjm Holiday junction migration PCNA Proliferating cell nuclear antigen Fen1 Flap endonuclease 1 MDC1 Mediator of DNA damage checkpoint 1 TopBP1 DNA Topoisomerase II Binding Protein 1) RNR Ribonucleotide reductase SMARCAL1 SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin, subfamily A-like 1 HGT Horizontal gene transfer

XI

OBJECTIVES

During my PhD study, I aimed to investigate the role of a paralog of TFB protein and an ortholog of Orc1/Cdc6 protein during DNA damage-induced response of Sulfolobus islandicus Rey15A. The following objectives have been addressed:

 To determine whether TFB3 and Orc1-2 function in DNA damage-induced transcriptional responses  To identify specific pathways that is regulated by TFB3 or Orc1-2.  To provide insights into the mechanism of Orc1-2/TFB3 mediated transcriptional regulation  To provide insights into how Sulfolobus respond to DNA damage in a global level

XII

INTRODUCTION

Since the recognition of Archaea as a separate domain of life based on the phylogenetic analysis of ribosomal RNA sequences (1), a wealth of knowledge in the field of archaeal biology has been obtained via phylogenetic, biochemical, structural and genetic studies on archaeal model organisms. These studies have revealed that archaeal organisms are featured with both eukaryal and bacterial characteristics. For instance, the archaeal DNA replication, transcription and translation system are more closely related to those in eukarya, whereas archaeal cells lack nucleus and contain the circular chromosome as that in bacteria (2).

Many archaeal organisms thrive in extremely high temperatures that facilitate DNA lesion formation (3), as spontaneous hydrolysis of purine and pyrimidine bases, sugar phosphate cleavage and deamination of cytosine in DNA accelerate as temperature increases (4). For this reason, a considerable amount of studies has been focusing on how these extremophiles maintain their genomic integrity (5). Strikingly, genetic analysis revealed that the spontaneous mutation rate of the thermoacidophilic archaeon Sulfolobus acidocaldarius is comparable to that of mesophilic bacterium Escherichia coli (6). This indicates that archaea possess a more efficient system by which DNA damage is repaired.

The availability of archaeal genome sequences has largely facilitated the research of DNA damage repair in Archaea during the past decades. A myriad of archaeal homologs have been identified based on sequence similarity to known bacterial or eukaryal DNA repair proteins, and subsequently characterized in details. These studies in turn provide insights into the function of their counterparts in other life domains. For example, the structural analysis of archaeal Y family polymerases has substantially increased our understandings of mechanisms of translesion DNA synthesis, due to their relative ease for crystallization (7). Moreover, a few archaeal specific enzymes implicated in DNA repair are also identified, which has expanded the spectrum of DNA repair machinery (3,8). The development of versatile genetic toolboxes in model archaeon further offers the possibility of studying the functionality of potential DNA repair systems in Archaea in vivo (9,10). Here our current understanding of how archaea deal with genomic insults and how they

1 respond to DNA damage signal in a global level, including the reported DNA repair pathways, DNA damage tolerance and other related cellular processes, will be summarized, with a comparison of those strategies have been evolved across the tree of life.

UNIVERSAL STRATEGIES FOR DNA DAMAGE REMOVAL The primary structure of DNA can be modified by both endogenous and environmental factors, giving rise to DNA lesions. For example, some DNA aberrations can arise via cellular processes, such as misincorporation of dUTPs during replication, abortive topoisomerase activity, spontaneous hydrolytic reactions and metabolic byproduct like reactive oxygen species (ROS). Some are resulted from modifications by exogenous factors like UV, ionizing irradiation and diverse chemicals.

Figure 1. Sources of DNA damage and DNA repair pathways Endogenous and exogenous factors (top) induce different types of DNA lesions (middle) on either DNA backbone or base. These DNA lesions can be removed by specific pathway or by the co- actions of several pathways (bottoms). The picture was taken from Genois et al., 2014 (11).

As reviewed by Sancar et al. (12), genomic insults occur on bases, the building blocks of DNA molecule, the sugar-phosphate backbone or sometimes in a form of crosslink of the two DNA strands. Common base damages include O6-methylguanine, thymine glycols, and other reduced, oxidized, or fragmented bases in DNA that are produced by ROS or by 2 ionizing radiation (12). Ultraviolet (UV) radiation also gives rise to these species indirectly by generating reactive oxygen species, and meanwhile, it also produces specific products such as cyclobutane pyrimidine dimers (CPD) and (6-4) photoproducts (12). In addition, various base adducts can also be induced by diverse chemicals. For instance, the formation of bulky adducts can be induced by large polycyclic hydrocarbons or simple alkyl adducts by alkylating agents. Backbone damages include abasic sites and single- and double-strand DNA breaks (SSB and DSB). Specifically, abasic sites are generated spontaneously by the formation of unstable base adducts or by base excision repair. SSB are produced by excision repair or directly by damaging agents. DSB are produced by ionizing radiation or induced by DNA-damaging agents. Some DNA damaging agents such as cisplatin and mitomycin D modify DNA in a form of interstrand or intrastrand DNA cross-links, which is the most complex form of DNA damage (12).

These DNA lesions can block genome replication, and if left unrepaired or are repaired improperly, they lead to genomic mutation or instability (13). As a result, all organisms have evolved a number of DNA repair pathways, to counteract the deleterious effects of DNA damage. Briefly, different types of DNA lesions generated by diverse cellular processes or environmental factors can be directly reversed or excised, or sometimes bypassed by specific pathways. As shown in Fig. 1, the DNA lesions occurred at base without strand-distortion are processed by direct reversal repair or BER (Base excision repair), while the DNA-helix-distorted lesion such as CPD (Cyclobutane pyrimidine dimmer) and 6-4 photoproducts are repaired by NER (Nucleotide excision repair), which also removes the bulky adducts on DNAs and are involved in the intrastrand DNA cross- link repair as well. The broken ends of DSB can be ligated by NHEJ (Non-homologous end joining) or subjected into HRR (homologous recombination repair). Post-replicative repair (mismatch repair, MMR) corrects those errors arising from DNA replication and the unrepaired replication-blocking lesion can be bypassed by translesion DNA synthesis (TLS) (14).

DIRECT REVERSAL OF DNA DAMAGE The most straightforward DNA repair pathway reported so far is the direct reversal of DNA lesions by specialized enzyme, including the photolyase that functions in the repair of UV- induced dimmers and methylguanine DNA methyltransferase (MGMT) that corrects with O-

3 alkylated DNA damage, by binding to DNA lesion and restoring the DNA to its normal states. Whereas the former is absent in placental mammals, the latter is ubiquitously present in nearly all organisms (15).

The DNA lesions induced by widespread UV light can block the DNA replication machinery if not removed properly. Two distinct enzymes, CPD photolyase and 6-4 photolyase have been identified for the direct removal of UV-induced DNA lesions. The former repairs CPD dimer and the latter specifically removes 6-4 photo products. So far, CPD photolyases have been found in most organisms from all three life domains, while 6-4 photolyases are mainly reported in Eukarya. Photolyase binds to DNA containing a pyrimidine dimer in a light- independent reaction and flips the dimer out into its active site pocket. The light harvesting cofactor in the photolyase absorbs a photon and transfers the excitation energy to the catalytic cofactor, FAD. Then, the excited state of FAD (FADH-) transfers an electron to the pyrimidine dimer, splitting the dimer into two pyrimidines. The electron returns to the flavin radical to regenerate FADH- and the enzyme then dissociates from the repaired DNA (16).

Archaea encode homologues of both CPD and 6-4 photolyases and the photoactivation activity has been reported in several model archaeon including Methanobacterium thermoautotrophicum, Halobacterium NRC1 and S. acidocaldarius (17,18). Biochemical and structural study of the photolyase from S. tokodaii demonstrated that it is capable of removing CPD and its overall structure superimposes very well with other known photolyases (19). Furthermore, analysis of the dynamics of CPD post UV treatment in S. solfataricus showed that light highly accelerate the removal of CPD, suggesting a solid role of photolyase in the repair of photoproduct in archaea (20). Interestingly, genetic study of two Sulfolobus photolyases indicated that only CPD photolyase is important for DNA repair, as deletion of the gene coding for 6-4 photolyase does not impair DNA repair activity (18).

Alkylating agents react with DNA bases leading to the formation of a variety of cytotoxic and mutagenic covalent adducts ranging from small methyl groups to bulky alkyl adducts. MGMT (also called AGT or OGT) is the enzyme that removes the alkylation of guanine bases at the O6 position or O4 position, which has been widely identified in different bacterial and eukaryal organisms. Similar to the photolyase, MGMT has also been proposed to recognize

4

DNA damage by three-dimensional diffusion. After forming a low-stability complex with the DNA backbone at the damage site, it then flips out the damaged base into its active site cavity, wherein the methyl group is transferred to the cysteine in the active site. The enzyme then dissociates from the repaired DNA, but the C-S bond of methylcysteine is stable, and therefore, after one catalytic event the enzyme becomes inactivated and is accordingly referred to as a suicide enzyme (12).

The archaeal enzymes for direct reversal of DNA alkylation are mostly reported in thermophilic archaea. These enzymes include a DNA alkyltransferase from Pyrococcus sp. KOD1 (pyrpMGMT) (21), an O6-alkylguanine DNA alkyltransferase from S. solfataricus (ssoOGT) (22) and the AGTendoV from Ferroplasma acidarmanus (23). Specifically, the pyrpMGMT expressed in E.coli exhibited a remarkable thermostability and this enzyme surprisingly restored the resistance of an E. coli ogt-deficient strain towards alkylating agent (21,24), suggesting the functional conservation of these enzymes. Similarly, ssoOGT protein also shows stability at high temperature and the encoding gene was induced upon alkylation agent treatment. More interestingly, the F. acidarmanus AGTendoV consists of a fusion of the C-terminal active site domain of O6-alkylguanine-DNA alkyltransferase (AGT) with an endonuclease V domain. Functional study of this protein showed that it is capable of removing O6-methylguanine lesions in DNA via alkyl transfer action and cleaving DNA substrates that contain deaminated bases, uracil, hypoxanthine, or xanthine in a similar manner to E. coli endonuclease V. This bifunctional enzyme has been found in a number of archaeal genomes and the functional association of AGT and other DNA repair pathway here may represent a general adaptation to the harsh environments for thermophilic archaea (25).

EXCISION OF DNA DAMAGE Three main pathways including BER, NER and MMR repair the DNA damage by the ‘cut and paste’ mode. They remove a single strand DNA containing DNA base damage, double helix distortion and mispaired bases individually and fill in the ssDNA gap by DNA synthesis. The mechanism of these pathways is highly conserved across the organisms in both Bacteria and Eukarya, though the proteins involved in bacterial and eukaryal NER pathway show little homology.

Base excision repair

5

Bases with small chemical alternation including oxidized bases, alkylated bases, deaminated bases and uracil that do not strongly affect the DNA helix structure are subjected to BER. BER is probably the most conserved pathway from bacteria to eukarya, by a series of coordinated reactions executed by glycosylases, AP lyases, AP endonucleases and DNA synthesis related proteins. As illustrated in Fig. 2, after the recognition of DNA lesion by specific DNA glycosylase, which catalyze the hydrolysis of the damaged base from the sugar- phosphate backbone to generate an AP site (apurinic/apyrimidinic sites), the AP site is then excised or replaced by DNA synthesis. During the long patch BER, AP site is processed by AP endonuclease, which hydrolyzes the phosphoester bond at the AP site, giving substrate suitable for stand replacement synthesis. The 5’flap is then cleaved by Fen1 endonuclease, generating a ligatable nick sealed by DNA ligase. In short patch BER, it is the AP lyase makes the first incision at the AP sites and the ribose-phosphate backbone is then removed by dRPase, giving the single gap that is filled in by the BER specific DNA polymerase and sealed by DNA ligase (13,26).

Figure 2. Schematic representation of canonical base excision repair (BER) pathway.

6

Two canonical subtypes of BER pathway including Short Patch BER and long Patch BER are illustrated. The DNA fragments in red represent the new inserted or synthesized DNA base pair(s). Picture adapted from Grasso and Tell, 2014 (27).

All Uracil DNA glycosylases (UDGs) or other damage-specific DNA glycosylases are conserved from bacteria to eukarya and they contain two core domains. One is responsible for the activation of the catalytic water molecule and the other interacts with the DNA minor groove after flipping out the base, thus stabilizing the DNA–UDG complex (27). In E. coli, the AP lyase activity is exerted by Endo III (also known as Nth) and the same activity is executed by Nth and OGG1 in human, which incise the DNA between the phosphate and the deoxyribose at the 3’ of the AP site. The main AP-endonuclease in human is APE1 and the most well studied ones in E. coli are Exo III (also known as Xth) and Exo IV (known also as Nfo) (27).

A number of archaeal BER proteins have been characterized and the fundamental pathway is well conserved in all three domains of life (27). As summarized by Grasso and Tell (27), about 20 archaeal uracil-DNA glycosylases (UDGs) have been characterized and they all fall into four of the six known UDGs families. Compared to those mesophilic UDGs belonging to class II and VI, the thermophilic ones are of class IV and V, suggesting these UDGs have co- evolved with the specific environment (5). In addition, two specific endonucleases that recognizes deaminated bases or abasic sites have also been identified, with Endonuclease Q (EndoQ) cutting at 5′ from deaminated bases or abasic sites in the DNA strand and Endonuclease V (EndoV) incising at 3′ towards deaminated adenine (28). Many archaeal genome encode EndoV, a nuclease found in all three domains of life (29), while in contrast, EndoQ is mainly distributed in Thermococcals and not present in most bacterial and eukaryal organisms (30). Meanwhile, several archaeal homologs of endonuclease IV (EndoIV) and exonuclease III (ExoIII) show APE activity among which the S. islandicus ExoIII and EndoIV have been characterized both in vitro and in vivo (31). The comparative analyses demonstrated that the S. islandicus EndoIV enzyme is much more active than the ExoIII enzyme in vitro, and genetic analysis indicated ΔendoIV is much more susceptible to the alkylating agent MMS than ΔexoIII (31).

Archaeal DNA polymerases harbor an uracil stalling pocket, which enable them to stop at

7

Uracil in the template DNA, representing a ‘read-ahead’ proofreading function (32). Interestingly, the interaction between PCNA and UDG has been reported in both crenarchaeota and euryarchaeota (33,34). It was proposed that such an interaction enable those UDGs being recruited to replication fork once DNAP stalled at uracil (25). In addition, PCNA also stimulates the activity of DNA polymerase, DNA ligase I, and flap endonuclease in the crenarchaeota S. solfataricus (35). The physical and functional interaction between PCNA and EndoQ, as well as the Mre11-Rad50 complex was also described in Pyroccus furiosus (36,37). Thus the slide clamp in archaea may function as a platform for the coordination between BER, DNA replication and other DNA repair pathways.

Nucleotide excision repair Nucleotide excision repair (NER) pathway removes a spectrum of single-strand DNA lesions that cause local helix-destabilization. Two subtypes of NER have been reported based on whether damage detection is linked to transcription (transcription-coupled repair, TCR) or not (global genome repair, GGR). TCR removes transcription stalling lesions, and GGR localizes and repairs genomic DNA lesions globally (38).

As shown in Fig. 3, during bacterial GGR, the genome is scanned by the UvrA-UvrB complex for damaged nucleotides causing large conformational changes. For TCR, the repair process is initiated by a stalled RNA polymerase on an actively transcribed gene, followed by the recruitment of UvrA-UvrB to the site of lesions by TRCF (transcription repair coupling factor, Mfd). Both mechanisms then converge into the same pathway. After damage recognition, UvrA then dissociates from the complex (by ATP hydrolysis), leaving a stable UvrB-DNA complex for damage verification, which further recruits UvrC to incise the damaged strand on both sides of the lesion. UvrC contains two nuclease domains that cleave the phosphodiester bonds 8 nucleotides 5’ and 4-5 nucleotides 3’ relative to the damaged site. UvrD helicase is required for the removal of excised DNA strand (12-13 nucleotides in length) and release of UvrB and UvrC form DNA. At last, the DNA repair is completed by gap filling by DNA polymerase I and nick sealing by DNA ligase (39).

8

Figure 3. Schematics of nucleotide excision repair (NER) in Bacteria and Eukarya The molecular mechanism of NER in E. coli (a) and Human (b). The DNA lesion is indicated by black rectangular, which can be detected by RNAP during TCR or bacterial/eukaryal specific damage sensors during GGR. In addition, photolyase (Phr) assists the recognition of CPDs in E. coli and accelerates the rate of CPD repair. After damage recognition and verification, the ssDNA containing DNA lesions is incised and the resulting gap is then filled by DNA re-synthesis. Picture taken from Hu, 2017 (40).

Eukaryal NER shares a similar process with the bacterial one but employs a much more complicated machinery (at least 25 proteins) to repair the similar lesions. A number of proteins implicated in human disease Xeroderma pigmentosum (XP) play parallel roles as the bacterial Uvr proteins. As reviewed by Giglia-Mari et al. (41), the damage sensing is achieved by stalled RNAP in TCR and Cockayne syndrome factors B/Rad26 that couples the stalled RNAP with NER machinery. Lesion discrimination in GGR is executed by the concerted actions of two complexes: XPC/hHR23B and UVDDB (DDB1 and DDB2/XPE). Subsequent steps of TCR and GGR convergent into the same pathway. TFIIH, a bi-directional helicase, is then recruited to open the damaged strand over a stretch of approximately 30 nucleotides. The unwound DNA is further stabilized by XPA and RPA. Two structure specific nucleases XPG and ERCC1-XPF then incise the damaged strand 3’ and 5’ respectively, giving a 25-30 nucleotides gap, which was then filled by DNA polymerase δ/ε. The NER repair is completed by sealing the final nick by DNA ligase 1 or DNA ligase 3 (42).

Phylogenetic analysis of NER proteins revealed that homologues of bacterial UvrABC proteins are only found in certain mesophilic archaea, which are probably acquired by the

9

HGT (horizontal gene transfer) from bacteria. In most archaea, especially for those thermophilic archaea, a number of homologues of eukaryal NER nuclease (XPF and Fen- 1(XPG)) and helicase (XPB and XPD), with an archaeal specific nuclease Bax have been identified, suggesting the existence of an eukaryal-type NER pathway in archaea (9). However, the proteins for damage recognition or verification, including the homologues of CSB, XPA and XPC, are absent in all archaea (38).

Biochemical characterization of these archaeal XP homologues showed that these proteins are competent to function as a nuclease or helicase. For example, archaeal XPD protein unwinds DNA from 5’ to 3’ and crystallographic analysis of the archaeal XPD revealed a striking similarity to the structure of eukaryal ones (43). Interestingly, archaeal XPB homologs form a complex with archaeal structure-specific nuclease Bax1, and the complex unwinds and cleaves model NER substrates in vitro (44). More recently, it was proposed that the XPB-bax1 complex may actually function as a dsDNA translocase-nuclease machinery by binding at the site of helix-destabilizing lesions, opening the bubble via XPB’s ATP-dependent translocase activity and cutting at the lesions by bax1(8).

In contrast, the results of the genetic analysis of archaeal NER system are particularly confusing. On the one hand, the photoproducts induced by UV light was efficiently removed under the dark environment, suggesting the existence of the dark repair pathway to deal with photoproducts (20,45). On the other, the deletion of individual xpb, bax1 and xpd genes hardly affect the host’s sensitivity towards DNA damage agent (10,46). While in contrast, the deletion of uvrA, uvrB or uvrC in Halobacterium NRC-1 leads to an increased sensitivity, though this organism also encodes a set of eukaryl-type NER proteins (47). Moreover, it was found that the removal of photoproducts in Sulfolobus lack strand specificity (20,45), suggesting the lack of TCR system or that GGR and other DNA repair pathway works much more efficient than that of TCR, so that the strand biased repair fails to be detected. Taken together, the debate about whether there is a functional NER in archaea will persist before more genetic studies are reported.

Mismatch repair DNA mismatch repair (MMR) is a highly conserved process in both bacteria and eukarya and

10 is primarily responsible for the repair of base-base mismatches and insertion/deletion generated during DNA replication and recombination. E. coli MutS and MutL and their eukaryal homologs, MutSα\β and MutLα\β\r, are key players in MMR-associated genome maintenance (48).

Figure 4. Schematic representation of DNA Mismatch repair (MMR) In E. coli, the mismatch-activated MutS-MutL-ATP complex licenses MutH to incise the nearest unmethylated GATC sequence. Then the joint action of nucleases and UvrD helicase are responsible for the removal of the DNA strand containing DNA lesions. The resulting gap is filled by DNA synthesis. The Picture was adapted from KEGG pathway: http://www.genome.jp/kegg- bin/show_pathway?ko03430

As illustrated in Fig. 4, the first step in bacterial MMR involves the damage recognition and binding by MutS, which then interacts with MutL to enhance mismatch recognition and distinguish between the template strand and nascent strand. In E. coli, the lack of adenine methylation in nascent strand allows MutH to specifically incise the unmethylated daughter strand at hemimethylated dGATC site. Following generation of the strand specific nick, UvrD

11 is loaded at the nick and unwinds the duplex from the nick towards the mismatch, generating single strand DNA breaks, which are rapidly protected by SSB proteins. The removal of mismatched bases are performed by single strand exonucleases from either direction by Exo I or Exo X (3’-5’ exonuclease) and Exo VII or Rec J (5’-3’ exonuclease)), which excise the nicked strand from the nicked site (the dGATC site) up to or slightly past the mismatch. The resulting single-stranded gap is filled by DNA re-synthesis by DNA polymerase III and the repair is completed by DNA ligase (48). MMR in eukarya shows strong similarities to that in bacteria, although it involves several different MutS and MutL homologs with more specialized roles individually. Importantly, In E. coli, hemi-methylated dGATC sites determine the strand specificity of the repair, and the mechanism for eukaryal MMR machinery to discriminate the template and nascent strand during the repair remains unclear (49).

So far, the MutS/L homologs are only identified in several mesophilic archaeal lineages (50) and these archaeal MutS/L homologs were reported to be dispensable for the maintenance of a low mutation rate in H. salinarum NRC-1 (51). However, the lack of the canonical MMR in archaea is not reflected in high mutation rates (52), suggesting the existence of an alternative pathway independent of MutS/L proteins for the maintenance of genomic fidelity in archaea. Recently, a NucS homolog from Pyrococcus abyssi was shown to be capable of cleaving mismatched DNA base pairs on both strands and was named EndoMS (53). The following structural analysis demonstrated that EndoMS flips out mismatched bases into its binding sites, and cleaves the DNA backbone in a manner reminiscent of type II restriction enzymes (54). Interestingly, the phylogenetic analysis revealed that EndoMS homologues are present in archaeal lineages that lack functional MutS homologs and also in the organisms from Actinobacteria, which also lack MutS homologs (53), suggesting that EndoMS may mediate the alternative MMR pathway. Consistently, a more recent study in actinobacteria revealed the function of NucS homologs from Mycobacterium and Streptomyces in preventing mutations and inactivation of the NucS homolog in M. smegmatis leads to a similar hypermutation phenotype as that observed in canonical MMR-null mutants (55).

As reviewed by M. White and T. Allers, the generation of DSB by EndoMS at the mismatched base pairs could be an advantage in terms of strand discrimination, as both

12 strands will then be processed for HRR repair (43). In consistence with that, the homologue of EndoMS has been found in an operon with RadA recombinase in P. abyssi (56). However, the generation of DSB is a risky strategy especially for those non-polypoid organisms and the in vivo mechanism of archaeal EndoMS-mediated pathway remains further investigation.

REPAIR OF DOUBLE STRAND DNA BREAKS DNA damaging agents, ionizing radiation and cell’s attempt to replicate the lesion-containing templates induce DNA double-strand breaks (DSBs). These lesions are extremely cytotoxic, as the cell cannot rely on simply copying the information from the undamaged strand. For example, it has been shown that a single DSB is sufficient to kill E. coli cell (57) and causes cell cycle arrest in human cell (58). Two main pathways including HRR (Homologous recombination repair) and NHEJ (non-homologous end-joining) have been evolved for the repair of DSBs, depending on whether homologous templates are required or not.

NHEJ is best studied in eukaryal models and is an error prone process, by connecting the two ends of DSB directly. As reviewed by Lieber et al. (59), the DNA ends of DSB are recognized by Ku70/80, which form a molecular scaffold for the recruitment of other core factors. In vertebrate cells, Ku then recruits DNA-PKcs forming the holoenzyme together with the DNA ends, which promotes DNA-end tethering and recruitment of additional NHEJ factors. The ligatable ends are often prerequisite for end ligation by NHEJ, which is facilitated by certain nucleases and polymerases. For example, Artemis nuclease is activated and recruited to the DNA ends upon DNA damage and facilitates the end processing in preparation for ligation. Yeast cells lack both DNA-PKcs and Artemis, instead the MRN complex plays a more substantial role in broken DNA end recognition and end processing. At last, the DNA ends will be sealed by protein complex consisting of XRCC4, XLF and DNA ligase IV (59).

Compared to NHEJ pathway, the HRR pathway is a highly faithful way to repair DSB, with a prerequisite of homologous chromosomal template. As a result, it acts exclusively in S- and G2-phase during the cell cycle. In contrast, those post-mitotic cells and cells in G1 phase have to seal DSBs by NHEJ (41). As demonstrated in Fig. 5, HRR starts with the resection of DNA ends at the DNA breaks by the MRN/X complex in eukarya or RecBCD/AddAB proteins in

13 bacteria (5’-3’ resection). In eukarya, there are additional factors responsible for further extensive end processing (ExoI or Dna2 in conjunction with BLM). End resection of DSB produces the 3’ single strand overhang, which is then coated by RecA/Rad51 recombinase (RecA in bacteria and Rad51 in Eukarya) forming the nucleoprotein filament (60). The loading of RecA/Rad51 to ssDNA is assisted by recombination mediators such as BRCA2 (breast cancer early-onset 2) or Rad52 in eukarya and RecBCD or RecFOR in bacteria. Then the homology search and strand invasion will be performed to form D-loop (displacement loop) or holiday junction structure upon second strand capture, which is then resolved by helicase and the structure-specific nuclease, to finally produce crossover or non-crossover products (61).

Figure 5. Schematic representation of DSB-initiated homologous recombination repair (HRR) HRR is initiated by end resection of DSB, giving a 3′ single-stranded end that then coated by RecA/Rad51. The DNA-RecA/Rad51 filament can invade a homologous template to initiate repair, with the assistance of recombination mediators. The stand invasion and subsequent DNA synthesis

14 lead to the formation of D-loop or holiday junction (HJ) structure upon second strand capture. These structures can be resolved by and structure specific nucleases, giving a non-crossover or a crossover product. Picture adapted from Moynahan et al., 2010 (62).

Archaea encode an eukaryal-type machinery for HRR, so far homologs of the core factors of eukaryal HRR including Rad51 (RadA in Archaea), Rad50 and Mre11 have been identified in all archaeal genomes (60). However, the homologues of nuclease and helicase for extensive end processing in eukarya are not found in Archaea. Instead, in many archaea, genes coding for Mre11 and Rad50 are clustered with two other genes coding for a hexameric helicase of the FtsK superfamily (HerA) and a 5′-3′ exonuclease (NurA) (63). These proteins form a complex with Mre11-Rad50 and the resulting complex is capable of DSB end recognition and resection (8). In addition, a Rad54 homolog from S. solfataricus (64) and a RadA paralog (RadB) from H. volcanii (65) have been shown to interact with RadA and function as the recombination mediators. Meanwhile, several helicases and nucleases including Hjm (Hel308) helicase and Hjc (Holliday junction cleavage), Hje (Holliday junction endonucleases) nucleases implicated in branch migration or HJ cleavage were also identified (66-68). Specifically, Hjm (Hel308) was suggested to function in D- loop step and implicated in the restart of stalled DNA replication fork (69). Hjc is specific for the four-ways DNA structures and has been reported to interact with a number of DNA repair proteins including RadA paralogue RadC2, Hjm (Hel308) and a novel ATPase potentially function in strand migration (SisPINA) (43). Hjc and Hje share the same fold similar to the type II endonucleases, however, the deletion of hje, but not hjc in Sulfolobus islandicus renders the mutant more sensitive to DNA damaging agents (67).

More importantly, genetic analysis by gene deletion showed that mre11, mad50, herA, nurA, radA and hjm, or at least one of the hje and hjc are indispensable for cell survival of Sulfolobus (10,67,70), indicating that HR activity is essential for Sulfolobus. Interestingly, it was proposed that many of bacteria (for example, Mycobacterium tuberculosis, Mesorhizobium loti, Sinorhizobium loti) that possess NHEJ factors spend a significant portion of their life cycle in a stationary haploid phase, in which a template for recombination is not available (71). However, Sulfolobus species have two copies of chromosome for the majority of their life cycle, which is probably a result of adaptation on HRR for their survival.

15

DNA DAMAGE TOLERANCE BY TRANSLESION DNA SYNTHESIS Persisting DNA lesions not removed by any of the repair pathways can interfere with DNA replication. To mitigate the deleterious effect by arresting DNA replication machinery (Prolonged stalling of replication forks leads to fork collapse, generating cytotoxic DSBs), cells also evolved the machinery for tolerating DNA lesions, leaving the damage to be repaired at a later time point (41).

The key players of DNA damage tolerance pathways are enzymes performing translesion DNA synthesis (TLS). During TLS, the stalled replicative polymerase is replaced by TLS polymerase, which is featured by low-processivity but with capability to bypass DNA lesions. The principal polymerases in TLS are those belonging to Y family. As reviewed by Sale (72), though Y family polymerases show virtually no sequence homology to those from other families, they adopt similar overall structure as the replicative polymerase (replicases). An extra little finger domain is encoded in Y family polymerase, and the remaining domains are much smaller than those in replicases. As a results, the active site is much more spacious and solvent exposed, which enable it to accommodate large bulky DNA lesions (72). A thermodynamic model proposed that high fidelity polymerase enhances WC (Watson-Crick) base pairing by partially excluding bulky water from the active cleft (73). In contrast, those error prone Y family polymerases are believed lacking the ability to discriminate WC and non-WC base pairs, with an open active site that enable the water to compete with nucleobases in forming hydrogen bonds (7). In addition, Y family polymerases lack a 3-5’ proof-reading exonuclease domain and all these features together can result in an incorporation error rate up to 1/10 (74).

As described in detail by Sale (72), many of the 15 known polymerases in eukarya have the capability to promote some degree of TLS. The principal eukaryal TLS polymerases are Polη/Rad30 and Rev1 belonging to Y-family, along with the B-family Polζ. Polη is the most well studied one and plays a key role in bypassing CPDs by accommodating both bases of CPD in its active site and incorporating two A opposite to TT. By acting as a molecular splint to straighten the kinked DNA backbone, Polη accurately copy the covalently linked pyrimidines of CPD. However, Polη is incompetent to completely replicate the template with the more highly distorted lesion, 6-4 photoproduct. The ability of Polη to accommodate a

16 dinucleotide lesion in its active site also contributes its capability to replicate intrastrand crosslink at G-G caused by cisplatin (72). Rev1 was proposed to be a G template specific DNA polymerase and it inserts dC opposite an abasic site and N2-adducts of guanine. However, it does not directly pair the incoming dC with the template. Instead, it swings the template’s dG out of the helix and temporarily coordinates it with its little finger domain. At the place of the template dG, Rev1 places its own Arg324 residue, which form hydrogen bond with incoming dC. This mechanism allows it to bypass bulky dG adducts and act as a template-independent dC transferase. Rev1 has also been implicated in replicating DNAs forming secondary structure, and it is required for budding yeast’s survival after exposure to G-adducting agent 4-NQO (72). DNA polymerase ζ (Polζ) is a multi-subunit B family polymerase related to replicative ones, but lacks proofreading exonuclease activity. It was proposed to function in the extension step of translesion DNA synthesis (TLS). Besides, genetic studies suggest a role of Polζ to bypass 6-4 photoproduct and a function in the recombination associated DNA synthesis in S. cerevisiae, which is an error prone process dependent on Polζ (72).

Translesion DNA synthesis in E. coli is mainly performed by Pol IV (DinB), Pol V belonging to Y-family and Pol II from B-family. All three polymerases are induced upon DNA damage, and among them Pol V is the major TLS polymerase in E. coli. UmuC is one of the subunit of Pol V, but when it interacts with the other subunit UmuD, it forms an inhibitory complex that cannot perform translesion DNA synthesis. Upon DNA damage, RecA mediates the cleavage of UmuD2C to UmuD2’C, which leads to the activation of the Pol V polymerase (UmuD’2C complex). Pol II has been implicated in translesion synthesis of abasic sites, interstrand crosslinks, and 3, N4-ethenocytosine adducts, although it has a 3’-5’ exonuclease domain. However, the pol II mutant is not UV sensitive and the in vivo functionality of Pol II is under debate. Nevertheless, the DNA replication was blocked following UV treatment in the pol II mutant, but not the wild type strain at the early time during the cell’s response, suggesting it may has a function in replication restart (75). Pol IV (DinB) orthologue is the most ubiquitous Y-family polymerase found in all three domains of life. Pol IV efficiently and accurately bypasses adducts on the N2 position of deoxyguanosine and even bypasses N2-N2-guanine interstrand cross-link with high fidelity and its processivity dramatically increases upon interaction with the β-clamp (75,76). Strikingly, expression of noncleavable UmuD (together

17 with UmuC) contribute to the survival upon DNA damage by delaying the resumption of DNA replication and ovexpression of Pol IV was shown to rapidly block the replication fork movement, suggesting these two TLS polymerases may have a function in checkpoint control during SOS response (77).

The potential mutagenicity of TLS polymerases means that the activities of them must be under carefully controlled. As reviewed by Goodman and Woodgate (76), in E.coli, the transcriptional control is primarily used to modulate the activity of TLS. The promoter region of pol II and pol IV binds lexA relatively weakly and are induced early in the SOS response. In contrast, the promoter of the umuDC operon (polV) has one of the tightest LexA-binding sites and is induced later in the response (76). Given their early induction and relatively high basal expression level in the absence of DNA damage, it was proposed that Pol II and pol IV participate in most error-free TLS at specific DNA lesions. In contrast, Pol V undergoes very later induction, suggesting that E.coli only uses it as a last resort, once all other error-free repair pathways have been exhausted. In addition, the error-prone pol V undergoes post- translational regulation upon DNA damage. Firstly the UmuD activation is trigged by RecA- filament mediated auto-proteolysis. then, UmuD, UmuD’ and UmuC proteins are rapidly degraded by Lon and ClpXP protease (76). In contrast, much of the regulations of TLS in eukarya rely on post-translational modification (PTM) or specific protein interactions that target those TLS polymerases to sites of damage. For example, ubiquitination of human Polη (or RAD30 in yeast) leads to rapid degradation by proteasome and the interaction between Y family polymerase and PCNA is enhanced upon DNA damage by monoubiquitination of PCNA (76).

The DinB family polymerases is the only group that has been found in all three domains of life (78). Archaeal Y family polymerases are mainly found in Crenarchaea and are proposed to be only present in those archaeal organisms exposed to UV-light (50). Y family polymerases from S. solfataricus (Dpo4) and S. acidocaldarius (Dbh) have been crystallized with different DNA substrates and widely used as a model to study the mechanism of lesion bypass. The structural and biochemical characterization of Sulfolobus Dpo4 demonstrated that it is capable of bypassing a broad spectrum of DNA lesions including abasic site, (deoxyguanosin-8-yl)-1-aminopyrene, benzo (a) pyrene diol epoxide, 8-oxoguanine,

18 methylguanine and benzylguanine, and thymine dimers (79). However, disruption of the gene coding for the Dbh in S. acidocaldarius does not affect cell’s resistance towards several DNA-damaging agents and the overall rate of spontaneous mutation of a target gene (80). Nevertheless, the disruption of this gene does lead to the change of mutation spectrum. Specifically, there are lower frequencies of small indels (insertion and deletion), but higher frequencies of BPSs (Base pair substitution) in dbh- strains. It was thus proposed that Dbh play both mutagenic and anti-mutagenic roles in vivo (80). Apart from Dpo4, two PolB family proteins Dpo2 and Dpo3 from Sulfolobus also show DNA lesion bypass activity in vitro and Dpo3 strikingly bypasses CPD more efficiently than Dpo4 (81). In addition, archaeal PriS (primase small subunit) was shown to bypass the oxidative DNA lesions, such as 8-Oxo-dG (one of the major products of DNA oxidation) and faithfully bypass UV-induced CPD. It was thus concluded that, apart from the de novo primer synthesis, PriS may assist the major replicases during elongation step of TLS (82).

DNA DAMAGE RESPONSE IN THREE DOMAINS OF LIFE

ATM/ATR MEDIATED DNA DAMAGE SIGNALING PATHWAYS IN EUKARYA DNA repair activity in eukarya has been intricately linked to other cellular processes. Upon DNA damage, a well-coordinated signaling cascade that mediates diverse cellular responses is initiated. Briefly, DNA damage sensors recognize the DNA lesions and the damage signals is then transmitted to effectors by transducers, to prompt DNA repair activities, activate cell cycle checkpoint control and regulate other cellular processes to ensure sufficient DNA repair. These cellular responses are principally coordinated by three evolutionally conserved kinases including Ataxia-telangiectasia mutated (ATM), ataxia-telangiectasia and Rad3-related (ATR) and DNA-dependent protein kinase catalytic subunit (DNA-PKcs) (DNA-PKcs is only reported in vertebrate cells), belonging to phosphoinositide-3-kinase-relatedprotein kinase (PIKK) family. Once these kinases are activated, hundreds of substrates are phosphorylated at Ser/Thr-Glu motifs and additional sites in an ATM- or ATR-dependent manner, whereas DNA-PK appears to regulate a smaller subset of targets and mainly functions in NHEJ pathway. The functional regulations of downstream DDR effectors by ATM/ATR and by their downstream kinases including checkpoint kinase 1 (CHK1) and checkpoint kinase 2 (CHK2) largely shape the cellular responses important for genomic stability (83,84).

19

Figure 6. ATM/ATR signaling pathways in Eukarya. ATM mediated cellular responses towards DSB (a) and ATR orchestrated responses towards ssDNA or replication stress (b). Following DSBs ATM is predominantly activated through interactions with NBS1 of the MRN complex. Activated ATM initiates a signaling cascade at the site of damage, including PTM of chromatin proteins and others factors that important for DNA repair such as CtIP- BRCA1 and 53BP1. ATM also activates the function of CHK2 and p53, leading to cell cycle arrest. ATR is activated by a range of cellular processes that produce ssDNA. Activated ATR also activates cell cycle checkpoint by phosphorylating CHK1. Meanwhile ATR contributes the stability of replication fork and regulates DNA replication upon DNA damage by several mechanisms. Picture adapted from Brown et al., 2017 (85) and Blackford and Jackson, 2017 (86).

ATM, the master regulator orchestrates global cellular responses to DSBs ATM is best known for its role in response to DSB by coordinating the DNA repair, cell cycle checkpoint activation, alternation of chromatin structure and other cellular processes (Fig. 6). Under non-stressed condition, ATM exists as multimeric form, which is dissociated into active monomers upon DNA damage (87). Meanwhile, following DSB induction, a proportion of nuclear ATM is rapidly auto-phosphorylated and mobilizes to the sites of damage. MRN complex, which specifically recognizes DSBs, was proposed to recruit ATM

20 to DNA damage sites and activate ATM via the interaction between NBS1 and ATM (88,89). Nevertheless, MRN-independent activation of ATM also existed and there are several interesting insights into the mechanism of ATM activation. (a) The initial trigger of ATM activation is a chromatin conformational change that induced by DSB (87). (b) The mere tethering of ATM or several DDR players to undamaged chromatin is sufficient to induce the ATM-dependent DDR (87). (c) Direct interaction between ATM and broken DNA, specifically a contact with the single stranded stretches at DSBs is required for its activation (88,90).

Once activated, ATM leads to phosphorylation of hundreds of substrates implicated in diverse cellular processes. Furthermore, ATM phosphorylates and activates other protein kinases that phosphorylate yet more substrates, meaning that ATM-dependent signaling events are not just restricted to the factors directly phosphorylated by ATM (91). One of the best studied substrates of ATM is CHK2, which is rapidly phosphorylated by ATM upon DNA damage. Activated CHK2 phosphorylates downstream targets including CDC25A phosphatases, which dephosphorylate and thus activate the cyclin-dependent kinases (CDKs). The phosphorylated CDC25A is then subjected into proteasomal degradation, which leads to the inactivation of CDK, thus halting cell cycle progression. Meanwhile, ATM initiates a signaling cascade involving phosphorylation, ubiquitination and methylation of the chromatin proteins close to damaged sites. For example, the histone variant H2AX is phosphorylated by ATM in response to DSB, and Phosphorylated H2AX (γH2AX) can be recognized by MDC1, which is also phosphorylated and stabilized on chromatin by ATM upon DNA damage. Phosphorylated MDC1 is then recognized by NBS1, thus prompting the retention of MRN complex on

γH2AX sites. MRN complex further recruits ATM and this mechanism allows ATM to phosphorylate additional H2AXs, which in turn binds additional MDC1 in a repeated process that spread the focus. Importantly, γH2AX marked chromatin is transcriptionally inactive and it may represent a mechanism to prevent collision between the DNA replication and transcription machinery (83). Meanwhile, ATM also phosphorylates the key factors in HRR and NHEJ, such as the BRCA1 and CtIP that promote DNA end resection, and 53BP1 that functions in DNA-end bridging (92), thereby facilitating the repair of the DSBs.

Apart from the functional regulation of DDR effectors by phosphorylation, a significant

21 proportion of regulations by ATM upon DNA damage are achieved via transcriptional responses. Briefly, ATM directly phosphorylates p53 or other proteins that directly or indirectly regulate p53 stability, thereby affecting the protein level of this global transcriptional regulator. The functional regulation of p53 leads to the activation or repression of different transcriptional programs that collectively promotes cell survival by cell-cycle arrest/DNA repair activation at lower DNA damage or leads to senescence or apoptosis upon overwhelming DNA damage (91).

ATR, an essential Kinase for cell’s responses to replication stress In contrast to ATM and DNA-PKcs, which respond primarily to DSBs, ATR is activated by a much wider range of genotoxic stresses that induce the accumulation of ssDNA. ATR is recruited to RPA-ssDNA by its partner protein ATRIP. Such RPA-coated ssDNA is generated by nucleolytic processing of various forms of damaged DNAs or by helicase-polymerase uncoupling at stalled replication forks. The ability of ATR-ATRIP binding to ssDNA-RPA renders it as potent factor that senses diverse DNA damage and replication stress (86). However, ATR recruitment to RPA-ssDNA is not sufficient for optimal activation of ATR and a number of extra factors are also required. For example, as illustrated in Fig. 6, the junction of ssDNA and dsDNA is also important for ATR activation. SsDNA/dsDNA structures are recognized by Rad17-RFC clamp loader, which interacts with the Rad9-Rad1-Hus1 (9-1-1) clamp and loads the clamp into ssDNA-dsDNA junction. The 9-1-1 complex further recruits TopBP1, one of the best characterized ATR activators. The following phosphorylation of Rad9 of 9-1-1 complex facilitates the association of TopBP1 with ATR, leading to ATR activation. The recognition of ATR-ATRIP to RPA-ssDNA, as well as the loading of 911 at junction of ssDNA-dsDNA, are both required for TopBP1 to stimulate ATR activation at the site of DNA damage. These factors act together to ensure optimal ATR activation (86,93).

Interestingly, DSBs also activate ATR and it was suggested that ssDNA from end resection process may be the key trigger for ATR activation at DSB. The resection of DSB requires the action of several nucleases including MRN and its associated CtIP, ExoI and Dna2. More interestingly, an in vitro study revealed that during resection, the region of ssDNA is gradually increased and the DSB-induced DNA damage signaling switches from an ATM-activating mode to ATR-activating mode (93). Consistently, the in vivo study revealed that the change of

22 the amount of nucleases involved in resection process including CtIP and ExoI, also leads to the switch between the phosphorylation of CHK2 and CHK1, which are specific substrates of ATM and ATR respectively. Moreover, the inactivation/depletion of ATM, Mre11, CtIP, ExoI, and Dna2 all leads to diminished ATR responses to DSBs (93).

Activation of ATR initiates a signaling cascade that coordinates cell cycle progression and DNA metabolic processes. Similar to ATM, ATR also contributes to the proteasomal degradation of CDC25 phosphatases by activating its downstream kinase CHK1, thereby arresting cell cycle. Moreover, ATR activity is important for responses to DNA replication stress by following mechanisms. First, upon DNA replication stress, ATR-dependent phosphorylation of FANC1 and CHK1-dependent inhibition of CDKs inhibit the firing of dormant origins, which may avoid the exhaustion of replication and repair factor pools, especially RPA. Then ATR prevents replication fork collapse by phosphorylating SMARCAL1 helicase (SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin, subfamily A-like 1). Meanwhile, ATR also regulated the cellular dNTP level by transcriptional modulation and PTM of the RNR subunit RRM2 (86).

MULTILAYER REGULATIONS OF THE SOS RESPONSE IN BACTERIA DDR in Bacteria is best represented by the LexA mediated SOS-response in E.coli. As illustrated in Fig. 7, upon DNA damage, the accumulation of ssDNA will be protected by SSB protein, followed by the replacement with RecA and formation of RecA-ssDNA filaments, which then interact with LexA, a repressor binding to the SOS box in the promoter of SOS genes. The interaction leads to the conformational change of carboxyl terminal region of LexA, thereby facilitating the auto-lysis of the repressor. The cleavage of LexA decreases its affinity to DNA, thereby leading to the transcription activation of a cascade of SOS genes involved in DNA repair, cell cycle arrest and DNA damage tolerance (94,95).

As described in detail by Erill et al. (95), ssDNA generated by replication fork arrest or other processes is believed to be the main trigger of SOS response. recA and ssb genes are firstly induced to protect and stabilize the fork, followed by the induction of DNA repair genes (uvrAB, ydgQ, uvrD, recN and ruvAB) involved in NER and HRR pathway. For the unrepaired DNA lesion upon severe DNA damage, Translesion synthesis polymerases (Pol IV

23 and Pol V) will be induced to bypass DNA lesions. Meanwhile, the negative regulator for septum formation, SulA, is also induced, which inhibits cell division by interacting with FtsZ (95). The molecular explanation of the later induction of SulA and UmuCD is that the increasing level of DNA damage leads to progressively increasing RecA-promoted LexA proteolysis and thus decreasing LexA concentrations. As a result, a relatively high level of DNA damage is necessary to achieve a very low LexA concentration that allows induction of genes such as sulA and umuC/D, as they have very strong LexA binding sites. This arrangement allows the SOS response to be graded, for example, facilitating DNA repair without blocking cell division or inducing TLS polymerase activity at low levels of DNA damage (96).

Figure 7. Schematic representation of the SOS response in E. coli. The basic circuitry of the E. coli SOS response is illustrated. RecA protein-DNA filaments serve as a co-protease to activate self-cleavage of the LexA protein, the global repressor of SOS. The cleavage of LexA activates the expression of a cascade of SOS genes functioning in DNA damage repair, cell cycle regulation and TLS. Picture taken from Kreuzer, 2013 (96)

24

The induction fold also varies from gene to gene. For example, sulA gene is induced to ca.100 folds, while the NER genes are only induced to 4-5 folds. This level of regulation has been attributed to the diversity of SOS box present in the promoter of SOS genes, including the different locations of the motif, sequence variation of the motif and existence of additional regulatory elements. For example, SOS box has been mapped to -10 region of sulA and umuDC promoter, between -10 and -35 region of recA and uvrB, and -35 region of uvrA. This allows for another layer of regulation throughout SOS (94).

Similar to p53 in eukarya, the LexA level is also under the dedicated control. With the absence of DNA damage, LexA Protein limits the expression of itself by binding to the promoter region of lexA gene. LexA also regulates transcription of the recA gene, which is important for the overall function of the circuit. Upon DNA damage, increased RecA protein level promotes recombination repair and effective cleavage of LexA, whereas increased LexA expression allows the SOS response to be rapidly shut off when the inducing signal of ssDNA wanes (96).

The binding of RecA to ssDNA is also regulated to achieve the fine-tuning and temporal modulation of the SOS response. For example, RecBCD recognizes the double strand ends and performs end processing to generate the ssDNA substrate for RecA. RecFOR recognize DNA nicks and gaps during replication of damaged templates, and then it assists the loading of RecA onto ssDNA. In vitro data revealed that SSB has a strong affinity for ssDNA and thus prevents RecA binding. The replacement of SSB by RecA only occurs when RecO and RecR were added. Consistently, in vivo result showed that the E.coli strain lacking RecFOR exhibits delayed SOS induction. In addition, the SOS-inducible gene product, DinI, stabilizes the RecA/ssDNA filaments, Whereas the antagonistic function of RecX, and RdgC (recombination dependent growth) proteins prevents the formation of RecA-ssDNA filament (97).

In a broader view, the number and the type of genes found in the SOS regulon also vary among different bacterial species. For example, SOS regulation in E. coli consists of over 40 genes, while LexA regulon in gram-positive model organism (Bacillus subtilis) comprises 33 genes, among which only 8 are corresponding counterparts of those of the E. coil system. An

25 extreme example is represented by Pseudomonas aeruginosa, where the genes in SOS regulon is reduced to 15 genes and NER gene are no longer included any more. Comparison of SOS regulons among various bacterial species has uncovered the LexA-regulated core regulon, which only comprises RecA, UvrA, ruvAB and RecN (98), suggesting that beyond basic induction mechanism, many details of the SOS response may have been evolved to fit the specific needs of different species thriving in diverse environment.

CELLULAR RESPONSES TOWARDS DNA DAMAGE IN ARCHAEA Proposed DNA damage sensors in Archaea DNA damage sensing and subsequent signal transduction are of crucial importance for organisms in Eukarya to initiate the DNA damage response. However, how the signal of DNA damage was sensed in archaea remains largely unclear. Nevertheless, there are a few candidates that have been proposed to function as the potential DNA damage sensors.

The biochemical characterization of single strand DNA binding protein (SSB) from S. Solfataricus showed that it is competent to discriminate and destabilize the DNA with lesions or mismatched base pairs (99). Furthermore, it was proposed that the local binding of SSB proteins at damage sites may lead to the recruitment of other factors for DNA repair. Consistently, the carboxyl terminal tail of SSB has been shown to interact with a spectrum of proteins, including MCM, reverse gyrase, NurA and PirA helicase (100-103). The ssDNA in bacteria and eukarya has the potential to trigger SOS and ATR pathway correspondingly, as normally the large amount of ssDNA should not exist in a cell and its presence indicates the existence of a stalled replication fork, resection of damaged DNA, or occurrence of the recombination (9). For these reasons, SSB may also mediate the signal transduction of DNA damage response in archaea by functioning as a ssDNA sensor.

The ability of Mre11-Rad50 complex to specifically recognize free DNA ends enables it as the potential sensor for DSBs. Interestingly, both Mre11 and RadA in Sulfolobus are immediately recruited to DNA and remain DNA-bound during the course of DNA repair following γ-irradiation. It was thus proposed that these two factors could function as the DSB sensors (104). Moreover, Sulfolobus Mre11-Rad50 complex undergoes extra posttranslational methylations post γ-irradiation (105) and the complex in Haloferax volcanii was implicated in

26 both the repair of DSBs and the compaction of the nucleoid post DNA damage (106,107), suggesting this complex may mediate the damage sensing and subsequent DNA repair processes.

In addition, the uracil-stalling feature of archaeal polymerase enables it as the detector of uracil in DNA templates (108). Meanwhile, the replicative DNA polymerases will stall at the DNA lesions and thus it can also function as a general DNA damage sensor. In a recent model proposed by Grogan (9), it was suggested that once DNAP stalled at the bulky lesions, the stalled DNA replication fork could be cleaved by 3’-flap endonuclease (XPF or ssDNA endonuclease) or 5’-flap endonuclease (XPG/Fen1), leaving the bulky lesions at a position close to the double strand DNA end, which can be processed by end processing enzymes such as Mre11-Rad50. If proven to be true, this strategy will represent an adaptation for those organisms without a functional NER pathway to remove bulky DNA lesions.

Similar to DNAP, RNAP can also function in DNA damage sensing. RNAP from Thermococcus kadarenis has been reported to stall at DNA lesions on the template during transcription (109), indicating a general role of RNAP as the DNA damage sensor in all three domains of life. Nevertheless the coupling factors that link the stalled RNAP and downstream DNA repair pathway remain to be identified.

UV-induced genome-responsive expression in Archaea Though the lack of defined signal transduction pathway, previous microarray-based transcriptome analysis of model archaeon do reveal a global transcriptional change following UV irradiation in Halobacterium NRC1 (110) and Sulfolobus species (111-113), suggesting the existence of a cellular response towards UV light in archaea. These findings have concluded the absence of a SOS-like response in archaea, as very few genes that function in DNA repair were significantly induced during UV-responses. Specifically, analysis of UV- response in Halobacterium NRC-1 revealed that there is a moderate increase of radA gene expression, but a coordinated change in other DNA repair genes is not observed. Strikingly, nearly 12% of all genes in Halobacterium NRC-1 implicated in diverse cellular processes were down-regulated at 60 min post the UV treatment and it was proposed that the global repression of metabolism during DNA repair might be a general stress-response mechanism

27 shared by all three domains of life, to maintain internal homeostasis (110).

Whole-genome transcriptome analysis in Sulfolobus revealed that a number of genes implicated in cell cycle progression, transcriptional regulation, translesion DNA synthesis and diverse metabolism are dramatically induced or repressed during UV-response. For example, cell cycle related genes were downregulated post UV irradiation. These genes encode Orc1-1 and Orc1-3 that function in DNA replication initiation and the eukaryal-like ESCRT-III system that mediate cell division process. In contrast, Orc1-2, a paralog of Orc1/Cdc6 protein, was dramatically induced by UV, and it was thus suggested that Orc1-2 may function in the negative regulation of DNA replication initiation upon DNA damage. In addition, a paralog of TFIIB protein (TFB3) was also highly induced by UV-light, suggesting a role in UV-induced transcriptional regulation. Another top induced gene encodes Dpo2 that belong to PolB2 family, which is the only UV inducible DNA polymerase. Sequence analysis of Dpo2 homologues revealed that their catalytic residues are mutated, suggesting these proteins may represent the inactivated polymerases (114). Nevertheless, the in vitro assay indicated that the catalytic subunit of Dpo2 is capable of bypassing DNA templates with Uracil, hypoxanthine and 8-oxoguanine, suggesting that Dpo2 may function in TLS past deaminated and oxidized bases (81).

DNA damage induced cell aggregation and intercellular chromosomal DNA transfer in Sulfolobus It has long been known that S. acidocaldarius efficiently mediates chromosomal auxotrophic markers exchange (115) and the process is stimulated by UV treatment (116). Further characterization of a UV-inducible operon encoding a type IV pili system in Sulfolobus (ups) revealed that ups genes are essential for UV-induced cell aggregation, exchange of chromosomal auxotrophic markers and important for cell survival upon UV treatment (117,118). In addition, two UV-inducible genes encoding membrane components were also revealed to be essential for UV-induced chromosomal DNA transfer, and probably function in DNA import process (Crenarchaeal DNA import system, Ced) (119). Interestingly, cell aggregation was shown to be species-specific and can also be triggered by other DSB inducing agents (117,118). These findings, together with the UV-induced upregulation of HRR genes, suggest that the DNA transfer among cells of the same species has an important

28 role in DNA repair by providing intact templates of homologous DNA (120).

The potential cell cycle checkpoint control upon stressed conditions in Sulfolobus

Fugure 8. The schematics of cell cycle for exponentially growing Sulfolobus spp. The cell cycle phases of Sulfolobus spp. are illustrated. The potential cell cycle checkpoint controls are indicated. The details are described in main text and the figure was taken from Lindås and Bernander, 2013 (2)

Most of cell cycle studies in archaea have been performed in Sulfolobus genus, a haploid crenarchaeote. Specifically, Sulfolobus spp. has three replication origins that fire once per cell cycle. The binding of Orc1/Cdc6 protein to the ORB (origin recognition boxes) is the main regulatory step during replication initiation and this process was proposed to be modulated by the switch between ATP and ADP bound status of Orc1/Cdc6 proteins. The replication is then terminated by fork collision and if homologous recombination occurred during replication, the resulting dimmer can be resolved by the homologue of bacterial XerCD recombinase, XerA. Chromosome replication is followed by genome segregation and cell division, which is mediated by the bacterial like ParA-ParB system and a system with homology to the eukaryal endosomal sorting compelx required for transport III (ESCRT-III), respectively (2).

29

The progression of cell cycle in Sulfolobus is regulated by a yet-to-be defined mechanism. As summarized by Lindås and Bernander (2), and shown in Fig. 8, an exponentially growing Sulfolobus cell goes through a short pre-replicative G1 phase (it occupies <5% of the cell cycle) and is followed by the S phase, where the genome replication occurs (it lasts for 30–35% of the cell cycle). Post-replicative G2 phase occupies over half of the cell cycle, and M and D phase each lasts ca. 5% of the cell cycle, during which the genome segregation and cell division happens. Interestingly, when Sulfolobus culture is growing in a stationary phase, under amino acid starvation stress or diluted into fresh medium, temporary cycle arrest in D (G2) was observed (121,122). In addition, several drugs including daunomycin that suppresses DNA replication and transcription by targeting topoisomerase, N1-guanyl-1,7-diaminoheptane (GC7) that suppresses translation by inhibiting posttranslational modification of protein synthesis initiation factor 5A and acetic acid induce cell cycle arrest in G2 phase (123-125). More interestingly, UV light treatment of S. solfataricus and S. acidocaldarius leads to an increased population with DNA contents as those in G1 and early S phase cells and it was proposed to be a result of the activation of a G1 checkpoint system upon DNA damage (112,113). The ability of Sulfolobus to provoke the cell cycle arrest under stressed conditions suggests the presence of a cell cycle checkpoint system.

TRANSCRIPTIONAL REGULATION IN ARCHAEA

Figure 9. A schematic diagram of transcription regulation in Archaea Binding of TBP to TATA box and TFB to BRE site are essential for the recruitment of RNAP to form PIC. Transcriptional activator or repressor can facilitate or block this process by binding to different regions of the promoter. Activators usually bind to DNA upstream of the BRE whereas bind to TATA/BRE-overlapping sequences. BRE: transcriptional factor B (TFB)

30 recognition element. TATA: the binding site for TBP. Picture adapted from Peeters et al., 2015 (126)

The archaeal transcription machinery shares strong similarity to the eukaryal RNA polymerase II system in subunit composition and general mechanism (127). As illustrated in Fig.9a, archaeal transcription is initiated by the binding of TATA-binding protein (TBP) and transcriptional factor B (TFB) to promoter region, and then, RNAP is recruited to the promoter along with the general transcriptional factor E (TFE) to form the pre-initiation complex (PIC). While TBP and TFB binds to TATA and BRE site of the promoter respectively in a sequence specific manner, TFE interacts with the non-template strand in a sequence-independent manner (126).

Archaeal transcriptional regulation is primarily mediated by specific or general transcription factors that interact with the basal transcription machinery at promoter regions. By binding to different promoter regions, these transcriptional factors can enhance or inhibit the sequential assembly of PIC. For example, as indicated in Fig. 9b, transcriptional activator usually binds to upstream of BRE site and facilitates the recruitment of TBP or TFB. While in contrast, the binding site of most repressors typically overlaps with TATA box and BRE, thereby blocking the formation of PIC. So far, a number of transcriptional factors of both bacterial- and eukaryal-types have been identified in archaea and they function in gene-specific regulation or global transcriptional regulation (128). Interestingly, approximately 70% of known archaeal genomes encode two or more copies of TFB and/or TBP family proteins (128) and functional implications of the paralogs of TFB in transcriptional regulation have been reported. For instance, as many as seven TFBs are encoded in H. salinarum. Among them TFBb may function in heat shock response (129) and TFBf is the exclusive TFB regulator for genes implicated in ribosome biogenesis (130).

Three homologues of TFB protein are encoded by Sulfolobus genus, two of them (TFB1 and TFB2) are in full-length form and the third (TFB3) is severely short. Specifically, TFB3 lacks the carboxyl-terminal core domain that interacts with TBP and binds the BRE, and the B-finger domain that stimulate transcription (113,131). Investigation of the relative abundances of these three TFB-encoding genes in S. solfataricus and S. acidocaldarius

31 demonstrated that only tfb3 is significantly induced upon UV treatment (113) and this regulator is subsequently shown to facilitate the in vitro transcription (132), suggesting it could function in DNA damage-induced transcriptional regulation.

32

SUMMARY OF THE RESULTS

In this work, I have investigated the function of TFB3 and Orc1-2 from S. islandicus Rey15A during DNA damage-induced cellular responses. To introduce DNA damage, 4-NQO was used. It has been reported that the metabolic activation of 4-NQO leads to the formation of DNA adducts, 8-hydroxyguanine and strand breakage (13) and our previous results indicated that NQO induces a similar replication stress as that observed in UV irradiation (133).

TFB3 FUNCTIONS AS A TRANSCRIPTIONAL ACTIVATOR FOR DNA TRANSFER PATHWAY tfb3 is among the top upregulated genes following exposure to UV light in S. solfataricus and S. acidocaldarius, proposing a role for this potential transcriptional regulator in DDR of Sulfolobus. In vitro transcription assay demonstrated that TFB3, together with the TFB1 and TBP are capable of facilitating the in vitro transcription. However, such an stimulatory effect is observed for all tested promoters including non-UV-responsive ones (132) and the in vivo function of TFB3 remain elusive.

In this work, we showed that tfb3 gene was also dramatically induced in S. islandicus by a number of DNA damaging agents including MMS, NQO, cisplatin and UV light, indicating that the induction of tfb3 is triggered specifically by DNA damage. Then we constructed the tfb3 deletion mutant (∆tfb3) and phenotypic characterization of ∆tfb3 revealed that the mutant is more sensitive to 1-2µM NQO treatment than the WT strain. The increased susceptibility of the mutant to NQO prompts us to analyze the transcriptome change in the WT and mutant upon NQO treatment, which led to the identification of 139 upregulated genes and 174 downregulated genes (>2 folds) in WT strain. These genes include most of the highly UV- inducible genes and again no significant induction of DNA repair genes was observed. In contrast, only 10 of those 61 top induced genes (>4 folds) was upregulated in ∆tfb3, indicating that TFB3 does function in transcriptional activation in vivo. Furthermore, many of those TFB3-dependent genes are implicated in the intercellular DNA transfer process, including previously reported ups genes, ced genes and a large number of genes encoding membrane-associated components, indicating TFB3 may specifically function in the transcriptional activation of the genes in DNA transfer pathway upon DNA damage.

To provide insights into the mechanism of TFB3-mediated transcriptional activation, we

33 performed CHIP-qPCR analysis and found that TFB3 is associated with the promoters of its target genes and essentially no enrichment was observed for the promoters of TFB3- independent genes, indicating that TFB3 is specifically recruited onto the promoter of those TFB3-dependent genes upon DNA damage and activate their expression, probably by recruiting RNAP to the promoter. However, different from the canonical TFB family protein, TFB3 lacks the DNA-binding domain that is normally present in the C-terminal cyclin fold of TFBs. As a result, the recruitment of TFB3 to the promoter region is probably mediated by protein-protein interaction. It has been proposed that the Zn ribbon domain of TFB/TFIIB protein functions in RNAP recruitment (134) and the four conserved cysteines in Zn ribbon coordinate the Zn ions (135). Intriguingly, TFB1 in Sulfolobus species lack the first and fourth cysteines that are conserved in TFB/TFIIB family including Sulfolobus TFB3. It was proposed that TFB1 and TFB3 functionally interact with each other to provide the full capacity of transcription (132). In this work, we further performed sequence analysis of TFB3 and identified a coiled-coil (CC) motif at the C terminal region. By performing the site- mutagenesis of TFB3 protein, we showed that both the conserved cysteines in N-terminal Zn ribbon region and the conserved residues in C terminal CC motif are essential for TFB3’s function as a transcriptional activator upon DNA damage. These results suggest that the canonical Zn ribbon of TFB3 may complement the function of TFB1 to provide full transcriptional capacity for those TFB3-dependent genes and the specificity of TFB3 can be achieved by interacting with a sequence-specific regulator via CC motif.

Interestingly, phylogenetic analysis revealed a co-occurrence of TFB3, non-canonical TFB1 and Ced system in a broader range of species in Crenarchaeota, suggesting TFB3 regulated DNA transfer pathway may represent a well conserved DDR regulatory circuit upon DNA damage in Crenarchaeota.

To conclude, this part of work solved the longstanding question about the functional significance of the TFB3’s induction upon DNA damage. Though lack of a DNA binding domain, TFB3 is recruited onto the promoter of its target genes and this process is probably mediated by protein-protein interaction via coiled-coil motif. Most of TFB3-dependent genes are implicated in DNA transfer process, together with the co-occurrence between TFB3 and Ced system, suggesting this regulator has co-evolved specifically with the DNA transfer

34 systems.

ORC1-2 FUNCTIONS AS A GLOBAL REGULATOR ESSENTIAL FOR DDR IN SULFOLOBUS To investigate the function of Orc1-2 during DNA damage response, in this work, we analysed the sensitivity of the orc1-2 deletion mutant previously constructed in S. islandicus Rey15A, towards NQO and a hypersensitivity phenotype was observed. To provide insights into how the deficiency of Orc1-2 affects cell’s survival after DNA damage treatment, the orc1-2 mutant and WT were subjected to transcriptome analysis. Strikingly, transcriptome data revealed that the Orc1-2 deficiency has abolished the differential expression of the majority of NQO responsive genes. Specifically, the Orc1-2 dependent upregulated genes include all those TFB3-dependent genes. In addition, genes involved in HRR including mre11-rad50 operon and Dpo2 operon also exhibits Orc1-2 dependent upregulation. Those Orc1-2-dependent downregulated genes are implicated in DNA replication initiation (Orc1-1 and Orc1-3), genome segregation (SegA and SegB) and cell division (CdvA, ESCRT-III orthologues and Vps4), suggesting that NQO imposes cell cycle arrest in S. islandicus and the regulation is dependent on Orc1-2.

In consistent with the deficiency in transcriptional responses upon DNA damage, Δorc1-2 was shown defective in cell aggregation. Meanwhile, flow cytometry analysis revealed that Δorc1-2 is also defective in cell cycle regulation upon DNA damage, as the population with one chromosome equivalent increased more dramatically in the mutant. Upon DNA damage, cell division in WT happens in a lower frequency than that in Δorc1-2, as a result of transcriptional repression of cell division genes in WT. While in contrast, though genes encoding DNA replication initiators were not repressed in Δorc1-2, the frequent initiation of DNA replication can be readily blocked by genomic DNA lesions. As a result, these processes collectively lead to a dramatic increase of the cell population with a similar genome content equivalent to one chromosome in Δorc1-2.

Interestingly, a conserved motif previously that has been reported in UV-responsive expression in S. acidocaldarius (136) is also present in a number of Orc1-2-dependent genes. Reporter gene assay showed that this motif also mediates NQO-responsive expression in WT strain of S. islandicus E234, but not in Δorc1-2, indicating that both

35

Orc1-2 protein and the motif are essential for DNA damage induced transcriptional regulation. Furthermore, DNaseI footprinting assay demonstrated that the recombinant Orc1-2 protein protects the motif (conducted by Mengmeng Sun), suggesting that Orc1-2 may bind to this DNA damage responsive element (DDRE) in vivo. Interestingly, such a motif also exists in the promoter of orc1-2 gene, suggesting an autoregulation of Orc1-2 protein level upon DNA damage and a higher expression of Orc1-2 may be important for the DDR process.

To investigate the influence of the protein level of Orc1-2 on Sulfolobus DDR, we constructed a strain in which the original promoter of the orc1-2 gene was replaced with the araS promoter that confers arabinose-inducible expression in this archaeon. By culturing the strain in non-inducible media and inducible condition, we are able to control the protein level of Orc1-2 in a basal or high level in a constant manner individually. Phenotypic characterization of the resulting mutant strains revealed that cells expressing a low level of Orc1-2 protein exhibited hypersensitivity to NQO treatment as orc1-2 deletion mutant cells. While in clear contrast, a constant high level of Orc1-2 protein allows the strain responding to NQO treatment more promptly. In parallel with that, quantitative analysis of the expression level DDR genes demonstrated that a constant high level of Orc1-2 protein enabled immediate induction of DDR genes upon NQO treatment. Nevertheless, DDR cascade was not activated by constantly inducing the expression of orc1-2, suggesting that other factors or post-translational modification of Orc1-2 is also required to activate the DDR cascade.

Based on the above summarized results, one could envisage the following scenario of the regulatory network in the DDR of Sulfolobus. As illustrated in Fig. 10, DSBs generated by DNA replication across genomic DNA lesions or directly from NQO treatment can be recognised by Mre11-Rad50 complex, which then recruits NurA and HerA and initiates end resection. Meanwhile, Mre11-Rad50 complex may also transmits the DNA damage signal to certain Orc1-2 modifying factor that activates the function of Orc1-2 upon DNA damage. Once activated, Orc1-2 enhances the expression of itself thus comprising a positive feedback loop. Meanwhile, Orc1-2 binds to the DDRE or other promoter regions

36 of its target genes to modulate transcriptional process, thereby orchestrating the global transcriptional response towards DNA damage.

Figure 10. An Orc1-2 centered DDR regulatory network in Sulfolobus DNA damage agents yield lesions on DNA that will be converted into double-stranded breaks, which activate the DNA damage signal transduction pathway. Then Orc1-2 is probably activated by posttranslational modifications, such as phosphorylation and/or acetylation. The activated form of Orc1-2 then recognizes DDRE present in the promoters of DDR genes and activates/represses their expression, including several different cellular processes as well as its own gene. AAA+: ATPases associated with diverse cellular activities; wH: wing-helix DNA binding domain; DDRE: DNA damage-responsive element; TTS: transcription start site; Ups: UV-responsive pili of Sulfolobus; Ced: Crenarchaeal system for exchange of DNA.

37

DISCUSSIONS AND FUTURE PERSPECTIVES

CHIP-qPCR analysis showed that TFB3 is recruited onto the promoter of TFB3-dependent DDR genes. However, the underlying mechanism remains to be elucidated. As TFB3 lacks any recognizable DNA-binding domain, its recruitment onto the promoter must be indirect, such as by protein-protein interaction with a transcriptional factor that binds to the DDR promoters. Here, Orc1-2 represent a candidate for such a role, as it was shown that Orc1-2 binds to DDRE in vitro. Consistently, deletion of either the orc1-2 or tfb3 abolish the DNA damage induced cell aggregation and our unpublished data suggested that TFB3 cannot restore the function of Orc1-2 in activating gene expression of ups operon upon DNA damage, indicating that both TFB3 and Orc1-2 are required for this process. The ability of Orc1-2 to bind DDRE at the promoter region may assist TFB3 recruitment via the interaction with TFB3’s CC motif. Indeed, the coiled-coil motif is conserved in TFB3 paralogues and was shown to be essential for the function of TFB3, suggesting that it may mediate the recruitment of TFB3.

It was shown that the conserved DDRE motif (ANTTTC) mediates NQO-responsive expression, in the presence of Orc1-2. However, among those top regulated genes, dpo2 and orc1-2 also contain the DDRE motif but is TFB3-independent, suggesting that other co- activators are recruited after Orc1-2 binding, or Orc1-2 itself is sufficient for the activation of dpo2 gene or the encoding gene of itself upon DNA damage.

More interestingly, the variation of the DDRE sequence and location at different promoter was observed. For example, tfb3 and orc1-2 gene promoter contain the most conserved motif that present between -50 and -80 region, while in contrast, most of the DDRE locate at -30 to -50 region of the promoter. This arrangement is reminiscent of the LexA-mediated SOS response, in which the location and sequence variation of SOS box determine the affinity to LexA and thus modulates the transcriptional strength of SOS genes. Similarly, the multilayer regulations of DDR genes may also exist in Sulfolobus.

In contrast, motif searching for the promoters of those downregulated ones failed to identify a conserved regulatory motif (apart from the TATA box region), suggesting a distinctive

38 mechanism for transcription repression of these genes. For example, the TFB3/Orc1-2 dependent upregulation of certain repressors could lead to the repression of certain genes. Another possibility is that activated Orc1-2 competes with TBP for the binding of TATA box region, thereby blocking the PIC formation on the promoter of those downregulated genes.

Interestingly, the induction of Orc1-2 is essential but not sufficient for the activation of DDR in Sulfolobus, suggesting the existence of a yet-to-be defined mechanism to activate Orc1-2 upon DNA damage. It was proposed that the ADP/ATP bound forms may function as a switch for the functional status of Sulfolobus Orc1/Cdc6 proteins and newly synthesized ATP-bound proteins will be inactivated by one round of ATP hydrolysis (137). However, the activation of Orc1-2 upon DNA damage is less likely to be mediated by such a mechanism, as the constant induction of orc1-2 gene in the Orc1-2araS strain, failed to initiate DDR cascade. One possibility is that the newly synthesized Orc1-2 has to interact with certain DNA-damage- activated partner to achieve the conformation change and thus fulfill its role in transcriptional regulation. However, a more attractive model is that certain Orc1-2 modifying factor activates Orc1-2 upon DNA damage by posttranslational modification, a prevailing mode employed by eukaryal organisms to modulate the function of their DDR regulators.

∆orc1-2 shows a hypersensitivity phenotype following the incubation with NQO, while ∆tfb3 is only moderately sensitive to NQO treatment. The difference between the susceptibility of ∆tfb3 and ∆orc1-2 to NQO treatment suggests that those TFB3-independent processes play a key role in cell survival upon DNA damage. Notably, Dpo2 encoding gene is the most top induced genes that are independent of TFB3. It thus represents a candidate that contributes to the survival of ∆tfb3, but not ∆orc1-2 upon DNA damage, by performing TLS. More importantly, the failure of ∆orc1-2 in blocking the expression of those genes involved in cell cycle progression upon DNA damage probably renders the strain more sensitive to NQO agent, as the NQO-induced lesions will be converted into the more deleterious DSBs upon DNA replication.

Based on the discussions above, the further research can be focused on the following aspects.

As the overexpression of orc1-2 does not trigger the activation of DDR cascade, we

39 hypothesized that PTM events mediated by certain kinases or other protein-modifying enzymes may occur on Orc1-2 upon DNA damage, which probably activate the function of Orc1-2. As a result, it is of crucial importance to determine whether Orc1-2 undergoes posttranslational modification upon DNA damage by mass spectrometry, either the phosphorylation or acetylation/methylation. If it proven to be true, then the identification of the Orc1-2 modifier will provide unprecedented insights into the DNA damage induced signal transduction pathway in archaea.

It was suggested those unrepaired lesions in Sulfolobus can block DNA replication and lead to the DSBs formation, which may trigger the cellular signal cascade towards DNA damage (138). So far, it is generally accepted that Mre11-Rad50 is the early factor that recognizes the DSBs and it may recruits either NurA-HerA for end resection or other factors for DNA damage signal transduction. In eukarya, the MRN complex was the main factor for the activation of the ATM kinase by protein-protein interaction and this complex is conserved across archaeal-eukaryal lineages, suggesting Mre11-rad50 could also transmit the DNA damage signal to downstream transducers in archaea, which probably is a kinase. By analogy to the mechanism in Eukarya, the identification of the factors that interact with Mre11-Rad50 under the condition of DNA damage will provide interesting insights into the mechanism of DDR activation in archaea.

In addition, how Orc1-2 regulates the transcriptional process for the target genes without a DDRE motif remains unclear. One possibility is that Orc1-2 interacts with specific or general transcriptional regulators to modulate the expression of certain subset of genes. The identification and characterization of Orc1-2 interactome under the normal growth and upon DNA damage may provide new clues for the mechanism of Orc1-2-mediated transcriptional responses.

40

REFERENCES

1. Woese, C.R. and Fox, G.E. (1977) Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci U S A, 74, 5088-5090. 2. Lindas, A.C. and Bernander, R. (2013) The cell cycle of archaea. Nat. Rev. Microbiol., 11, 627-638. 3. Zatopek, K.M., Gardner, A.F. and Kelman, Z. (2018) Archaeal DNA replication and repair: new genetic, biophysical, and molecular tools for discovering and characterizing enzymes, pathways, and mechanisms. FEMS Microbiol. Rev. 4. Lindahl, T. (1993) Instability and decay of the primary structure of DNA. Nature, 362, 709. 5. She, Q., Feng, X. and Han, W. (2017), Biocommunication of Archaea. Springer, pp. 305-318. 6. Grogan, D.W., Carver, G.T. and Drake, J.W. (2001) Genetic fidelity under harsh conditions: analysis of spontaneous mutation in the thermoacidophilic archaeon Sulfolobus acidocaldarius. Proc Natl Acad Sci U S A, 98, 7928-7933. 7. Goodman, M.F. and Woodgate, R. (2013) Translesion DNA polymerases. Csh Perspect Biol, 5, a010363. 8. White, M.F. and Allers, T. (2018) DNA Repair in the Archaea - an emerging picture. FEMS Microbiol. Rev. 9. Grogan, D.W. (2015) Understanding DNA repair in hyperthermophilic Archaea: persistent gaps and other reasons to focus on the fork. Archaea, 2015. 10. Zhang, C., Tian, B., Li, S., Ao, X., Dalgaard, K., Gokce, S., Liang, Y. and She, Q. (2013) Genetic manipulation in Sulfolobus islandicus and functional analysis of DNA repair genes. Biochem. Soc. Trans., 41, 405-410. 11. Genois, M.-M., Paquet, E.R., Laffitte, M.-C.N., Maity, R., Rodrigue, A., Ouellette, M. and Masson, J.-Y. (2014) DNA repair pathways in trypanosomatids: from DNA repair to drug resistance. Microbiol. Mol. Biol. Rev., 78, 40-73. 12. Sancar, A., Lindsey-Boltz, L.A., Unsal-Kacmaz, K. and Linn, S. (2004) Molecular mechanisms of mammalian DNA repair and the DNA damage checkpoints. Annu. Rev. Biochem., 73, 39-85. 13. Friedberg, E.C., Walker, G.C., Siede, W. and Wood, R.D. (2005) DNA repair and mutagenesis. American Society for Microbiology Press. 14. Ciccia, A. and Elledge, S.J. (2010) The DNA damage response: making it safe to play with knives. Mol. Cell, 40, 179-204. 15. Morita, R., Nakane, S., Shimada, A., Inoue, M., Iino, H., Wakamatsu, T., Fukui, K., Nakagawa, N., Masui, R. and Kuramitsu, S. (2010) Molecular mechanisms of the whole DNA repair system: a comparison of bacterial and eukaryotic systems. J Nucleic Acids, 2010, 179594. 16. Yi, C. and He, C. (2013) DNA repair by reversal of DNA damage. Csh Perspect Biol, 5, a012575. 17. Kiener, A., Husain, I., Sancar, A. and Walsh, C. (1989) Purification and properties of Methanobacterium thermoautotrophicum DNA photolyase. J. Biol. Chem., 264, 13880-13887.

41

18. Sakofsky, C.J., Runck, L.A. and Grogan, D.W. (2011) Sulfolobus mutants, generated via PCR products, which lack putative enzymes of UV photoproduct repair. Archaea, 2011. 19. Fujihashi, M., Numoto, N., Kobayashi, Y., Mizushima, A., Tsujimura, M., Nakamura, A., Kawarabayasi, Y. and Miki, K. (2007) Crystal structure of archaeal photolyase from Sulfolobus tokodaii with two FAD molecules: implication of a novel light-harvesting cofactor. J. Mol. Biol., 365, 903-910. 20. Dorazi, R., Götz, D., Munro, S., Bernander, R. and White, M.F. (2007) Equal rates of repair of DNA photoproducts in transcribed and non‐transcribed strands in Sulfolobus solfataricus. Mol. Microbiol., 63, 521-529. 21. Leclere, M.M., Nishioka, M., Yuasa, T., Fujiwara, S., Takagi, M. and Imanaka, T. (1998) The O6-methylguanine-DNA methyltransferase from the hyperthermophilic archaeon Pyrococcus sp. KOD1: a thermostable repair enzyme. Mol. Gen. Genet., 258, 69-77. 22. Perugino, G., Vettone, A., Illiano, G., Valenti, A., Ferrara, M.C., Rossi, M. and Ciaramella, M. (2012) Activity and regulation of archaeal DNA alkyltransferase: conserved protein involved in repair of DNA alkylation damage. J. Biol. Chem., 287, 4222-4231. 23. Kanugula, S., Pauly, G.T., Moschel, R.C. and Pegg, A.E. (2005) A bifunctional DNA repair protein from Ferroplasma acidarmanus exhibits O6-alkylguanine-DNA alkyltransferase and endonuclease V activities. Proc Natl Acad Sci U S A, 102, 3617-3622. 24. Nishikori, S., Shiraki, K., Okanojo, M., Imanaka, T. and Takagi, M. (2004) Equilibrium and kinetic stability of a hyperthermophilic protein, O6- methylguanine-DNA methyltransferase under various extreme conditions. Journal of biochemistry, 136, 503-508. 25. Messling, J.-E. and Williams, A. (2016), Genome Stability. Elsevier, pp. 51-67. 26. Krokan, H.E. and Bjoras, M. (2013) Base excision repair. Csh Perspect Biol, 5, a012583. 27. Grasso, S. and Tell, G. (2014) Base excision repair in Archaea: back to the future in DNA repair. DNA Repair, 21, 148-157. 28. Ishino, S., Makita, N., Shiraishi, M., Yamagami, T. and Ishino, Y. (2015) EndoQ and EndoV work individually for damaged DNA base repair in Pyrococcus furiosus. Biochimie, 118, 264-269. 29. Kiyonari, S., Egashira, Y., Ishino, S. and Ishino, Y. (2014) Biochemical characterization of endonuclease V from the hyperthermophilic archaeon, Pyrococcus furiosus. Journal of biochemistry, 155, 325-333. 30. Shiraishi, M., Ishino, S., Yamagami, T., Egashira, Y., Kiyonari, S. and Ishino, Y. (2015) A novel endonuclease that may be responsible for damaged DNA base repair in Pyrococcus furiosus. Nucleic Acids Res., 43, 2853-2863. 31. Yan, Z., Huang, Q., Ni, J. and Shen, Y. (2016) Distinct catalytic activity and in vivo roles of the ExoIII and EndoIV AP endonucleases from Sulfolobus islandicus. Extremophiles, 20, 785-793. 32. Greagg, M.A., Fogg, M.J., Panayotou, G., Evans, S.J., Connolly, B.A. and Pearl, L.H. (1999) A read-ahead function in archaeal DNA polymerases detects promutagenic template-strand uracil. Proc Natl Acad Sci U S A, 96, 9045-9050. 33. Dionne, I. (2005) Characterization of an archaeal family 4 uracil DNA glycosylase and its interaction with PCNA and chromatin proteins. Biochem. J., 387, 859-863.

42

34. Kiyonari, S., Tahara, S., Uchimura, M., Shirai, T., Ishino, S. and Ishino, Y. (2009) Studies on the base excision repair (BER) complex in Pyrococcus furiosus. Biochem. Soc. Trans., 37, 79-82. 35. Pan, M., Kelman, L.M. and Kelman, Z. (2011) The archaeal PCNA proteins. Biochem. Soc. Trans., 39, 20-24. 36. Shiraishi, M., Ishino, S., Yoshida, K., Yamagami, T., Cann, I. and Ishino, Y. (2016) PCNA is involved in the EndoQ-mediated DNA repair process in Thermococcales. Scientific reports, 6, 25532. 37. Hogrel, G., Lu, Y., Laurent, S., Henry, E., Etienne, C., Phung, D.K., Dulermo, R., Bossé, A., Pluchon, P.-F. and Clouet-d’Orval, B. Physical and functional interplay between PCNA DNA clamp and Mre11–Rad50 complex from the archaeon Pyrococcus furiosus. Nucleic Acids Res. 38. Rouillon, C. and White, M.F. (2011) The evolution and mechanisms of nucleotide excision repair proteins. Res. Microbiol., 162, 19-26. 39. Kisker, C., Kuper, J. and Van Houten, B. (2013) Prokaryotic nucleotide excision repair. Csh Perspect Biol, 5, a012591. 40. Hu, J., Selby, C.P., Adar, S., Adebali, O. and Sancar, A. (2017) Molecular mechanisms and genomic maps of DNA excision repair in Escherichia coli and humans. J. Biol. Chem., 292, 15588-15597. 41. Giglia-Mari, G., Zotter, A. and Vermeulen, W. (2011) DNA damage response. Csh Perspect Biol, 3, a000745. 42. Marteijn, J.A., Lans, H., Vermeulen, W. and Hoeijmakers, J.H. (2014) Understanding nucleotide excision repair and its roles in cancer and ageing. Nat. Rev. Mol. Cell Biol., 15, 465-481. 43. White, M.F. and Allers, T. (2018) DNA Repair in the Archaea–an emerging picture. FEMS Microbiol. Rev. 44. Rouillon, C. and White, M.F. (2010) The XBP-Bax1 helicase-nuclease complex unwinds and cleaves DNA: implications for eukaryal and archaeal nucleotide excision repair. J. Biol. Chem., 285, 11013-11022. 45. Romano, V., Napoli, A., Salerno, V., Valenti, A., Rossi, M. and Ciaramella, M. (2007) Lack of strand-specific repair of UV-induced DNA lesions in three genes of the archaeon Sulfolobus solfataricus. J. Mol. Biol., 365, 921-929. 46. Fujikane, R., Ishino, S., Ishino, Y. and Forterre, P. (2010) Genetic analysis of DNA repair in the hyperthermophilic archaeon, Thermococcus kodakaraensis. Genes Genet. Syst., 85, 243-257. 47. Crowley, D.J., Boubriak, I., Berquist, B.R., Clark, M., Richard, E., Sullivan, L., DasSarma, S. and McCready, S. (2006) The uvrA, uvrB and uvrC genes are required for repair of ultraviolet light induced DNA photoproducts in Halobacterium sp. NRC-1. Saline Systems, 2, 11. 48. Li, G.-M. (2008) Mechanisms and functions of DNA mismatch repair. Cell Res., 18, 85. 49. Larrea, A.A., Lujan, S.A. and Kunkel, T.A. (2010) SnapShot: DNA mismatch repair. Cell, 141, 730-730. e731. 50. Kelman, Z. and White, M.F. (2005) Archaeal DNA replication and repair. Curr. Opin. Microbiol., 8, 669-676. 51. Busch, C.R. and DiRuggiero, J. (2010) MutS and MutL are dispensable for maintenance of the genomic mutation rate in the halophilic archaeon Halobacterium salinarum NRC-1. PLoS One, 5, e9045.

43

52. Grogan, D.W. (2004) Stability and repair of DNA in hyperthermophilic Archaea. Curr. Issues Mol. Biol., 6, 137-144. 53. Ishino, S., Nishi, Y., Oda, S., Uemori, T., Sagara, T., Takatsu, N., Yamagami, T., Shirai, T. and Ishino, Y. (2016) Identification of a mismatch-specific endonuclease in hyperthermophilic Archaea. Nucleic Acids Res., 44, 2977-2986. 54. Nakae, S., Hijikata, A., Tsuji, T., Yonezawa, K., Kouyama, K.-i., Mayanagi, K., Ishino, S., Ishino, Y. and Shirai, T. (2016) Structure of the EndoMS-DNA complex as mismatch restriction endonuclease. Structure, 24, 1960-1971. 55. Castaneda-Garcia, A., Prieto, A.I., Rodriguez-Beltran, J., Alonso, N., Cantillon, D., Costas, C., Perez-Lago, L., Zegeye, E.D., Herranz, M., Plocinski, P. et al. (2017) A non-canonical mismatch repair pathway in prokaryotes. Nature communications, 8, 14246. 56. Ren, B., Kühn, J., Meslet‐Cladiere, L., Briffotaux, J., Norais, C., Lavigne, R., Flament, D., Ladenstein, R. and Myllykallio, H. (2009) Structure and function of a novel endonuclease acting on branched DNA substrates. The EMBO journal, 28, 2479-2489. 57. Bonura, T., Town, C.D., Smith, K.C. and Kaplan, H.S. (1975) The influence of oxygen on the yield of DNA double-strand breaks in X-irradiated Escherichia coli K-12. Radiat. Res., 63, 567-577. 58. Huang, L.C., Clarkin, K.C. and Wahl, G.M. (1996) Sensitivity and selectivity of the DNA damage sensor responsible for activating p53-dependent G1 arrest. Proc Natl Acad Sci U S A, 93, 4827-4832. 59. Lieber, M.R. and Wilson, T.E. (2010) SnapShot: Nonhomologous DNA end joining (NHEJ). Cell, 142, 496-496. e491. 60. Blackwood, J.K., Rzechorzek, N.J., Bray, S.M., Maman, J.D., Pellegrini, L. and Robinson, N.P. (2013) End-resection at DNA double-strand breaks in the three domains of life. Biochem. Soc. Trans., 41, 314-320. 61. Jasin, M. and Rothstein, R. (2013) Repair of strand breaks by homologous recombination. Csh Perspect Biol, 5, a012740. 62. Moynahan, M.E. and Jasin, M. (2010) Mitotic homologous recombination maintains genomic stability and suppresses tumorigenesis. Nat. Rev. Mol. Cell Biol., 11, 196-207. 63. Constantinesco, F., Forterre, P., Koonin, E., Aravind, L. and Elie, C. (2004) A bipolar DNA helicase gene, herA, clusters with rad50, mre11 and nurA genes in thermophilic archaea. Nucleic Acids Res., 32, 1439-1447. 64. Haseltine, C.A. and Kowalczykowski, S.C. (2009) An archaeal Rad54 protein remodels DNA and stimulates DNA strand exchange by RadA. Nucleic Acids Res., 37, 2757-2770. 65. Patoli, B.B., Winter, J.A., Patoli, A.A., Delahay, R.M. and Bunting, K.A. (2017) Co-expression and purification of the RadA recombinase with the RadB paralog from Haloferax volcanii yields heteromeric ring-like structures. Microbiology, 163, 1802-1811. 66. Zhai, B., DuPrez, K., Doukov, T.I., Li, H., Huang, M., Shang, G., Ni, J., Gu, L., Shen, Y. and Fan, L. (2017) Structure and Function of a Novel ATPase that Interacts with Holliday Junction Resolvase Hjc and Promotes Branch Migration. J. Mol. Biol., 429, 1009-1029.

44

67. Huang, Q., Li, Y., Zeng, C., Song, T., Yan, Z., Ni, J., She, Q. and Shen, Y. (2015) Genetic analysis of the Holliday junction resolvases Hje and Hjc in Sulfolobus islandicus. Extremophiles, 19, 505-514. 68. White, M.F. (2011) Homologous recombination in the archaea: the means justify the ends. Biochem. Soc. Trans., 39, 15-19. 69. Northall, S.J., Ivančić-Baće, I., Soultanas, P. and Bolt, E.L. (2016) Remodeling and control of homologous recombination by DNA helicases and translocases that target recombinases and synapsis. Genes, 7, 52. 70. Huang, Q., Liu, L., Liu, J., Ni, J., She, Q. and Shen, Y. (2015) Efficient 5′-3′ DNA end resection by HerA and NurA is essential for cell viability in the crenarchaeon Sulfolobus islandicus. BMC Mol. Biol., 16, 2. 71. Weller, G.R., Kysela, B., Roy, R., Tonkin, L.M., Scanlan, E., Della, M., Devine, S.K., Day, J.P., Wilkinson, A., d'Adda di Fagagna, F. et al. (2002) Identification of a DNA nonhomologous end-joining complex in bacteria. Science, 297, 1686-1689. 72. Sale, J.E. (2013) Translesion DNA synthesis and mutagenesis in eukaryotes. Csh Perspect Biol, 5, a012708. 73. Petruska, J. and Goodman, M.F. (1995) Enthalpy-entropy compensation in DNA melting thermodynamics. J. Biol. Chem., 270, 746-750. 74. McCulloch, S.D. and Kunkel, T.A. (2008) The fidelity of DNA synthesis by eukaryotic replicative and translesion synthesis polymerases. Cell Res., 18, 148. 75. Fuchs, R.P. and Fujii, S. (2013) Translesion DNA synthesis and mutagenesis in prokaryotes. Csh Perspect Biol, 5, a012682. 76. Goodman, M.F. and Woodgate, R. (2013) Translesion DNA polymerases. Cold Spring Harbor perspectives in biology, 5, a010363. 77. Kreuzer, K.N. (2013) DNA damage responses in prokaryotes: regulating gene expression, modulating growth patterns, and manipulating replication forks. Csh Perspect Biol, 5, a012674. 78. Wu, Y., Wilson, R.C. and Pata, J.D. (2011) The Y-family DNA polymerase Dpo4 uses a template slippage mechanism to create single-base deletions. J. Bacteriol., 193, 2630-2636. 79. Trakselis, M.A. and Bauer, R.J. (2014), Nucleic Acid Polymerases. Springer, pp. 139-162. 80. Sakofsky, C.J., Foster, P.L. and Grogan, D.W. (2012) Roles of the Y-family DNA polymerase Dbh in accurate replication of the Sulfolobus genome at high temperature. DNA Repair, 11, 391-400. 81. Choi, J.-Y., Eoff, R.L., Pence, M.G., Wang, J., Martin, M.V., Kim, E.-J., Folkmann, L.M. and Guengerich, F.P. (2011) Roles of the four DNA polymerases of the crenarchaeon Sulfolobus solfataricus and accessory proteins in DNA replication. J. Biol. Chem., 286, 31180-31193. 82. Jozwiakowski, S.K., Gholami, F.B. and Doherty, A.J. (2015) Archaeal replicative primases can perform translesion DNA synthesis. P Natl Acad Sci USA, 112, E633- E638. 83. Blackford, A.N. and Jackson, S.P. (2017) ATM, ATR, and DNA-PK: The Trinity at the Heart of the DNA Damage Response. Mol. Cell, 66, 801-817. 84. Sirbu, B.M. and Cortez, D. (2013) DNA Damage Response: Three Levels of DNA Repair Regulation. Csh Perspect Biol, 5. 85. Brown, J.S., O'Carrigan, B., Jackson, S.P. and Yap, T.A. (2017) Targeting DNA repair in cancer: beyond PARP inhibitors. Cancer discovery, 7, 20-37.

45

86. Blackford, A.N. and Jackson, S.P. (2017) ATM, ATR, and DNA-PK: The trinity at the heart of the DNA damage response. Molecular cell, 66, 801-817. 87. Bakkenist, C.J. and Kastan, M.B. (2003) DNA damage activates ATM through intermolecular autophosphorylation and dimer dissociation. Nature, 421, 499. 88. Lee, J.-H., Mand, M.R., Deshpande, R.A., Kinoshita, E., Yang, S.-H., Wyman, C. and Paull, T.T. (2013) Ataxia telangiectasia-mutated (ATM) kinase activity is regulated by ATP-driven conformational changes in the Mre11/Rad50/Nbs1 (MRN) complex. J. Biol. Chem., 288, 12840-12851. 89. Schiller, C.B., Lammens, K., Guerini, I., Coordes, B., Feldmann, H., Schlauderer, F., Mockel, C., Schele, A., Strasser, K., Jackson, S.P. et al. (2012) Structure of Mre11-Nbs1 complex yields insights into ataxia-telangiectasia-like disease mutations and DNA damage signaling. Nat. Struct. Mol. Biol., 19, 693-+. 90. Schiller, C.B., Lammens, K., Guerini, I., Coordes, B., Feldmann, H., Schlauderer, F., Möckel, C., Schele, A., Strässer, K. and Jackson, S.P. (2012) Structure of Mre11–Nbs1 complex yields insights into ataxia-telangiectasia–like disease mutations and DNA damage signaling. Nature Structural and Molecular Biology, 19, 693. 91. Shiloh, Y. and Ziv, Y. (2013) The ATM protein kinase: regulating the cellular response to genotoxic stress, and more. Nat Rev Mol Cell Bio, 14, 197-210. 92. Escribano-Díaz, C., Orthwein, A., Fradet-Turcotte, A., Xing, M., Young, J.T., Tkáč, J., Cook, M.A., Rosebrock, A.P., Munro, M. and Canny, M.D. (2013) A cell cycle-dependent regulatory circuit composed of 53BP1-RIF1 and BRCA1-CtIP controls DNA repair pathway choice. Mol. Cell, 49, 872-883. 93. Marechal, A. and Zou, L. (2013) DNA Damage Sensing by the ATM and ATR Kinases. Csh Perspect Biol, 5. 94. Simmons, L.A., Foti, J.J., Cohen, S.E. and Walker, G.C. (2008) The SOS regulatory network. EcoSal Plus, 2008. 95. Erill, I., Campoy, S. and Barbe, J. (2007) Aeons of distress: an evolutionary perspective on the bacterial SOS response. FEMS Microbiol. Rev., 31, 637-656. 96. Kreuzer, K.N. (2013) DNA Damage Responses in Prokaryotes: Regulating Gene Expression, Modulating Growth Patterns, and Manipulating Replication Forks. Cold Spring Harbor Perspectives in Biology, 5. 97. Soutoglou, E. and Misteli, T. (2008) Activation of the cellular DNA damage response in the absence of DNA lesions. Science, 320, 1507-1510. 98. You, Z., Bailis, J.M., Johnson, S.A., Dilworth, S.M. and Hunter, T. (2007) Rapid activation of ATM on DNA flanking double-strand breaks. Nat. Cell Biol., 9, 1311. 99. Cubeddu, L. and White, M.F. (2005) DNA damage detection by an archaeal single- stranded DNA-binding protein. J. Mol. Biol., 353, 507-516. 100. Napoli, A., Valenti, A., Salerno, V., Nadal, M., Garnier, F., Rossi, M. and Ciaramella, M. (2005) Functional interaction of reverse gyrase with single-strand binding protein of the archaeon Sulfolobus. Nucleic Acids Res., 33, 564-576. 101. Carpentieri, F., De Felice, M., De Falco, M., Rossi, M. and Pisani, F.M. (2002) Physical and functional interaction between the mini-chromosome maintenance- like DNA helicase and the single-stranded DNA binding protein from the crenarchaeon Sulfolobus solfataricus. J. Biol. Chem., 277, 12118-12127. 102. Wei, T., Zhang, S., Zhu, S., Sheng, D., Ni, J. and Shen, Y. (2008) Physical and functional interaction between archaeal single-stranded DNA-binding protein and the 5′–3′ nuclease NurA. Biochem. Biophys. Res. Commun., 367, 523-529.

46

103. Cadman, C.J. and McGlynn, P. (2004) PriA helicase and SSB interact physically and functionally. Nucleic Acids Res., 32, 6378-6387. 104. Quaiser, A., Constantinesco, F., White, M.F., Forterre, P. and Elie, C. (2008) The Mre11 protein interacts with both Rad50 and the HerA bipolar helicase and is recruited to DNA following gamma irradiation in the archaeon Sulfolobus acidocaldarius. BMC Mol. Biol., 9, 25. 105. Kish, A., Gaillard, J.C., Armengaud, J. and Elie, C. (2016) Post‐translational methylations of the archaeal Mre11: Rad50 complex throughout the DNA damage response. Mol. Microbiol., 100, 362-378. 106. Delmas, S., Shunburne, L., Ngo, H.-P. and Allers, T. (2009) Mre11-Rad50 promotes rapid repair of DNA damage in the polyploid archaeon Haloferax volcanii by restraining homologous recombination. PLoS Genet., 5, e1000552. 107. Delmas, S., Duggin, I.G. and Allers, T. (2013) DNA damage induces nucleoid compaction via the Mre11‐Rad50 complex in the archaeon Haloferax volcanii. Mol. Microbiol., 87, 168-179. 108. Greagg, M.A., Fogg, M.J., Panayotou, G., Evans, S.J., Connolly, B.A. and Pearl, L.H. (1999) A read-ahead function in archaeal DNA polymerases detects promutagenic template-strand uracil. Proceedings of the National Academy of Sciences, 96, 9045-9050. 109. Gehring, A.M. and Santangelo, T.J. (2017) Archaeal RNA polymerase arrests transcription at DNA lesions. Transcription, 8, 288-296. 110. Baliga, N.S., Bjork, S.J., Bonneau, R., Pan, M., Iloanusi, C., Kottemann, M.C., Hood, L. and DiRuggiero, J. (2004) Systems level insights into the stress response to UV radiation in the halophilic archaeon Halobacterium NRC-1. Genome Res., 14, 1025-1035. 111. Salerno, V., Napoli, A., White, M.F., Rossi, M. and Ciaramella, M. (2003) Transcriptional response to DNA damage in the archaeon Sulfolobus solfataricus. Nucleic Acids Res., 31, 6127-6138. 112. Fröls, S., Gordon, P.M., Panlilio, M.A., Duggin, I.G., Bell, S.D., Sensen, C.W. and Schleper, C. (2007) Response of the hyperthermophilic archaeon Sulfolobus solfataricus to UV damage. J. Bacteriol., 189, 8708-8718. 113. Götz, D., Paytubi, S., Munro, S., Lundgren, M., Bernander, R. and White, M.F. (2007) Responses of hyperthermophilic crenarchaea to UV irradiation. Genome biology, 8, R220. 114. Rogozin, I.B., Makarova, K.S., Pavlov, Y.I. and Koonin, E.V. (2008) A highly conserved family of inactivated archaeal B family DNA polymerases. Biology direct, 3, 32. 115. Grogan, D.W. (1996) Exchange of genetic markers at extremely high temperatures in the archaeon Sulfolobus acidocaldarius. J. Bacteriol., 178, 3207-3211. 116. Schmidt, K.J., Beck, K.E. and Grogan, D.W. (1999) UV stimulation of chromosomal marker exchange in Sulfolobus acidocaldarius: implications for DNA repair, conjugation and homologous recombination at extremely high temperatures. Genetics, 152, 1407-1415. 117. Fröls, S., Ajon, M., Wagner, M., Teichmann, D., Zolghadr, B., Folea, M., Boekema, E.J., Driessen, A.J., Schleper, C. and Albers, S.V. (2008) UV‐inducible cellular aggregation of the hyperthermophilic archaeon Sulfolobus Solfataricus is mediated by pili formation. Mol. Microbiol., 70, 938-952.

47

118. Ajon, M., Fröls, S., van Wolferen, M., Stoecker, K., Teichmann, D., Driessen, A.J., Grogan, D.W., Albers, S.V. and Schleper, C. (2011) UV‐inducible DNA exchange in hyperthermophilic archaea mediated by type IV pili. Mol. Microbiol., 82, 807- 817. 119. van Wolferen, M., Wagner, A., van der Does, C. and Albers, S.V. (2016) The archaeal Ced system imports DNA. P Natl Acad Sci USA, 113, 2496-2501. 120. Wagner, A., Whitaker, R.J., Krause, D.J., Heilers, J.H., van Wolferen, M., van der Does, C. and Albers, S.V. (2017) Mechanisms of gene flow in archaea. Nat. Rev. Microbiol., 15, 492-501. 121. Han, W. and Institut, K.U.D.N.-o.B.F.B. (2015) Studies on DNA Damage Response in Sulfolobus Islandicus. University of Copenhagen, Faculty of Science, Department of Biology. 122. Hjort, K. and Bernander, R. (1999) Changes in cell size and DNA content in Sulfolobus cultures during dilution and temperature shift experiments. J. Bacteriol., 181, 5669-5675. 123. Jansson, B.M., Malandrin, L. and Johansson, H.E. (2000) Cell cycle arrest in archaea by the hypusination inhibitor N 1-guanyl-1, 7-diaminoheptane. J. Bacteriol., 182, 1158-1161. 124. Hjort, K. and Bernander, R. (2001) Cell cycle regulation in the hyperthermophilic crenarchaeon Sulfolobus acidocaldarius. Mol. Microbiol., 40, 225-234. 125. Lundgren, M., Andersson, A., Chen, L.M., Nilsson, P. and Bernander, R. (2004) Three replication origins in Sulfolobus species: Synchronous initiation of chromosome replication and asynchronous termination. P Natl Acad Sci USA, 101, 7046-7051. 126. Peeters, E., Driessen, R.P.C., Werner, F. and Dame, R.T. (2015) The interplay between nucleoid organization and transcription in archaeal genomes. Nat. Rev. Microbiol., 13, 333-341. 127. Werner, F. and Grohmann, D. (2011) Evolution of multisubunit RNA polymerases in the three domains of life. Nat. Rev. Microbiol., 9, 85. 128. Martinez-Pastor, M., Tonner, P.D., Darnell, C.L. and Schmid, A.K. (2017) Transcriptional Regulation in Archaea: From Individual Genes to Global Regulatory Networks. Annu. Rev. Genet., 51, 143-170. 129. Lu, Q., Han, J., Zhou, L., Coker, J.A., DasSarma, P., DasSarma, S. and Xiang, H. (2008) Dissection of the regulatory mechanism of a heat-shock responsive promoter in Haloarchaea: a new paradigm for general transcription factor directed archaeal gene regulation. Nucleic Acids Res., 36, 3031-3042. 130. Facciotti, M.T., Reiss, D.J., Pan, M., Kaur, A., Vuthoori, M., Bonneau, R., Shannon, P., Srivastava, A., Donohoe, S.M., Hood, L.E. et al. (2007) General transcription factor specified global gene regulation in archaea. P Natl Acad Sci USA, 104, 4630-4635. 131. Werner, F. (2007) Structure and function of archaeal RNA polymerases. Mol. Microbiol., 65, 1395-1404. 132. Paytubi, S. and White, M.F. (2009) The crenarchaeal DNA damage‐inducible transcription factor B paralogue TFB3 is a general activator of transcription. Mol. Microbiol., 72, 1487-1499. 133. Han, W.Y., Xu, Y.Q., Feng, X., Liang, Y.X., Huang, L., Shen, Y.L. and She, Q.X. (2017) NQO-Induced DNA-Less Cell Formation Is Associated with Chromatin

48

Protein Degradation and Dependent on A(0)A(1)-ATPase in Sulfolobus. Frontiers in Microbiology, 8, 1-12. 134. Chen, H.-T. and Hahn, S. (2003) Binding of TFIIB to RNA polymerase II: mapping the binding site for the TFIIB zinc ribbon domain within the preinitiation complex. Mol. Cell, 12, 437-447. 135. Zeng, Q.D., Lewis, L.M., Colangelo, C.M., Dong, J. and Scott, R.A. (1996) A transcription factor IIB homolog from the hyperthermophilic archaeon Pyrococcus furiosus binds Zn or Fe in N-terminal Cys(4) motif. J Biol Inorg Chem, 1, 162-168. 136. Le, T.N., Wagner, A. and Albers, S.-V. (2017) A conserved hexanucleotide motif is important in UV-inducible promoters in Sulfolobus acidocaldarius. Microbiology, 163, 778-788. 137. Samson, R.Y., Xu, Y., Gadelha, C., Stone, T.A., Faqiri, J.N., Li, D., Qin, N., Pu, F., Liang, Y.X., She, Q. et al. (2013) Specificity and function of archaeal DNA replication initiator proteins. Cell reports, 3, 485-496. 138. Frols, S., White, M.F. and Schleper, C. (2009) Reactions to UV damage in the model archaeon Sulfolobus solfataricus. Biochem. Soc. Trans., 37, 36-41.

49

Nucleic Acids Research, 2018 1 doi: 10.1093/nar/gky236 A transcriptional factor B paralog functions as an activator to DNA damage-responsive expression in archaea Xu Feng1,2, Mengmeng Sun2, Wenyuan Han2, Yun Xiang Liang1 and Qunxin She1,2,*

1State Key Laboratory of Agricultural Microbiology and College of Life Science and Technology, Huazhong Agricultural University, 430070 Wuhan, China and 2Archaea Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark

Received February 27, 2018; Revised March 17, 2018; Editorial Decision March 19, 2018; Accepted March 20, 2018

ABSTRACT are larger than the bacterial counterparts consisting of 5 subunits (1). Studies on transcription initiation show that Previously it was shown that UV irradiation induces a the two types of RNAP use different mechanisms to initi- strong upregulation of tfb3 coding for a paralog of the ate gene transcription. In bacteria, the bacterial RNAP and archaeal transcriptional factor B (TFB) in Sulfolobus sigma factor form the holoenzyme, in which the sigma sub- solfataricus, a crenarchaea. To investigate the func- unit recognizes promoters and binds there to form a pre- tion of this gene in DNA damage response (DDR), tfb3 initiation complex (PIC) (2). RNA transcription in archaea was inactivated by gene deletion in Sulfolobus is- is more related to the process by RNAP II, the enzyme that landicus and the resulting tfb3 was more sensitive is responsible for synthesis of mRNA in eukaryotes. The ar- to DNA damage agents than the original strain. Tran- chaeal transcription starts with recognition of a promoter scriptome analysis revealed that a large set of genes by the TATA-binding protein and transcriptional factor B show TFB3-dependent activation, including genes of (TFB), and then, RNAP is recruited to the promoter along with the general transcriptional factor E to form the PIC the ups operon and ced system. Furthermore, the (3,4). TFB3 protein was found to be associated with DDR Current investigation on archaeal transcriptional regu- gene promoters and functional dissection of TFB3 lation has revealed that the eukaryotic-like transcriptional showed that the conserved Zn-ribbon and coiled-coil machinery is primarily controlled by the promoter-centered motif are essential for the activation. Together, the re- mode of regulation; transcription factors specifically bind sults indicated that TFB3 activates the expression of to DNA motifs present on gene promoter regions and reg- DDR genes by interaction with other transcriptional ulate the gene transcription by affecting the PIC formation factors at the promoter regions of DDR genes to fa- on the promoters (5,6). Transcriptional factors of both bac- cilitate the formation of transcription initiation com- terial and eukaryotic types have been identified in archaea plex. Strikingly, TFB3 and Ced systems are present and function in gene-specific regulation (7–9). In addition, / in a wide range of crenarchaea, suggesting that the many archaeal genomes encode multiple TBP and or TFB (10), and for this reason, archaea have the potential to ex- Ced system function as a primary DNA damage re- plore these basal transcriptional factors to exert global reg- pair mechanism in Crenarchaeota. Our findings fur- ulation in analogy to sigma factors in bacterial transcrip- ther suggest that TFB3 and the concurrent TFB1 form tion. a TFB3-dependent DNA damage-responsive circuit Indeed, early works on two TFB paralogs (TFB1 and with their target genes, which is evolutionarily con- TFB2) of Thermococcus kodakarensis reveals that each of served in the major lineage of Archaea. them can support transcription in vitro without any ap- parent selectivity on promoter, and neither of them is es- sential for cell growth (11). Nevertheless, tfb1 is expressed INTRODUCTION to a higher level in T. kodakarensis and supports better RNA transcription is the first step of decoding genetic infor- cell growth, relative to tfb2 (12). Furthermore, characteri- mation from DNA, and RNA polymerase (RNAP), a multi- zation of Pyrococcus furiosus TFB1 and TFB2 shows that protein complex that is responsible for the process, is evolu- the two factors have different capability to further RNA tionarily conserved in all three domains of life. RNAPs in transcription in vitro (13). Sulfolobus solfataricus and Sul- archaea and eukaryotes have 12 or more subunits, which folobus acidocaldarius encode three paralogs of TFB pro-

*To whom correspondence should be addressed. Tel: +45 532 2013; Fax: +45 3532 2128; Email: [email protected]

C The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 2 Nucleic Acids Research, 2018

tein (TFB1, −2and−3) among which TFB3 is a truncated spacer adjacent motif (CCN) positioned 138 bp after the version of archaeal TFBs (14,15). Investigation of their ex- start codon of the tfb3 gene, and the immediately adjacent pression in these archaea shows that only tfb3 is strongly up- 40-nt sequence was used as the spacer. Oligos designed for regulated by UV irradiation (16,17). In vitro experiments in- construction of the spacer of an artificial mini-CRISPR dicate that the S. solfataricus TFB3, TBP and TFB1 form a plasmid (pAC) and the donor DNA of the genome edit- complex in the presence of a promoter DNA fragment, and ing plasmid (pGE) were listed in Supplementary Table S2. it was thus suggested that TFB3 can function as a general The pAC-tfb3 plasmid was constructed as described (27). activator to gene transcription in this archaeon (18). Never- Then, the donor DNA was generated by splicing and over- theless, it remains elusive what genes are to be regulated by lapping extension (SOE)-PCR (28) and inserted into pAC- TFB3 and whether the truncated version of archaeal TFB tfb3, yielding pGE-tfb3 plasmid. The genome-editing plas- could regulate UV-responsive expression in these archaea. mid was then introduced into the competent cell of E233S1 UV lights and other DNA damage agents have been used by electroporation, giving transformants on selective plates, in investigation of DNA damage responsive regulation of which should carry the designed tfb3 allele. The identity genome expression in Bacteria and Eukarya. These studies of deletion mutants was revealed by polymerase chain reac- have revealed a series of coordinated cellular and molecu- tion (PCR) amplification of the tfb3 allele and verified by lar in these organisms to prevent accumulation of DNA le- DNA sequencing. Plasmids were cured from deletion mu- sions and to facilitate maintenance of genome integrity, and tants by pyrEF counter selection with uracil and 5-FOA, the revealed network of these cellular events are collectively yielding tfb3 for further experiment. called the DNA damage response (DDR) (19–22). Here Sul- folobus islandicus, a genetic model for which a complete ge- Construction of plasmids for expression of mutated tfb3 genes netic toolbox has been developed (23), was used for inves- The WT tfb3 gene and the mutant derivative coding tigation of TFB3 function in the archaeal DDR. We found for the N-terminal 49 amino acids (Zn ribbon) were di- that TFB3 is associated with promoters of DDR genes and rected amplified from the genomic DNA of S. islandi- it plays an essential role in regulating a number of genes cus REY15A (26) by PCR using primer set of tfb3pro- involved in cell aggregation or/and intercellular DNA ex- f/tfb3WT-r and tfb3pro-f/tfb3N-Zn-r, respectively. The re- change in this archaeon. sulting DNA fragments contained the native promoter region and the specific coding sequence of WT TFB3 MATERIALS AND METHODS and TFB3N-Zn. The remaining three mutants carried Cell growth and DNA damage treatment point mutations in tfb3, which were obtained by SOE- PCR, following the reported procedure (28). Specifically, The S. islandicus strains, i.e. E233S1, the wild-type (WT) tfb3M1 carrying substitutions in the Coil 1 motif (R145A, strain and the tfb3 mutant as well as their plasmid- K146A) was generated with the primer set of tfb3WT- carrying derivatives (Supplementary Table S1) were grow at / / ◦ f tfb3CoilM1-SOE-r and tfb3CoilM1-SOE-f tfb3WT-r 78 C in SCV media (Basal media supplemented with 0.2% whereas tfb3M2 carrying the CoilM2 mutation (L148A, sucrose, 0.2% Casamino acids and 1% vitamin solution), K149A, L151A) was amplified with tfb3WT-f/tfb3CoilM2- and uracil was added to 20 ␮g/ml if required (24). SOE-r and tfb3CoilM2-SOE-f/ tfb3WT-r. The tfb3C3C25 For DNA damage treatment with chemicals, exponen- mutant harboring substitution for C3S C25T was gener- = tially growing cultures of S. islandicus strains (OD600 ated with the primer set of C3SC25T-f/C3SC25T-SOE- 0.2) were supplemented with one of the following drugs, in- r and C3SC25T-SOE-f/tfb3WT-r. All primers for SOE- cluding 4-nitroquinoline-1-oxide (NQO), methyl methane- PCR are listed in Supplementary Table S2. The resulting sulfonate (MMS), cisplatin and hydroxyurea (HU) at the DNA fragments were cleaved with NdeI and SalI, together concentrations indicated in each experiment. UV irradia- with the native tfb3 promoter DNA amplified with tfb3pro- tion was conducted by placing 30 ml of culture in a petri f/tfb3pro-r (SphI/NdeI), were cloned to pSeSD1 (29), the dish of 9 cm in diameter and irradiated with a setting of Sulfolobus expression vector at SphI/SalI sites in a ligation 2 energy level at 200 J/m using the UV Stratalinker 1800 of three DNA fragments, yielding expression plasmids for (Stratagene, USA). The treated cultures were incubated for each mutant tfb3 gene (pSeptfb3-tfb3N-Zn, pSeptfb3-tfb3M1 the time periods indicated in each experiment, during which and pSeptfb3-tfb3M2, pSeptfb3-tfb3C3S-C25T (Supplemen- cell samples were taken for OD600 determination, cell aggre- tary Table S1)). All mutations on the expression plasmids gation analysis, extraction of total RNAs and preparation were verified by DNA sequencing. Importantly, all tfb3 mu- of cell extracts as described in individual experiments. tant genes still contained the native promoter of tfb3. Each expression plasmid was introduced into the S. is- Construction of a tfb3 in-frame deletion mutant of S. islandi- landicus tfb3 mutant by electroporation, yielding strains cus expressing either N-Zn, or CoilM1, or CoilM2, or C3S- C25T or the WT TFB3. The resulting strains were then used The CRISPR-based genetic manipulation recently devel- for further experiments. oped in our laboratory (25) was employed to construct a tfb3 mutant using S. islandicus REY15A (26). The Western blot analysis genome-editing plasmid carried the mini-CRISPR array containing a spacer derived from the target site in the tfb3 Cell sample of 15 ml culture was pelleted by centrifuga- gene and the donor DNA of the tfb3 gene allele (Supple- tion and re-suspended in 150 ␮l TBST buffer (50 mM Tris– mentary Figure S1). The target site started with the proto- HCL, 100mM NaCl, 0.1% Tween-20, pH7.6). Sonication

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 3

of the cell suspension gave total cell extracts, which were system (Bio-Rad, Hercules, CA, USA), using Maxima fractionated by sodium dodecyl sulphate-polyacrylamide SYBR Green/ROX qPCR Master Mix (Thermo Scientific). gel electrophoresis (SDS-PAGE). Protein bands on the gel Primers employed in qPCR were listed in Supplementary were transferred onto a nitrocellulose membrane using Table S2. Relative amounts of RNAs were calculated using a Trans-Blot Semi-Dry Transfer Cell (Bio-Rad). For im- the comparative Ct method by using 16srRNA as the refer- munoblotting, the membrane was immerged in 5% skim ence gene (31). milk blocking agent, then incubated with individual pri- mary antibodies and finally with corresponding second an- Chromatin immunoprecipitation tibodies. TFB3 antisera were raised against S. solfataricus Chromatin immunoprecipitation (ChIP) experiment was TFB3 (18) (kindly provided by Prof. Malcolm F. White) conducted as previously described with minor modifica- whereas PCNA3 antisera against S. islandicus PCNA3 tions (32). Cell samples for ChIP analysis were first cooled were reported previously. Secondary antibodies were pur- to room temperature, and then, formaldehyde was added to chased from Thermo Fisher Scientific, and hybridization each cooled culture to the final concentration of 1% and in- signals were detected using the ECL western blot substrate cubated at RT for 10 min for crosslink. Glycine was added to (Thermo Fisher Scientific), and visualized by exposure of 125 mM for another 5 min to quench the cross-link reaction. the membrane to an X-ray film (Agfa HealthCare, Bel- Cells were then collected by centrifugation, and the result- gium). ing cell pellet was washed with PBS buffer (137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 2 mM KH2PO4), and re- Transcriptome analysis suspended again in 3 ml TBSTT buffer (20 mM Tris–HCl, 150 mM NaCl, 0.1% Tween 20, 0.1% Triton X-100, pH 7.5). Exponentially growing cultures of E233S1 and tfb3 Cells in the suspension were disrupted by French press, and strains (OD = 0.2) were grown for 6 h in the presence 600 DNA fragments in the cell extracts were shared by sonica- or absence of NQO (6 hpt). Cells in each culture were har- tion to yield a size range of 200–1000 bp of genomic DNAs. vested by centrifugation and used for RNA extraction us- Immunoprecipitations were performed by incubating 5 ␮l ing the TRIzol reagent (Invitrogen, Carlsbad, CA, USA) antibody or corresponding serum with 100 ␮g of cell ex- by following the instruction of the manufacturer. The qual- ◦ tract in 3 ml TBSTT at 4 C for overnight, with gentle rota- ity of total RNA preparations was controlled using Nan- tion. Then, 50 ␮l DynaBeads of protein G (Thermo Fisher oDrop 1000 and 2100 Bioanalyzer (Agilent Technologies, Scientific) was added and incubated for 2 h. Immune com- Santa Clara, CA, USA). RNA preparations of high quality plexes were captured by the magnet and washed first with were used for construction of RNA-Seq libraries and next the TBSTT buffer, then with TBSTT containing 500 mM generation sequencing in BGI (Shenzhen, China) using Il- NaCl, and finally with TBSTT containing 0.5% Tween 20 lumina HiSeq™ 4000. and Triton X-100, followed by wash with TBSTT for three A total of ca.12 millions of high-quality sequence reads times. The immune complexes was eluted by resuspending were mapped to the reference genome of S. islandicus the beads in 100 ␮l elution buffer (20 mM Tris (pH 7.8), Rey15A, giving genome coverage of 320- to 400-fold for 10 mM ethylenediaminetetraacetic acid and 1% SDS) and different samples. The gene expression level was calculated ◦ heating up to 65 C for 30 min. Eluted samples were first by using RPKM method (reads per kb per million reads) ◦ treatedwith10␮g/ml proteinase K for 6 h at 65 Cand10 (30) and the differentially expressed genes (DEGs) was iden- ◦ hat37 C to remove proteins cross-linked to DNA, and then tified by comparing treated and corresponding untreated ◦ incubated with RNase A for 2 h at 37 CtoremoveRNAsin samples. The FDR (false discovery rate) was applied to de- the samples. Finally, the treated samples were extracted with termine the threshold of p value in multiple tests. FDR ≤ phenol/chloroform/isoamyl alcohol (25:24:1) and chloro- 0.001 and the absolute value of log2 ratio ≥1wereusedto form, and DNAs in the samples were precipitated with cold quantify the significance of gene expression difference. ethanol in the presence of 20 ␮g glycogen. The DNA prepa- To validate the RNA seq results, seven genes (upsX, upsA, rations were then used for qPCR analysis. cedB, SiRe 1957, dpo2, cdvB1 (SiRe 1550)andSiRe 0187), including those that were either upregulated or downregu- Cell aggregation analysis lated by NQO were selected for quantitative reverse tran- scriptase PCR (qRT-PCR) analysis. Correlation values be- Exponentially growing cultures of S. islandicus strains tween the result of RNA seq and qPCR was 0.97 for the (OD600 = 0.2) were subjected to different DNA damage slope and 0.93 for R-value (Supplementary Figure S2), in- treatment. Samples with UV irradiation was taken 3 h post- dicating the two datasets are strongly correlated. treatment, while for the treatment with other DNA damag- ing agents, the samples was taken at 10 hpt. Sampling time in NQO treatment experiments was delayed, relative to that Real-time quantitative PCR analysis in the UV experiments, because NQO is not toxic; its tox- Before cDNA synthesis, residual DNA in each total mRNA icity to the cell has to be generated by cellular metabolic preparation (5 ␮g) was removed by DNase I treatment. activities (33). As a result, it takes longer time to reach the The first strand cDNA synthesis was conducted using Re- maximal level of cell aggregation. Fresh samples were trans- vert Aid First Strand cDNA Synthesis Kit (Thermo Sci- ferred to a glass slide, covered with a coverslip and directly entific), and the resulting cDNA was diluted and then observed under a Nikon Eclipse Ti-E inverted microscope used as the templates for quantitative PCR analysis. qPCR (Nikon, Kobe, Japan) and pictures were taken from a cam- reaction was conducted with a CFX96 real-time PCR era connected to the microscope. Quantification data were

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 4 Nucleic Acids Research, 2018

obtained from at least 12 fields of view images and 500 sin- gle cells for each cell sample and from three independent experiments.

Cell viability analysis

Exponentially growing cultures of S. islandicus (OD600 = 0.2) were treated with NQO with a final concentration of 0–2 ␮M and incubated for 24 h. Then, 1 ml of culture was taken, and cells were pelleted by centrifugation and resus- pended in 1 ml fresh SCV medium. The resulting cell sus- pensions were serially diluted and plated using the two-layer plating method (24). A total of 100 ␮l of diluted sample were plated onto gelrite plates in triplicate. Colonies ap- peared on plates after 7 days of incubation were counted, giving colony formation units (CFUs) per ml culture.

RESULTS Expression of tfb3 is highly induced in Sulfolobus islandicus upon NQO treatment

To investigate if the expression of tfb3 in S. islandicus could Figure 1. NQO-induced upregulation of tfb3 in Sulfolobus islandicus. (A) be activated by the treatment of NQO, a chemical that yields qRT-PCR analysis of tfb3 mRNA in NQO-treated and untreated S. is- bulky adducts on bases of DNA (33,34), the WT S. is- landicus cells. Data were normalized to the level of 16S rRNA. Error bars landicus strain was grown for 9 h in NQO-containing me- represent the standard derivation values of three independent replicates. dia (hours post-treatment, hpt) during which cell samples (B) Western blot analysis of TFB3 protein in the cell extracts of NQO- treated and untreated S. islandicus using PCNA as a reference S. islandicus were taken at a 3 h interval. Each cell sample was then di- E233S1 (WT) strain was grown in the absence (−NQO) or presence of the vided into two portions: one was used for extraction of to- 2␮M NQO (+NQO) for 9 h (hours post-treatment, hpt) during which cell tal RNAs while the other was for preparation of cell extracts samples were taken at the indicated time points and used for mRNA ex- for immunoblotting analysis. Cellular levels of tfb3 mRNAs traction (for qPCR) and cell extracts preparation (western analysis). in the samples were estimated by real-time quantitative re- verse transcription PCR (qRT-PCR) as described in ‘Ma- terials and Methods’ section. We found that, while the ex- pression level of tfb3 in the untreated reference remained almost constant, the expression of the gene was elevated for >40-fold at all three tested time points (Figure 1A). shown in Figure 2A, and the absence of the tfb3 gene in the Western analysis of the S. islandicus TFB3 using antibodies mutant was confirmed by PCR with the primer sets and ge- raised against the S. solfataricus TFB3 (a gift from Prof. M. nomic DNAs of the two strains (Figure 2B). Furthermore, F. White) revealed that a protein band of ca. 20 kDa was immunoblotting analysis of TFB3 protein in NQO-treated specifically recognized, which was present in a large amount cell extracts using antibodies raised against S. solfataricus in NQO-treated samples but was barely detectable in the un- TFB3 revealed that TFB3 was present in the cell extracts treated samples (Figure 1B). of the WT strain in a large quantity but absent from tfb3 To estimate the TFB3 protein level after induction, cell (Figure 2C), consistent with their genotypes. extracts were diluted for 8- and 16-fold and the diluted The mutant was employed for investigation of its resis- cell extracts were then used for SDS-PAGE and for im- tance to NQO as above described for WT. To do that, the munoblotting analysis. Since signals of TFB3 hybridization WT strain and tfb3 were grown in the media containing 0, obtained from 16-fold diluted samples were equivalent to, 0.5, 0.75, 1, 1.5 or 2 ␮M of NQO for 24 h. Investigation of or stronger than, that in the undiluted cell sample of the TFB3 expression in cell samples taken from the above cul- reference culture. The NQO-treated cells were estimated to tures by immunoblotting revealed that the gene was strongly contain >16-fold of TFB3 protein, relative to the untreated activated at all tested concentrations of NQO (Supplemen- cells (Supplementary Figure S3). Therefore, we concluded tary Figure S4). Growth data of these cultures showed that that tfb3 gene expression is strongly upregulated in S. is- tfb3 did not exhibit any obvious growth defect either in landicus cells upon NQO treatment. the absence or presence of NQO, relative to the correspond- ing WT cultures (Supplementary Figure S5). Nevertheless, determination of their cell viability by plating for CFUs re- TFB3 deficiency by gene deletion reduced cell viability of S. vealed that the survival rate of tfb3 was always lower than islandicus upon NQO treatment that of the WT strain in all tested drug concentrations and To investigate the function of TFB3, an in-frame deletion the largest difference occurred for the treatment with 0.75– mutant of tfb3 gene was constructed for S. islandicus, us- 1.0 ␮M NQO (30–37%, Figure 3). Therefore, tfb3 exhib- ing the CRISPR-assisted gene deletion procedure (Supple- ited a higher sensitivity to the DNA damage agent than the mentary Figure S1). The WT and mutant alleles of tfb3 are WT strain.

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 5

Figure 2. Construction of a tfb3 in-frame deletion mutant of S. islandicus. (A) Schematic of PCR identification of the WT tfb3 gene and tfb3 alleles in S. islandicus strains. (B) Verification of the tfb3 genotype by PCR amplification of the tfb3 gene alleles (PCR). Primer set of F1/R1 was designed to check the gene deletion whereas primer set of F2/R2 was designed to amplify an internal DNA fragment in the tfb3 gene. (C) Western blot analysis of TFB3 protein in NQO-treated cultures of S. islandicus. Immunoblotting was conducted with the TFB3 antibody raised against the S. solfataricus TFB3 protein (␣-TFB3). WT: S. islandicus E233S1, the genetic host; tfb3: tfb3 deletion mutant constructed with E233S1.

NQO for 6 h. Cell mass was collected from each culture from which the total mRNAs were extracted, giving RNA preparations for conducting RNA-seq analysis. First, the transcriptome analysis of the WT strain led to the identification of 313 DEGs upon NQO treatment (>2-fold change), including 139 upregulated genes and 174 downregulated genes (Supplementary Table S3). The same analysis with total RNAs prepared from corresponding tfb3 culture revealed that 78 upregulated genes and 8 downregulated ones were TFB3 dependent (Supplementary Table S3). Among the 61 genes that showed NQO-upregulation of >4-fold, 51 genes showed TFB3-dependent activation (Supplementary Tables S3 and 4). The most upregulated ones were those in the ups operon and all known ced Figure 3. Survival rates of the wild-type S. islandicus and tfb3 upon genes (cedA1, cedA and cedB). Other regulated genes in- NQO treatment. Cell viability was estimated for S. islandicus E233S1 (WT) strain and tfb3 mutant at 24 h post-NQO addition by determination of cluded those that code for potential DNA transfer related CFUs. CFU of WT-0 ␮M and tfb3-0␮M was assigned to 100% for each processes, transporter-related or membrane-associated pro- strain, with which the survived fractions of cells in other cultures were cal- teins, all of which can be implicated in DNA damage re- culated individually. pair. Together, these results indicated that TFB3 regulates a subset of DDR-activated DNA repair genes. A few genes showed TFB3-dependent downregulation and they all code TFB3 as the activator of a subset of NQO-responsive genes for a hypothetical protein, and the negative regulation could in S. islandicus be indirect (see ‘Discussion’ section). Therefore, we con- cluded that TFB3 functions as an activator to the transcrip- Then, we investigated how TFB3 deficiency could influence tion of a large subset of NQO-responsive genes in S. islandi- the genome expression in this archaeon by transcriptome cus. analysis using RNA sequencing. Both the deletion mutant and the WT strain were grown in presence or absence of

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 6 Nucleic Acids Research, 2018

TFB3 deficiency resulted in the loss of the DNA damage- qPCR, and this revealed that, after NQO treatment, the an- induced cell aggregation tisera specifically enriched DNA fragments containing the promoters of upsE, herA1 and cedB only in NQO-treated Since the activation of the expression of ups and ced genes samples and the enrichment was for 17-, 25- and 37-fold, was both NQO-responsive and TFB3-dependent, this sug- respectively (Figure 5). Essentially no enrichment was ob- gested that tfb3 could have lost the capability of cell aggre- served from the ChIP-qPCR experiments with the same gation. To test that, untreated and NQO-treated cultures of promoters in the untreated samples, and furthermore, the tfb3 and E233S1, the WT reference were grown in SCV or same analysis for the promoters of cdvB1, a gene exhibiting in SCV+NQO for 15 h during which cell samples were taken TFB3-independent downregulation, and two references, i.e. for microscopic examination. As shown in Figure 4, while the cmr-β promoter and an internal fragment of 16S rRNA no cell aggregates were observed for the WT strain, nor gene, with and without NQO treatment, did not yield any for tfb3 grown in SCV media, >50% of WT cells formed difference between the untreated and NQO-treated samples cell aggregates at 12 hpt when grown in SCV+NQO me- (Figure 5). These results suggested that TFB3 could form dia and the portion of the cells in cell aggregates increased complexes with other transcription factors on the promot- to 82% at 15 hpt. In contrast, no cell aggregates were ob- ers of NQO-responsive genes upon DNA damage treatment served in the corresponding tfb3 culture (Figure 4). These and activate gene expression probably by recruiting RNAP results are consistent with the lack of activation of the ex- to the promoters to form PICs. pression from the ups operon in the mutant as revealed from the RNA seq analysis. To date, both UV irradiation and NQO treatment are ca- Functional characterization of TFB3 conserved domains pable of inducing the formation of cell aggregates in differ- Next, we examined what part of TFB3 could be involved ent Sulfolobus species. Since they both are DNA damage in facilitating PIC formation. Unlike its TFB1 and TFB2 agents, this raised a question if other DNA damage agents paralogs, TFB3 lacks the helix-turn-helix DNA-binding do- could also trigger cell aggregation in this archaeon. To test main present in the C-termini of common TFB proteins. that, three additional drugs were studied for their effects on Instead, Sulfolobus TFB3 proteins contain a peptide that S. islandicus, including cisplatin that makes cross-link le- is predicted to form a coiled-coil (CC) domain (38)(Fig- sions on dsDNA (35), MMS that produces base-alkylating ure 6A and Supplementary Figure S7B). Nevertheless, the lesion on DNA (36) and HU that induces cell cycle ar- TFB3 protein contains the four cysteine residues that form rests and replication stresses (37). Sulfolobus islandicus WT the Zn ribbon domain in archaeal TFB proteins, whereas strain and tfb3 were grown in the presence or absence of the Sulfolobus TFB1 protein only has two conserved cys- each drug and grown for 6 h (or 3 h post-UV irradiation). teine residues; those at the first and the last positions were Cell samples were taken and analyzed for the expression changed as for the S. solfataricus TFB3 protein (Figure 6A of TFB3 protein by immunoblotting analysis, and this re- and Supplementary Figure S7A). vealed that the protein was present in a large amount in the Here, we investigated whether the Zn ribbon and CC do- cells treated with cisplatin, or MMS or UV irradiation, as mains of S. islandicus TFB3 could be essential for activa- for NQO treatment; however, HU treatment did not influ- tion of gene expression in this archaeon by construction of ence the tfb3 expression (Supplementary Figure S6A). four tfb3 mutants and test the function of each mutant pro- Investigation of cell aggregation by microscopy showed tein. These included: (i) N-Zn, a truncated form of TFB3 that cell aggregates were observed in S. islandicus cultures mutant only containing the N-terminal 49 amino acids of treated with all these DNA damage agents but absent from the Zn domain, (ii) CoilM1 and CoilM2––two CC mu- the culture of HU treatment as well as from tfb3 cultures tants that carry R145A, K146A substitutions and L148A, treated with each of the tested drugs (Supplementary Figure K149A, L151A substitutions at the conserved CC domains, S6B), consistent with the upregulation of tfb3 gene expres- respectively, and (iii) C3S-C25T, two substitutions in the sion observed for the former category of drug, but not for conserved cysteine residues of the TFB3 Zn ribbon domain, the latter one. Together these results indicated that DNA which change the TFB3 Zn ribbon motif to that present damage treatment triggers the expression of TFB3, which in the non-canonical TFB1 proteins (Figure 6A and Sup- in turn functions as a DNA damage-inducible regulator to plementary Figure S7). The original promoter of tfb3 was activate the expression of a large number of genes involved used for controlling the expression such that their expres- in DNA damage repair in S. islandicus. sion was under the same regulation as in the WT strain. After introduction of each TFB3 expression plasmid into TFB3 was specifically associated with the promoters of DDR tfb3, the expression of the wild-type TFB3 and the four genes mutant proteins was examined in each transformant by im- To yield an insight in to the mode of the TFB3 regula- munoblotting analysis with the TFB3 antibodies. This re- tion, we analyzed the association of the transcription fac- vealed all these TFB3 proteins were expressed and their ex- tor with the promoters of NQO-responsive genes by ChIP- pression was subjected to DNA damage activation (Figure qPCR. The tested promoters include those of upsE, herA1, 6B). Then, the expression levels of upsX and cedB upon cedB, cdvB1, with the cmr-β promoter (promoter of the sub- DNA damage in tfb3 strains carrying a WT tfb3 gene or type III-B CRISPR-Cas cmr-β gene operon) and an intra- one of its mutated derivatives were estimated by qRT-PCR. genic region of 16S rRNA as references. Both untreated and As shown in Figure 6C, while the plasmid-borne WT tfb3 NQO-treated samples were analyzed. DNA fragments en- gene completely restored the function of the chromosomal riched by ChIP with the TFB3 antisera were quantified by tfb3 gene, none of the tested TFB3 mutant proteins, includ-

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 7

Figure 4. S. islandicus tfb3 mutant lost the capability of forming NQO-induced cell aggregation. (A) Microscopy of cells and cell aggregates in samples taken from the cultures of the WT strain (E233S1) and tfb3 mutant. Each strain was grown in the absence (−NQO) or presence (+NQO) of NQO for 24 h, and aliquots of cultures were placed on glass slides, covered with coverslips and directly observed under a phase-contrast microscope. 12 and 15 h: hours after NQO supplementation. (B) Quantification data of cell aggregation in the cell samples shown in panel A. At least 500 cells were analyzedfor each cell sample.

ing N-Zn, CoilM1, CoilM2 and C3S-C25T, were capable of tem appeared in 15 genera, and strikingly, 13 of the crenar- activating the expression of these DDR genes. These results chaeal genera have both a truncated version of TFB pro- indicated that the conserved cysteines in the Zn ribbon and tein and the Ced system. All these data were illustrated in the CC motif are essential for TFB3 function in DDR reg- Figure 7, using iTOL (Interactive Tree Of Life) (41). Multi- ulation in this archaeon. ple sequence alignment of all identified truncated TFB pro- teins showed that they fell into three distinct classes in which all known TFB3 homologs formed a clade (Supplementary Co-evolution of TFB3 and Ced systems in crenarchaea Figure S8). In addition, there also appeared a concurrence To yield an insight into the conservation of TFB3 and its of TFB3 and non-canonical TFB1 proteins, the unconven- target systems of regulation, we searched for the presence tional TFB proteins carrying two cysteine residues in the Zn of tfb3,theups operon and ced genes in archaea using the ribbon domain in Crenarchaeota (18) (Supplementary Fig- BLAST search (39). We found that, while the ups system ure S7A). Together, these data suggested a coevolution of was limited to organisms of Sulfolobales as reported pre- TFB1, TFB3 and the Ced system and the DDR regulatory viously (40), the truncated version of TFB was found in or- circuit of the intercellular DNA transfer in Crenarchaeota, ganisms of 16 genera in Crenarchaeota whereas the Ced sys- one of the major phyla in the Archaeal domain.

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 8 Nucleic Acids Research, 2018

system) probably function in concert to facilitate DNA ex- change for HRR in these archaea (47). Our investigation on DNA damage repair in S. islandicus demonstrates that both the Ups and Ced systems are subjected to the regu- lation by TFB3, a truncated version of the archaeal TFB, further supporting that the two systems cooperate in DNA exchange for HRR. Important insights have been gained into the activation of DDR genes in this archaeon. First, the ups operon in S. is- landicus is clustered with an NQO-repressed SiRe 1883 that codes for a putative transcriptional factor (26). However, this gene is absent from the genomes of several Sulfolobus species including S. acidocaldarius (15), suggesting it may not have a function in the regulation of ups operon. Further- more, the ced genes does not appear to be located close to any genes encoding a putative transcriptional factor. Since Figure 5. Enrichment of TFB3 association with promoters of upregu- DNA transfer required both the Ups pili and the Ced pro- lated DDR genes. S. islandicus E233S1 (WT) was grown in SCV in the teins, it is more meaningful to have both systems regulated presence (+NQO, 2 ␮M) or absence of the drug (−NQO) for 3 h. Cell mass was then collected from cultures of E233S1 from which cell extracts by a common factor and the identification of TFB3 as their were prepared and used for CHIP analysis using the TFB3 antisera. The common regulator is consistent with this reasoning. folds enrichment refers to the CHIP signal from the TFB3 antibody rel- Indeed, several other genes exhibiting TFB3-dependent ative to the signal from the sheep serum. The promoter regions of three regulation have also been implicated in DNA transfer- TFB3-dependent upregulated genes (upsE, cedB and herA1 (SiRe 1715)), related processes in S. islandicus. The encoded pro- one TFB3-independent downregulated gene, cdvB1,(SiRe1550 encoding ESCRTIII-1) and cmrβ operon, together with intragenic region of 16S teins include: (i) a PadR family transcriptional regulator rRNA gene, were selected for CHIP analysis with the TFB3 antibody. (SiRe 1956) and its adjacent gene SiRe 1957; the latter Folds of enrichment were calculated with the data from three independent codes for a membrane protein with signal peptide for type replicates with the error bars indicating the standard derivation values. IV pili, with the N-terminal part showing similarity to flag- ellar hook-basal body protein (48), suggestive of a func- tion in Ups pilus assembly or protein export, (ii) an Ar- DISCUSSION chaese (SiRe 1319) that is a putative DNA chaperon (Sup- DDR in bacteria and eukarya consists of a series of cel- plementary Table S4), and the encoding gene is clustered lular and molecular events, including inhibition of DNA with the cedA operon (located immediately downstream), replication, cell cycle arrest and activation of certain DNA implying that the protein may interact with the Ced system damage repair pathways such as nucleotide excision repair, in the intercellular DNA transfer, (iii) several putative trans- translesion DNA repair and homologous recombination re- porter and membrane proteins that may function in forma- pair (HRR) (42). DDR is initiated by the recognition of a tion of cell aggregation or related process. The conservation DNA damage signal, usually ssDNA and dsDNA breaks, of these proteins in Sulfolobales (Supplementary Table S4) resulting in the differential expression of a large set of genes suggests that the TFB3-dependent regulation is conserved that are responsible for the observed DDR events. Never- in these organisms and they all probably play a role in facil- theless, bacteria and eukaryotes have evolved distinct mech- itating cell aggregation or/and intercellular DNA transfer anisms to control their DDR regulation; while bacteria em- upon DNA damage. ploy global transcriptional factors to regulate the expres- Transcriptome data have revealed that TFB3 does not sion (20,43), eukaryotes have evolved kinase-signaling path- only activate gene expression but it also represses gene ways to control the process (21,22,44,45). In this article, we expression (Supplementary Table S3), which appeared to show that, upon DNA damage, the expression of tfb3 gene be inconsistent with the activator hypothesis for TFB3. is strongly activated in S. islandicus, a crenarchaeal model The only two genes i.e. SiRe 0266 (membrane protein), organism and the expressed protein forms complexes with SiRe 0267 (sodium:solute symporter) that showed TFB3- transcriptional factors on DDR gene promoters and func- dependent repression of >10-fold were organized into an tions as a DNA damage-responsive activator to regulate operon and cluster together with a third gene, SiRe 0269 RNA transcription in this archaeon. that codes for a small protein of 66 amino acids (26). In- The most interesting group of TFB3-responsive genes terestingly, SiRe 0269 expression is strongly upregulated identified from the DNA damage study in S. islandicus in- by TFB3 (45-fold, Supplementary Table S4). Although clude those in the ups operon implicated for UV-responsive SiRe 0269 does not show a detectable similarity to any cell aggregation of Sulfolobus and ced genes coding for the known protein in the current GenBank database in BLAST intercellular DNA transfer system. Genes of the ups operon search, our RNA-seq data suggested that the gene could have been investigated in S. solfataricus and S. acidocal- code for a novel transcription repressor to downregulate the darius (40,46)whereasced genes responsible for intercellu- expression of SiRe 0266, SiRe 0267 and thus representing lar DNA transfer have recently been identified in S. acido- an example of indirect regulation by TFB3 in S. islandicus. caldarius and functionally characterized (47). These studies To this end, albeit as an activator, TFB3 can either mediate have led to the proposal that the UV-responsive pilus forma- activation or repression to DDR genes. To date, TFB3 is tion (Ups system) and the Ced DNA transfer system (Ced

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 9

Figure 6. Functional characterization of TFB3 by mutagenesis. (A) Schematic of the WT TFB3 and its mutant derivatives. A CC region was predicted by Coils server (38). CoilM1 and CoilM2 refer to the mutation of R145AK146A and L148AK149AL151A. The first and fourth conserved cysteine inZn ribbon of TFB3 was replaced with the corresponding ones in TFB1 (SiRe 1555). (B) Western blot analysis of the total cell extracts of the strains carrying different mutated TFB3 after NQO treatment. Sample was taken 3 h after 2 ␮M NQO treatment. (C) Quantitative analysis of the expression levels of upsX and cedB in the strains encoding different isoform of TFB3. Sample was taken 3 h after 2 ␮M NQO treatment.

Figure 7. Co-evolution of TFB3 and Ced system in Archaea DNA sequences of 16S rRNA genes of different crenarchaeal species were retrieved from the GenBank database and used for multiple sequence alignment using Cluster X and for construction of phylogenetic tree. The resulting tree was visualized and annotated using iTOL (Interactive Tree Of Life) (41). The presence or absence of a truncated version of TFB (TFB3) is indicated by the filled or empty rectangles, respectively; the presence or absence of the Ced system is shown as the filled or empty triangles individually, whereas the canonical and noncanonical TFB1 proteins are notified as the green and dark yellow circles, respectively.

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 10 Nucleic Acids Research, 2018

the only known transcriptional factor that regulates DDR like cell-cycle arrest and inhibition of DNA replication and in archaea. upregulation of homologous recombination are regulated Further insights into the regulation by TFB3 have been by other transcription factor(s) and the regulation occurs in gained from ChIP-qPCR analysis of the association of the TFB3-independent regulation. Together, these findings TFB3 with DDR gene promoters and functional analysis suggest there are multiple regulatory pathways in the of TFB3 mutant proteins. In the latter, mutagenesis of two DDR regulation in this archaeon. In conclusion, our work conserved domains in TFB3 proteins, i.e. the conserved cys- demonstrates that archaea also possess a network of DDR teine residues in the Zn ribbon and R145, K146, L148, regulation to mediate cell cycle arrest, inhibition of DNA K149, L151 residues in the C terminal CC motif has re- replication and activation of DNA repair pathways and the vealed that the both domains are essential for the function DDR process in crenarchaea involves novel transcriptional of the DDR-specific activator. These results are in consis- regulators and new DNA repair pathways. tent with those obtained from the study of the Zn ribbon domain of eukaryotic TFIIB functions where the Zn rib- DATA AVAILABILITY bon is essential to the recruitment of RNA Pol II into the PIC (49,50), and that a CC motif in the C terminal of a The RNA-seq data have been deposited in the public TFB homolog is essential for the transcriptional activation database with GEO accession: GSE111187. in mitochondria of Dictyostelium discoideum (51). For the former, specific enrichment of TFB3 on the upregulated SUPPLEMENTARY DATA DDR gene promoter regions suggests that the interaction Supplementary Data are available at NAR Online. between TFB3 and DDR gene promoters in vivo is spe- cific. Nevertheless, TFB3 does not contain any recognizable DNA-binding motif, suggesting the protein does not inter- ACKNOWLEDGEMENTS act with DNA sequence specifically. To this end, the spe- We thank Prof. Malcolm White for kindly providing us the cific activation of DDR genes by TFB3 is probably medi- antibody of TFB3 and our colleagues in the Archaea Cen- ated by protein–protein interaction between TFB3 and an- tre, University of Copenhagen and those in the Wuhan lab, other transcriptional factor. Huazhong Agricultural University, China for helpful dis- Furthermore, it was shown that the S. solfataricus TFB3 cussions. co-exist with the S. solfataricus TFB1, a non-canonical TFB1 and furthermore, the two factors also interact in vitro FUNDING and the interaction enhances RNA transcription activity by an archaeal RNAP (18). The concurrence of TFB3 and National Science Foundation of China [31771380]; Dan- the non-canonical TFB1 proteins has now extended to all ish Council for Independent Research [DFF-4181-00274]. known orders of organisms in Crenarchaeota except Ther- Funding for open access charge: National Science Founda- moproteales (Figure 7). These data are consistent with the tion of China [31771380]; China Scholarship Council PhD distribution of the Ced system of DNA transfer (47). To- Studentship (to X.F., M.S.). gether, this suggests co-evolution of the TFB1 and TFB3 Conflict of interest statement. None declared. factors with the Ced system in Crenarchaeota. It would be interesting to investigate the mechanisms of activation of REFERENCES DDR genes by the two factors in these crenarchaea. In addition, several DDR genes exhibit TFB3- 1. Werner,F. and Grohmann,D. (2011) Evolution of multisubunit RNA polymerases in the three domains of life. Nat. Rev. Microbiol., 9, independent regulation. First, genes coding for enzymes 85–98. involved in homologous recombination belong to a class of 2. Browning,D.F. and Busby,S.J. (2016) Local and global regulation of genes that is commonly regulated by DDR in bacteria and transcription initiation in bacteria. Nat. Rev. Microbiol., 14, 638–650. eukaryotes (21,22,45,46). The archaeal proteins include 3. Gietl,A., Holzmeister,P., Blombach,F., Schulz,S., von RadA, Mre11, Rad50 that are homologous to the eukary- Voithenberg,L.V., Lamb,D.C., Werner,F., Tinnefeld,P. and Grohmann,D. (2014) Eukaryotic and archaeal TBP and otic counterparts (52,53) and NurA and HerA, two unique TFB/TF(II)B follow different promoter DNA bending pathways. enzymes of the archaeal system and their genes form an Nucleic Acids Res., 42, 6219–6231. operon (54,55). As expected, radA and all four genes in 4. Nagy,J., Grohmann,D., Cheung,A.C., Schulz,S., Smollett,K., the nurA operon are upregulated upon NQO treatment Werner,F. and Michaelis,J. (2015) Complete architecture of the archaeal RNA polymerase open complex from single-molecule FRET (Supplementary Figure S5). Other genes that also show and NPS. Nat. Commun., 6, 6161. TFB3-independent upregulation include dpo2, coding for 5. Gehring,A.M., Walker,J.E. and Santangelo,T.J. (2016) Transcription an enzyme that can bypass DNA lesion on the templates regulation in archaea. J. Bacteriol., 198, 1906–1917. in DNA synthesis in S. solfataricus (56), and cdc6-2 that 6. Sheppard,C. and Werner,F. (2017) Structure and mechanisms of viral was shown to inhibit DNA replication in S. solfataricus transcription factors in archaea. Extremophiles, 21, 829–838. 7. Jun,S.H., Reichlen,M.J., Tajiri,M. and Murakami,K.S. (2011) (32) but is dispensable for replication initiation in S. is- Archaeal RNA polymerase and transcription regulation. Crit. Rev. landicus (57). Furthermore, a large group of genes that are Biochem. Mol. Biol., 46, 27–40. downregulated by DNA damage treatment both in the WT 8. Contursi,P., Fusco,S., Limauro,D. and Fiorentino,G. (2013) Host strain and in tfb3. These include cell division genes (cdvA, and viral transcriptional regulators in Sulfolobus: an overview. Extremophiles, 17, 881–895. cdvB2, cdvB3 and vps4) and DNA replication initiator gene 9. Gindner,A., Hausner,W. and Thomm,M. (2014) The TrmB family: a (cdc6-3) was moderately affected in tfb3 mutant (Supple- versatile group of transcriptional regulators in Archaea. mentary Table S5). These data suggest that DDR events Extremophiles, 18, 925–936.

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 11

10. Martinez-Pastor,M., Tonner,P.D., Darnell,C.L. and Schmid,A.K. 31. Schmittgen,T.D. and Livak,K.J. (2008) Analyzing real-time PCR data (2017) Transcriptional regulation in archaea: from individual genes to by the comparative C(T) method. Nat. Protoc., 3, 1101–1108. global regulatory networks. Annu. Rev. Genet., 51, 143–170. 32. Robinson,N.P., Dionne,I., Lundgren,M., Marsh,V.L., Bernander,R. 11. Santangelo,T.J., Cubonova,L., James,C.L. and Reeve,J.N. (2007) and Bell,S.D. (2004) Identification of two origins of replication in the TFB1 or TFB2 is sufficient for Thermococcus kodakaraensis viability single chromosome of the archaeon Sulfolobus solfataricus. Cell, 116, and for basal transcription in vitro. J. Mol. Biol., 367, 344–357. 25–38. 12. Hidese,R., Nishikawa,R., Gao,L., Katano,M., Imai,T., Kato,S., 33. Bailleul,B., Daubersies,P., Galiegue-Zouitina,S. and Kanai,T., Atomi,H., Imanaka,T. and Fujiwara,S. (2014) Different Loucheux-Lefebvre,M.H. (1989) Molecular basis of 4-nitroquinoline roles of two transcription factor B proteins in the hyperthermophilic 1-oxide carcinogenesis. Jpn. J. Cancer Res., 80, 691–697. archaeon Thermococcus kodakarensis. Extremophiles, 18, 573–588. 34. Han,W., Xu,Y., Feng,X., Liang,Y.X., Huang,L., Shen,Y. and She,Q. 13. Micorescu,M., Gr¨unberg,S., Franke,A., Cramer,P., Thomm,M. and (2017) NQO-Induced DNA-Less cell formation is associated with Bartlett,M. (2008) Archaeal transcription: function of an alternative chromatin protein degradation and dependent on A0A1-ATPase in transcription factor B from Pyrococcus furiosus. J. Bacteriol., 190, Sulfolobus. Front. Microbiol., 8, 1480. 157–167. 35. Wong,J.H., Brown,J.A., Suo,Z., Blum,P., Nohmi,T. and Ling,H. 14. She,Q., Singh,R.K., Confalonieri,F., Zivanovic,Y., Allard,G., (2010) Structural insight into dynamic bypass of the major Awayez,M.J., Chan-Weiher,C.C., Clausen,I.G., Curtis,B.A., De cisplatin-DNA adduct by Y-family polymerase Dpo4. EMBO J., 29, Moors,A. et al. (2001) The complete genome of the crenarchaeon 2059–2069. Sulfolobus solfataricus P2. Proc. Natl. Acad. Sci. U.S.A., 98, 36. Valenti,A., Napoli,A., Ferrara,M.C., Nadal,M., Rossi,M. and 7835–7840. Ciaramella,M. (2006) Selective degradation of reverse gyrase and 15. Chen,L., Brugger,K., Skovgaard,M., Redder,P., She,Q., DNA fragmentation induced by alkylating agent in the archaeon Torarinsson,E., Greve,B., Awayez,M., Zibat,A., Klenk,H.P. et al. Sulfolobus solfataricus. Nucleic Acids Res., 34, 2098–2108. (2005) The genome of Sulfolobus acidocaldarius, a model organism 37. Liew,L.P., Lim,Z.Y., Cohen,M., Kong,Z., Marjavaara,L., Chabes,A. of the Crenarchaeota. J. Bacteriol., 187, 4992–4999. and Bell,S.D. (2016) Hydroxyurea-mediated cytotoxicity without 16. Gotz,D., Paytubi,S., Munro,S., Lundgren,M., Bernander,R. and inhibition of ribonucleotide reductase. Cell Rep., 17, 1657–1670. White,M.F. (2007) Responses of hyperthermophilic crenarchaea to 38. Lupas,A., Van Dyke,M. and Stock,J. (1991) Predicting coiled coils UV irradiation. Genome Biol., 8, R220. from protein sequences. Science, 252, 1162–1164. 17. Frols,S., Gordon,P.M., Panlilio,M.A., Duggin,I.G., Bell,S.D., 39. Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. Sensen,C.W. and Schleper,C. (2007) Response of the (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403–410. hyperthermophilic archaeon Sulfolobus solfataricus to UV damage. J. 40. Ajon,M., Frols,S., van Wolferen,M., Stoecker,K., Teichmann,D., Bacteriol., 189, 8708–8718. Driessen,A.J., Grogan,D.W., Albers,S.V. and Schleper,C. (2011) 18. Paytubi,S. and White,M.F. (2009) The crenarchaeal DNA UV-inducible DNA exchange in hyperthermophilic archaea mediated damage-inducible transcription factor B paralogue TFB3 is a general by type IV pili. Mol. Microbiol., 82, 807–817. activator of transcription. Mol. Microbiol., 72, 1487–1499. 41. Letunic,I. and Bork,P. (2016) Interactive tree of life (iTOL) v3: an 19. Erill,I., Campoy,S. and Barbe,J. (2007) Aeons of distress: an online tool for the display and annotation of phylogenetic and other evolutionary perspective on the bacterial SOS response. FEMS trees. Nucleic Acids Res., 44, W242–W245. Microbiol. Rev., 31, 637–656. 42. Giglia-Mari,G., Zotter,A. and Vermeulen,W. (2011) DNA damage 20. Kreuzer,K.N. (2013) DNA damage responses in prokaryotes: response. Cold Spring Harb. Perspect. Biol., 3, a000745. regulating gene expression, modulating growth patterns, and 43. Baharoglu,Z. and Mazel,D. (2014) SOS, the formidable strategy of manipulating replication forks. Cold Spring Harb. Perspect. Biol., 5, bacteria against aggressions. FEMS Microbiol. Rev., 38, 1126–1145. a012674. 44. Sirbu,B.M. and Cortez,D. (2013) DNA damage response: three levels 21. Ciccia,A. and Elledge,S.J. (2010) The DNA damage response: making of DNA repair regulation. Cold Spring Harb. Perspect. Biol., 5, it safe to play with knives. Mol. Cell, 40, 179–204. a012724. 22. Jackson,S.P. and Bartek,J. (2009) The DNA-damage response in 45. Blackford,A.N. and Jackson,S.P. (2017) ATM, ATR, and DNA-PK: human biology and disease. Nature, 461, 1071–1078. the trinity at the heart of the DNA damage response. Mol. Cell, 66, 23. Peng,N., Han,W., Li,Y., Liang,Y. and She,Q. (2017) Genetic 801–817. technologies for extremely thermophilic microorganisms of 46. Frols,S., Ajon,M., Wagner,M., Teichmann,D., Zolghadr,B., Sulfolobus, the only genetically tractable genus of crenarchaea. Sci. Folea,M., Boekema,E.J., Driessen,A.J., Schleper,C. and Albers,S.V. China Life Sci., 60, 370–385. (2008) UV-inducible cellular aggregation of the hyperthermophilic 24. Deng,L., Zhu,H., Chen,Z., Liang,Y.X. and She,Q. (2009) Unmarked archaeon Sulfolobus solfataricus is mediated by pili formation. Mol. gene deletion and host-vector system for the hyperthermophilic Microbiol., 70, 938–952. crenarchaeon Sulfolobus islandicus. Extremophiles, 13, 735–746. 47. van Wolferen,M., Wagner,A., van der Does,C. and Albers,S.V. (2016) 25. Li,Y., Pan,S., Zhang,Y., Ren,M., Feng,M., Peng,N., Chen,L., The archaeal Ced system imports DNA. Proc. Natl. Acad. Sci. Liang,Y.X. and She,Q. (2016) Harnessing Type I and Type III U.S.A., 113, 2496–2501. CRISPR-Cas systems for genome editing. Nucleic Acids Res., 44, e34. 48. Liu,R. and Ochman,H. (2007) Stepwise formation of the bacterial 26. Guo,L., Brugger,K., Liu,C., Shah,S.A., Zheng,H., Zhu,Y., Wang,S., flagellar system. Proc. Natl. Acad. Sci. U.S.A, 104, 7116–7121. Lillestol,R.K., Chen,L., Frank,J. et al. (2011) Genome analyses of 49. Chen,H.-T. and Hahn,S. (2003) Binding of TFIIB to RNA Icelandic strains of Sulfolobus islandicus, model organisms for genetic polymerase II: mapping the binding site for the TFIIB zinc ribbon and virus-host interaction studies. J. Bacteriol., 193, 1672–1680. domain within the preinitiation complex. Mol. Cell, 12, 437–447. 27. Peng,W., Feng,M., Feng,X., Liang,Y.X. and She,Q. (2015) An 50. Bushnell,D.A., Westover,K.D., Davis,R.E. and Kornberg,R.D. archaeal CRISPR type III-B system exhibiting distinctive RNA (2004) Structural basis of transcription: an RNA polymerase targeting features and mediating dual RNA and DNA interference. II-TFIIB cocrystal at 4.5 Angstroms. Science, 303, 983–988. Nucleic Acids Res., 43, 406–417. 51. Manna,S., Le,P. and Barth,C. (2013) A unique mitochondrial 28. Horton,R.M., Hunt,H.D., Ho,S.N., Pullen,J.K. and Pease,L.R. transcription factor B protein in Dictyostelium discoideum. PLoS (1989) Engineering hybrid genes without the use of restriction One, 8, e70614. enzymes: gene splicing by overlap extension. Gene, 77, 61–68. 52. Seitz,E.M., Brockman,J.P., Sandler,S.J., Clark,A.J. and 29. Peng,N., Deng,L., Mei,Y., Jiang,D., Hu,Y., Awayez,M., Liang,Y. and Kowalczykowski,S.C. (1998) RadA protein is an archaeal RecA She,Q. (2012) A synthetic arabinose-inducible promoter confers high protein homolog that catalyzes DNA strand exchange. Genes Dev., levels of recombinant protein expression in hyperthermophilic 12, 1248–1253. archaeon Sulfolobus islandicus. Appl. Environ. Microbiol., 78, 53. Aravind,L., Walker,D.R. and Koonin,E.V. (1999) Conserved domains 5630–5637. in DNA repair proteins and evolution of repair systems. Nucleic 30. Mortazavi,A., Williams,B.A., McCue,K., Schaeffer,L. and Wold,B. Acids Res., 27, 1223–1242. (2008) Mapping and quantifying mammalian transcriptomes by 54. Constantinesco,F., Forterre,P. and Elie,C. (2002) NurA, a novel 5-3 RNA-Seq. Nat. Methods, 5, 621–628. nuclease gene linked to rad50 and mre11 homologs of thermophilic Archaea. EMBO Rep., 3, 537–542.

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 12 Nucleic Acids Research, 2018

55. Constantinesco,F., Forterre,P., Koonin,E.V., Aravind,L. and Elie,C. accessory proteins in DNA replication. J. Biol. Chem., 286, (2004) A bipolar DNA helicase gene, herA, clusters with rad50, 31180–31193. mre11 and nurA genes in thermophilic archaea. Nucleic Acids Res., 57. Samson,R.Y., Xu,Y., Gadelha,C., Stone,T.A., Faqiri,J.N., Li,D., 32, 1439–1447. Qin,N., Pu,F., Liang,Y.X., She,Q. et al. (2013) Specificity and 56. Choi,J.-Y., Eoff,R.L., Pence,M.G., Wang,J., Martin,M.V., Kim,E.-J., function of archaeal DNA replication initiator proteins. Cell Rep., 3, Folkmann,L.M. and Guengerich,F.P. (2011) Roles of the four DNA 485–496. polymerases of the crenarchaeon Sulfolobus solfataricus and

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky236/4956648 by Det Kongelige Bibliotek user on 07 June 2018 Supplementary tables

Supplementary Table S1. Sulfolobus strains and plasmids used in this study

Strain or vector Genotype/features Source or reference

Sulfolobus strains

S.islandicus ∆pyrEF∆lacS Deng et al.(Deng et E233S1 (WT) al., 2009)

S.islandicus ∆tfb3 ∆pyrEF∆lacS∆tfb3 This work

Vectors

pSe-Rp The plasmid contain a DNA fragment of two Peng et al.(Peng et tandem copies of CRISPR repeat sequences for al., 2015) construction of the artificial mini-CRISPR array

pAC-tfb3 pSe-Rp carrying an spacer matching to the This work protospacer in the coding region of tfb3 gene in genome

pGE-tfb3 The genome-editing plasmid derived from pAC- This work tfb3, with the donor DNA inserted between SphI and XholI

pSeSD1 A Sulfolobus-E.coli shuttle vector carrying an Peng et al.(Peng et expression cassette controlled under a synthetic al., 2012)

strong promoter ParaS-SD

pSeptfb3 pSeSD1 derivative with araS-SD promoter This work region being replaced by native tfb3 promoter

pSeptfb3-tfb3WT pSeptfb3 carrying wild type TFB3 encoding This work sequence

pSeptfb3-tfb3N-Zn pSeptfb3 carrying TFB3 N terminal Zn ribbon This work encoding sequence

pSeptfb3-tfb3C3S- pSeptfb3 carrying TFB3 encoding sequence with This work C25T C3S, C25T substitutions

pSeptfb3-tfb3M1 pSeptfb3 carrying TFB3 encoding sequence with This work R145A, K146A substitutions

pSeptfb3-tfb3M2 pSeptfb3 carrying TFB3 encoding sequence with This work L148A, K149A, L151A substitutions

Supplementary Table S2. Primers used in this paper

Primers Sequence(5’-3’)

Construction of tfb3 deletion mutant

KOtfb3-Spacer-f AAGATAATAAGCAATAAGTTCTATAAAGATGATATCTTAATA

KOtfb3-Spacer-r AGCTAATTAAGATATCATCTTTATAGAACTTATTGCTTATTAT

KOtfb3-L-f TTTGCATGCTTGATTTCGACTGATGTAG

KOtfb3-R-r TTTCTCGAGACAAGGTCAGCCAAATAC

KOtfb3-SOE-r CCGATCACTTTCTATACCACAATTACTACATACAAC

KOtfb3-SOE-f GTAATTGTGGTATAGAAAGTGATCGGATCTGATTTGAG

KOtfb3-F1 CTCTATCTGCCTTATCCT

KOtfb3-R1 ACTTGTGCCACTGCTTTA

KOtfb3-F2 GATAATAAGAACGGGGAAGT

KOtfb3-R2 CTCTGATTCATTTCGCCCAC

Construction of mutated TFB3 proteins

tfb3pro-f TCTgcatgcATGCTAAATATATCTTCTCTAAAG

tfb3pro-r GGCCACTcatatgTTCACATGCAGGACATTCCATAT

tfb3WT-f GGCCACTcatatgGAATGTCCTGCATGTGGTTC

tfb3WT-r ATTTgtcgacGATCCGATCACTTTCTTCGT

tfb3N-Zn-r CTAgtcgacTATTATGATTGTCTCTGTACTCTC

tfb3CoilM1-SOE-f GGTATAAAAAATGAAAATCTTgGAgAAAGACTTAAAAGAATT

tfb3CoilM1-SOE-r TCcAAGATTTTCATTTTTTATACCAAATTCCT

tfb3CoilM2-SOE-f GGTATAAAAAATGAAAATCTTAGAAAAAGACaTgAAAcAATT

tfb3CoilM2-SOE-r CTAAGATTTTCATTTTTTATACCAAATTCCT

tfb3C3S-C25T-f GGCCACTcatatgGAAaGTCCTGCATGTGGTTC

tfb3C3S-C25T-SOE -f TAGTAATACTGGTATAATTATTGATAATATTTACTATA

tfb3C3S-C25T-SOE-r AATTATACCAGTATTACTACATACAACTTCCCCGTTCT

Primers for qPCR

16SrRNAqPCR-f GAATGGGGGTGATACTGTCG 16SrRNAqPCR-r TTTACAGCCGGGACTACAGG

tfb3qPCR-f TGACGAGGGTATTTTGAGTGGT

tfb3qPCR-r TCCGATCACTTTCTTCGTTGAGT

upsXqPCR-f ACTGGTTGGAGGGAGATCGA

upsXqPCR-r ACAGTTCCGTCATTAGAAACCAGA

upsAqPCR-f TCAATCCTAATACCGGACAAGCA

upsAqPCR-r TTGGCCAGCAGTTATTTTACCA

dpo2qPCR-f CCATCATACACCATGGCAGTT

dpo2qPCR-r CTCCGCCAAAGCCCAAATTAG

cedB1qPCR-f AAGGATACACGGCGAGCGAT

cedB1qPCR-r TGTCCACGCCTCATAAGCCT

SiRe_1957qPCR -f TTTTCAAGTAGTGCGTCCCA

SiRe_1957qPCR -r CCATGGATAGCCAGCAAAGT

SiRe_0187qPCR -f GCAAGGTACAGCGATTAGGA

SiRe_0187qPCR -r CCGTTTTGCTGAACTCCTCT

cedBqPCR-f GTATTGGATGAAGCATGGAC

cedBqPCR-r CTTACCTCCTCCCAGAATTT

cedBproqPCR-f GGGTAGAACAAAGTATTC

cedBproqPCR-r AATCCAATGATTCTTGAG

upsEproqPCR-f AGATGCATGCTCATTTTCCCCATAGAGCCT

upsEproqPCR-r TTAGCTGACATATGAATACTGAGTATTCAGAAAA

herA1proqPCR-f ATTGTAACTCTTCCTCTCTGCT

herA1proqPCR-r GACATTCCATATGAAAACAATA

cedB1proqPCR-f AAGAACTAGAATATAACA

cedB1proqPCR-r GAAATCTTTTCCGAGCATG

CmrβproqPCR-f GTCCCGATGCAAGAAATT

CmrβproqPCR-r ACTTATCACAAAAACTTA

Supplementary Table S4. TFB3 dependent highly NQO inducible genes (Folds change over 4)

Gene E233S1 E233S1 Δtfb3 Δtfb3 Functional Proteins Identity -NQO +NQO -NQO +NQO Implication /Features

DNA transfer or related proteins Membrane protein important UpsX SiRe_1878 23.43 761.84 7.82 17.92 for efficient DNA transfer Type IV secretion system UpsE SiRe_1879 85.07 3844.25 2.58 16.24 ATPase UpsF SiRe_1880 18.25 482.31 2.30 8.03 Integral membrane protein Pilin subunits containing class UpsA SiRe_1881 915.29 35106.96 17.15 156.63 III signal peptides Pilin subunits containing class UpsB SiRe_1882 10.73 481.64 0.48 5.88 III signal peptides CedA SiRe_1317 267.51 8668.38 56.51 112.88 DNA import

CedA1 SiRe_1316 692.31 21616.00 221.47 315.54 DNA import

CedB SiRe_1857 91.84 2678.39 27.92 77.92 DNA import Located directly downstream of cedA operon and Archaese Archease SiRe_1319 48.16 394.17 34.44 88.45 could function as DNA chaperon Conjugal transfer Potentially involved in DNA SiRe_1486 143.20 1186.70 127.34 244.31 protein transfer Conserved Containing class III signal SiRe_1487 71.46 553.20 53.71 130.68 plasmid protein peptides Conjugal transfer Cluster together with ABC SiRe_0427 88.43 659.29 120.56 97.42 protein transporter Archaeal SiRe_1787 3.50 28.50 11.16 11.02 DNA transfer Integrase PadR family Potential regulator of transcriptional SiRe_1956 138.28 1113.71 132.91 116.13 SiRe_1957 regulator Containing class III signal peptides, N terminal show FlgG like protein SiRe_1957 400.12 12504.51 26.24 78.96 similarity to flagellar basal body subunit FlgG Conserved protein belonging HerA1 SiRe_1715 58.47 2374.44 10.08 29.88 to HerA family Glycoside Putative glycoside hydrolase SiRe_0936 14.13 473.09 14.42 19.97 hydrolase with unknown function NurA paralogue SiRe_0014 54.53 385.01 59.17 38.38 Putative NurA family protein C terminal show similarity to Putative ATPase SiRe_0016 48.33 270.19 36.94 17.40 Chromosome segregation ATPases Adenine-specific Cluster together with ABC SiRe_0589 138.86 3326.64 16.20 23.81 DNA methylase transporter RecB family SiRe_1994 349.80 2717.45 283.79 302.25 5’ to 3’ DNA Exonuclease exonuclease Metal-dependent DNA-specific Cas1 SiRe_0761 7.74 89.84 8.12 5.01 endonuclease V-type ATP Coupling the energy of ATP SiRe_0015 20.55 288.94 15.38 12.42 synthase hydrolysis to proton transport Membrane Protein with a transmembrane SiRe_0020 13.88 853.58 0 5.25 protein helix Transport protein SiRe_0137 29.02 1289.22 5.07 6.98 Putative sialic acid transporter Membrene Containing 7 transmembrane SiRe_2100 117.39 4004.74 3.67 17.50 protein helices Acyl-CoA Lipid transport and SiRe_2101 76.15 932.84 98.78 71.23 dehydrogenase metabolism ABC transporter related Cluster together with conjugal SiRe_0426 21.00 625.17 14.05 11.71 Membrane protein SiRe_0427 protein Containing 6 transmembrane ABC transporter helices and showing remote SiRe_0425 44.31 501.62 25.88 23.88 permease homology to MFS general substrate transporter MFS transporter SiRe_2286 38.13 265.68 10.69 16.91 Putative H+ Antiporter Zinc finger SWIM Showing remote homology to SiRe_0670 182.12 7295.24 0.00 11.96 domain protein ABC import system Membrane Putative inner membrane- SiRe_0671 70.53 1347.83 47.86 54.57 protein associated hydrolase Membrane Potentially involved in ion SiRe_1984 8.56 91.40 4.99 4.24 protein channel Potential Potentially involved in transcriptional SiRe_0269 118.49 5332.74 4.80 24.88 transcriptional regulation of regulator transport proteins ABC transporter Cluster together with ATP-binding SiRe_0588 141.14 607.65 216.68 196.58 SiRe_0589, a putative protein methylase Membrane associated SiRe_2664 223.50 2433.81 209.21 176.77 leucyl aminopeptidase Aminopeptidase N Showing remote homolog to Membrane SiRe_2279 1.19 11.06 6.07 2.50 Periplasmic thiol:disulfide protein interchange protein DsbA Metabolism NADH Coupling electron transfer to dehydrogenase SiRe_1078 4.22 46.01 5.41 7.36 the electrogenic transport of (quinone) protons or Na+ TQO small The terminal quinol oxidase SiRe_2153 140.79 746.88 139.72 85.14 subunit DoxD present in the membrane Glutamate Glutamate metabolism or SiRe_1332 31.98 155.97 38.96 58.93 synthase nitrogen metabolism Thiamine Function as a coenzyme in pyrophosphate SiRe_0338 10.41 71.69 15.74 21.31 diverse enzymatic reactions protein GTP:adenosylcob inamide- Porphyrin and chlorophyll phosphate SiRe_1942 2.67 17.77 1.95 4.33 metabolism guanylyltransfera se S- adenosylmethioni Transfer of methyl groups to ne-dependent SiRe_0331 2.89 16.91 14.72 9.69 various biomolecules methyltransferase s Showing similarity to Formate SiRe_1076 1.61 7.81 0.82 1.21 Respiratory-chain NADH hydrogenlyase dehydrogenase Hypothetical proteins Hypotehtical SiRe_0187 55.13 1980.28 3.38 17.76 Containing Coiled Coil protein IS110 family SiRe_1859 1.08 19.73 0.37 2.72 Movement of the transposon transposase Hypotehtical SiRe_0502 1.30 16.12 0 3.27 Unknown protein IS4 family SiRe_2672 1.43 13.70 0.37 2.71 Movement of the transposon transposase Hypothetical SiRe_1079 4.24 18.07 0.54 7.34 Unknown protein Showing remote homology to Hypothetical SiRe_1960 2.10 10.42 7.28 6.35 membrane-bound ClpP-class protein protease Hypothetical SiRe_0271 48.06 198.77 33.787 83.417 Unknown protein A list of 51 TFB3 dependent upregulated genes upon NQO treatment with a folds change over 4. tfb3 gene was not included in this table. Class III signal peptides refer to N-terminal class III secretory signal sequence as found in type IV pilin precursors, which was predicted cleavage site for the type IV pre-pilin peptidase PibD. The expression level of each gene in different samples was indicated by RPKM value.

Supplementary Table S5. TFB3 independent NQO responsive genes

E233S1 E233S1 Δtfb3 Δtfb3 Proteins Gene Identity -NQO +NQO -NQO +NQO DNA repair

NurA SiRe_0061 31.32 108.27 36.88 101.79

Rad50 SiRe_0062 29.94 86.77 35.38 63.68

Mre11 SiRe_0063 122.72 423.32 13.36 30.48

HerA SiRe_0064 257.94 857.90 239.73 653.70

RadA SiRe_1747 1556.19 3758.37 1738.00 4564.96 DNA topoisomerase VI SiRe_1118 421.95 883.66 369.80 439.63 subunit B TatD family hydrolase SiRe_1950 895.52 168.24 751.68 273.83

DNA replication

Dpo2N SiRe_0614 150.79 3188.79 458.75 4856.19

Dpo2C SiRe_0615 74.57 2003.07 269.68 2843.11

Cdc6-2 SiRe_1231 120.54 549.23 182.18 623.31

Cdc6-1 SiRe_1740 274.19 86.27 246.36 221.37

Chromatin architecture

Cren7 SiRe_1111 4722.47 516.94 5484.23 1046.19

Sul7 SiRe_2648 12653.87 1979.62 16481.85 2979.64

DNA-binding 7 kDa protein SiRe_0668 8510.87 2044.77 11428.09 2938.12

Reverse Gyrase SiRe_1124 653.88 142.97 532.77 229.14

Cell division

CdvA SiRe_1173 1569.39 526.39 1908.20 1113.25

ESCRT-III (CdvB) SiRe_1174 1079.36 262.11 992.40 424.05

Vps4 (CdvC) SiRe_1175 1022.31 254.32 888.26 462.84

ESCRT-III-1 (CdvB1) SiRe_1550 9894.60 1149.66 9477.53 3663.06

ESCRT-III-2 (CdvB2) SiRe_1200 15111.61 2119.78 16423.59 7984.77

ESCRT-III-3 (CdvB3) SiRe_1388 143.91 42.02 133.92 109.90

SegA SiRe_1962 243.47 45.69 228.96 94.46

SegB SiRe_1961 153.50 24.28 177.69 88.75

Xer recombinase SiRe_1624 292.59 64.03 265.11 179.41

ParA paralogue SiRe_0265 150.45 45.21 94.47 79.37

A list of selected genes exhibited TFB3 independent transcriptional regulation upon NQO treatment. Expression level of each gene was indicated by the RPKM value. The NQO inducible genes and repressed genes were labeled as Red and Green respectively.

Supplementary figures

Supplementary Figure S1. Schematic illustration of the construction of ∆tfb3 using the CRISPR based genome editing technique

The pGE-tfb3 plasmid was constructed by inserting the donor DNA and spacer sequence into the pAC vector. After introducing pGE-tfb3 into the E233S1 strain by electroporation, the transcribed crRNA will activated the DNA targeting activity of Cas complex, thereby resulting a double strand DNA break at the proto-spacer region of tfb3 gene. The homologous recombination between the DNA regions flanking tfb3 gene in the genome and the donor DNA in the genome editing plasmid finally give ∆tfb3.

Supplementary Figure S2. Correlation between the result of RNA seq and qPCR

The folds changes of 7 genes at 6 hours post NQO treatment were analyzed by quantitative PCR to validate the RNA Seq data. The log10 (folds change) for upsX (closed triangles), upsA (closed circles), cedB (closed squares), SiRe_1957 (open triangles), dpo2 (open circles), cdvB1 (open squares) and SiRe_0187 (closed diamonds) based on RNA seq data and qPCR results are plotted on the x and y axes, respectively. The data obtained from the two methods yield a linear fit with a slope of 0.976 and R-value of 0.93.

Supplementary Figure S3. Quantitative analysis of the induced folds of tfb3 after NQO treatment

Cell extracts prepared in Figure 1 were subjected to 8 and 16 times dilution before SDS-PAGE and immunoblotting analysis of TFB3 and PCNA3.

Supplementary Figure S4. Induction of tfb3 by different concentration of NQO

NQO was supplemented into the early exponential growth culture to a final indicated concentration for 6 hours, then cell mass were collected for total cell extracts. The same amounts of total protein were subjected to SDS-PAGE separation and immunoblotting analysis of TFB3. PCNA3 protein serves as a control with a constant protein level during NQO treatment.

Supplementary Figure S5. Growth curve of WT and ∆tfb3 upon prolonged incubation with NQO

The WT or ∆tfb3 culture growing in the exponential phase (OD600=0.2) was exposed to 0 (A), 0.5, 0.75 (B), 1.0, 1.5 and 2.0 µM NQO treatment for an incubation of 24 hours, during which the samples were taken for OD600 measurement and growth curve analysis (C). The same color of the connecting line indicates the same concentration of NQO treatment.

Supplementary Figure S6. TFB3 mediated cell aggregates formation induced by different DNA damaging agents

2 (A) Exponentially growing culture (OD600=0.2) was exposed to 2mM MMS or 200J/m UV irradiation for 3 hours and 2μM NQO, 10g/L Cisplatin, 5mM HU respectively for 6 hours, then the cell extracts were prepared and the same amount of total protein was used for western blot analysis using TFB3 antibody. PCNA3 serves as a control with a constant protein level. (B) Aliquots of cultures were directly observed under a microscope after incubation with MMS, Cisplatin and HU at 78°C for 10 h or 3 hours post UV irradiation.

Supplementary Figure S7. Sequence analysis of TFB3 proteins

(A), Sequence comparison of the N-terminal Zn ribbon domain of TFB1 and TFB3 in Crenarchaeota. The sequences of TFB3 from Sulfolobus islandicus Rey15A, Acidilobus saccharovorans 345-15, Pyrodictium delaneyi strain su06, Ignisphaera aggregans DSM 17230, Staphylothermus marinus F1 and its cognate TFB1, together with the sequence of canonical TFB1 from Vulcanisaeta moutnovskia 768-28, Thermofilum pendens Hrk 5, Nitrososphaera viennensis EN76, Haloferax volcanii SD2, were used for multiple sequence analysis by Cluster W(Larkin et al., 2007). The four conserved cysteines in Zn ribbon domain were indicated by the asterisk symbol. (B) Multiple sequence analysis of TFB3 proteins in Crenarchaeota. TFB3 orthologues from Metallosphaera cuprina Ar-4, Metallosphaera sedula DSM 5348, Acidianus hospitalis W1, Sulfolobus islandicus REY15A, Sulfolobus solfataricus P2, Sulfolobus tokodaii str. 7, Sulfolobus acidocaldarius DSM 639, Caldisphaera lagunensis DSM 15908, Acidilobus saccharovorans 345-15, Hyperthermus butylicus DSM 5456, Pyrolobus fumarii 1A, Pyrodictium delaneyi , Ignisphaera aggregans DSM 17230, Thermogladius calderae 1633, Thermosphaera aggregans DSM 11486, Staphylothermus hellenicus DSM 12710, Staphylothermus marinus F1, Pyrobaculum islandicum DSM 4184, Thermoproteus tenax Kra 1 were used for sequence analysis of TFB3 proteins. The coiled-coil region was predicted by Coils server(Lupas et al., 1991).

Supplementary Figure S8. Phylogenetic tree of TFB3 proteins

The sequences of TFB3 from identified Crenarchaeal species, together with sequences of TFB1 and TFB2 from Pyrococcus furiosus DSM 3638 (Euryarchaeota) were used to construct the NJ tree of these TFB proteins, with the latter two as outgroup. The same colour for the species name indicated the same Order the strain belonging to. The bright blue and orange refer to those species belonging to Sulfolobales and Acidilobales, whereas the red and green indicated those species in Thermoproteales and Desulfurococcales. The two species in outgroup are coloured as purple.

Deng, L., H. Zhu, Z. Chen, Y.X. Liang & Q. She, (2009) Unmarked gene deletion and host-vector system for the hyperthermophilic crenarchaeon Sulfolobus islandicus. Extremophiles 13: 735-746. Larkin, M.A., G. Blackshields, N.P. Brown, R. Chenna, P.A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm, R. Lopez, J.D. Thompson, T.J. Gibson & D.G. Higgins, (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23: 2947-2948. Lupas, A., M. Van Dyke & J. Stock, (1991) Predicting coiled coils from protein sequences. Science 252: 1162- 1164. Peng, N., L. Deng, Y. Mei, D. Jiang, Y. Hu, M. Awayez, Y. Liang & Q. She, (2012) A synthetic arabinose- inducible promoter confers high levels of recombinant protein expression in hyperthermophilic archaeon Sulfolobus islandicus. Appl Environ Microbiol 78: 5630-5637. Peng, W., M. Feng, X. Feng, Y.X. Liang & Q. She, (2015) An archaeal CRISPR type III-B system exhibiting distinctive RNA targeting features and mediating dual RNA and DNA interference. Nucleic Acids Res 43: 406-417.

Nucleic Acids Research, 2018 1 doi: 10.1093/nar/gky487 An Orc1/Cdc6 ortholog functions as a key regulator in the DNA damage response in Archaea Mengmeng Sun1,2,†, Xu Feng1,2,†, Zhenzhen Liu1, Wenyuan Han2, Yun Xiang Liang1,* and Qunxin She1,2,*

1State Key Laboratory of Agricultural Microbiology and College of Life Science and Technology, Huazhong Agricultural University, 430070 Wuhan, China and 2Archaea Centre, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark

Received April 07, 2018; Revised May 15, 2018; Editorial Decision May 15, 2018; Accepted May 17, 2018

ABSTRACT mental and endogenous factors. If left unrepaired, these DNA lesions will either alter the content of the genetic While bacteria and eukaryotes show distinct mech- blueprint, giving rise to mutations, or leading to the loss anisms of DNA damage response (DDR) regulation, of genome integrity and cell death. DNA damage repair investigation of ultraviolet (UV)-responsive expres- has been studied in great detail in the organisms belong- sion in a few archaea did not yield any conclusive ev- ing to the domains of Bacteria and Eukarya, in which dis- idence for an archaeal DDR regulatory network. Nev- tinctive regulatory networks called DNA damage response ertheless, expression of Orc1-2, an ortholog of the (DDR) have been revealed (1,2). The bacterial DDR is archaeal origin recognition complex 1/cell division best represented by the SOS response, which is triggered by control protein 6 (Orc1/Cdc6) superfamily proteins ssDNA-RecA filaments, an intermediate of homologous re- was strongly activated in Sulfolobus solfataricus and combination repair (HRR) of double stranded DNA breaks Sulfolobus acidocaldarius upon UV irradiation. Here, (DSBs). Investigation of the SOS regulation in Escherichia coli has revealed a LexA-dependent regulatory network that a series of experiments were conducted to investi- is conserved in numerous bacteria (3). Nevertheless, several gate the possible functions of Orc1-2 in DNA dam- LexA-independent mechanisms have also been discovered age repair in Sulfolobus islandicus. Study of DDR in different bacteria, revealing the diversification of the bac- in orc1-2 revealed that Orc1-2 deficiency abolishes terial DDR regulatory mechanisms (4). DDR regulation in DNA damage-induced differential expression of a eukaryotes is far more complex than those found in bac- large number of genes and the mutant showed hyper- teria, involving multiple genome surveillance mechanisms. sensitivity to DNA damage treatment. Reporter gene Two best-known examples are the ataxia telangiectasia- and DNase I footprinting assays demonstrated that mutated (ATM) and the ATM and Rad3-related (ATR) sig- Orc1-2 interacts with a conserved hexanucleotide nal transduction pathways that control, for example, cell cy- motif present in several DDR gene promoters and cle checkpoint regulation (5,6). Nevertheless, all these bac- regulates their expression. Manipulation of orc1-2 ex- terial and eukaryotic DDR mechanisms share some com- mon features; after recognition of DNA damage signal, a pression by promoter substitution in this archaeon series of cellular events occur in a coordinated fashion, in- revealed that a high level of orc1-2 expression is es- cluding inhibition of DNA replication, cell cycle arrest and sential but not sufficient to trigger DDR. Together, activation of the synthesis of various DNA repair enzymes, these results have placed Orc1-2 in the heart of the and such a regulation ensures efficient DNA damage repair archaeal DDR regulation, and the resulting Orc1-2- to occur in a timely fashion in the cell. centered regulatory circuit represents the first DDR Investigation of UV-responsive genome expression in a network identified in Archaea, the third domain of life. few model archaea including Halobacterium NRC-1 (7,8), Sulfolobus solfataricus P2 (9–11)andSulfolobus acidocal- INTRODUCTION darius DSM639 (10) has revealed that a number of genes are differentially expressed. Nevertheless, uncertainty about Organisms of all three domains of life have to deal with le- whether these observations reflect the presence of an ar- sions on their chromosomal DNAs generated by environ-

*To whom correspondence should be addressed. Tel: +45 3532 2013; Fax: +45 3532 2128; Email: [email protected] Correspondence may also be addressed to Yun Xiang Liang. Tel: +86 27 87281040; Email: [email protected] †The authors wish it to be known that, in their opinion, the first two authors should be regarded as Joint First Authors.

C The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected]

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 2 Nucleic Acids Research, 2018

chaeal DDR regulation persists primarily because of two Transcriptome analysis by RNA sequencing and verification reasons: (a) The differentially expressed genes (DEGs) ex- of the data by RT-qPCR clude those coding for the enzymes responsible for nu- Exponentially growing cultures of S. islandicus E233S1 (the cleotide excision repair (NER), which are strongly activated wild-type strain, WT,) and orc1-2 were diluted to an A by DNA damage in bacteria and eukaryotes (12,13), and 600 of ca. 0.2 and grown in the presence or absence of 2 ␮M (b) archaea do not code for any homologs of bacterial or NQO for 6 h. Cell mass was then collected by centrifuga- eukaryotic DDR regulators. As a result, it remains as an tion and employed for extraction of total RNAs using the open question whether organisms of the archaeal domain Trizol reagent (Ambion, Austin, TX, USA). The quality possess any DDR regulation, and if so, how the process is and quantity of the total RNA preparations were evaluated controlled. using NanoDrop 1000 spectrophotometer (Labtech, Wilm- In this work, we employed S. islandicus, a genetic model ington, MA, USA) and 2100 Bioanalyzer (Agilent Tech- in the Crenarchaeota (14) to investigate the DNA damage- nologies, Santa Clara, CA, USA). RNA sequencing (RNA- responsive genome expression and its regulation. Previous Seq) was conducted in Novogene (Beijing, China). About genetic analysis of three orc1 genes in S. islandicus showed 3 ␮g of high quality RNA were used for construction of that orc1-1 and orc1-3 code for replication initiators respon- RNA-Seq libraries, which were then subjected to next gen- sible for replication initiation from the oriC1 and oriC2 of eration sequencing using an Illumina HiSeq2000. This gave the chromosome but deletion of orc1-2 gene does not im- 4.1–4.8 million high quality sequence reads for each RNA pair the origin usage in this archaeon (15). The strong up sample, which were then mapped to the genome of S. is- regulation of orc1-2 in S. solfataricus and S. acidocaldarius landicus Rey15A (20). The resulting data were then ana- (upon UV irradiation (9,10)) and in S. islandicus (by treat- lyzed by Fragments Per Kilobase of transcript sequence ment of 4-nitroquinoline 1-oxide (NQO) and UV light (16)) per Million base pairs sequenced (FPKM) analysis (21) prompted us to investigate possible functions of Orc1-2 in to reveal expression levels of all genes in the S. islandicus DNA damage response in this crenarchaeon. We found that genome. Differential genome expression analysis (NQO- Orc1-2 has gained a novel function during evolution, i.e. the treated samples versus the corresponding untreated refer- factor functions as a global regulator in the DNA damage ences) was performed using the DEGSeq R package (22). response network of S. islandicus. Corrected P-value of 0.005 and log2 (Fold change) of 1 were set as the threshold for significant difference in differential gene expression. Quantitative reverse transcription PCR (RT-qPCR) was MATERIALS AND METHODS employed to validate the RNA-Seq data. DDR genes Growth conditions, transformation and NQO treatment of chosen for verification included dpo2, upsX, upsA, tfb3, Sulfolobus SiRe 1957 and SiRe 1550. First-strand cDNAs were syn- thesized with total RNA samples using a reverse transcrip- S. islandicus strains used in this work are listed in Supple- ◦ tase (Thermo-Scientific, Waltham, MA, USA) and random mentary Table S1. Sulfolobus cells were cultured at 78 Cin hexamer primers. The resulting cDNA samples were used SCV (basic salts plus 0.2% sucrose, 0.2% casamino acids, for estimation of mRNA levels of the above DDR genes 1% vitamin solution) medium (17) or in ACV medium (in by qPCR, using the Maxima SYBR Green/ ROX qPCR which sucrose is replaced with D-arabinose). If required, Master Mix (Thermo Scientific, Waltham, MA, USA) and uracil was added to 20 ␮g/ml. Plasmids were introduced gene-specific primers (Supplementary Table S2). PCR was into S. islandicus strains by electroporation as described performed in a CFX96 Touch™ real-time PCR detection originally for S. solfataricus (18). system (Bio-Rad, Hercules, CA, USA) with the following NQO treatment experiments were conducted as previ- ◦ ◦ steps: denaturing at 95 C for 5 min, 40 cycles of 95 C15 ously described (16). A stock of the drug (130 mM) was pre- ◦ ◦ ◦ s, 55 C15sand72C 20 s. Relative amounts of mRNAs pared with DMSO and kept in –20 C. By the time of NQO were estimated by using the comparative Ct method with experiment, the stock solution was diluted to 1.3 mM with 16S rRNA as the reference (23). A correlation between the H O. Then, the diluted NQO solution was added to Sul- 2 two sets of data was found to be 0.9705 with an R-value of folobus cultures of an early exponential growth phase (i.e. 0.96 (Supplementary Figure S1). absorbance at 600 nm (A600) = ca. 0.2) to the concentra- tions specified in each experiment. Cell samples were taken Cell aggregation assay and flow cytometry during incubation and used for A600 measurement, cell ag- gregation assay, cell viability assay, RNA extraction, west- The extent of cell agggregation in S. islandicus cultures was ern blot analysis and flow cytometry individually. estimated by direct observation of cell aggregates in fresh Cell viability of Sulfolobus cultures was estimated by de- cultures under a Nikon Eclipse Ti-E inverted microscope termination of their colony formation units (CFU). Cells (Nikon, Kobe, Japen). Data were collected from at least 12 were collected from 1 ml of culture by centrifugation for fields of view images and 500 single cells for each cell sam- each cell sample and re-suspended in the equal volume of ple, and the same analysis was conducted for three indepen- pre-warmed SCV medium. Each cell resuspension was di- dent growth experiments. luted in series of dilutions and plated for colony formation Flow cytometry of S. islandicus cells was conducted as using the two-layer plating method previously described previously described (24). Briefly, the archaeal cells were (19). Colonies of Sulfolobus cells appeared on plates after fixed with ice-cold ethanol, stained with 40 ␮g/ml ethid- 7 days of incubation. ium bromide (Sigma-Aldrich, St. Louis, MO, USA) and 100

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 3

␮g/ml mithramycin A (Apollo Chemical, Tamworth, UK) upsE, SiRe 1717: tfb3 and SiRe 1316: cedA1) were selected andanalyzedinanApogeeA40cytometer(Apogeeflow, for the experiment. Promoter fragments of these genes Hertfordshire, UK) equipped with a 405 nm laser. A dataset were amplified by PCR with Fast Pfu DNA polymerase of at least 60 000 cells was collected for each cell sample. (TransGene, Beijing, China) from the genomic DNA of S. islandicus REY15A (20), using the following primer pairs individually, upsA-fwd-SphI/upsA-rev-NdeI, upsE- Construction of S. islandicus orc1-2araS mutant using a fwd-SphI/upsE-rev-NdeI, tfb3-fwd-SphI/tfb3-rev-NdeI or CRISPR-based genome-editing method cedA1-fwd-SphI/cedA1-rev-NdeI (Supplementary Table The genome-editing plasmid pGE-orc1-2araS was con- S2). The resulting PCR products (ca. 220 bp) were puri- structed by following the strategy previously described (25). fied with the GeneJET PCR Purification kit (Thermo Sci- A target site was selected on the orc1-2 gene, consisting of a entific, Waltham, MA, USA), and the purified DNAs were 5-CCN-3 (positioned at –28 referring to the start codon of digested with SphI and NdeI and purified again. Each pu- orc1-2), a type I-A protospacer adjacent motif and the im- rified promoter fragment was then inserted into pSe-lacS mediately downstream 40-nt DNA sequence (protospacer). (27), yielding pSe-upsA-LacS, pSe-upsE-LacS, pSe-tfb3- Two DNA oligonucleotides were then designed based on LacS and pSe-cedA1-LacS reporter gene plasmids contain- the protospacer, giving orc1-2araS Spacer F and orc1-2araS ing the promoters of the wild-type DNA damage respon- Spacer R (Supplementary Table S2). Annealing of the two sive element (DDRE) (Supplementary Table S1). Promoter oligonucleotides yielded a DNA fragment containing the derivatives containing the mutated DDRE (DDREmut) mo- designed spacer. The resulting DNA fragment also con- tifs were constructed using the SOE-PCR procedure (29). tained the 5-flanking 4 nt protruding ends that are compat- First, two overlapping primers were designed containing ible to the ends of the SapI-digested pSe-Rp vector. There- desired mutations in the center of each pair of oligonu- fore, ligation of the spacer fragment and the digested vector cleotides (Supplementary Table S2), including upsA-SOE- yielded the mini-CRISPR plasmid pAC-orc1-2. top/upsA-SOE-bm, upsE-SOE-top/upsE-SOE-bm, tfb3- The donor DNA fragment was generated by splicing SOE-top/tfb3-SOE-bm and cedA1-SOE-top/cedA1-SOE- and overlapping extension (SOE)-PCR with Fast Pfu DNA bm. These SOE primers and their corresponding promoter- polymerase (TransGene, Beijing, China), following the pub- cloning primers were then employed for SOE-PCR with the lished procedure (26). First, orc1-2 gene was amplified by corresponding reporter gene plasmid as template. The re- PCR from the genome DNA using the primer pair of orc1- sulting DDREmut promoter fragments were then inserted 2fwd-NcoI/orc1-2rev-XmaI. Insertion of the PCR frag- into pSe-lacS (27), giving four reporter gene plasmids with ment into pZC1-S-50-orc1-2 yielded pSe-araS-orc1-2 car- DDREmut promoters, i.e. pSe-upsAmut-LacS, pSe-upsEmut- rying the araS-orc1-2 fusion gene. Then, a 654-bp genomic LacS, pSe-tfb3mut-LacS and pSe-cedA1mut-LacS (Supple- DNA fragment was obtained from the archaeal genome by mentary Table S1). Determination of the sequences of the PCR with the primer pair of orc1-2araS SOEfwd-SphI/orc1- plasmid-borne wild-type and mutated promoters of the four 2araS SOErev whereas the fusion gene was amplified by DDR genes by DNA sequencing confirmed the identity of PCR from pSe-araS-orc1-2 with the primer pair of orc1- these reporter gene plasmids. 2araS SOEfwd/orc1-2araS rev-XhoI. Finally, the two PCR These reporter gene plasmids were then introduced into fragments were joined together by PCR using the primer S. islandicus E233S1 (WT) and orc1-2 individually by elec- pair of orc1-2araS fwd-SphI/orc1-2araS rev-XhoI. The result- troporation. Three colonies of transformants were chosen ing DNA fragment was digested with SphI and XhoI and from each transformation and grown for 6 h in SCV ei- inserted into pAC-orc1-2 at the same sites, giving pGE- ther in the presence or absence of 2 ␮M NQO. Cell mass orc1-2araS. The identity of the plasmid was confirmed by se- was then collected from which cell extracts were prepared quence determination of the DNA inserts by DNA sequenc- by sonication. ␤-Glycosidase activity in the cell extracts ing. of different S. islandicus strains was determined using the pGE-orc1-2araS was then introduced into S. islandicus ONPG (␳-nitrophenyl-b-D-galactopyranoside) method as E233S1 by electroporation, giving transformants that were described previously (30). subjected to pyrEF counter selection to cure the pGE plas- mid as previously described (19). The genotype of single Western blotting and hybridization analysis colonies obtained on 5-FOA-containing plates were deter- mined by PCR amplification of the insert with orc1-2araS Cells in 15 ml culture were collected by centrifugation. check-fwd/orc1-2araS check-rev primers and subsequent se- Cell pellets were re-suspended in 1 ml 10 mM Tris–HCl quencing of the PCR products. One of the verified mutants buffer (pH 8.0). The resulting cell suspensions were soni- was designated as S. islandicus orc1-2araS andusedinsubse- cated to disrupt Sulfolobus cells. Cell debris was removed quent experiments. by centrifugation 13000 rpm at 4◦C for 30 min, yielding cellular extracts for further analysis. Protein concentra- tions of the samples were determined using Coomassie Pro- Construction of reporter gene plasmids and report gene assay tein Assay Kit (Thermo Scientific, Waltham, MA, USA). of DDR gene promoters Similar amounts of protein (ca. 10 ␮g) were taken from Reporter plasmids were constructed using the Sulfolobus- the prepared cell extracts and loaded on a 12% polyacry- E. coli shuttle vector pSeSD (27) with the S. solfataricus lamide gel for sodium dodecyl sulfate-polyacrylamide gel ␤-glycosidase gene (lacS)(28) as the reporter gene. Four electrophoresis (SDS-PAGE). Then, fractionated proteins highly up-regulated genes (SiRe 1881: upsA, SiRe 1879: were transferred from the polyacrylamide gel onto a ni-

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 4 Nucleic Acids Research, 2018

Table 1. Survival rates of S. islandicus wild-type strain (E233S1) and from the corresponding reporter gene plasmids (Supple- orc1-2 after NQO treatment mentary Table S1) using primers listed in Supplementary Viable cells in % after Table S2. The resulting PCR products were then cloned NQO treatmenta into pCR™4-TOPO® (Thermo Scientific, Waltham, MA, Doses of NQO (␮mol/l) E233S1 orc1-2 Ratio USA), yielding plasmids that were used as template for PCR with an M13fwd primer carrying a 5-end FAM la- 0 100 100 1.0 beling and an M13rev primer carrying a HEX 5-end label- 1 42.66 11.33 3.8 ing. The resulting fluorescence-labeled DNAs were used as 2 11.47 1.92 6.0 2.5 3.29 0.23 14.3 DNA probes for DNase I footprinting analysis. The coding 3 1.23 0.04 30.8 strand of the upsE promoter region was labeled by FAM 4 0.04 0.002 20.0 (peaks above the promoter sequences) and the non-coding

a strand was labeled by HEX (peaks below the promoter se- Exponentially growing cultures were treated with different doses of NQO quence). The two DNA strands of the tfb3 promoter re- for 6 h. Cell samples were taken and plated on drug-free SCVU plates for determination of colony formation units (CFU). Survival rates are ex- gion were labeled in the opposite combination: HEX for pressed as % of viable cells in drug-treated cultures relative to those in the the coding strand and FAM for the non-coding strand. To corresponding drug-free reference cultures. ensure efficient binding, 400 ng of labeled DNA and 13.7 ␮g of Orc1-2 protein were mixed and incubated at 40◦Cfor ␮ trocellulose blotting membrane (GE Healthcare, Wauke- 20 min in 50 l binding buffer [20 mM HEPES (pH 7.6), sha, WI, USA), using the Semi-Dry Electrophoretic Trans- 10 mM (NH4)2SO4, 1 mM DTT, 0.2% Tween-20, 30 mM fer Cell system (Bio-Rad, Hercules, CA, USA) and used KCl]. Then, 0.02 U of RNase-free DNase I (Thermo Sci- entific, Waltham, MA, USA) was added and incubated at for immunoblotting. Briefly, the membrane was incubated ◦ / in 5% skim milk blocking agent for 1 h, and then incu- 37 C for 2 min to degrade unprotected DNA. Finally, 1 10 (v/v) of 50 mM EDTA was added to each reaction and in- bated with individual primary rabbit antibodies (against ◦ Orc1-2 or PCNA3 proteins) and finally with the horseradish cubated at 65 C for 10 min to stop the reaction. DNAs in peroxidase-labeled goat anti-rabbit antibody (Beyotime, the samples were extracted using GeneJET PCR Purifica- Beijing, China) as described previously (31). Protein bands tion kit (Thermo Scientific, Waltham, MA, USA) and sent were visualized using the ECL western blot substrate for fragment length analysis by capillary electrophoresis in (Thermo Scientific, Waltham, MA, USA) and recorded by Eurofins Genomics company (Ebersberg, Germany). Elec- exposure to an X-ray film. tropherograms were aligned using GeneMapper v2.6.3 (Ap- plied Biosystems).

Sequence analysis of promoters of highly activated DDR RESULTS genes S. islandicus orc1-2 showed hypersensitivity to NQO treat- Promoter sequences (100 bp preceding the start codon) of ment 21 Orc1-2-depedent NQO-responsive genes (over 16 folds change after NQO treatment, Supplementary Table S6) First, S. islandicus orc1-2 strain constructed previously were individually retrieved from the genome sequence of S. (15) was investigated for its sensitivity to NQO, a drug that islandicus REY15A (20). These sequences were then used forms stable bulky quinolone adducts on bases of DNA in for de novo motif discovery by using MEME (Multiple EM bacterial and eukaryotic cells, leading to the formation of for Motif Elicitation) with the default setting to identify DSBs (34,35), and it induces programmed cell death in S. is- conserved motifs as previously reported (32). landicus (16). Both orc1-2 and its corresponding WT were grown in SCV media containing different concentrations of NQO. Growth of these cultures was monitored by measur- DNase I footprinting assay ing their A600 values. We found that 2.5 ␮M NQO com- Orc1-2 protein was expressed in E. coli Rosetta cells carry- pletely inhibited the growth of the mutant while it required 4 ing pET-orc1-2. The E. coli strain was cultured in an LB ␮M NQO to stop the growth of the WT strain (Supplemen- ◦ medium containing 30 ␮g/ml kanamycin at 37 C until A600 tary Figure S2), indicating that orc1-2 is hypersensitive to = 0.6. Orc1-2 protein synthesis was induced by adding 0.5 this drug, in reference to WT. mM IPTG, and the induction was for 12 h at 16◦C. Cell The sensitivity of WT and orc1-2 to NQO was also mass was harvested by centrifugation and re-suspended in evaluated by determination of their survival rate after drug the lysis buffer (50 mM phosphate saline buffer, 500 mM treatment. Both strains were again grown in SCV in the NaCl, 20 mM imidazole). Cells were disrupted using a presence of different concentrations of the drug for 6 h French press at 4◦C. After removing cell debris by centrifu- (hours post treatment, hpt). The number of viable cells in gation, the supernatant was loaded onto a 1 ml Ni-NTA col- all cultures was then estimated by determination of their umn (GE Healthcare, Chicago, Illinois, USA). His-tagged colony formation units (CFU), with the results summarized recombinant Orc1-2 protein was purified by following the in Table 1. Two features are evident in these data: (a) the manufacturer’s instruction. number of viable cells in NQO-treated cultures exhibited DNase I footprinting assay was performed as described a strong reverse correlation to the drug concentration for previously (33). The upsE and tfb3 original promoter PupsE- both orc1-2 mutant and the WT strain, and (b) the ratio DDRE and Ptfb3-DDRE and the mutated promoter PupsE- of cell viability between WT and orc1-2 changed from 3.8 mut mut DDRE and Ptfb3-DDRE were amplified individually to 30 folds as the NQO content increased from 1 to 3 ␮M

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 5

in the medium. These results confirmed the hypersensitivity microscope. We found that WT formed small cell aggregates of orc1-2 to NQO treatment. of 3–7 cells at 6 hpt, and larger cell aggregates (10–30 cells) appeared at 12 hpt (Figure 2A). In contrast, there was es- sentially no difference in cell aggregation between the NQO- Orc1-2 deficiency by gene deletion eliminated the NQO- treated orc1-2 cells and their corresponding untreated ref- responsive expression erences since <5% of cells were found in aggregates and that Next, we extracted total RNAs from cell samples of WT and number did not change after drug treatment (Figure 2B). orc1-2 cultures (grown in SCV containing 2 ␮M NQO) at Therefore, these results indicated that orc1-2 has lost the 6 hpt and from the corresponding cell samples of untreated capability of cell aggregation. reference cultures. These RNA samples were employed for These cell samples were also analyzed by flow cytometry. determination of mRNA abundance of individual genes by As shown in Figure 2C, the population of cells with DNA RNA-Seq. FPKM plot analysis of the RNA-Seq data re- content clustering at 1 chromosome (G1+, comprising of vealed a large number of DEGs upon NQO treatment in G1 and ‘apparent G1’ cells, the latter of which contain 1 WT, which did not show NQO-responsive expression in chromosome with fired origins of replication) slightly in- orc1-2 (Figure 1A, Supplementary Tables S3, S4 and S5). creased in WT at 6 hpt (<15%), and strikingly, G1+ cells These include a total of 646 genes among which 290 are up accounted for 75% of the total cell population in the orc1- regulated and 356 are down regulated (with a corrected P- 2 cell sample. These results suggested that cell division was value = 0.005, fold change > 2) (Supplementary Table S3, inhibited in WT cells, but not in orc1-2 cells, consistent Figure 1B). with the RNA-Seq data in which the expression of cdvA, Many up-regulated DEGs could be implicated in DNA cdvB and vps4, which code for the proteins (CdvA, ESCRT, damage repair (Supplementary Table S4), including: (a) Vsp4) responsible for the ESCRT mode of cell division in orc1-2 coding for one of the three Orc1/Cdc6 orthologous Sulfolobus (48,49), was down regulated in NQO-treated WT proteins in this archaeon (20), (b) genes in the gene operon cells, but their expression was not changed in NQO-treated of UV-inducible pili of Sulfolobus (the Ups system) (36)and orc1-2 cells (Supplementary Table S5). In addition, the ex- genes of the crenarchaeal system for exchange of DNA (the pression of orc1-1, orc1-3, which code for replication ini- Ced system), the latter of which has recently been shown to tiators responsible for initiation of oriC1 and oriC2 of the be responsible for intercellular DNA transfer in Sulfolobus S. islandicus chromosome (15), was also inhibited in NQO- (37), (c) genes coding for the basal transcriptional factors treated WT cells but not in NQO-treated orc1-2 cells (Sup- TFB1 and TFB3 (38,39) and (d) genes coding for proteins plementary Table S5), and these results suggested that repli- involved in homologous recombination repair (40,41). cation initiation could have been inhibited in the WT cells Several highly repressed DEGs are implicated in different but not in the orc1-2 cells. We reasoned that most G1+ cellular processes including replication initiation (Orc1-1 cells could be ‘apparent’ G1 cells containing fired origins and Orc1-3) (42–45), genome maintenance and segregation of replication on their chromosome but the DNA replica- (chromatin protein Sul7d and Cren7 (46) and chromosome tion could be blocked by NQO-induced DNA lesions on segregation proteins SegA and SegB (47)), and cell divi- the chromosome in the cells. If so, collapse of stalled repli- sion (CdvA, ESCRT-III and Vps4 (48,49)aswellasseveral cation forks would induce DSBs, leading to cell death. In- ESCRT-III paralogs (50), Supplementary Table S5). To- deed, continuous incubation of these NQO-treated cultures gether, these results suggested that the NQO-induced DNA led to cell death to most orc1-2 cells as well as a fraction damage could impose DNA replication inhibition and cell of NQO-treated WT cells (Figure 2C). To this end, our re- cycle arrest in S. islandicus. sults indicated that the Orc1-2-depedent DDR regulation is RNA-Seq analysis also revealed that 37 genes showed of crucial importance to genome integrity maintenance in differential expression in orc1-2 (9 up- and 28 down- this archaeon. regulated genes Figure 1C). Among the down regulated genes, 24 showed a <3 folds of repression whereas the re- Promoters of DDR genes mediated NQO-responsive expres- maining four genes had 3- to 5-fold reduction. The rel- sion atively low levels of repression suggest that the observed changes may reflect background fluctuation of gene expres- In bacteria and archaea, short DNA segments immediately sion in the mutant. The up regulated genes include six genes upstream of genes often contain all promoter elements re- of 2- to 5-fold activation and three highly activated genes: quired for directing the expression of the genes. To test if SiRe 0629 and SiRe 0630 (9–10 folds), SiRe 0655 (47 folds), promoters of DDR genes could also contain DNA motifs all of which are of unknown function. To this end, these required for DDR regulation, promoter fragments of four data indicated that orc1-2 is no longer capable of mediat- DDR genes, i.e. upsA, upsE, tfb3 and cedA1, were ampli- ing NQO-responsive expression. fied by PCR using the primers list in Supplementary Ta- ble S2, giving a ca. 200-bp DNA fragment (upstream of the start codon) for each promoter. PCR fragments of these S. islandicus orc1-2 lost the capability of cell aggregation promoters were then used to replace the araS-SD promoter and cell cycle regulation in pSeSD (27), yielding reporter gene plasmids for these To further study the phenotype of the S. islandicus orc1- DDR genes (listed in Supplementary Table S1). All these 2, the deletion mutant and WT were grown in the medium promoters were found to confer NQO-responsive expres- containing 2 ␮M NQO for 24 h. Cell samples were taken sion from the reporter gene plasmids in S. islandicus (see during incubation and examined for cell aggregates under below). Then, promoter sequences (100 bp preceding the

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 6 Nucleic Acids Research, 2018

Figure 1. Global transcriptional change mediated by Orc1-2 upon NQO treatment. (A) Heatmap of the genome expression of S. islandicus E233S1 (WT) and orc1-2. Both strains were grown in the presence or absence of 2 ␮M NQO (indicated as + NQO and - NQO, respectively) for 6 h. Cell mass was collected from which total RNAs were prepared and used for RNA-Seq analysis. Genes are clustered with their log10(FPKM+1) values and their expression levels are illustrated with different colors with red colors representing the highest levels of expression whereas blue ones indicating the lowest levels of expression. (B) Volcano plot of differentially expressed genes in the WT strain. (C) Volcano plot of differentially expressed genes in orc1-2. X-axis: fold change in gene expression; Y-axis: statistical significance of the fold change. Genes exhibiting >2-fold (i.e. –1 > log2 > +1) up and down regulation are highlighted in red and green, respectively, whereas those that showed a <2-fold change in differential gene expression are shown in blue.

start codon) of 21 highly up regulated DDR genes (over 18 of the 5-ANTTC-3 motif completely abolished the NQO- folds change, Supplementary Table S6) were retrieved from responsive expression from each promoter (Figure 3), indi- the genome sequence of S. islandicus REY15A (20)and cating the motif functions as an NQO-responsive element analyzed for conserved sequence motifs using the MEME on these promoters. By contrast, none of the promoters (Multiple EM for Motif Elicitation) suite (32).A20bpcon- showed the NQO-responsive expression in the orc1-2 mu- sensus (5-AATAGTTTCRGWDTACTCWS-3) was iden- tant (Figure 3), suggesting that the 5-ANTTTC-3 motif tified, containing the DNA motif of5-ANTTTC-3 pre- and the Orc1-2 protein could interact with each other to viously reported for UV-responsive gene promoters of S. mediate DDR regulation in this archaeon. acidocaldarius (51). The motif is positioned at –23 to –55 bp upstream of the ATG codon in most identified DDR Orc1-2 bound to the conserved motif on DDR gene promoters genes (Supplementary Figure S3).   Next, the hexanucleotide motif (5-ANTTTC-3) in four To test if Orc1-2 could bind to the 5 -ANTTTC-3 motif promoters (upsA, upsE, tfb3 and cedA1) was mutated in- present in the promoter regions of DDR genes, the S. is- dividually by transversion mutation, giving respective mu- landicus orc1-2 gene was cloned into pET30a, an E. coli ex- tated promoters. Both the native promoters and their mu- pression vector, giving pET-orc1-2. The expression plasmid tated derivatives were analyzed for NQO-responsive expres- was introduced into E. coli for overexpression of Orc1-2 sion in WT and orc1-2. In WT transformants, mutation recombinant protein. Highly purified Orc1-2 recombinant protein was obtained (Supplementary Figure S4). DNase

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 7

Figure 2. orc1-2 mutant lost the capability of cell aggregation and cell cycle regulation. (A) Formation of cell aggregates before (0 h) and after treatment with 2 ␮M NQO (12 h). (B) Quantification of the extent of cell aggregation. (C) Cell cycle profiles of the cultures. DNA contents were divided into256 arbitrary points on the X-axis, and cell counts (Y-axis) were obtained for each point and used to plot against the DNA content. WT: S. islandicus E233S1; orc1-2: orc1-2 deletion mutant derived from the E233S1; DNA-less cells (L); cells containing one chromosome (1), and cells containing two chromosomes (2). Error bars: standard derivations of three independent experiments.

I footprinting assay was performed with the original and which the original promoter of the orc1-2genewasre- mutated promoters of upsE and tfb3 genes, in the presence, placed with araS-50, a promoter derivative of the araS gene or absence of Orc1-2 protein, following the procedure de- coding for an arabinose-binding protein. This yielded the scribed previously (33). We found that Orc1-2 protein pro- promoter-substitution mutant designated orc1-2araS (Sup- tected the sequence of 5-ANTTTC-3 motif and its flanking plementary Figure S5). It has been shown that the araS-50 regions on both DNA strands (Figure 4A), and substitu- promoter confers arabinose-inducible expression in this ar- tion of the 5-ANTTTC-3 motif on the mutated promoters chaeon (30). Therefore, when cultured in SCV with sucrose completely abolished the protection of the DDRE region by as the carbon source, a non-inducible medium for the orc1- Orc1-2 binding (Figure 4B). These results, together with the 2araS gene, Orc1-2 should be expressed to a constantly low report genes assays shown in Figure 3, indicated that Orc1-2 level in the mutant. Then, WT and orc1-2araS were grown protein binds specifically to 5 -ANTTTC-3 on these DDR in SCV for 48 h either in the presence or absence of 2 ␮M gene promoters and activates their expression. NQO. Cell samples were taken during incubation and ex- amined for orc1-2 expression, culture growth and cellular DNA content. S. islandicus cells stably expressing a low level of Orc1-2 pro- araS tein exhibited hypersensitivity to NQO treatment Analysis of Orc1-2 protein in the WT and orc1-2 cells by immunoblotting revealed that the cellular content of To yield an insight into the DDR regulation by Orc1-2 Orc1-2 remained constantly low (Figure 5A) in the presence in this archaeon, we constructed a S. islandicus strain in

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 8 Nucleic Acids Research, 2018

Figure 3. Promoters of NQO-inducible genes dictate the regulation of the NQO-responsive expression. Four highly up regulated genes (upsA, upsE, tfb3 and cedA1) were chosen for the reporter gene assay. S. islandicus strains carrying one of the 8 reporter gene plasmids were grown in SCV in the presence, or absence of 2 ␮M NQO (denoted as +NQO and –NQO respectively) for 6 h, and cell mass was collected from which cell extracts were prepared and used for determination of ␤-glycosidase activity. DDRE: reporter gene plasmids of original promoters containing the 5-ANTTTC-3 motif; DDREmut: reporter gene plasmids of mutated promoters carrying transversion mutation in the 5-ANTTTC-3 motif; WT: the genetic host S. islandicus E233S1; orc1-2: S. islandicus orc1-2 deletion mutant. Error bars: standard derivations of three independent experiments.

of NQO, consistent with the nature of the orc1-2araS fusion immunoblotting confirmed that Orc1-2 was expressed to gene. The mutant showed an interesting phenotype: it grew a high level in ACV-cultured orc1-2araS cells (Figure 6A). in a similar fashion as for WT in the absence of NQO; how- Then, growth data revealed that NQO treatment had little ever, mutant growth was strongly inhibited in the presence influence on the growth of orc1-2araS cells in ACV since very of the drug and the growth inhibition persisted (Figure 5B). similar growth curves were obtained for NQO-treated ver- These results suggest that orc1-2araS could be deficient in sus untreated orc1-2araS cultures (Figure 6B). Nevertheless, initiating DDR regulation as shown for orc1-2 (compare flow cytometry detected the increase of the cell population with the data in Figure 2 and Supplementary Figure S2). of 1–2 chromosomes in the NQO-treated cultures of both Indeed, flow cytometry of the cell samples showed that, a strains, indicating that both WT and orc1-2araS are capable large number of orc1-2araS cells became DNA-less cells dur- of recovering from NQO-induced DNA damage under this ing incubation in the NQO-SCV medium, and this is in con- growth condition (Figure 6C). Strikingly, both sets of data trast to the WT cells grown in the same medium in which (growth curves and flow cytometry profiles) suggest a quick majority of cells were recovered from NQO treatment at 48 recovery for the promoter substitution mutant, relative to hpt (Figure 5C). Together, these results indicated that S. is- WT (Figure 6). Taken together, these results suggested that landicus cells containing a low level of Orc1-2 protein fail to a high level of Orc1-2 could probably shorten the time re- respond to NQO treatment and they are as hypersensitive quired for execution of DDR regulation in S. islandicus. to NQO treatment as for orc1-2 cells.

A constant high level of Orc1-2 protein enabled immediate in- S. islandicus cells stably expressing a high level of Orc1-2 pro- duction of DDR genes in the archaeon upon NQO treatment tein responded more promptly to NQO treatment To test that, the activation of gene expression in orc1-2araS Next, we investigated how the archaeal cells containing a and WT cells after NQO treatment was investigated for a high level of Orc1-2 could respond to NQO treatment. To few selected DDR genes, including tfb3, upsX and cedB. do that, both orc1-2araS and WT were grown in ACV with D- Cell samples taken at the early stage of NQO treatment (1 arabinose as the carbon source, in the presence or absence of and 3 hpt) were used for total RNA extraction, and the ex- 2 ␮M NQO. This medium is an inducible medium for the tracted RNAs were analyzed for DDR gene expression by expression of the orc1-2araS gene such that Orc1-2 should RT-qPCR. be expressed to a constantly high level in orc1-2araS cells. As shown in Figure 7A, none of the three DDR genes These cultures were grown for 48 h during which cell sam- showed any NQO-responsive expression in the cells express- ples were taken for the analyses as above described. First, ing a constant low level of Orc1-2 (SCV-cultured orc1-2araS

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 9

Figure 4. Orc1-2 protein bound to a conserved motif on upsE and tfb3 promoters. DNase I footprinting was performed with fluorescence-labeled DNA fragment of PupsE or Ptfb3 promoter (ca. 200 bp) in the presence (red peaks) or absence (cyan peaks) of Orc1-2. DDRE: Promoter fragments containing the 5-ANTTTC-3 motif (highlighted in red); DDREmut: mutated promoter fragments carrying the transversion mutation of the 5-ANTTTC-3 motif (highlighted in blue). The promoter sequences shown for PupsE and Ptfb3are positioned from –17 to –69 and from –47 to –99 in reference to their start codons, respectively.

cells), reminiscent of the lack of the DDR regulation in Figure S6). A previous genetic analysis of orc1-2 function in orc1-2. By contrast, all three DDR genes were readily acti- S. islandicus revealed that the gene does not have a function vated by NQO treatment in the cells expressing a constantly in replication initiation, in contrast to its orthologs Orc1-1 high level of the regulator (ACV-cultured orc1-2araS cells), and Orc1-3 that function as the replication initiator to and in fact, their expression levels in orc1-2araS cells at 1 hpt the adjacent oriC1 and oriC2 individually (15). Here, we are higher than those in WT cells at 3 hpt (Figure 7B). In show that orc1-2 is hypersensitive to NQO, a chemi- summary, two sequential events have been identified in the cal that yields bulky adducts on bases of chromosomal archaeal DDR regulation: (a) activation of the expression DNAs, which are to be repaired by NER (35), and the of Orc1-2, and (b) the subsequent activation of target genes mutant has lost the capability to form NQO-induced cell by Orc1-2. Therefore, Orc1-2 plays a very important role in aggregation and cell cycle regulation (Figure 2). In fact, the DDR regulation in this archaeon. NQO-responsive expression is completely abrogated in orc1-2 (Figure 1, Supplementary Table S3). Together, DISCUSSION these findings suggest that, during evolution, Orc1-2 has lost the function as a replication initiator and gained the Here, we show that Orc1-2 functions as a global regulator new function as a primary regulator in DNA damage to mediate DNA damage response in S. islandicus,ahyper- response. In addition, investigation of bacterial replica- thermophilic crenarchaeon. To our knowledge, this repre- tion initiator DnaA proteins and eukaryotic replication sents the first identification of a key regulator in DNA dam- initiators Orc1 and Cdc6 proteins reveals that these factors age response of an archaeal organism. also function as transcriptional regulator, but these factors S. islandicus Orc1-2 is an ortholog of the have maintained their function in replication initiation archaeal/eukaryotic Orc1/Cdc6 superfamily of repli- (52–54), which is in contrast to the scenario reported for cation initiators (42–45). The encoding gene is not closely the S. islandicus Orc1-2 here. Therefore, this Orc1-2 protein located to any origins of replication on the chromosome of represents the first example of functional diversification of S. islandicus (15), and this genetic organization is conserved proteins of the Orc1/Cdc6 superfamily. in all known species in Sulfolobales (44,45) (Supplementary

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 10 Nucleic Acids Research, 2018

Figure 5. S. islandicus Orc1-2araS cells failed to respond to NQO treatment in SCV media. (A) Western analysis of Orc1-2 protein. Only the samples taken at 6 hpt were shown. In the lower panel, only NQO-treated WT sample (lane 2) was diluted for 8 folds. PCNA3 (one of the subunits of the replication clamp) was used as the loading reference. (B) Growth curves based on absorbance at 600 nm. (C) Flow cytometry profile of cell samples. DNA contents were divided into 256 arbitrary points on the X-axis, and cell counts (Y-axis) were obtained for each point and used to plot against the DNA content. WT: S. islandicus E233S1; orc1-2araS: Promoter-substitution mutant containing the araS-50 promoter-orc1-2 fusion gene. DNA-less cells (L); cells containing one chromosome (1), and cells containing two chromosomes (2).

Interestingly, it has been reported that archaeal DNA damage-induced genome expression as for the S. Orc1/Cdc6 proteins form two distinct clades corre- islandicus Orc1-2. sponding to Orc1-1 and Orc1-2 clusters in which proteins Our research has gained important insights into the of the Orc1-2 cluster evolve faster than those of the Orc1- mechanisms of Orc1-2-dependent regulation in S. islandi- 1cluster (44,45). Since S. islandicus Orc1-2 functions as cus. First, a conserved motif (5-ANTTTC-3) is present in a global DDR regulator, the fast evolution of the Orc1-2 the promoters of a number of highly up regulated DDR cluster probably reflects the acquisition of a new function genes, and it is identical to the motif identified in UV- for some of Orc1 proteins in this cluster. Furthermore, the inducible genes of S. acidocaldarius (51). Here, we show conservation of the genetic organization of orc1-2 and its that the S. islandicus Orc1-2 protein binds specifically to flanking genes in Sulfolobales (Supplementary Figure S6) the motif present in a selected set of DDR gene promot- suggests that these Orc1-2 homologs could also function as ers in vitro (Figure 4), and inactivation of either compo- a key regulator in archaeal DDR regulation. Moreover, it is nent (Orc1-2 deficiency or mutagenesis of the DNA bind- also very tempting to investigate whether non-Sulfolobales ing motif) abolishes the NQO-responsive expression from members of the Orc1-2 cluster could also play a role in all tested promoters in reporter gene assay (Figure 3). To this end, the deduced mechanism of transcriptional activa-

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 11

Figure 6. S. islandicus Orc1-2araS cells exhibited a quicker recovery from NQO treatment in ACV media. (A) Western analysis of Orc1-2 protein. Only the samples taken at 6 hpt were shown. In the lower panel NQO-treated WT sample (lane 2), NQO-treated and untreated orc1-2araS samples (lane 3 and 4) were diluted for 8 folds. PCNA3 (one of the subunits of the replication clamp) was used as the loading reference. (B) Growth curves based on absorbance at 600 nm. (C) Flow cytometry profile of cell samples. DNA contents were divided into 256 arbitrary points on the X-axis, and cell counts (Y-axis)were obtained for each point and used to plot against the DNA content. WT: S. islandicus E233S1; orc1-2araS: Promoter-substitution mutant containing the araS-50 promoter-orc1-2 fusion gene. DNA-less cells (L); cells containing one chromosome (1), and cells containing two chromosomes (2).

tion is that Orc1-2 binds to the DDR promoters and fa- subset of DDR genes (31). However, the TFB3-dependent cilitates gene expression. Second, a high level of orc1-2 ex- activation of DDR genes is completely abolished in orc1- pression is essential but not sufficient to trigger DDR reg- 2, indicating that TFB3 must function downstream of Orc1- ulation in this archaeon, and this is strongly supported by 2 in the DDR network of this archaeon (Supplementary Ta- the following findings: (a) When orc1-2 is constitutively ex- ble S4). Taken together, these results suggest that Orc1-2 is pressed to a constantly low level in orc1-2araS, a promoter- the primary regulator in the DDR network in this archaeon substitution mutant, DDR is not induced by NQO treat- and that the factor could be activated by posttranslational ment as observed for orc1-2.(b)Whenorc1-2 is expressed modifications (PTMs), in analogy to the eukaryotic ATM- to a constantly high level, the mutant does not show any dependent activation of DDR regulators, such as p53, an growth delay or retardation in the absence of NQO. Third, extensively characterized tumor suppressor that regulates orc1-2araS cells grown in ACV media contain a high level cell cycle arrest, DNA repair and apoptosis (55,56). of Orc1-2 but they do not exhibit any DDR regulation in Genes that are regulated both by Orc1-2 and by TFB3 the absence of NQO. Nevertheless, these orc1-2araS cells re- include those present in the ups operon coding for the pro- spond more promptly to NQO treatment in DDR initiation teins involved in the UV-responsive pilus formation (36)and than WT cells (Figure 7). In addition, TFB3 is a truncated those of the Ced system that mediates intercellular DNA TFB paralog that also functions as a DDR regulator to a transfer (37). It has been known for a long time that S.

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 12 Nucleic Acids Research, 2018

Figure 7. High Orc1-2 level in S. islandicus cells prior to NQO treatment enabled earlier responsive expression. (A) Relative expression levels of tfb3, upsX and cedB in WT and orc1-2araS cells at 1 and 3 hpt when grown in SCV in which the expression level of orc1-2 was constantly low. (B) Relative expression levels of tfb3, upsX and cedB in WT and orc1-2araS cells at 1 and 3 hpt when grown in ACV in which the expression level of orc1-2 was constantly high. The expression level of these genes in the cell samples of WT and orc1-2araS was estimated by RT-qPCR with the obtained data normalized to the level of 16S RNA. WT: S. islandicus E233S1; orc1-2araS: Promoter-substitution mutant containing the araS-50 promoter-orc1-2 fusion gene. Error bars: standard derivations of three independent experiments.

acidocaldarius efficiently mediates chromosomal DNA ex- a recombinant protein of the large subunit of the S. solfa- change (57) and the process is highly efficient (58). In fact, taricus Dpo2 is capable of catalyzing DNA synthesis using other Sulfolobus species have also been shown to mediate DNA templates containing a range of DNA lesions, such as DNA conjugation although the process has to be induced 8-oxoguanine, hypoxanthine and uracil (65), suggesting that such as by UV irradiation or DNA damage treatment with it could be a translesion DNA polymerase. Since SiRe 0236 bleomycin (36,37,59). Since the two systems have been im- coding for the Y-family DNA polymerase (whose homolog plicated in repairing UV-induced DNA damage in S. acido- in S. solfataricus functions as DNA lesion-bypass enzyme) caldarius (36,37,59,60) and NQO-mediated DNA damage did not exhibit any up regulation upon NQO treatment, t in S. islandicus, activation of their gene expression represent would be interesting to investigate if Dpo2 could be an ac- one of the DDR network of cellular events in S. islandicus. tive DNA polymerase in DNA damage repair. The up regulation of known HRR genes including nurA, In conclusion, our research has yielded the first picture rad50, mre11 and herA (Supplementary Table S4) by DNA of the archaeal DDR network for the regulation of cellular damage agents is consistent with the requirement for DNA processes upon DNA damage, in which the key regulator, repair with high fidelity and the strong activation of DNA Orc1-2 is positioned at the heart of the regulatory network transfer activity for importing DNA template for HRR as (Figure 8). First, the global regulator has to be activated by discussed above. Currently, whether there is any functional DNA damage, and the activation occurs in two aspects: (a) connection between the DNA transfer and HRR activity Orc1-2 strongly up regulates its own gene expression, and remains to be investigated. Employment of the existing ge- (b) the factor could be activated by PTMs, such as phos- netic manipulation methods to tackle this problem is chal- phorylation, acetylation and/or methylation. Then, the ac- lenging since each HRR gene is essential in this organ- tivated Orc1-2 exerts either activation or repression to the ism (61,62). Nevertheless, an efficient CRISPR-based gene expression of a large number of DDR genes. The repressed knockdown approach has been reported for this crenar- genes include those that mediate cell cycle arrest, including chaeon recently (63) and successfully used to dissect the inhibition of cell division, DNA replication initiation and function of the essential topR1 gene coding for a reverse genome segregation. The activated genes are as following: gyrase (24). This approach should be useful to test the hy- (a) Dpo2 that may function in translesion DNA synthesis, pothesis that the HRR system works in concert with the Ced (b) HRR genes that function in DNA repair and (c) TFB3, a system in archaeal DNA damage repair. secondary DDR regulator responsible for activation of the Among the genes coding for the four DNA poly- expression of the ups operon, the ced genes (31), both of merases in S. islandicus REY15A (20), only SiRe 0614 and which have been implicated in importing DNA fragments SiRe 0615 that code for DNA polymerase B2 (Dpo2) show for HRR. Finally, (d) the cell aggregation and DNA transfer NQO-responsive activation, and the regulation is Orc1-2- systems are subjected to dual control by TFB3 and Orc1-2 dependent but TFB3-independent, as demonstrated here (Figure 8). These results demonstrate, for the first time, that and in a previous work. Strikingly, the Sulfolobus Dpo2 has the strategy of orchestrating a network of cellular events to been considered as an inactivated DNA polymerase because deal with DNA damage is evolutionarily conserved across there are multiple substitutions in the catalytic residues of the three domains of life. its polymerase and exonuclease domains (64). Nevertheless,

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 13

Figure 8. An Orc1-2-centered network of DNA damage response in S. islandicus. DNA damage agents yield lesions on DNA that will be converted into double-stranded breaks, which activate the DNA damage signal transduction pathway in this archaeon. Then, the global regulator, Orc1-2 is probably activated by posttranslational modifications, such as phosphorylation and/or acetylation. The activated form of Orc1-2 then recognizes DDRE present in the promoters of DDR genes and activates or represses their expression, including several different cellular processes as well as its own gene. AAA+: ATPases associated with diverse cellular activities; wH: wing-helix DNA binding domain; DDRE: DNA damage-responsive element; BRE: Transcriptional factor B (TFB) recognition element; TATA; TATA box serving as the binding site for TATA-binding protein (TBP); TTS: transcription start site; Ups: UV-responsive pili of Sulfolobus; Ced: Crenarchaeal system for exchange of DNA.

DATA AVAILABILITY Foundation of China. M.S. and X.F. are recipients of PhD studentship from the China Scholarship Council. The accession number for the RNA-Seq data in this paper Conflict of interest statement. None declared. is GEO: GSE101744.

REFERENCES SUPPLEMENTARY DATA 1. Baharoglu,Z. and Mazel,D. (2014) SOS, the formidable strategy of Supplementary Data are available at NAR Online. bacteria against aggressions. FEMS Microbiol. Rev., 38, 1126–1145. 2. Giglia-Mari,G., Zotter,A. and Vermeulen,W. (2011) DNA damage response. Cold Spring Harb. Perspect. Biol., 3, a000745. ACKNOWLEDGEMENTS 3. Butala,M., Zgur-Bertok,D. and Busby,S.J. (2009) The bacterial LexA transcriptional repressor. Cell. Mol. Life Sci., 66, 82–93. We thank our colleagues in the Archaea Centre, Univer- 4. Kreuzer,K.N. (2013) DNA damage responses in prokaryotes: sity of Copenhagen and those in the Wuhan laboratory, regulating gene expression, modulating growth patterns, and Huazhong Agricultural University, China for helpful dis- manipulating replication forks. Cold Spring Harb. Perspect. Bio.l, 5, a012674. cussions. 5. Sirbu,B.M. and Cortez,D. (2013) DNA damage response: three levels of DNA repair regulation. Cold Spring Harb. Perspect. Biol., 5, a012724. FUNDING 6. Blackford,A.N. and Jackson,S.P. (2017) ATM, ATR, and DNA-PK: the trinity at the heart of the DNA damage response. Mol. Cell, 66, National Natural Science Foundation of China [31771380]; 801–817. Danish Council for Independent Research [DFF-4181- 7. Baliga,N.S., Bjork,S.J., Bonneau,R., Pan,M., Iloanusi,C., 00274]. Funding for open access charge: National Science Kottemann,M.C., Hood,L. and DiRuggiero,J. (2004) Systems level

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 14 Nucleic Acids Research, 2018

insights into the stress response to UV radiation in the halophilic Moors,A. et al. (2001) The complete genome of the crenarchaeon archaeon Halobacterium NRC-1. Genome Res., 14, 1025–1035. Sulfolobus solfataricus P2. Proc. Natl. Acad. Sci. U.S.A., 98, 8. McCready,S., Muller,J.A., Boubriak,I., Berquist,B.R., Ng,W.L. and 7835–7840. DasSarma,S. (2005) UV irradiation induces homologous 29. Heckman,K.L. and Pease,L.R. (2007) Gene splicing and mutagenesis recombination genes in the model archaeon, Halobacterium sp. by PCR-driven overlap extension. Nat. Protoc., 2, 924–932. NRC-1. Saline Syst., 1,3. 30. Peng,N., Xia,Q., Chen,Z., Liang,Y.X. and She,Q. (2009) An upstream 9. Frols,S., Gordon,P.M., Panlilio,M.A., Duggin,I.G., Bell,S.D., activation element exerting differential transcriptional activation on Sensen,C.W. and Schleper,C. (2007) Response of the an archaeal promoter. Mol. Microbiol., 74, 928–939. hyperthermophilic archaeon Sulfolobussolfataricus to UV damage. J. 31. Feng,X., Sun,M., Han,W., Liang,Y.X. and She,Q. (2018) A Bacteriol., 189, 8708–8718. transcriptional factor B paralog functions as an activator to DNA 10. Gotz,D., Paytubi,S., Munro,S., Lundgren,M., Bernander,R. and damage-responsive expression in archaea. Nucleic Acids Res., White,M.F. (2007) Responses of hyperthermophilic crenarchaea to doi:10.1093/nar/gky236. UV irradiation. Genome Biol., 8, R220. 32. Bailey,T.L., Boden,M., Buske,F.A., Frith,M., Grant,C.E., 11. Salerno,V., Napoli,A., White,M.F., Rossi,M. and Ciaramella,M. Clementi,L., Ren,J., Li,W.W. and Noble,W.S. (2009) MEME SUITE: (2003) Transcriptional response to DNA damage in the archaeon tools for motif discovery and searching. Nucleic Acids Res., 37, Sulfolobus solfataricus. Nucleic Acids Res., 31, 6127–6138. W202–W208. 12. Erill,I., Campoy,S. and Barbe,J. (2007) Aeons of distress: an 33. He,F., Vestergaard,G., Peng,W., She,Q. and Peng,X. (2017) evolutionary perspective on the bacterial SOS response. FEMS CRISPR-Cas type I-A Cascade complex couples viral infection Microbiol. Rev., 31, 637–656. surveillance to host transcriptional regulation in the dependence of 13. Ciccia,A. and Elledge,S.J. (2010) The DNA damage response: making Csa3b. Nucleic Acids Res., 45, 1902–1913. it safe to play with knives. Mol. Cell, 40, 179–204. 34. Bailleul,B., Daubersies,P., Galiegue-Zouitina,S. and 14. Peng,N., Han,W., Li,Y., Liang,Y. and She,Q. (2017) Genetic Loucheux-Lefebvre,M.H. (1989) Molecular basis of 4-nitroquinoline technologies for extremely thermophilic microorganisms of 1-oxide carcinogenesis. Jpn. J. Cancer Res., 80, 691–697. Sulfolobus, the only genetically tractable genus of crenarchaea. Sci. 35. Williams,A.B., Hetrick,K.M. and Foster,P.L. (2010) Interplay of China Life Sci., 60, 370–385. DNA repair, homologous recombination, and DNA polymerases in 15. Samson,R.Y., Xu,Y., Gadelha,C., Stone,T.A., Faqiri,J.N., Li,D., resistance to the DNA damaging agent 4-nitroquinoline-1-oxide in Qin,N., Pu,F., Liang,Y.X., She,Q. et al. (2013) Specificity and Escherichia coli. DNA Repair (Amst.), 9, 1090–1097. function of archaeal DNA replication initiator proteins. Cell Rep., 3, 36. Ajon,M., Frols,S., van Wolferen,M., Stoecker,K., Teichmann,D., 485–496. Driessen,A.J., Grogan,D.W., Albers,S.V. and Schleper,C. (2011) 16. Han,W., Xu,Y., Feng,X., Liang,Y.X., Huang,L., Shen,Y. and She,Q. UV-inducible DNA exchange in hyperthermophilic archaea mediated (2017) NQO-Induced DNA-Less cell formation is associated with by type IV pili. Mol. Microbiol., 82, 807–817. chromatin protein degradation and dependent on A0A1-ATPase in 37. van Wolferen,M., Wagner,A., van der Does,C. and Albers,S.V. (2016) sulfolobus. Front. Microbiol., 8, 1480. The archaeal Ced system imports DNA. Proc. Natl. Acad. Sci. 17. Zillig,W., Kletzin,A., Schleper,C., Holz,I., Janekovic,D., Hain,J., U.S.A., 113, 2496–2501. Lanzendorfer,M. and Kristjansson,J.K. (1994) Screening for 38. Qureshi,S.A. and Jackson,S.P. (1998) Sequence-specific DNA binding sulfolobales, their plasmids and their viruses in icelandic solfataras. by the S. shibatae TFIIB homolog, TFB, and its effect on promoter Syst. Appl. Microbiol., 16, 609–628. strength. Mol. Cell, 1, 389–400. 18. Schleper,C., Kubo,K. and Zillig,W. (1992) The particle SSV1 from 39. Paytubi,S. and White,M.F. (2009) The crenarchaeal DNA the extremely thermophilic archaeon Sulfolobus is a virus: damage-inducible transcription factor B paralogue TFB3 is a general demonstration of infectivity and of transfection with viral DNA. activator of transcription. Mol. Microbiol., 72, 1487–1499. Proc. Natl. Acad. Sci. U.S.A., 89, 7645–7649. 40. Constantinesco,F., Forterre,P. and Elie,C. (2002) NurA, a novel 5-3 19. Deng,L., Zhu,H., Chen,Z., Liang,Y.X. and She,Q. (2009) Unmarked nuclease gene linked to rad50 and mre11 homologs of thermophilic gene deletion and host-vector system for the hyperthermophilic Archaea. EMBO Rep., 3, 537–542. crenarchaeon Sulfolobusislandicus. Extremophiles, 13, 735–746. 41. Constantinesco,F., Forterre,P., Koonin,E.V., Aravind,L. and Elie,C. 20. Guo,L., Brugger,K., Liu,C., Shah,S.A., Zheng,H., Zhu,Y., Wang,S., (2004) A bipolar DNA helicase gene, herA, clusters with rad50, Lillestol,R.K., Chen,L., Frank,J. et al. (2011) Genome analyses of mre11 and nurA genes in thermophilic archaea. Nucleic Acids Res., Icelandic strains of Sulfolobus islandicus, model organisms for genetic 32, 1439–1447. and virus-host interaction studies. J. Bacteriol., 193, 1672–1680. 42. Bell,S.D. (2012) Archaeal orc1/cdc6 proteins. Subcell. Biochem., 62, 21. Trapnell,C., Roberts,A., Goff,L., Pertea,G., Kim,D., Kelley,D.R., 59–69. Pimentel,H., Salzberg,S.L., Rinn,J.L. and Pachter,L. (2012) 43. Parker,M.W., Botchan,M.R. and Berger,J.M. (2017) Mechanisms Differential gene and transcript expression analysis of RNA-seq and regulation of DNA replication initiation in eukaryotes. Crit. Rev. experiments with TopHat and Cufflinks. Nat. Protoc., 7, 562–578. Biochem. Mol. Biol., 52, 107–144. 22. Wang,L., Feng,Z., Wang,X., Wang,X. and Zhang,X. (2010) DEGseq: 44. Raymann,K., Forterre,P., Brochier-Armanet,C. and Gribaldo,S. an R package for identifying differentially expressed genes from (2014) Global phylogenomic analysis disentangles the complex RNA-seq data. Bioinformatics, 26, 136–138. evolutionary history of DNA replication in archaea. Genome Biol. 23. Schmittgen,T.D. and Livak,K.J. (2008) Analyzing real-time PCR data Evol., 6, 192–212. by the comparative C(T) method. Nat. Protoc., 3, 1101–1108. 45. Makarova,K.S. and Koonin,E.V. (2013) Archaeology of eukaryotic 24. Han,W., Feng,X. and She,Q. (2017) Reverse gyrase functions in DNA replication. Cold Spring Harb. Perspect. Biol., 5, a012963. genome integrity maintenance by protecting DNA breaks in vivo. Int. 46. Driessen,R.P., Meng,H., Suresh,G., Shahapure,R., Lanzani,G., J. Mol. Sci., 18, E1340. Priyakumar,U.D., White,M.F., Schiessel,H., van Noort,J. and 25. Li,Y., Pan,S., Zhang,Y., Ren,M., Feng,M., Peng,N., Chen,L., Dame,R.T. (2013) Crenarchaeal chromatin proteins Cren7 and Sul7 Liang,Y.X. and She,Q. (2016) Harnessing Type I and Type III compact DNA by inducing rigid bends. Nucleic Acids Res., 41, CRISPR-Cas systems for genome editing. Nucleic Acids Res., 44, e34. 196–205. 26. Warrens,A.N., Jones,M.D. and Lechler,R.I. (1997) Splicing by 47. Kalliomaa-Sanford,A.K., Rodriguez-Castaneda,F.A., McLeod,B.N., overlap extension by PCR using asymmetric amplification: an Latorre-Rosello,V., Smith,J.H., Reimann,J., Albers,S.V. and improved technique for the generation of hybrid proteins of Barilla,D. (2012) Chromosome segregation in Archaea mediated by a immunological interest. Gene, 186, 29–35. hybrid DNA partition machine. Proc. Natl. Acad. Sci. U.S.A., 109, 27. Peng,N., Deng,L., Mei,Y., Jiang,D., Hu,Y., Awayez,M., Liang,Y. and 3754–3759. She,Q. (2012) A synthetic arabinose-inducible promoter confers high 48. Samson,R.Y., Obita,T., Freund,S.M., Williams,R.L. and Bell,S.D. levels of recombinant protein expression in hyperthermophilic (2008) A role for the ESCRT system in cell division in archaea. archaeon Sulfolobus islandicus. Appl. Environ. Microbiol., 78, Science, 322, 1710–1713. 5630–5637. 49. Lindas,A.C., Karlsson,E.A., Lindgren,M.T., Ettema,T.J. and 28. She,Q., Singh,R.K., Confalonieri,F., Zivanovic,Y., Allard,G., Bernander,R. (2008) A unique cell division machinery in the Awayez,M.J., Chan-Weiher,C.C., Clausen,I.G., Curtis,B.A., De Archaea. Proc. Natl. Acad. Sci. U.S.A., 105, 18942–18946.

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 Nucleic Acids Research, 2018 15

50. Liu,J., Gao,R., Li,C., Ni,J., Yang,Z., Zhang,Q., Chen,H. and Shen,Y. archaeon Sulfolobus solfataricus is mediated by pili formation. Mol. (2017) Functional assignment of multiple ESCRT-III homologs in Microbiol., 70, 938–952. cell division and budding in Sulfolobus islandicus. Mol. Microbiol., 60. van Wolferen,M., Ma,X. and Albers,S.V. (2015) DNA processing 105, 540–553. proteins involved in the UV-induced stress response of Sulfolobales. J. 51. Le,T.N., Wagner,A. and Albers,S.V. (2017) A conserved Bacteriol., 197, 2941–2951. hexanucleotide motif is important in UV-inducible promoters in 61. Huang,Q., Liu,L., Liu,J., Ni,J., She,Q. and Shen,Y. (2015) Efficient Sulfolobus acidocaldarius. Microbiology, 163, 778–788. 5-3 DNA end resection by HerA and NurA is essential for cell 52. Scholefield,G., Veening,J.W. and Murray,H. (2011) DnaA and ORC: viability in the crenarchaeon Sulfolobus islandicus. BMC Mol. Biol., more than DNA replication initiators. Trends Cell Biol., 21, 188–194. 16,2. 53. Sasaki,T. and Gilbert,D.M. (2007) The many faces of the origin 62. Zhang,C., Tian,B., Li,S., Ao,X., Dalgaard,K., Gokce,S., Liang,Y. recognition complex. Curr. Opin. Cell Biol., 19, 337–343. and She,Q. (2013) Genetic manipulation in Sulfolobusislandicus and 54. Hossain,M. and Stillman,B. (2016) Opposing roles for DNA functional analysis of DNA repair genes. Biochem. Soc. Trans., 41, replication initiator proteins ORC1 and CDC6 in control of Cyclin E 405–410. gene transcription. Elife, 5, pii: e12785. 63. Peng,W., Feng,M., Feng,X., Liang,Y.X. and She,Q. (2015) An 55. Beckerman,R. and Prives,C. (2010) Transcriptional regulation by archaeal CRISPR type III-B system exhibiting distinctive RNA p53. Cold Spring Harb. Perspect. Biol., 2, a000935. targeting features and mediating dual RNA and DNA interference. 56. Fischer,M. (2017) Census and evaluation of p53 target genes. Nucleic Acids Res., 43, 406–417. Oncogene, 36, 3943–3956. 64. Rogozin,I.B., Makarova,K.S., Pavlov,Y.I. and Koonin,E.V. (2008) A 57. Grogan,D.W. (1996) Exchange of genetic markers at extremely high highly conserved family of inactivated archaeal B family DNA temperatures in the archaeon Sulfolobus acidocaldarius. J. Bacteriol., polymerases. Biol. Direct., 3, 32. 178, 3207–3211. 65. Choi,J.Y., Eoff,R.L., Pence,M.G., Wang,J., Martin,M.V., Kim,E.J., 58. Jacobs,K.L. and Grogan,D.W. (1998) Spontaneous mutation in a Folkmann,L.M. and Guengerich,F.P. (2011) Roles of the four DNA thermoacidophilic archaeon: evaluation of genetic and physiological polymerases of the crenarchaeon Sulfolobus solfataricus and factors. Arch. Microbiol., 169, 81–83. accessory proteins in DNA replication. J. Biol. Chem., 286, 59. Frols,S., Ajon,M., Wagner,M., Teichmann,D., Zolghadr,B., 31180–31193. Folea,M., Boekema,E.J., Driessen,A.J., Schleper,C. and Albers,S.V. (2008) UV-inducible cellular aggregation of the hyperthermophilic

Downloaded from https://academic.oup.com/nar/advance-article-abstract/doi/10.1093/nar/gky487/5033999 by Det Kongelige Bibliotek user on 07 June 2018 Supplementary tables Supplementary table S1. Sulfolobus strains and vectors used in this study

Strain or vector Genotype/features Source or reference

Sulfolobus strains S. islandicus E233S1 △pyrEF △lacS Deng et al.(1)

Derived from E233S1, carrying deletion of orc1-2 S. islandicus △orc1-2 Samson et al.(2) genes Derived from E233S1, the original promoter of orc1-2 gene was replaced with araS-50 ( a promoter S. islandicus orc1-2araS derivative of the araS gene coding for arabinose- This work binding protein that shows arabinose-inducible expression(3)) Vectors

The plasmid contains a DNA fragment of two tandem pSe-Rp copies of CRISPR repeat sequences for construction Peng et al.(4) of the artificial mini-CRISPR array

pSe-Rp carrying a spacer matching to the protospacer pAC-orc1-2 This work in the promoter region of orc1-2 gene in genome

Sulfolobus-E. coli shuttle vector derived from pSeSD pZC1-S-50 with the substitution mutation of TA to GC at -51/-50 Peng et al.(3) relative to TSS of araS.

pZC1-S-50-orc1-2 The pZC1-S-50 carrying an Orc1-2 coding sequence This work between NdeI and SalI

The genome-editing plasmid derived from pAC-orc1-2, pGE-orc1-2araS with the donor (lacking the protospacer) DNA inserted This work between SphI and XholI

A Sulfolobus-E. coli shuttle vector carrying an pSeSD expression cassette controlled under a synthetic strong Peng et al.(5) promoter ParaS-SD

S. solfataricus lacS was insert into pSeSD to yield a pSe-LacS fusion gene coding for a LacS fusion protein fused to a Peng et al.(5) His-tag at both ends

Promoter of araS-SD of pSe-LacS was replaced by the pSe-upsA-LacS This work promoter of upsA (200 bp upstream of start codon)

The ANTTTC motif in pSe-upsA-LacS was mutated by pSe-upsAmut-LacS This work transversion mutation Strain or vector Genotype/features Source or reference

Promoter of araS-SD of pSe-LacS was replaced by the pSe-upsE-LacS This work promoter of upsE (200 bp upstream of start codon)

The ANTTTC motif in pSe-upsE-LacS was mutated by pSe-upsEmut-LacS This work transversion mutation

Promoter of araS-SD of pSe-LacS was replaced by pSe-tfb3-LacS This work the promoter of tfb3 (200 bp upstream of start codon)

The ANTTTC motif in pSe-tfb3-LacS was mutated by pSe-tfb3mut-LacS This work transversion mutation

Promoter of araS- of pSeSD-LacS was replaced by the pSe-cedA1-LacS This work promoter of cedA1 (200 bp upstream of start codon)

The ANTTTC motif in pSe-cedA1-LacS was mutated pSe-cedA1mut-LacS This work by transversion mutation

pET-orc1-2 pET30a carry the S. islandicus orc1-2 gene This work

pCRTM4-TOPO® T/A cloning commercial vector Thermo Scientific

Supplementary table S2. Primers used in this work

Primers Sequence (5’-3’)

Primers for the construction of orc1-2araS strain orc1-2fwd-NcoI GCCACTCCATGGGTTTCAGCTAAGGATATACT orc1-2rev-XmaI ATTTCCCGGGAGCCTTCGATTTTAACCTCA orc1-2araS Spacer F AAGCTTAATTTAGATTGGGGTAAAAGTTTTGGTTTCAG CTAA orc1-2araS Spacer R AGCTTAGCTGAAACCAAAACTTTTACCCCAATCTAAATT AAG orc1-2araS fwd-SphI TTTGCATGCCGTTACCCGCTGTTATTGAG orc1-2araS rev-XhoI TTTCTCGAGTATCATAAAGTTCCATAGACCT orc1-2araS SOErev TTTTTTTTCACCGCGGGCAATGTTAAACAAGTTAGG orc1-2araS SOEfwd GCCCGCGGTGAAAAAAAATTATATTTTATCTAAGAG orc1-2araS Check-fwd ACTACTCTCTTAGATAAAATATAA orc1-2araS Check-rev AAGTATATCCTTAGCTGAAACCAA Primers for RT-qPCR orc1-2qfwd CAGAGAAGGAGGGATCACCA orc1-2qrev TCCCATTGTAACCTCATCAGC 16Sqfwd GAATGGGGGTGATACTGTCG 16Sqrev TTTACAGCCGGGACTACAGG tfb3qfwd TGACGAGGGTATTTTGAGTGGT tfb3qrev TCCGATCACTTTCTTCGTTGAGT upsXqfwd ACTGGTTGGAGGGAGATCGA upsXqrev ACAGTTCCGTCATTAGAAACCAGA upsAqfwd TCAATCCTAATACCGGACAAGCA upsAqfwd TTGGCCAGCAGTTATTTTACCA Dpo2qfwd CCATCATACACCATGGCAGTT Dpo2qrev CTCCGCCAAAGCCCAAATTAG cedBqfwd GTATTGGATGAAGCATGGAC cedBqrev CTTACCTCCTCCCAGAATTT SiReˍ1957qfwd TTTTCAAGTAGTGCGTCCCA SiReˍ1957qrev CCATGGATAGCCAGCAAAGT SiReˍ1550qfwd CCAGAAGTTTCGTTAGAACTCGC SiReˍ1550qrev ACCAACAGGTAGGTCTGGGA Primers for the construction of pET-orc1-2 plamid orc1-2fwd-NdeI GGTAAAACATATGGTTTCAGCTAAGGATATA orc1-2rev-NotI ATTGCGGCCGCAGCCTTCGATTTTAACCT

Primers Sequence (5’-3’)*

Primers used for the construction of report gene plasmids upsA-fwd-SphI AGATGCATGCATTGTTCTATATTCGATAGC upsA-rev- NdeI TTAGCTGACATATGCTCTCACTATATCAAAATTT upsA-SOE-top TAAATTCGGGGAGATGTTCTCTGAGTAAGATAAATT upsA-SOE-bm AACATCTCCCCGAATTTACCTTTTTAGATTAAGTTA upsE-fwd- SphI AGATGCATGCTCATTTTCCCCATAGAGCCT upsE-rev- NdeI TTAGCTGACATATGAATACTGAGTATTCAGAAAA upsE-SOE-top TTAAGTCCGGGAGGTATTGTGTGAGTATATTTTTTC upsE-SOE-bm AATACCTCCCGGACTTAAAACTTCCTTTATAGCCAT tfb3-fwd- SphI AGATGCATGCATTGTAACTCTTCCTCTCTG tfb3-rev- NdeI TTAGCTGACATATGATGAAAACAATAACTTTTAT tfb3-SOE-top TAAAGTCTGGGAAGCTTTCTCAGACATTATAGTCAC tfb3-SOE-bm AAAGCTTCCCAGACTTTAGAGAAGATATATTTAGCA cedA1-fwd- SphI AGATGCATGCATAATTAAAAGGGAATAC cedA1-rev- NdeI TTAGCTGACATATGTACTTATCGTTAATATCT cedA1-SOE-top ATCATCCTGGGAACAAAACCCTCAGCCATTAAAAGA cedA1-SOE-bm TTTTGTTCCCAGGATGATAGATAGGTCAGAATTAAT Primers for footprinting assay M13fwd-HEX TGTAAAACGACGGCCAGT

M13rev-FAM CAGGAAACAGCTATGACC *The mutated DDRE sequences are underlined

Supplementary table S4. Transcriptional change of selected Orc1-2 dependent NQO- inducible genes in S. islandicus a

mRNA abundance ratio (calculated based on Gene TFB3- Proteins FPKM) identity dependence WT WT Δorc1-2 Δorc1-2 - NQO + NQO - NQO + NQO DNA replication & transcription

Dpo2N SiRe_0614 1 21.530 0.040 0.122 No

Dpo2C SiRe_0615 1 29.011 0.044 0.094 No Orc1-2 SiRe_1231 1 6.652 0.084 0.128 No TFB1 SiRe_1555 1 4.621 1.890 1.844 Yes TFB3 SiRe_1717 1 105.946 1.217 1.291 Yes Homologous recombination NurA SiRe_0061 1 2.732 1.227 1.812 No Rad50 SiRe_0062 1 4.268 2.339 2.830 No Mre11 SiRe_0063 1 4.595 1.164 1.608 No HerA SiRe_0064 1 5.294 1.208 1.501 No DNA transfer CedA1 SiRe_1316 1 47.364 0.079 0.107 Yes CedA SiRe_1317 1 40.216 0.157 0.206 Yes CedA2 SiRe_1318 1 45.942 0.414 0.203 Yes CedB SiRe_1857 1 78.300 1.645 1.817 Yes UpsX SiRe_1878 1 49.330 1.365 1.536 Yes UpsE SiRe_1879 1 128.216 0.391 0.429 Yes UpsF SiRe_1880 1 65.944 1.258 1.457 Yes UpsA SiRe_1881 1 199.674 0.213 0.251 Yes UpsB SiRe_1882 1 109.181 0.296 0.310 Yes

Putative DDR related proteins

NurA SiRe_0014 1 35.557 1.367 1.836 Yes paralogue Hypothetical SiRe_0020 1 91.566 0.196 0.182 Yes protein Putative sialic acid SiRe_0137 1 129.008 0.134 0.215 Yes transporter

mRNA abundance ratio (calculated based on Gene TFB3- Proteins FPKM) identity dependent WT WT Δorc1-2 Δorc1-2 - NQO + NQO - NQO + NQO Hypothetical SiRe_0187 1 124.001 0.366 0.400 Yes protein Hypothetical SiRe_0269 1 46.608 0.079 0.052 Yes protein Hypothetical SiRe_0426 1 61.611 1.351 1.444 Yes protein Putative DNA SiRe_0589 1 108.693 0.279 0.417 Yes methylase Zinc finger SWIM SiRe_0670 1 54.801 0.018 0.018 Yes domain Glycoside SiRe_0936 1 41.046 0.817 0.620 Yes hydrolase Hypothetical SiRe_1040 1 38.389 1.396 1.194 NVb protein HerA1 SiRe_1715 1 141.412 1.219 1.754 Yes Flagellar hook-basal SiRe_1957 1 128.999 0.438 0.479 Yes body protein Hypothetical SiRe_2100 1 73.161 0.073 0.100 Yes protein Pseudo SiRe_2101 1 31.776 1.003 0.805 Yes a Supplementary Table S4 is a list of selected Orc1-2 dependent NQO-inducible genes, those genes encoding hypothetical protein but with a high induced fold (>30 folds) were also included. b NV: not available.

Supplementary table S5. Transcriptional change of selected Orc1-2 dependent NQO- repressible genes in S. islandicus a

mRNA abundance ratio (calculated based on Gene Process Proteins FPKM) identity WT WT Δorc1-2 Δorc1-2 - NQO + NQO - NQO + NQO Chromatin Cren7 SiRe_1111 1 0.023 0.585 0.275 Sul7d SiRe_2648 1 0.077 0.746 0.307 DNA replication

Orc1-3 SiRe_0002 1 0.434 1.498 1.202 Orc1-1 SiRe_1740 1 0.153 1.876 1.700

Cell division CdvA SiRe_1173 1 0.501 1.951 2.024 ESCRT-III SiRe_1174 1 0.285 1.499 1.862

Vps4 SiRe_1175 1 0.197 1.200 1.286 ESCRT-III-2 SiRe_1200 1 0.011 0.603 0.719 ESCRT-III-3 SiRe_1388 1 0.132 1.073 1.192 ESCRT-III-1 SiRe_1550 1 0.008 0.738 0.859 SegA SiRe_1962 1 0.197 0.844 0.753 SegB SiRe_1961 1 0.363 0.680 0.777 Cell mobility Flagellar SiRe_0118 1 0.359 1.655 1.751 protein FlaJ Flagellar SiRe_0119 1 0.269 1.655 1.093 protein FlaI Flagellar SiRe_0120 1 0.167 2.542 1.510 protein FlaH Flagellar SiRe_0121 1 0.156 1.241 0.854 protein FlaF Flagellar SiRe_0122 1 0.032 1.310 1.220 protein FlaG Hypothetical SiRe_0123 1 0.105 2.281 2.043 protein

Flagellin SiRe_0124 1 0.009 0.621 0.564

mRNA abundance ratio (calculated based on Gene Process Proteins FPKM) identity WT WT Δcdc6-2 Δcdc6-2 - NQO + NQO - NQO + NQO

Putative DDR related proteins

Hypothetical SiRe_0126 1 0.023 1.011 0.865 protein

CopG family transcriptional SiRe_0197 1 0.169 1.053 1.402 regulator

ParA SiRe_0265 1 0.334 1.440 1.309 paralogue

Putative transcriptiona SiRe_1577 1 0.022 0.592 0.587 l regulator Transcription al regulator, SiRe_1806 1 0.095 1.340 1.552 ArsR family DNA-binding transcriptiona SiRe_1883 1 0.016 0.684 0.712 l activator MhpR Hypothetical SiRe_1890 1 0.037 1.536 1.732 protein Transcription al regulator, SiRe_1891 1 0.044 0.781 0.801 XRE family Hypothetical SiRe_2677 1 0.014 0.637 0.733 protein

a Supplementary Table S5 is a list of selected Orc1-2 dependent NQO-repressed genes, those genes encoding for the putative transcriptional regulator or hypothetical protein (with a high repressed fold < 0.03 folds) were also included.

Supplementary table S6. Promoter Sequences used for motif identification

Genes Sequence (5’-3’) SiRe_1879 GTTATCAATCTCTATGGCTTTAAGTGAAGCAATGGCTATAAAGGAAG TTTTAAGTAATTTCGGTATTGTGTGAGTATATTTTTTCTGAATACTCAG TATT SiRe_1715 CATATCAAATTGTAGGAAATAAAGAAAGATTAATTTTAACTAAGAACG CATTACTTTTAATCGACTCGCTGTAGTTAATAATCATAAAACTCGTTA CAAG SiRe_0137 CTCGCCTTTTAGTGAAAATAGTCTGATTGTGAGGTTAAAAAGTTTTA GCGTGGTCCCTCGCTTAGTTGTTCTTTAAAGGTAAGTTGCAGGTGT AAGGTCT SiRe_1957 ATAGTAAACTATGTTATCGAAAAAGGCAAATTTTATTCAAATTAAATCC ACATCGGTAGTTTCAAAATAGACTCCCTTCTCTTAAAAATTGATTTTT AAC SiRe_0187 CTTCATTCGACTTATGGGTTATATCTTTAATGAATGCAAAATAGAGAA CTTTCACTTTTTTCATACACTCACCATTATTAATTTTTAATCCTTAGTTA GT SiRe_0589 AAACAAATTAGGTTATAATTGTATACAGAATCTCCAGTAATATTTAACA CTTAAAATAGTATCGTCATACCGAGAGTGTCTAATACTTCTCTTTTATT CT SiRe_1717 ATGCTAAATATATCTTCTCTAAAGTAGTTTCAGCTTTCTCAGACATTAT AGTCACATTATCAAAAATAAGTATATTCCCCATAAAAGTTATTGTTTTC AT SiRe_0020 GTCAGTTAAGGTTGGTTTCCTCATACATTATAATCATTATTTTTAGTTA CTATTATCAGTTACGCTTCACTGACTGTTAGATATTAAACACCGGGTT ATT SiRe_1857 ATAAGGTTAAAGAAGACTGAAGAAGATTTAAGGTTAGTTCACGCTAA TATTGAACTATAGTTTCTCTTCAGTCACTCACAATTTTTTAATAATAGT GAAT SiRe_2100 TCCCTGCTTGATAGTGAGCTCATCTCGTTTAACATTTAACGTTTAACT AAAAAAATAATATCGGTCTTCCCTCAGATAATATAATCTCTTAAAAACT CTC SiRe_0425 GAGGGACCGTATATTAAGGTTAAAGGTAAACTAAGGGAAATAGTTGA CAAGTTGGATGATAGTGGAATTGAGATTATAAGTATTAGGAGGTCTT CATTGG SiRe_0670 GGGTTAAAACGTATAAAACCTTAGGAATACTTTCCCATCTTCAATGAA TAATACTTTCAGTGTACTGGATCTAATTCTTTTTTGAAGAAACGATAA TGAG SiRe_1316 ACGTGGATGTTAATAATATAGTAAAATTAATTCTGACCTATCTATCATC AGTTTCACAAAACCCTCAGCCATTAAAAGAAAAAGATATTAACGATA AGTA SiRe_0269 AGGGAACCAGTGGAAATTTCATTGCCCAAGATAGTTGAGAACGTGA GACAGAAAATACTTTCGTGAATCTCAGTAGTGTTTTTTAAGTAAAATT TCTGAT SiRe_0936 GAGAAAAAGTAAGAATATCCATTGATACTAATAAGGTAGTACTCTTTG ATGCTTCAAGCGAAAGAAACCTACTTGAGCAATACGAGAAAGTGAA GAAGAAG SiRe_1040 GGCTGTAGTGTATTAAGTATTATACATTTATTAAACTTATTCCAAGCAT ACAGTTTTAGTTTACAGAGGATCATTTTAAAAATAAAGCTCTAACTTA AAT Genes Sequence (5’-3’)

SiRe_0016 AAAAGTCCAGCATTGGTAAGAATACCGTGGGATGCTAGGTTGGGAT CATACGCTGGAAACGTGGAAAGAATTGATTTAATTTTAAGCGAAGGG TGAAAAT SiRe_1231 CTGGGATATCGCTTTTGAGAAAGAATCAAGAAAAGTTTCGGACTACT CTCTTAGATAAAATATAAGGTAATACCGCTTAATTTAGATTGGGGTAA AAGTTT SiRe_0614 TTTAAGATATTATCTTCCTAATATCTAGCTACAAAAGCTAAAACCTACA CTTTCGGTGTTCTCTCTAAAGACTTAATTTACCAAGGGAGAGATATAT CAC SiRe_2664 ATATATGTGAAGAGTCCTCTGATCCTCCTAAAAATACCCTTGATAACG AGTTTTCAAGTTAATAGAAAACACTTTAAACCTTATTATATCATAAAAA CAC SiRe_0338 AATTAAGATAGGTTACGAGCATAGGGATACTCCCATAACGAAACCCG TTTATTGGGTTGGGGTCAGAGGAGACCCTTGGTAAAATTGAAGGTG ATCCCAC Supplementary figures

Supplementary figure S1. The correlation between RT-qPCR and RNA-Seq data.

Exponentially growing cultures of the wild type strain E233S1 and Δorc1-2 mutant were treated with 2 µM NQO for 6 h, and cells were then harvested and employed for extraction of total RNAs, which were then used for RNA- Seq analysis. The genes used for the RT-qPCR analysis are dpo2 (closed triangles), upsX (closed squares), upsA (closed circles), tfb3 (open circles), SiRe_1957 (open squares) and SiRe_1550 (open triangles). The X and Y value was RNA-Seq data and RT- qPCR data, respectively. qPCR data were normalized with the level of 16S RNA. A slope of 0.9705 and an R-value of 0.96 were yielded based on the data obtained from two methods.

Supplementary figure S2. S. islandicus Dorc1-2 show hypersensitivity towards NQO

Growth curve of WT (A) and Dorc1-2 (B) were plotted after exposure to different concentration of NQO agent. Sulfolobus strains were grown in SCV medium in the presence of NQO at the indicated concentrations for 24 hours. Culture growth was estimated by measuring A600 values of individual cultures during incubation and plotted against the incubation time.

Supplementary figure S3. The consensus motif in the promoter regions of NQO-induced upregulated genes

A total of 21 promoter sequences of those top upregulated genes with obvious promoters were used as the input sequences for De Novo motif discovery by MEME server with the default setting. A motif of 20 bp width (AATAGTTTCRGWDTACTCWS) with most conserved region at ANTTTC was identified (A), which is present in 17 out of the 21 input sequences. The motif mainly positioned between -25 to -55 bp upstream of start codon (B). The height of the motif indicate the similarity to that of consensus one. + and – represent coding and non-coding strand respectively.

Supplementary figure S4. SDS-PAGE analysis of S. islandicus Orc1-2 recombinant protein (10 µg) purified from E. coli. M: Protein marker standard; Orc1-2: recombinant Orc1-2 protein.

Supplementary figure S5. Construction of orc1-2araS strain

A. Two parts of donor DNA were amplified from genomic DNA and pZC-S-50orc1-2 plasmid respectively, and were fused by SOE-PCR using the indicated primers. B. Schematic illustration of the construction of the genome editing plasmid for the promoter replacement of orc1-2 gene. C. PCR verification of the strain by the check primers (check-fwd/check-rev). An 84-bp PCR product is expected for the E233S1 strain and 127 bp, for the orc1-2araS. D. Sequencing results of the promoter region of orc1-2 in WT and orc1-2araS. The position of original and araS promoter were indicated.

Supplementary figure S6. Conserved genomic organization of Orc1 proteins in Sulfolobales

The amino acid sequences of (A) Orc1-1, (B) Orc1-2 and (C) Orc1-3 in S. islandicus Rey15A were used as the query sequence for gene synteny analysis by the program SyntTax(6). The homologous genes corresponding to the query proteins are drawn in bold. The score below the organism name indicated the normalized blast score between the query sequence and the orthologues in the matched chromosome. The chromosomes are then ranked by decreasing scores. The gene number of orc1-1, orc1-2 and orc1-3 in S. islandicus Rey15A is SiRe_1740, SiRe_1231 and SiRe_0002, respectively.

References 1. Deng, L., Zhu, H., Chen, Z., Liang, Y.X. and She, Q. (2009) Unmarked gene deletion and host-vector system for the hyperthermophilic crenarchaeon Sulfolobus islandicus. Extremophiles, 13, 735-746. 2. Samson, R.Y., Xu, Y., Gadelha, C., Stone, T.A., Faqiri, J.N., Li, D., Qin, N., Pu, F., Liang, Y.X., She, Q. et al. (2013) Specificity and function of archaeal DNA replication initiator proteins. Cell Rep, 3, 485-496. 3. Peng, N., Xia, Q., Chen, Z., Liang, Y.X. and She, Q. (2009) An upstream activation element exerting differential transcriptional activation on an archaeal promoter. Mol Microbiol, 74, 928-939. 4. Peng, W., Feng, M., Feng, X., Liang, Y.X. and She, Q. (2015) An archaeal CRISPR type III-B system exhibiting distinctive RNA targeting features and mediating dual RNA and DNA interference. Nucleic Acids Res, 43, 406-417. 5. Peng, N., Deng, L., Mei, Y., Jiang, D., Hu, Y., Awayez, M., Liang, Y. and She, Q. (2012) A synthetic arabinose- inducible promoter confers high levels of recombinant protein expression in hyperthermophilic archaeon Sulfolobus islandicus. Appl Environ Microbiol, 78, 5630-5637. 6. Oberto, J. (2013) SyntTax: a web server linking synteny to prokaryotic taxonomy. BMC bioinformatics, 14, 4.