Role of senataxin in RNA : DNA hybrids resolution at DNA double strand breaks Sarah Cohen

To cite this version:

Sarah Cohen. Role of senataxin in RNA : DNA hybrids resolution at DNA double strand breaks. Cellular Biology. Université Paul Sabatier - Toulouse III, 2019. English. ￿NNT : 2019TOU30125￿. ￿tel-02930730￿

HAL Id: tel-02930730 https://tel.archives-ouvertes.fr/tel-02930730 Submitted on 4 Sep 2020

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. �����

���� ������������������ ������������������������������������

��������� ���!������"������ �� "��#�$�% ��� � ����

%��"���������"� ��� ��� � � � &������

������������ �������

����'������"�� � (����� �"�� ���"�� �������"�&)�����"����*���� (�+ "" ��"��� ��������"�����!���

������������������,�,�$�,����-��.�� ���.�,����+&����-��" �������������,����/�������������� �������������������� �,�0�%�$�� ��� ���������,����-������� � �������0���+ � ����� ������'������� ������'������������ %����1�� �����

����������������� /��������/�,��

��� ���������� ������������������� ������������������������������ �� ����������������� ���������� ����������������� !���� ����������� ����"�� �� �"#�����������������������

1

Summaries

2

Actively transcribed can be the source of instability through numerous mechanisms. Those genes are characterized by the formation of secondary structures such as RNA- DNA hybrids. They are formed when nascent RNA exiting RNA polymerase II hybridizes single stranded DNA. Numerous studies have shown that RNA-DNA hybrids accumulation can lead to DNA damages. Among those damages, DNA double strand breaks (DSB) are the most deleterious for cells since they can generate mutations and chromosomal rearrangements. Two major repair mechanisms exist in the cell: Non-Homologous End-Joining (NHEJ) and Homologous recombination (HR). My lab showed recently that DSB occurring in transcribed genes are preferentially repaired by HR. Moreover, multiple studies have shown a cross talk between transcription and DSB repair. Those results led us to propose that actively transcribed genes could be repaired by a specific mechanism implicating proteins associated with transcription: “Transcription-coupled DSB repair”. During my PhD, using the DIvA (DSB Induction via AsiSI) cell line allowing the induction of annotated DSB through the genome, I worked on 2 projects focusing on DSB repair in transcribed genes. First, we showed that DSB repair in transcribed loci requires a known RNA: DNA : senataxin (SETX). After DSB induction in an active , SETX is recruited which allows RNA-DNA hybrid resolution (mapped by DRIP-seq). We also showed that SETX activity allows RAD51 loading and limits DSB illegitimate rejoining and consequently promotes cell survival after DSB induction. This study shows that DSB in transcribed loci require specific RNA-DNA hybrids removal by SETX for accurate repair. Second, we showed an interplay between SETX and Bloom (BLM) a G4 DNA helicase in DSB repair induced in transcribed loci. We showed that BLM is also recruited at DSB in transcribed loci where it promotes resection and repair fidelity. Strikingly, we showed that BLM depletion rescued the survival defects observed in SETX depleted cells following DSB induction. Knock down of other G4- (RTEL1, FANCJ) also promoted cell survival in SETX depleted cells upon damage. Those data suggest an interplay between G4 helicases and RNA: DNA resolution for DSB repair in active genes. Altogether, these studies promote a better understanding of the specificity of DSB repair in transcriptionally active genes, and notably identification of proteins involved in “Transcription-coupled DSB repair”.

3

Les gènes transcriptionellement actifs peuvent être la source de l'instabilité du génome via de nombreux mécanismes. Ces gènes sont caractérisés par la formation de structures secondaires telles que les hybrides ADN : ARN. Ils se forment lorsque l'ARN sortant l'ARN polymérase II s'hybride au simple brin d'ADN. De nombreuses études ont montrées que l’accumulation de ces hybrides peut mener à la création de dommages à l'ADN. Parmi ces dommages, les Cassures Double Brins (CDB) sont les plus dangereuses pour la cellule puisqu'elles peuvent produire des mutations et des réarrangements chromosomiques. Il existe deux mécanismes de réparation majeurs dans la cellule : la Jonction Non-Homologue des Extrémités (NHEJ) et la Recombinaison Homologue (HR). Mon équipe a récemment montré que les CDB localisées dans les gènes transcrits sont préférentiellement réparés par HR. De plus, de nombreuses études ont montrées une interaction entre transcription et réparation des CDB. Au vue de ces résultats, nous avons donc émis l'hypothèse que les gènes transcriptionellement actifs pourraient être réparés par un mécanisme spécifique nécessitant l'activité de protéines associées à la transcription : "Réparation couplée à la transcription". Durant ma thèse, je me suis intéressée au rôle de deux protéines dans la réparation des régions transcrites en utilisant la lignée cellulaire DIvA (DSB Induction via AsiSI) qui permet l'induction de cassures annotées sur tout le génome. Premièrement, nous avons montré que la réparation des CDB dans des loci transcrits nécessitent une hélicase ADN : ARN connue : sénataxine (SETX). Après induction d'une cassure dans un gène, SETX est recrutée ce qui permet la résolution d'hybride ADN : ARN (cartographié par DRIP- seq). Nous avons aussi montré que SETX permet le recrutement de RAD51 et limite les jonctions illégitimes des CDB et par conséquent promeut la survie des cellules après induction des cassures. Cette étude montre que les CDB dans les loci transcrits requièrent la résolution spécifique des hybrides ADN : ARN par SETX pour permettre une réparation précise et est absolument indispensable pour la survie cellulaire. Deuxièmement, nous avons montré une interaction entre SETX et Bloom (BLM) une G4 DNA hélicase dans la réparation des CDB dans les régions transcrites. Nous avons montré que BLM est aussi recrutée au CDB dans les loci transcrits où elle est nécessaire à la résection et à la fidélité de réparation. De façon importante, nous avons montré que la déplétion de BLM restaure le défaut de survie cellulaire observé dans les cellules déplétées pour SETX après induction des CDB. La déplétion d'autres hélicases G4 (RTEL1, FANCJ) promeut aussi la survie des cellules déplétées pour SETX après dommages. Ces résultats suggèrent une interaction entre les hélicases G4 et la résolution des hybrides ADN : ARN dans la réparation des gènes actifs. En conclusion, ces études permettent une meilleure compréhension de la spécificité de la réparation des régions transcrites du génome, et notamment l'identification de protéines impliquées dans la "Réparation couplée à la Transcription".

4

Table of contents

5

Table of contents

Summaries...... 2

Table of contents...... 5

Introduction ...... 9

I. Transcription ...... 10

A. Steps of transcription...... 10 Initiation ...... 10

Elongation/Splicing ...... 11

Termination...... 12

B. Non coding RNAs processing...... 12 Small RNA ...... 12

Long non-coding RNA...... 13

C. Secondary structure: R-loops ...... 13 Formation...... 14

Processing ...... 15

Detection techniques...... 16

D. Transcription induces genome instability...... 17 Common fragile site and replication collapse ...... 17

Topoisomerase II activity ...... 18

R-loops ...... 19

R-loops in human diseases...... 22

II. DNA double strand breaks repair...... 25

A. DSB detection and signaling ...... 25 B. Mechanisms...... 26 Non-Homologous End Joining...... 26

Homologous Recombination ...... 27

Alternative End-joining/Microhomology Mediated End-joining ...... 27

C. Factors influencing the choice between the 2 major repair pathways ...... 28 Resection regulation ...... 28

6

Chromatin state around DSB ...... 29

Cell cycle dependent...... 30

Type of breaks ...... 31

III. Transcription role in DSB repair...... 33

A. RNA: active player for DSB repair? ...... 33 Transcription initiation following DSB induction ...... 33

Non-coding RNAs target DDR proteins at the break ...... 34

RNA template for repair during Homologous Recombination ...... 35

B. RNA: roadblock for repair...... 36 Transcription inhibition: stopping RNA production at the break ...... 36

Removing RNA: DNA hybrids from DSB ...... 38

Results ...... 41

I. RNA: DNA hybrid resolution is crucial for DSB repair in active genes ...... 42

II. RNA: DNA and G4 helicases interplay for DSB repair in active genes...... 43

A. BLM is recruited at DSB produced in G4-rich loci...... 43 B. BLM is mainly recruited at DSBs induced in active genes...... 44 C. BLM is recruited at HR-prone DSB and required for RAD51 loading ...... 45 D. SETX and BLM interplay for DSB repair...... 46 BLM has no impact on cell survival but is harmful in SETX depleted cells46

BLM promotes RNA-DNA hybrids accumulation ...... 47

Discussion ...... 49

I. RNA: DNA hybrid resolution is crucial for DSB repair in active genes ...... 50

A. RNA: DNA hybrids mapping...... 50 RNA: DNA hybrids loss on the gene body...... 50

RNA: DNA hybrids accumulation around the DSB ...... 51

RNA: DNA hybrids function at DSB...... 52

B. RNA: DNA hybrids resolution required for repair ...... 53 C. In human diseases ...... 54 II. RNA: DNA and G4 helicases interplay for DSB repair in active genes...... 55

A. BLM is recruited at DSB in active genes and required for HR...... 56 B. G4 against R-loops? ...... 57

7

Bibliography ...... 59

Annexes ...... 74

8

Introduction

9

I. Transcription

DNA (DesoxyriboNucleic Acid) molecule carries the genetic information for all living organisms. This genetic information is held in DNA sequences called genes. In human, protein- coding genes produce around 20000 proteins by a mechanism called transcription. This mechanism is taken care of by the 12 sub-unit RNA Polymerase II leading to the production of a messenger RNA (mRNA) from the transcription of the DNA template. Transcription is a complex, dynamic and timely regulated process in the nucleus. This regulation occurs at every steps notably by the DNA structure and sequence itself but also by different factors such as and non-histone proteins (Shandilya and Roberts, 2012).

A. Steps of transcription

Initiation

In mammalians, transcription is commonly initiated at the core promoter elements (Figure 1A). It is defined as the minimal contiguous DNA elements required for accurate transcription. This core promoter element can be located upstream or downstream of their targeted gene transcription sites. The first core promoter element discovered in 1979 is the TATA box (Goldberg, 1979). It can function independently or synergistically and is located about -31 to -30 relative to the transcription site. A second class of core promoter elements are CpG islands which are DNA stretches of CG dinucleotides (Deaton and Bird, 2011) (Figure 1A). They are preferentially located around transcription start sites and CpG methylation correlates with levels (Cross and Bird, 1995). Promoter regions with unmethylated CpG are usually active promoters whereas inactive/silenced promoters are characterized by important methylated 5mCpG (Boyes and Bird, 1992). TATA-box dependent promoters represent a minority of mammalian promoters (10-16%) whereas a majority of human promoters are associated with CpG islands (around 70%) (Baumann et al., 2010). The promoter-proximal region is also characterized by histone marks specific of active transcription such as H3K9/K14 acetylation and H3K4 methylation (Figure 1A). Those epigenetics marks allow a more open and permissive chromatin (Li et al., 2007a). The core promoter elements serve as a platform for the binding of the Pre-Initiation Complex (PIC) (Baumann et al., 2010; Roeder, 1996; Sainsbury et al., 2015). The PIC is composed of RNA polymerase II associated with the general transcription factors (GTF): TFIIA, TFIIB, TFIID, TFIIE TFIIF and TFIIH (Buratowski et al., 1989; Sainsbury et al., 2015) (Figure 1A). Those GTFs and the mediator complex are assembled with RNA polymerase II in a timely manner for a productive initiation.

10 Pre-Initiation Complex A P Ser5 CTD

Mediator H3K14ac H3K9ac GTF RNA pol II

CpG

H3K4me CPE

CTD B Mediator DSIF NELF GTF RNA pol II

Paused 5’ capped polymerase nascent mRNA

P-TEFb C

NELF P Ser2 CTD Mediator DSIF GTF RNA pol II

Polymerase 5’ capped released nascent mRNA

Figure 1: Transcription initiation A. Pre-initiation complex (PIC) assembly: Binding of the PIC is required for transcription inhibition. The PIC is composed of the General Transcription Factor (GTF: TFIIA, TFIIB, TFIID, TFIIE TFIIF and TFIIH) together with mediator omplex and RNA polymerase II. They are recruited at the Core Promoter Element (CPE) which in a majority of human promoters are associated with CpG islands. Ser5 phosphorylation of RNA polymerase II CTD repeats is also necessary for initiation. Epigenetic modifications involved in initiation are H3K9/K14 acetylation and H3K4 methylation. B. RNA Polymerase II promoter proximal pausing: 25 nascent RNA nucleotides are produced before Ser5P allows 5’end capping. DSIF and NELF binding lead to RNA polymerase II pausing. C. RNA Polymerase II release: Positive-Transcription Elongation Factor b (P-TEFb) complex made of cyclin T1 and Cyclin Dependent Kinase 9 phosphorylates Ser2, NELF and DSIF (red star). NELF is then evicted from RNA polymerase II and DSIF becomes a positive factor. RNA polymerase II is then released from pausing. Finally, post-translational modification of RNA polymerase II C-Terminal Domain (CTD) is also necessary for transcription initiation. In human, CTD is composed of 52 tandem hepta- peptide repeats of Y-S-P-T-S-P-S. One of the most important modification is Serines phosphorylation. Serines at positions 2, 5 and 7 are known to be phosphorylated at different steps of transcription. Notably Ser 5 needs to be phosphorylated to initiate transcription (Figure 1A) (Shandilya and Roberts, 2012). After the transcription of around 25 nucleotides of nascent RNA, Ser5P recruits the capping enzyme allowing the addition of a cap structure to the 5' end and therefore productive transcription initiation (Figure 1B). DSIF and NELF proteins can bind and inhibit RNA polymerase II leading to promoter proximal pausing (Jonkers and Lis, 2015). RNA polymerase II is released from pausing by the Positive-Transcription Elongation Factor b (P-TEFb) complex Transcription elongation complex made of cyclin T1 and Cyclin Dependent Kinase 9 (Figure 1C). P-TEFb phosphorylates Ser2, NELF and DSIF. NELF is then evicted from RNA polymerase II and DSIF becomes a positive factor. Following release, RNA polymerase II proceeds across the coding gene to engage in elongation (Shandilya and Roberts, 2012).

Elongation/Splicing

Gradual increase of Ser2 and the loss of Ser5 could explain RNA polymerase II increase elongation rates from 0.5kb per minutes within the first few kilobases to 2-5kb per minutes after 15 kb. Several other factors influence the elongation rate such as certain histone marks. Indeed, histone marks such as H2B ubiquitylation and H3K79 dimethylation are enriched in the first intron and associated with higher elongation rate whereas H3K36 trimethylation and H4K20 methylation are enriched in exons and associated with reduced elongation rate (Li et al., 2007a) (Figure 2A). Histone chaperones like FACT and ASF1 can also affect elongation rate. Finally, the factor with the strongest negative effect on elongation rate are exons. Transcription is delayed by 20-30 seconds per exon and this delay could be caused by their high GC content. This delay through the gene might be necessary for co- transcriptional splicing (Saldi et al., 2016). Indeed, most eukaryote genes are expressed as precursors mRNA (pre-mRNA) requiring a step of splicing to remove introns (non-coding sequences) and ligate exons together (Herzel et al., 2017). This mechanism is catalyzed in the nucleus by a multi-megadalton ribonucleoprotein complex called the spliceosome (Figure 2A). The main spliceosome building blocks are 5 U-rich nuclear small ribonucleoproteins (snRNPs) which are named after their small nuclear RNA component U1, U2, U4, U5 and U6. They are also associated with Sm proteins and are implicated in splicing in a sequential manner (Herzel et al., 2017; Saldi et al., 2016; Wahl et al., 2009).

11 A

FACT or ASF1

Mediator DSIF

GTF H3K36me3 DSIF RNA pol II H4K20me H2Bub RNA pol II

H3K79me2 Exon 1 Intron 1 Low Spliceosome High elongation rate elongation rate

B Eviction

RNA pol II RNA pol II AATAAA CstF AATAAA CPSF

XRN2 Spliceosome Paused polymerase SETX mRNA release

C Eviction

Exosome

RNA pol II AAA

Nab3 Nab3 Nrd1 Nrd1 Sen1 TRAMP

Figure 2: Transcription elongation/splicing and termination A. Transcription elongation and splicing: RNA polymerase II is released while the rest of the PIC remains at the promoter as a transcription reinitiating scaffold. Elongation rate is influenced by histone marks in introns and exons. In exons, elongation rate is slow and associated with H3K36me3 and H4K20me whereas in introns, elongation is high and associated with H2Bub and H3K79me. Histone chaperones FACT and ASF1 promote histone marks specific to introns. Elongation delay contributes to co-transcriptional splicing by the spliceosome. B. Transcription termination in mammalian cells: after transcription of the polyadenylation site, Cleavage and Polyadenylation Specificity Factor (CPSF) and the Cleavage Stimulatory Factor (CstF) activity leads to transcription pausing and to RNA polymerase II eviction. In the torpedo model, after cleavage the RNA unprotected 5’ end still tethered to RNA polymerase II can be degraded by XRN2 5’-3’ exoribonuclease. RNA: DNA helicase SETX can promote XRN2 access to RNA by unwinding R-loop formed co-transcriptionally. C. Transcription termination in S.cerevisiae: NNS pathway promote non-coding RNA termination by binding Nrd1 and Nab3 to neo-transcript RNA. This binding promotes Sen1 recruitment allowing RNA unwinding and RNA polymerase II. 3’ end non-coding RNA is then generated by endoribonucleotilyc cleavage or/and by the nuclear exosome/TRAMP complex and are lacking a polyA tail. Termination

After elongation, transcription termination occurs when RNA polymerase II and the nascent mRNA dissociate from the DNA. There are 2 main pathways in eukaryotes. In human, mRNA termination is coupled with a maturation event in which the nascent RNA 3’ end is cleaved and polyadenylated (Figure 2B). The major effectors of termination are the Cleavage and Polyadenylation Specificity Factor (CPSF) and the Cleavage Stimulatory Factor (CstF) (Kuehner et al., 2011; Porrua et al., 2016; Porrua and Libri, 2015). After transcription of the polyadenylation site, their activity leads first to transcription pausing and then to RNA polymerase II release. Release of RNA polymerase II might also require the activity of XRN2, 5’-3’ exoribonuclease. In the torpedo model, creation of an unprotected 5’ end after cleavage event will allow XRN to degrade RNA still tethered to RNA polymerase II (Figure 2B). Their collision would promote termination. The RNA:DNA helicase SETX has also be involved in RNA polymerase II release by unwinding R-loops and access to XRN2 after nascent RNA cleavage (Wagschal et al., 2012). In S.cerevisiae, the other major pathway dedicated for non-coding RNAs termination is the NNS (Nrd1-Nab3-Sen1) pathway (Figure 2C). In this case, 3’-ends of yeast non-coding RNAs are generated by endoribonucleotilyc cleavage or/and by the nuclear exosome/TRAMP complex and are lacking a polyA tail. In this model, RNA binding proteins Nrd1 and Nab3 serve as adaptors to position the SETX homolog Sen1. Sen1 can therefore unwind RNA and release the polymerase (Arndt and Reines, 2015).

B. Non coding RNAs processing

In eukaryotes, RNA polymerase II is responsible for the transcription of most non- coding RNAs except for tRNA and rRNA. They are generated from the gene body but also from extragenic loci such as enhancers and upstream and antisense of mRNA genes. Those non-coding RNA are usually less than 1kb. They are highly unstable and restricted to the nucleus.

Small RNA

Small RNAs were first discovered in 1993 by genetic screens in C.elegans. They are highly conserved and have various functions such as heterochromatin formation, RNA silencing and translational control. Small RNAs involved in RNA silencing are by definition short, around 20-30 nucleotides and associated with Argonautes (Ago) family proteins. They are classified in 3 families in animals: micro RNA (miRNA), small interfering RNA (siRNA) and PIWI-interacting RNA (piRNA) (Castel and Martienssen, 2013).

12 1. miRNA 2. siRNA 3. piRNA

RNA pol II RNA pol II RNA pol II RNA pol II

pri-miRNA

Microprocessor Drosha DGCR8

Nucleus

pre-miRNA Cytoplasm Exp5

Zucchini Dicer Dicer

PIWI Ago RISC Ago

Figure 3: Small non-coding RNA processing A. micro-RNA (miRNA) processing: RNA polymerase II generate short hairpin RNAs called pri-miRNA which are then cleaved by the microprocessor: Drosha and DGCR8 in the nucleus. Pre-miRNA forms a complex with exportin 5 which is exported in the cytoplasm. Finally, pre-miRNA is cleaved by Dicer and forms the RNA-Induced Silencing Complex (RISC) with Ago protein. B. Small Interfering RNA (siRNA) processing: dsRNA can be generated for example by convergent RNA polymerase II transcription. They are exported to the cytoplasm and processed by Dicer and form a complex with Ago proteins. C. PIWI-Interacting RNA (piRNA) processing: piRNA precursors are produced as dsRNA and exported to the cytoplasm. They are processed by Zucchini and associated with PIWI proteins. miRNA are generated from short hairpin RNAs that are successively processed by 2 RNAse III type proteins (Ha and Kim, 2014; Kim et al., 2009) (Figure 3A). In the nucleus, pri- miRNA maturation in pre-miRNA is first initiated by Drosha in association with DGCR8, also called microprocessor complex. The pre-miRNA is then exported in the cytoplasm by forming a complex with the exportin 5 (EXP5). This complex translocates through the nuclear pore complex. The pre-miRNA is then finally processed by Dicer and then loading to an Ago protein to form the effector RNA-Induced Silencing Complex (RISC). siRNA are produced as dsRNA that are then processed in the cytoplasm by Dicer where they are also associated in a complex with Ago proteins (Ghildiyal and Zamore, 2009) (Figures 3B). piRNA have a different processing and does not require the action of RNAse III type proteins or the loading on Ago proteins. On the contrary, they are processed in the cytoplasm by an endonuclease called Zucchini and then associated with PIWI proteins. (Figure 3C)

Long non-coding RNA

Long Non Coding RNA (lnc RNA) are very similar to mRNAs (reviewed in(Quinn and Chang, 2016). They are 5' capped, spliced and polyadenylated. The main differences between those 2 categories are their length, lnc RNA are usually no longer than 200 nucleotides and they do not have an Open Reading Frame (ORF). Their expression usually follows mRNA expression patterns therefore they are expressed in a cell type-, tissue-, developmental stage- or disease state-specific manner. They also are estimated to outnumber mRNAs. The function of a majority of lnc RNAs has not been identified however a few of them have been well described. For example, lnc RNA XIST (X inactive specific transcript) is necessary for X inactivation and TERC (telomerase RNA component) is required for telomere elongation. Their degradation is mediated by the exosome or by cytosolic nonsense-mediated decay. Moreover, lncRNA called TERRA (Telomere repeat-containing RNA) are also produced at telomeres where they control telomerase activity and heterochromatin formation at telomeres (Dianatpour and Ghafouri-Fard, 2017). Finally, two lncRNA are processed by RNA P: MALAT1 (metastasis associated-lung adenocarcinoma transcript 1) and NEAT1 (nuclear enriched abundant transcript 1) in mammals (Quinn and Chang, 2016). It has been proposed that they both participate in expression regulation: MALAT1 by regulating and NEAT1 by keeping mRNA in the nucleus for editing (Yoshimoto et al., 2016; Yu et al., 2017). Their overexpression has been associated with different types of cancer

C. Secondary structure: R-loops

Transcription can produce secondary structure such as R-loops. R-loops are triple helix composed by an RNA-DNA hybrid highly stable and a single strand DNA. R-loops

13 are formed co-transcriptionally when RNA exiting RNA polymerase II hybridizes to DNA leaving the second DNA strand free. They differ from RNA: DNA hybrids forming transiently during transcription elongation by their span: from 100 to 2000 base pairs. They participate to various biological phenomenon such as transcription regulation and can generate DNA damages (Crossley et al., 2019; Santos-Pereira and Aguilera, 2015). Numerous studies showed that this structure is not a rare event from transcription. Instead they form across the whole genome throughout the cell cycle and R-loops are highly conserved in bacteria, yeast and higher eukaryotes. Recent studies showed they are abundant structures: estimated occupation on the genome is up to 5% of the mammalian genome (Sanz et al., 2016) and 8% of the budding yeast genome (Wahba et al., 2016).

Formation

DNA sequence and conformation

R-loops formation was first studied in vitro by the Davis lab in 1976 (Thomas et al., 1976). This formation is ensured by RNA: DNA hybridization being more stable than DNA: DNA hybridization. Sequence specificity takes part in this phenomenon. Indeed, it has been shown in vitro and in vivo that G-rich RNA are thermodynamically more stable with C-rich DNA than G-rich DNA. Moreover, studies showed that G-rich sequence can promote the formation of secondary DNA structure called G-quadruplex or G4 (Duquette et al., 2004). Only 4 consecutive guanines are necessary for the formation of G4. Interestingly, G4 can be formed co-transcriptionally and are associated with R-loops formation (Figure 4A) (Duquette et al., 2004). They could promote dsDNA “opening” by maintaining the non-templated ssDNA stabilized. (Hamperl and Cimprich, 2014; Quinn and Chang, 2016) DNA conformation can also play a role in R-loops formation. It has been showed that RNA polymerase II passing through DNA duplex can create negative and positive DNA supercoiling. Positive supercoiling in front of RNA polymerase II can lead to transcription pausing/block whereas negative supercoiling behind RNA polymerase II can lead to an opening of the DNA duplex and facilitate RNA annealing (Figure 4A) (Corless and Gilbert, 2016).

Pausing

Several studied showed that RNA polymerase II pausing is associated with R-loops formation. For example, G4 genome-wide mapping in human cells has shown that promoters and some termination pause sites are highly enriched in G4 (Hänsel-Hertsch et al., 2018; Marsico et al., 2019). However, it is not clear if pausing is responsible for R-loops formation or

14 A G-quadruplex

5‘ RNA pol II 5‘ 3‘ 3‘

R-loop

Paused RNA (-) Supercoiling polymerase (+) Supercoiling

5‘ RNA pol II 3‘

B R-loops at promoters PRC2

DNMT1

RNA pol II

C R-loops at transcription termination sites

G9a HP1γ

H3K9me2 RNA pol II AATAAA

Paused polymerase

XRN2 DHX9

SETX

Figure 4: R-loops formation A. Influence of the DNA sequence and conformation: top panel, R-loops formation can be promoted by G-rich sequences and notably by G-quadruplex (G4) formation on the ssDNA. G4 could promote dsDNA “opening” by maintaining the non-templated ssDNA stabilized. Bottom panel: RNA polymerase II elongation can produce supercoiling: negative behind RNA polymerase II leading to DNA “opening” and therefore RNA annealing, positive in front of RNA polymerase II leading to its pausing. B. Influence of pausing at promoters: Genome-wide mapping of R-loops showed an accumulation at promoter proximal pausing sites. R-loops can act as an epigenetic mark allowing transcription regulation. Indeed, R-loops inhibits transcription silencing for example, by blocking DNMT1 DNA methyltransferase which as binds poorly to R-loops and therefore blocking methylation. R-loops can also inhibit chromatin remodelers such as polycomb repressive complex 2 (PRC2) and therefore inhibits transcriptional repressive marks. C. Influence of pausing at transcription termination sites (TTS): Genome-wide mapping of R-loops also showed an accumulation at TTS. R-loops promote pausing by inducing repressing chromatin marks via G9a histone lysine methyltransferase recruitment (G9a). H3K9me2 is formed and allows HP1γ recruitment. R-loops resolution at TTS is dependent on RNA: DNA helicases SETX and DHX9. Following RNA degradation is achieved by XRN2. on the contrary if R-loops formation leads to RNA polymerase II pausing. Nevertheless, a number of studies reveal a function of R-loops on those pausing sites (Mayer et al., 2017).

(1) At promoters

R-loops mapping showed that they accumulate at promoters (Chen et al., 2017; Sanz et al., 2016) and they are formed immediately after the transcription start site (Figure 4B) (Dumelie and Jaffrey, 2017). Their formation could be facilitated by chromatin opening at promoters (Boque-Sastre et al., 2015). They could act as an epigenetic mark allowing transcription regulation. Indeed, they can block transcription silencing for example, by inhibition of methylation as DNA methyltransferase binds poorly to R-loops (Grunseich et al., 2018). Studies have shown that R-loops can also inhibit chromatin remodelers and therefore inhibits transcriptional repressive marks (Boque-Sastre et al., 2015; Chen et al., 2015). By promoting chromatin opening, R-loops could facilitate transcription (Cloutier et al., 2016).

(2) At termination sites

They are evidences that R-loops might participate to efficient transcription termination (Figure 4C). R-loops mapping showed that they form at termination site (Sanz et al., 2016; Skourti-Stathaki et al., 2011). It has been proposed that they allow RNA polymerase stalling downstream of the polyadenylation sequence (Sanz et al., 2016). R-loops could promote RNA polymerase II pausing by inducing repressive chromatin marks via G9a histone lysine methyltransferase recruitment. Consequently, H3K9me2 is formed and heterochromatin protein 1γ is then recruited(Skourti -Stathaki et al., 2014). R-loops are then resolved by RNA- DNA helicases such as SETX (Skourti-Stathaki et al., 2011) and DHX9 (Cristini et al., 2018). Finally, the free RNA strand can be degraded by exonucleases such as XRN2 (Morales et al., 2016).

Processing

Numerous factors exist to prevent R-loops formation but also to promote R-loops resolution or degradation.

(1) RNA processing factors: Tho/TREX and mRNP

Prevention of R-loops formation requires to "coat" the nascent RNA, making it unavailable for hybridization with the template DNA (Figure 5A). Binding of RNA processing on the nascent RNA factors has been shown to prevent R-loops formation and mutations of those factors lead to R-loops accumulation. In yeast, mutants for RNA binding protein Pbp1 and protein involved in the Tho complex (mRNP biogenesis and elongation) and TREX complex (TRansport/Export) promote R-loops accumulation (Huertas and Aguilera, 2003;

15

Jimeno et al., 2006; Salvi et al., 2014). RNA surveillance protein such as trf4 has also been involved in R-loops resolution (Gavaldá et al., 2013). Depletion of splicing factors such as ASF1/SF2,BuGZ and Bub3 promotes R-loops accumulation as well (Li and Manley, 2005; Li et al., 2007b; Wan et al., 2015). Finally, the exosome complex through notably Rrp6 exoribonuclease activity is thought to resolve R-loops (Pefanis et al., 2015).

(2) Rnase H

Rnase H can specifically degrade RNA strand involved in R-loops (Figure 5B) (Cerritelli and Crouch, 2009). Both prokaryotes and eukaryotes encode for 2 different Rnase H: Rnase HI/RnaseH1 and RnaseHII/Rnase H2. Eukaryotic Rnase H2 is composed of 3 sub-units. The catalytic sub-unit is similar to the monomeric prokaryotic RNase HII whereas the 2 other sub-units are specific to eukaryotes with unidentified functions. Rnase HII/Rnase H2 also takes a role in removing ribonucleotides incorporated during replication.

(3) Helicases

Once R-loops are formed they can be resolved by helicases that unwind specifically RNA: DNA hybrids (Figure 5C). For example, during transcription termination R-loops can be resolved by Rho in bacteria (Hong et al., 1995) and by SETX/Sen1 (Becherel et al., 2013; Kim et al., 1999; Mischo et al., 2011) and DHX9 in eukaryotes (Cristini et al., 2018). Other RNA:DNA helicases have been identified such as Aquarius (AQR) (De et al., 2015; Sollier et al., 2014).

Detection techniques

Multiple techniques have been developed in order to map R-loops on the genome with next-generation sequencing (Crossley et al., 2019). First method used to map R-loops is footprinting where bisulfites treatment allows the conversion of ssDNA cytosine in uracile which is in turn read as thymine by sequencing (Ginno et al., 2012b; Yu et al., 2003). This technique is controlled by Rnase H treatment to confirm ssDNA is R-loop specific. Second, most widely used technique requires RNA-DNA hybrids pull down by the use of a specific antibody S9.6: DRIP (Ginno et al., 2012b). In this technique, cells are not fixed before pull down, and nucleic acids are digested by a cocktail of restriction enzymes. The pull down produce can also be analyzed by quantitative PCR. They have been variations in techniques deriving from DRIP-seq such as the use of sonication instead of enzymatic digestion (Wahba, Amon, Koshland, & Vuica-Ross, 2011; Wahba, Costantino, Tan, Zimmer, & Koshland, 2016). Indeed enzymatic digestion as shown to create a bias notably on genes

16 A RNA pol II

THO TREX

Splicing factors

Exosome

RNA binding proteins depletion

RNA pol II

R-loop

B C

RNA pol II RNA pol II

Rnase H1 SETX RNA: DNA helicases

RNA pol II RNA pol II Rnase H1 SETX

Figure 5: R-loops prevention and processing A. R-loops prevention: RNA coating by RNA binding proteins can prevent its annealing with DNA. Among those proteins, proteins of the Tho/TREX complex, of the spliceosome and of the exosome has been identified. Depletion of those proteins leads to R-loops accumulation. B. Degradation by Rnase H1: Once R-loops are formed they can be specifically processed by Rnase H1 which degrades RNA involved in RNA: DNA hybrids. C. Unwinding by helicases: R-loops can also be removed by RNA: DNA unwinding by specific helicases such as SETX. Other RNA: DNA helicases have been identified such as DHX9, PIF1 and AQR. 5'end (Halász et al., 2017). However, sonication could also create damages in the ssDNA. Moreover, DRIP-seq is not strand specific therefore, new techniques allow either strand- specific DNA library preparation or the sequencing of RNA involved in R-loops(Sanz et al., 2016; Xu et al., 2017). A combination of the 2 techniques has been developed: the bis-DRIP-seq (Dumelie and Jaffrey, 2017). This technique allows the detection of R-loops at the single nucleotide. It uses S9.6 to immunoprecipitate the R-loops but also use bisulfite to alter nucleotides in the ssDNA only as the other strand is shielded by RNA. Those modified nucleotides are then sequenced precisely and allows the exact R-loop localization. Another limitation of DRIP related techniques is the use of the only antibody S9.6 (Halász et al., 2017; König et al., 2017). Moreover, it seems that this antibody also recognizes dsRNA (Hartono et al., 2018; Phillips et al., 2013). To go around S9.6 use, expression of catalytically inactive Rnase H is an alternate. Indeed, this approach allow the binding to R- loops (Chen et al., 2017; Ginno et al., 2012b). Chromatin is then cross-linked and Rnase H is immunoprecipitated. But even if this technique allows in situ binding, it could also have an impact on the dynamic of R-loops such as a stabilization of the structure.

All those techniques provide various tools allowing R-loops detection more or less precisely on the genome

D. Transcription induces genome instability

It is admitted that transcription can be a threat to genome integrity. For example, it has been shown that transcription machinery can collide with the replication machinery. Transcription can also produce structure such as R-loops that can be targeted by enzymes such as AID or Topoisomerase II that can create breaks. Finally, it has been known that certain genomic loci are more susceptible to DNA breaks and recent genome-wide studies showed that actively transcribed genes are more fragile. This phenomenon has been particularly reviewed (Crossley et al., 2019; D'Alessandro and d'Adda di Fagagna, 2017; Marnef et al., 2017).

Common fragile site and replication collapse

In 1984, Common Fragile Sites (CFS) were first identified in lymphocytes (Glover et al., 1984). They are formed under mild replication stress and are characterized by gaps/breaks in metaphase . They have been thoroughly studied because they match with translocation break points and are the source to gross chromosomal rearrangements in cancer cells (Marnef et al., 2017; Sarni and Kerem, 2016). CFS are conserved throughout the

17 evolution and present in every individual however, CFS breakage depends on cell type and source of replication stress (Le Tallec et al., 2011; Le Tallec et al., 2013). It has been shown that CFS instability is caused by incomplete replication during S-phase. CFS are devoid of replication origins, replicate late and 80% of them overlap long genes (>300kb) in (Helmrich et al., 2011). These long genes might need more than one cycle complete transcription which could lead to collision between the transcription and the replication machineries and in term to breakage. However, as every long gene are not prone to breakage, in normal conditions replication forks may be able to resolve collisions (Helmrich et al., 2011). Under replication stress though, secondary structures as R-loops could form at CFS loci and lead to fork stalling and CFS instability (Helmrich et al., 2011; Marnef et al., 2017; Sarni and Kerem, 2016).However, it has been shown that the fragility of FRA3B, the most active CFS in human lymphocytes, is caused by a paucity of replication initiation and not fork slowing or stalling (Letessier et al., 2011). Moreover, CFS are associated with chromatin condensation that may impair DNA damage repair. Another class of fragile sites was identified in lymphocytes B after hydroxyurea treatment (Barlow et al., 2013). They are Early Replicating Fragile Sites (ERFS) identified by ChIP-seq of DNA repair proteins RPA, BRCA1 and γH2AX. As for CFS, ERFSare cell type specific and detected as gaps on mitotic chromosomes. They are replicated early and localized in highly transcriptionally active regions, therefore instability in this case is rather caused by transcription than by replication timing. Finally, other recurrent breaks have been identified at active genes promoters in lymphocytes B and neurons (Klein et al., 2011; Suberbielle et al., 2013; Wei et al., 2016). Those data suggest that endogenous breaks caused by transcription-replication collision can play a role during development.

Topoisomerase II activity

Transcription activity itself can lead to genome instability notably through Topoisomerase IIβ catalytic activity (Figure 6) (Drolet et al., 1995; Ju et al., 2006). DNA topoisomerases are enzymes solving DNA double strands over winding, downstream of replication or transcription machineries, by creating transient breaks (Bollimpelli et al., 2017). They are 2 different types of topoisomerases: type I creating single strand breaks and type II creating double strand breaks. Those topoisomerases are divided in sub categories. Notably, Topoisomerase IIA isoform β (TOP2B) is expressed in mammalians post-mitotic cells and has been implicated in transcription regulation (Bollimpelli et al., 2017). It has been proposed that following transcription activation, TOP2B would generate breakage to release RNA polymerase II previously blocked by DNA torsion and allow it to resume elongation (Bunch et

18 Replication-independent DSB formation at active genes

Transcription OFF Paused RNA (-) Supercoiling polymerase (+) Supercoiling

5‘ RNA pol II 3‘

Topoisomerase IIb

Stimuli RNA polymerase II Impaired resealing release

RNA pol II RNA pol II

Accidental DSBs Transcription ON Transcription OFF

Marnef et al, 2017

Figure 6: Topoisomerase-II-dependent DSB induction upon transcription stimulation A large number of genes exhibit RNA polymerase pausing that contributes to transcription regulation. Upon certain stimuli, the release of paused RNA polymerase II into the gene body may require topoisomerase IIβ (TOP2B) activity. DSBs might accidentally arise following impaired resealing of TOP2B intermediates. al., 2015). DNA breaks have been observed at early responsive genes following estrogen (Stork et al., 2016; Williamson and Lees-Miller, 2011) and androgen (Haffner et al., 2010) stimuli but also after heat shock or serum stimulation (Bunch et al., 2015). Interestingly, this mechanism seems to take place in a subset of immediate responsive genes after neuronal activity stimulation in mice primary cultured neurons (Madabhushi et al., 2015; Stork et al., 2016; Suberbielle et al., 2013). TOP2B break induction represent a replication-independent mechanism for genome instability associated to transcription.

R-loops

R-loops have been associated with genome instability in numerous studies (Crossley et al., 2019; Hamperl and Cimprich, 2014; Marnef et al., 2017; Skourti-Stathaki and Proudfoot, 2014; Sollier and Cimprich, 2015). Indeed, stabilization of R-loops by depletion of processing factors is a source of DNA damages. They are several ways that R-loops can generate DNA breaks. ssDNA displaced in R-loops formation can be a target for cleavage and following replication can generate DNA double strand breaks (DSB). Moreover, R-loops could be responsible for transcription-replication collision (García-Muse and Aguilera, 2016). Finally, a new system of R-loops elimination has been shown recently to generate DNA damage (Sollier and Cimprich, 2015; Sollier et al., 2014).

(1) R-loops stabilization promotes genomic instability

A considerable amount of studies showed that the absence of R-loops processing factors leads not only to R-loops accumulation but also to DNA damages. Mutations of Rnase H1 and Rnase H2 leads to DNA breaks in yeast (Costantino and Koshland, 2018; Wahba et al., 2011). Rnase H2 mutants in mice also results in genome instability (Reijns et al., 2012). Remarkably, Rnase H2 sub-units mutations are responsible for the Aicardi-Goutières syndrome a severe neuroinflammatory disorder. Genomic instability caused by Rnase H mutation could be responsible for neurodevelopment defects observed in this disorder. Helicases depletion is also a source of DNA damages (reviewed in Puget et al., 2019 (in revision)). In bacteria, depletion of the helicase RecG combined with Rnase H depletion is lethal (Hong et al., 1995). RecG is thought to prevent R-loops accumulation and to catalyze reverse branch migration of Holliday junctions. In yeast, it has been observed that Sen1 is necessary for R-loops resolution and to prevent DNA damages (Mischo et al., 2011). A similar role has been identified for a DHX9 helicase, responsible for transcription termination and to prevent R-loops associated damages (Cristini et al., 2018). Interestingly, it has been shown that G4 stabilization by pyridostatin treatment or G4 helicases depletion (depletion of BLM or

19 A Transcription/Replication Adapted from Marnef et al, 2017 machineries collision

DNA RNA pol II Pol

DNA Paused Pol polymerase

Under-replication B. ssDNA sensitivity

AID CTIP? G4 endonuclease Topoisomerase I

DSB

SSB

C Transcription-coupled Nucleotide Excision Repair Fork collapse

SSB XPF XPG

CSB

Figure 7: R-loops induced genome instability A. Transcription/Replication machineries collision: G4 and/or R-loops could trigger RNA polymerase pausing that can collide with the replication machinery. Such collision can directly trigger DSB production. B. ssDNA sensitivity: Displaced single-stranded DNA is more sensitive to various enzymes and nucleases. Topoisomerase 1 (Top1) is recruited to release superhelical stress in the R-loop’s DNA template and its cleavage-ligation activity can also be a source of damages. During Class Switch Recombination in activated B lymphocytes, ssDNA GC rich can be targeted by AID. Sae2/CtIp could target R-loops’ ssDNA in S phase. Single-strand DSB generated can be converted in DSb during replication or can lead to fork collapse. C. Transcription-coupled Nucleotide Excision Repair: R-loops can be targeted by the endonucleases XPG and XPF, components of the TC-NER machinery that generates a single-strand gap. Single-strand DSB lead to fork collapse and DSB generation during replication. Pif1) can also generate DNA damages at transcribed regions (Rodriguez et al., 2012; van Wietmarschen et al., 2018). Moreover, depletion of Sgs1, BLM homolog, increases R-loops formation and transcription/replication collisions in yeast (Chang et al., 2017). Similar results were found in human cells in this study. A similar role was suggested for FANCJ, a protein involved in the Fanconi Anemia pathway (London et al., 2008; Wu et al., 2008). It can be argued that G4 helicases depletion in this case would lead to G4 stabilization, hence DNA opening and therefore R-loops accumulation. Finally, mRNA processing factors are also guardians against DNA damages. In yeast, depletion of Tho/TREX complex sub-units generate genomic instability (Huertas and Aguilera, 2003; Jimeno et al., 2006). Depletion of splicing factors such as ASF/SF2 (Gavaldá et al., 2013; Li and Manley, 2005; Li et al., 2007b; Wan et al., 2015) and trf4, a protein involved in RNA surveillance, (Gavaldá et al., 2013) led to similar results.

Altogether, those studies show that R-loops stabilization generates genomic instability. A number of mechanism can explain how R-loops can threaten genome integrity.

(2) Collision with replication machinery

Transcription and DNA replication frequency in cells can lead to encounters and eventually collisions between the 2 machineries. Indeed, the replisome moving alone a single strand DNA cannot progress through the RNA polymerase II embracing the double strand DNA. Therefore, their encounter lead necessarily to conflicts and eventually DNA breaks and chromosomal rearrangements (García-Muse and Aguilera, 2016; Lin and Pasero, 2012). They are 2 different types of collision: head-on (HO) and co-directional (CO). In a HO collision, replisome and RNA polymerase II progress in opposite directions toward each other which may lead to forks collapse. On the contrary, in a CO collision, the 2 machineries move along the same direction however the collision can be caused by RNA polymerase II being slower (can be as low as 0.5 kb/min in human) than the replisome (2-3 kb/min in human(Méchali, 2010)). In this case, replication can resume if RNA polymerase II is removed from DNA (García-Muse and Aguilera, 2016; Lin and Pasero, 2012). Several factors have been identified as affecting collision such as DNA supercoiling, G-quadruplexes and R-loops. Those factors are thought to limit replication and transcription machineries progression and therefore promote conflicts between them (García-Muse and Aguilera, 2016). Indeed, a large body of evidence show that R-loops affects genome integrity by colliding with the replication fork (Figure 7A). In breast cancer cells, estrogen was shown to induce R- loops accumulation and DNA damages notably in S-phase. Strikingly, transcription inhibition abolished DNA damages in the same conditions. Those data suggest that genome instability was caused by collision between replication machinery and potentially R-loops (Stork et al.,

20

2016). Moreover, factors involved in transcription termination and R-loops resolution such as Sen1 (Alzu et al., 2012) and Xrn2 (Morales et al., 2016) has been shown as protecting replication fork respectively in yeast and HeLa cells. More recently, a screen assessing replication capacity and replication stress sensitivity following protein depletion identified also pre-mRNA cleavage factors necessary for fork protection associated by counteracting (Teloni et al., 2019). It has been observed that collision direction impacts both R-loops and DNA damages. Using both a reporter system and genomic data (DRIP-seq compared to GRO-seq from HeLa cells), they observed that CO collisions reduce R-loops and promote ATM signaling whereas HO collisions increase R-loops and promote ATR signaling (Hamperl et al., 2017).

To conclude, R-loops can produce genomic instability by colliding with the replication machinery.

(3) ssDNA sensitivity

ssDNA displaced during R-loop formation can be a target for various nucleases and enzymes that can modify DNA and eventually induce DNA damage (Figure 7B) (Hamperl and Cimprich, 2014; Sollier and Cimprich, 2015). A classic example of this process takes place during Ig heavy chain Class Switch Recombination in activated B lymphocytes (Santos-Pereira and Aguilera, 2015; Skourti- Stathaki and Proudfoot, 2014). This event consists to recombination between 2 switch (S) regions where R-loops are formed (Yu et al., 2003). R-loops are essential for this process. Indeed, ssDNA displaced by R-loop formation is GC-rich and can be bind by enzyme from the APOBEC family: activation-induced cytidine deaminase (AID). AID deaminates dC to dU that can be processed by base-excision repair or mismatch repair machineries. Those processes can lead to nicks converted to DNA double strand breaks (DSB) during replication (Skourti- Stathaki and Proudfoot, 2014). Topoisomerase 1 (Top1) cleavage-ligation activity can also be a source of damages (Hamperl and Cimprich, 2014). In Saccharomyces cerevisiae, Top1 is recruited at transcriptionally active regions (Lippert et al., 2011) which coincides with deletions hotspots of 2 or 3 nucleotides. It is presumed that Top1 is recruited to release superhelical stress in the R- loop’s DNA template (Takahashi et al., 2011). However, it is possible that Top1 stay trapped and form a covalent complex with the 5’ end of the nicked DNA. It is proposed that the nick is then processed by Rad1 and/or Mus81 into a gap (Takahashi et al., 2011). If the single strand DNA (ssDNA) forms a G4, it could also be targeted by a specific G4-endonuclease. This type of enzyme has been characterized in human and is able to cleave the single strand 5’ region of the G4 (Sun et al., 2001) creating a single strand break (SSB).

21

Finally, R-loop’s ssDNA can be targeted by the Transcription coupled Nucleotide Excision Repair (TC-NER) pathway which would generate DSB (Figure 7C) (Sollier et al., 2014; Stork et al., 2016). Indeed, in human cells, following depletion of RNA: DNA helicases such as AQR and SETX, Cockayne Syndrome group B (CSB) promote the activity of XPF and XPG endonucleases (Sollier et al., 2014). Those enzymes can cleave either the RNA-DNA hybrids creating a SSB that can eventually be converted into DSB during replication or cleave the entire R-loops creating directly a DSB. Interestingly in human breast cancers cells, it was shown that estrogen treatment led to R-loops accumulation and DSB induction which was reduced by depletion of XPG and CSB (Stork et al., 2016). Those results suggest that TC-NER is also responsible of genome instability in human breast cancer cells by targeting R-loops. More recently, it was proposed that Sae2/CtIP could target R-loops’ ssDNA in S phase (Figure 7B) (Makharashvili et al., 2018). Altogether, those studies suggest that ssDNA can be a target for various factors, leading to genomic instability.

R-loops in human diseases

Mutations of genes coding for proteins processing R-loops can not only generate genomic instability but they are also linked to human diseases such as trinucleotide repeat- associated diseases, cancers and neurological diseases (Groh and Gromak, 2014; Richard and Manley, 2017). Tri-nucleotide repeat-associated diseases are caused by repeat expansions in specific genes (see table 1 for examples). It has been shown in vitro and vivo that transcription of those repeats lead to R-loops formation in bacteria and in human cells (Grabczyk et al., 2007; Loomis et al., 2014). Eventually, those R-loops can generate genetic instability such as repeat expansion or deletion (Lin et al., 2010; McIvor et al., 2010). This mechanism has been particularly observed in lymphoblastoid cell lines derived from Friedreich ataxia (FRDA) and Fragile X syndrome (FXS) patients (Groh et al., 2014). It was demonstrated that stable R-loops co-localize with expansion sites and repressive marks in these diseases (Groh et al., 2014; Loomis et al., 2014). Furthermore, it was showed that R-loops were responsible for the formation of chromatin repressive mark and transcriptional silencing in FXS (Colak et al., 2014; Groh et al., 2014). As repeats in those diseases are GC rich (Pearson et al., 2005), R-loops formation on expansion site could be facilitated by G-quadruplexes formation. They are over 40 genetic repeated-associated disorders; it would be of significant interest to see if this mechanism takes place in all those diseases. Additionally, mutations of proteins involved in R-loops resolution have been identified as responsible for human cancers. In eosinophilic leukemia, inactivation of FIPL1, a polyadenylation factor, induces R-loop mediated damages (Stirling et al., 2012). Similarly, in

22 Name of the disease Gene mutated Gene symbol Repeats Huntington’s disease Huntintin HTT CAG

Myotonic dystrophy Dystrophia myotonica DMPK CTG type 1 protein kinase

Spinocerebellar ataxia Ataxin 1 ATXN1 CAG type 1 Fragile X mental Fragile X mental retardation or Fragile FMR1 CGG retardation 1 X syndrome Friedreich ataxia Frataxin FXN GAA

Table1: Tri-nucleotide repeat-associated diseases testicular germ cell cancer (or seminoma) downregulation of E3 ligase Bre1, actor of transcription elongation, creates DNA damages Rnase H1 sensitive, thus probably R-loops dependent (Chernikova et al., 2012). R-loops accumulation resulting in fork stalling and DNA breaks have also been observed in cells Fanconi Anemia (FA) patients, a disorder characterized by a high cancer risk (Schwab et al., 2015). Mutations of FANCJ, at the origin of breast cancer and Fanconi Anemia, lead to a stabilization of G-quadruplex which could be responsible for R-loops accumulation and eventually DNA damages in this disease (Wu et al., 2008). The same mechanism could participate to the Bloom syndrome, disease characterized notably by cancer predisposition. Indeed, as FANJ, BLM is a G4-helicase and its depletion leads to similar phenotypes (Chang et al., 2017; van Wietmarschen et al., 2018), and it has been suggested that they could interact with each other(Dhar and Brosh, 2018). Through the sequestration of the human TREX complex, Kaposi’s sarcoma-associated virus could also lead to R-loops induced DNA breaks (Jackson et al., 2014). This virus causes various AIDS-related cancers, meaning that human oncogenic viruses through R-loops accumulation could generate DNA damages that contribute to tumorigenesis. Several neurological disorders have been associated with R-loops accumulation (Groh and Gromak, 2014; Richard and Manley, 2017). For example, Aicardi-Goutières Syndrome patients’ cells present a genome-wide increase of R-loops (Lim et al., 2015). It is an inflammatory disorder causing neurological damages in infants and caused by mutations in TREX1 (coding for a 3’-5’ exonuclease) or in Rnase H2 (Rice et al., 2007). Interestingly, increase of R-loops was localized in intergenic or gene body regions and not at 5’ or 3’ end of genes. This would indicate a different sort of R-loops than those forming naturally. Patients cells mutated for Rnase H2 are also associated with hypomethylation at R-loops forming sites (Lim et al., 2015). This epigenetic change could be responsible for the immune system response typical of the disease. Furthermore, evidences suggest that certain cases of Amyotrophic Lateral Sclerosis (ALS), a severe motor neuron disease, could implicate R-loops (Salvi and Mekhail, 2015). One of the most common cause of ALS is an expansion of hexanucleotide GGGGCC in the first exon of C9ORF72 gene (DeJesus-Hernandez et al., 2011; Majounie et al., 2012). They are usually around 30 repeats of this hexanucleotide however in mutated genes it can be up to thousands of repeats. In vitro, it has been observed that those repeats can form G- quadruplexes and promote R-loops formation (Fratta et al., 2012). This hexanucleotide expansion has not been found in any other ALS disease however G-quadruplex formation through other GC repeat sequences could be responsible for some of the 50% ALS cases without known mutations. Finally, mutations of senataxin (SETX) putative RNA: DNA helicase is responsible for 2 neurodegenerative diseases: dominant Amyotrophic Lateral Sclerosis 4 (ALS4) (Chen et al.,

23

2004) and recessive Ataxia with Oculomotor Apraxia type 2 (AOA2) (Moreira et al., 2004). They are characterized by degeneration of motor neuron, muscles weakness, cerebellum atrophy and are both declared during childhood. As reported previously, Sen1/SETX have been showed to protect the genome against genome instability by removing R-loops and allowing transcription termination (Alzu et al., 2012; De Amicis et al., 2011; Richard et al., 2013; Yüce and West, 2013). This mechanism could promote apoptosis that could account for neurons death in these diseases.

24 ATR ATM

911 DSB 911 DSB sensing MRN MRN

ATM/ATR phosphorylation

γH2AX

911 ATM ATM MRN ATR ATR

MDC1 53BP1 MDC1 53BP1 53BP1 53BP1 911 ATM ATM MRN ATR ATR

ATM/ATR phosphorylation

MDC1 53BP1 MDC1 Signal 53BP1 53BP1 53BP1 transduction 911 ATM ATM ATR ATR

Figure 8: DSB detection and signaling DSB repair begins by DSB detection by sensing proteins: MRN and 9-1-1 complex. Those sensing proteins promote recruitment of ATM and ATR kinases which phosphorylates H2AX. γH2AX allows the recruitment of 53BP1 and MDC1 which in turn are phosphorylated by ATM and ATR, promoting signal transduction and amplification. II. DNA double strand breaks repair

Genome integrity is threatened by various damages. DNA double strand breaks (DSB) are the most deleterious damages for the cell as they can lead to mutations/chromosomal arrangements and eventually cell death. Repair pathways exist in cells to fix those breaks. Those repair pathways require chromatin modifications and action of various enzymes and proteins recruitment, depending of the cell cycle, cell type and localization of the break on the genome. This topic has been extensively reviewed and will be summarized here.

A. DSB detection and signaling

First step of DNA damage response (DDR) is DSB detection (Figure 8). Sensors protein such as PARP1/2, Ku 70/80, the MRN complex (Mre11, RAD50, NSB1) or the 9/1/1 (RAD9, RAD1, HUS1) complex can be ligated to damaged DNA (D'Amours et al., 1999; de Murcia and Ménissier de Murcia, 1994; Paull and Lee, 2005). Those proteins allow the recruitment of essential kinases from the PIKK (PhospatidylInositol 3-Kinase-related Kinases) family: ATM and ATR. They both have a serine- threonine kinase activity and a number of common targets. ATM is exclusively recruited and activated at DSB by NBS1 from the MRN complex (Lee and Paull, 2005). On the contrary, ATR can also bind SSB by binding with its partner ATRIP and RPA that recognizes ssDNA (Zou and Elledge, 2003). ATM and ATR first and major target is H2AX histone variant (Figure8). They phosphorylate H2AX serine 139 (Rogakou et al., 1998), which is then called γH2AX. γH2AX is not only a strong marker of damaged DNA, spreading on several mega-bases around the break (Rogakou et al., 1999), but also a major recruitment platform for others DDR proteins. Among those proteins are 53BP1 (p53 binding protein) (Anderson et al., 2001) or MDC1 (Mediator Checkpoint 1) (Stucki et al., 2005) that recognize phosphorylated residues with their BRCT (BRCA1 C-terminus Tandem) domain. Hence, those proteins are recruited on γH2AX and phosphorylated by ATM which allows signal transduction (Figure 8). MDC1 participates to the recruitment of numerous DDR proteins (Stucki et al., 2005). The accumulation of γH2AX and all those DDR proteins at the break can be visualized as foci in the nucleus by microscopy. ATM and ATR kinase activity promote cell cycle arrest as well (Figure 9). This step is required in order to protect from damages transmission to the daughter cells. • G1 block (Figure 9A): CHK2 (Checkpoint Kinase 2) (Ahn et al., 2000) and ligase MDM2 (Maya et al., 2001) phosphorylation by ATM limits p53 degradation, promotes its accumulation and activates its targets such as p21 transcription. Finally,

25 A B

ATM ATR ATM ATR

MDM2 CHK2 CHK1 CHK2 CHK1

p53 p53 p53 p53 CDC25 p53 CDC25 p53 p53 p53

p53 p53

Sequestered p21 p21 degradation in the cytoplasm 14.3.3 p21 p21 p21 CDC25 p21 p21 CDC25

p21 p21 p21

CDK2 Cyclin E CDK2 Cyclin E

CDK6 Cyclin D CDK1 Cyclin B

G1 phase G1/S phase G2/M phase

Figure 9: ATM and ATR activity on cell cycle arrest A. G1/S phase arrest: G1 arrest can be achieved by p53 stabilization. P53 can then induce p21 production and therefore inhibits Cdk/Cyclins involved in G1 to S phase transition. Cdc25a degradation can also take part in cell cycle arrest by blocking the activation of Cdk2/Cyclin E involved in this transition. (phosphorylations: purple star) B. G2/M phase arrest: As for G1/S phase arrest, p21 accumulation inhibits Cdk1/Cyclin B. This arrest can also be achived by cdc25a sequestration in the cytoplasm by the 14.3.3 complex, hence blocking the activation of Cdk1/Cyclin B. (phosphorylations: purple star) p21 inhibits Cdk (Cyclin dependent kinase) 2/cyclin E and Cdk4-6/cyclin D activity and blocks cell cycle in G1. • G1/S transition block (Figure 9A): CHK1 and CHK2 phosphorylated by ATR and ATM respectively, can phosphorylate cdc25 phosphatase which leads to its ubiquitinylation and degradation. Cdk2/cyclin E is finally inhibited and the cell cycle is blocked (Deckbar et al., 2011; Warmerdam and Kanaar, 2010). • G2/M transition block (Figure 9B): similarly to G1/S transition block, CHK1 and CHK2 phosphorylation by ATM and ATR leads to cdc25a phosphorylation. Cdc25a is then exported and sequestered in the cytoplasm by the protein 14-3-3. The complex cdk1/cyclin B is inhibited and the cell cycle blocked. Another possibility is the direct phosphorylation of p53 by ATM/ATR (or via CHK1/CHK2) leading to p21 expression and cdk1/cyclin B inhibition (Deckbar et al., 2011; Warmerdam and Kanaar, 2010).

B. Mechanisms

There are 2 major repair mechanisms, one requiring only the ligation or “joining” of both extremities together or one more complex, requiring a homologous sequence intact as template to repair the DSB.

Non-Homologous End Joining

Non-homologous end joining (NHEJ) is the major DSB repair mechanism in human cells has it occurs all across the cell cycle. It consists only on the ligation of both extremities (Figure 10). First step is Ku70 and Ku80 fixation at break (Walker et al., 2001). Those 2 proteins form a complex interacting with PI3K-like protein DNAPK extremities (Weterings et al., 2003). DNAPK catalyzes the phosphorylation of targets such as H2AX serine 139 but also its auto- phosphorylation (Costantini et al., 2007). Some damaging agents can lead to bases modification making the direct ligation between extremities impossible. Therefore, NHEJ can require extremities maturation by various enzymes (phosphatases, nucleases or kinases) such as EXO1, MRE11, Artemis and WRN (Kusumoto et al., 2008; Weterings et al., 2009; Xie et al., 2009).Degradation by nucleases can sometimes lead to DNA synthesis of few nucleotides by DNA polymerase λ and μ to restore extremities (Capp et al., 2007; Lee et al., 2004). All those modifications can generate loss of fidelity. The final step of ligation is achieved by XRCC4/XLF/DNA ligase IV complex (Ahnesorg et al., 2006). XRCC4 itself does not possess a catalytic activity however it is essential for DNA ligase IV recruitment at the break.

26 γH2AX ATM ATM DSB

DNA-PK DNA-PK

γH2AX ATM ATM

Ku70/80 Ku70/80

DNA-PK auto-phosphorylation

Exo1 Extremities WRN maturation MRE11 Artemis

Polymerases

Exo1 WRN MRE11 Ligation Artemis XRCC4 XLF

Ligase IV

Figure 10: Non-Homologous End Joining pathway Ku70/80 early recruitment at DSB promotes DNA-PK binding which maintain extremities together. DNA-PK phosphorylate H2AX and itself which promote the recruitment various enzymes (polymerases, phosphatases, nucleases or kinases) such as EXO1, MRE11, Artemis and WRN, hence leading to extremities maturation. Extremities ligation is achieved by XRCC4/XLF/DNA ligase IV complex. Homologous Recombination

Secondary repair mechanism in human cells is Homologous Recombination (HR). It is known to be a more precise mechanism than NHEJ (Figure 11) as it requires a homologous sequence as template for repair. First and crucial step for HR is resection by nucleases. Resection of the 5’ strand generates a 3’ tail ssDNA. It is initiated by MRE11 (Lavin, 2004) and CTIP (Sartori et al., 2007) and can be resumed by 2 mechanisms. First one requires the unwinding of ssDNA by RecQ helicases such as BLM or WRN followed by DNA2 nuclease activity (Gravel et al., 2008). Second mechanism requires simply EXO1 direct degradation of dsDNA in ssDNA (Nimonkar et al., 2011). Resection is followed by coating and stabilization of ssDNA by RPA (Wold, 1997). RPA is then displaced by BRCA2 (Pellegrini et al., 2002), recruited by BRCA1 and PALB2 (Zhang et al., 2009), and RAD52. This displacement allows RAD51 recruitment on ssDNA (Conway et al., 2004) in order to achieve homology search: the search for a homolog sequence to the damaged one. It is followed by Rad51-ssDNA filament invasion of the dsDNA. This invasion generates a D-loop with the 3’ end as start for DNA synthesis by DNA polymerase, such as DNA polymerase δ, followed by ligation. The second damaged extremity is either engaged in the D-loop by independent invasion or captured by RAD52. Those steps lead to the formation of a double Holliday Junction (dHJ) that can be resolved by 2 mechanisms. • Resolution by nucleases: MUS81-EME1, GEN1 or resolvase A which can lead to crossing- over (CO, with important exchange information) or gene conversion (GC, limited exchange) (Heyer, 2004). • Resolution by helicases: major mechanism, BLM and TOPIII-α activity leading to limited information exchange (neither CO or GC) (Raynard et al., 2006).

Alternative End-joining/Microhomology Mediated End-joining

In cases where NHEJ factors are missing or HR is blocked after the resection step, making it impossible to use NHEJ, DSB repair can be achieved by a rescue mechanism: alternative end-joining (Alt-NHEJ) or microhomology mediated end-joining (MMEJ). This mechanism seems to occur preferentially in S-phase (Truong et al., 2013) and is based on the ligation of microhomology sequences at each sides of the break. As for the other mechanisms, the DSB is detected and signaled at first, then it is followed by a 5’ ends resection as for HR leaving 3’ single strand exposed. As for HR, it appears that CtIP and MRN complex are required for the resection step (Rass et al., 2009). Those 3’ ends are 5-25 bp microhomologies are able to anneal together. Non-annealed DNA ends are eventually removed and it can be

27 DSB

Resection (CtIP, MRN, DNA2, Exo1...)

Rad52

ssDNA coating PALB2 RPA

RPA replacement (BRCA1, PALB2, BRCA2) RAD51

Homology search (D-loop formation)

DNA synthesis (DNA polymerase delta)

double Holliday Junction

Resolution by helicases Resolution by nucleases (BLM, TOPIII) (MUS81-EME1,GEN1)

Limited Gene exchange Crossing-over conversion

Figure 11: Homologous Recombination pathway First step of HR is resection, then single stranded DNA is coated by RPA. RPA is replaced by RAD51 thanks to BRCA2. It is followed by Rad51-ssDNA filament invasion of the dsDNA for homology search. This invasion generates a D-loop with the 3’ end as start for DNA synthesis by DNA polymerase δ followed by ligation. Depending of the resolution of double Holliday junction, HR can lead to limited information exchange, crossing-over or gene conversion. followed by a step of gap filling (by DNA ligase 4 in yeast and 3 in mammalians) and ligation (Frankenberg-Schwager et al., 2009). This mechanism can be highly error-prone leading to loss of genetic information as DNA in between microhomologies is lost. Therefore, this mechanism is controlled for example by the action of 53BP1 and BLM blocking resection. This mechanism is indeed mainly used in cases where HR or NHEJ are unavailable. However, Alt- NHEJ was also implicated in CSR (Yan et al., 2007). As several mechanisms can be used for DSB repair, it is important to identify factors influencing the choice between those mechanisms.

C. Factors influencing the choice between the 2 major repair pathways

HR and NHEJ are essential for cell survival. However, misuse or failure of those pathways can also generate genomic instability (Cannan and Pederson, 2016; Chapman et al., 2012; Clouaire and Legube, 2015; Symington and Gautier, 2011). On one hand, extremities maturation during NHEJ can lead to loss of fidelity although it is a mainly conservative pathway. NHEJ could also cause translocations and telomeres fusion. On the other hand, HR is extremely, if not perfectly, conservative when using sister chromatid as template for repair. However, breaks occurring in repeated sequences or homologous chromosomes can have harmful consequences. Respectively, they can lead to amplifications/deletions or loss of heterozygosity. Evidences show that HR and NHEJ can co- exist in the cell, but also that the loss of one can be compensated by the other. Considering the consequences from those repair events, it is crucial to choose between both mechanism and to identify factors influencing this choice.

Resection regulation

Resection initiation

Resection is the typical step of HR. Once resection is initiated, NHEJ cannot be used anymore therefore it is a turning point for repair: HR has to be completed or it will be resumed by the error-prone Alt-NHEJ.Different factors either favor or limit extremities degradation. Resection can be achieved by 2 complementary mechanisms:5’-3’ end resection, from the break by Exo1 and DNA2 or 3’-5’ end resection, toward the break by the MRN complex and CtIP/Sae2. MRN complex and CtIP 3’-5’ end resection activity can come in handy for chemically modified extremities inaccessible for Exo1/DNA2 (Symington and Gautier, 2011). Indeed, Exo1/DNA2 are more efficient to process free DNA ends and are also known to promote long range resection. Finally, CtIP appears to be the main resection nuclease as it has its own endonuclease activity and also by its association with the MRN complex activates MRE11 endonuclease function. This association requires CtIP ubiquitinylation by BRCA1,

28 allowing CtIP recruitment to chromatin (Yu et al., 2006). The complex including MRN, CtIP and BRCA1 is called BRCA1C. BRCA1 depletion affects strongly resection (Schlegel et al., 2006). Once this step started, DSB repair is committed to HR, NHEJ is not available therefore resection inhibition is crucial for repair pathway choice.

Resection inhibition

There are a number of factors impeding resection such as the Ku protein. Ku 70/80 limits resection by binding to DSB extremities, protecting DNA ends from resection initiation and therefore leading repair towards NHEJ (Aparicio et al., 2014; Chapman et al., 2012). 53BP1 also participates to DSB repair by limiting resection. 53BP1 phosphorylation by ATM is not necessary for its recruitment to DSB but leads to interaction with RIF1. 53BP1 and RIF1 act together to block resection. Several publications identified recently 53BP1 and RIF1 downstream factor, the Shieldin complex (comprising REV7, c20orf196, FAM35A and CTC- 534A2.2) and showed that Shieldin is necessary for DNA end protection and NHEJ (Ghezraoui et al., 2018; Gupta et al., 2018; Noordermeer et al., 2018). Additionally, 53BP1/RIF1 and Shieldin were recently found to counteract resection at DSB by recruiting CST (Ctc1, Stn1 and Ten1), a RPA like complex, at DNA damage sites (Mirman et al., 2018). Once at break sites, CST would act as a Polymeraseα/primase accessory factor. CST together with Polymerase α would “fill in” resected DSB and therefore prevent hyper-resection. It appears that 53BP1 also counteracts BRCA1 activity. Interestingly, embryonic lethality caused by BRCA1 depletion can be rescued by 53BP1 depletion. Moreover, as BRCA1 depletion promotes NHEJ, 53BP1 depletion promotes HR whereas double depletion rescues a wild type phenotype. Those evidences suggest that BRCA1 and 53BP1 have antagonistic roles for DSB repair pathway choice.

Factors regulating resection influence strongly DSB repair pathway choice. However, they are also submitted to regulation and other components play a role in this choice.

Chromatin state around DSB

Chromatin landscape around the DSB can influence the repair pathway mechanism used. Genome-wide mapping of histones and chromatin proteins allowed the identification of a repair “histone code” committing DSB to HR or NHEJ (Clouaire and Legube, 2015; Clouaire et al., 2018). DSB occurring in heterochromatin are repaired by a specific HR pathway ATM dependent, using Artemis as exonuclease (Beucher et al., 2009) and surprisingly dependent on 53BP1, RNF168 and RNF8 ubiquitin ligases (Goodarzi et al., 2008; Noon et al., 2010). The

29 reason why NHEJ might be block in heterochromatin is chromatin compaction in this region (Shibata et al., 2011). Heterochromatin specificity relies on a pathway depending on HP1 and H3K9me3 (Sun et al., 2009). It has been shown that following DSB induction, HP1 is removed in cis of the break, revealing H3K9me3 which usually is bound by HP1. H3K9me3 allows the loading of TIP60, a histone acetyltransferase that promotes nucleosome removing and resection (Sun et al., 2009). Additionally, H3K36me3 is specific of active transcription and is usually found on actively transcribed gene body (Edmunds et al., 2008). Several groups showed that SETD2, H3K36me3 histone metyltransferase, is essential for HR (Aymard et al., 2014; Carvalho et al., 2014; Pfister et al., 2014). Our lab also showed that LEDGF (Lens Epithelium Growth Factor) interaction with H3K36me3 allows RAD51 recruitment. CtIP was also found to interact with LEDGF (Aymard et al., 2014). Those results suggest that DSB induced in transcribed region enriched in active chromatin marks are preferentially repaired by HR. Importantly, SETD2 and H3K36me3 are not respectively recruited and enriched after DSB induction but already present before, suggesting pre-existing code for DSB repair choice. Altogether, a “DSB repair choice histone code” was proposed. Indeed, as H3K36me3 is responsible for HR choice at active genes, inactive chromatin marks have been identified as responsible for NHEJ choice in non-transcribed regions (Clouaire et al., 2018). Similarly, as H3K36me3 for RAD51 and CtIP, H4K20me1/2 and H2AK15ub help stabilizing 53BP1 at the break orienting repair towards NHEJ in G1.

Chromatin landscape around the break can explain not only the repair choice made at transcribed and untranscribed region but also repair choices in different cell types. This chromatin state can also be influenced by the cell cycle.

Cell cycle dependent

The cell cycle is known to play a role in pathway choice (Hustedt and Durocher, 2016). Evidently, as HR requires the sister chromatid as template for repair, it is preferentially used during S/G2 phase as the sister chromatid is absent in G1 phase. As mentioned previously, breaks induced in actively transcribed loci are repaired preferentially by HR whereas breaks in untranscribed loci are preferentially repaired by NHEJ. We showed recently by capture Hi-C (High chromosome Contact, allowing contact detection between DNA fragments) that HR- prone breaks, when induced in G1 can cluster together (Aymard et al., 2017). Those breaks present delayed repair suggesting that DSB are “sequestered” in cluster to protect them from deleterious repair mechanisms such as Alt-NHEJ. Other factors are influenced by the cell cycle.

30

Proteins mentioned above are impacted differently by cell cycle phases. First, their quantity varies during the cell cycle. CtIP for example, is degraded by the proteasome in G1 and therefore more abundant in S/G2 phase (Buis et al., 2012). As for BRCA1, its expression increases in G2 (Chen et al., 1996). Moreover, it was found that the cell cycle controls BRCA1 association with PALB2 and restrict BRCA2 activity to S/G2 phase (Orthwein et al., 2015). Post-translational modifications are also affected by the cell cycle and are crucial for repair pathway choice. Indeed, during the G1/S transition, CtIP is phosphorylated on the serine 327 and threonine 847, respectively responsible for complex formation with MRN/BRCA1 (Yu and Chen, 2004) and resection initiation (Huertas and Jackson, 2009). Those phosphorylations take part in 53BP1/RIF1 and CtIP/BRCA1 mutual exclusions during the cell cycle. In G1, RIF1 antagonizes BRCA1 committing repair to NHEJ. However, in G2, CtIP phosphorylation by CDK1 on the threonine 847 (Escribano-Díaz et al., 2013) promotes CtIP association with BRCA1, leading to RIF1 exclusion from the chromatin by BRCA1 and therefore pushing repair towards HR. The cell cycle influence chromatin marks specific of repair pathways as well. One manifest example is histone H4 lysine 20 (H4K20) methylation oscillations during the cell cycle (Jørgensen et al., 2013). Indeed, in G1 phase, H4K20 is methylated (H4K20me2) and promotes 53BP1 recruitment at the break and therefore NHEJ. During S phase, 53BP1 accumulation at damages decline as H4K20me2 is diluted due to H4K20me0 is incorporated in newly synthetized DNA (Pellegrino et al., 2017). However, H4K20me0 promotes BRCA1 recruitment at DSB in S/G2 phase, hence directs repair towards HR (Nakamura et al., 2019). Those data show that cell cycle influence the DSB repair pathway choice.

Type of breaks

In principles, “clean” DSB, with 3’ hydroxyl and 5’ phosphate groups can be easily repaired by NHEJ or HR (Aparicio et al., 2014). However, damaging agents such as irradiation or radiomimetics drugs induce rarely “clean” breaks. They usually generate modifications, including 5’ hydroxyls, 3’ phosphates, abasic sites, covalently-bound adenylate groups and protein-DNA abducts at the break site. Those modifications require DNA ends to be restored in order to achieve major DSB repair steps as Ku binding, resection and ligation, and therefore are repaired more slowly. Certain proteins involved in NHEJ and HR can process directly those modifications. Indeed, Ku can remove abasic sites near DSB (Roberts et al., 2010) and MRN/CtIP complex (Hartsuiker et al., 2009) can release covalently-attached protein abducts from DNA ends in order to initiate resection. Additionally, other enzymes involved in “cleaning” of DSB ends have been identified. Non-exhaustively, PNKP (polynucleotide kinase 3’ phosphatase) is required

31 to add phosphate groups to 5’ OH and removes 3’ phosphates (Jilani et al., 1999). Aprataxin removes adenylate groups (Ahel et al., 2006). Finally, a number of nucleases such as Artemis, APLF (Aprataxin PNKP like Factor) (Kanno et al., 2007) and WRN catalyze cleavage upstream from the break to remove DNA-end lesions.

Altogether, the localization of the DSB on the genome, the damaging agents used and when during the cell cycle the DSB is induced seems to play a major role in DSB repair pathway choice. However, the exact mechanisms used in those different conditions is still being investigated and numerous factors can also participate in this choice. More precisely, the way transcription and secondary structures influence repair is still under investigation.

32

III. Transcription role in DSB repair

Transcriptional role in DSB repair, and more specifically if transcription is beneficial or harmful for repair, is controversial. On one hand, transcription and RNA itself could actively take part to DSB repair. Indeed, DSB could not only initiate transcription but RNA could also be necessary for DSB signaling and furthermore serve as a template for repair during Homologous Recombination. However, a large among of evidence also suggest that RNA would actually be a roadblock for repair. Indeed, transcription inhibition and RNA: DNA hybrids removal are a necessary step for DSB repair in order to recruit DDR proteins. Inhibition of this step was proved to be strikingly deleterious for cell survival. Nonetheless, those studies reveal a specific mechanism for DSB repair in transcribed loci.

A. RNA: active player for DSB repair?

Transcription initiation following DSB induction

Considering that transcription could actively play a role in DSB repair, it has been suggested that DSB could initiate transcription (Figure 12). Deep-sequencing analysis showed non-coding RNA production after DSB induction and more specifically that those non-coding RNAs are originating from the break (Bonath et al., 2018; Burger et al., 2019; Francia et al., 2012; Michalik et al., 2012; Michelini et al., 2017; Ohle et al., 2016; Wei et al., 2012). This phenomenon seems to be very conserved as it was observed in plants, yeast, Drosophila and human cells, Moreover, those RNA products was observed with DSB reporters (induction by restriction enzymes such as I-Ppo1 and I-sce1) but also after irradiation. It has also been suggested that RNA polymerase II is recruited after DSB induction at the break site (Bonath et al., 2018; Burger et al., 2019; Michelini et al., 2017; Ohle et al., 2016) where it could be subjected to post translational modification such as phosphorylation of RNA Polymerase II tyrosine 1 (Burger et al., 2019). Authors of those studies suggest de novo transcription from the break site in order to participate to repair. Interestingly another model has been proposed in S.pombe (Ohle et al., 2016). Indeed, in this study RNA: DNA hybrids formation on resected I-pPO1 sites has been observed in RnaseH1 mutants limiting resection and RPA recruitment leading to HR defect (Figure 12). On the contrary, Rnase H1 overexpression led to long range resection, RPA recruitment and to loss or repeat regions. Those results represent a challenge for HR model where hybrids would limit resection and RPA recruitment, hence limiting loss of repeating regions around DSB.

33 de novo transcription at DSB?

P-Tyr1

RNA pol II DSB 3‘ 5‘ 5‘ 5‘ 3‘ 3‘ 3‘ RNA 5‘ pol II

RNA synthesis on resected DNA strand

RNA RNA polymerase II pol II loading at the break?

Figure 12: Transcription initiation following DSB induction RNA polymerase II could be recruited after DSB induction at DNA ends and subjected to post-translational modifications such as Tyrosine 1 Phosphorylation (P-Tyr 1). This recruitment could lead to de novo transcription from the break, possibly on the resected strand. To summarize, those studies suggest that transcription could be initiated following DSB induction and that step would be required for efficient repair. Additional work is necessary to confirm this theory and its impact on repair.

Non-coding RNAs target DDR proteins at the break

Small non-coding RNA

Most non-coding RNAs sequenced at the break are around 21-23 nucleotides long, therefore mostly small non-coding RNAs. Numerous studies have suggested a role for small non-coding RNAs in DSB repair and more specifically for small non-coding RNA processing factors (Figure 13). It has been proposed that Dicer and Drosha activities are required for the DDR and more specifically that their depletion by siRNAs in human cells leads to a decrease of pATM and ATM signal by immunofluorescence (Francia et al., 2016; Francia et al., 2012) Rnase A treatment also led to decrease of the DDR implying that Dicer and Drosha products, small non- coding RNAs also called DNA Damage-response RNAs (DDRNAs) are necessary for signaling. As both strands appeared to be transcribed in vitro, it is possible that those small ncRNA form a dsRNA (Francia et al., 2012). Moreover, Ago2 depletion is responsible for DSB repair defects (Gao et al., 2014; Wei et al., 2012) and for a decrease of Rad51 recruitment in human cells (Gao et al., 2014). However, in S. pombe it has been observed that Dcr1, Dicer homolog, would take part in transcription termination by releasing RNA polymerase II at breaks site (Castel et al., 2014). This study suggests that microprocessing factors could participate in DSB repair by taking a role in premature transcription termination on at those sites. Finally, it has been observed that Dicer, Drosha and Ago2, by regulating non-coding RNAs production, would remodel chromatin around the break (Wang and Goldstein, 2016). Their activity would result in chromatin opening enhancing DSB repair proteins recruitment such as Rad51 and BRCA1.

In conclusion, small non-coding RNAs processing seems to allow DDR proteins targeting/recruitment of DDR proteins.

Long non-coding RNA

A similar role of long non-coding in DDR, named damaged-induced lncRNAs (dilncRNAs), has been suggested (Figure 14) (Michelini et al., 2017). In this study, authors use a NIH2/4 mouse cell line and I-sce1 endonuclease site to induce DSB. They propose that long non-coding RNAs are produced starting at a few nucleotides from the break site by RNA

34 Dicer

Ago + Dicer Drosha chromatin remodelling

eviction?

RNA pol II γH2AX P-ATM RNA pol II DSB P-ATM

Ago

Dicer + RAD51 Drosha

DDRNA?

Figure 13: Role of small non-coding RNA in DSB repair Small non-coding RNA could be produced from the DSB (DNA Damage Response RNA) and processed by Dicer and Drosha. They have been implicated in various steps of DSB repair. Small ncRNA processing factors are required for ATM phosphorylation as well as chromatin remodeling. DDRNA have been involved in the recruitment of DDR proteins such as RAD51 through Ago association. Dicer seems to be also required for RNA polymerase II eviction polymerase II. Moreover, they imply that RNA polymerase II interacts with MRN and that those long non-coding RNA promote 53BP1 foci formation. Additionally, D’Adda di Fagagna lab showed that dilncRNA promote HR by recruiting BRCA1, BRCA2 and RAD51 at the break (D'Alessandro et al., 2018). In S/G2 phase, dilncRNA would form a RNA-DNA hybrids with resected DNA ends, as proposed in S.pombe. RNA-DNA hybrids levels would be regulated by Rnase H2 recruitment through its interaction with BRCA2. This regulation seems to contribute to HR. Further work would be necessary to decipher the exact involvement of those non- coding RNAs in DSB repair and how they could target damaged sites specifically.

RNA template for repair during Homologous Recombination

HR is the mechanism preferred for DSB repair in transcriptionally active regions. However, this mechanism is only available during S and G2 phase when the sister chromatid can be use as substrate. Interestingly, it has been proposed that during G1, RNA could be used as template in HR (Meers et al., 2016) (Figure 15). This mechanism would require reverse transcription from DNA to RNA. In early 1990s, studies showed first evidence of genetic information transfer by RNA into DNA by reverse transcription in yeast (Derr and Strathern, 1993; Derr et al., 1991). More specifically, they showed that reverse transcription of a cDNA from endogenous cellular RNA could be integrated in the genome. Using a HIS3 reporter gene under galactose inducible promoter in which a DSB site is inserted, Keskin et al, showed that repair could be achieve by HIS3 cDNA via HR (Keskin et al., 2014). Interestingly, they also observed that RNA could directly repair DSB by being a template for HR (Keskin et al., 2014; Storici et al., 2007). Indeed, they first observed that single stranded RNA oligonucleotides could repair a DSB induced in the leu2 marker leading to leu+ transformants. They confirmed their direct role by showing that depletion of SPT3 gene, essential for Ty1 and Ty2 transposition had no impact on repair frequency therefore RNA is directly use for repair and not cDNA (Storici et al., 2007). They proposed that DNA polymerase α and δ could copy short RNA tracts into DNA in vivo as they were able to observed in vitro. This type of repair would be facilitated by the proximity of RNA at the break as they were able to show that RNA produced in cis allow a higher repair frequency than RNA produced in trans (Keskin et al., 2014). Additionally, they proposed that repair by the use of an RNA template is promoted by Rnase H depletion, to avoid RNA degradation before repair completion (Keskin et al., 2014). Another possible use of RNA would be as a bridge between the 2 extremities has been proposed in this study (Keskin et al., 2014).

35 S/G2 RAD51

BRCA1 MRN

BRCA2 RNAse H2 RNA pol II DSB 3‘ 5‘ 5‘ 5‘ 3‘ 3‘ RNA 3‘ 5‘ pol II 53BP1 DSB induced long non-coding RNA? MRN

Figure 14: Role of long non-coding RNA in DSB repair Long non-coding RNA production from DSB ends is supposed to regulate resection extend and would promote MRN and 53BP1 recruitment. In S/G2 phase, it would limit resection extent and promote HR proteins recruitment. BRCA2 recruitment is supposed to regulate RNA: DNA hybrids formation at the break by its association with Rnase H2.

DSB RNA pol II CSB

Rad52

DSB RNA pol II

Rad52

DNA patching? LINE-1?

DNA synthesis from nascent RNA?

RNA templated repair

Figure 15: RNA template for DSB repair At DSB in transcribed loci, CSB recruitment could promote RAD52 binding and RNA template repair. It could also serve as a bridge between extremities. RNA retrotransposition for repair could be achieve by LINE-1 retrotransposase. This mechanism could allow homologous recombination when the sister chromatid is missing. Finally, they demonstrated in vitro that RAD52 promotes RNA annealing to DSB-like DNA ends (dsDNA oligonucleotides) suggesting that RAD52 could favor RNA templated HR or RNA used as a bridge between break extremities (Keskin et al., 2014). More recently, it was shown in vivo that Cockayne syndrome B protein (CSB) recruits RAD52 at break site in an RNA dependent manner in G0/G1 phase (Wei et al., 2015). CSB depletions leads to HR defects. Altogether those results suggest that RNA in cis could be used as a template for HR in G0/G1 phase when the sister chromatid is not available. Retrotransposition could be used as well for DSB repair in mammals. Using NHEJ- defective Chinese Hamster Ovary (CHO) cell, Long Intersperse Element-1 (LINE-1) were shown to integrate and therefore retrotranspose at DNA lesions site (Morrish et al., 2007; Morrish et al., 2002). Use of retro-transposase for DSB repair have also been proposed in human cell lines (Onozawa et al., 2014). However, in this study it was proposed that RNA used as template is produced in trans of the break contrary of the previous studies mentioned.

Overall, those studies provide an interesting mechanism for HR in G0/G1 phase in the absence of sister chromatid to serve as template for repair. Altogether, the production of RNA in cis or in trans of the break and its exact role as a DSB repair factor remain unclear. However, numerous studies suggest that RNA accumulation at the break could actually represent an obstacle for repair.

B. RNA: roadblock for repair

Contrary to what stated above, a number of evidence indicate that RNA accumulation at break sites could actually impede repair for example by limiting DDR proteins access to the break. To avoid RNA accumulation around the break, 2 steps appears to be necessary: first, transcription inhibition of damaged genes in order to avoid mRNA production and second, RNA: DNA hybrids removal from the break site Numerous factors such as DDR pathway proteins seems to be implicated in RNA: DNA hybrids removal at the break site as well. Additionally, depletion of R-loops or G4 processing factors leads to repair defects after DSB induction. Hence, it is possible that regulation of RNA: DNA hybrids formation, caused by RNA polymerase stalling at the break site, is an essential component of the DNA damage response.

Transcription inhibition: stopping RNA production at the break

As RNA can represent an obstacle for DSB repair, the first step would be to avoid their production at damaged genes by repressing transcription. Indeed, there are a number of evidence showing how DSB induction promotes transcription inhibition (Figure 16) (D'Alessandro and d'Adda di Fagagna, 2017; Marnef et al., 2017; Mullenders, 2015).

36 1. Damaged gene 2. Cis-transcriptional transcriptional repression repression (4-13 kb from DSB)

BAF180 Polycomb PBAF NuRD complex HDAC chromatin ZMYND8 ENL SEC remodelling DSB RNA pol II ATM RNA pol II

eviction γH2AX Transcriptional inhibition Transcriptional inhibition

degradation? 3. Transcriptional maintainance within a γH2AXdomain

ATM phosphorylation H2AK119ub RNA pol II H4 acetylated

cohesin insulation

Figure 16: Transcription repression at DSB Gene transcription is inhibited locally following DSB induction, while transcription is maintained farther away within the γH2AX domains. Transcriptional repression is achieved by the eviction of RNA Pol II from the damaged gene and possibly via its degradation, as well as multiple chromatin modifications, including histone deacetylation by the NuRD complex and Polycomb dependent ubiquitination of H2A at lysine 119. Transcription is maintained in the rest of the γH2AX domain. Following DSB induction in active genes, transcription and more specifically RNA polymerase II elongation is inhibited in cis of the break (Awwad et al., 2017; Iacovoni et al., 2010; Iannelli et al., 2017; Pankotai et al., 2012; Shanbhag et al., 2010; Solovjeva et al., 2007) but does not affect active genes transcription as severely in the rest of the γH2AX domain (Iacovoni et al., 2010; Iannelli et al., 2017). Transcription inhibition also could be mediated by RNA polymerase II degradation by the proteasome (Pankotai et al., 2012). Moreover, it was suggested that transcription inhibition at the break site is ATM- dependent (Iannelli et al., 2017; Shanbhag et al., 2010) and associated to chromatin modifications (Figure 16). Indeed, ATM activity leads to histone deacetylation by the NuRD complex (Gong et al., 2015; Gong et al., 2017), recruited at DNA damage by ZMYND8 (zinc finger and MYND [myeloid, Nervy, and DEAF-1] domain containing 8), which in turn leads to transcription repression. NuRD and ZMYND8 can also be regulated by the histone demethylase KDM5A (Gong et al., 2017). Indeed, KMD5A catalyze H3K4me3 demethylation near DSB which is required for NuRD-ZMYND8 recruitment. ATM is also responsible for transcription repression by H2AK119 ubiquitylation RNF8 and RNF168 ubiquitin ligases dependent (Figure 16) (Shanbhag et al., 2010). However, there are evidence that this ubiquitylation is catalyzed by the polycomb group (PRC1 and PRC2) proteins (Mattiroli et al., 2012). Indeed, ATM phosphorylates ENL, part of the Super Elongation Complex (SEC), which promote its interaction with BMI1 the E3-ubiquitin ligase complex of PCR1 (Ui et al., 2015). In turns, BMI1 ubiquitinates H2AK119 at promoters and damaged sites (Ui et al., 2015). Additionally, PRC2 sub-unit EZH2 promotes H3K27me3 allowing PRC1 recruitment amplification therefore encouraging transcription repression (Kakarougkas et al., 2014). Transcription inhibition by H2AK119 ubiquitylation is also mediated by the PBAF complex after ATM phosphorylation (Kakarougkas et al., 2014). This step is required for early time points repair. Similarly as PBAF (Brownlee et al., 2014), cohesins are necessary for sister chromatin cohesion Figure 16). Interestingly, cohesins seem to share the same role than PBAF in transcription repression as well (Meisenberg et al., 2019). However, cohesins play this role throughout the cell cycle contrary to its role in sister chromatid cohesion. Mutations of cohesins sub-units lead to large chromosomal rearrangements suggesting that transcription inhibition is a crucial step for correct DNA repair. Transcription inhibition could promote repair by allowing repair proteins to access DSB sites and limit production of defective mRNA.

Finally, once repair has been processed, transcription needs to be re-activated in order for cells to survived. It has been observed that re-activation depends on uH2A deubiquitylation by USP16 (Shanbhag et al., 2010).

37 G4 helicases?

G-quadruplex DSB

5‘ RNA pol II 5‘ 3‘ 3‘ SETX TC-NER THO TREX HR Spliceosome machinery

Exosome

RNA binding proteins

Figure 17: Removing R-loops from DSB After DSB induction, RNA: DNA hybrids seem to accumulate at the break and need to removed. Depletion of DDR proteins and notably HR proteins lead not only to repair defect but also to RNA: DNA hybrids accumulation. Depletion of proteins involved in RNA: DNA hybrids prevent (RNA binding proteins) is also associated with DSB repair defect. G4 stabilization could also lead to RNA: DNA hybrids accumulation. Indeed, G4 stabilization by BLM depletion is also linked to HR impairment and chromosomal arrangements. Those evidences suggest that RNA: DNA hybrids need to be removed from the break for proper repair. Removing RNA: DNA hybrids from DSB

Damage Response proteins

With transcription repressed, RNA polymerase II stalling could lead to RNA: DNA hybrids accumulation in the vicinity of the break. Therefore, it would be required to “clear” the break for any RNA lingering at the damaged site. It was observed that proteins involved in DDR could be recruited by RNA: DNA hybrids formation and participate in their removal (Figure 17). In S.cerevisiae, depletion of Rad51p, required to join nucleic acids stretches together to repair DNA breaks, promotes RNA: DNA hybrids (Wahba et al., 2013). Recruitment of DDR proteins such as BRCA1, BRCA2, CSB and Rad52 by RNA: DNA hybrids was observed in human cells as well (Bhatia et al., 2014; D'Alessandro and d'Adda di Fagagna, 2017; Hatchi et al., 2015; Teng et al., 2018; Yasuhara et al., 2018). Interestingly, BRCA1 and BRCA2 were shown to interact respectively with SETX and TREX-2, RNA processing factors, and their depletion was associated not only with hybrids accumulation but also to genome instability (Bhatia et al., 2014; Hatchi et al., 2015). Additionally, it was observed that RAD52 prevents alterations caused by aberrant NHEJ (Yasuhara et al., 2018).

Those studies suggest that DDR proteins participate in DSB repair, and potentially HR, by removing RNA: DNA hybrids from the break.

RNA binding proteins

As for DDR proteins, RNA binding factors could limit RNA: DNA accumulation at break sites. Interestingly, several genome wide screens, using repair reporters or irradiation to induce breaks, identified that RNA processing factors take a role in the DNA damage response (Figure 17) (Adamson et al., 2012; Maréchal et al., 2014; Simon et al., 2017; Yuan et al., 2014). With other studies, they showed that RNA splicing factors, such as THRAP3, RBMX, SNRPA1, Pso4, PRP19 and RBM14 (Abbas et al., 2014; Adamson et al., 2012; Beli et al., 2012; Maréchal et al., 2014; Simon et al., 2017; Tanikawa et al., 2016; Yuan et al., 2014), RNA helicase such as Dead box 1(Li et al., 2008; Li et al., 2017) and proteins involved in mRNA biogenesis such as SAF-A, EXOSC10 and XRN2 (Britton et al., 2014; Marin-Vicente et al., 2015; Morales et al., 2016) are recruited at damage sites. More specifically, that recruitment of those proteins at the break correlates with or even depends on transcription and/or R-loops formation (Britton et al., 2014; Li et al., 2008; Li et al., 2017; Morales et al., 2016; Tanikawa et al., 2016) and take part in the processing of those R-loops (Britton et al., 2014; Morales et al., 2016).

38

Depletion of splicing factors such as ASF1/SF2 is responsible for DNA double strand break induction in enriched in R-loops (Li and Manley, 2005; Li et al., 2007b; Wan et al., 2015). Interestingly Rnase H or RNA binding protein overexpression suppressed DSB formation hence R-loops is responsible for genome instability (Li, Niu, & Manley, 2007). This mechanism shows that R-loops can be prevented by splicing factors binding. Strikingly, depletion of those proteins leads to DSB repair defects. However, if a majority of studies suggest a role in Homologous recombination by RAD51 or BRCA1 recruitment (Abbas et al., 2014; Adamson et al., 2012; Marin-Vicente et al., 2015; Maréchal et al., 2014; Tanikawa et al., 2016), a few studies suggest they participate in NHEJ for example by DNA-PKcs phosphorylation (Morales et al., 2016; Simon et al., 2017; Yuan et al., 2014). These differences could be explained by the use of different damaging agents. Etoposide and hydroxyurea could induce a majority of breaks at transcribed loci, on the contrary, irradiation induces breaks all across the genome, therefore mostly in untranscribed loci. Therefore, chromatin landscape around the break could influence the type of repair used and the impact of RNA processing factors depletion on repair. However, the need to decipher those proteins different roles on repair and how the influence repair pathways choice remains. Among those RNA processing factors, SETX has been repeatedly associated with DNA repair. As mentioned previously, mutations of SETX gene cause neurodegenerative diseases such as AOA2. Interestingly, it was showed that AOA2 patients cells are sensitive to oxidative stress (Suraweera et al., 2007). Moreover, SETX is recruited at replicative stress induced DNA damages (Yüce and West, 2013) but also at meiotic break sites in mice (Becherel et al., 2013). In both cases, SETX recruitment correlates with RNA: DNA hybrids formation. As mentioned previously, SETX forms a complex with BRCA1 required to remove hybrids at breaks induced in transcriptional pause sites (Hatchi et al., 2015). Moreover, it was suggested that SETX localization at break sites depends on its sumoylation and leads to exosome targeting at the break (Richard et al., 2013) potentially to “clean” RNA from the break. SETX depletion leads to DSB repair delay and a defect in RAD51 recruitment (Becherel et al., 2013) at meiotic breaks.

Altogether, those studies show that RNA processing factors such as SETX are at the interface between mRNA processing and DSB repair, suggesting that RNA removing from the break could be essential for efficient DSB repair in transcribed loci.

G4 processing factors

Interestingly, G4 processing factors could limit RNA: DNA formation and therefore participate in the DDR response (Figure 17). As stated previously, it has been observed that G4 helicases depletion produces genomic instability. Additionally, G4 helicases seem to be

39 recruited after DSB induction at break sites. It was observed that poly(ADP)ribose 3 (PARP3) is recruited at DSB sites where it removes G4 and limits chromosomal rearrangements by promoting HR (Day et al., 2017). Moreover, BLM is recruited at transcription/replication collision sites as well (Chang et al., 2017). Interestingly, it was also proposed that BLM could be recruited via interaction with DDX1, an RNA helicase (Li et al., 2017).

Those data suggest that G4 helicases could also have a role in DSBs induced in transcribed region, and potentially limit RNA: DNA hybrids formation at those sites.

40

Results

41

I. RNA: DNA hybrid resolution is crucial for DSB repair in active genes

DNA double strand breaks (DSB) are the most deleterious damages for cells since they can generate mutations and chromosomal rearrangements. Two major repair mechanisms exist in the cell: Non-Homologous End-Joining (NHEJ) and Homologous recombination (HR). My lab showed recently that DSB occurring in transcribed genes are preferentially repaired by HR. Those genes are characterized by the formation of secondary structures such as RNA- DNA hybrids. In this study, we showed that RNA: DNA hybrids need to be resolved for efficient DSB repair in genes. Using the DIvA (DSB Induction via AsiSI) cell line allowing the induction of annotated DSB through the genome, we showed by Chromatin Immunoprecipitation followed by sequencing (ChIP-seq) that a known RNA: DNA helicase: senataxin (SETX) is recruited at DSB in transcribed loci. We demonstrated that SETX is recruited at DSB in active genes to remove RNA-DNA hybrid (mapped by DRIP-seq). We also showed that SETX activity allows RAD51 loading and limits DSB illegitimate rejoining and consequently promotes cell survival after DSB induction.

This study shows that DSB in transcribed loci require specific RNA-DNA hybrids removal for accurate repair and is absolutely necessary for cell survival.

42

ARTICLE

DOI: 10.1038/s41467-018-02894-w OPEN Senataxin resolves RNA:DNA hybrids forming at DNA double-strand breaks to prevent translocations

Sarah Cohen1, Nadine Puget1, Yea-Lih Lin2, Thomas Clouaire1, Marion Aguirrebengoa1, Vincent Rocher1, Philippe Pasero2, Yvan Canitrot1 & Gaëlle Legube1

1234567890():,; Ataxia with oculomotor apraxia 2 (AOA-2) and amyotrophic lateral sclerosis (ALS4) are neurological disorders caused by mutations in the gene encoding for senataxin (SETX), a putative RNA:DNA helicase involved in transcription and in the maintenance of genome integrity. Here, using ChIP followed by high throughput sequencing (ChIP-seq), we report that senataxin is recruited at DNA double-strand breaks (DSBs) when they occur in tran- scriptionally active loci. Genome-wide mapping unveiled that RNA:DNA hybrids accumulate on DSB-flanking chromatin but display a narrow, DSB-induced, depletion near DNA ends coinciding with senataxin binding. Although neither required for resection nor for timely repair of DSBs, senataxin was found to promote Rad51 recruitment, to minimize illegitimate rejoining of distant DNA ends and to sustain cell viability following DSB production in active genes. Our data suggest that senataxin functions at DSBs in order to limit translocations and ensure cell viability, providing new insights on AOA2/ALS4 neuropathies.

1 LBCMCP, Centre de Biologie Integrative (CBI), CNRS, Université de Toulouse, UT3, 118 Route de Narbonne, 31062 Toulouse, France. 2 Institut de Génétique Humaine, CNRS, Université de Montpellier, 34396 Montpellier, France. Correspondence and requests for materials should be addressed to G.L. (email: [email protected])

NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w

utations in the SETX gene are responsible for the rare production at the break point, two features that have been pre- neurological disorders ALS4, a dominantly inherited viously proposed to occur in many organisms (for review see ref. M 12 form of amyotrophic lateral sclerosis and ataxia with ). oculomotor apraxia type 2 (AOA2) which are associated with an In order to gain insights into R-loops biology at DSBs, here we early onset neurons degeneration (for review see ref. 1). As a set to assess a potential function of senataxin at sites of DNA consequence, patients display strong ataxia as well as oculomotor DSBs. Using ChIP-seq and DRIP-seq, we uncovered that sena- troubles (AOA2) or muscle weakness (ALS4), generally occurring taxin is recruited specifically at DSBs induced in transcriptionally before their 30′s. SETX encodes a helicase, strongly conserved active genes, which exhibit RNA:DNA hybrids accumulation throughout evolution that has been implicated in a large variety following DSB induction. Senataxin distribution around DSBs of biological processes, from transcription termination, to meiosis coincided with a local decrease in R-loops and senataxin deple- completion and maintenance of genomic integrity1. At a mole- tion triggered increased DSB-induced RNA:DNA hybrids for- cular level, studies of the yeast senataxin homolog Sen1p estab- mation, suggesting that senataxin processes DNA damage lished that this helicase displays an unwinding activity toward induced RNA:DNA hybrids. We found that senataxin is not – RNA:DNA hybrids2 4. Multiple genomic studies in both yeast required to sustain resection, nor rapid repair at these DSBs. Yet, and mammals recently unveiled that RNA:DNA hybrids mostly it promotes Rad51 foci formation, counteracts translocations and form as RNA polymerases progress throughout the genes, by the sustains viability following DSB production in active genes, hence re-hybridization of the nascent RNA to the template DNA strand, identifying a crucial and unanticipated role for senataxin in DSB leading to triple-stranded structure called R-loops (reviewed in repair. ref. 5). While R-loops display a strong ability to naturally form at GC-skewed promoters due to an enhanced thermodynamic sta- bility of C-rich DNA: G-rich RNA duplexes6,7, multiple com- Results plexes regulate their occurrence throughout the genome, Senataxin is recruited at DSBs produced in active genes.To including RNA splicing/processing factors and specific helicases, assess senataxin recruitment at sites of DSBs, we used the DSB such as senataxin8,9. R-loops formation and processing play a inducible via AsiSI (DIvA) cell line, which allows to induce clean crucial role in terminating transcription at least in yeast (reviewed DSBs throughout the genome29,30. In this cell line, 4 hydro- in ref. 8). In agreement, mutations in sen1 trigger defective xytamoxifen (4OHT) treatment induces the relocalisation of a transcription termination and increased transcriptional read- stably expressed restriction enzyme (AsiSI) that in turn triggers through, especially on short transcribed units such as rDNA, the production of multiple DSBs at annotated positions across the tRNA, and small non-coding or coding genes (for review see ref. genome, in a homogeneous manner in the cell population hence 10). Such a function of R-loops processing and senataxin in allowing the use of chromatin immunoprecipitation (ChIP)30.To transcriptional termination was also proposed to be conserved in obtain a quantitative and genome-wide assessment of senataxin higher eukaryotes, in a manner that would also involve dsRNA binding and distribution at DSBs, we performed ChIP followed processing factors such as Drosha and DGCR8 as well as com- by high throughput sequencing (ChIP-seq) against senataxin ponents of the RNA exosome (for review ref. 11). However, before and after DSB induction in DIvA cells (respectively beyond their role in regulating transcription, R-loops also −4OHT and +4OHT). We observed an accumulation of senataxin represent a severe threat to genome integrity, proposed to arise in a 1–2 kb window surrounding AsiSI sites following 4OHT both due to the susceptibility of the displaced single-stranded treatment (see examples in Fig. 1a). In vivo, AsiSI does not DNA to damaging agents as well as their potential to impede produce DSB at each of its 1211 annotated recognition sites, likely – replication fork progression (reviewed in refs. 12 14). In agree- due to both DNA methylation and chromatin compaction30. Our ment, increased damage occurrence and genome instability was recent studies allowed us to characterize AsiSI sites cleavage observed in cells deficient for R-loops processing factors such as efficiency, using BLESS (direct in situ breaks labeling, enrichment AQR15. Additionally in yeast, sen1 mutations are associated with on streptavidin and next generation sequencing) throughout the an increased transcription-associated genome instability16,17.On genome and to identify a set of 80 DSBs robustly induced fol- the other hand, few studies also suggested that senataxin may play lowing 4OHT treatment31 (Clouaire, T. et al., manuscript sub- a more direct role at damage sites. Senataxin/Sen1p interacts with mitted). Senataxin binding was significantly enriched following repair proteins in yeast and mammals and localizes to the site of 4OHT on the AsiSI sites population that exhibits cleavage com- damage during replication18,19. Moreover, SETX mutant mice pared to the uncut AsiSI recognition sites (Fig. 1b). Heatmaps display defective meiosis and Spo11-mediated DSB persistence20. revealed that senataxin recruitment did not necessarily correlate Finally, depletion of senataxin/Sen1p triggers sensitivity to some with the cleavage efficiency, indicating that genomic and/or epi- 21–23 fl DNA damaging agents such as H2O2 and UV , a feature also genomic features in uence senataxin binding at DSBs (Fig. 1c). observed in AOA2 patient cell lines22,24,25. However, senataxin Importantly, while AsiSI-induced DSBs mostly lie within depleted cells are not radiation sensitive22, suggesting that it may promoters or gene bodies of active genes, some do reside in not function at sites of DNA double-strand break (DSB), a form intergenic regions or in genes exhibiting no or very low level of of DNA damage largely induced upon irradiation. RNA PolII (refs. 29,31 and see examples later). We previously Yet, recent studies have suggested that R-loops or/and RNA: reported that preexisting transcriptional activity strongly influ- – DNA hybrids likely form at DSBs. An assay using a duplex ences both DSB repair and signaling events29 31. Hence, we specific nuclease detected RNA:DNA hybrids upstream the I-SceI further tested whether DSB-induced senataxin recruitment may site upon DSB formation26. Additionally a mutant form of vary depending on the transcriptional activity of the broken locus. RNAse H1, devoid of RNA exonuclease activity accumulates at For this, we performed ChIP-seq mapping of the elongating form the site of laser induced damages27. Finally, in Schizosacchar- of RNA polymerase II (RNA PolII-S2P), of the total RNA PolII, omyces pombe, RNA:DNA hybrids were shown to form during as well as RNA-seq, prior DSB induction. DSBs were further resection, regulating RPA filament formation28. The exact sorted according to their transcriptional status prior to damage mechanism that leads to such R-loops or/and RNA:DNA hybrids induction. Senataxin recruitment following DSB induction formation remains unclear and may either relate to the tran- correlated with total RNA PolII enrichment levels preceding scriptional extinction observed at damaged sites or to de novo damage (Fig. 1d) as well as with elongating RNA PolII and RNA RNA PolII loading at DNA ends and subsequent RNA levels (Supplementary Fig. 1A, B). Moreover, inspection of

2 NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w ARTICLE

ab 2 P=1.7e–35 BLESS 0 P=7.4e–19 1 SETX +4OHT 0 1 2.5 SETX –4OHT

Normalized read count 0 RBMXL1 2.0 chr1 89 420 000 89 520 000 1 DSB 1.5 read count on +/– 1 kb

0 Normalized SETX ChIP-seq 1 −4OHT +4OHT −4OHT +4OHT read count Normalized 0 Uncut Cut 89456000 89462000 DSB

2 BLESS d • 0 1 SETX +4OHT 0.4 • • 0 1 SETX –4OHT 0.3 0 Normalized read count • ASXL1 0.2

chr20 30880000 30920000 30960000 31000000 –/+ 500 bp DSB • 1 0.1 • • ChIP-seq SETX read count 0 0 1 4OHT – +–+–+–+ read count Normalized Total RNA PolII High Medium Medium Low 0 (-4OHT) high low 30944000 30950000 DSB c e SETX ChIP-seq count 0 0.3 Transcribed loci Untranscribed loci −4OHT +4OHT 2 2 BLESS BLESS 0 0 1 1 RNA-seq RNA-seq 0 0 0.3 RNA PolII-S2P 0.3 RNA PolII-S2P 0 0 1 SETX+4OHT 1 SETX+4OHT 0 0 1 SETX–4OHT 1 SETX–4OHT Normalized read count 0 Normalized read count 0 SRSF6

chr20 chr13 42050000 42100000 105200000 105250000 DSB DSB 2 2 BLESS BLESS 0 0 1 1 80 DSBs RNA-seq RNA-seq 0 0

cleavage efficiency 0.3 RNA PolII-S2P 0.3 RNA PolII-S2P 0 0 1 SETX+4OHT 1 SETX+4OHT 0 0 1 SETX-4OHT 1 SETX-4OHT Normalized read count 0 Normalized read count 0 RBMXL1 SLC32A1

chr1 chr20 89400000 89450000 37300000 37350000 +/–5 kb +/–5 kb DSB DSB

Fig. 1 Senataxin is recruited at DSB induced in active loci. a Genome browser screenshots representing senataxin ChIP-Seq reads count before AsiSI activation (−4OHT) and after damage induction (+4OHT) at two individual AsiSI sites. The BLESS signal (indicative of cleavage efficiency (Clouaire, T. et al., manuscript submitted)) is also shown. Close up profiles are also shown below each screenshot. b Box plots representing senataxin ChIP-seq count before (−4OHT) and after (+4OHT) DSB induction at sites displaying AsiSI-induced cleavage (“cut”, 80 AsiSI sites) or not (“uncut”, 1139 AsiSI sites). Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. P values are indicated (Wilcoxon Mann–Whitney test). c Heatmaps representing senataxin ChIP-seq count over a 10 kb window centered on the DSB before (−4OHT) and after (+4OHT) DSB induction. DSBs are sorted according to decreasing cleavage efficiency (based on BLESS data set (Clouaire, T. et al., manuscript submitted)). d Box plots representing senataxin ChIP-seq count before (−4OHT) and after (+4OHT) DSB induction at AsiSI “cut” sites sorted according to total RNA Polymerase II occupancy on a 10 kb window surrounding AsiSI sites (20 DSBs in each category). Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. Points: outliers. e Genome browser screenshots representing senataxin ChIP-Seq reads count before AsiSI activation (−4OHT) and after damage induction (+4OHT) at four individual AsiSI sites, exhibiting either high (left) or low (right) transcriptional activity (indicated by RNA PolII S2P ChIP-seq mapping and RNA-seq −4OHT). The BLESS signal (Clouaire, T. et al., manuscript submitted) indicates that all sites display equivalent cleavage following 4OHT treatment

NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w individual sites indicated that senataxin did not accumulate at fused to an auxin inducible degron (AID). In this cell line, auxin broken intergenic or silent regions, although they were robustly addition triggers the degradation of the enzyme and hence repair cleaved (see BLESS signal) and showing high level of XRCC4 of AsiSI-induced DSBs29.Wefirst assessed the survival of AID recruitment (assessed by ChIP-seq29) (Fig. 1e, Supplementary DIvA cells following break induction and repair, in both control Fig. 1C). We previously demonstrated that transcriptionally and senataxin-depleted cells (Supplementary Fig. 3A). Clonogenic active genes exhibit preferential binding of Rad51 and homo- assays revealed that depletion of senataxin using siRNA does not logous recombination (HR) repair29. In agreement, senataxin trigger cell death in absence of exogenous damage (Fig. 3a, displayed a significantly enhanced recruitment at DSB enriched in −4OHT). DSB induction for 4 h, followed by auxin addition Rad51 compared to DSBs exhibiting low levels of Rad51 (+4OHT + auxin) only led to a mild survival defect in control (Supplementary Fig. 1D). Altogether these data indicate that cells (Fig. 3a) indicating that these cells recover well from the senataxin is recruited to damage sites with a strong preference for induction of DSBs by AsiSI, as previously reported34. In contrast, DSBs induced in transcriptionally active loci, preferentially senataxin depleted cells exhibited a strong sensitivity to AsiSI- repaired by HR. induced DSBs (Fig. 3a, Supplementary Fig. 3B). Since AsiSI-induced DSBs exhibit clean DNA ends and undergo repeated cycles of cleavage, we further tested the effect Senataxin removes DSB-induced RNA:DNA hybrids in active of senataxin depletion at DSBs produced by other means. loci. A well described function of senataxin is its ability to unwind Etoposide is an inhibitor of Topoisomerase II (TOP II) and RNA:DNA hybrids, hence regulating R-loops stability on the multiples studies have revealed that TOP II exerts a critical genome. Given senataxin recruitment at DSBs, we further function at active genes in order to unwind supercoils and release investigated R-loops distribution at AsiSI-induced DSBs. For this, topological constraints that form at transcriptionally active we performed DRIP-seq using the S9.6 antibody that displays a regions35. Hence, etoposide-induced DSBs exhibit a biased strong specificity for RNA:DNA hybrids32,33 before and after DSB distribution throughout the genome, being preferentially located induction. As expected6,7, DRIP-seq signal was enriched on active in active promoters (~30%) and genes bodies (~40%)36. genes, peaking at TSS and TTS, validating our sequencing results Interestingly, senataxin depletion also triggered enhanced sensi- (Supplementary Fig 2A, B). Notably, we could observe robust tivity to etoposide (Supplementary Fig. 3C). On the other hand, changes in R-loops distribution following damage, with an ionizing radiation (IR) induces DSBs randomly across the increase of RNA:DNA hybrids around the break site (Fig. 2a, genome, hence being mainly located in intergenic loci that Supplementary Fig. 2C). When taken collectively, following DSB represent over 95% of higher eukaryotes . Notably, and induction, RNA:DNA hybrids were mildly but significantly in agreement with a previous report22, senataxin depletion did enriched (7% increase) on a 10 kb window surrounding cut sites not trigger enhanced sensitivity to irradiation (Supplementary compared to uncut sites (Fig. 2b). Interestingly, while senataxin Fig. 3D). Our data therefore suggest that senataxin exerts an accumulation was strongly correlating with the transcriptional important function in cell survival specifically following break activity of the broken locus (Fig. 1 and Supplementary Fig. 1), this induction in active loci. was less the case for RNA:DNA hybrids accumulation. Indeed, we could detect their formation upon damage at few (example Supplementary Fig. 2D, E)—but not all (example Supplementary Senataxin regulates γH2AX accumulation. To further decipher Fig. 2F)—untranscribed loci (no detectable signal for RNA-seq or the function of senataxin in DSB repair and to understand the elongating RNA PolII). While total RNA PolII (hence likely not lethality observed following DSB induction in senataxin deficient in an elongating form) was readily detectable prior DSB induction cells, we analyzed the effect of senataxin depletion on γH2AX foci at some of these untranscribed loci (Supplementary Fig. 2D), formation following DSB induction. We found that senataxin others did not display any RNA PolII before damage induction depletion did not abolish foci formation and rather triggered (Supplementary Fig. 2E), suggesting that at least in a few enhanced γH2AX signal in 4OHT-treated cells (Fig. 3b). Inter- instances RNA:DNA hybrids may also be able to form at estingly, senataxin depletion also increased γH2AX signaling untranscribed loci, devoid of RNA PolII (Discussion), albeit at following etoposide treatment (Supplementary Fig. 3E) but not lower levels. following IR (Supplementary Fig. 3F), suggesting a function of Interestingly, at active genes, beyond the accumulation senataxin in regulating γH2AX foci formation at DSBs induced in surrounding the DSBs, we could also detect a decrease in R- active genes. To further strengthen these data, we tested whether loops across damaged genes bodies and termination sites (Fig. 2a a short global transcription extinction preceding break induction for examples, Supplementary Fig. 2G for averaged profiles). was able to rescue the increased γH2AX signaling observed in Moreover, careful examination of our high-resolution data senataxin-depleted cells. A pretreatment of DIvA cells with cor- revealed that despite an increase of RNA:DNA hybrids formation dycepin, a well characterized transcription inhibitor, partially on ~10 kb window around DSBs, at active genes we could also reduced γH2AX in damaged DIvA cells following senataxin observe a sharp 1–2 kb decrease of RNA:DNA hybrids at the depletion (Fig. 3c). Altogether, our data suggest that senataxin is exact sites of senataxin accumulation (Fig. 2c, d). Depletion of involved in regulating γH2AX establishment at DSBs induced in senataxin using a siRNA (Supplementary Fig. 3A) triggered an transcriptionally active loci. increase in DSB-dependent RNA:DNA hybrids accumulation We next investigated the consequences of senataxin depletion proximally to a DSB (Fig. 2e). Thus, our high resolution data on DSB repair kinetics. Immunofluorescence performed at reveal that the pattern of R-loops shows complex alterations upon different time points after auxin addition revealed that γH2AX DSB induction. Altogether our data suggests that RNA:DNA foci disappeared with the same kinetics in both control and SETX hybrids form at DSB flanking chromatin and that, at active genes siRNA-treated cells (Fig. 4a, b). To refine this analysis, senataxin recruitment contribute to their removal in the experiments were also performed using a high content micro- immediate vicinity of DSBs. scope which allows to sort G1 versus G2 cells based on their DNA content31. Similarly we could not detect any delay of γH2AX foci Senataxin promotes cell survival upon active genes breakage. clearance following auxin addition in SETX-depleted G1 and G2 To further assess SETX function in DSB repair, we used our cells (Supplementary Fig. 4A). Additionally, AID DIvA cells allow improved version of the DIvA cell line, whereby AsiSI-ER is also to assay repair kinetics at specific DSBs using a protocol based on

4 NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w ARTICLE

a c 10 DRIP(S9.6)+4OHT 1 0 SETX +4OHT 10 DRIP(S9.6)–4OHT

0 0 1.5 10 SETX +4OHT DRIP(S9.6)+4OHT 0 1.5

Normalized read count 0 SETX –4OHT 0 89 454 000 89 464 000 1 DSB

Normalized read count RNAPolII-S2P–4OHT 10 kb 0 1.2 RNA-seq d 0 DRIP-seq 1.6 ASXL1 –4OHT +4OHT chr20 30 940 000 31 020 000 1.2 DSB 10 DRIP(S9.6)+4OHT 0.8

0 10 DRIP(S9.6)–4OHT Averaged DRIP-seq count 0.4 0 –5 kbDSB +5 kb 1.5 SETX +4OHT 0 1.5 SETX ChIP-seq SETX –4OHT 0.1 –4OHT 0 1 +4OHT Normalized read count RNAPolIIS2P –4OHT 0 1.2 RNA-seq 0.05 0 RBMXL1 Averaged ChIP-seq count

chr1 –5 kbDSB +5 kb 89 380 000 89 440 000 DSB

b e 10 +4OHT P=4.5e–22 0 P=0.0047 10 –4OHT DRIP-seq read count 4.5 Normalized 0 89 458 000 89 459 000 DSB primers 4.0 0.12 n=3 CTRL siRNA 3.5 0.08 SETX#2 siRNA Normalized DRIP-seq read count on +/– 5 kb 3.0 0.04

DRIP -qPCR (% input) 0 −4OHT +4OHT −4OHT +4OHT −4OHT +4OHT Uncut Cut

Fig. 2 Senataxin removes DSB-induced RNA:DNA hybrids in active loci. a Genome browser screenshots representing DRIP-seq reads count before (−4OHT) and after damage induction (+4OHT) at two individual AsiSI sites. Senataxin profiles in both conditions are also shown, together with BLESS enrichment following DSB induction as well as RNA PolII-S2P and RNA-seq prior to DSB induction. b Box plots representing DRIP-seq reads count before (−4OHT) and after (+4OHT) DSB induction at sites displaying AsiSI-induced cleavage (“cut”) or not (“uncut”). Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. P values are indicated (Wilcoxon–Mann–Whitney test). c Close-up genome browser screenshot of senataxin and RNA:DNA hybrids at an individual DSB upon DSB induction. d Average DRIP-seq (top) and senataxin ChIP-seq (bottom) profiles on a ±5 kb window centered on the 80 AsiSI induced DSBs. e DRIP-qPCR in control (CTRL) and senataxin (SETX#2) depleted cells before (−4OHT) and after (+4OHT) DSB induction in DIvA cells as indicated. The position of the primers used to quantify RNA:DNA hybrids by qPCR are indicated on the genome browser screenshot above. Mean and s.e.m. of three biological replicates is shown the ligation of a biotinylated oligonucleotide followed by Altogether, these data indicate that while SETX deficiency streptavidin purification and quantitative PCR measurement of triggers enhanced γH2AX signaling, it is not associated with purified DNA30,37. Once more, senataxin depletion did not delayed DSB repair, neither in G1 nor in G2. trigger repair delay at two DSBs found to be enriched in senataxin following 4OHT treatment (Fig. 4c). In addition, this held also true in G1- and G2-arrested cells following treatment with Senataxin regulates repair pathway choice. Given that senataxin lovastatin and RO-3306 respectively (Supplementary Fig. 4B). was found to be recruited at DSBs induced in transcriptionally

NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w

a –4OHT +4OHT +4OHT+ IAA

n=3 CTRL 120 p=0.85 SETX#2 100 p=0.03

80

60

40 Survival (%) p=0.09 20 SETX#2 CTRL

0 –4OHT +4OHT +4OHT+IAA b CTRL SETX #2

γH2AX γH2AX p<0.0001 25

20

15

10 γH2AX γH2AX H2AX intensity (AU)

γ 5

0 –4OHT +4OHT –4OHT +4OHT CTRL SETX #2 c CTRL SETX #2 γH2AX γH2AX

p<0.0001 p=0.0051 10

8 γH2AX γH2AX 6

4 H2AX intensity (AU)

γ 2

γH2AX γH2AX 0 –4OHT +4OHT +4OHT –4OHT +4OHT +4OHT +cordycepin +cordycepin

+4OHT CTRL SETX#2 +cordycepin +4OHT –4OHT +4OHT –4OHT

Fig. 3 Senataxin regulates survival and γH2AX upon DSB induction. a Clonogenic assays in AID DIvA cells transfected with control and SETX siRNA, before and after 4OHT treatment (4 h), followed by auxin (IAA) treatment (4 h) as indicated. Left panel shows a representative experiment. Right panel shows the average and s.e.m. of three biological replicates. P values are indicated (paired t-test). b γH2AX staining performed in untreated or 4OHT-treated DIvA cells (4 h), after transfection with control or SETX siRNA as indicated. Scale bar: 10 µM. Right panel shows the quantification of the γH2AX nuclear signal within foci ( > 100 nuclei) from a representative experiment. Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. P values are indicated (unpaired t-test). c γH2AX staining performed in control or SETX-siRNA transfected DIvA cells as indicated, treated with 4OHT or pretreated with cordycepin (1 h) previous 4OHT addition (4 h). Scale bar: 10 µM. Quantification is shown on the right panel ( > 100 nuclei, from a representative experiment). Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. P values are indicated (unpaired t-test) active loci, which are prone to undergo HR repair29, we further pathway choice, we used the reporter constructs previously investigated the function of senataxin on HR. Senataxin depletion developed to quantitatively measure HR38, single strand anneal- impaired Rad51 foci formation, while increasing 53BP1 accu- ing (SSA)39, and NHEJ40, using flow cytometry following I-SceI mulation following DSB induction by AsiSI (Fig. 5a, b). Notably, transfection. Senataxin depletion triggered a mild decrease of HR the enhanced accumulation of 53BP1 observed in senataxin and SSA associated with a similarly mild increase of NHEJ depleted cells was partially reversed by pretreating the cells with (Fig. 5c). Importantly, western blot against I-SceI (myc tagged) transcription inhibitors (cordycepin and 5,6-dichloro-1-β-D- indicated that this was not due to changes in I-SceI expression. ribofuranosyl-benzimidazole (DRB)) (Supplementary Fig. 5A, B). Because generation of ssDNA is the initial step for HR repair, Next, to further investigate the function of senataxin in repair we further examined the effect of senataxin depletion on DSB end

6 NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w ARTICLE

a +4OHT+IAA –4OHT +4OHT 30 min 60 min 120 min γH2AX CTRL siRNA

γH2AX siRNA SETX#2

b c 120 DSB-ASXL1 CTRL 100 SETX#2 20 CTRL 80 SETX#2 60 15 40 20 10 0 Unrepaired DSB (%) –4OHT +4OHT 30 60 120 5 +4OHT+IAA (min) H2AX intensity (AU)

γ 120 DSB-RBMXL1 0 100 80 30 60 120 30 60 120 –4OHT+4OHT –4OHT+4OHT 60 40 +4OHT+IAA (min) +4OHT+IAA(min) 20 siRNA CTRL siRNA SETX#2 0 Unrepaired DSB (%) –4OHT +4OHT 30 60 120 +4OHT+IAA (min)

Fig. 4 Senataxin depletion does not delay repair kinetics. a γH2AX staining performed in untreated or 4OHT-treated AID DIvA cells (4 h), followed by auxin (IAA) addition, after transfection with control or SETX siRNA as indicated. Scale bar: 10 µM. b Quantification of the γH2AX nuclear signal ( > 100 nuclei) from a representative experiment, performed in the above condition. Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. P values are indicated (unpaired t-test). c Cleavage assay performed in AID-DIvA cells left untreated or treated with 4OHT (4 h) followed by auxin (IAA) addition (30 min, 60 min, and 120 min), after transfection of control or SETX-directed siRNA. Precipitated DNA was analyzed close to two DSBs, found to recruit SETX after 4OHT. The percentage of sites that remain broken for each DSBs after the indicated time of auxin treatment are presented. Average and s.e.m. (n = 3, biological replicates) are shown resection. For this, we used an assay developed previously that reproducible increase of all four translocations events compared allows to quantitatively measure single stranded DNA (ssDNA) to control cells (Fig. 6b, Supplementary Fig. 6B, C). Notably, this generated at site specific DSBs41 (Fig. 5d). Senataxin depletion did increase of translocations observed in senataxin depleted cells was not reduce ssDNA levels at two DSBs induced by AsiSI (Fig. 5e) partially rescued by a pretreatment with DRB (Fig. 6c) or upon indicating that it is not necessary to promote resection. overexpression of RNAseH1, which degrades RNA:DNA hybrids Collectively these data indicate that senataxin promotes HR (Supplementary Fig. 6C). These data indicate that senataxin plays repair downstream of resection, by promoting Rad51 recruitment a key role in counteracting illegitimate rejoining of DSBs induced and counteracting 53BP1 accumulation. in loci ongoing active transcription.

SETX depletion enhances translocations. Given the strong Discussion requirement of SETX for survival upon DSB induction in active In this study we set to better understand the formation of RNA: genes (Fig. 3), despite no clear delay in repair kinetics (Fig. 4), we DNA hybrids at DSBs as well as the potential function of RNA: next set to assess whether SETX could influence the quality of the DNA hybrids removal factors in DSB repair, focusing on sena- repair reaction, more specifically the frequency of illegitimate taxin, a well characterized R-loops helicase. We discovered that rejoining between distant DSBs, involved in the generation of senataxin is specifically recruited at DSBs induced in active loci, translocations. Using high resolution Capture Hi-C, we recently where it removes RNA:DNA hybrids forming in cis to broken demonstrated that DSBs can cluster when induced on active loci. Senataxin is further required to regulate γH2AX signaling, to genes and identified the molecular identity of AsiSI-induced DSBs promote Rad51 loading and to minimize abnormal rejoining of brought into spatial proximity within nuclear foci31. Hence, based distant DNA ends (Fig. 6d). on this knowledge we developed an assay to accurately measure Our genome-wide mapping indicates that RNA:DNA hybrids – the illegitimate rejoining of closely clustered DSBs. We could accumulate in cis to DSBs, as previously proposed26 28. However, detect rejoining between DSBs induced on the same chromosome it seems that proximal DSB-induced RNA:DNA hybrids may (between MIS12 and TRIM37 as well as in LINC0072 and form differently depending on the transcriptional status of the LYRM2) (Fig. 6a), but also between DSBs induced on different damaged locus. At active genes, this RNA:DNA hybrids accu- chromosomes (MIS12::LYRM2 and TRIM37::RBMXL1) (Supple- mulation around DSBs is associated with an otherwise R-loops mentary Fig. 6A). Importantly, SETX depletion led to a highly decrease across the entire damaged gene body. Several studies

NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w have reported that transcription is downregulated at the damaged elongating RNA PolII7,49. Hence, at active genes, DSB-induced gene, as well as on chromatin proximally flanking DSBs, although RNA PolII stalling may contribute to the strong RNA:DNA being globally maintained in the γH2AX domain at distance from hybrids and/or R-loops formation in cis to the break. the break (reviewed in refs. 12,42). The exact mechanism leading Notably, although it was not a general feature (Supplementary to transcriptional repression in cis to DSBs is not yet clear but Fig. 2F), we could also identify few seemingly transcriptionally – involves the recruitment of chromatin modifying complexes43 47, silent sites that displayed low, but detectable levels of RNA:DNA as well as ATM-induced modifications of transcription elongation hybrids following breakage (Supplementary Fig. 2D, E), even factors45,47 yielding to a reduction in the elongating form of RNA when total RNA PolII was not present prior damage (Supple- PolII across the gene body48. Multiple studies have now estab- mentary Fig. 2E). Hence de novo RNA PolII recruitment at DNA lished that R-loops accumulate at sites of paused or slowly ends (a feature recently observed28,50) may, at least in some

a –4OHT +4OHT ×40 ×40 ×100 Rad51 Rad51 Rad51 P=0.0028 10 CTRL 8

6

4 Rad51 Rad51 Rad51 2 RAD51 intensity (AU) 0 SETX –4OHT +4OHT –4OHT +4OHT #2 CTRL SETX#2

b –4OHT +4OHT ×40 ×40 ×100 53BP1 53BP1 53BP1 P<0.0001

12 CTRL

8

53BP1 53BP1 53BP1 4

53BP1 intensity (AU) 0 SETX –4OHT +4OHT –4OHT +4OHT #2 CTRL SETX#2

c d HR +IsceI -IsceI AsiSI BanI BanI 1.2 n=3 p=0.12 1 NT NT 0.8 SETX CTRL 0.6 tub DSB-KDELR3 200 bp 1626 bp 0.4 myc DSB-ASXL1 740 bp 2000 bp 0.2

GFP-postive cells (AU) 0 –I–SceI CTRL SETX +I–SceI e DSB-KDELR3 SSA n=4 CTRL–4OHT +IsceI –IsceI 8 p=0.49 CTRL+4OHT 1.2 p=0.06 siRNA n=4 7 SETX–4OHT 1 6 SETX+4OHT

0.8 SETX CTRL NT NT 5 p=0.08 4 0.6 tub 3 0.4 myc 2 0.2

ssDNA (normalized) 1

GFP-postive cells (AU) 0 0 –I–SceI CTRL SETX 200 bp 1626 bp

+I–SceI DSB-ASXL1 1.6 NHEJ p=0.03 9 n=4 p=0.56 CTRL–4OHT CTRL+4OHT 1.4 n=5 +IsceI –IsceI 8 siRNA SETX–4OHT 1.2 7 6 SETX+4OHT 1 5 0.8 SETX CTRL NT NT 4 p=0.26 0.6 tub 3 0.4 2

0.2 myc ssDNA (normalized) 1 0 0 GFP-postive cells (AU) –I–SceI CTRL SETX 740 bp 2000 bp +I–SceI

8 NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w ARTICLE instance, produce RNA:DNA hybrids in cis to DSBs. More Interestingly we found that, although senataxin depletion did dedicated systems that allow to induce DSBs at a larger number of not delayed DSB repair, it impaired Rad51 foci formation, and transcriptionally silent loci (using multiple guides RNAs for Cas9 conversely increased accrual of 53BP1. Decreased HR and induced breaks for example) should help to understand the increased NHEJ were also observed following senataxin depletion occurrence of such RNA:DNA hybrids as well as the mechanisms using HR and NHEJ reporter constructs although to a milder driving their formation. Additionally, it is important to note here extent, which may be due to low R-loops formation on these that our study does not allow to determine whether the substrates. Notably, senataxin depletion did not impede resection, S9.6 signal detected at DSBs represent triple stranded structures suggesting that senataxin and/or RNA:DNA hybrids removal are (R-loops) or double-stranded RNA:DNA hybrids. Indeed, in S. critical at a step subsequent to ssDNA generation. However, pombe, such hybrids were proposed to form as resection pro- resection was only monitored up to 1.6 kb from the DSB, so we gresses, by the hybridization of a RNA to the resected single cannot exclude a function of senataxin in regulating more long strand DNA28. Strand-specific mapping of the RNA engaged in range resection events. Notably, we also found that senataxin the hybrids detected at DSB, by DRIPc-seq, should help to regulates γH2AX establishment and counteracts the illegitimate determine whether these resection-dependent RNA:DNA hybrids rejoining of distant DNA ends, suggesting that R-loops removal is are conserved in mammalian cells and whether the increased required to minimize translocations and maintain genome S9.6 signal at DSBs observed in this study represent R-loops or integrity following production of DSB in active genes. The RNA:DNA hybrids. mechanism by which senataxin impacts on γH2AX is currently Regardless of whether they arise from an unscheduled activity unknown but may involve the regulation of ATM recruitment or of RNA Polymerase II blocked by the lesion, or by de novo activity. Alternatively, senataxin and/or R-loops may regulate the transcription from DNA-ends, DSB-induced RNA:DNA hybrids DSB-flanking chromatin structure, modifying its ability to may exert some important function in the DDR. First, these RNA: undergo H2AX phosphorylation. Equally, how senataxin coun- DNA hybrids could contribute to setup an adequate chromatin teracts translocations needs to be investigated. We recently pro- landscape. Indeed, R-loops at genes have been proposed to posed that γH2AX spreading on entire topologically associated modify chromatin structure. For instance, R-loops trigger H3-S10 domains (TAD) likely modifies the properties of the chromatin phosphorylation linked to chromatin condensation51,52,].On fiber which could translate into changes in chromatin mobility another hand, R-loops also coincide with increased chromatin within the nucleus42,65. Given that DSBs induced in active genes accessibility7 and can mediate the recruitment of the TIP60/p400 were found to display enhanced clustering ability31, we can complex that promotes histone acetylation and nucleosome speculate that the increased aberrant joining of distant DSBs remodeling53. Notably, the Tip60 complex has been repeatedly observed in senataxin depleted cells arise from an increased DNA found at break site, where it mediates H4 and H2A acetylation ends mobility triggered by the enhanced γH2AX establishment. – and subsequent recruitment of repair proteins54 58. Hence one Importantly, SETX is a gene mutated in two severe neurolo- can hypothesize that R-loops and/or RNA:DNA hybrids will gical diseases, AOA2 and ALS4, associated with progressive contribute to set up the proper chromatin landscape required at neurodegeneration. In this regard it is interesting that DSBs have DSBs to ensure accurate and timely repair. Second, RNA:DNA been shown to be produced as a consequence of neuronal activ- hybrids may contribute in promoting premature transcription ity66 and further genomic studies suggested they likely arise in termination of broken genes. Indeed, a large amount of studies active genes67,68. Given that senataxin exerts its anti-translocation established a function for R-loops, as well as for senataxin, in and survival-promoting functions only for DSBs induced in active terminating transcription including for promoter-associated genes, we propose that this “Transcription Coupled DSB repair” bidirectional non-coding RNA (for instance59,60, reviewed in function of senataxin may contribute to neuron loss in AOA2/ refs. 1,11). In this regard it is interesting that dsRNA processing ALS4 patients. – factors such as Drosha and Dicer contribute to this process61 63. Hence, at damaged genes, R-loops and senataxin may promote Methods premature termination in order to allow either RNA PolII Cell culture. U20S were retrieved from ATCC and modified with a plasmid clearance from the damaged region to favor accessibility of repair encoding for the restriction enzyme (pBABE-AsiSIER and pAID-AsiSIER)29,30. proteins, or efficient recycling of RNA PolII to resume tran- U2OS, DIvA (AsiSI-ER-U20S), and AID-DIvA (AID-AsiSI-ER-U20S) cells were scription after repair. It is tempting to speculate that the reported cultured in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 64 antibiotics, 10% FCS (InVitrogen) and either 1 µg/mL puromycin (DIvA cells) or function of Dicer and Drosha in DDR may at least in part, 800 µg/mL G418 (AID-DIvA cells) at 37 °C under a humidified atmosphere with relate to the transcriptional termination at genes experiencing a 5% CO2. The cell lines were regularly checked for mycoplasma contamination. For DSB. AsiSI-dependent DSB induction, cells were treated with 300 nM 4OHT (Sigma, H7904) for 4 h. When indicated, 4OHT-treated cells were washed three times in

Fig. 5 Senataxin depletion decreases HR but does not impede resection. a Rad51 staining performed in untreated or 4OHT-treated DIvA cells (4 h), after transfection with control or SETX siRNA as indicated. Scale bar: 10 µM. Right panel shows the quantification of the Rad51 nuclear signal within foci (>100 nuclei) from a representative experiment. Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. P values are indicated (unpaired t-test). b 53BP1 staining performed in untreated or 4OHT-treated DIvA cells (4 h), after transfection with control or SETX siRNA as indicated. Scale bar: 10 µM. Right panel shows the quantification of the 53BP1 nuclear signal within foci ( > 100 nuclei) from a representative experiment. Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. P values are indicated (unpaired t-test). c HR (top panel), SSA (middle panel), and NHEJ (bottom panel) usage was measured using cell lines harboring specific reporter constructs38–40 in control or senataxin-deficient (siRNA-transfected) dedicated cells. Myc-I-SceI expression was controlled by western blot in each condition. Mean and s.e.m. of at least three biological replicates are shown (as indicated). P values are indicated (paired t-test). d The site specific resection assay has been described earlier. Briefly, DNA purified from damaged or undamaged cells is digested by dedicated restriction enzymes (as indicated) and digestion-resistant DNA (single stranded DNA) is measured by qPCR, using primers pairs apart from the restriction sites. Here, we optimized this assay at two AsiSI-induced DSBs that were shown to undergo HR29. e Resection assay at the two DSBs in control or SETX-siRNA transfected cells. Values were normalized against the % of ssDNA detected in control cells before 4OHT treatment. Average and s.e.m. of four biological replicates are shown. P values are indicated (paired t-test)

NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w

ab –4OHT MIS12::TRIM37 LINC00271::LYRM2 siRNA CTRL +4OHT+IAA –4OHT +4OHT H20 100 bp –4OHT +4OHT H20 100 bp –4OHT +IAA +IAA +4OHT+IAA siRNA SETX n=5 2.5 P=0.0051 P=0.0006

2.0

1.5

* 1.0

0.5 (normalized to CTRL) Translocation frequency

0 MIS12::TRIM37 LINC00271::LYRM2 c MIS12::TRIM37 siRNA CTRL n=3 3 P=0.08 siRNA SETX#2

2.5

2

1.5

1

(normalized to CTRL) 0.5 Translocation frequency

0 –4OHT +4OHT+IAA –4OHT +4OHT+IAA –DRB +DRB d Elongating polymerase R-loops across gene body

RNA Active gene pol II

DSB induction

Stalled polymerase Damaged gene R-loops and/or R-loops decreases across gene body RNA:DNA hybrids accumulation RNA pol II DSB SETX recruitment

RNA polII clearance? RNA:DNA hybrids removal Damaged gene RNA pol II RNA SETX pol II DSB ATM Regulation of γH2AX Rad51 recruitment ? Inhibition of translocation

Fig. 6 Senataxin counteracts the formation of translocations. a Rejoining of distant DSBs were detected by PCR, following DSB induction and repair (+4OHT + IAA 2 h) at breaks recently shown to undergo clustering31. DNA sequencing confirmed the nature of the amplified products. b MIS12::TRIM37 and LINC00271::LYRM2 rejoining frequencies were analyzed before or after 4OHT + IAA treatment, by quantitative PCR in AID DIvA cells transfected with control or SETX directed siRNA. Mean and s.e.m. of five biological replicates are shown. P values are indicated (one sample t-test). c MIS12::TRIM37 rejoining frequency was analyzed in control or SETX-depleted AID DIvA cells pretreated or not with DRB prior to 4OHT addition as indicated. Mean and s. e.m. of three biological replicates are shown. P value is indicated (paired t-test). d Model: R-loops form as the RNA Polymerase II progresses across the gene. The induction of a DSB elicits ATM activity which triggers RNA polymerase II stalling at the vicinity of the DSB, hence decreasing R-loops across the gene body. On another hand, R-loops and/or RNA:DNA hybrids accumulate in cis to the DSB, due to stalled RNA PolII generating short, abortive, RNAs which thread back in the DNA duplex, or/and potentially de novo PolII transcription from DNA end. Senataxin is further recruited to remove RNA:DNA hybrids at the vicinity of the break induced in active loci. Senataxin and/or R-loop removal, regulate γH2AX establishment, promote Rad51 loading and minimize the occurrence of translocation by a mechanism that still need to be investigated pre-warmed PBS and further incubated with 500 µg/mL auxin (IAA) (Sigma; assays in U20S cells, DSBs were induced either by increasing doses of etoposide I5148) for the indicated time. For transcriptional inhibition, DRB (Sigma, 100 μM) (Sigma) for 16 h as indicated or by irradiation with a Cs137 source (Biobeam 8000). or cordycepin (Sigma, 50 µM) was added to the medium 1 h prior to 4OHT (4 h) and auxin (2 h) treatments. Cells were arrested in G1 using a 48 h treatment with 40 μM lovastatin (Mevinolin from LKT Laboratories) and in G2 with a 24 h siRNA and plasmid transfection. siRNA transfections were performed with a Cell treatment with 40 μM Ro-3306 (CDK1 inhibitor, Calbiochem). For clonogenic Line Nucleofector kit V (Program X-001, Amaxa) according to the manufacturer’s instructions and cells were assayed 48 h after siRNA nucleofection. The following

10 NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w ARTICLE siRNA against SETX were used: SETX#2 GAGAGAAUUAUUGCGUACU and chloroform extraction and ethanol precipitation. DNA was digested by a restriction SETX Smart pool (Dharmacon) containing the following siRNA: #a GCACGU- enzyme cocktail (20 units each of EcoRI, HindIII, BsrGI, XbaI) (New England CAGUCAUGCGUAA, #b UAGCACAGGUUGUUAAUCA, #c AAAGAGUA- Biolabs) in 1× NEBuffer 2, with or without RNase H treatment, overnight at 37 °C. CUUCACGAAUU #d GGACAAAGAGUUCGAUAGA. siRNAs efficiency was Fragmented DNA was cleaned by phenol-chloroform extraction and ethanol assessed by mRNAs extraction with a Qiagen RNeasy kit (Qiagen) and reverse precipitation followed by two washes with 70% ethanol. Air-dried pellets were transcription with the AMV reverse transcriptase (Promega). cDNAs were quan- resuspended in 10 mM Tris-HCl pH 7.5, 1 mM EDTA (TE). In total, 4 µg of digests tified by RT-qPCR (primer sequences: SETX_FW CTTCATCCTCGGA- was diluted in 450 µL of TE, and 10 µL was reserved as input for qPCR. 50 µLof CATTTGAG and SETX_REV TTAATAATGGCACCACGCTTC) and normalized 10× IP buffer was added (final buffer concentration of 10 mM sodium phosphate, to RPLP0 cDNA levels (primer sequences: FW GGCGACCTGGAAGTCCAACT 140 mM sodium chloride, 0.05% Triton X-100) and 10 µL of S9.6 antibody (1 mg/ and REV CCATCAGCACCACAGCCTTC). For RNAse H1 overexpression, pICE- ml, kind gift from F. Chedin, UC Davis). Samples were incubated with the antibody NLSmCherry and pICE-RNaseHI-NLS-mCherry were transfected 24 h after siRNA at 4 °C for 2 h on a wheel. 50 µL of Protein A/G Agarose (Pierce), previously transfection using Lipofectamin 2000 (Life Technologies) following manufacturer’s washed twice with 700 µL of 1× IP buffer for 5 min at room temperature, were instructions. added and samples were incubated for 2 h at 4 °C on a wheel. Each DRIP was then washed three times with 700 µL 1× IP buffer for 10 min at room temperature. After the final wash, beads were resuspended in 250 µL of 1× IP buffer and incubated Western blot . To assess for SETX depletion, western blot analysis was performed with 60 units of Proteinase K for 45 min at 55 °C. Digested DRIP samples were with NuPAGE Bis–Tris 4–12% gels and reagents (Invitrogen) according to the ’ fl then cleaned with phenol-chloroform extraction and ethanol precipitation. Air- manufacturer s instructions. Brie y, cells were lysed in NuPage sample buffer with dried DRIP pellets were resuspended in 45 µL of 10 mM Tris-HCl pH 8. DRIP reducing agent (Invitrogen) and resolved proteins were transfered onto PVDF experiment was assayed by qPCR using primers located at SNRNP (negative membranes (Invitrogen). PVDF membranes were then saturated 1 h in 5% nonfat control), RPLA13 (positive control) and RBXML1 (AsiSI site): dry milk with TBS and 0.5% Tween 20 and incubated overnight with the following SNRNP_FW:GCCAAATGAGTGAGGATGGT; SNRNP_REV: primary antibodies: anti-SETX (Novus Biologicals, NB100-57542, 1:500) and anti- TCCTCTCTGCCTGACTCCAT; alpha-tubulin (Sigma-Aldrich, DM1A, 1: 100,000). Horseradish peroxidase- RPLA13_FW:AATGTGGCATTTCCTTCTCG; coupled secondary antibodies were from Sigma (anti-mouse, A2554, 1: 10,000; RPLA13_REV: CCAATTCGGCCAAGACTCTA. anti-rabbit, A0545, 1: 10,000), and the chemiluminescence Lumilight reagent was RBMXL1_FW: GATTGGCTATGGGTGTGGAC from Roche Diagnostic. To analyze I-SceI expression, total cell lysates were pre- RBMXL1_REV: CATCCTTGCAAACCAGTCCT pared 24 h after I-SceI plasmid transfection by direct resuspension of cells in For DRIP-Seq, samples from three DRIP experiments were pooled and Laemmli buffer and sonication. Cell extracts were separated on 10% SDS PAGE sonicated to an average size of 300 bp using a Bioruptor (Diagenode) for 20 cycles and proteins were transferred on nitrocellulose membrane. Primary antibodies of 30 s on, 30 s off, high setting. Immunoprecipitated DNA was subjected to library were anti-myc (9E10, Roche) and anti-alpha-tubulin (Sigma-Aldrich). Signals were preparation and single-end sequencing on a NextSeq 500 at EMBL GeneCore analyzed by autoradiography (for SETX expression) or using a ChemiDoc touch (Heidelberg, Germany). device (BioRad) for I-SceI expression.

ChIP-seq, RNA-seq, and DRIP-seq data set analyses. SETX ChIP-Seq, RNA- ChIP followed by high throughput sequencing and RNA-seq. Cells were seq, and DRIP-Seq samples were sequenced using Illumina NextSeq 500 (single- crosslinked with formaldehyde (1%) added to the culture medium for 15 min at end, 80-bp reads for SETX ChIP-Seq; paired-end, 75-bp reads for RNA-Seq and room temperature. Glycin (0.125 M) was added for 5 min to stop the reaction. Cells single-end, 85 bp reads for DRIP-Seq) at EMBL Genomics core facilities (Heidel- were washed twice with cold PBS and harvested by scraping. Pelleted cells were berg, Germany). The quality of each raw sequencing file (fastq) was verified with incubated in lysis buffer (Pipes 50 mM pH 8, KCl 85 mM, NP‐40 0.5%), homo- FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). ChIP-Seq genized with a Dounce homogenizer. Nuclei were harvested by centrifugation and and DRIP-Seq files were aligned to the reference human genome (hg19) and incubated in nuclear lysis buffer: (50 mM Tris pH 8.1, 10 mM EDTA, 1% SDS). processed using a classical ChIP-seq pipeline: bwa (http://bio-bwa.sourceforge.net/) Samples were sonicated ten times for 10 s at a power setting of 5 and 50% duty for mapping and samtools (http://www.htslib.org/) for duplicate removal (rmdup), cycle (Branson Sonifier 250), to obtain DNA fragments of about 500–1000 bp. sorting (sort), and indexing (index). RNA-seq was mapped to a custom human After sonication, samples were diluted ten times in dilution buffer (0.01% SDS, genome (hg19 merge with ERCC92 sequences) to avoid mapping ERCC sequences 1.1% Triton X‐100, 1.2 mM EDTA, 16.7 mM Tris pH 8.167 mM NaCl) and pre- on the human genome and processed as the same way as ChIP-Seq, except for the cleared for 2 h with 100 μl of protein‐A and protein‐G beads (Sigma), previously alignment in paired-end mode with STAR and without remove potential duplicate. blocked with 500 μg of BSA 2 h at 4 °C. Precleared samples were incubated over- Coverage for each aligned ChIP-seq data set (.bam) were computed with the night at 4 °C on a wheel with specific antibodies. For SETX ChIP, 200 µgof rtracklayer R package and normalized using total read count for each sample. chromatin was immunoprecipitated by using 2 µg of anti-SETX (Novus Biologicals, Coverage data was exported as bigwig (file format) for further processing. NB100-57542). For RNA Pol II ChIP, 25 µg of chromatin was immunoprecipitated Averaged ChIP-seq, RNA-seq, and DRIP-Seq profiles were generated using the with 2 µg of anti-RNA polymerase II CTD repeat YSPTSPS (phospho S2) (Chro- R package ggplot2. For profiles relative to DSB, the x-axis represents genomic motek 3E10), or with 2 µg of the anti-total RNA PolII (Bethyl Laboratories A304- position relative to AsiSI site and the y-axis represents the mean coverage at each 405A). XRCC4 ChIP-seq were published earlier29. Immune complexes were pre- bp (Fig. 2d). For metagene profiles, the mean coverage was computed in two parts: cipitated with 100 μl of blocked protein A/protein G beads for 2 h at 4 C on a first, mean coverage was computed in 200 bp intervals 3 kb upstream TSS and rotating wheel. Beads were washed with dialysis buffer (2 mM EDTA, 50 mM Tris downstream TSS. Second, gene bodies were divided into 100 equally sized bins, so pH 8.1, 0.2% Sarkosyl) once and with wash buffer (100 mM Tris pH 8.8, 500 mM average profiles could be computed as a percent of entire gene length. Profiles were LiCl, 1% NP‐40, 1% NaDoc) four times. Immunoprecipitated complexes were re- computed for all genes (Supplementary Fig. 2B) or for genes either directly suspended in 200 µl of TE buffer (Tris 10 mM pH8, EDTA 0.5 mM pH8) with 30 damaged or located near a DSB (<1 kb) (Supplementary Fig. 2G). µg of RNAse A for 30 min at 37 °C. Crosslink was reversed in the presence of 0.5% To classify DSBs based on transcriptional activity, RNA PolII-S2P ChIP-seq, SDS at 70 °C overnight with shaking. After a 2 h proteinase K treatment, immu- total RNA PolII ChIP-seq or RNA-seq obtained in DIvA cells prior DSB induction noprecipitated and input DNA were purified with phenol/chloroform and pre- were computed on a windows of ±5 kbp around DSBs. DSBs were ordered based cipitated. Samples were resuspended in 100 µl water. For ChIP-Seq, multiple ChIP on their RNA PolII enrichment and discriminated into four categories of 20 DSBs experiments were pooled. Immunoprecipitated DNA was subjected to library each (low, medium low, medium high, and high). preparation and single-end sequencing on a NextSeq 500 at EMBL GeneCore Box-plots were generated with R-base. The center line represents the median, (Heidelberg, Germany). box ends represent respectively the first and third quartiles, and whiskers represent the minimum and maximum values without outliers. Outliers were defined as first RNA extraction and RNA-seq library preparation. For RNA-seq, DIvA cells quartile −(1.5 × interquartile range) and above third quartile + (1.5 × interquartile (transfected with the control siRNA) were lysed using TRI reagent (SIGMA) and range). Values represent the total normalized read count in a specific genomic spiked-in with ERCC RNA Spike-In Mix (Thermo Fisher Scientific). Total RNA window surrounding AsiSI-induced DSBs or uncut AsiSI genomic sites (“uncut”). were recovered by chloroform extraction followed by isopropanol precipitation. Statistical hypothesis testing was performed using nonparametric paired Samples were treated with RQ1 RNase-free DNase (Promega) for 1 h at 37 °C and Mann–Whitney–Wilcoxon (wilcoxon.test() function in R) to tests distribution purified by phenol/chloroform extraction followed by ethanol precipitation. differences between two populations. Ribosomal RNA depletion and RNA-Seq library preparation were performed at For heatmap representations, average normalized sequencing signal was EMBL Genomics core facilities (Heidelberg, Germany) using TruSeq Stranded determined in 500 bp bins centered on each cleaved AsiSI site using custom R/ Total RNA (Illumina). Bioconductor scripts. The resulting matrix was represented as a heatmap using Java Treeview (http://www.jtreeview.sourceforge.net). DSBs were ordered based on the BLESS signal (Fig. 1c) or the total RNA PolII enrichment on ±5 kb (Supplementary DNA:RNA immunoprecipitation. DRIP assay was carried out according to the Fig. 1B). protocol described in ref. 7. DIvA cells were treated with 300 nM 4OHT for 4 h, trypsinized, pelleted at low speed and washed with DPBS (Life technology). Total nucleic acids were extracted with 0.5% SDS /Proteinase K (Thermo Fisher Scien- Clonogenic assays. After siRNA transfection, AID-DIvA and U20S cells were tific, Waltham, MA) treatment at 37 °C overnight and recovered by phenol- seeded at a clonal density in 10 cm diameter dishes. After 48 h, U20S cells were

NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w either exposed at increasing doses of IR or treated with increasing doses of eto- DSB-ASXL1_2000_REV: TGGACCCCAAATTCCTAAAG. = (Ct digested poside, as indicated. AID-DIvA cells were treated with 300 nM 4OHT for 4 h and, ssDNA% was calculated with the following equation: ssDNA% 1/(2 −Ct undigested−1) when indicated, washed three times in pre-warmed PBS and further incubated with + 0.5)*100. 500 µg/mL auxin for another 4 h. After three washes in pre-warmed PBS, complete medium was added to each AID-DIvA cells dish. After 10 days, U20S cells and HR, NHEJ, and SSA repair assays. The GCS5 and RG37 cell lines have been AID-DIvA cells were stained with crystal violet (Sigma) and counted. Only colonies derived from SV40 T-transformed human fibroblasts (GM639)38,40. The U2OS containing more than 50 cells were scored. SSA has been derived from the osteosarcoma cell line U2OS39. For siRNA trans- fection, 1 × 105 cells were transfected using Interferin (Ozyme, France) with 10 nM Immunofluorescence. Cells were plated in glass coverslips, fixed with 4% paraf- of siRNA according to the manufacturer’s instructions. Plasmid coding for I-SceI ormaldehyde for 15 min, permeabilized with Triton 0.5%, and blocked with PBS- expression was transfected using JetPei (Ozyme, France) according to the manu- BSA 3% for 30 min at room temperature. Cells were then incubated with antibody facturer’s instructions.Twenty four hours after siRNA transfection, cells were against γH2AX (JBW301, 05-636, Millipore, 1:1000) and 53BP1 (Novus Biological washed and transfected with the I-SceI coding plasmid (1 µg). After 72 h, cells were NB-300-104, 1:500) overnight at 4 °C. Cells were washed three times in PBS-BSA collected after trypsin treatment and analyzed by flow cytometry (BD Facscalibur) 3% and incubated with secondary antibody for 1 h. After three washes (one PBS- to detect and count GFP-positive cells in each condition. Percentage of GFP- BSA 3% and two PBS), nuclei were stained with Hoechst 33342 (Sigma). Staining positive cells was calculated on 25,000 sorted events. against Rad51 (Santa cruz sc8349, 1:200) was performed using the following protocol. Cells were plated in glass corverslips, then submitted to a pre-extraction Translocation assay. AID-DIvA cells were treated as indicated and then DNA was with ice cold buffer (20 mM HEPES pH 7.5; 20 mM NaCl; 5 mM MgCl2;1mM DTT; 0.5% NP40) for 20 min on ice. Cells were fixed with 4% paraformaldehyde extracted from fresh cells using the DNAeasy kit (Qiagen). Illegitimate rejoining for 15 min and blocked with PBS-BSA 3% for 30 min at room temperature. frequencies between MIS12 and TRIM37 (chr17_5390209 and chr17_57184285), Immunofluorescence was carried as previously described. Image acquisition was LINC00217 and LYRM2 (chr6_135819337 and chr6_9034817), MIS12 and LYRM2 performed using MetaMorph on a wide-field microscope equiped with a cooled (chr17_5390209 and chr6_9034817), or TRIM37 and RBMXL1 (chr17_57184285 charge-coupled device camera (CoolSNAP HQ2), using a ×40 or ×100 objective. and chr1_89433139) were assessed by qPCR using the following primers: MIS12_Fw: GACTGGCATAAGCGTCTTCG TRIM37_Rev: TCTGAAGTCTGCGCTTTCCA High-throughput microscopy. AID-DIvA and U20S cells were plated in 96-well LINC00217_ Fw: GGAAGCCGCCCAGAATAAGA fi plates after transfection. After xation, permeabilization and saturation steps, γ- LYRM2_Rev: TCTGAAGTCTGCGCTTTCCA H2AX was stained overnight with γH2AX (JBW301, 05-636, Millipore) and the TRIM37_Fw: AATTCGCAAACACCAACCGT secondary antibody anti-mouse Alexa 647 (A21235, Molecular Probes). Nuclei RBMXL1_Rev: GCCAATGGAGTTCCCTGAGTC fi were labeled with Hoechst 33342 (Sigma) at a nal concentration of 1 μg/ml for 5 Results were normalized using two control regions, both far from any AsiSI sites min. γ-H2AX foci were further analyzed with an Operetta automated high-content and γH2AX domain using the following primers: screening microscope (PerkinElmer). For quantitative image analysis, multiple Ctrl_chr1_82844750_Fw: AGCACATGGGATTTTGCAGG fi elds per well were acquired with a ×40 objective lens to visualize ~2000 cells per Ctrl_chr1_82844992_Rev: TTCCCTCCTTTGTGTCACCA well in triplicate. Ctrl_chr17_9784962_Fw: ACAGTGGGAGACAGAAGAGC Ctrl_chr17_9785135_Rev: CTCCATCATCGCACCCTTTG. γ-H2AX, 53BP1, and Rad51 foci intensity quantification. All quantification was Normalized translocation frequencies were calculated using the DeltaDeltaCt 69 performed using Columbus, the integrated software to the Operetta automated method from Bio-Rad CFX Manager 3.1 software . high-content screening microscope (PerkinElmer). DAPI or Hoechst nuclei were selected according to the B method, and appropriate parameters, such as the size fl Data availability. High throughput sequencing data have been deposited to Array and intensity of uorescent objects, were applied to eliminate false-positive. Then Express under accession number E-MTAB-6318. Other data and source codes are γ -H2AX, Rad51 and 53BP1 foci were detected with the D method with adjusted available upon request. parameters to ensure best foci detection: detection sensitivity 0.5–1; splitting coefficient, 0.5–1; background correction, >0.5–0.9. G1 and G2 nuclei were selected on the basis of the Hoechst intensity, after visualization of the Hoechst distribution Received: 27 July 2017 Accepted: 5 January 2018 in all cells. Box plots represent the total nuclear signal intensity detected in foci.

Repair kinetics at AsiSI sites. Repair kinetics at specific AsiSI-induced DSBs were measured as described in refs. 29,37. Genomic DNA was extracted using the DNAeasy kit (Qiagen) and in vitro ligation with a biotinylated double-stranded oligonucleotide, ligatable with AsiSI sites, was carried out overnight at 4 °C. T4 References ligase was inactivated at 65 °C for 10 min, then ligated DNA was fragmented by 1. Groh, M., Albulescu, L. O., Cristini, A. & Gromak, N. Senataxin: genome EcoRI digestion at 37 °C for 2 h. Digestion was then inactivated at 70 °C for 20 min. guardian at the interface of transcription and neurodegeneration. J. Mol. Biol. Samples were precleared with protein A beads for 2 h at 4 °C on a wheel. Precleared 429, 3181–3195 (2016). samples were then incubated with streptavidin beads (Sigma) at 4 °C overnight. 2. Kim, H. D., Choe, J. & Seo, Y. S. The sen1(+) gene of Schizosaccharomyces Beads were previously saturated with 500 μg of BSA 2 h at 4 °C. DNA pulled down pombe, a homologue of budding yeast SEN1, encodes an RNA and DNA with streptavidin beads was washed once with dialysis buffer (2 mM EDTA, 50 mM – Tris pH 8.1, 0.2% Sarkosyl), five times with wash buffer (100 mM Tris pH 8.8, 500 helicase. Biochemistry 38, 14697 14710 (1999). mM LiCl, 1% NP‐40, 1% NaDoc), and three times in TE buffer (Tris10mM pH8, 3. Martin-Tumasz, S. & Brow, D. A. Saccharomyces cerevisiae Sen1 helicase ′ ′ EDTA0.5 mM pH8). Beads were resuspended in 100 μL of water and digested with domain exhibits 5 -to3-helicase activity with a preference for translocation – HindIII at 37 °C for 4 h. After phenol/chloroform purification and precipitation, on DNA rather than RNA. J. Biol. Chem. 290, 22880 22889 (2015). DNA was resuspended in 100 μL water. qPCR was performed using the following 4. Leonaite, B. et al. Sen1 has unique structural features grafted on the primers: ASXL1-FW CCTAGCTGAGGTCGGTGCTA; ASXL1-REV GAA- architecture of the Upf1-like helicase family. EMBO J. 36, 1590–1604 (2017). GAGTGAGGAGGGGGAGT; RBMXL1-FW GATTGGCTATGGGTGTGGAC; 5. Chedin, F. Nascent connections: R-loops and chromatin patterning. Trends RBMXL1-REV CATCCTTGCAAACCAGTCCT. Genet. 32, 828–838 (2016). 6. Ginno, P. A., Lim, Y. W., Lott, P. L., Korf, I. & Chedin, F. GC skew at the 5′ and 3′ ends of human genes links R-loop formation to epigenetic regulation Resection assay. Measure of resection was performed as described in ref. 41 with and transcription termination. Genome Res. 23, 1590–1600 (2013). the following modifications. DNA was extracted from fresh cells using the 7. Sanz, L. A. et al. Prevalent, dynamic, and conserved R-loop structures associate DNAeasy kit (Qiagen). In total, 400 ng were digested overnight at 37 °C using the fi – Ban I restriction enzyme (16U per samples) that cuts at ~200 bp and 1626 bp from with speci c epigenomic signatures in mammals. Mol. Cell 63, 167 178 the DSB-KDELR3 and at 740 bp and 2000 bp for DSB-ASXL1. Digested and (2016). undigested samples were also treated with Rnase H (Promega). Ban1 was heat 8. Skourti-Stathaki, K. & Proudfoot, N. J. A double-edged sword: R loops as inactivated 20 min at 65 °C. Digested and undigested DNA were analyzed by qPCR threats to genome integrity and powerful regulators of gene expression. Genes – using the following primers: Dev. 28, 1384 1396 (2014). fl DSB-KDELR3_200 FW: ACCATGAACGTGTTCCGAAT; 9. Garcia-Muse, T. & Aguilera, A. Transcription-replication con icts: how they DSB-KDELR3_200_REV: GAGCTCCGCAAAGTTTCAAG; occur and how they are resolved. Nat. Rev. Mol. Cell Biol. 17, 553–563 (2016). DSB-KDELR3_1626_FW: CCCTGGTGAGGGGAGAATC; 10. Porrua, O. & Libri, D. Transcription termination and the control of the DSB-KDELR3_1626_REV: GCTGTCCGGGCTGTATTCTA; transcriptome: why, where and how to stop. Nat. Rev. Mol. Cell Biol. 16, DSB-ASXL1_740 FW: GTCCCCTCCCCCACTATTT; 190–202 (2015). DSB-ASXL1_740_REV: ACGCACCTGGTTTAGATTGG; 11. Proudfoot, N. J. Transcriptional termination in mammals: stopping the RNA DSB-ASXL1_2000_FW: GTTCCTGTTATGCGGGTGTT; polymerase II juggernaut. Science 352, aad9926 (2016).

12 NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w ARTICLE

12. Marnef, A., Cohen, S. & Legube, G. Transcription-coupled DNA double- 40. Xie, A., Kwok, A. & Scully, R. Role of mammalian Mre11 in classical and strand break repair: active genes need special care. J. Mol. Biol. 429, alternative nonhomologous end joining. Nat. Struct. Mol. Biol. 16, 814–818 1277–1288 (2017). (2009). 13. Gaillard, H. & Aguilera, A. Transcription as a threat to genome integrity. 41. Zhou, Y., Caron, P., Legube, G. & Paull, T. T. Quantitation of DNA double- Annu. Rev. Biochem. 85, 291–317 (2016). strand break resection intermediates in human cells. Nucleic Acids Res. 42, e19 14. Sollier, J. & Cimprich, K. A. Breaking bad: R-loops and genome integrity. (2014). Trends Cell Biol. 25, 514–522 (2015). 42. Clouaire, T., Marnef, A. & Legube, G. Taming tricky DSBs: ATM on duty. 15. Sollier, J. et al. Transcription-coupled nucleotide excision repair factors DNA Rep. https://doi.org/10.1016/j.dnarep.2017.06.010 (2017). promote R-loop-induced genome instability. Mol. Cell 56, 777–785 (2014). 43. Gong, F. et al. Screen identifies bromodomain protein ZMYND8 in chromatin 16. Alzu, A. et al. Senataxin associates with replication forks to protect fork recognition of transcription-associated DNA damage that promotes integrity across RNA-polymerase-II-transcribed genes. Cell 151, 835–846 homologous recombination. Genes Dev. 29, 197–211 (2015). (2012). 44. Gong, F., Clouaire, T., Aguirrebengoa, M., Legube, G. & Miller, K. M. Histone 17. Mischo, H. E. et al. Yeast Sen1 helicase protects the genome from demethylase KDM5A regulates the ZMYND8-NuRD chromatin remodeler to transcription-associated instability. Mol. Cell 41,21–32 (2011). promote DNA repair. J. Cell Biol. 216, 1959–1974 (2017). 18. Richard, P., Feng, S. & Manley, J. L. A SUMO-dependent interaction between 45. Kakarougkas, A. et al. Requirement for PBAF in transcriptional repression Senataxin and the exosome, disrupted in the neurodegenerative disease and repair at DNA breaks in actively transcribed regions of chromatin. Mol. AOA2, targets the exosome to sites of transcription-induced DNA damage. Cell 55, 723–732 (2014). Genes Dev. 27, 2227–2232 (2013). 46. Shanbhag, N. M., Rafalska-Metcalf, I. U., Balane-Bolivar, C., Janicki, S. M. & 19. Yuce, O. & West, S. C. Senataxin, defective in the neurodegenerative disorder Greenberg, R. A. ATM-dependent chromatin changes silence transcription in ataxia with oculomotor apraxia 2, lies at the interface of transcription and the cis to DNA double-strand breaks. Cell 141, 970–981 (2010). DNA damage response. Mol. Cell Biol. 33, 406–417 (2013). 47. Ui, A., Nagaura, Y. & Yasui, A. Transcriptional elongation factor ENL 20. Becherel, O. J. et al. Senataxin plays an essential role with DNA damage phosphorylated by ATM recruits polycomb and switches off transcription for response proteins in meiotic recombination and gene silencing. PLoS Genet. 9, DSB repair. Mol. Cell 58, 468–482 (2015). e1003435 (2013). 48. Iannelli, F. et al. A damaged genome’s transcriptional landscape through 21. Airoldi, G. et al. Characterization of two novel SETX mutations in AOA2 multilayered expression profiling around in situ-mapped DNA double-strand patients reveals aspects of the pathophysiological role of senataxin. breaks. Nat. Commun. 8, 15656 (2017). Neurogenetics 11,91–100 (2010). 49. Zhang, X. et al. Attenuation of RNA polymerase II pausing mitigates BRCA1- 22. Suraweera, A. et al. Senataxin, defective in ataxia oculomotor apraxia type 2, is associated R-loop accumulation and tumorigenesis. Nat. Commun. 8, 15908 involved in the defense against oxidative DNA damage. J. Cell Biol. 177, (2017). 969–979 (2007). 50. Michelini, F. et al. Damage-induced lncRNAs control the DNA damage 23. Golla, U. et al. Sen1p contributes to genomic integrity by regulating expression response through interaction with DDRNAs at individual double-strand of ribonucleotide reductase 1 (RNR1) in Saccharomyces cerevisiae. PLoS ONE breaks. Nat. Cell Biol. 19, 1400–1411 (2017). 8, e64798 (2013). 51. Castellano-Pozo, M. et al. R loops are linked to histone H3 S10 24. Roda, R. H., Rinaldi, C., Singh, R., Schindler, A. B. & Blackstone, C. Ataxia phosphorylation and chromatin condensation. Mol. Cell 52, 583–590 (2013). with oculomotor apraxia type 2 fibroblasts exhibit increased susceptibility to 52. Garcia-Pichardo, D. et al. Histone mutants separate R loop formation from oxidative DNA damage. J. Clin. Neurosci. 21, 1627–1631 (2014). genome instability induction. Mol. Cell 66, 597–609.e5 (2017). 25. De Amicis, A. et al. Role of senataxin in DNA damage and telomeric stability. 53. Chen, P. B., Chen, H. V., Acharya, D., Rando, O. J. & Fazzio, T. G. R loops DNA Repair (Amst.) 10, 199–209 (2011). regulate promoter-proximal chromatin architecture and cellular 26. Li, L. et al. DEAD Box 1 facilitates removal of RNA and homologous differentiation. Nat. Struct. Mol. Biol. 22, 999–1007 (2015). recombination at DNA double-strand breaks. Mol. Cell. Biol. 36, 2794–2810 54. Jacquet, K. et al. The TIP60 complex regulates bivalent chromatin recognition (2016). by 53BP1 through direct H4K20me binding and H2AK15 acetylation. Mol. 27. Britton, S. et al. DNA damage triggers SAF-A and RNA biogenesis factors Cell 62, 409–421 (2016). exclusion from chromatin coupled to R-loops removal. Nucleic Acids Res. 42, 55. Tang, J. et al. Acetylation limits 53BP1 association with damaged chromatin to 9047–9062 (2014). promote homologous recombination. Nat. Struct. Mol. Biol. 20, 317–325 28. Ohle, C. et al. Transient RNA-DNA hybrids are required for efficient double- (2013). strand break repair. Cell 167, 1001–1013.e7 (2016). 56. Courilleau, C. et al. The chromatin remodeler p400 ATPase facilitates Rad51- 29. Aymard, F. et al. Transcriptionally active chromatin recruits homologous mediated repair of DNA double-strand breaks. J. Cell Biol. 199, 1067–1081 recombination at DNA double-strand breaks. Nat. Struct. Mol. Biol. 21, (2012). 366–374 (2014). 57. Sun, Y. et al. Histone H3 methylation links DNA damage detection to 30. Iacovoni, J. S. et al. High-resolution profiling of gammaH2AX around DNA activation of the tumour suppressor Tip60. Nat. Cell Biol. 11, 1376–1382 double strand breaks in the mammalian genome. EMBO J. 29, 1446–1457 (2009). (2010). 58. Xu, Y. et al. Thep400 ATPase regulates nucleosome stability and chromatin 31. Aymard, F. et al. Genome-wide mapping of long-range contacts unveils ubiquitination during DNA repair. J. Cell Biol. 191,31–43 (2010). clustering of DNA double-strand breaks at damaged active genes. Nat. Struct. 59. Hatchi, E. et al. BRCA1 recruitment to transcriptional pause sites is required Mol. Biol. 24, 353–361 (2017). for R-loop-driven DNA damage repair. Mol. Cell 57, 636–647 (2015). 32. Boguslawski, S. J. et al. Characterization of monoclonal antibody to DNA: 60. Zhao, D. Y. et al. SMN and symmetric arginine dimethylation of RNA RNA and its application to immunodetection of hybrids. J. Immunol. Methods polymerase II C-terminal domain control termination. Nature 529,48–53 89, 123–130 (1986). (2016). 33. Nadel, J. et al. RNA:DNA hybrids in the human genome have distinctive 61. Skourti-Stathaki, K., Kamieniarz-Gdula, K. & Proudfoot, N. J. R-loops induce nucleotide characteristics, chromatin composition, and transcriptional repressive chromatin marks over mammalian gene terminators. Nature 516, relationships. Epigenet. Chromatin 8, 46 (2015). 436–439 (2014). 34. Caron, P. et al. Non-redundant functions of ATM and DNA-PKcs in response 62. Dhir, A., Dhir, S., Proudfoot, N. J. & Jopling, C. L. Microprocessor mediates to DNA double-strand breaks. Cell Rep. 13, 1598–1609 (2015). transcriptional termination of long noncoding RNA transcripts hosting 35. Calderwood, S. K. A critical role for topoisomerase IIb and DNA double microRNAs. Nat. Struct. Mol. Biol. 22, 319–327 (2015). strand breaks in transcription. Transcription 7,75–83 (2016). 63. Wagschal, A. et al. Microprocessor, Setx, Xrn2, and Rrp6 co-operate to induce 36. Canela, A. et al. Genome organization drives chromosome fragility. Cell 170, premature termination of transcription by RNAPII. Cell 150, 1147–1157 507–521 (2017). (2012). 37. Chailleux, C. et al. Quantifying DNA double-strand breaks induced by site- 64. Francia, S. et al. Site-specific DICER and DROSHA RNA products control the specific endonucleases in living cells by ligation-mediated purification. Nat. DNA-damage response. Nature 488, 231–235 (2012). Protoc. 9, 517–528 (2014). 65. Marnef, A. & Legube, G. Organizing DNA repair in the nucleus: DSBs hit the 38. Dumay, A. et al. Bax and Bid, two proapoptotic Bcl-2 family members, inhibit road. Curr. Opin. Cell Biol. 46,1–8 (2017). homologous recombination, independently of apoptosis regulation. Oncogene 66. Suberbielle, E. et al. Physiologic brain activity causes DNA double-strand 25, 3196–3205 (2006). breaks in neurons, with exacerbation by amyloid-beta. Nat. Neurosci. 16, 39. Stark, J. M., Pierce, A. J., Oh, J., Pastink, A. & Jasin, M. Genetic steps of 613–621 (2013). mammalian homologous repair with distinct mutagenic consequences. Mol. 67. Madabhushi, R. et al. Activity-induced DNA breaks govern the expression of Cell. Biol. 24, 9305–9316 (2004). neuronal early-response genes. Cell 161, 1592–1605 (2015).

NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications 13 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-02894-w

68. Schwer, B. et al. Transcription-associated processes cause DNA double-strand Additional information breaks and translocations in neural stem/progenitor cells. Proc. Natl Acad. Sci. Supplementary Information accompanies this paper at https://doi.org/10.1038/s41467- USA 113, 2258–2263 (2016). 018-02894-w. 69. Vandesompele, J. et al. Accurate normalization of real-time quantitative RT- PCR data by geometric averaging of multiple internal control genes. Genome Competing interests: The authors declare no competing financial interests. Biol. 3, RESEARCH0034 (2002). Reprints and permission information is available online at http://npg.nature.com/ reprintsandpermissions/ Acknowledgements We thank the genomic core facility of EMBL for high throughput sequencing. We are Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in grateful to the Non-Invasive Exploration service—US006 for the access to the irradiator published maps and institutional affiliations. (Biobeam 8000). We thank Dr. Lionel Sanz and Dr. Frederic Chedin (UC Davis) for providing advice for DRIP experiments. We thank Dr. Bernard Lopez (IGR Paris) for the NHEJ and HR reporter constructs, as well as Dr. Jeremy Stark (City of Hope) for the SSA reporter construct. pICE-NLSmCherry and pICE-RNaseHI-NLS-mCherry were a kind Open Access This article is licensed under a Creative Commons gift from Dr. S. Britton (IPBS Toulouse) We thank Dr. M. Bushell (University of Lei- Attribution 4.0 International License, which permits use, sharing, ’ cester) for helpful discussions and data exchange. We also thank Dr D adda di Fagagna adaptation, distribution and reproduction in any medium or format, as long as you give (IFOM) for sharing unpublished work. M.A. was supported by the Fondation pour la appropriate credit to the original author(s) and the source, provide a link to the Creative Recherche Médicale (FRM). Funding in GL laboratory was provided by grants from the Commons license, and indicate if changes were made. The images or other third party European Research Council (ERC-2014-CoG 647344), Agence Nationale pour la material in this article are included in the article’s Creative Commons license, unless Recherche (ANR-14-CE10-0002-01and ANR-13-BSV8-0013), the Institut National indicated otherwise in a credit line to the material. If material is not included in the “ contre le Cancer (INCA), and the Ligue Nationale contre le Cancer (LNCC, équipe article’s Creative Commons license and your intended use is not permitted by statutory ” labelisée ). regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/ Author contributions licenses/by/4.0/. S.C., N.P., T.C., and Y.C. performed experiments. V.R., M.A., and T.C. performed bioin- formatic analyses of ChIP and DRIP data sets. Y.-L.L. and P.P. provided technical and conceptual assistance for DRIP experiments. G.L. conceived, supervised, analyzed experi- © The Author(s) 2018 ments and wrote the manuscript. All authors commented and edited the manuscript.

14 NATURE COMMUNICATIONS | (2018) 9:533 | DOI: 10.1038/s41467-018-02894-w | www.nature.com/naturecommunications A • • 0.4 • 0.4 • • • 0.3 0.3

0.2 • • 0.2

• • 0.1 0.1 ChIP-seq SETX read count -/+ 500bp • ChIP-seq SETX read count -/+ 500bp • 0 0 4OHT - +-+- +- + 4OHT - +-+- +- +

RNA PolII S2P (-4OHT) high medium medium low RNA level (-4OHT) high medium medium low high low high low

B RNA PolII ChIP-seq RNA PolII ChIP-seq SETX ChIP-seq (Total) (S2-P)

-4OHT +4OHT -4OHT -4OHT

B SETX ChIP-seq count

0 0.3

RNA PolII ChIP-seq count (total) 0 0.3

RNA PolII ChIP-seq count (S2P) -20 -10 DSB 0 0.3

B Total RNA PolII ChIP-seq count (-5kb/+5kb) RNA Total

DSB DSB DSB DSB -/+5kb -/+5kb -/+5kb -/+5kb C transcribed loci untranscribed loci D 0.1 0.1 H3 H3 p= 0.008 0 0 4 5 5 XRCC4 XRCC4 0 0 3 2.5 2.5 • BLESS BLESS normalized read count normalized read count 0 0 2 SRSF6 chr20 chr13 42050000 42100000 105200000 105250000 DSB DSB 1

0.1 0.1 H3 H3 SETX ChIP-seq read count log2(+4OHT/-4OHT) +/- 500bp 0 0 0 5 5 XRCC4 XRCC4 Rad51-bound Rad51-unbound 0 0 2.5 BLESS 2.5 BLESS normalized read count 0 normalized read count 0 SLC32A1 RBMXL1 chr20 chr1 37340000 37380000 89420000 89480000 DSB DSB Supplementary Figure 1, related to Figure 1: Senataxin binds to DSBs induced in transcriptionally active loci A. Box plots representing senataxin ChIP-seq count before (-4OHT) and after (+4OHT) DSB induction at AsiSI “cut” sites sorted according to RNA Polymerase II-S2P occupancy (left panel) or RNA level (right panel) on a 10kb window surrounding AsiSI sites (20 DSBs in each categories). Center line: median; Box limits: 1st and 3rd quartiles; Whiskers: Maximum and minimum without outliers. Points: outliers. B. Heatmaps representing SETX ChIP-seq count over a 10kb window centered on the DSB before (- 4OHT) and after (+4OHT) DSB induction, as well as RNA Pol II ChIP-seq count prior to DSB induction. DSBs are sorted according to decreasing RNA Polymerase II occupancy. C. Genome browser screenshots representing BLESS signal (Clouaire et al, in revision) H3 (negative control, Clouaire et al, in revision) and XRCC4 (positive control, Aymard et al, 2014) ChIP-Seq reads count after damage induction (+4OHT) at the four individual AsiSI sites presented Fig. 1E. Note that despite no senataxin recruitment, the two untranscribed loci display strong recruitment of XRCC4. D. Box plot showing the senataxin read count on -/+ 500bp windows at DSBs that exhibit high level of Rad51 (Rad51 bound, 20 DSBs), or low level of Rad51 (Rad51-unbound, 20 DSBs). Data are expressed in log2 (+4OHT/-4OHT). Center line: median; Box limits: 1st and 3rd quartiles; Whiskers: Maximum and minimum without outliers. Points: outliers. P-values are indicated (Wilcoxon Mann-Whitney test).

B low RNA PolII C A 5 medium RNA PolII DRIP-seq+4OHT S9.6 high RNA PolII 0 1.25 5 S9.6 +RNAseH DRIP-seq-4OHT 0.3 0 1.00 0.8 0.25 0 SETX+4OHT 0.8 0.2 0.75 0 SETX-4OHT 0.5 PolII-S2P 0.15 0.5 0 1.5 normalized read count RNA-seq 0.1 0 0.25 LYRM2

DRIP efficiency (%input) efficiency DRIP 0.05 0.00 chr6 0 -seq read count normalized DRIP 90 344 000 90 350 000 SNRNP RPL13A -3kbTSS TTS +3kb DSB

2 2 D BLESS E BLESS 0 0 10 10 DRIP-seq+4OHT DRIP-seq+4OHT 0 0 10 DRIP-seq-4OHT 10 DRIP-seq-4OHT 0 0 1 SETX+4OHT 1 SETX+4OHT 0 0 1 SETX-4OHT 1 SETX-4OHT 0 0 0.4 PolII-S2P 0.4 PolII-S2P normalized read count 0 normalized read count 0 1.2 RNA-seq 1.2 RNA-seq 0 0 DEFB133 DEFB114 DEFB110 chr6 49 850 000 50 000 000 105 150 000 105 350 000 DSB DSB 1kb 2 1kb 2 BLESS BLESS 0 0 3 DRIP-seq+4OHT 3 DRIP-seq+4OHT 0 0 3 DRIP-seq-4OHT 3 DRIP-seq-4OHT 0 0 0.3 Total RNA PolII 0.3 Total RNA PolII 0 0 0.8 0.8 RNA-seq RNA-seq normalized read count 0 normalized read count 0 DEFB133 DSB DSB

F G -4OHT 2 DRIP-seq across damaged genes BLESS +4OHT 0 10 DRIP-seq+4OHT 0 1.5 10 DRIP-seq-4OHT 0 1 SETX+4OHT 0 1.0 1 SETX-4OHT 0 0.4 PolII-S2P normalized read count 0 0.5 1.2 RNA-seq 0 DRIP -seq read count DRIP chr2 43 300 000 43 400 000 DSB 0.0 1kb 2 -3kb TSS TTS +3kb 0 BLESS 3 DRIP-seq+4OHT 0 3 DRIP-seq-4OHT 0 0.3 Total RNA PolII 0 0.8 RNA-seq normalized read count 0 DSB Supplementary Figure 2, related to Figure 2: RNA:DNA hybrids distribution analysed by DRIP-seq in DIvA cells prior and after DSB induction A. DRIP-qPCR performed in DIvA cells prior DSB induction in presence or absence of RNAseH treatment as indicated, on two genomic loci known to either be devoid (SNRNP) or enriched (RPL13A) in R-Loops. Mean and s.e.m of technical replicates of a representative experiment is shown. B. Average DRIP-seq profiles across all genes on the genome (hg19) divided in three categories based on their RNA PolII enrichment across the gene body (high, medium, low, as indicated). C. Genome browser screenshot representing DRIP-seq and senataxin ChIP-Seq reads count before (-4OHT) and after damage induction (+4OHT) at a DSB induced in the first intron of a transcribed gene D. Genome browser screenshot representing DRIP-seq and senataxin ChIP-Seq reads count before (-4OHT) and after damage induction (+4OHT) at an untranscribed AsiSI site (see RNA PolIIS2P and RNA-seq signals). The BLESS signal (indicative of cleavage efficiency) is also shown. A close-up with Total RNA PolII enrichment is shown on the bottom panel. Note that at this locus, although not transcribed to a detectable level, total RNA PolII is present prior break induction. A low amount of RNA:DNA hybrids forms following DSB induction. E. Same as in C, except that total RNA PolII is not detected prior break induction. F. Same as in C, except that at this untranscribed locus, no RNA:DNA hybrids forms following breakage. G. Average DRIP-seq profiles across the genes either directly damaged (AsiSI within their gene body) or lying at the immediate vicinity of an AsiSI site (<1kb), before and after 4OHT treatment as indicated. A B n=6 CTRL n=7 n=5 siRNA p<0.0001 p<0.0001 SETX 100 100 120 p=0.7 (Smart Pool) 80 80 100 60 60 p=0.0003 40 40 80 20 20 cDNA level (%) cDNA cDNA level (%) cDNA 0 0 60 CTRL SETX#2 CTRL SETX

Smartpool (%)Survival 40 siRNA CTRL SETX#2 20 SETX 0 α tubulin -4OHT +4OHT+IAA

etoposide C no etoposide 50nM 70nM 100nM

120 n=3 CTRL siRNA SETX#2 CTRL 100 80 60

Survival (%)Survival 40 20

SETX 0 #2 - 50nM 70nM 100nM 1µM etoposide

D 0Gy 2Gy 4Gy n=3 120 CTRL CTRL siRNA 100 SETX#2

80

60 Survival (%)Survival 40 SETX #2 20

0 0Gy 2Gy 4Gy

EF 8 300 250 200 6 150

100 4 H2AX intensity (AU) γ

50 H2AX intensity (AU) γ

0 0 0 10nM 50nM 0 10nM 50nM 2Gy 4Gy 2Gy 4Gy etoposide etoposide siRNA CTRL siRNA SETX#2 siRNA CTRL siRNA SETX#2 Supplementary Figure 3 , related to Figure 3: Senataxin depletion triggers sensitivity to DSB induced by AsiSI and to etoposide but not to irradiation A. cDNA levels, analysed by RT-qPCR in cells transfected with a control siRNA, a single siRNA directed against SETX (SETX#2 top panel) or a pool of siRNA directed against SETX (Smartpool, middle panel). mean and s.e.m of respectively 7 and 5 biological replicates are shown..Bottom panel shows a western blot in DIvA cells transfected with the control or SETX#2 siRNA. P-values are indicated (paired t-test). B. Clonogenic assays in AID-DIvA cells transfected with control and SETX smartpool siRNA, before (-4OHT) and after 4OHT treatment followed by auxin treatment (+4OHT+IAA) as indicated. Mean and s.e.m of 6 biological replicates are shown. P-values are indicated (paired t-test). C. Clonogenic assays in U2OS cells transfected with control and SETX #2 siRNA, following etoposide treatment as indicated. Mean and s.e.m of 3 biological replicates are shown. D. Clonogenic assays in U2OS cells transfected with control and SETX #2 siRNA, following exposure to γ -irradiation as indicated. Mean and s.e.m of 3 biological replicates are shown. E. Quantification of γH 2AX signal detected in U2OS following treatment with increasing dose of etoposide, after transfection with control or SETX#2 siRNA as indicated. A representative experiment is shown (>100 nuclei). Center line: median; Box limits: 1st and 3rd quartiles; Whiskers: Maximum and minimum without outliers.F. Same as in E except that quantification was performed following exposure to 2 or 4Gy. A

CTRL SETX#2 15

10 H2AX intensity (AU)

γ 5

0

+4OHT +4OHT +4OHT+4OHT +4OHT+4OHT +4OHT +4OHT +IAA +IAA +IAA +IAA

siRNA CTRL siRNA SETX#2 siRNA CTRL siRNA SETX#2

G1 G2

B G1 arrested G2 arrested (+lovastatin) (+RO3306)

CTRL SETX#2 CTRL SETX#2

DSB-RBMXL1 CTRL DSB-RBMXL1 120 120 SETX#2 100 100 80 80 60 60 40 40 unrepaired DSB (%)

unrepaired DSB (%) 20 20 0 0 +4OHT 30 60 +4OHT 30 60

+4OHT+IAA (min) +4OHT+IAA (min)

Supplementary Figure 4, related to Figure 4: Senataxin depletion does not delay repair kinetics in G1 and G2 cells A. Quantification of γH 2AX signal detected in AID DIvA cells following treatment with 4OHT (4h) and 4OHT followed by 2h of auxin (4OHT+IAA). Cells in G1 or G2 were sorted based on Hoechst staining, using an operetta device coupled to the Colombus software. A representative experiment is shown (>1000 nuclei). Center line: median; Box limits: 1st and 3rd quartiles; Whiskers: Maximum and minimum without outliers.B. Cleavage assay performed in AID-DIvA cells, arrested in G1 (left panel) or in G2 (right panel), left untreated or treated with 4OHT (4h) followed by auxin (IAA) addition (30min and 60 min), after transfection of control or SETX#2 siRNAs. Precipitated DNA was analyzed close to the DSB-RBMXL1, found to recruit Senataxin after 4OHT. The percentage of sites that remains broken after the indicated time of auxin treatment are presented. Mean and s.e.m of technical replicates from representative experiment are shown. FACS profiles are also shown (top panels) for both conditions. A CTRL SETX#2 p<0.0001 53BP1 53BP1 p<0.0001

4 +4OHT p<0.0001 -DRB 3

2 53BP1 53BP1 53BP1 intensity (AU)

+4OHT 1 +DRB

0

+4OHT +4OHT +4OHT +4OHT +DRB +DRB

CTRL SETX#2 B CTRL SETX#2 p<0.0001 53BP1 53BP1 p<0.0001

+4OHT 12 p<0.0001 -CDP 10

8

53BP1 53BP1 6

4 +4OHT +CDP 53BP1 intensity (AU) 2

0

+4OHT +4OHT +4OHT +4OHT +CDP +CDP

CTRL SETX#2

Supplementary Figure 5, related to Figure 5: Senataxin depletion increases 53BP1 foci formation in a manner partially reversed by prior exposition to transcription inhibitors.

A. 53BP1 staining performed in 4OHT- treated DIvA cells (4h), after transfection with control or SETX siRNA, in presence of DRB, a transcription inhibitor, as indicated. Right panel shows the quantification of the 53BP1 nuclear signal within foci (>100 nuclei) from a representative experiment. Center line: median; Box limits: 1st and 3rd quartiles; Whiskers: Maximum and minimum without outliers. P-values are indicated (unpaired t-test). B. Same as in A, except that a cordycepin pre-treatment was performed. Scale bar: 10µM A B MIS12::LYRM2 TRIM37::RBMXL1 n=4 siRNA CTRL +4OHT +4OHT 5 100bp -4OHT +IAA H20 H20 -4OHT +IAA 100bp p=0.068 siRNA SETX#2 4 p=0.009

3

2 normalized to CTRL to normalized translocation frequency translocation 1

0 MIS12::LYRM2 TRIM37::RBMXL1

C 8 p=0.015 n=4 siRNA CTRL siRNA SETX SmartPool 6 p=0.058

4 p=0.035 p=0.004 normalized to CTRL to normalized translocation frequency translocation 2

0

MIS12::TRIM37 MIS12::LYRM2 TRIM37::RBMXL1 LINC00271::LYRM2 D MIS12::TRIM37

p= 0.038 n=4 siRNA CTRL

2 siRNA SETX#2

1 (normalized to CTRL) translocation frequency

0 -4OHT +4OHT -4OHT +4OHT

empty vector RNAseH

Supplementary Figure6,related to Figure 6: Senataxin depletion increases translocation frequency in a manner partially reversed following overexpression of RNAseH1 A. Rejoining of distant DSBs located on different chromosomes were detected by PCR, following DSB induction and repair (+4OHT+IAA 2h). DNA sequencing confirmed the nature of the amplified products. B. MIS12::LYRM2 and TRIM37::RBMXL1 rejoining frequencies were analyzed after 4OHT+ IAA treatment, by quantitative PCR in AID-DIvA cells transfected with control or SETX directed siRNA (SETX#2). Data are normalized to the translocation level observed in control cells. Mean and s.e.m of 4 biological replicates are shown. P values are indicated (one sample t-test). C. All four translocations events were measured as above following depletion of senataxin using the SmartPool siRNA. Mean and s.e.m of 4 biological replicates are shown. P values are indicated (one sample t-test). D. MIS12::TRIM37 rejoining frequency was analyzed in control or SETX-depleted AID-DIvA cells transfected with an empty vector or with a plasmid expressing RNAseH1, as indicated. Mean and s.e.m of 4 biological replicates are shown. P values is indicated (paired t-test). II. RNA: DNA and G4 helicases interplay for DSB repair in active genes

Bloom Syndrome (BS) is an autosomal recessive disorder that is associated with predisposition to cancer, as well as frequent infections due to immune deficiency, symptoms that have all been associated with suboptimal DNA Double Strand breaks (DSB) repair. At a molecular level, one of the main characteristic of BS cells is a dramatic increase in Sister Chromatid Exchange (SCE) frequency, also indicative of misregulated DSB repair. BS is linked to the mutation of the BLM gene, encoding for a very efficient on G-quadruplexes (G4s) (Chatterjee et al., 2014; Huber et al., 2002; Mohaghegh et al., 2001). G4 structures are non- canonical DNA structures that are prevalent in transcribed regions prone to form RNA/DNA hybrids known as R-loops (Ginno et al., 2012a; Ginno et al., 2012b). We recently reported that RNA: DNA hybrids resolution by SETX at DSBs is required to promote HR and to counteract translocations upon damage in active genes (Cohen et al., 2018) BLM has also been implicated in the earlier steps of HR, such as end resection and RAD51 filament assembly (Cejka et al., 2010; Nimonkar et al., 2011; Sturzenegger et al., 2014; Wu et al., 2001). However, BLM also promotes 53BP1 (an anti-resection factor) foci assembly, interacts with 53BP1, and acts together with 53BP1/RIF1 to protect against CtIP dependent long range deletions (Grabarz et al., 2013; Tripathi et al., 2018) indicative of some anti- resection properties of the BLM helicase. BLM can also disrupt RAD51 filament assembled on ssDNA and inhibits D-loop formation (Bugreev et al., 2007). Altogether these studies clearly pointed a dual role of BLM in HR repair and therefore genomic instability. In this study in preparation, we show unanticipated interplay between BLM (and potentially other G4 DNA helicases) and SETX to repair damages occurring in active genes. Using ChIP-seq mapping in DIvA (DSB inducible via AsiSI) cells, we found that BLM exhibited a recruitment bias for DSB induced in active genes. As expected, BLM depletion impaired HR and translocations, but surprisingly, rescued the survival defects observed in SETX depleted cells upon damage. Conversely, knock down of other G4-helicases (RTEL1, FANCJ) also rescued cell survival in SETX depleted cells upon damage. Additionally, double depletion rescued RNA-DNA hybrids accumulation observed in SETX depleted cells. Those results suggest that BLM and SETX interplay could be a key feature of “Transcription-coupled DSB repair”.

A. BLM is recruited at DSB produced in G4-rich loci

In order to get insights into BLM function during DSB repair we analyzed the distribution of BLM, genome-wide and at a high resolution, using our previously described DIvA system.

43 A BLM ChIP-seq performed D by Thomas Clouaire

0.4 0.4 G4 ChIP γH2AX (HaCaT, Hansel-Hertsch et al)

0 0.4 BLM 0 0.4 0 BLM 5 BLESS normalized read count

0 0 chr1 chr17 89 000 000 90 000 000 57 170 000 57 190 000 DSB 0.4 0.4 G4 ChIP (HaCaT, Hansel-Hertsch et al)

0 0.4 0 0.4 BLM 0 5

normalized read count BLESS

0 0 chr1 chr13 89 450 000 DSB 89 470 000 114 880 000 114 900 000

B p=4.5 e-40 E

0.35

750 0.3

0.25 500 G4 ChIP read Count G4 ChIP (Hansel-Hertsch et al) normalized BLM ChIP-seq read count on +/- 1kb 0.2 250

BLM BLM uncut cut positive negative C γH2AX BLM

0.08

0.06

0.04 average read counts

0.02

-50kb DSB +50kb Distance from DSB (in kb)

Figure 18: BLM is recruited on confined area corresponding to G4 at DSB A. Genome browser screenshots representing γH2AX and BLM ChIP-seq read counts after damage induction at one AsiSI sites. Magnifications is shown below. B. Box plots representing BLM ChIP-seq read count ratio (+4OHT/-4OHT) over 1kb at sites displaying AsiSI-induced cleavage (“cut”, 80 AsiSI sites) or not (“uncut”, 1139 AsiSI sites). Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. P values are indicated (Wilcoxon Mann–Whitney test). C. Averaged of BLM (blue) and γH2AX (red) signals, over a 100kb region flanking annotated AsiSI sites (80 best cleaved AsiSI) are shown. D. Genome browser screenshots representing BLM ChIP-Seq reads count after damage induction at two individual AsiSI sites. G4 ChIP-seq read counts in HaCAT cells is also shown (Hansel-Hertsch et al., 2018), together with the BLESS signal (indicative of cleavage efficiency (Clouaire, T. et al., 2018)). E. Box plot showing the distribution of G4 ChIP-seq read counts between “BLM low” and “BLM high” subset of DSBs. AsiSI induced DSB were sorted according to the ratio BLM/ XRCC4, in order to define “BLM low” and “BLM high” categories. We performed ChIP against BLM followed by high throughput sequencing (ChIP-Seq), after damage induction with 4OHT. We found that BLM is highly recruited at the vicinity of the DSB (see Figure 18A). As described in a recent publication from the lab (Clouaire et al., 2018), we identified a set of 80 DSBs robustly induced after 4OHT treatment using BLESS (direct in situ breaks labeling, enrichment on streptavidin and next generation sequencing). BLM binding was significantly enriched following 4OHT on the AsiSI cut sites compared to the uncut AsiSI sites (Figure 18B). Averaging BLM profile at all AsiSI induced DSBs revealed that BLM spreads on roughly 5-10kb around DSBs (Figure 18C). In contrast, γH2AX spreads on megabase chromatin domains but is depleted proximal to the DSB (Figure 18C). Interestingly, BLM clearly accumulated at the γH2AX depleted region surrounding DSBs (Figure 18C). As previously stated, BLM is known to be a very efficient G4 helicase (Chatterjee et al., 2014; Huber et al., 2002; Mohaghegh et al., 2001). Hence, we examined whether BLM recruitment at DSB depends on G4 enrichment at AsiSI sites. We compared our BLM ChIP- seq data with Balasubramanian’s lab G4 ChIP-seq (Hänsel-Hertsch et al., 2018) obtained in HaCat cell line (spontaneously immortalized, non-oncogenic human epidermal keratinocytes). We observed that BLM recruitment at DSB correlates with G4 enrichment (Figure 18D and 18E). Altogether, these data indicate that BLM helicase is distributed on restricted domains which correlate with G4 rich loci.

B. BLM is mainly recruited at DSBs induced in active genes.

Since G4 are secondary structure formed at transcribed regions, we investigated whether BLM recruitment at DSBs also correlates with transcriptional activity. For this analysis we used ChIP-Seq mapping of RNA-Polymerase II phosphorylated on serine 2 (Pol II-S2P) and total RNA polymerase II (total pol II) that we previously generated in our DIvA model (Cohen et al., 2018). We used those data to accurately infer transcriptional activity for genes immediately proximal to either BLM-low or BLM-high DSBs. Average profiles for Pol II-S2P showed that genes located close to BLM-high DSBs exhibited the typical pattern of actively transcribed genes, with a shift of Pol II-S2P across the gene body (Figure 19A). Conversely, genes lying near BLM low DSBs exhibited enrichment of Pol II-S2P at the promoter together with no increase detected on genes bodies, indicative of low or absent transcription (Figure 19A). We also sorted BLM high DSB according to their Pol II S2-P and total pol II enrichment prior to damage induction. BLM recruitment correlates with total RNA pol II and Pol II S2-P enrichment prior DSB induction (Figure 19B). To further investigate whether transcription is required for BLM recruitment after DSB induction, we performed ChIP-qPCR in DIvA cells before and after 4OHT and pretreatment with transcription inhibitor (5,6-dichloro-1-β- ribofuranosyl-benzimidazole (DRB)). Importantly, DRB pretreatment reduced BLM recruitment

44 A High BLM B Low BLM RNA PolII S2P 0.25

60 0.20

0.15 40

0.10 normalized BLM read count RNA PolII S2P read count PolII S2P RNA 20 0.05

0.00 Total RNA polII high medium medium low -3kb +3kb -3kb +3kb high low TSS TTS TSS TTS C 0.25 Experiment performed by Thomas Clouaire 0.20 n=3 -4OHT -4OHT 0.15 BLM +4OHT γH2AX +4OHT 0,07 +4OHT+DRB +4OHT+DRB 0.10 0,06 normalized BLM read count 0,05 0.05

0,04 0.00 0,03 RNA polII-S2P high medium medium low high low 0,02 ChIP efficiency (%input) efficiency ChIP

0,01

0 actinDSB-II actin DSB-II (no DSB) (no DSB)

Figure 19: BLM is recruited at damaged active transcription units. A. Average profiles for RNA polymerase II phosphorylated on Serine2 on genes proximal to BLM high (red) and BLM low (blue) DSB. B. Box plots representing BLM ChIP-seq count after (+4OHT) DSB induction at AsiSI “cut” sites sorted according to total RNA Polymerase II and RNA polymerase II phosphorylated on Serine2 occupancy on a 10 kb window surrounding AsiSI sites (20 DSBs in each category). C. ChIP against BLM and γH2AX before damage induction (-4OHT), after damage induction (+4OHT) and after damage induction and transcription inhibition (+4OHT+DRB). qPCR performed at one DSB and one control gene (actin), data represented in percent of input precipitated. after DSB induction compared to untreated cells (Figure 19C). Transcription inhibition had no impact on γH2AX signal suggesting that signalization is unaffected by the treatment (Figure 19C). Altogether, those results suggest that BLM is recruited preferentially at DSB induced in active genes.

C. BLM is recruited at HR-prone DSB and required for RAD51 loading

We have shown previously that DSB induced in transcriptionally active loci are predominantly repaired by HR (Aymard et al., 2014). Indeed, in this study, we performed a genome-wide mapping of both NHEJ (XRCC4) and HR (RAD51) components that led to the identification of two classes of AsiSI-induced DSBs. The HR-prone subset recruits high level of RAD51, undergoes resection and relies on the HR machinery for repair. On another hand, the non-HR subset of DSBs does not recruit RAD51 (or to a much lower extent), undergoes no or minimal resection, and relies on XRCC4 for efficient repair. Hence, we compared our BLM high resolution map with these subsets. As shown Fig, BLM is strongly recruited at RAD51-bound, HR-prone DSBs, where it spreads on 10kb (Figure 20A). By contrast, both the level and spreading extent of BLM are restricted on RAD51-unbound, non-HR prone DSBs (Figure 20A). When taken collectively, HR-prone DSBs clearly exhibit more BLM with an average spreading of 10kb than non HR-prone DSBs (Figure 20B). RAD51 is known to coat resected DNA. Importantly, we observed by ChIP-seq following 24 hours 4OHT treatment that RAD51 spreading around the break increases from 10kb to around 20kb (figure 20C). We suggest that this increased spreading correlates with increased resection around the break. By comparing those results with ChIP-seq data for BLM after 24 hours DSB induction we observed a similar spreading for BLM of 20kb around the break (Figure 20C). Overall, those results suggest that BLM spread around the break in a similar manner than RAD51 and its spreading might depend on resection length. We next investigated the consequence of BLM depletion on the repair pathway choice using chromatin immunoprecipitation of repair factors at DSB: RAD51 for HR and XRCC4 in NHEJ. Transfection of DIvA cells with BLM siRNA led to a clear drop in BLM mRNA level (Figure 20D), BLM protein level (Figure 20D) and BLM recruitment on damaged chromatin (Figure 20D). RAD51/XRCC4 ratio decrease after BLM depletion by siRNA indicates that RAD51 recruitment is impaired whereas XRCC4 is unaffected (Figure 20E). Additionally, we performed immunofluorescence against 53BP1 and RAD51 and quantified foci intensity. BLM depletion led to increased 53BP1 signal and decreased RAD51 suggesting of a switch of repair pathway from HR to NHEJ. Importantly, those results were similar to SETX depletion (Figure 21A) suggesting that BLM promotes HR as SETX.

45 A Experiments performed by Thomas Clouaire B BLM positive 5 BLM BLESS BLM positive 40 Rad51 0 0.5 30 XRCC4 0 0.5 20 Rad51 0 10 normalized read count 0.5 BLM average read countt 0 0 −5kb −2.5kb 0 2.5kb 5kb chr22 (DSB-1) 38 860 000 38 880 000

5 BLESS BLM negative 40 0 0.5 30 BLM negative XRCC4 0 0.5 20 Rad51

0 average read countt 10 0.5 normalized read count BLM 0 0 chr13 −5kb −2.5kb 0 2.5kb 5kb 105 230 000 105 250 000

C D Experiments performed by Thomas Clouaire and Aline Marnef

0.5 control siRNA Rad51 (4h) BLM siRNA siRNA control BLM 0 0.5 no Ab Rad51 (24h) n=4 BLM Ab 0 0.3 100 BLM BLM (4h) 0.16 80 0 0.12 0.3 normalized read count BLM (24h) 60 40 0.08 0 chr22 20 α Tub 0.04 38 800 000 39 000 000

0 (% input) efficency ChIP 0 0.7 level (normalized to P0) cDNA Rad51 (4h) siRNA control BLM 0 0.7 Rad51 (24h) 0 E 0.25 BLM (4h) 4 control siRNA

0 BLM siRNA 0.25 normalized read count BLM (24h) 3 chr170 5 300 000 5 500 000 2 0.8 Rad51 (4h) 0

0.8 efficiency ChIP (RAD51/XRCC4) 1 Rad51 (24h) 0 0.4 BLM (4h) 0

0 DSB-1 DSB-2 DSB-3 0.4 normalized read count BLM (24h)

0 BLM-positive BLM-negative chr17 57 100 000 57 300 000 Experiments performed by Thomas Clouaire

Figure 20: BLM enrichment and spreading associates with RAD51 binding A. Genome browser screenshots representing XRCC4, RAD51 and BLM ChIP-Seq reads count after damage induction at two individual AsiSI sites. The BLESS signal (indicative of cleavage efficiency (Clouaire, T. et al., 2018)) is also shown. B. Averaged of BLM (blue) and RAD51 (red) signals, over a 5kb region flanking annotated “BLM high" (top panel) and “BLM low” AsiSI sites are shown. C. Genome browser screenshots representing RAD51 and BLM ChIP-Seq reads count after damage induction for 4 hours (4h) and for 24 hours (24h) at three individual AsiSI sites. D. RT-qPCR showing the efficiency of BLM depletion at RNA level. Western blot showing BLM depletion efficiency at the protein level. ChIP against BLM performed in 4OHT treated DIvA cells transfected with control or BLM siRNA as indicated. qPCR was performed at one DSB and data are shown as a percent of input immunoprecipitated. E. ChIP against RAD51 and XRCC4 were performed in 4OHT treated DIvA cells transfected with control or BLM siRNA as indicated, data are shown as RAD51/ XRCC4 ratio. qPCR was performed at 2 DSB “BLM high” and 1 DSB “BLM low”. Moreover, compelling evidences suggest a strong involvement of the BLM helicase in promoting resection at DSB induced in human cells. However, some recent data also suggested antiresection properties of BLM (Grabarz et al., 2013). We thus studied resection in BLM depleted DIvA cells using a previously developed assay for sequence specific DSB. This assay relies on the inability of restriction enzymes to cleave single strand DNA (Zhou et al., 2014). As expected, resection (single strand, digestion resistant DNA) was observed upon 4OHT addition, at an HR-prone, BLM-bound DSB (Figure 21B). Interestingly, depletion of BLM by siRNA led to decreased resection at this DSB, indicative of a pro-resection activity of BLM at AsiSI induced DSB (Figure 21B). As shown previously, SETX depletion did not reduce ssDNA at DSB indicating that it is not necessary to promote resection. However, double depletion decreased ssDNA, suggesting that BLM has a stronger impact on HR than SETX (Figure 21B). Altogether, those results suggest that BLM recruitment is necessary for RAD51 recruitment, resection and therefore the HR pathway.

D. SETX and BLM interplay for DSB repair

Overall, we showed that BLM is recruited preferentially at DSB induced in transcriptionally active loci, preferentially repaired by HR and is necessary for RAD51 recruitment. Those results being similar than what we observed for SETX (Cohen et al., 2018), we investigated whether BLM could play a similar role in DSB repair.

BLM has no impact on cell survival but is harmful in SETX depleted cells

We investigated BLM involvement in cell survival following DSB induction. We performed a clonogenic assay using AID-DIvA cells, an improved version of our DIvA model (Aymard et al., 2014; Caron et al., 2015). This cell line stably expresses a construct carrying AsiSI-ER fused to an auxin inducible degron (AID), which triggers the rapid degradation of the restriction enzyme upon auxin addition thus enabling repair of AsiSI-induced DSBs. As shown Figure 22A, 4OHT treatment of control transfected cells reduced clonogenic survival to about 50% whereas auxin addition rescued cell survival to about 70%. For SETX depleted cells, DSB induction decrease cell survival to around 15% whereas after DSB repair restored cell survival around 40-50% as we shown previously (Cohen et al., 2018). Notably, BLM depletion did not change survival neither after 4OHT nor after auxin treatment (Figure 22A). This indicates that BLM is not required for cell survival after induction of clean DSBs by AsiSI, suggesting that repair can occur in BLM depleted cells contrary to SETX depletion. As BLM and SETX are both involved in HR we were expecting that double depletion of those proteins would lead to

46 A

BLM 3 CTRL SETX BLM

SETX ) (AU) 7 2.5

2

53BP1 1.5

1

0.5

53BP1 intensity (10 0

siRNA CTRL SETX BLM SETX BLM RAD51 ) (AU)

5 1.5

1

0.5

0 RAD51 intensity (10 siRNA CTRL SETX BLM SETX BLM

120 AsiSI BanI BanI B n=3

100 DSB-KDELR3 200bp 950bp

80

60 ssDNA (%) 40

20

0 -4OHT+4OHT -4OHT+4OHT -4OHT+4OHT -4OHT+4OHT -4OHT+4OHT -4OHT+4OHT -4OHT+4OHT -4OHT+4OHT siRNA CTRL SETX BLM SETX CTRL SETX BLM SETX BLM BLM

DSBII_200pb DSBII_950pb

Figure 21: BLM contributes to repair pathway choice and to resection A. 53BP1 and Rad51 staining performed in 4OHT-treated DIvA cells (4 h), after transfection with control, SETX, BLM and SETX/BLM siRNA as indicated. Right panel shows the quantification of nuclear signal within foci for each staining (>100 nuclei) from a representative experiment (n=4 for 53BP1 staining and n=5 for RAD51 staining). Center line: median; box limits: 1st and 3rd quartiles; whiskers: maximum and minimum without outliers. B. The site specific resection assay has been described earlier. Briefly, DNA purified from damaged or undamaged cells is digested by dedicated restriction enzymes (as indicated) and digestion-resistant DNA (single stranded DNA) is measured by qPCR, using primers pairs apart from the restriction sites. Here, we optimized this assay at one AsiSI-induced DSBs that was shown to undergo HR. Resection assay at the DSB in control, SETX, BLM and SETX/BLM siRNA transfected cells before and after 4OHT treatment. increase cell death. Surprisingly, SETX and BLM depletion rescued significantly cell survival after DSB (Figure 22A). BLM being a DNA helicase, we investigated whether other DNA helicases could also rescue SETX depletion phenotype on cell survival. We performed clonogenic assay again but this time after depletion with siRNA against FANCJ and RTEL1 (Figure 22B). Indeed, double depletions with those proteins rescued also SETX depletion impact on cell survival. To confirm that this result depends on DNA helicase activity, we perform the same experiment with DHX9, an RNA helicase (Figure 22B). This time, SETX and DHX9 combined depletion did not rescue cell survival (Figure 22B). Those result showed that BLM, and more specifically DNA helicases activity is deleterious for cell survival in SETX depleted cells.

BLM promotes RNA-DNA hybrids accumulation

As BLM depletion rescued cell survival in SETX depleted cells, we investigated whether cell survival rescue could be associated with RNA-DNA hybrids resolution at DSB. Indeed, we showed using DRIP-qPCR in our previous publication that SETX depletion leads to increase RNA-DNA hybrids at break sites. Hence, we decided to perform DRIP-qPCR in BLM and double depleted cells (Figure 23). Interestingly, in BLM depleted DSB induction did not promote RNA: DNA hybrid accumulation at DSB site compared to untreated cells. Double depletion led to decreased RNA-DNA hybrid as well (Figure 23). Those results suggest that BLM promotes RNA-DNA hybrids accumulation at DSB site. Additionally, we investigated if cell survival could be due to resection inhibition more than to RNA: DNA hybrids resolution. Indeed, we showed previously that BLM depletion led to a decrease of resection in SETX depleted cells. To test whether cell survival could be rescued by resection inhibition we performed clonogenic assay with CtIP depletion and combined SETX/CtIP depletion (Figure 24). As for BLM depletion, CtIP depletion itself has no impact on cell survival. This result suggest that HR is not necessary for cell survival. However, combined depletion with SETX did not restore cell survival (Figure 24). Therefore, we show that resection and therefore repair pathway choice are not crucial for cell survival. We propose that RNA: DNA hybrids resolution might actually be an essential step for cell survival after damage induction.

Altogether, we showed in this study that BLM is recruited at DSB induced in transcriptionally active genes and more specifically at G4 rich region in those genes. Additionally, BLM is responsible for RAD51 recruitment at DSB site and resection therefore promoting HR. Although both SETX and BLM are required to promote HR, we observed a

47 A CTRL siRNA SETX siRNA CTRLSETX BLM BLM +SETX BLM BLM +SETX

120 -4OHT n=4 P=0.08 100 P=0.52 P=0.007 P=0.02 80 P=0.52

P=0.02 +4OHT 60 % Survival 40

20 +4OHT +IAA 0 -4OHT +4OHT +4OHT+IAA

B siRNA siRNA

CTRL CTRL SETX SETX FANCJ RTEL1 FANCJ+SETX RTEL1+SETX 120 120 P=0.011 n=3 n=4 P=0.02 P=0.15 100 100 P=0.002 P=0.064 P=0.002 P=0.009 80 80 P=0.018

P=0.002 60 P=0.0002 60 % Survival 40 40 % Survival

20 20

0 0 -4OHT +4OHT +4OHT+IAA -4OHT +4OHT +4OHT+IAA

120 siRNA

CTRL 100 SETX P=0.002 DHX9 80 DHX9+SETX P=0.049

60 P=0.0002 % Survival 40 n=4

20

0 -4OHT +4OHT +4OHT+IAA

Figure 22: BLM and DNA helicases depletion promotes cell survival in SETX depleted cells A. Clonogenic assays in AID DIvA cells transfected with control, SETX, BLM and SETX/BLM siRNA, before and after 4OHT treatment (4 h), followed by auxin (IAA) treatment (4 h) as indicated. Left panel shows a representative experiment. Right panel shows the average and s.e.m. of four biological replicates. P values are indicated (paired t-test). B. Average clonogenic performed similarly as A., top panel left control, SETX, FANCJ and SETX/FANCJ siRNA, top panel right: control, SETX, RTEL1 and SETX/RTEL1 siRNA, bottom panel: control, SETX, DHX9, SETX/DHEX9 siRNA. P values are indicated (paired t-test). surprising interplay between them for cell survival. Indeed, even though BLM has no impact on cell survival by itself, its presence is deleterious for SETX depleted cells. We observed a similar impact of other DNA helicases such as FANCJ and RTEL1. Although it is unclear of the mechanism promoting it, we showed that BLM could promote RNA-DNA hybrids stabilization at DSB in SETX depleted cells.

48 2,5

n=3 2

10 +4OHT 1,5 0 10 -4OHT DRIP-seq read count normalized 1 0

89 458 000 89 459 000 DRIP-qPCR (% Input) 0,5 DSB primers

0

siRNA -4OHT +4OHT -4OHT +4OHT -4OHT +4OHT -4OHT +4OHT

CTL SETX BLM SETX BLM

Figure 23: BLM contributes to RNA: DNA hybrid accumulation DRIP-qPCR in control, SETX, BLM, and SETX/BLM depleted cells before (−4OHT) and after (+4OHT) DSB induction in DIvA cells as indicated. The position of the primers used to quantify RNA: DNA hybrids by qPCR are indicated on the genome browser screenshot above.

120 CTRL SETX 100 siRNA CtIP CtIP +SETX 80 P=0.008

60 P=0.014 % Survival 40 P=0.016

20

0

-4OHT +4OHT +4OHT +IAA

Figure 24: Cell survival is independent of resection Clonogenic assays in AID DIvA cells transfected with control, SETX, CtIP and SETX/CtIP siRNA, before and after 4OHT treatment (4 h), followed by auxin (IAA) treatment (4 h) as indicated. Average and s.e.m. of three biological replicates. P values are indicated (paired t-test). Discussion

49

I. RNA: DNA hybrid resolution is crucial for DSB repair in active genes

During my PhD, I showed that RNA: DNA hybrids represent an obstacle for repair at active genes and it is crucial to remove them from the break. More precisely, I identified SETX has a helicase required to remove DSB induced in active genes. RNA: DNA hybrids resolution by SETX is necessary for RAD51 loading and to limit translocations and eventually, cell survival.

A. RNA: DNA hybrids mapping

Previous R-loops mapping showed that RNA: DNA hybrids accumulate at RNA polymerase II pause sites, and more specifically at promoters (Chen et al., 2017; Sanz et al., 2016), immediately after the transcription start site (Dumelie and Jaffrey, 2017) and at termination site (Sanz et al., 2016; Skourti-Stathaki et al., 2011). In agreement with those studies, our DRIP-seq analyses were able to detect RNA: DNA hybrids at transcription start and termination sites on active genes. RNA Polymerase II elongation and pausing could be responsible for RNA: DNA hybrids on the gene body. Importantly, in control conditions, DSB induction in active genes led to 1. a strong decrease of RNA: DNA hybrids on the gene body 2. an accumulation around the DSB 3. a signal gap at the break itself.

RNA: DNA hybrids loss on the gene body

We observed that DSB induction led to a loss of RNA: DNA hybrids in the gene body. It is possible that DRIP-seq signal decrease on the gene body is caused by transcription repression at the gene after DSB induction (Figure 25A). Indeed, it has been observed that DSB induction in a transcriptionally active gene lead to local and transient transcription repression near the break. This event could facilitate DNA repair and limit defective transcription. Transcription silencing can be caused by elongation inhibition in cis of the break (Awwad et al., 2017; Iacovoni et al., 2010; Iannelli et al., 2017; Pankotai et al., 2012; Shanbhag et al., 2010; Solovjeva et al., 2007), RNA polymerase II degradation by the proteasome (Pankotai et al., 2012) but also via chromatin modifications on the damaged genes. Indeed, ATM-dependent H2AK119ub has been associated with repressive chromatin through RNF8 and RNF168 ubyquitilation (Shanbhag et al., 2010). This ubyqutilation as also found to be catalyzed by Polycomb Repressive Complex & and 2 (Ui et al., 2015). PRC1

50 A BAF180 Polycomb PBAF complex NuRD HDAC ENL SEC ZMYND8 RNA pol II ATM DSB RNA pol II

RNA templated?

R-loop Paused polymerase

DDR proteins ? Resection inhibition?

RNA RNA removal from DSB stabilization at DSB

Cell survival Cell death

B TC-NER XPF XPG

RNA: DNA helicases

RNA pol II DSB RNA pol II

SETX Rnase H

Transcription termination?

Drosha Exosome DGCR8

Microprocessor

Figure 25: RNA: DNA hybrids at DSB, model proposed A. RNA: DNA hybrids accumulation at DSB as a byproduct of transcription repression: RNA: DNA hybrids accumulates at DSB induced in active genes. This accumulation could be caused by transcription repression mediated by chromatin remodeling and RNA polymerase pausing. Those hybrids could play different roles in DSB repair: DNA damage response proteins recruitment, or be used as a template for repair. RNA could also limit RPA or RAD51 binding and limit resection. Whatever the case, those hybrids need to be remove from DSB for repair and cell survival. B. RNA: DNA hybrids removal from DSB: hybrids can be removed by Transcription-coupled Nuclease Excision Repair (mediated by XPG and XPF). RNA involved in hybrids can also be targeted directly by RnaseH1 and 2. RNA: DNA hybrids can also be removed by exosome proteins (RRP6), exoribonuclease such as XRN2, and RNA: DNA helicases such as SETX. Collectively, those proteins have been implicated in transcription termination. Additionally, SETX has been associated with the microprocessor (DGCR8 and Drosha) for premature transcription termination. This mechanism could also be involved in RNA removal from DSB. recruitment is also amplified by PRC2 (Kakarougkas et al., 2014). Finally, ATM-dependent H2AK119ub is also mediated by PBAF complex (Kakarougkas et al., 2014). ATM-dependent Histone deacetylation by the NuRD complex is responsible for transcription repression as well (Gong et al., 2015; Gong et al., 2017), This complex is recruited at DNA damage by ZMYND8. NuRD and ZMYND8 can also be regulated by the histone demethylase KDM5A (Gong et al., 2017). Finally, cohesins seem to be implicated in transcription repression as well (Meisenberg et al., 2019) throughout the cell cycle.

Overall, our data and others suggest that DSB induction in transcriptionally active gene can lead to transcription repression illustrated by the loss of RNA: DNA hybrids in the gene body.

RNA: DNA hybrids accumulation around the DSB

In addition to the loss of hybrids in the gene body, our DRIP-seq analyses allowed us to observe RNA: DNA hybrids accumulation at DSB induced in active genes. Other studies showed similar results. Indeed, RNA: DNA hybrids were previously observed accumulating at damages caused by micro-irradiation (Britton et al., 2014) or DSB induced at I-SceI sites (Li et al., 2008). More recently, several studies using the DIvA cell line also showed RNA: DNA hybrids accumulating at DSB using either DRIP followed by qPCR (Burger et al., 2019; D'Alessandro et al., 2018) or DRIP followed by high throughput sequencing (Lu et al., 2018). Several studies proposed that DSB induction can initiate transcription at break site, independently from previous transcription. Therefore, RNA: DNA hybrids accumulation at the break could be the result of de novo RNA polymerase II recruitment and transcription at DSB (Bonath et al., 2018; Burger et al., 2019; Francia et al., 2012; Michalik et al., 2012; Michelini et al., 2017; Ohle et al., 2016; Wei et al., 2012). This de novo transcription would produce lncRNA at the resected strand (Michelini et al., 2017; Ohle et al., 2016). Additionally, RNA polymerase II recruitment at the break site could be subjected to post translational modification (tyr-1) (Burger et al., 2019). However, RNA polymerase II was not seen recruited de novo at DSB in intergenic loci by ChIP-seq (Iannelli et al., 2017). Additionally, strand specific NET-seq data allowing the genome wide analyses of RNAs embedded in RNA polymerase II also shown RNA accumulation at DSB in transcriptionally active loci (Burger et al., 2019). Those results are corroborating our own data where hybrids seem to accumulate mostly at DSB induced in transcriptionally active loci and/or enriched in RNA polymerase II. We indeed, observed very low hybrid formation in a few intergenic loci with low RNA polymerase II level prior DSB

51 induction (Cohen et al., 2018). Altogether, those results suggest that RNA: DNA hybrids accumulation is specific to DSB induced in transcriptionally active loci. De novo transcription was also proposed to take place at resected strand in active loci (D'Alessandro et al., 2018; Michelini et al., 2017). However, after further analyses from NET- seq data, no RNA polymerase II embedded transcripts were seen starting at DBS ends (Figure 26). Importantly, Bushell’s lab showed no small ncRNAs at AsISI sites after DSB induction (Lu et al., 2018). Moreover, it seems difficult in our model to decipher at active loci whether RNA: DNA hybrids accumulate at DSB because of de novo transcription or because of RNA polymerase II pausing. Indeed, we favor the hypothesis that RNA: DNA hybrids accumulation is actually a byproduct of RNA polymerase II pausing at DSB (Figure 25A). As we mentioned previously, DSB induction can lead to RNA polymerase stalling and several studies showed that DSB-induced pausing can promote R-loops formation (Chen et al., 2017; Shivji et al., 2018; Zhang et al., 2017). RNA associated with paused RNA polymerase II could therefore hybridize double-strand or resected strand and been observed as RNA: DNA hybrids or non- coding RNA at the break. Additional experiments such as DRIPc-seq could help determine the exact length, strand specificity and sequence of RNA accumulating around the DSB and eventually better understanding the nature of RNA: DNA hybrids induced at DSB.

RNA: DNA hybrids function at DSB

Even though the mechanism producing RNA: DNA hybrids at DSB remains unclear, they could play a number of role at different step of DSB repair (Figure 25A). RNA: DNA hybrids could directly promote DDR proteins recruitment (Figure 25A) such as mediator proteins MDC1 and 53BP1 (Burger et al., 2019; Lu et al., 2018) as well as protein from the HR pathway: CSB, RAD51, RAD52 and BRCA1 (D'Alessandro et al., 2018; Lu et al., 2018; Yasuhara et al., 2018). DDR proteins recruitment could also be promoted by RNA: DNA hybrids indirectly through the generation of double-stranded small RNAs, DDRNA (Michelini et al., 2018), however their accumulation have been debated (Bonath et al., 2018). Additionally, RNA: DNA hybrids could regulate resection. Drosha depletion was associated with a decrease of resection and RNA: DNA hybrids at DSB site (Lu et al., 2018). However, Rnase H overexpression in yeast led to extended resection suggesting that RNA: DNA hybrids limit resection DSB (Ohle et al., 2016). This mechanism would be necessary for repetitive DNA regions around DSB where HR can be deleterious and lead to recombination between homolog chromosomes. Moreover, RNA: DNA hybrids presence at the break could influence protein binding to ssDNA. Indeed, RNA: DNA hybrids destabilization was found to impede RPA recruitment (D'Alessandro et al., 2018; Yasuhara et al., 2018) suggesting that RNA: DNA hybrids are

52 DSB1 DSB2 200kb 200kb 1.4 1.4 RNA PolII -DSB 0 0 8 8 NET-seq -DSB (+) strand 0 0 (-) strand -8 -8 8 8 NET-seq +DSB (+) strand 0 0 (-) strand -8 normalized read counts -8

4kb 4kb 1.4 1.4 RNA PolII -DSB 0 0 8 8 NET-seq -DSB 0 0

-8 -8 8 NET-seq +DSB 8 0 0 normalized read counts -8 -8

1.2 400bp 1.4 400bp RNA PolII -DSB 0 0 50 8 NET-seq -DSB 0 0

-50 -8 50 NET-seq +DSB 8 0 0

normalized read counts -50 -8

DSB DSB

Figure 26: NET-seq analyses from Burger et al, 2019 Genome browser screenshots representing RNA Polymerase II S2 ChIP-seq read counts before DSB induction and NET-seq read counts before and after DSB induction at 200kb around two individual AsiSI sites. Magnifications are shown below each screenshots. Zoom in of 400bp around the DSB shows no transcript embedded in RNA polymerase II starting at DSB ends (Figure taken from Puget et al, 2019 review in preparation). necessary for RPA binding. However, we observed that SETX depletion not only increased RNA: DNA hybrids accumulation but also reduced RAD51 binding (Cohen et al., 2018). Accordingly, it was observed in S.cerevisiae and human that RNA: DNA hybrids stabilization impedes RPA recruitment at DSB (D'Alessandro and d'Adda di Fagagna, 2017; Li et al., 2008; Ohle et al., 2016). Finally, RNA involved in RNA: DNA hybrids could also be used as a template for repair or bridge between DSB ends via CSB and RAD52 recruitment (Keskin et al., 2014; Meers et al., 2016; Wei et al., 2015). Altogether, those studies indicate that RNA: DNA hybrids accumulate at DSB site mostly in active genes. However, more experiments are necessary to determine if they are produced by de novo transcription or by pausing of RNA polymerase II. Whatever their function may be, it seems necessary to remove RNA from DSB as they can represent a roadblock for repair (Figure 25A).

B. RNA: DNA hybrids resolution required for repair

RNA: DNA hybrids need to be removed from DSB to ensure correct repair. Indeed, we showed that SETX depletion led to a strong DSB-induced lethality (Figure 25A). Accordingly, Rnase HI, Rnase H2 and Sen1 mutations in yeast caused RAD52 foci persistence, BIR activation and lethality (Amon and Koshland, 2016). Overall, those results suggest that RNA: DNA hybrids removal is crucial for cell survival after damages (Figure 25A). Strikingly, translocations increased and RAD51 loading decreased when SETX was depleted, suggesting HR defect. A similar mechanism was observed in a recent study. Indeed, in 5% of total DSB occurring following IR in human cells in G2 phase, RAD52 and XPG process RNA: DNA hybrids hence promoting HR repair (Yasuhara et al., 2018). XPG participates to Transcription-coupled DSB repair as well, hence promoting HR (Figure 25B) (Sollier et al., 2014). Collectively with studies mentioned previously, RNA: DNA hybrids regulate resection and ssDNA protein binding therefore impacting HR efficiency. HR defects observed with RNA: DNA hybrids accumulation would be responsible for gross chromosomal rearrangements. RNA polymerase II stalling at the break could lead to RNA: DNA hybrids formation, hence would require transcription termination (Figure 25B). As SETX is recruited at DSB in active genes and resolve RNA: DNA hybrids, it is possible that SETX promotes premature transcription termination. Indeed, SETX was found to control premature transcription termination by co-operating with the miRNA microprocessor made of Drosha and DGCR8, and in association with Xrn2 and RRP6 (Wagschal et al., 2012). Importantly, other studies showed that XRN2 and RRP6 are required for DSB repair after irradiation and their depletion leads to DDR defect and RNA: DNA hybrids increase (Marin-Vicente et al., 2015; Morales et al., 2016). It would be interesting to see in the DIvA system if those proteins are also required for DSB

53 repair in transcriptionally active loci. Altogether those data suggest that premature transcription termination could be a crucial step in active genes DSB repair. This mechanism could prevent transcription with RNA polymerase paused at the break and allow correct transcription restart. Overall, SETX role in premature transcription termination represent an interesting hypothesis on how RNA: DNA hybrids are removed from DSB in active genes (Figure 25B). To conclude, our study and others showed that RNA: DNA hybrids stabilization at the break can lead to HR defect. This impact on HR can have grave impact on the genome and furthermore on cells survival.

C. In human diseases

Brain degeneration has been previously associated with genomic instability (Barzilai et al., 2017; Jiang et al., 2017). Indeed, Genome instability syndromes are typically characterized by tissue degeneration and notably neurodegeneration. One typical example is the severe Ataxia Telangiectasia (AT) disease caused by ATM null mutations (Rothblum-Oviatt et al., 2016). Importantly, AT is characterized by neuromotor dysfunction, cerebellar atrophy and telangiectasia, phenotypes observed in other AT-like diseases such as Ataxia with Oculomotor Apraxia. Among AOA diseases, AOA2 is caused by mutations in SETX gene. Importantly, one of the most striking result obtained is the massive impact of SETX depletion on cell survival following DSB induction. Indeed, we observed that SETX role in repair, and more specifically in HR, is absolutely necessary for cell survival. It is possible that defect in DSB repair caused by SETX deficiency might be the reason for cerebellum atrophy and neuron loss observed in AOA2. There are 3 other types of AOA diseases associated with mutations in different genes. The first gene identified as responsible for an AOA (therefore type 1) was Aprataxin (APTX) (Date et al., 2001). APTX encodes for a member of the histidine triad superfamily (Schellenberg et al., 2015), which has been involved in single-stranded break (SSB) repair and base excision repair (BER). AOA type 3 is due to a mutation in the Phosphoinositide-3-Kinase Regulatory Subunit 5 gene (PIK3R5), encoding a regulatory subunit of the class I Phosphatidylinositol 3-kinases (PI3Ks), previously involved in oncogenesis (Fruman and Rommel, 2014). Finally, mutations in the kinase domain of Polynucleotide Kinase 3'- Phosphatase gene (PNKP) is responsible for AOA type 4 (Dumitrache and McKinnon, 2017). PNKP is also involved in SSB and Non-Homologous End Joining repair pathways and interacts with repair proteins such as XRCC1 and XRCC4 (Dumitrache and McKinnon, 2017). Given the phenotype similarities between the different AOA types and other AT-like diseases it would be interesting to see if those proteins are involved in DSB repair.

54

Additionally, it has been demonstrated that DSB are recurrent during brain development (Schwer et al., 2016; Wei et al., 2016; Wei et al., 2018) and are a physiological feature of brain activity (Madabhushi et al., 2015; Suberbielle et al., 2013). Therefore, those endogenous breaks could possibly be at the origin of phenotypes observed in neuropathies if DSB repair pathways are altered. Importantly, Alt’s lab identified that clusters of DSB in neural stem/progenitor cells (NSPC) occur in long neural genes. As those genes are transcribed and late-replicated it is possible that DSB are caused by transcription/replication collision (Wei et al., 2016; Wei et al., 2018). Using High-throughput, genome wide translocation analysis, they were able to show that DSB occurring at TSS in NSPC lead to translocations (Schwer et al., 2016). Moreover, both NHEJ and Alt-NHEJ mediate translocations from transcription- associated DSB. Overall, those results are in agreement with our findings and suggest that RNA: DNA hybrids resolution could be an important mechanism in neural progenitors and neural cells, not only to limit DSB induction but also to limit translocations. Defect in this mechanism could be responsible for neurodevelopment defects and eventually neurodegeneration. Altogether, it would be interesting to see if what we observed in this mechanism exists in wild-type NSPC and neural cells and to compare it with patient cells.

II. RNA: DNA and G4 helicases interplay for DSB repair in active genes

In a second part of my PhD, I studied the implication of BLM, a G4 helicase, in DSB repair. We were able to show that BLM is recruited following DSB induction in transcriptionally active genes and is required for HR repair in those genes. As those characteristics are similar to what we observed with SETX, we decided if BLM had the same impact on repair pathway choice, translocations, RNA-DNA hybrids resolution and cell survival as well. As G4 helicases have been proposed to stabilize RNA-DNA hybrids, we were expecting that BLM depletion would have the same effect than SETX depletion on DSB repair, and furthermore, that double depletion would aggravate phenotypes generated by SETX depletion alone. However, double depletion actually rescued major traits obtained with SETX depletion such as cell survival and RNA-DNA hybrids accumulation. Even though the exact mechanism remains unclear, those results suggest an unexpected interplay between RNA-DNA hybrids helicases and G4 helicases in DSB repair.

55

A. BLM is recruited at DSB in active genes and required for HR

Our ChIP-Seq analysis indicates that BLM binding is enhanced at DSBs characterized by high G4 enrichment (Figure 18D and 18E). It has to be noted that the cell line used for G4 ChIP -seq (HaCat, Hänsel-Hertsch et al., 2018) is different from the DIvA cell line (U2OS). However, we do observe a strong correlation between BLM binding and G4 signal which suggest that G4 positions are conserved in U2OS. G4 are non-canonical DNA structures that are prevalent in transcribed regions and especially gene promoters (Chambers et al., 2015) and BLM helicase is very efficient on G4 structures (Chatterjee et al, 2014; Huber et al, 2002; Mohaghegh et al, 2001). Those data suggest that BLM G4 helicase activity could be implicated in DSB repair. First, it would be relevant to perform G4 ChIP-seq in DIvA cell line and to investigate whether DSB induction affect G4 formation. Second, the same experiment this time with BLM depletion could bring more information. It is possible that BLM could resolve G4 at break sites potentially to facilitate resection or protein binding and BLM depletion could stabilize G4 at the break. As specified previously, G4 are structures specific of transcribed loci and interestingly we found that BLM preferentially associates with DSBs localized in actively transcribed genes (Figure 19A and 19B). This suggests that G4 resolution could be involved in DSB repair in active genes. Additionally, our data shows that BLM is recruited at DSB repaired by Homologous Recombination since i) BLM spreads on ~10kb at HR prone DSBs (Figure 20A) and ii) RAD51 binding is higher at DSBs enriched for BLM, compared to DSBs poorly enriched in BLM (Figure 20B). In addition, we found that at those HR-prone DSBs, BLM adopts a bimodal distribution (Figure 20B), and spreads around 5kb both upstream and downstream the DSB, with a drop on 2-3kb surrounding the break point. Since ChIP-Seq data only provide a snapshot over the entire cell population, it is not possible to accurately determine whether BLM spreading is strictly symmetrical at each HR-prone DSB or if such binding profiles represent an equal mixture of BLM translocating in only one direction. However, by inducing DSB during 24 hours we were able to see RAD51 spreading around the break (Figure 20E) suggesting increased resection and importantly, BLM spreading was similar. Altogether, this bimodal distribution observed at DSB undergoing homologous recombination and thus resection, together with our finding that BLM contribute to resection at these sites (Figure 20E), suggest that BLM translocates along the DNA as resection occurs, which will subsequently favor RAD51 loading. Finally, we observed that BLM is dispensable for cell survival. Previous studies led to conflicting results concerning the sensitivity of BLM-deficient cells to DNA damaging agents. Indeed, both Bloom Syndrome (BS) and BLM-depleted cells were found to be insensitive to HU-induced replication stress, unless prolonged HU treatment is performed (Davies et al., 2004). Similarly, BLM-defective chicken DT40 cell exhibit normal sensitivity to HU (Imamura et

56 al., 2001). On another hand, BS and/or BLM siRNA depleted cells display hypersensitivity to formaldehyde, (inducing DNA-protein cross-links, (Kumari et al., 2015), agents inducing DNA interstrand cross-links (Pichierri et al., 2004), camptothecin (Rao and Devi, 2005) or other genotoxics (for example see (Beamish et al., 2002)). Here we unambiguously demonstrate that BLM transient depletion by siRNA does not reduce survival of cells that experience a hundred of clean DNA Double Strand Breaks. However, although not necessary for DSB repair, we also found that BLM promotes end resection and RAD51 binding (Figure 21A and 21B). Those results are in agreement with previous studies on BLM. During HR, double Holliday junctions can occur and rely on BLM/Sgs1 activity for dissolution (Cejka et al., 2010; Cejka and Kowalczykowski, 2010; Wu and Hickson, 2003). BLM has also been implicated in the earlier steps of HR, such as end resection, RAD51 filament assembly, and D-Loop formation. BLM/Sgs1 stimulates Exo1 and DNA2 dependent resection in vitro (Cejka et al., 2010; Nimonkar et al., 2011; Sturzenegger et al., 2014) and promotes phosphorylated RPA foci formation (indicative of resection) in vivo (Gravel et al., 2008). Given our results that BLM mainly functions within a specific DSB repair pathway that operates to specifically repair damage within active genes, it is tempting to speculate that the cancer predisposition observed in BS patients arise at least in part from inaccurate repair events at damaged active transcription units.

B. G4 against R-loops?

Results obtained with BLM in DSB repair suggest a similar role than SETX. Indeed, not only did we found BLM at DSB induced in active genes after DSB induction, but we also observed that both proteins promote HR (Figure 20 and 21A). Surprisingly, however, those proteins do not have the same impact on cell survival. Not only BLM does not affect survival by itself, but its depletion actually rescued SETX depletion (Figure 22A). We found similar results when SETX depletion was combined with depletion of other DNA helicases suggesting BLM DNA helicase activity is responsible for this phenotype. Furthermore, as BLM is a G4 helicase, its recruitment at DSB could destabilize RNA: DNA hybrids. Indeed, it was proposed that G4 stabilize R-loops (Hänsel-Hertsch et al., 2018; Marsico et al., 2019). However, BLM depletion actually led to RNA: DNA hybrids decrease at the break, suggesting that BLM takes part in RNA: DNA hybrids accumulation at the break (Figure 23). Those results suggest that RNA: DNA hybrids could be destabilized by G4 formation which seems to be in contradiction with previous studies (Figure 27). Indeed, it can be envisaged that G-quadruplexes formation could rather destabilize RNA hybridization on the ssDNA after DSB induction (Figure 27A and 27B). It would be interesting to compare DRIPc- seq, allowing to detect RNA: DNA hybrids strand specificity with G4 ChIP-seq data to

57

A DNA2 Resection Exo1 RNA pol II

1 BLM

G4 resolution

RNA pol II

2 SETX RNA: DNA hybrids removal

B

Resection

RNA pol II RNA pol II VS

SETX BLM G4 stabilization RNA: DNA hybrids accumulation

pathway pathway

Cell survival Cell death

Figure 27: G-quadruplex and RNA: DNA hybrids interplay at DSB A. Working model: 1- After resection, ssDNA can form a G4, targeted by BLM helicase, 2- Once G4 resolved, RNA can hybridize ssDNA in turns targeted by SETX. Resolution of those secondary structures allows RAD51 binding and therefore HR. B. BLM and SETX depletion: BLM depletion could promote G4 stabilization, making it impossible for RNA to hybridize ssDNA. SETX depletion promotes RNA: DNA hybrids accumulation. Both mechanism lead to HR defect however, RNA: DNA hybrids accumulation is damaging for cell survival determine if G4 are formed on the DNA strand involved in hybrids. If this is the case, it could explain how G4 accumulation can limit RNA: DNA hybrids accumulation. Overall, those data suggest that cell survival is dependent of RNA: DNA hybrids resolution and not of repair pathway choice. Indeed, we observed that HR defect in BLM depletion did not result in cell survival defect. Moreover, even though BLM depletion led to resection decrease in SETX depleted, resection does not seem to be required for viability. Indeed, CtIP depletion had no impact on cell survival by itself and did not rescue SETX depletion (Figure 24). To confirm that cell survival is dependent of RNA: DNA hybrids accumulation and not resection, it would be interesting to perform DRIP in SETX and CtIP depleted cells and to see whether RNA: DNA hybrids accumulate at the break. If it is the case, it would suggest that RNA: DNA hybrids do not necessary form on resected strand at the break. Additional experiments are required to determine the exact interplay between BLM and SETX.

The completion of this study would allow us to precise how transcribed loci are repaired. More specifically, it would shed a light on how secondary structure such as G4 and R-loops influence repair and furthermore how RNA and DNA helicases influence DSB repair and cell survival. That information could help determine how mutations in genes coding for those proteins can lead to human diseases such as neuropathies and cancers.

58

Bibliography

Abbas, M., Shanmugam, I., Bsaili, M., Hromas, R., and Shaheen, M. (2014). The role of the human psoralen 4 (hPso4) protein complex in replication stress and homologous recombination. J Biol Chem 289, 14009-14019. Adamson, B., Smogorzewska, A., Sigoillot, F.D., King, R.W., and Elledge, S.J. (2012). A genome-wide homologous recombination screen identifies the RNA-binding protein RBMX as a component of the DNA-damage response. Nat Cell Biol 14, 318-328. Ahel, I., Rass, U., El-Khamisy, S.F., Katyal, S., Clements, P.M., McKinnon, P.J., Caldecott, K.W., and West, S.C. (2006). The neurodegenerative disease protein aprataxin resolves abortive DNA ligation intermediates. Nature 443, 713-716. Ahn, J.Y., Schwarz, J.K., Piwnica-Worms, H., and Canman, C.E. (2000). Threonine 68 phosphorylation by ataxia telangiectasia mutated is required for efficient activation of Chk2 in response to ionizing radiation. Cancer Res 60, 5934-5936. Ahnesorg, P., Smith, P., and Jackson, S.P. (2006). XLF interacts with the XRCC4-DNA ligase IV complex to promote DNA nonhomologous end-joining. Cell 124, 301-313. Alzu, A., Bermejo, R., Begnis, M., Lucca, C., Piccini, D., Carotenuto, W., Saponaro, M., Brambati, A., Cocito, A., Foiani, M., et al. (2012). Senataxin associates with replication forks to protect fork integrity across RNA-polymerase-II-transcribed genes. Cell 151, 835-846. Amon, J.D., and Koshland, D. (2016). RNase H enables efficient repair of R-loop induced DNA damage. Elife 5. Anderson, L., Henderson, C., and Adachi, Y. (2001). Phosphorylation and rapid relocalization of 53BP1 to nuclear foci upon DNA damage. Mol Cell Biol 21, 1719-1729. Aparicio, T., Baer, R., and Gautier, J. (2014). DNA double-strand break repair pathway choice and cancer. DNA Repair (Amst) 19, 169-175. Arndt, K.M., and Reines, D. (2015). Termination of Transcription of Short Noncoding RNAs by RNA Polymerase II. Annu Rev Biochem 84, 381-404. Awwad, S.W., Abu-Zhayia, E.R., Guttmann-Raviv, N., and Ayoub, N. (2017). NELF-E is recruited to DNA double-strand break sites to promote transcriptional repression and repair. EMBO Rep 18, 745-764. Aymard, F., Aguirrebengoa, M., Guillou, E., Javierre, B.M., Bugler, B., Arnould, C., Rocher, V., Iacovoni, J.S., Biernacka, A., Skrzypczak, M., et al. (2017). Genome-wide mapping of long- range contacts unveils clustering of DNA double-strand breaks at damaged active genes. Nat Struct Mol Biol 24, 353-361. Aymard, F., Bugler, B., Schmidt, C.K., Guillou, E., Caron, P., Briois, S., Iacovoni, J.S., Daburon, V., Miller, K.M., Jackson, S.P., et al. (2014). Transcriptionally active chromatin recruits homologous recombination at DNA double-strand breaks. Nat Struct Mol Biol 21, 366- 374. Barlow, J.H., Faryabi, R.B., Callén, E., Wong, N., Malhowski, A., Chen, H.T., Gutierrez-Cruz, G., Sun, H.W., McKinnon, P., Wright, G., et al. (2013). Identification of early replicating fragile sites that contribute to genome instability. Cell 152, 620-632. Barzilai, A., Schumacher, B., and Shiloh, Y. (2017). Genome instability: Linking ageing and brain degeneration. Mech Ageing Dev 161, 4-18. Baumann, M., Pontiller, J., and Ernst, W. (2010). Structure and basal transcription complex of RNA polymerase II core promoters in the mammalian genome: an overview. Mol Biotechnol 45, 241-247. Beamish, H., Kedar, P., Kaneko, H., Chen, P., Fukao, T., Peng, C., Beresten, S., Gueven, N., Purdie, D., Lees-Miller, S., et al. (2002). Functional link between BLM defective in Bloom's syndrome and the ataxia-telangiectasia-mutated protein, ATM. J Biol Chem 277, 30515- 30523. Becherel, O.J., Yeo, A.J., Stellati, A., Heng, E.Y., Luff, J., Suraweera, A.M., Woods, R., Fleming, J., Carrie, D., McKinney, K., et al. (2013). Senataxin plays an essential role with DNA damage response proteins in meiotic recombination and gene silencing. PLoS Genet 9, e1003435.

59 Beli, P., Lukashchuk, N., Wagner, S.A., Weinert, B.T., Olsen, J.V., Baskcomb, L., Mann, M., Jackson, S.P., and Choudhary, C. (2012). Proteomic investigations reveal a role for RNA processing factor THRAP3 in the DNA damage response. Mol Cell 46, 212-225. Beucher, A., Birraux, J., Tchouandong, L., Barton, O., Shibata, A., Conrad, S., Goodarzi, A.A., Krempler, A., Jeggo, P.A., and Löbrich, M. (2009). ATM and Artemis promote homologous recombination of radiation-induced DNA double-strand breaks in G2. EMBO J 28, 3413-3427. Bhatia, V., Barroso, S.I., García-Rubio, M.L., Tumini, E., Herrera-Moyano, E., and Aguilera, A. (2014). BRCA2 prevents R-loop accumulation and associates with TREX-2 mRNA export factor PCID2. Nature 511, 362-365. Bollimpelli, V.S., Dholaniya, P.S., and Kondapi, A.K. (2017). Topoisomerase IIβ and its role in different biological contexts. Arch Biochem Biophys 633, 78-84. Bonath, F., Domingo-Prim, J., Tarbier, M., Friedländer, M.R., and Visa, N. (2018). Next- generation sequencing reveals two populations of damage-induced small RNAs at endogenous DNA double-strand breaks. Nucleic Acids Res 46, 11869-11882. Boque-Sastre, R., Soler, M., Oliveira-Mateos, C., Portela, A., Moutinho, C., Sayols, S., Villanueva, A., Esteller, M., and Guil, S. (2015). Head-to-head antisense transcription and R- loop formation promotes transcriptional activation. Proc Natl Acad Sci U S A 112, 5785-5790. Boyes, J., and Bird, A. (1992). Repression of genes by DNA methylation depends on CpG density and promoter strength: evidence for involvement of a methyl-CpG binding protein. EMBO J 11, 327-333. Britton, S., Dernoncourt, E., Delteil, C., Froment, C., Schiltz, O., Salles, B., Frit, P., and Calsou, P. (2014). DNA damage triggers SAF-A and RNA biogenesis factors exclusion from chromatin coupled to R-loops removal. Nucleic Acids Res 42, 9047-9062. Brownlee, P.M., Chambers, A.L., Cloney, R., Bianchi, A., and Downs, J.A. (2014). BAF180 promotes cohesion and prevents genome instability and aneuploidy. Cell Rep 6, 973-981. Bugreev, D.V., Yu, X., Egelman, E.H., and Mazin, A.V. (2007). Novel pro- and anti- recombination activities of the Bloom's syndrome helicase. Genes Dev 21, 3085-3094. Buis, J., Stoneham, T., Spehalski, E., and Ferguson, D.O. (2012). Mre11 regulates CtIP- dependent double-strand break repair by interaction with CDK2. Nat Struct Mol Biol 19, 246- 252. Bunch, H., Lawney, B.P., Lin, Y.F., Asaithamby, A., Murshid, A., Wang, Y.E., Chen, B.P., and Calderwood, S.K. (2015). Transcriptional elongation requires DNA break-induced signalling. Nat Commun 6, 10191. Buratowski, S., Hahn, S., Guarente, L., and Sharp, P.A. (1989). Five intermediate complexes in transcription initiation by RNA polymerase II. Cell 56, 549-561. Burger, K., Schlackow, M., and Gullerova, M. (2019). Tyrosine kinase c-Abl couples RNA polymerase II transcription to DNA double-strand breaks. Nucleic Acids Res. Cannan, W.J., and Pederson, D.S. (2016). Mechanisms and Consequences of Double-Strand DNA Break Formation in Chromatin. J Cell Physiol 231, 3-14. Capp, J.P., Boudsocq, F., Besnard, A.G., Lopez, B.S., Cazaux, C., Hoffmann, J.S., and Canitrot, Y. (2007). Involvement of DNA polymerase mu in the repair of a specific subset of DNA double-strand breaks in mammalian cells. Nucleic Acids Res 35, 3551-3560. Caron, P., Choudjaye, J., Clouaire, T., Bugler, B., Daburon, V., Aguirrebengoa, M., Mangeat, T., Iacovoni, J.S., Álvarez-Quilón, A., Cortés-Ledesma, F., et al. (2015). Non-redundant Functions of ATM and DNA-PKcs in Response to DNA Double-Strand Breaks. Cell Rep 13, 1598-1609. Carvalho, S., Vítor, A.C., Sridhara, S.C., Martins, F.B., Raposo, A.C., Desterro, J.M., Ferreira, J., and de Almeida, S.F. (2014). SETD2 is required for DNA double-strand break repair and activation of the p53-mediated checkpoint. Elife 3, e02482. Castel, S.E., and Martienssen, R.A. (2013). RNA interference in the nucleus: roles for small RNAs in transcription, epigenetics and beyond. Nat Rev Genet 14, 100-112. Castel, S.E., Ren, J., Bhattacharjee, S., Chang, A.Y., Sánchez, M., Valbuena, A., Antequera, F., and Martienssen, R.A. (2014). Dicer promotes transcription termination at sites of replication stress to maintain genome stability. Cell 159, 572-583.

60 Cejka, P., Cannavo, E., Polaczek, P., Masuda-Sasa, T., Pokharel, S., Campbell, J.L., and Kowalczykowski, S.C. (2010). DNA end resection by Dna2-Sgs1-RPA and its stimulation by Top3-Rmi1 and Mre11-Rad50-Xrs2. Nature 467, 112-116. Cejka, P., and Kowalczykowski, S.C. (2010). The full-length Saccharomyces cerevisiae Sgs1 protein is a vigorous DNA helicase that preferentially unwinds holliday junctions. J Biol Chem 285, 8290-8301. Cerritelli, S.M., and Crouch, R.J. (2009). Ribonuclease H: the enzymes in eukaryotes. FEBS J 276, 1494-1505. Chambers, V.S., Marsico, G., Boutell, J.M., Di Antonio, M., Smith, G.P., and Balasubramanian, S. (2015). High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat Biotechnol 33, 877-881. Chang, E.Y., Novoa, C.A., Aristizabal, M.J., Coulombe, Y., Segovia, R., Chaturvedi, R., Shen, Y., Keong, C., Tam, A.S., Jones, S.J.M., et al. (2017). RECQ-like helicases Sgs1 and BLM regulate R-loop-associated genome instability. J Cell Biol 216, 3991-4005. Chapman, J.R., Taylor, M.R., and Boulton, S.J. (2012). Playing the end game: DNA double- strand break repair pathway choice. Mol Cell 47, 497-510. Chatterjee, S., Zagelbaum, J., Savitsky, P., Sturzenegger, A., Huttner, D., Janscak, P., Hickson, I.D., Gileadi, O., and Rothenberg, E. (2014). Mechanistic insight into the interaction of BLM helicase with intra-strand G-quadruplex structures. Nat Commun 5, 5556. Chen, L., Chen, J.Y., Zhang, X., Gu, Y., Xiao, R., Shao, C., Tang, P., Qian, H., Luo, D., Li, H., et al. (2017). R-ChIP Using Inactive RNase H Reveals Dynamic Coupling of R-loops with Transcriptional Pausing at Gene Promoters. Mol Cell 68, 745-757.e745. Chen, P.B., Chen, H.V., Acharya, D., Rando, O.J., and Fazzio, T.G. (2015). R loops regulate promoter-proximal chromatin architecture and cellular differentiation. Nat Struct Mol Biol 22, 999-1007. Chen, Y., Farmer, A.A., Chen, C.F., Jones, D.C., Chen, P.L., and Lee, W.H. (1996). BRCA1 is a 220-kDa nuclear phosphoprotein that is expressed and phosphorylated in a cell cycle- dependent manner. Cancer Res 56, 3168-3172. Chen, Y.Z., Bennett, C.L., Huynh, H.M., Blair, I.P., Puls, I., Irobi, J., Dierick, I., Abel, A., Kennerson, M.L., Rabin, B.A., et al. (2004). DNA/RNA helicase gene mutations in a form of juvenile amyotrophic lateral sclerosis (ALS4). Am J Hum Genet 74, 1128-1135. Chernikova, S.B., Razorenova, O.V., Higgins, J.P., Sishc, B.J., Nicolau, M., Dorth, J.A., Chernikova, D.A., Kwok, S., Brooks, J.D., Bailey, S.M., et al. (2012). Deficiency in mammalian histone H2B ubiquitin ligase Bre1 (Rnf20/Rnf40) leads to replication stress and chromosomal instability. Cancer Res 72, 2111-2119. Clouaire, T., and Legube, G. (2015). DNA double strand break repair pathway choice: a chromatin based decision? Nucleus 6, 107-113. Clouaire, T., Rocher, V., Lashgari, A., Arnould, C., Aguirrebengoa, M., Biernacka, A., Skrzypczak, M., Aymard, F., Fongang, B., Dojer, N., et al. (2018). Comprehensive Mapping of Histone Modifications at DNA Double-Strand Breaks Deciphers Repair Pathway Chromatin Signatures. Mol Cell 72, 250-262.e256. Cloutier, S.C., Wang, S., Ma, W.K., Al Husini, N., Dhoondia, Z., Ansari, A., Pascuzzi, P.E., and Tran, E.J. (2016). Regulated Formation of lncRNA-DNA Hybrids Enables Faster Transcriptional Induction and Environmental Adaptation. Mol Cell 62, 148. Cohen, S., Puget, N., Lin, Y.L., Clouaire, T., Aguirrebengoa, M., Rocher, V., Pasero, P., Canitrot, Y., and Legube, G. (2018). Senataxin resolves RNA:DNA hybrids forming at DNA double-strand breaks to prevent translocations. Nat Commun 9, 533. Colak, D., Zaninovic, N., Cohen, M.S., Rosenwaks, Z., Yang, W.Y., Gerhardt, J., Disney, M.D., and Jaffrey, S.R. (2014). Promoter-bound trinucleotide repeat mRNA drives epigenetic silencing in fragile X syndrome. Science 343, 1002-1005. Conway, A.B., Lynch, T.W., Zhang, Y., Fortin, G.S., Fung, C.W., Symington, L.S., and Rice, P.A. (2004). Crystal structure of a Rad51 filament. Nat Struct Mol Biol 11, 791-796. Corless, S., and Gilbert, N. (2016). Effects of DNA supercoiling on chromatin architecture. Biophys Rev 8, 245-258.

61

Costantini, S., Woodbine, L., Andreoli, L., Jeggo, P.A., and Vindigni, A. (2007). Interaction of the Ku heterodimer with the DNA ligase IV/Xrcc4 complex and its regulation by DNA-PK. DNA Repair (Amst) 6, 712-722. Costantino, L., and Koshland, D. (2018). Genome-wide Map of R-Loop-Induced Damage Reveals How a Subset of R-Loops Contributes to Genomic Instability. Mol Cell 71, 487- 497.e483. Cristini, A., Groh, M., Kristiansen, M.S., and Gromak, N. (2018). RNA/DNA Hybrid Interactome Identifies DXH9 as a Molecular Player in Transcriptional Termination and R-Loop-Associated DNA Damage. Cell Rep 23, 1891-1905. Cross, S.H., and Bird, A.P. (1995). CpG islands and genes. Curr Opin Genet Dev 5, 309-314. Crossley, M.P., Bocek, M., and Cimprich, K.A. (2019). R-Loops as Cellular Regulators and Genomic Threats. Mol Cell 73, 398-411. D'Alessandro, G., and d'Adda di Fagagna, F. (2017). Transcription and DNA Damage: Holding Hands or Crossing Swords? J Mol Biol 429, 3215-3229. D'Alessandro, G., Whelan, D.R., Howard, S.M., Vitelli, V., Renaudin, X., Adamowicz, M., Iannelli, F., Jones-Weinert, C.W., Lee, M., Matti, V., et al. (2018). BRCA2 controls DNA:RNA hybrid level at DSBs by mediating RNase H2 recruitment. Nat Commun 9, 5376. D'Amours, D., Desnoyers, S., D'Silva, I., and Poirier, G.G. (1999). Poly(ADP-ribosyl)ation reactions in the regulation of nuclear functions. Biochem J 342 ( Pt 2), 249-268. Date, H., Onodera, O., Tanaka, H., Iwabuchi, K., Uekawa, K., Igarashi, S., Koike, R., Hiroi, T., Yuasa, T., Awaya, Y., et al. (2001). Early-onset ataxia with ocular motor apraxia and hypoalbuminemia is caused by mutations in a new HIT superfamily gene. Nat Genet 29, 184- 188. Davies, S.L., North, P.S., Dart, A., Lakin, N.D., and Hickson, I.D. (2004). Phosphorylation of the Bloom's syndrome helicase and its role in recovery from S-phase arrest. Mol Cell Biol 24, 1279-1291. Day, T.A., Layer, J.V., Cleary, J.P., Guha, S., Stevenson, K.E., Tivey, T., Kim, S., Schinzel, A.C., Izzo, F., Doench, J., et al. (2017). PARP3 is a promoter of chromosomal rearrangements and limits G4 DNA. Nat Commun 8, 15110. De Amicis, A., Piane, M., Ferrari, F., Fanciulli, M., Delia, D., and Chessa, L. (2011). Role of senataxin in DNA damage and telomeric stability. DNA Repair (Amst) 10, 199-209. De, I., Bessonov, S., Hofele, R., dos Santos, K., Will, C.L., Urlaub, H., Lührmann, R., and Pena, V. (2015). The RNA helicase Aquarius exhibits structural adaptations mediating its recruitment to spliceosomes. Nat Struct Mol Biol 22, 138-144. de Murcia, G., and Ménissier de Murcia, J. (1994). Poly(ADP-ribose) polymerase: a molecular nick-sensor. Trends Biochem Sci 19, 172-176. Deaton, A.M., and Bird, A. (2011). CpG islands and the regulation of transcription. Genes Dev 25, 1010-1022. Deckbar, D., Jeggo, P.A., and Löbrich, M. (2011). Understanding the limitations of radiation- induced cell cycle checkpoints. Crit Rev Biochem Mol Biol 46, 271-283. DeJesus-Hernandez, M., Mackenzie, I.R., Boeve, B.F., Boxer, A.L., Baker, M., Rutherford, N.J., Nicholson, A.M., Finch, N.A., Flynn, H., Adamson, J., et al. (2011). Expanded GGGGCC hexanucleotide repeat in noncoding region of C9ORF72 causes chromosome 9p-linked FTD and ALS. Neuron 72, 245-256. Derr, L.K., and Strathern, J.N. (1993). A role for reverse transcripts in gene conversion. Nature 361, 170-173. Derr, L.K., Strathern, J.N., and Garfinkel, D.J. (1991). RNA-mediated recombination in S. cerevisiae. Cell 67, 355-364. Dhar, S., and Brosh, R.M. (2018). BLM's balancing act and the involvement of FANCJ in DNA repair. Cell Cycle 17, 2207-2220. Dianatpour, A., and Ghafouri-Fard, S. (2017). The Role of Long Non Coding RNAs in the Repair of DNA Double Strand Breaks. Int J Mol Cell Med 6, 1-12. Drolet, M., Phoenix, P., Menzel, R., Massé, E., Liu, L.F., and Crouch, R.J. (1995). Overexpression of RNase H partially complements the growth defect of an Escherichia coli

62 delta topA mutant: R-loop formation is a major problem in the absence of DNA topoisomerase I. Proc Natl Acad Sci U S A 92, 3526-3530. Dumelie, J.G., and Jaffrey, S.R. (2017). Defining the location of promoter-associated R-loops at near-nucleotide resolution using bisDRIP-seq. Elife 6. Dumitrache, L.C., and McKinnon, P.J. (2017). Polynucleotide kinase-phosphatase (PNKP) mutations and neurologic disease. Mech Ageing Dev 161, 121-129. Duquette, M.L., Handa, P., Vincent, J.A., Taylor, A.F., and Maizels, N. (2004). Intracellular transcription of G-rich DNAs induces formation of G-loops, novel structures containing G4 DNA. Genes Dev 18, 1618-1629. Edmunds, J.W., Mahadevan, L.C., and Clayton, A.L. (2008). Dynamic histone H3 methylation during gene induction: HYPB/Setd2 mediates all H3K36 trimethylation. EMBO J 27, 406-420. Escribano-Díaz, C., Orthwein, A., Fradet-Turcotte, A., Xing, M., Young, J.T., Tkáč, J., Cook, M.A., Rosebrock, A.P., Munro, M., Canny, M.D., et al. (2013). A cell cycle-dependent regulatory circuit composed of 53BP1-RIF1 and BRCA1-CtIP controls DNA repair pathway choice. Mol Cell 49, 872-883. Francia, S., Cabrini, M., Matti, V., Oldani, A., and d'Adda di Fagagna, F. (2016). DICER, DROSHA and DNA damage response RNAs are necessary for the secondary recruitment of DNA damage response factors. J Cell Sci 129, 1468-1476. Francia, S., Michelini, F., Saxena, A., Tang, D., de Hoon, M., Anelli, V., Mione, M., Carninci, P., and d'Adda di Fagagna, F. (2012). Site-specific DICER and DROSHA RNA products control the DNA-damage response. Nature 488, 231-235. Frankenberg-Schwager, M., Gebauer, A., Koppe, C., Wolf, H., Pralle, E., and Frankenberg, D. (2009). Single-strand annealing, conservative homologous recombination, nonhomologous DNA end joining, and the cell cycle-dependent repair of DNA double-strand breaks induced by sparsely or densely ionizing radiation. Radiat Res 171, 265-273. Fratta, P., Mizielinska, S., Nicoll, A.J., Zloh, M., Fisher, E.M., Parkinson, G., and Isaacs, A.M. (2012). C9orf72 hexanucleotide repeat associated with amyotrophic lateral sclerosis and frontotemporal dementia forms RNA G-quadruplexes. Sci Rep 2, 1016. Fruman, D.A., and Rommel, C. (2014). PI3K and cancer: lessons, challenges and opportunities. Nat Rev Drug Discov 13, 140-156. Gao, M., Wei, W., Li, M.M., Wu, Y.S., Ba, Z., Jin, K.X., Liao, Y.Q., Adhikari, S., Chong, Z., Zhang, T., et al. (2014). Ago2 facilitates Rad51 recruitment and DNA double-strand break repair by homologous recombination. Cell Res 24, 532-541. García-Muse, T., and Aguilera, A. (2016). Transcription-replication conflicts: how they occur and how they are resolved. Nat Rev Mol Cell Biol 17, 553-563. Gavaldá, S., Gallardo, M., Luna, R., and Aguilera, A. (2013). R-loop mediated transcription- associated recombination in trf4Δ mutants reveals new links between RNA surveillance and genome integrity. PLoS One 8, e65541. Ghezraoui, H., Oliveira, C., Becker, J.R., Bilham, K., Moralli, D., Anzilotti, C., Fischer, R., Deobagkar-Lele, M., Sanchiz-Calvo, M., Fueyo-Marcos, E., et al. (2018). 53BP1 cooperation with the REV7-shieldin complex underpins DNA structure-specific NHEJ. Nature 560, 122-127. Ghildiyal, M., and Zamore, P.D. (2009). Small silencing RNAs: an expanding universe. Nat Rev Genet 10, 94-108. Ginno, P.A., Lott, P.L., Christensen, H.C., Korf, I., and Chedin, F. (2012a). R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters. Mol Cell 45, 814- 825. Ginno, P.A., Lott, P.L., Christensen, H.C., Korf, I., and Chédin, F. (2012b). R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters. Mol Cell 45, 814- 825. Glover, T.W., Berger, C., Coyle, J., and Echo, B. (1984). DNA polymerase alpha inhibition by aphidicolin induces gaps and breaks at common fragile sites in human chromosomes. Hum Genet 67, 136-142. Goldberg, M. (1979). Ph.D. Thesis. (Stanford University; Stanford, CA, U.S.A). Gong, F., Chiu, L.Y., Cox, B., Aymard, F., Clouaire, T., Leung, J.W., Cammarata, M., Perez, M., Agarwal, P., Brodbelt, J.S., et al. (2015). Screen identifies bromodomain protein ZMYND8

63 in chromatin recognition of transcription-associated DNA damage that promotes homologous recombination. Genes Dev 29, 197-211. Gong, F., Clouaire, T., Aguirrebengoa, M., Legube, G., and Miller, K.M. (2017). Histone demethylase KDM5A regulates the ZMYND8-NuRD chromatin remodeler to promote DNA repair. J Cell Biol 216, 1959-1974. Goodarzi, A.A., Noon, A.T., Deckbar, D., Ziv, Y., Shiloh, Y., Löbrich, M., and Jeggo, P.A. (2008). ATM signaling facilitates repair of DNA double-strand breaks associated with heterochromatin. Mol Cell 31, 167-177. Grabarz, A., Guirouilh-Barbat, J., Barascu, A., Pennarun, G., Genet, D., Rass, E., Germann, S.M., Bertrand, P., Hickson, I.D., and Lopez, B.S. (2013). A role for BLM in double-strand break repair pathway choice: prevention of CtIP/Mre11-mediated alternative nonhomologous end- joining. Cell Rep 5, 21-28. Grabczyk, E., Mancuso, M., and Sammarco, M.C. (2007). A persistent RNA.DNA hybrid formed by transcription of the Friedreich ataxia triplet repeat in live bacteria, and by T7 RNAP in vitro. Nucleic Acids Res 35, 5351-5359. Gravel, S., Chapman, J.R., Magill, C., and Jackson, S.P. (2008). DNA helicases Sgs1 and BLM promote DNA double-strand break resection. Genes Dev 22, 2767-2772. Groh, M., and Gromak, N. (2014). Out of balance: R-loops in human disease. PLoS Genet 10, e1004630. Groh, M., Lufino, M.M., Wade-Martins, R., and Gromak, N. (2014). R-loops associated with triplet repeat expansions promote gene silencing in Friedreich ataxia and fragile X syndrome. PLoS Genet 10, e1004318. Grunseich, C., Wang, I.X., Watts, J.A., Burdick, J.T., Guber, R.D., Zhu, Z., Bruzel, A., Lanman, T., Chen, K., Schindler, A.B., et al. (2018). Senataxin Mutation Reveals How R-Loops Promote Transcription by Blocking DNA Methylation at Gene Promoters. Mol Cell 69, 426-437.e427. Gupta, R., Somyajit, K., Narita, T., Maskey, E., Stanlie, A., Kremer, M., Typas, D., Lammers, M., Mailand, N., Nussenzweig, A., et al. (2018). DNA Repair Network Analysis Reveals Shieldin as a Key Regulator of NHEJ and PARP Inhibitor Sensitivity. Cell 173, 972-988.e923. Ha, M., and Kim, V.N. (2014). Regulation of microRNA biogenesis. Nat Rev Mol Cell Biol 15, 509-524. Haffner, M.C., Aryee, M.J., Toubaji, A., Esopi, D.M., Albadine, R., Gurel, B., Isaacs, W.B., Bova, G.S., Liu, W., Xu, J., et al. (2010). Androgen-induced TOP2B-mediated double-strand breaks and prostate cancer gene rearrangements. Nat Genet 42, 668-675. Halász, L., Karányi, Z., Boros-Oláh, B., Kuik-Rózsa, T., Sipos, É., Nagy, É., Mosolygó-L, Á., Mázló, A., Rajnavölgyi, É., Halmos, G., et al. (2017). RNA-DNA hybrid (R-loop) immunoprecipitation mapping: an analytical workflow to evaluate inherent biases. Genome Res 27, 1063-1073. Hamperl, S., Bocek, M.J., Saldivar, J.C., Swigut, T., and Cimprich, K.A. (2017). Transcription- Replication Conflict Orientation Modulates R-Loop Levels and Activates Distinct DNA Damage Responses. Cell 170, 774-786.e719. Hamperl, S., and Cimprich, K.A. (2014). The contribution of co-transcriptional RNA:DNA hybrid structures to DNA damage and genome instability. DNA Repair (Amst) 19, 84-94. Hartono, S.R., Malapert, A., Legros, P., Bernard, P., Chédin, F., and Vanoosthuyse, V. (2018). The Affinity of the S9.6 Antibody for Double-Stranded RNAs Impacts the Accurate Mapping of R-Loops in Fission Yeast. J Mol Biol 430, 272-284. Hartsuiker, E., Neale, M.J., and Carr, A.M. (2009). Distinct requirements for the Rad32(Mre11) nuclease and Ctp1(CtIP) in the removal of covalently bound topoisomerase I and II from DNA. Mol Cell 33, 117-123. Hatchi, E., Skourti-Stathaki, K., Ventz, S., Pinello, L., Yen, A., Kamieniarz-Gdula, K., Dimitrov, S., Pathania, S., McKinney, K.M., Eaton, M.L., et al. (2015). BRCA1 recruitment to transcriptional pause sites is required for R-loop-driven DNA damage repair. Mol Cell 57, 636- 647. Helmrich, A., Ballarino, M., and Tora, L. (2011). Collisions between replication and transcription complexes cause common fragile site instability at the longest human genes. Mol Cell 44, 966-977.

64 Herzel, L., Ottoz, D.S.M., Alpert, T., and Neugebauer, K.M. (2017). Splicing and transcription touch base: co-transcriptional spliceosome assembly and function. Nat Rev Mol Cell Biol 18, 637-650. Heyer, W.D. (2004). Recombination: Holliday junction resolution and crossover formation. Curr Biol 14, R56-58. Hong, X., Cadwell, G.W., and Kogoma, T. (1995). Escherichia coli RecG and RecA proteins in R-loop formation. EMBO J 14, 2385-2392. Huber, M.D., Lee, D.C., and Maizels, N. (2002). G4 DNA unwinding by BLM and Sgs1p: substrate specificity and substrate-specific inhibition. Nucleic Acids Res 30, 3954-3961. Huertas, P., and Aguilera, A. (2003). Cotranscriptionally formed DNA:RNA hybrids mediate transcription elongation impairment and transcription-associated recombination. Mol Cell 12, 711-721. Huertas, P., and Jackson, S.P. (2009). Human CtIP mediates cell cycle control of DNA end resection and double strand break repair. J Biol Chem 284, 9558-9565. Hustedt, N., and Durocher, D. (2016). The control of DNA repair by the cell cycle. Nat Cell Biol 19, 1-9. Hänsel-Hertsch, R., Spiegel, J., Marsico, G., Tannahill, D., and Balasubramanian, S. (2018). Genome-wide mapping of endogenous G-quadruplex DNA structures by chromatin immunoprecipitation and high-throughput sequencing. Nat Protoc 13, 551-564. Iacovoni, J.S., Caron, P., Lassadi, I., Nicolas, E., Massip, L., Trouche, D., and Legube, G. (2010). High-resolution profiling of gammaH2AX around DNA double strand breaks in the mammalian genome. EMBO J 29, 1446-1457. Iannelli, F., Galbiati, A., Capozzo, I., Nguyen, Q., Magnuson, B., Michelini, F., D'Alessandro, G., Cabrini, M., Roncador, M., Francia, S., et al. (2017). A damaged genome's transcriptional landscape through multilayered expression profiling around in situ-mapped DNA double-strand breaks. Nat Commun 8, 15656. Imamura, O., Fujita, K., Shimamoto, A., Tanabe, H., Takeda, S., Furuichi, Y., and Matsumoto, T. (2001). Bloom helicase is involved in DNA surveillance in early S phase in vertebrate cells. Oncogene 20, 1143-1151. Jackson, B.R., Noerenberg, M., and Whitehouse, A. (2014). A novel mechanism inducing genome instability in Kaposi's sarcoma-associated herpesvirus infected cells. PLoS Pathog 10, e1004098. Jiang, B., Glover, J.N., and Weinfeld, M. (2017). Neurological disorders associated with DNA strand-break processing enzymes. Mech Ageing Dev 161, 130-140. Jilani, A., Ramotar, D., Slack, C., Ong, C., Yang, X.M., Scherer, S.W., and Lasko, D.D. (1999). Molecular cloning of the human gene, PNKP, encoding a polynucleotide kinase 3'- phosphatase and evidence for its role in repair of DNA strand breaks caused by oxidative damage. J Biol Chem 274, 24176-24186. Jimeno, S., Luna, R., García-Rubio, M., and Aguilera, A. (2006). Tho1, a novel hnRNP, and Sub2 provide alternative pathways for mRNP biogenesis in yeast THO mutants. Mol Cell Biol 26, 4387-4398. Jonkers, I., and Lis, J.T. (2015). Getting up to speed with transcription elongation by RNA polymerase II. Nat Rev Mol Cell Biol 16, 167-177. Ju, B.G., Lunyak, V.V., Perissi, V., Garcia-Bassets, I., Rose, D.W., Glass, C.K., and Rosenfeld, M.G. (2006). A topoisomerase IIbeta-mediated dsDNA break required for regulated transcription. Science 312, 1798-1802. Jørgensen, S., Schotta, G., and Sørensen, C.S. (2013). Histone H4 lysine 20 methylation: key player in epigenetic regulation of genomic integrity. Nucleic Acids Res 41, 2797-2806. Kakarougkas, A., Ismail, A., Chambers, A.L., Riballo, E., Herbert, A.D., Künzel, J., Löbrich, M., Jeggo, P.A., and Downs, J.A. (2014). Requirement for PBAF in transcriptional repression and repair at DNA breaks in actively transcribed regions of chromatin. Mol Cell 55, 723-732. Kanno, S., Kuzuoka, H., Sasao, S., Hong, Z., Lan, L., Nakajima, S., and Yasui, A. (2007). A novel human AP endonuclease with conserved zinc-finger-like motifs involved in DNA strand break responses. EMBO J 26, 2094-2103.

65

Keskin, H., Shen, Y., Huang, F., Patel, M., Yang, T., Ashley, K., Mazin, A.V., and Storici, F. (2014). Transcript-RNA-templated DNA recombination and repair. Nature 515, 436-439. Kim, H.D., Choe, J., and Seo, Y.S. (1999). The sen1(+) gene of Schizosaccharomyces pombe, a homologue of budding yeast SEN1, encodes an RNA and DNA helicase. Biochemistry 38, 14697-14710. Kim, V.N., Han, J., and Siomi, M.C. (2009). Biogenesis of small RNAs in animals. Nat Rev Mol Cell Biol 10, 126-139. Klein, I.A., Resch, W., Jankovic, M., Oliveira, T., Yamane, A., Nakahashi, H., Di Virgilio, M., Bothmer, A., Nussenzweig, A., Robbiani, D.F., et al. (2011). Translocation-capture sequencing reveals the extent and nature of chromosomal rearrangements in B lymphocytes. Cell 147, 95- 106. Kuehner, J.N., Pearson, E.L., and Moore, C. (2011). Unravelling the means to an end: RNA polymerase II transcription termination. Nat Rev Mol Cell Biol 12, 283-294. Kumari, A., Owen, N., Juarez, E., and McCullough, A.K. (2015). BLM protein mitigates formaldehyde-induced genomic instability. DNA Repair (Amst) 28, 73-82. Kusumoto, R., Dawut, L., Marchetti, C., Wan Lee, J., Vindigni, A., Ramsden, D., and Bohr, V.A. (2008). Werner protein cooperates with the XRCC4-DNA ligase IV complex in end- processing. Biochemistry 47, 7548-7556. König, F., Schubert, T., and Längst, G. (2017). The monoclonal S9.6 antibody exhibits highly variable binding affinities towards different R-loop sequences. PLoS One 12, e0178875. Lavin, M.F. (2004). The Mre11 complex and ATM: a two-way functional interaction in recognising and signaling DNA double strand breaks. DNA Repair (Amst) 3, 1515-1520. Le Tallec, B., Dutrillaux, B., Lachages, A.M., Millot, G.A., Brison, O., and Debatisse, M. (2011). Molecular profiling of common fragile sites in human fibroblasts. Nat Struct Mol Biol 18, 1421- 1423. Le Tallec, B., Millot, G.A., Blin, M.E., Brison, O., Dutrillaux, B., and Debatisse, M. (2013). Common fragile site profiling in epithelial and erythroid cells reveals that most recurrent cancer deletions lie in fragile sites hosting large genes. Cell Rep 4, 420-428. Lee, J.H., and Paull, T.T. (2005). ATM activation by DNA double-strand breaks through the Mre11-Rad50-Nbs1 complex. Science 308, 551-554. Lee, J.W., Blanco, L., Zhou, T., Garcia-Diaz, M., Bebenek, K., Kunkel, T.A., Wang, Z., and Povirk, L.F. (2004). Implication of DNA polymerase lambda in alignment-based gap filling for nonhomologous DNA end joining in human nuclear extracts. J Biol Chem 279, 805-811. Letessier, A., Millot, G.A., Koundrioukoff, S., Lachagès, A.M., Vogt, N., Hansen, R.S., Malfoy, B., Brison, O., and Debatisse, M. (2011). Cell-type-specific replication initiation programs set fragility of the FRA3B fragile site. Nature 470, 120-123. Li, B., Carey, M., and Workman, J.L. (2007a). The role of chromatin during transcription. Cell 128, 707-719. Li, L., Monckton, E.A., and Godbout, R. (2008). A role for DEAD box 1 at DNA double-strand breaks. Mol Cell Biol 28, 6413-6425. Li, L., Poon, H.Y., Hildebrandt, M.R., Monckton, E.A., Germain, D.R., Fahlman, R.P., and Godbout, R. (2017). Role for RIF1-interacting partner DDX1 in BLM recruitment to DNA double-strand breaks. DNA Repair (Amst) 55, 47-63. Li, X., and Manley, J.L. (2005). Inactivation of the SR protein splicing factor ASF/SF2 results in genomic instability. Cell 122, 365-378. Li, X., Niu, T., and Manley, J.L. (2007b). The RNA binding protein RNPS1 alleviates ASF/SF2 depletion-induced genomic instability. RNA 13, 2108-2115. Lim, Y.W., Sanz, L.A., Xu, X., Hartono, S.R., and Chédin, F. (2015). Genome-wide DNA hypomethylation and RNA:DNA hybrid accumulation in Aicardi-Goutières syndrome. Elife 4. Lin, Y., Dent, S.Y., Wilson, J.H., Wells, R.D., and Napierala, M. (2010). R loops stimulate genetic instability of CTG.CAG repeats. Proc Natl Acad Sci U S A 107, 692-697. Lin, Y.L., and Pasero, P. (2012). Interference between DNA replication and transcription as a cause of genomic instability. Curr Genomics 13, 65-73.

66 Lippert, M.J., Kim, N., Cho, J.E., Larson, R.P., Schoenly, N.E., O'Shea, S.H., and Jinks- Robertson, S. (2011). Role for topoisomerase 1 in transcription-associated mutagenesis in yeast. Proc Natl Acad Sci U S A 108, 698-703. London, T.B., Barber, L.J., Mosedale, G., Kelly, G.P., Balasubramanian, S., Hickson, I.D., Boulton, S.J., and Hiom, K. (2008). FANCJ is a structure-specific DNA helicase associated with the maintenance of genomic G/C tracts. J Biol Chem 283, 36132-36139. Loomis, E.W., Sanz, L.A., Chédin, F., and Hagerman, P.J. (2014). Transcription-associated R- loop formation across the human FMR1 CGG-repeat region. PLoS Genet 10, e1004294. Lu, W.T., Hawley, B.R., Skalka, G.L., Baldock, R.A., Smith, E.M., Bader, A.S., Malewicz, M., Watts, F.Z., Wilczynska, A., and Bushell, M. (2018). Drosha drives the formation of DNA:RNA hybrids around DNA break sites to facilitate DNA repair. Nat Commun 9, 532. Madabhushi, R., Gao, F., Pfenning, A.R., Pan, L., Yamakawa, S., Seo, J., Rueda, R., Phan, T.X., Yamakawa, H., Pao, P.C., et al. (2015). Activity-Induced DNA Breaks Govern the Expression of Neuronal Early-Response Genes. Cell 161, 1592-1605. Majounie, E., Abramzon, Y., Renton, A.E., Keller, M.F., Traynor, B.J., and Singleton, A.B. (2012). Large C9orf72 repeat expansions are not a common cause of Parkinson's disease. Neurobiol Aging 33, 2527.e2521-2522. Makharashvili, N., Arora, S., Yin, Y., Fu, Q., Wen, X., Lee, J.H., Kao, C.H., Leung, J.W., Miller, K.M., and Paull, T.T. (2018). Sae2/CtIP prevents R-loop accumulation in eukaryotic cells. Elife 7. Marin-Vicente, C., Domingo-Prim, J., Eberle, A.B., and Visa, N. (2015). RRP6/EXOSC10 is required for the repair of DNA double-strand breaks by homologous recombination. J Cell Sci 128, 1097-1107. Marnef, A., Cohen, S., and Legube, G. (2017). Transcription-Coupled DNA Double-Strand Break Repair: Active Genes Need Special Care. J Mol Biol 429, 1277-1288. Marsico, G., Chambers, V.S., Sahakyan, A.B., McCauley, P., Boutell, J.M., Di Antonio, M., and Balasubramanian, S. (2019). Whole genome experimental maps of DNA G-quadruplexes in multiple species. Nucleic Acids Res. Maréchal, A., Li, J.M., Ji, X.Y., Wu, C.S., Yazinski, S.A., Nguyen, H.D., Liu, S., Jiménez, A.E., Jin, J., and Zou, L. (2014). PRP19 transforms into a sensor of RPA-ssDNA after DNA damage and drives ATR activation via a ubiquitin-mediated circuitry. Mol Cell 53, 235-246. Mattiroli, F., Vissers, J.H., van Dijk, W.J., Ikpa, P., Citterio, E., Vermeulen, W., Marteijn, J.A., and Sixma, T.K. (2012). RNF168 ubiquitinates K13-15 on H2A/H2AX to drive DNA damage signaling. Cell 150, 1182-1195. Maya, R., Balass, M., Kim, S.T., Shkedy, D., Leal, J.F., Shifman, O., Moas, M., Buschmann, T., Ronai, Z., Shiloh, Y., et al. (2001). ATM-dependent phosphorylation of Mdm2 on serine 395: role in p53 activation by DNA damage. Genes Dev 15, 1067-1077. Mayer, A., Landry, H.M., and Churchman, L.S. (2017). Pause & go: from the discovery of RNA polymerase pausing to its functional implications. Curr Opin Cell Biol 46, 72-80. McIvor, E.I., Polak, U., and Napierala, M. (2010). New insights into repeat instability: role of RNA•DNA hybrids. RNA Biol 7, 551-558. Meers, C., Keskin, H., and Storici, F. (2016). DNA repair by RNA: Templated, or not templated, that is the question. DNA Repair (Amst) 44, 17-21. Meisenberg, C., Pinder, S.I., Hopkins, S.R., Wooller, S.K., Benstead-Hume, G., Pearl, F.M.G., Jeggo, P.A., and Downs, J.A. (2019). Repression of Transcription at DNA Breaks Requires Cohesin throughout Interphase and Prevents Genome Instability. Mol Cell 73, 212-223.e217. Michalik, K.M., Böttcher, R., and Förstemann, K. (2012). A small RNA response at DNA ends in Drosophila. Nucleic Acids Res 40, 9596-9603. Michelini, F., Jalihal, A.P., Francia, S., Meers, C., Neeb, Z.T., Rossiello, F., Gioia, U., Aguado, J., Jones-Weinert, C., Luke, B., et al. (2018). From "Cellular" RNA to "Smart" RNA: Multiple Roles of RNA in Genome Stability and Beyond. Chem Rev 118, 4365-4403. Michelini, F., Pitchiaya, S., Vitelli, V., Sharma, S., Gioia, U., Pessina, F., Cabrini, M., Wang, Y., Capozzo, I., Iannelli, F., et al. (2017). Damage-induced lncRNAs control the DNA damage response through interaction with DDRNAs at individual double-strand breaks. Nat Cell Biol 19, 1400-1411.

67

Mirman, Z., Lottersberger, F., Takai, H., Kibe, T., Gong, Y., Takai, K., Bianchi, A., Zimmermann, M., Durocher, D., and de Lange, T. (2018). 53BP1-RIF1-shieldin counteracts DSB resection through CST- and Polα-dependent fill-in. Nature 560, 112-116. Mischo, H.E., Gómez-González, B., Grzechnik, P., Rondón, A.G., Wei, W., Steinmetz, L., Aguilera, A., and Proudfoot, N.J. (2011). Yeast Sen1 helicase protects the genome from transcription-associated instability. Mol Cell 41, 21-32. Mohaghegh, P., Karow, J.K., Brosh, R.M., Jr., Bohr, V.A., and Hickson, I.D. (2001). The Bloom's and Werner's syndrome proteins are DNA structure-specific helicases. Nucleic Acids Res 29, 2843-2849. Morales, J.C., Richard, P., Patidar, P.L., Motea, E.A., Dang, T.T., Manley, J.L., and Boothman, D.A. (2016). XRN2 Links Transcription Termination to DNA Damage and Replication Stress. PLoS Genet 12, e1006107. Moreira, M.C., Klur, S., Watanabe, M., Németh, A.H., Le Ber, I., Moniz, J.C., Tranchant, C., Aubourg, P., Tazir, M., Schöls, L., et al. (2004). Senataxin, the ortholog of a yeast RNA helicase, is mutant in ataxia-ocular apraxia 2. Nat Genet 36, 225-227. Morrish, T.A., Garcia-Perez, J.L., Stamato, T.D., Taccioli, G.E., Sekiguchi, J., and Moran, J.V. (2007). Endonuclease-independent LINE-1 retrotransposition at mammalian telomeres. Nature 446, 208-212. Morrish, T.A., Gilbert, N., Myers, J.S., Vincent, B.J., Stamato, T.D., Taccioli, G.E., Batzer, M.A., and Moran, J.V. (2002). DNA repair mediated by endonuclease-independent LINE-1 retrotransposition. Nat Genet 31, 159-165. Mullenders, L. (2015). DNA damage mediated transcription arrest: Step back to go forward. DNA Repair (Amst) 36, 28-35. Méchali, M. (2010). Eukaryotic DNA replication origins: many choices for appropriate answers. Nat Rev Mol Cell Biol 11, 728-738. Nakamura, K., Saredi, G., Becker, J.R., Foster, B.M., Nguyen, N.V., Beyer, T.E., Cesa, L.C., Faull, P.A., Lukauskas, S., Frimurer, T., et al. (2019). H4K20me0 recognition by BRCA1- BARD1 directs homologous recombination to sister chromatids. Nat Cell Biol 21, 311-318. Nimonkar, A.V., Genschel, J., Kinoshita, E., Polaczek, P., Campbell, J.L., Wyman, C., Modrich, P., and Kowalczykowski, S.C. (2011). BLM-DNA2-RPA-MRN and EXO1-BLM-RPA-MRN constitute two DNA end resection machineries for human DNA break repair. Genes Dev 25, 350-362. Noon, A.T., Shibata, A., Rief, N., Löbrich, M., Stewart, G.S., Jeggo, P.A., and Goodarzi, A.A. (2010). 53BP1-dependent robust localized KAP-1 phosphorylation is essential for heterochromatic DNA double-strand break repair. Nat Cell Biol 12, 177-184. Noordermeer, S.M., Adam, S., Setiaputra, D., Barazas, M., Pettitt, S.J., Ling, A.K., Olivieri, M., Álvarez-Quilón, A., Moatti, N., Zimmermann, M., et al. (2018). The shieldin complex mediates 53BP1-dependent DNA repair. Nature 560, 117-121. Ohle, C., Tesorero, R., Schermann, G., Dobrev, N., Sinning, I., and Fischer, T. (2016). Transient RNA-DNA Hybrids Are Required for Efficient Double-Strand Break Repair. Cell 167, 1001-1013.e1007. Onozawa, M., Zhang, Z., Kim, Y.J., Goldberg, L., Varga, T., Bergsagel, P.L., Kuehl, W.M., and Aplan, P.D. (2014). Repair of DNA double-strand breaks by templated nucleotide sequence insertions derived from distant regions of the genome. Proc Natl Acad Sci U S A 111, 7729- 7734. Orthwein, A., Noordermeer, S.M., Wilson, M.D., Landry, S., Enchev, R.I., Sherker, A., Munro, M., Pinder, J., Salsman, J., Dellaire, G., et al. (2015). A mechanism for the suppression of homologous recombination in G1 cells. Nature 528, 422-426. Pankotai, T., Bonhomme, C., Chen, D., and Soutoglou, E. (2012). DNAPKcs-dependent arrest of RNA polymerase II transcription in the presence of DNA breaks. Nat Struct Mol Biol 19, 276- 282. Paull, T.T., and Lee, J.H. (2005). The Mre11/Rad50/Nbs1 complex and its role as a DNA double-strand break sensor for ATM. Cell Cycle 4, 737-740. Pearson, C.E., Nichol Edamura, K., and Cleary, J.D. (2005). Repeat instability: mechanisms of dynamic mutations. Nat Rev Genet 6, 729-742.

68 Pefanis, E., Wang, J., Rothschild, G., Lim, J., Kazadi, D., Sun, J., Federation, A., Chao, J., Elliott, O., Liu, Z.P., et al. (2015). RNA exosome-regulated long non-coding RNA transcription controls super-enhancer activity. Cell 161, 774-789. Pellegrini, L., Yu, D.S., Lo, T., Anand, S., Lee, M., Blundell, T.L., and Venkitaraman, A.R. (2002). Insights into DNA recombination from the structure of a RAD51-BRCA2 complex. Nature 420, 287-293. Pellegrino, S., Michelena, J., Teloni, F., Imhof, R., and Altmeyer, M. (2017). Replication- Coupled Dilution of H4K20me2 Guides 53BP1 to Pre-replicative Chromatin. Cell Rep 19, 1819- 1831. Pfister, S.X., Ahrabi, S., Zalmas, L.P., Sarkar, S., Aymard, F., Bachrati, C.Z., Helleday, T., Legube, G., La Thangue, N.B., Porter, A.C., et al. (2014). SETD2-dependent histone H3K36 trimethylation is required for homologous recombination repair and genome stability. Cell Rep 7, 2006-2018. Phillips, D.D., Garboczi, D.N., Singh, K., Hu, Z., Leppla, S.H., and Leysath, C.E. (2013). The sub-nanomolar binding of DNA-RNA hybrids by the single-chain Fv fragment of antibody S9.6. J Mol Recognit 26, 376-381. Pichierri, P., Franchitto, A., and Rosselli, F. (2004). BLM and the FANC proteins collaborate in a common pathway in response to stalled replication forks. EMBO J 23, 3154-3163. Porrua, O., Boudvillain, M., and Libri, D. (2016). Transcription Termination: Variations on Common Themes. Trends Genet 32, 508-522. Porrua, O., and Libri, D. (2015). Transcription termination and the control of the transcriptome: why, where and how to stop. Nat Rev Mol Cell Biol 16, 190-202. Quinn, J.J., and Chang, H.Y. (2016). Unique features of long non-coding RNA biogenesis and function. Nat Rev Genet 17, 47-62. Rao, B.S., and Devi, P.U. (2005). Response of S 180 murine tumor to bleomycin in combination with radiation and hyperthermia using micronucleus assay: a multimodality approach for therapeutic augmentation. Indian J Exp Biol 43, 596-600. Rass, E., Grabarz, A., Plo, I., Gautier, J., Bertrand, P., and Lopez, B.S. (2009). Role of Mre11 in chromosomal nonhomologous end joining in mammalian cells. Nat Struct Mol Biol 16, 819- 824. Raynard, S., Bussen, W., and Sung, P. (2006). A double Holliday junction dissolvasome comprising BLM, topoisomerase IIIalpha, and BLAP75. J Biol Chem 281, 13861-13864. Reijns, M.A., Rabe, B., Rigby, R.E., Mill, P., Astell, K.R., Lettice, L.A., Boyle, S., Leitch, A., Keighren, M., Kilanowski, F., et al. (2012). Enzymatic removal of ribonucleotides from DNA is essential for mammalian genome integrity and development. Cell 149, 1008-1022. Rice, G., Patrick, T., Parmar, R., Taylor, C.F., Aeby, A., Aicardi, J., Artuch, R., Montalto, S.A., Bacino, C.A., Barroso, B., et al. (2007). Clinical and molecular phenotype of Aicardi-Goutieres syndrome. Am J Hum Genet 81, 713-725. Richard, P., Feng, S., and Manley, J.L. (2013). A SUMO-dependent interaction between Senataxin and the exosome, disrupted in the neurodegenerative disease AOA2, targets the exosome to sites of transcription-induced DNA damage. Genes Dev 27, 2227-2232. Richard, P., and Manley, J.L. (2017). R Loops and Links to Human Disease. J Mol Biol 429, 3168-3180. Roberts, S.A., Strande, N., Burkhalter, M.D., Strom, C., Havener, J.M., Hasty, P., and Ramsden, D.A. (2010). Ku is a 5'-dRP/AP lyase that excises nucleotide damage near broken ends. Nature 464, 1214-1217. Rodriguez, R., Miller, K.M., Forment, J.V., Bradshaw, C.R., Nikan, M., Britton, S., Oelschlaegel, T., Xhemalce, B., Balasubramanian, S., and Jackson, S.P. (2012). Small- molecule-induced DNA damage identifies alternative DNA structures in human genes. Nat Chem Biol 8, 301-310. Roeder, R.G. (1996). The role of general initiation factors in transcription by RNA polymerase II. Trends Biochem Sci 21, 327-335. Rogakou, E.P., Boon, C., Redon, C., and Bonner, W.M. (1999). Megabase chromatin domains involved in DNA double-strand breaks in vivo. J Cell Biol 146, 905-916.

69

Rogakou, E.P., Pilch, D.R., Orr, A.H., Ivanova, V.S., and Bonner, W.M. (1998). DNA double- stranded breaks induce histone H2AX phosphorylation on serine 139. J Biol Chem 273, 5858- 5868. Rothblum-Oviatt, C., Wright, J., Lefton-Greif, M.A., McGrath-Morrow, S.A., Crawford, T.O., and Lederman, H.M. (2016). Ataxia telangiectasia: a review. Orphanet J Rare Dis 11, 159. Sainsbury, S., Bernecky, C., and Cramer, P. (2015). Structural basis of transcription initiation by RNA polymerase II. Nat Rev Mol Cell Biol 16, 129-143. Saldi, T., Cortazar, M.A., Sheridan, R.M., and Bentley, D.L. (2016). Coupling of RNA Polymerase II Transcription Elongation with Pre-mRNA Splicing. J Mol Biol 428, 2623-2635. Salvi, J.S., Chan, J.N., Szafranski, K., Liu, T.T., Wu, J.D., Olsen, J.B., Khanam, N., Poon, B.P., Emili, A., and Mekhail, K. (2014). Roles for Pbp1 and caloric restriction in genome and lifespan maintenance via suppression of RNA-DNA hybrids. Dev Cell 30, 177-191. Salvi, J.S., and Mekhail, K. (2015). R-loops highlight the nucleus in ALS. Nucleus 6, 23-29. Santos-Pereira, J.M., and Aguilera, A. (2015). R loops: new modulators of genome dynamics and function. Nat Rev Genet 16, 583-597. Sanz, L.A., Hartono, S.R., Lim, Y.W., Steyaert, S., Rajpurkar, A., Ginno, P.A., Xu, X., and Chédin, F. (2016). Prevalent, Dynamic, and Conserved R-Loop Structures Associate with Specific Epigenomic Signatures in Mammals. Mol Cell 63, 167-178. Sarni, D., and Kerem, B. (2016). The complex nature of fragile site plasticity and its importance in cancer. Curr Opin Cell Biol 40, 131-136. Sartori, A.A., Lukas, C., Coates, J., Mistrik, M., Fu, S., Bartek, J., Baer, R., Lukas, J., and Jackson, S.P. (2007). Human CtIP promotes DNA end resection. Nature 450, 509-514. Schellenberg, M.J., Tumbale, P.P., and Williams, R.S. (2015). Molecular underpinnings of Aprataxin RNA/DNA deadenylase function and dysfunction in neurological disease. Prog Biophys Mol Biol 117, 157-165. Schlegel, B.P., Jodelka, F.M., and Nunez, R. (2006). BRCA1 promotes induction of ssDNA by ionizing radiation. Cancer Res 66, 5181-5189. Schwab, R.A., Nieminuszczy, J., Shah, F., Langton, J., Lopez Martinez, D., Liang, C.C., Cohn, M.A., Gibbons, R.J., Deans, A.J., and Niedzwiedz, W. (2015). The Fanconi Anemia Pathway Maintains Genome Stability by Coordinating Replication and Transcription. Mol Cell 60, 351- 361. Schwer, B., Wei, P.C., Chang, A.N., Kao, J., Du, Z., Meyers, R.M., and Alt, F.W. (2016). Transcription-associated processes cause DNA double-strand breaks and translocations in neural stem/progenitor cells. Proc Natl Acad Sci U S A 113, 2258-2263. Shanbhag, N.M., Rafalska-Metcalf, I.U., Balane-Bolivar, C., Janicki, S.M., and Greenberg, R.A. (2010). ATM-dependent chromatin changes silence transcription in cis to DNA double- strand breaks. Cell 141, 970-981. Shandilya, J., and Roberts, S.G. (2012). The transcription cycle in eukaryotes: from productive initiation to RNA polymerase II recycling. Biochim Biophys Acta 1819, 391-400. Shibata, A., Conrad, S., Birraux, J., Geuting, V., Barton, O., Ismail, A., Kakarougkas, A., Meek, K., Taucher-Scholz, G., Löbrich, M., et al. (2011). Factors determining DNA double-strand break repair pathway choice in G2 phase. EMBO J 30, 1079-1092. Shivji, M.K.K., Renaudin, X., Williams, Ç., and Venkitaraman, A.R. (2018). BRCA2 Regulates Transcription Elongation by RNA Polymerase II to Prevent R-Loop Accumulation. Cell Rep 22, 1031-1039. Simon, N.E., Yuan, M., and Kai, M. (2017). RNA-binding protein RBM14 regulates dissociation and association of non-homologous end joining proteins. Cell Cycle 16, 1175-1180. Skourti-Stathaki, K., Kamieniarz-Gdula, K., and Proudfoot, N.J. (2014). R-loops induce repressive chromatin marks over mammalian gene terminators. Nature 516, 436-439. Skourti-Stathaki, K., and Proudfoot, N.J. (2014). A double-edged sword: R loops as threats to genome integrity and powerful regulators of gene expression. Genes Dev 28, 1384-1396. Skourti-Stathaki, K., Proudfoot, N.J., and Gromak, N. (2011). Human senataxin resolves RNA/DNA hybrids formed at transcriptional pause sites to promote Xrn2-dependent termination. Mol Cell 42, 794-805.

70 Sollier, J., and Cimprich, K.A. (2015). Breaking bad: R-loops and genome integrity. Trends Cell Biol 25, 514-522. Sollier, J., Stork, C.T., García-Rubio, M.L., Paulsen, R.D., Aguilera, A., and Cimprich, K.A. (2014). Transcription-coupled nucleotide excision repair factors promote R-loop-induced genome instability. Mol Cell 56, 777-785. Solovjeva, L.V., Svetlova, M.P., Chagin, V.O., and Tomilin, N.V. (2007). Inhibition of transcription at radiation-induced nuclear foci of phosphorylated histone H2AX in mammalian cells. Chromosome Res 15, 787-797. Stirling, P.C., Chan, Y.A., Minaker, S.W., Aristizabal, M.J., Barrett, I., Sipahimalani, P., Kobor, M.S., and Hieter, P. (2012). R-loop-mediated genome instability in mRNA cleavage and polyadenylation mutants. Genes Dev 26, 163-175. Storici, F., Bebenek, K., Kunkel, T.A., Gordenin, D.A., and Resnick, M.A. (2007). RNA- templated DNA repair. Nature 447, 338-341. Stork, C.T., Bocek, M., Crossley, M.P., Sollier, J., Sanz, L.A., Chédin, F., Swigut, T., and Cimprich, K.A. (2016). Co-transcriptional R-loops are the main cause of estrogen-induced DNA damage. Elife 5. Stucki, M., Clapperton, J.A., Mohammad, D., Yaffe, M.B., Smerdon, S.J., and Jackson, S.P. (2005). MDC1 directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. Cell 123, 1213-1226. Sturzenegger, A., Burdova, K., Kanagaraj, R., Levikova, M., Pinto, C., Cejka, P., and Janscak, P. (2014). DNA2 cooperates with the WRN and BLM RecQ helicases to mediate long-range DNA end resection in human cells. J Biol Chem 289, 27314-27326. Suberbielle, E., Sanchez, P.E., Kravitz, A.V., Wang, X., Ho, K., Eilertson, K., Devidze, N., Kreitzer, A.C., and Mucke, L. (2013). Physiologic brain activity causes DNA double-strand breaks in neurons, with exacerbation by amyloid-β. Nat Neurosci 16, 613-621. Sun, H., Yabuki, A., and Maizels, N. (2001). A human nuclease specific for G4 DNA. Proc Natl Acad Sci U S A 98, 12444-12449. Sun, Y., Jiang, X., Xu, Y., Ayrapetov, M.K., Moreau, L.A., Whetstine, J.R., and Price, B.D. (2009). Histone H3 methylation links DNA damage detection to activation of the tumour suppressor Tip60. Nat Cell Biol 11, 1376-1382. Suraweera, A., Becherel, O.J., Chen, P., Rundle, N., Woods, R., Nakamura, J., Gatei, M., Criscuolo, C., Filla, A., Chessa, L., et al. (2007). Senataxin, defective in ataxia oculomotor apraxia type 2, is involved in the defense against oxidative DNA damage. J Cell Biol 177, 969- 979. Symington, L.S., and Gautier, J. (2011). Double-strand break end resection and repair pathway choice. Annu Rev Genet 45, 247-271. Takahashi, T., Burguiere-Slezak, G., Van der Kemp, P.A., and Boiteux, S. (2011). Topoisomerase 1 provokes the formation of short deletions in repeated sequences upon high transcription in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 108, 692-697. Tanikawa, M., Sanjiv, K., Helleday, T., Herr, P., and Mortusewicz, O. (2016). The spliceosome U2 snRNP factors promote genome stability through distinct mechanisms; transcription of repair factors and R-loop processing. Oncogenesis 5, e280. Teloni, F., Michelena, J., Lezaja, A., Kilic, S., Ambrosi, C., Menon, S., Dobrovolna, J., Imhof, R., Janscak, P., Baubec, T., et al. (2019). Efficient Pre-mRNA Cleavage Prevents Replication- Stress-Associated Genome Instability. Mol Cell 73, 670-683.e612. Teng, Y., Yadav, T., Duan, M., Tan, J., Xiang, Y., Gao, B., Xu, J., Liang, Z., Liu, Y., Nakajima, S., et al. (2018). ROS-induced R loops trigger a transcription-coupled but BRCA1/2- independent homologous recombination pathway through CSB. Nat Commun 9, 4115. Thomas, M., White, R.L., and Davis, R.W. (1976). Hybridization of RNA to double-stranded DNA: formation of R-loops. Proc Natl Acad Sci U S A 73, 2294-2298. Tripathi, V., Agarwal, H., Priya, S., Batra, H., Modi, P., Pandey, M., Saha, D., Raghavan, S.C., and Sengupta, S. (2018). MRN complex-dependent recruitment of ubiquitylated BLM helicase to DSBs negatively regulates DNA repair pathways. Nat Commun 9, 1016. Truong, L.N., Li, Y., Shi, L.Z., Hwang, P.Y., He, J., Wang, H., Razavian, N., Berns, M.W., and Wu, X. (2013). Microhomology-mediated End Joining and Homologous Recombination share

71 the initial end resection step to repair DNA double-strand breaks in mammalian cells. Proc Natl Acad Sci U S A 110, 7720-7725. Ui, A., Nagaura, Y., and Yasui, A. (2015). Transcriptional elongation factor ENL phosphorylated by ATM recruits polycomb and switches off transcription for DSB repair. Mol Cell 58, 468-482. van Wietmarschen, N., Merzouk, S., Halsema, N., Spierings, D.C.J., Guryev, V., and Lansdorp, P.M. (2018). BLM helicase suppresses recombination at G-quadruplex motifs in transcribed genes. Nat Commun 9, 271. Wagschal, A., Rousset, E., Basavarajaiah, P., Contreras, X., Harwig, A., Laurent-Chabalier, S., Nakamura, M., Chen, X., Zhang, K., Meziane, O., et al. (2012). Microprocessor, Setx, Xrn2, and Rrp6 co-operate to induce premature termination of transcription by RNAPII. Cell 150, 1147-1157. Wahba, L., Amon, J.D., Koshland, D., and Vuica-Ross, M. (2011). RNase H and multiple RNA biogenesis factors cooperate to prevent RNA:DNA hybrids from generating genome instability. Mol Cell 44, 978-988. Wahba, L., Costantino, L., Tan, F.J., Zimmer, A., and Koshland, D. (2016). S1-DRIP-seq identifies high expression and polyA tracts as major contributors to R-loop formation. Genes Dev 30, 1327-1338. Wahba, L., Gore, S.K., and Koshland, D. (2013). The homologous recombination machinery modulates the formation of RNA-DNA hybrids and associated chromosome instability. Elife 2, e00505. Wahl, M.C., Will, C.L., and Lührmann, R. (2009). The spliceosome: design principles of a dynamic RNP machine. Cell 136, 701-718. Walker, J.R., Corpina, R.A., and Goldberg, J. (2001). Structure of the Ku heterodimer bound to DNA and its implications for double-strand break repair. Nature 412, 607-614. Wan, Y., Zheng, X., Chen, H., Guo, Y., Jiang, H., He, X., Zhu, X., and Zheng, Y. (2015). Splicing function of mitotic regulators links R-loop-mediated DNA damage to tumor cell killing. J Cell Biol 209, 235-246. Wang, Q., and Goldstein, M. (2016). Small RNAs Recruit Chromatin-Modifying Enzymes MMSET and Tip60 to Reconfigure Damaged DNA upon Double-Strand Break and Facilitate Repair. Cancer Res 76, 1904-1915. Warmerdam, D.O., and Kanaar, R. (2010). Dealing with DNA damage: relationships between checkpoint and repair pathways. Mutat Res 704, 2-11. Wei, L., Nakajima, S., Böhm, S., Bernstein, K.A., Shen, Z., Tsang, M., Levine, A.S., and Lan, L. (2015). DNA damage during the G0/G1 phase triggers RNA-templated, Cockayne syndrome B-dependent homologous recombination. Proc Natl Acad Sci U S A 112, E3495-3504. Wei, P.C., Chang, A.N., Kao, J., Du, Z., Meyers, R.M., Alt, F.W., and Schwer, B. (2016). Long Neural Genes Harbor Recurrent DNA Break Clusters in Neural Stem/Progenitor Cells. Cell 164, 644-655. Wei, P.C., Lee, C.S., Du, Z., Schwer, B., Zhang, Y., Kao, J., Zurita, J., and Alt, F.W. (2018). Three classes of recurrent DNA break clusters in brain progenitors identified by 3D proximity- based break joining assay. Proc Natl Acad Sci U S A 115, 1919-1924. Wei, W., Ba, Z., Gao, M., Wu, Y., Ma, Y., Amiard, S., White, C.I., Rendtlew Danielsen, J.M., Yang, Y.G., and Qi, Y. (2012). A role for small RNAs in DNA double-strand break repair. Cell 149, 101-112. Weterings, E., Verkaik, N.S., Brüggenwirth, H.T., Hoeijmakers, J.H., and van Gent, D.C. (2003). The role of DNA dependent protein kinase in synapsis of DNA ends. Nucleic Acids Res 31, 7238-7246. Weterings, E., Verkaik, N.S., Keijzers, G., Florea, B.I., Wang, S.Y., Ortega, L.G., Uematsu, N., Chen, D.J., and van Gent, D.C. (2009). The Ku80 carboxy terminus stimulates joining and artemis-mediated processing of DNA ends. Mol Cell Biol 29, 1134-1142. Williamson, L.M., and Lees-Miller, S.P. (2011). Estrogen receptor α-mediated transcription induces cell cycle-dependent DNA double-strand breaks. Carcinogenesis 32, 279-285. Wold, M.S. (1997). Replication protein A: a heterotrimeric, single-stranded DNA-binding protein required for eukaryotic DNA metabolism. Annu Rev Biochem 66, 61-92.

72 Wu, L., Davies, S.L., Levitt, N.C., and Hickson, I.D. (2001). Potential role for the BLM helicase in recombinational repair via a conserved interaction with RAD51. J Biol Chem 276, 19375- 19381. Wu, L., and Hickson, I.D. (2003). The Bloom's syndrome helicase suppresses crossing over during homologous recombination. Nature 426, 870-874. Wu, Y., Shin-ya, K., and Brosh, R.M. (2008). FANCJ helicase defective in Fanconia anemia and breast cancer unwinds G-quadruplex DNA to defend genomic stability. Mol Cell Biol 28, 4116-4128. Xie, A., Kwok, A., and Scully, R. (2009). Role of mammalian Mre11 in classical and alternative nonhomologous end joining. Nat Struct Mol Biol 16, 814-818. Xu, W., Xu, H., Li, K., Fan, Y., Liu, Y., Yang, X., and Sun, Q. (2017). The R-loop is a common chromatin feature of the Arabidopsis genome. Nat Plants 3, 704-714. Yan, C.T., Boboila, C., Souza, E.K., Franco, S., Hickernell, T.R., Murphy, M., Gumaste, S., Geyer, M., Zarrin, A.A., Manis, J.P., et al. (2007). IgH class switching and translocations use a robust non-classical end-joining pathway. Nature 449, 478-482. Yasuhara, T., Kato, R., Hagiwara, Y., Shiotani, B., Yamauchi, M., Nakada, S., Shibata, A., and Miyagawa, K. (2018). Human Rad52 Promotes XPG-Mediated R-loop Processing to Initiate Transcription-Associated Homologous Recombination Repair. Cell 175, 558-570.e511. Yoshimoto, R., Mayeda, A., Yoshida, M., and Nakagawa, S. (2016). MALAT1 long non-coding RNA in cancer. Biochim Biophys Acta 1859, 192-199. Yu, K., Chedin, F., Hsieh, C.L., Wilson, T.E., and Lieber, M.R. (2003). R-loops at immunoglobulin class switch regions in the chromosomes of stimulated B cells. Nat Immunol 4, 442-451. Yu, X., and Chen, J. (2004). DNA damage-induced cell cycle checkpoint control requires CtIP, a phosphorylation-dependent binding partner of BRCA1 C-terminal domains. Mol Cell Biol 24, 9478-9486. Yu, X., Fu, S., Lai, M., Baer, R., and Chen, J. (2006). BRCA1 ubiquitinates its phosphorylation- dependent binding partner CtIP. Genes Dev 20, 1721-1726. Yu, X., Li, Z., Zheng, H., Chan, M.T., and Wu, W.K. (2017). NEAT1: A novel cancer-related long non-coding RNA. Cell Prolif 50. Yuan, M., Eberhart, C.G., and Kai, M. (2014). RNA binding protein RBM14 promotes radio- resistance in glioblastoma by regulating DNA repair and cell differentiation. Oncotarget 5, 2820-2826. Yüce, Ö., and West, S.C. (2013). Senataxin, defective in the neurodegenerative disorder ataxia with oculomotor apraxia 2, lies at the interface of transcription and the DNA damage response. Mol Cell Biol 33, 406-417. Zhang, F., Ma, J., Wu, J., Ye, L., Cai, H., Xia, B., and Yu, X. (2009). PALB2 links BRCA1 and BRCA2 in the DNA-damage response. Curr Biol 19, 524-529. Zhang, X., Chiang, H.C., Wang, Y., Zhang, C., Smith, S., Zhao, X., Nair, S.J., Michalek, J., Jatoi, I., Lautner, M., et al. (2017). Attenuation of RNA polymerase II pausing mitigates BRCA1- associated R-loop accumulation and tumorigenesis. Nat Commun 8, 15908. Zhou, Y., Caron, P., Legube, G., and Paull, T.T. (2014). Quantitation of DNA double-strand break resection intermediates in human cells. Nucleic Acids Res 42, e19. Zou, L., and Elledge, S.J. (2003). Sensing DNA damage through ATRIP recognition of RPA- ssDNA complexes. Science 300, 1542-1548.

73

Annexes

74 Review

Transcription-Coupled DNA Double-Strand Break Repair: Active Genes Need Special Care

Aline Marnef †, Sarah Cohen † and Gaëlle Legube

LBCMCP, Centre de Biologie Intégrative (CBI), CNRS, Université de Toulouse, UT3, 118 Route de Narbonne, 31062 Toulouse, France

Correspondence to Gaëlle Legube: [email protected]. http://dx.doi.org/10.1016/j.jmb.2017.03.024 Edited by Maxwell Anthony

Abstract

For decades, it has been speculated that specific loci on eukaryotic chromosomes are inherently susceptible to breakage. The advent of high-throughput genomic technologies has now paved the way to their identification. A wealth of data suggests that transcriptionally active loci are particularly fragile and that a specific DNA damage response is activated and dedicated to their repair. Here, we review current understanding of the crosstalk between transcription and double-strand break repair, from the reasons underlying the intrinsic fragility of genes to the mechanisms that restore the integrity of damaged transcription units. © 2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Introduction studies identified active genes as particularly fragile and thus the theater of dedicated repair events. In this Chromosomes are exceptionally long molecules review, we discuss our current understanding of the that must be faithfully replicated and segregated crosstalk between transcription and double-strand during each cell cycle to provide genetic information break (DSB) repair, with a particular focus on recent to daughter cells. A large body of evidence supports evidence that (i) points to transcriptional activity as a the idea that the DNA double helix is irregular: it can major threat to the genome and (ii) sheds light on the form non-canonical structures such as R-loops (three- DNA damage response at active gene. stranded structures composed of RNA:DNA hybrids and single-stranded DNA), hairpins, G-quadruplex (G4), and underwound or over-twisted DNA helices Transcription as a Threat to Genome that are further translated into negative and positive Stability supercoiling [1]. Conversely, negative supercoiling destabilizes the DNA helix, favoring the formation of these atypical DNA structures [2]. In eukaryotes, DNA Genome-wide mapping of fragile sites and DSB- associates with over half a thousand of proteins to prone loci form chromatin, which adopts multiple conformational states from the linear “beads on a string” nucleosomal In the 70s, cytological studies of metaphase fiber to more complex structures such as chromatin chromosome spreads indicated that dividing cells loops and topologically associated domains [2].The experience recurrent DNA breakage events at certain transcription, replication, and repair machineries must positions [3]. These “common fragile sites” (CFS), cope with this great variety of secondary and tertiary which arise upon mild replication stress, were first structures if they are to accurately execute transcrip- defined as chromosomal bands with significantly tional programs and maintain genome integrity. elevated break/gap frequencies on mitotic chromo- It has been known for decades that some genomic somes. They were later mapped with higher resolution loci are particularly prone to breakage and instability, using fluorescence in situ hybridization. CFS have but it is only very recently that, owing to high-throughput been subjected to intense investigation, as they genomic techniques, their positions have been deter- coincide with translocation break points and are mined at near-nucleotide resolution. Importantly, these hotspots for gross chromosomal rearrangements in 0022-2836/© 2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). J Mol Biol (2017) 429, 1277–1288 1278 DSB repair at active genes cancer cells (for a review, see Ref. [4]). It has long [24], END-seq [25], and DSBCapture [26] have been been supposed that their fragility results from incom- further developed as more direct approaches to plete replication before segregation. Repli-Seq and mapping DSBs at the genomic scale in vivo. Impor- ChIP-seq analyses demonstrated that CFS are tantly, DSBCapture and BLESS recently allowed, for devoid of replication origins, are replicated late, the first time, the visualization of DSB induction at overlap with very long genes (N300 kb) [5–8],and active promoters [26,27]. Similarly, BLESS allowed accumulate DSB repair proteins [9]. Moreover, it also the clear identification of long genes as preferential became clear that CFS expression (i.e., frequency of sites of DSB following replication stress [23]. breakage) is tissue-specific [10,11]. This indicates that Taken together, these studies support the existence CFS are epigenetically defined rather than just of two classes of fragile sites that recurrently sequences inherently difficult to replicate. Accordingly, experience DSB: (1) long, active genes, which CFS instability coincides with the expression of the replicate late (CFS), and are particularly susceptible underlying gene [7]. Since these very long genes need to replicative stress; and (2) promoters of transcrip- more than one cell cycle to be entirely transcribed, it tionally active genes that replicate early. has been proposed that the collision between the replication and transcription machineries could lead to Molecular mechanisms underlying the fragility the slowing or stalling of replication forks [7]. Never- of active genes: R-loop and G4-driven DSBs theless, since not all expressed long genes are prone to breakage [10,12], instability might rather result from The molecular mechanisms underlying ERFS and secondary DNA structures and/or specific chromatin CFS fragility, and by extension, the susceptibility of features assembled on a subset of these very long active genes to breakage, are still not fully understood. active genes. At CFS, late firing and scarcity of replication origins ChIP-seq analyses of DNA-bound RPA, γH2AX, have been proposed to generate under replicated DNA and BRCA1 following hydroxyurea treatment led to the that would trigger a DSB at the next mitosis, in a identification of another class of fragile sites, named manner that depends on the Mus81 structure-specific early replicating fragile sites (ERFS) [13].ERFSare endonuclease [28–31]. However, this mechanism replicated early, and like CFS, their instability (detected unlikely applies to ERFS that replicate early and so as gaps on mitotic chromosomes) is cell-type-specific. have more time to complete replication. Notably, these regions are highly transcribed, and it is An alternative is suggested by recent work, which this transcriptional activity rather than the timing of indicated that two potentially interlinked structures, replication that promotes their instability [13]. Impor- R-loops and G4s, contribute to genomic instability tantly, binding of BRCA1 and RPA at ERFS is also (for reviews, see Refs. [32–34]). R-loops are three- detected without hydroxyurea treatment, suggesting stranded RNA:DNA hybrids that can form as RNA that breakage at these sites occurs to some extent polymerases progress through duplex DNA, by during normal replication [13]. Although not directly hybridization of the nascent RNA to the template linked to the occurrence of DNA damage, it has to be DNA strand [35,36] (for a review, see Ref. [32]). noted that other genome-wide profiles also revealed DNA:RNA hybrid immunoprecipitation (DRIP) using the accumulation of repair proteins at transcriptionally S9.6 antibody followed by next generation sequenc- active loci (P-DNAPK, BRCA1, PALB2) [14–16]. ing (DRIP-seq or DRIPc-seq) has recently allowed Novel sequencing-based techniques that allow the their genome wide profiling at a near-nucleotide direct mapping of DSBs on the genome at near- resolution and in a strand-specific manner [37–39]. nucleotide resolution have recently provided insights These studies revealed that R-loops form in the recurrent DSB landscape in different cell types. co-transcriptionally and accumulate at promoters The ability of endogenous DSBs to translocate to a that carry transcriptionally active marks. Most “bait” DSB (introduced by either Cas9 or I-SceI R-loops form at sites exhibiting GC-skew (increased endonucleases) is the basic principle of high-through- G over C on the non-template strand) due to increased put genome-wide translocation sequencing (HTGTS) thermodynamic stability of G-rich RNA hybridized with and translocation capture sequencing (TC‐Seq) tech- C-rich DNA strands [37–39]. In parallel, the first niques. These techniques mapped recurrent DSBs in genome-wide mapping of G4s, four-stranded B-cells and neuronal stem progenitor cells [17–22]. non-canonical DNA structures, which arise at G-rich Combined with global run-on experiments to assess loci, revealed that they form at transcriptionally active the ongoing transcription at the whole-genome level, promoters [40]. This raises the exciting possibility that these studies showed that (i) upon mild replicative G4 may arise on displaced G-rich DNA strand at stress (and to a lesser extent in unchallenged cycling R-loops. This structure termed G-loop has first been cells), clusters of DSBs occur in long active genes that proposed to happen on the immunoglobulin locus and replicate late [22], and (ii) DSB frequency is generally to contribute to class switch recombination [41] and higher in nucleosome-depleted regions at transcrip- later found to occur in vitro and on plasmid DNA in tional start sites of active genes and is proportional to Escherichia coli [42]. One attractive hypothesis is that transcription rate [17,18,21].BLESS[23], Break-seq upon transcription activation at GC-skewed DSB repair at active genes 1279 promoters, DNA unwinding facilitates the formation of produced by active endonucleolytic cleavage. Indeed, these coupled R-loop/G4 structures. On one hand, Stork et al. and Sollier et al. demonstrated that G-loops would favor transcription by stabilizing the XPG and XPF, known to generate ssDNA gaps initiation bubble. On the other hand, they could also during transcription-coupled nucleotide excision repair trigger fragility if not removed appropriately. Genetic (TC-NER), are required for DSB induction at R-loop- and biochemical analyses, cytogenetics of chromo- forming loci [49,50] (Fig. 1, left). This XPF/XPG- some spreads, and staining or mapping of DSB dependent R-loop processing may happen indepen- markers in organisms ranging from bacteria to higher dently of replication but may also be coordinated with eukaryotes have provided a wealth of evidence that DNA replication as a mean to remove secondary R-loop accumulation provokes DSB (for example, see structures counteracting DNA polymerase progression. Refs. [7,43–51]). The persistence of R-loops was also Second, R-loop/G4 structures may interfere directly recently shown to impede replication at CFS [52], with the progression of the replication fork and/or cause suggesting that R-loops may be an important deter- RNA polymerase pausing, which itself can impede fork minant of CFS instability. Similarly, G4 structures progression (Fig. 1, right). Fork deceleration or stalling impede replication fork progression and threaten upon encountering these roadblocks would give rise to genome integrity if not properly unfolded during under-replicated DNA, further triggering DNA breakage replication (for reviews, see Refs. [33,34]). R-loop at the following mitosis in a nuclease-dependent and/or G4-driven DSB could arise in a manner [29–31]. Moreover, prolonged fork stalling replication-dependent manner through several can also itself trigger nuclease-induced breaks and non-exclusive mechanisms (Fig. 1). First, R-loop fork collapse (for a review, see Ref. [55]). Notably, and G4 formation result in an unannealed DNA NER factors might also contribute to generate these strand, which is more sensitive to damaging agents. breaks, as the replication machinery encounters R-loop This could lead to a higher frequency of single-strand structures [49,50]. Finally, collision between replication breakage (SSB), leading to fork collapse and DSB and transcription machineries in either head-on formation during subsequent replication (reviewed in (converging) or co-directional orientations can directly Refs. [32,53,54]). In addition, such nicks may also be generate DSBs in E. coli [56,57] (for reviews, see

Fig. 1. Possible mechanisms for R-loop/G4-driven DSB production. R-loops and G4 accumulate preferentially at transcriptionally active genes, mainly in nucleosome-depleted promoter regions. Both structures provoke replication- dependent DSB formation. Left panel: Displaced single-stranded DNA is more sensitive to DNA-damaging agents. In addition, R-loops can be targeted by the endonucleases XPG and XPF, components of the TC-NER machinery that generates a single-strand gap. In both cases, SSBs lead to fork collapse and DSB generation during replication. Right panel: G4 and/or R-loops can directly interfere with the progression of the replication fork, provoking DNA polymerase retardation or stalling. Under replicated DNA further experiences breakage during mitosis in a manner that depends on structure-specific endonucleases. These structures could also trigger RNA polymerase pausing that can collide with the replication machinery. Such collision can directly trigger DSB production as shown in Escherichia coli. 1280 DSB repair at active genes

Refs. [54,58,59]), providing an additional means of suggests that damage induction following stimulation generating DSBs during the replication of R-loop/G4- can also occur in a replication-independent manner. forming loci. Notably, these breakage events depend on topoisom- In addition to this replication-dependent R-loop/ erase IIβ (TOP2B) activity [16,61,62,64,65].Given G4-driven DSB induction, these structures can also that DNA torsional stress impedes RNA polymerase trigger DSB in non-dividing cells by less well- elongation, a current model proposes that TOP2B- characterized mechanisms in E. coli [47] and mediated DNA breakage would release topological mammalian post-mitotic cells [44,60]. constraints, thereby allowing RNA Polymerase II (RNA Pol II) to escape from its pause site and engage Molecular mechanisms that underlie the fragility in an elongation mode (Fig. 2) (for a review, see Ref. of active genes: Topoisomerase-II-driven DSB [54]). It is likely that DSBs arise at these genes following impaired resealing of TopoII intermediates. Beyond the contribution of R-loops and G4 to DSB Application of technologies such as HTGTS, BLESS, production at active genes, there is evidence that End-Seq, or DSBCapture to mapping DSBs in transcription activation itself can trigger DSB forma- different cell types in response to various stimuli tion (Fig. 2). Both estrogen and androgen stimuli should soon allow the determination of the frequency trigger the appearance of DSBs at early responsive of active promoter-associated DSBs and their exact genes [49,61–64]. Similarly, stimulation of neuronal positioning relative to R-loop/G4 structures and to activity leads to DSB induction not only in a subset of TOP2B and RNA Pol II binding sites. immediate responsive genes in primary cultured neurons but also in normal mouse brains following fear conditioning or novel environment exploration Towards “Transcription-Coupled DSB [65,66]. Finally, heat shock or serum stimulation Repair”: Specific DSB Repair Pathways also rapidly triggers DSB in early responsive genes Take Place at Active Genes [16]. While estrogen-induced DSBs appear as cells progress into S phase and may be replication- While further investigation is required to determine dependent [49,64], serum-induced DSBs appear whether DSB production is an accidental by-product within the first 15 min after reentry into the cell of transcription activation or a necessity for RNA Pol II cycle [16]. Together with the fact that stimulus- release following stimulation, the aforementioned induced DSBs can occur in adult mouse brains (in studies clearly show that the DSB landscape is post-replicative cells) [66], this observation strongly strongly biased toward active genes. This raises the

Fig. 2. Topoisomerase-II-dependent DSB induction upon transcription stimulation. A large number of genes exhibit RNA polymerase pausing that contributes to transcription regulation. Upon certain stimuli, the release of paused RNA polymerase II into the gene body may require topoisomerase IIβ (TOP2B) activity. DSBs might accidentally arise following impaired resealing of TOP2B intermediates. DSB repair at active genes 1281 question of how cells cope with breaks that would be pearance of RNA Pol II from the broken gene, which particularly detrimental since they occur in what can be may rely on proteasome activity [75]. It is also regarded as the most important parts of the genome. consistent with the observation that in yeast, tran- Investigating how the transcriptional status of a scription inhibition spreads over flanking genes as damaged locus influences the repair reaction and resection proceeds [78–80],evenifthismightnotbe how transcription is regulated at genes that experience the case for mammalian cells [71,77]. a DSB have proved to be extremely challenging since it Once repair has been achieved, transcription requires the induction of DSBs at specific, known loci must be suitably revived to maintain cell fate. A recent on the genome. During the past decade, most studies study indicates that transcription of the repaired gene have been performed using irradiation (γ-rays, X-rays, recovers normally in non-dividing cells, indicating heavy ions) or drugs (topoisomerase poisons, interca- that cell cycle progression is not required for the lating agents…), which induce (i) uncharacterized restoration of the epigenetic information and for the damage beyond the DSB, including SSBs and various resumption of transcription [82]. Transcription recovery DNA adducts; (ii) damage at random, unknown at DSB-flanking genes was shown to take place within positions on the genome; and iii) damage at various 2 h after the termination of DSB induction and to require stages of the cell cycle. This wide variety of DSB- the deubiquitinylation of H2A-K119 by USP16 [71].At inducing methods has blurred the overall picture and present, little is known about these essential steps, precluded any analysis of repair pathway preference probably because investigating transcription recovery throughout the genome and of transcription regulation at damaged genes has been challenging. The recent at damaged genes. Consequently, much effort has development of tools that not only permit DSB induction been made to develop systems where sequence- at controlled loci but also repair completion, owing to specific and annotated DSBs can be induced on the the availability of degradable or reversible enzymes genome. In yeast, this was already possible in the 80s, [71,86,87], should now enable more rapid advance. owing to the use of HO endonuclease combined with the possibility of engineering the genome [67,68].In higher eukaryotes, the first accurate and controlled Repair pathways influence faithful sequence method of inducing a single DSB at a specific locus was recovery at damaged genes devised by the laboratory of Maria Jasin in 1994. To accurately quantify homologous recombination (HR) Beyond the transcriptional regulation of damaged events, they introduced into the mouse genome a active genes, the cell's most important challenge is to transgene that carries the recognition site for the accurately recover genetic information at these loci. endonuclease I-SceI within a GFP reporter system Multiple, partly redundant, and extremely well- [69]. This methodology has been superseded by the conserved repair pathways exist in eukaryotes use of restriction enzymes, Zinc-finger nucleases, and, (reviewedinRef.[88]). They are usually classified more recently, CRISPR/Cas9 technology [70–73], into two major groups: non-homologous end joining which allow the induction of sequence-specific DSBs (NHEJ) and Homologous Recombination (HR). NHEJ at endogenous locations and anywhere in the genome. promotes the direct ligation of the two DNA ends with minimal processing, while HR accomplishes repair Transcription extinction and recovery at damaged using an intact copy of the broken locus (mostly the active genes sister chromatid) as a template. To this end, HR relies on a process called resection, during which endo- and One of the consequences of DSB production in exonucleases generate single-strand DNA required for active genes is the rapid extinction of transcription at D-loop formation with the template DNA. D-Loop sites of damage (recently reviewed in Ref. [74]). establishment is followed by DNA synthesis, tightly Transcription inhibition occurs at the damaged gene coupled with dissolution/resolution mechanisms, which [70,71,75–80] and can also spread a few kilobases determine the output of HR and give rise to crossover or away from the DSB [71,80], although not over the non-crossover products. Since this initial classification, entire megabase-wide γH2AX domain [70,79,81,82]. several alternative mechanisms have been discovered, Transcriptional repression is an active process that which soften the distinction between these two major relies on ATM signaling, ubiquitination of H2A lysine pathways. Single-strand annealing initially resembles K119, which is a well-known repressive histone mark HR, resecting to form single strands, but then uses an established by Polycomb group proteins, and histone illegitimate homologous copy in cis to anneal and deacetylation by the NuRD complex [71,77,83,84] directly reseal the two resected ends. Similarly, alter- (Fig. 3, right panel). Importantly, this transcriptional native NHEJ relies on short-range resection to expose shut-down is tightly linked to the completion of repair microhomologies, which are further used to synapse of damaged genes [77,83–85], suggesting that it and ligate the break (therefore also known as micro- may help “clean” the damaged locus to enable repair homology-mediated end joining). In yeast, break-- activity. This proposal agrees with the finding that induced replication initiates a non-canonical transcriptional extinction is correlated with the disap- replication fork, which can proceed to the end of the 1282 DSB repair at active genes

Fig. 3. Transcription-coupled DSB repair. Left panel: Transcriptionally active genes are subjected to specific repair. In G2, homologous recombination is specifically targeted at transcribing genes, in a manner that largely depends on chromatin features associated with active transcription. Trimethylation of histone H3 on lysine 36 (H3K36me3), which is correlated with elongating RNA Pol II, recruits CtIP via LEDGF, allowing the initiation of resection and HR repair. In addition, acetyl H4K16, a histone mark enriched in active regions, interferes with the binding of 53BP1 to H4 methylated on lysine 20, hence promoting BRCA1 binding and resection. Finally MBTD1, part of the TIP60 complex, competes with 53BP1 for binding to H4K20me. DSB-recruited TIP60 acetylates H2A on lysine 15 (H2AK15ac) in G2, counteracting its ubiquitination and thereby further destabilizing 53BP1. Together, these chromatin marks favor resection, RAD51 loading, and HR repair. In G1, specific repair events that occur on active gene are still under characterization. Tip60 does not acetylate H2AK15, allowing its ubiquitination that, together with H4K20me, stabilizes 53BP1 and inhibits resection. Although XRCC4 is recruited, these DSBs exhibit a strong repair delay and undergo clustering. RNA might also participate in the repair reaction as a patch to synapse the two DNA ends or as a template for Rad52-dependent repair. Right panel: Gene transcription is inhibited locally following DSB induction, while transcription is maintained farther away within the γH2AX domains. Transcriptional repression is achieved by the eviction of RNA Pol II from the damaged gene and possibly via its degradation, as well as multiple chromatin modifications, including histone deacetylation by the NuRD complex and Polycomb dependent ubiquitination of H2A at lysine 119. chromosome and generate complex chromosome transcribed. The authors had already discussed the rearrangements. More recently, it has been proposed “avant-garde” possibility that chromatin status may that RNA-templated repair also occurs in yeast and contribute to this elevated recombination rate [90]. possibly in higher eukaryotes (reviewed in Ref. [89]). All Similarly, mammalian cell studies led to the conclusion these repair mechanisms coexist and can give rise to a that transcription enhances recombination rate by a multitude of genomic scars, such as point mutations, process termed “transcription-associated recombina- translocations, and even chromothripsis, a massive tion” (TAR) [91,92]. In yeast and bacteria, an increase chromosomal reshuffling observedincancer.The of mutation rate was also observed in active genes choice between these pathways at DSBs induced in (also known as TAM or transcription-associated active genes is therefore critical, as it will clearly mutagenesis; for a review, see Ref. [93]); hence, determine the quality of the repair event and thus the TAR was attributed to higher damage frequency in subsequent functionality of the damaged gene. genes (for a review, see Ref. [94]). However, recent work, described below, also raises the possibility that A role for HR in repairing active genes TAR may be related to repair mechanisms specifically set up at active genes. In the late 80s, a study from the Thomas and Analysis of the fate of an HO break introduced at Rothstein established that spontaneous recombination dedicated position on the yeast genome revealed between duplicated sequences inserted in the yeast that an active gene exhibits faster repair than an GAL10 gene was strongly enhanced when GAL10 was inactive gene, supporting the existence of a DSB repair at active genes 1283

“transcription-coupled DSB repair” pathway [95]. these studies reveal a complex regulatory network of Analyses of repair in higher eukaryotes at an I-SceI multiple chromatin marks that fine-tunes repair out- site close to inducible promoters showed that comes by influencing the recruitment and stabilization transcription activity generally did not strongly affect of repair factors and identified chromatin status as HR and NHEJ pathway usage or outcome central in HR recruitment at active genes (for a review, [92,96,97]. However, the inherently transgenic na- see Ref. [106]; Fig. 3, left panel). ture of this single I-SceI-targeted locus called for further work to compare repair events that take place Other pathways are involved in active gene repair at different endogenous loci on the genome. in G1 ChIP-seq analyses of the distribution of HR and NHEJ proteins at DSBs introduced by the AsiSI An additional layer of complexity in pathway usage is restriction enzyme throughout the genome [70,98] cell cycle dependence. Indeed, HR is strongly sup- revealed that among AsiSI-induced DSBs, mostly pressed during the G1 phase [107], and HR preference those lying within or close to transcriptionally active at active genes only occurs during G2 [86]. This raises loci (e.g., promoters) could recruit the HR factor the question of how DSBs in active genes are repaired RAD51 [86]. This result clearly demonstrated that during G1 or in non-dividing cells (Fig. 3, left panel). damaged genes that are active exhibit a preference NHEJ proteins are clearly recruited at DSBs induced in for HR repair compared to other sequences located active genes throughout the cell cycle [84,86,99]. in euchromatin (AsiSI activity being inhibited by DNA Accordingly, NHEJ repair proteins are associated with methylation, these studies did not allow the investi- RNA Pol II (for example, see Refs. [108,109]). Howev- gation of repair at heterochromatin loci). Preference er, XRCC4 depletion does not affect repair kinetics at of transcribed loci for HR was further confirmed by transcriptionally active regions [86], and Ku70 dissoci- inducing damage with Killer Red (a DNA-damaging ates rapidly after its recruitment at a damaged agent activated by visible light) at a transcriptionally transcribed locus, while HR proteins persist much active or repressed cassette inserted in the genome longer [99]. This raises the possibility that although [99]. recruited, canonical NHEJ might not be entirely efficient at these loci. In agreement, a recent genome-wide Chromatin-driven HR targeting to damaged analysis by BLESS of repair kinetics revealed that active genes transcriptionally active genes are not fully proficient for repair in G1, in contrast to non-transcribed sequences Transcription itself does not seem to be responsi- on the genome [110]. Moreover, these delayed DSBs ble for HR recruitment at damaged active genes. It exhibit an increased ability to cluster (i.e., coalesce) as is rather the trimethylation of histone H3 on lysine shown by high-resolution mapping of long-range 36 (H3K36me3), well known to be correlated with contact using Capture Hi-C [110]. These unrepaired, elongating RNA Pol II, that acts as a critical determinant damaged active genes assemble within foci that are for resection and RAD51 loading [86,100]. Accordingly, reminiscent of the 53BP1/OPT bodies. Indeed, follow- depletion of SetD2, the main histone methyltransferase ing breakage in mitosis, CFS, which occur in long and responsible for H3K36me3 in human cells, strongly active genes, assemble within 53BP1 bodies. These impedes HR repair [86,100,101]. H3K36me3 has been bodies persist throughout the G1 phase and might be reported to be directly recognized by the PWWP cleared during the following replication phase [111]. domain of LEDGF, a protein that interacts with the CtIP These data suggest that in G1, a suboptimal efficacy of resection factor [102]. Hence, these studies point NHEJ at active genes, combined with the unavailability toward a model in which H3K36me3 recruits CtIP at of HR, could result in persistent DSBs that cluster and least in part via LEDGF, allowing the initiation of await the arrival of S phase to undergo repair by a still resection and the use of HR pathway at active genes uncharacterized mechanism. (Fig. 3, left panel). Following another line of investiga- Thus, while HR seems to be preferentially used to tion, Tang et al. discovered that histone acetylation also repair active loci in G2, the contribution of other less regulates the repair outcome. Acetylation of histone H4 well-characterized repair pathways at DSBs arising lysine16, a histone mark correlated with active genomic in active genes, especially during G1, needs further regions, counteracts the recruitment of 53BP1, an investigation. Among these elusive pathway(s), RNA- anti-resection factor, to H4 methylated on lysine 20 templated repair, which bypasses the need for a and favors the binding of BRCA1, a pro-resection factor homologous DNA template and thus may occur in G1, [87]. Accordingly, Tip60 and hMof, two histone is further discussed below. acetyltransferases with known activity on K16, have been shown to favor HR [87,103]. More recently, Tip60 A direct function for RNA as template at active was shown to acetylate H2A lysine K15 during G2, genes? which precludes H2A ubiquitination on the same residue and so destabilizes 53BP1 binding at the In addition to exhibiting specific chromatin patterns, break site [104,105] (Fig. 3, left panel). Taken together, active genes are distinguished from the rest of the 1284 DSB repair at active genes genome by their ability to produce RNA. It has recently While the function of these diRNAs is still being emerged that RNA could serve as a template for HR characterized, it is interesting that under conditions of repair (recently reviewed in Ref. [89]). In budding no DNA damage, double-stranded RNAs have been yeast, RNA oligonucleotides can be used as direct proposed to function in terminating transcription and templates for a synthesis-dependent repair, probably assembling repressive chromatin [120], raising the involving the polδ replicative polymerase [112]. possibility that diRNAs could be similarly involved in Moreover, using a very elegant reporter system, the transcriptional extinction at damaged genes. Finally, a Keskin et al. demonstrated that a transcript produced very recent study by Ohle et al. [121] demonstrated in cis to the gene experiencing a DSB can aid repair that RNA:DNA hybrids accumulate at resected ends not only indirectly via the formation of cDNA but in fission yeast. Importantly, these hybrids were found also directly by providing a template for a Rad52- to control single-strand annealing usage and to exert dependent repair mechanism [113]. However, the strong regulation of rDNA copy number following latter process was observed only in mutant yeast damage in rDNA, suggesting that this mechanism strains devoid of RNAseH1 and RNAseH2 and thus might be particularly relevant in transcribed regions. deficient in DNA:RNA hybrid processing, raising doubt as to whether this RNA-templated repair pathway occurs in normal cells. How RNA-templated DNA Concluding Remarks repair functions in yeast remains speculative. It could involve the use of the RNA molecule as a patch, which The existence of a transcription-coupled repair maintains the DNA ends in close proximity to facilitate process for DNA damage such as chemical adducts their ligation or the RNA molecule could be used to or UV-induced pyrimidine dimers, called transcription‐ extend the 3′ end of the DSB by an as yet unknown coupled NER (TC-NER, has emerged since the 90s polymerase. These findings challenge the traditional (for a review, see Ref. [122]). Notably, mutations in view that HR requires a DNA template for repair TC-NER proteins are associated with diseases that and highlight the notion that RNA, which is available provoke premature aging (for a review, see Ref. at transcriptionally active genes, could be used to [123]). The fact that specific processes are involved to template DNA repair. RNA-dependent repair might maintain gene integrity may now be extended to DSB also take place in higher eukaryotes: quiescent human repair since the evidence discussed clearly estab- cells use a RNA and RAD52-dependent mechanism lishes that active genes are the theater of complex and for DSB repair [99], and NHEJ has been proposed to tightly regulated DSB repair events that together could operate at active genes through an RNA-templated define a “transcription-coupled DSB repair” pathway. mechanism [114]. However, the massive amount of These transcription-coupled DSB repair mecha- damage induced in the latter study (1 DSB/10 kb, nisms may reveal to be important, given that normal based on the failure to PCR amplify 10-kb amplicons transcription activity is now emerging as a potent following drug treatment) means that confirmation of DSB-inducing agent. We shall soon see whether these findings following milder or more relevant transcription-coupled DSB repair deficiency underlies genotoxic insults will be required. While direct particular diseases, as does the impairment of evidence demonstrating the use of RNA as a template TC-NER. Mutations of some DSB repair factors are for repair in higher eukaryotes is still lacking, these associated with progeria or segmental premature studies open the possibility that RNA-templated repair aging phenotypes (such as Werner disease) and is a conserved alternative pathway for repairing regions neurodegenerative diseases (Ataxia Telangectasia, of the genome in the course of transcription (Fig. 3). Nijmegen breakage syndrome…). Whether these Interestingly, small non-coding double-stranded RNAs disorders impede the correct repair of DSB induced have also been discovered at or in the vicinity of in active regions of the genome is a possibility that DSBs in plant, Drosophila, and human cells (recently deserves attention in the near future. reviewed in Refs. [115–117]).Whether these DSB- induced small RNAs (diRNAs) arise only at DSBs induced in active genes is still under debate. Indeed, in Drosophila, diRNAs are produced to much higher levels from a plasmid linearized by the induction of Acknowledgments DSBs in a highly active gene than from DSBs in a weakly transcribed one [118]. Yet, in mammalian We thank Dr. David Lane, Dr. Thomas Clouaire, cells, diRNAs were also detected at an I-SceI-induced Dr. Nadine Puget, and Dr. Sylvain Egloff for critical break in an inactive locus, indicating that the induction reading of the manuscript. Funding in GL laboratory of diRNAs at DSB might not be specific to active was provided by grants from the European Research genes [119]. Future work is clearly needed to Council (ERC-2014-CoG 647344), Agence Nationale determine whether diRNAs arise by de novo tran- pour la Recherche (ANR-14-CE10-0002-01 and scription in both sense and antisense orientations or ANR-13-BSV8-0013), and the Ligue Nationale contre whether they require a preexisting RNA precursor. le Cancer (LNCC). DSB repair at active genes 1285

Received 20 December 2016; lie in fragile sites hosting large genes, Cell Rep. 4 (2013) Received in revised form 22 March 2017; 420–428. Accepted 23 March 2017 [11] A. Letessier, G.A. Millot, S. Koundrioukoff, A.M. Lachages, Available online 28 March 2017 N. Vogt, R.S. Hansen, B. Malfoy, O. Brison, M. Debatisse, Cell-type-specific replication initiation programs set fragility of the FRA3B fragile site, Nature 470 (2011) 120–123. Keywords: [12] K. Miron, T. Golan-Lev, R. Dvir, E. Ben-David, B. Kerem, DNA double-strand break repair; Oncogenes create a unique landscape of fragile sites, Nat. transcription; Commun. 6 (2015) 7094. chromatin [13] J.H. Barlow, R.B. Faryabi, E. Callen, N. Wong, A. Malhowski, H.T. Chen, G. Gutierrez-Cruz, H.W. Sun, P. A.M. and S.C. contributed equally to this work. McKinnon, G. Wright, R. Casellas, D.F. Robbiani, L. Staudt, O. Fernandez-Capetillo, A. Nussenzweig, Identification of early replicating fragile sites that contribute to genome instability, Cell 152 (2013) 620–632. [14] H.Chandler,H.Patel,R.Palermo,S.Brookes,N.Matthews,G. Abbreviations used: Peters, Role of polycomb group proteins in the DNA damage G4, G-quadruplex; DSB, double-strand break; CFS, response—a reassessment, PLoS One 9 (2014) e102968. common fragile site; ERFS, early replicating fragile site; [15] A. Gardini, D. Baillat, M. Cesaroni, R. Shiekhattar, Genome- DRIP, DNA:RNA hybrid immunoprecipitation; SSB, wide analysis reveals a role for BRCA1 and PALB2 in single-strand break; NER, nucleotide excision repair; TC- transcriptional co-activation, EMBO J. 33 (2014) 890–905. NER, transcription-coupled nucleotide excision repair; [16] H. Bunch, B.P. Lawney, Y.F. Lin, A. Asaithamby, A. HR, homologous recombination; NHEJ, non-homologous Murshid, Y.E. Wang, B.P. Chen, S.K. Calderwood, Tran- end joining; TAR, transcription-associated recombination; scriptional elongation requires DNA break-induced signal- H3K36me3, trimethylation of histone H3 on lysine 36; ling, Nat. Commun. 6 (2015) 10,191. diRNA, DSB-induced small RNA; TOP2B, topoisomerase [17] I.A.Klein,W.Resch,M.Jankovic,T.Oliveira,A.Yamane,H. Nakahashi, M. Di Virgilio, A. Bothmer, A. Nussenzweig, D.F. IIβ; RNA Pol II, RNA polymerase II. Robbiani, R. Casellas, M.C. Nussenzweig, Translocation- capture sequencing reveals the extent and nature of chromosomal rearrangements in B lymphocytes, Cell 147 References (2011) 95–106. [18] R. Chiarle, Y. Zhang, R.L. Frock, S.M. Lewis, B. Molinie, [1] G.Wang,K.M.Vasquez,ImpactofalternativeDNA Y.J. Ho, D.R. Myers, V.W. Choi, M. Compagno, D.J. Malkin, structures on DNA damage, DNA repair, and genetic D. Neuberg, S. Monti, C.C. Giallourakis, M. Gostissa, F.W. instability, DNA Repair (Amst) 19 (2014) 143–151. Alt, Genome-wide translocation sequencing reveals mech- [2] S. Corless, N. Gilbert, Effects of DNA supercoiling on anisms of chromosome breaks and rearrangements in B chromatin architecture, Biophys. Rev. 8 (2016) 245–258. cells, Cell 147 (2011) 107–119. [3] G.R. Sutherland, Heritable fragile sites on human chromo- [19] R.L. Frock, J. Hu, R.M. Meyers, Y.J. Ho, E. Kii, F.W. Alt, somes II. Distribution, phenotypic effects, and cytogenetics, Genome-wide detection of DNA double-stranded breaks Am. J. Hum. Genet. 31 (1979) 136–148. induced by engineered nucleases, Nat. Biotechnol. 33 (2015) [4] D. Sarni, B. Kerem, The complex nature of fragile site 179–186. plasticity and its importance in cancer, Curr. Opin. Cell Biol. [20] J. Hu, R.M. Meyers, J. Dong, R.A. Panchakshari, F.W. Alt, 40 (2016) 131–136. R.L. Frock, Detecting DNA double-stranded breaks in [5] A. Helmrich, K. Stout-Weider, K. Hermann, E. Schrock, T. mammalian genomes by linear amplification-mediated high- Heiden, Common fragile sites are conserved features of throughput genome-wide translocation sequencing, Nat. human and mouse chromosomes and relate to large active Protoc. 11 (2016) 853–871. genes, Genome Res. 16 (2006) 1222–1230. [21] B. Schwer, P.C. Wei, A.N. Chang, J. Kao, Z. Du, R.M. [6] B. Le Tallec, B. Dutrillaux, A.M. Lachages, G.A. Millot, O. Brison, Meyers, F.W. Alt, Transcription-associated processes M. Debatisse, Molecular profiling of common fragile sites in cause DNA double-strand breaks and translocations in human fibroblasts, Nat. Struct. Mol. Biol. 18 (2011) 1421–1423. neural stem/progenitor cells, Proc. Natl. Acad. Sci. U. S. A. [7] A. Helmrich, M. Ballarino, L. Tora, Collisions between 113 (2016) 2258–2263. replication and transcription complexes cause common [22] P.C. Wei, A.N. Chang, J. Kao, Z. Du, R.M. Meyers, F.W. Alt, fragile site instability at the longest human genes, Mol. Cell B. Schwer, Long neural genes harbor recurrent DNA break 44 (2011) 966–977. clusters in neural stem/progenitor cells, Cell 164 (2016) [8] D.I. Smith, S. McAvoy, Y. Zhu, D.S. Perez, Large common 644–655. fragile site genes and cancer, Semin. Cancer Biol. 17 (2007) [23] N. Crosetto, A. Mitra, M.J. Silva, M. Bienko, N. Dojer, Q. 31–41. Wang, E. Karaca, R. Chiarle, M. Skrzypczak, K. Ginalski, P. [9] J.A. Harrigan, R. Belotserkovskaya, J. Coates, D.S. Pasero, M. Rowicka, I. Dikic, Nucleotide-resolution DNA Dimitrova, S.E. Polo, C.R. Bradshaw, P. Fraser, S.P. double-strand break mapping by next-generation sequencing, Jackson, Replication stress induces 53BP1-containing Nat. Methods 10 (2013) 361–365. OPT domains in G1 cells, J. Cell Biol. 193 (2011) 97–108. [24] E.A. Hoffman, A. McCulley, B. Haarer, R. Arnak, W. Feng, [10] B. Le Tallec, G.A. Millot, M.E. Blin, O. Brison, B. Dutrillaux, Break-seq reveals hydroxyurea-induced chromosome fragility M. Debatisse, Common fragile site profiling in epithelial and as a result of unscheduled conflict between DNA replication erythroid cells reveals that most recurrent cancer deletions and transcription, Genome Res. 25 (2015) 402–412. 1286 DSB repair at active genes

[25] A. Canela, S. Sridharan, N. Sciascia, A. Tubbs, P. Meltzer, [43] X. Li, J.L. Manley, Inactivation of the SR protein splicing B.P. Sleckman, A. Nussenzweig, DNA breaks and end factor ASF/SF2 results in genomic instability, Cell 122 (2005) resection measured genome-wide by end sequencing, Mol. 365–378. Cell 63 (2016) 898–911. [44] O. Sordet, C.E. Redon, J. Guirouilh-Barbat, S. Smith, S. [26] S.V. Lensing, G. Marsico, R. Hansel-Hertsch, E.Y. Lam, D. Solier, C. Douarre, C. Conti, A.J. Nakamura, B.B. Das, E. Tannahill, S. Balasubramanian, DSBCapture: in situ capture and Nicolas, K.W. Kohn, W.M. Bonner, Y. Pommier, Ataxia sequencing of DNA breaks, Nat. Methods 13 (2016) 855–857. telangiectasia mutated activation by transcription- and topo- [27] F. Yang, C.J. Kemp, S. Henikoff, Anthracyclines induce isomerase I-induced DNA double-strand breaks, EMBO Rep. double-strand DNA breaks at active gene promoters, Mutat. 10 (2009) 887–893. Res. 773 (2015) 9–15. [45] P.C. Stirling, Y.A. Chan, S.W. Minaker, M.J. Aristizabal, I. [28] B. Le Tallec, S. Koundrioukoff, T. Wilhelm, A. Letessier, O. Barrett, P. Sipahimalani, M.S. Kobor, P. Hieter, R-loop- Brison, M. Debatisse, Updating the mechanisms of common mediated genome instability in mRNA cleavage and fragile site instability: how to reconcile the different views? polyadenylation mutants, Genes Dev. 26 (2012) 163–175. Cell. Mol. Life Sci. 71 (2014) 4489–4494. [46] S. Tuduri, L. Crabbe, C. Conti, H. Tourriere, H. Holtgreve-Grez, [29] H. Duda, M. Arter, J. Gloggnitzer, F. Teloni, P. Wild, M.G. A. Jauch, V. Pantesco, J. De Vos, A. Thomas, C. Theillet, Y. Blanco, M. Altmeyer, J. Matos, A mechanism for controlled Pommier, J. Tazi, A. Coquelle, P. Pasero, Topoisomerase I breakage of under-replicated chromosomes during mitosis, suppresses genomic instability by preventing interference Dev. Cell 40 (2017) 421–422. between replication and transcription, Nat. Cell Biol. 11 (2009) [30] V. Naim, T. Wilhelm, M. Debatisse, F. Rosselli, ERCC1 and 1315–1324. MUS81-EME1 promote sister chromatid separation by [47] H. Wimberly, C. Shee, P.C. Thornton, P. Sivaramakrishnan, processing late replication intermediates at common fragile S.M. Rosenberg, P.J. Hastings, R-loops and nicks initiate sites during mitosis, Nat. Cell Biol. 15 (2013) 1008–1015. DNA breakage and genome instability in non-growing [31] S. Ying, S. Minocherhomji, K.L. Chan, T. Palmai-Pallag, Escherichia coli, Nat. Commun. 4 (2013) 2115. W.K. Chu, T. Wass, H.W. Mankouri, Y. Liu, I.D. Hickson, [48] M.L. Garcia-Rubio, C. Perez-Calero, S.I. Barroso, E. MUS81 promotes common fragile site expression, Nat. Cell Tumini, E. Herrera-Moyano, I.V. Rosado, A. Aguilera, The Biol. 15 (2013) 1001–1007. Fanconi anemia pathway protects genome integrity from R- [32] K. Skourti-Stathaki, N.J. Proudfoot, A double-edged sword: R loops, PLoS Genet. 11 (2015) e1005674. loops as threats to genome integrity and powerful regulators [49] C.T. Stork, M. Bocek, M.P. Crossley, J. Sollier, L.A. Sanz, F. of gene expression, Genes Dev. 28 (2014) 1384–1396. Chedin, T. Swigut, K.A. Cimprich, Co-transcriptional R-loops [33] A.L. Valton, M.N. Prioleau, G-Quadruplexes in DNA replication: are the main cause of estrogen-induced DNA damage, elife 5 a problem or a necessity? Trends Genet. 32 (2016) 697–706. (2016) e17548. [34] M.L. Bochman, K. Paeschke, V.A. Zakian, DNA secondary [50] J. Sollier, C.T. Stork, M.L. Garcia-Rubio, R.D. Paulsen, A. structures: stability and function of G-quadruplex structures, Aguilera, K.A. Cimprich, Transcription-coupled nucleotide Nat. Rev. Genet. 13 (2012) 770–780. excision repair factors promote R-loop-induced genome [35] P. Huertas, A. Aguilera, Cotranscriptionally formed DNA: instability, Mol. Cell 56 (2014) 777–785. RNA hybrids mediate transcription elongation impairment [51] R.D. Paulsen, D.V. Soni, R. Wollman, A.T. Hahn, M.C. Yee, and transcription-associated recombination, Mol. Cell 12 A. Guan, J.A. Hesley, S.C. Miller, E.F. Cromwell, D.E. (2003) 711–721. Solow-Cordero, T. Meyer, K.A. Cimprich, A genome-wide [36] K. Yu, F. Chedin, C.L. Hsieh, T.E. Wilson, M.R. Lieber, R- siRNA screen reveals diverse cellular processes and loops at immunoglobulin class switch regions in the chromo- pathways that mediate genome stability, Mol. Cell 35 somes of stimulated B cells, Nat. Immunol. 4 (2003) 442–451. (2009) 228–239. [37] P.A. Ginno, Y.W. Lim, P.L. Lott, I. Korf, F. Chedin, GC skew [52] A. Madireddy, S.T. Kosiyatrakul, R.A. Boisvert, E. Herrera- at the 5′ and 3′ ends of human genes links R-loop formation Moyano, M.L. Garcia-Rubio, J. Gerhardt, E.A. Vuono, N. to epigenetic regulation and transcription termination, Owen, Z. Yan, S. Olson, A. Aguilera, N.G. Howlett, C.L. Genome Res. 23 (2013) 1590–1600. Schildkraut, FANCD2 facilitates replication through common [38] P.A. Ginno, P.L. Lott, H.C. Christensen, I. Korf, F. Chedin, R- fragile sites, Mol. Cell 64 (2016) 388–404. loop formation is a distinctive characteristic of unmethylated [53] A. Aguilera, T. Garcia-Muse, R loops: from transcription human CpG island promoters, Mol. Cell 45 (2012) 814–825. byproducts to threats to genome stability, Mol. Cell 46 [39] L.A. Sanz, S.R. Hartono, Y.W. Lim, S. Steyaert, A. Rajpurkar, (2012) 115–124. P.A. Ginno, X. Xu, F. Chedin, Prevalent, dynamic, and [54] T. Garcia-Muse, A. Aguilera, Transcription-replication con- conserved R-loop structures associate with specific epige- flicts: how they occur and how they are resolved, Nat. Rev. nomic signatures in mammals, Mol. Cell 63 (2016) 167–178. Mol. Cell Biol. 17 (2016) 553–563. [40] R.Hansel-Hertsch,D.Beraldi,S.V.Lensing,G.Marsico,K. [55] D. Cortez, Preventing replication fork collapse to maintain Zyner, A. Parry, M. Di Antonio, J. Pike, H. Kimura, M. Narita, D. genome integrity, DNA Repair (Amst) 32 (2015) 149–157. Tannahill, S. Balasubramanian, G-quadruplex structures mark [56] A.K. Tehranchi, M.D. Blankschien, Y. Zhang, J.A. Halliday, A. human regulatory chromatin, Nat. Genet. 48 (2016) 1267–1272. Srivatsan, J. Peng, C. Herman, J.D. Wang, The transcription [41] L.A. Dempsey, H. Sun, L.A. Hanakahi, N. Maizels, G4 DNA factor DksA prevents conflicts between DNA replication and binding by LR1 and its subunits, nucleolin and hnRNP D, a transcription machinery, Cell 141 (2010) 595–605. role for G-G pairing in immunoglobulin switch recombination, [57] D. Dutta, K. Shatalin, V. Epshtein, M.E. Gottesman, E. Nudler, J. Biol. Chem. 274 (1999) 1066–1071. Linking RNA polymerase backtracking to genome instability in [42] M.L. Duquette, P. Handa, J.A. Vincent, A.F. Taylor, N. E. coli, Cell 146 (2011) 533–543. Maizels, Intracellular transcription of G-rich DNAs induces [58] H. Merrikh, Y. Zhang, A.D. Grossman, J.D. Wang, Replication- formation of G-loops, novel structures containing G4 DNA, transcription conflicts in bacteria, Nat. Rev. Microbiol. 10 Genes Dev. 18 (2004) 1618–1629. (2012) 449–458. DSB repair at active genes 1287

[59] A. Helmrich, M. Ballarino, E. Nudler, L. Tora, Transcription- [75] T. Pankotai, C. Bonhomme, D. Chen, E. Soutoglou, DNAPKcs- replication encounters, consequences and genomic instability, dependent arrest of RNA polymerase II transcription in the Nat. Struct. Mol. Biol. 20 (2013) 412–418. presence of DNA breaks, Nat. Struct. Mol. Biol. 19 (2012) [60] A. Cristini, J.H. Park, G. Capranico, G. Legube, G. Favre, O. 276–282. Sordet, DNA-PK triggers histone ubiquitination and signaling [76] L.V. Solovjeva, M.P. Svetlova, V.O. Chagin, N.V. Tomilin, in response to DNA double-strand breaks produced during Inhibition of transcription at radiation-induced nuclear foci the repair of transcription-blocking topoisomerase I lesions, of phosphorylated histone H2AX in mammalian cells, Nucleic Acids Res. 44 (2016) 1161–1178. Chromosom. Res. 15 (2007) 787–797. [61] M.C. Haffner, M.J. Aryee, A. Toubaji, D.M. Esopi, R. [77] F. Gong, L.Y. Chiu, B. Cox, F. Aymard, T. Clouaire, J.W. Albadine, B. Gurel, W.B. Isaacs, G.S. Bova, W. Liu, J. Xu, Leung, M. Cammarata, M. Perez, P. Agarwal, J.S. Brodbelt, A.K. Meeker, G. Netto, A.M. De Marzo, W.G. Nelson, S. G. Legube, K.M. Miller, Screen identifies bromodomain Yegnasubramanian, Androgen-induced TOP2B-mediated protein ZMYND8 in chromatin recognition of transcription- double-strand breaks and prostate cancer gene rearrange- associated DNA damage that promotes homologous recom- ments, Nat. Genet. 42 (2010) 668–675. bination, Genes Dev. 29 (2015) 197–211. [62] B.G. Ju, V.V. Lunyak, V. Perissi, I. Garcia-Bassets, D.W. [78] S.E. Lee, A. Pellicioli, J. Demeter, M.P. Vaze, A.P. Gasch, A. Rose, C.K. Glass, M.G. Rosenfeld, A topoisomerase IIbeta- Malkova, P.O. Brown, D. Botstein, T. Stearns, M. Foiani, J.E. mediated dsDNA break required for regulated transcription, Haber, Arrest, adaptation, and recovery following a chromo- Science 312 (2006) 1798–1802. some double-strand break in Saccharomyces cerevisiae, [63] C. Lin, L. Yang, B. Tanasa, K. Hutt, B.G. Ju, K. Ohgi, J. Cold Spring Harb. Symp. Quant. Biol. 65 (2000) 303–314. Zhang, D.W. Rose, X.D. Fu, C.K. Glass, M.G. Rosenfeld, [79] J.A. Kim, M. Kruhlak, F. Dotiwala, A. Nussenzweig, J.E. Haber, Nuclear receptor-induced chromosomal proximity and DNA Heterochromatin is refractory to gamma-H2AX modification in breaks underlie specific translocations in cancer, Cell 139 yeast and mammals, J. Cell Biol. 178 (2007) 209–218. (2009) 1069–1083. [80] N. Manfrini, M. Clerici, M. Wery, C.V. Colombo, M. Descrimes, [64] L.M. Williamson, S.P. Lees-Miller, Estrogen receptor A. Morillon, F. d'Adda di Fagagna, M.P. Longhese, Resection alpha-mediated transcription induces cell cycle-dependent is responsible for loss of transcription around a double-strand DNA double-strand breaks, Carcinogenesis 32 (2011) break in Saccharomyces cerevisiae, elife 4 (2015) e08942. 279–285. [81] P. Caron, F. Aymard, J.S. Iacovoni, S. Briois, Y. Canitrot, B. [65] R. Madabhushi, F. Gao, A.R. Pfenning, L. Pan, S. Yamakawa, Bugler, L. Massip, A. Losada, G. Legube, Cohesin protects J. Seo, R. Rueda, T.X. Phan, H. Yamakawa, P.C. Pao, R.T. genes against gammaH2AX induced by DNA double-strand Stott, E. Gjoneska, A. Nott, S. Cho, M. Kellis, L.H. Tsai, breaks, PLoS Genet. 8 (2012) e1002460. Activity-induced DNA breaks govern the expression of [82] J. Kim, D. Sturgill, A.D. Tran, D.A. Sinclair, P. Oberdoerffer, neuronal early-response genes, Cell 161 (2015) 1592–1605. Controlled DNA double-strand break induction in mice [66] E. Suberbielle, P.E. Sanchez, A.V. Kravitz, X. Wang, K. Ho, K. reveals post-damage transcriptome stability, Nucleic Acids Eilertson, N. Devidze, A.C. Kreitzer, L. Mucke, Physiologic Res. 44 (2016) e64. brain activity causes DNA double-strand breaks in neurons, [83] A. Kakarougkas, A. Ismail, A.L. Chambers, E. Riballo, A.D. with exacerbation by amyloid-beta, Nat. Neurosci. 16 (2013) Herbert, J. Kunzel, M. Lobrich, P.A. Jeggo, J.A. Downs, 613–621. Requirement for PBAF in transcriptional repression and repair [67] R.E. Jensen, I. Herskowitz, Directionality and regulation of at DNA breaks in actively transcribed regions of chromatin, cassette substitution in yeast, Cold Spring Harb. Symp. Mol. Cell 55 (2014) 723–732. Quant. Biol. 49 (1984) 97–104. [84] A. Ui, Y. Nagaura, A. Yasui, Transcriptional elongation factor [68] C.I. White, J.E. Haber, Intermediates of recombination ENL phosphorylated by ATM recruits polycomb and switches during mating type switching in Saccharomyces cerevisiae, off transcription for DSB repair, Mol. Cell 58 (2015) 468–482. EMBO J. 9 (1990) 663–673. [85] K.M. Miller, J.V. Tjeertes, J. Coates, G. Legube, S.E. Polo, S. [69] P. Rouet, F. Smih, M. Jasin, Introduction of double-strand Britton, S.P. Jackson, Human HDAC1 and HDAC2 function in breaks into the genome of mouse cells by expression of a rare- the DNA-damage response to promote DNA nonhomologous cutting endonuclease, Mol. Cell. Biol. 14 (1994) 8096–8106. end-joining, Nat. Struct. Mol. Biol. 17 (2010) 1144–1151. [70] J.S. Iacovoni, P. Caron, I. Lassadi, E. Nicolas, L. Massip, D. [86] F. Aymard, B. Bugler, C.K. Schmidt, E. Guillou, P. Caron, S. Trouche, G. Legube, High-resolution profiling of gamma- Briois, J.S. Iacovoni, V. Daburon, K.M. Miller, S.P. Jackson, H2AX around DNA double strand breaks in the mammalian G. Legube, Transcriptionally active chromatin recruits genome, EMBO J. 29 (2010) 1446–1457. homologous recombination at DNA double-strand breaks, [71] N.M. Shanbhag, I.U. Rafalska-Metcalf, C. Balane-Bolivar, Nat. Struct. Mol. Biol. 21 (2014) 366–374. S.M. Janicki, R.A. Greenberg, ATM-dependent chromatin [87] J. Tang, N.W. Cho, G. Cui, E.M. Manion, N.M. Shanbhag, changes silence transcription in cis to DNA double-strand M.V. Botuyan, G. Mer, R.A. Greenberg, Acetylation limits breaks, Cell 141 (2010) 970–981. 53BP1 association with damaged chromatin to promote [72] E. Berkovich, R.J. Monnat Jr., M.B. Kastan, Roles of ATM and homologous recombination, Nat. Struct. Mol. Biol. 20 (2013) NBS1 in chromatin structure modulation and DNA double- 317–325. strand break repair, Nat. Cell Biol. 9 (2007) 683–690. [88] E. Mladenov, S. Magin, A. Soni, G. Iliakis, DNA double- [73] M. van Sluis, B. McStay, A localized nucleolar DNA damage strand-break repair in higher eukaryotes and its role in response facilitates recruitment of the homology-directed genomic instability and cancer: cell cycle and proliferation- repair machinery independent of cell cycle stage, Genes dependent regulation, Semin. Cancer Biol. 37-38 (2016) Dev. 29 (2015) 1151–1163. 51–64. [74] G. D'Alessandro, d'Adda di Fagagna, F., Transcription and [89] C. Meers, H. Keskin, F. Storici, DNA repair by RNA: DNA damage: holding hands or crossing swords? J. Mol. Biol. templated, or not templated, that is the question, DNA Repair (2016) pii: S0022-2836(16)30471-5. (Amst) 44 (2016) 17–21. 1288 DSB repair at active genes

[90] B.J. Thomas, R. Rothstein, Elevated recombination rates in [106] T. Clouaire, G. Legube, DNA double strand break repair transcriptionally active DNA, Cell 56 (1989) 619–630. pathway choice: a chromatin based decision? Nucleus 6 [91] J.A. Nickoloff, R.J. Reynolds, Transcription stimulates (2015) 107–113. homologous recombination in mammalian cells, Mol. Cell. [107] A. Orthwein, S.M. Noordermeer, M.D. Wilson, S. Landry, Biol. 10 (1990) 4837–4845. R.I. Enchev, A. Sherker, M. Munro, J. Pinder, J. Salsman, [92] P. Gottipati, T.N. Cassel, L. Savolainen, T. Helleday, G. Dellaire, B. Xia, M. Peter, D. Durocher, A mechanism for Transcription-associated recombination is dependent on the suppression of homologous recombination in G1 cells, replication in mammalian cells, Mol. Cell. Biol. 28 (2008) Nature 528 (2015) 422–426. 154–164. [108] C.C. Ebmeier, D.J. Taatjes, Activator-mediator binding [93] S. Jinks-Robertson, A.S. Bhagwat, Transcription-associated regulates mediator-cofactor interactions, Proc. Natl. Acad. mutagenesis, Annu. Rev. Genet. 48 (2014) 341–359. Sci. U. S. A. 107 (2010) 11,283–11,288. [94] A. Aguilera, The connection between transcription and [109] E. Maldonado, R. Shiekhattar, M. Sheldon, H. Cho, R. genomic instability, EMBO J. 21 (2002) 195–201. Drapkin, P. Rickert, E. Lees, C.W. Anderson, S. Linn, D. [95] P. Chaurasia, R. Sen, T.K. Pandita, S.R. Bhaumik, Reinberg, A human RNA polymerase II complex associated Preferential repair of DNA double-strand break at the active with SRB and DNA-repair proteins, Nature 381 (1996) 86–89. gene in vivo, J. Biol. Chem. 287 (2012) 36,414–36,422. [110] F. Aymard, M. Aguirrebengoa, E. Guillou, B.M. Javierre, B. [96] C. Allen, C.A. Miller, J.A. Nickoloff, The mutagenic potential Bugler, C. Arnould, V. Rocher, J.S. Iacovoni, A. Biernacka, of a single DNA double-strand break in a mammalian M. Skrzypczak, K. Ginalski, M. Rowicka, P. Fraser, G. chromosome is not influenced by transcription, DNA Repair Legube, Genome-wide mapping of long-range contacts (Amst) 2 (2003) 1147–1156. unveils clustering of DNA double-strand breaks at damaged [97] A. Gunn, N. Bennardo, A. Cheng, J.M. Stark, Correct end active genes, Nat. Struct. Mol. Biol. (2017) http://dx.doi.org/ use during end joining of multiple chromosomal double 10.1038/nsmb.3387 [Epub ahead of print]. strand breaks is influenced by repair protein RAD50, DNA- [111] C. Lukas, V. Savic, S. Bekker-Jensen, C. Doil, B. Neumann, dependent protein kinase DNA-PKcs, and transcription R.S. Pedersen, M. Grofte, K.L. Chan, I.D. Hickson, J. context, J. Biol. Chem. 286 (2011) 42,470–42,482. Bartek, J. Lukas, 53BP1 nuclear bodies form around DNA [98] L. Massip, P. Caron, J.S. Iacovoni, D. Trouche, G. Legube, lesions generated by mitotic transmission of chromosomes Deciphering the chromatin landscape induced around DNA under replication stress, Nat. Cell Biol. 13 (2011) 243–253. double strand breaks, Cell Cycle 9 (2010) 2963–2972. [112] F. Storici, K. Bebenek, T.A. Kunkel, D.A. Gordenin, M.A. [99] L. Wei, S. Nakajima, S. Bohm, K.A. Bernstein, Z. Shen, M. Resnick, RNA-templated DNA repair, Nature 447 (2007) Tsang, A.S. Levine, L. Lan, DNA damage during the G0/G1 338–341. phase triggers RNA-templated, Cockayne syndrome B- [113] H. Keskin, Y. Shen, F. Huang, M. Patel, T. Yang, K. Ashley, dependent homologous recombination, Proc. Natl. Acad. A.V. Mazin, F. Storici, Transcript-RNA-templated DNA Sci. U. S. A. 112 (2015) E3495–E3504. recombination and repair, Nature 515 (2014) 436–439. [100] S.X. Pfister, S. Ahrabi, L.P. Zalmas, S. Sarkar, F. Aymard, [114] A. Chakraborty, N. Tapryal, T. Venkova, N. Horikoshi, R.K. C.Z. Bachrati, T. Helleday, G. Legube, N.B. La Thangue, A.C. Pandita, A.H. Sarker, P.S. Sarkar, T.K. Pandita, T.K. Hazra, Porter, T.C. Humphrey, SETD2-dependent histone H3K36 Classical non-homologous end-joining pathway utilizes trimethylation is required for homologous recombination nascent RNA for error-free double-strand break repair of repair and genome stability, Cell Rep. 7 (2014) 2006–2018. transcribed genes, Nat. Commun. 7 (2016) 13,049. [101] S. Carvalho, A.C. Vitor, S.C. Sridhara, F.B. Martins, A.C. [115] F. d'Adda di Fagagna, A direct role for small non-coding RNAs in Raposo, J.M. Desterro, J. Ferreira, S.F. de Almeida, SETD2 is DNA damage response, Trends Cell Biol. 24 (2014) 171–178. required for DNA double-strand break repair and activation of [116] J.S. Khanduja, I.A. Calvo, R.I. Joh, I.T. Hill, M. Motamedi, the p53-mediated checkpoint, elife 3 (2014) e02482. Nuclear noncoding RNAs and genome stability, Mol. Cell 63 [102] M. Daugaard, A. Baude, K. Fugger, L.K. Povlsen, H. Beck, (2016) 7–20. C.S. Sorensen, N.H. Petersen, P.H. Sorensen, C. Lukas, J. [117] Y.G. Yang, Y. Qi, RNA-directed repair of DNA double- Bartek, J. Lukas, M. Rohde, M. Jaattela, LEDGF (p75) strand breaks, DNA Repair (Amst) 32 (2015) 82–85. promotes DNA-end resection and homologous recombina- [118] K.M. Michalik, R. Bottcher, K. Forstemann, A small RNA tion, Nat. Struct. Mol. Biol. 19 (2012) 803–810. response at DNA ends in Drosophila, Nucleic Acids Res. 40 [103] G.G. Sharma, S. So, A. Gupta, R. Kumar, C. Cayrou, N. (2012) 9596–9603. Avvakumov, U. Bhadra, R.K. Pandita, M.H. Porteus, D.J. [119] S. Francia, F. Michelini, A. Saxena, D. Tang, M. de Hoon, V. Chen, J. Cote, T.K. Pandita, MOF and histone H4 acetylation Anelli, M. Mione, P. Carninci, d'Adda di Fagagna, F., Site- at lysine 16 are critical for DNA damage response and specific DICER and DROSHA RNA products control the double-strand break repair, Mol. Cell. Biol. 30 (2010) DNA-damage response, Nature 488 (2012) 231–235. 3582–3595. [120] K. Skourti-Stathaki, K. Kamieniarz-Gdula, N.J. Proudfoot, R- [104] K. Jacquet, A. Fradet-Turcotte, N. Avvakumov, J.P. Lambert, loops induce repressive chromatin marks over mammalian C. Roques, R.K. Pandita, E. Paquet, P. Herst, A.C. Gingras, gene terminators, Nature 516 (2014) 436–439. T.K. Pandita, G. Legube, Y. Doyon, D. Durocher, J. Cote, The [121] C. Ohle, R. Tesorero, G. Schermann, N. Dobrev, I. Sinning, TIP60 complex regulates bivalent chromatin recognition by T. Fischer, Transient RNA-DNA hybrids are required for 53BP1 through direct H4K20me binding and H2AK15 efficient double-strand break repair, Cell 167 (2016) e7. acetylation, Mol. Cell 62 (2016) 409–421. [122] P.C. Hanawalt, G. Spivak, Transcription-coupled DNA [105] A. Fradet-Turcotte, M.D. Canny, C. Escribano-Diaz, A. repair: two decades of progress and surprises, Nat. Rev. Orthwein, C.C. Leung, H. Huang, M.C. Landry, J. Kitevski- Mol. Cell Biol. 9 (2008) 958–970. LeBlanc, S.M. Noordermeer, F. Sicheri, D. Durocher, 53BP1 [123] J.A. Marteijn, H. Lans, W. Vermeulen, J.H. Hoeijmakers, is a reader of the DNA-damage-induced H2A Lys 15 ubiquitin Understanding nucleotide excision repair and its roles in cancer mark, Nature 499 (2013) 50–54. and ageing, Nat. Rev. Mol. Cell Biol. 15 (2014) 465–481.

Les gènes transcriptionellement actifs peuvent être la source de l'instabilité du génome via de nombreux mécanismes. Ces gènes sont caractérisés par la formation de structures secondaires telles que les hybrides ADN : ARN. Ils se forment lorsque l'ARN sortant l'ARN polymérase II s'hybride au simple brin d'ADN. De nombreuses études ont montrées que l’accumulation de ces hybrides peut mener à la création de dommages à l'ADN. Parmi ces dommages, les Cassures Double Brins (CDB) sont les plus dangereuses pour la cellule puisqu'elles peuvent produire des mutations et des réarrangements chromosomiques. Il existe deux mécanismes de réparation majeurs dans la cellule : la Jonction Non-Homologue des Extrémités (NHEJ) et la Recombinaison Homologue (HR). Mon équipe a récemment montré que les CDB localisées dans les gènes transcrits sont préférentiellement réparés par HR. De plus, de nombreuses études ont montrées une interaction entre transcription et réparation des CDB. Au vue de ces résultats, nous avons donc émis l'hypothèse que les gènes transcriptionellement actifs pourraient être réparés par un mécanisme spécifique nécessitant l'activité de protéines associées à la transcription : "Réparation couplée à la transcription". Durant ma thèse, je me suis intéressée au rôle de deux protéines dans la réparation des régions transcrites en utilisant la lignée cellulaire DIvA (DSB Induction via AsiSI) qui permet l'induction de cassures annotées sur tout le génome. Premièrement, nous avons montré que la réparation des CDB dans des loci transcrits nécessitent une hélicase ADN : ARN connue : sénataxine (SETX). Après induction d'une cassure dans un gène, SETX est recrutée ce qui permet la résolution d'hybride ADN : ARN (cartographié par DRIP- seq). Nous avons aussi montré que SETX permet le recrutement de RAD51 et limite les jonctions illégitimes des CDB et par conséquent promeut la survie des cellules après induction des cassures. Cette étude montre que les CDB dans les loci transcrits requièrent la résolution spécifique des hybrides ADN : ARN par SETX pour permettre une réparation précise et est absolument indispensable pour la survie cellulaire. Deuxièmement, nous avons montré une interaction entre SETX et Bloom (BLM) une G4 DNA hélicase dans la réparation des CDB dans les régions transcrites. Nous avons montré que BLM est aussi recrutée au CDB dans les loci transcrits où elle est nécessaire à la résection et à la fidélité de réparation. De façon importante, nous avons montré que la déplétion de BLM restaure le défaut de survie cellulaire observé dans les cellules déplétées pour SETX après induction des CDB. La déplétion d'autres hélicases G4 (RTEL1, FANCJ) promeut aussi la survie des cellules déplétées pour SETX après dommages. Ces résultats suggèrent une interaction entre les hélicases G4 et la résolution des hybrides ADN : ARN dans la réparation des gènes actifs. En conclusion, ces études permettent une meilleure compréhension de la spécificité de la réparation des régions transcrites du génome, et notamment l'identification de protéines impliquées dans la "Réparation couplée à la Transcription". Actively transcribed genes can be the source of genome instability through numerous mechanisms. Those genes are characterized by the formation of secondary structures such as RNA- DNA hybrids. They are formed when nascent RNA exiting RNA polymerase II hybridizes single stranded DNA. Numerous studies have shown that RNA-DNA hybrids accumulation can lead to DNA damages. Among those damages, DNA double strand breaks (DSB) are the most deleterious for cells since they can generate mutations and chromosomal rearrangements. Two major repair mechanisms exist in the cell: Non-Homologous End-Joining (NHEJ) and Homologous recombination (HR). My lab showed recently that DSB occurring in transcribed genes are preferentially repaired by HR. Moreover, multiple studies have shown a cross talk between transcription and DSB repair. Those results led us to propose that actively transcribed genes could be repaired by a specific mechanism implicating proteins associated with transcription: “Transcription-coupled DSB repair”. During my PhD, using the DIvA (DSB Induction via AsiSI) cell line allowing the induction of annotated DSB through the genome, I worked on 2 projects focusing on DSB repair in transcribed genes. First, we showed that DSB repair in transcribed loci requires a known RNA: DNA helicase: senataxin (SETX). After DSB induction in an active gene, SETX is recruited which allows RNA-DNA hybrid resolution (mapped by DRIP-seq). We also showed that SETX activity allows RAD51 loading and limits DSB illegitimate rejoining and consequently promotes cell survival after DSB induction. This study shows that DSB in transcribed loci require specific RNA-DNA hybrids removal by SETX for accurate repair. Second, we showed an interplay between SETX and Bloom (BLM) a G4 DNA helicase in DSB repair induced in transcribed loci. We showed that BLM is also recruited at DSB in transcribed loci where it promotes resection and repair fidelity. Strikingly, we showed that BLM depletion rescued the survival defects observed in SETX depleted cells following DSB induction. Knock down of other G4-helicases (RTEL1, FANCJ) also promoted cell survival in SETX depleted cells upon damage. Those data suggest an interplay between G4 helicases and RNA: DNA resolution for DSB repair in active genes. Altogether, these studies promote a better understanding of the specificity of DSB repair in transcriptionally active genes, and notably identification of proteins involved in “Transcription-coupled DSB repair”.