FACULTY OF SCIENCE

Analysis of genome instability using genomic and bioinformatic approaches

Ph.D. Thesis

Kateřina Havlová

Supervisor: prof. RNDr. Jiří Fajkus, CSc.

Laboratory of Functional Genomics and Proteomics

Brno 2020

Bibliographic Entry

Author: Mgr. Kateřina Havlová Faculty of Science, Masaryk University National Centre for Biomolecular Research Functional Genomics and Proteomics Central European Institute of Technology, MU Mendel Centre for Plant Genomics and Proteomics Analysis of genome instability using genomic and Title of Thesis: bioinformatic approaches

Degree programme: Genomics and proteomics

Supervisor: prof. RNDr. Jiří Fajkus, CSc.

Academic Year: 2019/2020

Number of Pages: 80+33

Keywords: rDNA; ribosomal DNA; ribosomal DNA intergenic spacer; ; Arabidopsis thaliana; Chromatin Assembly factor 1; Physcomitrella patens; G-quadruplex

Bibliografický záznam

Autor: Mgr. Kateřina Havlová Přírodovědecká fakulta, Masarykova univerzita Národní centrum pro výzkum biomolekul Funkční genomika a proteomika Středoevropský technologický institut, MU Mendelovo centrum genomiky a proteomiky rostlin Analýza nestability genomu pomocí genomických a Název práce: bioinformatických přístupů

Studijní program: Genomika a proteomika

Vedoucí práce: prof. RNDr. Jiří Fajkus, CSc.

Akademický rok: 2019/2020

Počet stran: 80+33 Klíčová slova: rDNA; ribosomální DNA; mezerník ribosomální DNA; chromatin; Arabidopsis thaliana; Chromatin Assembly factor 1; Physcomitrella patens; G-kvadruplex

Abstract

Ribosomal RNA genes (rDNA) are the most abundant and utilized genes in eukaryotes. They compose a vast portion of the genome and they are involved in the maintenance of the genome-wide chromatin structure. This thesis focuses on rDNA in two different model species – Arabidopsis thaliana and Physcomitrella patens.

Firstly, we present here the detailed characterisation of the variability in the sequence of Arabidopsis intergenic spacer (IGS), the regulatory region separating each two copies of rRNA genes. We present a new variant in the 3’ETS region of pre-rRNA, and the preferential association of 3’ETS variants with specific IGS arrangements. Next, we mapped the IGS variant rearrangements in Arabidopsis plants which underwent rDNA loss and subsequent rDNA recovery (plants with dysfunctional histone chaperone CAF1 and plants with restored CAF1 function, respectively). Overall, CAF1-deficient plants show less variability than wild-type plants. We have observed the selective loss of some IGS variants and sporadic generation of new IGS variants. In plants with restored CAF1 function, the spectrum of IGS variants resembles that of their parental mutants, suggesting that the rDNA recovery occurs through a relatively precise DNA synthesis-dependent homologous recombination mechanism.

Secondly, we present a computational analysis of rDNA sequence in Physcomitrella patens which support the hypothesis that G-quadruplex structures substantially contribute to the rDNA instability observed in Physcomitrella with dysfunctional quadruplex-unwinding helicase RTEL1. This is consistent with the ability of RTEL1 to resolve G-quadruplex structures during replication.

Abstrakt

Geny ribosomální RNA (rDNA) jsou nejvíce zastoupené a vytížené geny u eukaryot. Zabírají velkou část genomu a podílejí se na udržení struktury chromatinu celého genomu. Tato práce se zaměřuje na rDNA dvou rozdílných modelových druhů – Arabidopsis thaliana a Physcomitrella patens.

Nejprve zde prezentujeme detailní popis variability v sekvenci genového mezerníku (IGS) u Arabidopsis, což je regulační úsek oddělující každé dvě kopie genů rRNA. Prezentujeme novou variantu v 3’ETS oblasti pre-rRNA a preferenční spojení 3’ETS variant s jejich specifickou IGS stavbou. Dále jsme mapovali přestavby variant IGS u rostlin Arabidopsis, které prodělaly ztrátu rDNA a následné obnovení rDNA (jedná se o rostliny s nefunkčním histonovým chaperonem CAF1 a rostliny s obnovenou funkcí CAF1). Celkově CAF1-deficientní rostliny vykazují menší variabilitu než rostliny wild-type. Pozorovali jsme selektivní ztrátu některých variant IGS a zřídkavé vytvoření nových variant IGS. U rostlin s obnovenou funkcí CAF1 je spektrum variant IGS podobné spektru, které bylo u jejich rodičovských mutantů, což naznačuje, že obnovení rDNA u těchto rostlin probíhá relativně přesným mechanismem homologní rekombinace závislým na syntéze DNA.

Následně prezentujeme výpočetní analýzu na sekvenci rDNA u Physcomitrella patens, která podporuje hypotézu, že struktury G-kvadruplexů významně přispívají k nestabilitě rDNA, která byla pozorovaná v případě nefunkční helikázy RTEL1. To je v souladu se schopností RTEL1 rozvolňovat G-kvadruplexy během replikace.

Acknowledgements

Here I would like to thank my supervisor prof. RNDr. Jiří Fajkus CSc. and my consultant Mgr. Martina Dvořáčková, PhD. for the expert leading and the invaluable and kind guidance throughout my studies.

I am also grateful to all colleagues from the Laboratory of Functional Genomics and Proteomics for their help and friendship.

I wish to acknowledge the support of my loving family, especially my husband and my parents, who always believed in me and did their best so I could finish this thesis.

This work was supported by ERDF ([project SYMBIT, reg. no. CZ.02.1.01/0.0/0.0/15 003/0000477]).

Statement

I hereby declare that I worked on this thesis independently and that I used only the literature listed in bibliography.

Brno, 2020 ……………………………… Kateřina Havlová

Original publications and the author’s contribution

The thesis is based on two publications to which the author contributed.

Publication 1 (Attachment 1)

Havlová K., Dvořáčková M., Peiro R., Abia D., Mozgová I., Vansáčová L., Gutierrez C. and Fajkus J. (2016). Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana. Plant Mol Biol. doi:10.1007/s11103-016-0524-1

The author contributed to this publication by performing the experiments: optimization of cloning, sequencing of the clones, and optimization of the samples for Pacific Biosciences sequencing. The author also did the sequence data analysis and participated in interpretation of all the results presented in the publication. Finally, the author participated in writing the manuscript.

Publication 2 (Attachment 2)

Goffová I., Vágnerová R., Peška V., Franěk M., Havlová K., Holá M., Zachová D., Fojtová M., Cuming A., Kamisugi Y., Angelis K. J. and Fajkus J. (2019). Roles of RAD51 and RTEL1 in and rDNA stability in Physcomitrella patens. Plant J. doi: 10.1111/tpj.14304

The author contributed to this publication by performing the sequence analysis of the rDNA intergenic spacer in Physcomitrella patens. The analysis included the prediction of potential G-quadruplex sites and the prediction of rRNA transcription start site.

Table of contents 1. Introduction ...... 11 2. Chromatin ...... 13 2.1. Nucleosomes ...... 13 2.2. Chromatin interactions...... 14 2.3. Chromatin states ...... 15 2.4. Chromatin maintenance during the genome replication ...... 16 2.5. Chromatin Assembly Factor 1 ...... 18 2.6. Chromatin Assembly Factor 1 in Arabidopsis thaliana ...... 20 2.7. Nucleolus ...... 22 3. Ribosomal DNA ...... 25 3.1. Ribosomal DNA loci ...... 25 3.2. 5S rDNA ...... 27 3.3. 45S rDNA ...... 28 3.4. 45S rDNA intergenic spacer ...... 29 3.5. 45S rDNA intergenic spacer in Arabidopsis thaliana ...... 30 3.6. Regulation of rRNA gene variants in Arabidopsis thaliana ...... 31 4. Ribosomal DNA instability in Physcomitrella patens ...... 35 4.1. Physcomitrella patens ...... 35 4.2. Ribosomal DNA instability in pprad51 and pprtel1 mutants ...... 35 5. G-quadruplexes ...... 37 5.1. Prediction of potential G-quadruplexes ...... 38 5.2. G-quadruplexes as a source of genome instability ...... 40 6. Aims of the study ...... 42 7. Methods ...... 43 7.1. Analysis of 45S rDNA intergenic spacer Arabidopsis thaliana ...... 43 7.1.1. Plant material ...... 43 7.1.2. DNA isolation ...... 43 7.1.3. Cloning, plasmid DNA isolation and sequencing of clones ...... 43 7.1.4. Single molecule real-time sequencing...... 45 7.1.5. Sequence analysis ...... 47 7.1.6. Restriction fragment analysis, probe labelling, and hybridization ...... 48 7.1.7. Restriction fragment analysis in silico ...... 48 7.2. Analysis of the rRNA gene unit in Physcomitrella patens ...... 49

9

7.2.1. Prediction of G-quadruplexes ...... 49 7.2.2. Detection of the putative transcription start site of the rRNA gene unit ...... 49 8. Results ...... 50 8.1. Variability of the intergenic spacer in Arabidopsis thaliana ...... 50 8.1.1. Variability of the intergenic spacer in WT plants ...... 52 8.1.2. Variability of the intergenic spacer in fas mutants ...... 57 8.1.3. Restriction fragment length polymorphism ...... 59 8.1.4. Variability of the intergenic spacer in revertant lines ...... 61 8.2. Prediction of G-quadruplexes in the rRNA gene unit of Physcomitrella patens ...... 62 9. Discussion ...... 63 10. Conclusion ...... 66 11. References ...... 68 12. Attachments ...... 81

10

1. Introduction

Ribosomal DNA (rDNA) loci are large, typically several Mbp-long, repetitive sequences which give rise to the nucleolus in the interphase of the cell cycle. rDNA constitutes hundreds of ribosomal RNA (rRNA) genes which are organised in a tandem head-to-tail fashion, each two gene copies separated by an intergenic spacer (IGS). Numerous recent studies show that rDNA is vitally important for the fate of the cell. Not only is it crucial for ribosome biogenesis (Fromont-Racine et al. 2003) but it is also involved in the stress response (Boulon et al. 2010), in the cell cycle regulation (Tsai and Pederson 2014), and in the maintenance of genome-wide chromatin structure (Paredes and Maggert 2009; Quinodoz et al. 2018). Yet, interestingly, rDNA is one of the most unstable and fragile regions of eukaryotic genomes with high level of homologous recombination and often plays a role as a genome instability marker (Stults et al. 2009).

The unequal recombination between rRNA gene copies may cause intragenomic fluctuations in the rRNA gene copy number and even in the number of rDNA loci and thus puts rDNA among the fastest evolving segments of the genome (Eickbush and Eickbush 2007). Interestingly, the large family of rRNA genes evolves in a highly concerted manner. The coding sequences are conserved while the noncoding IGSs show considerable interspecific and even intra-individual variability. The variability in spacer sequences allows for distinguishing among highly unified rRNA genes and it is involved in the selective regulation of rRNA genes (Chandrasekhara et al. 2016; Abou-Ellail et al. 2011; Tucker et al. 2010).

The first part of this thesis focuses on the IGS variability in the model plant organism Arabidopsis thaliana. The IGS in Arabidopsis has not been comprehensively described despite its potential importance in the regulation of rDNA transcription and replication. The study of IGS is complicated by the repetitiveness and the length of the sequence. The use of modern single molecule sequencing method allows us to overcome these issues. This thesis presents results of an analysis of thousands of complete spacer sequences and describes the detailed sequence variation in the IGS of Arabidopsis wild-type plants. In addition, we put the results in the context with the genomic DNA analysis by the restriction fragment analysis.

We further investigate the IGS in Arabidopsis mutants dysfunctional in Chromatin Assembly Factor 1 (CAF1). CAF1 is an evolutionarily conserved histone chaperone depositing (H3-H4)2 tetramers to DNA during the replication-dependent chromatin assembly. CAF1

11 mutants show severe decrease in the rDNA copy number and thus represent a great model for the study of rDNA regulation (Mozgová et al. 2010). Next, we focus on the IGS variability in plants with restored CAF1 function, where the rDNA copy number is recovered to various extents (Pavlištová et al. 2016), in order to study the sequence rearrangements in the IGS region associated with the recovery.

The second part of the thesis concerns the model plant Physcomitrella patens. In Physcomitrella, rDNA instability was observed in pprad51 and pprtel1 mutants (Goffová et al. 2019). The proteins RAD51 and RTEL1 are essential factors involved in the double strand break DNA repair by homologous recombination and in the maintenance of , complex DNA-protein structures which protect the ends of chromosomes. RAD51 performs homology search and strand invasion step of homology recombination while RTEL1 helps to protect the cell from inappropriate recombination by disassembling D-loops, the recombination intermediates (Barber et al. 2008; Bleuyard et al. 2006). RTEL1 can also dissolve quadruplex DNA structures (G4s) which otherwise block the extension of telomeres by telomerase as well as their replication by a conventional replication machinery (Vannier et al. 2012). Here, we focus on G4s and their contribution to the observed rDNA instability with the use of computational prediction tools which are based on the DNA propensity to form G4s.

12

2. Chromatin

The fade of every living cell is directed by the genetic information encoded in its DNA which is stored in the nucleus. DNA together with the interacting proteins form the chromatin. The structure of chromatin determines the expression of the genetic information and ensures the stability of the whole genome. The basic element of chromatin is a nucleosome (Figure 1). DNA covered with nucleosomes is further organized in complex higher order structures that allow interactions of otherwise distant DNA regions (Figure 1A) (Gibcus and Dekker 2013). Chromatin is a dynamic complex and its spatial organization is variable between cells. No two nuclei have the exact same inter- and intrachromosomal interactions. Generally, genome regions sharing specific similar properties, e.g. epigenetic marks, transcription activity or gene density, are more likely to group and interact with each other.

2.1. Nucleosomes

The nucleosome is the most basic and the best studied structure of chromatin. A single nucleosome consists of 147-bp-long DNA wrapped around the surface of a histone octamer (Figure 1B) (Luger et al. 1997). The histone octamer is a globular structure, which is composed of a centrally located (H3-H4)2 tetramer which is flanked on either side by an H2A-H2B dimer. Due to their outer location, the H2A-H2B dimers are more susceptible to the loss and replacement than H3-H4 (Ramachandran and Henikoff 2015). The histones interact with each other with their helical parts while their N-terminal ends are exposed out of the nucleosome and often bear posttranslational modifications with various regulatory meaning (Luger et al. 1997). Apart from the canonical types of histones, most eukaryotes have genes encoding histone variants (Malik and Henikoff 2003). Histone variants and histone modifications modulate the stability of nucleosomes and the interactions with transcription and chromatin remodelling factors (Luger et al. 1997).

Nucleosomes, separated by nucleosome-free sections of various length, cover the whole DNA. As they create obstacles for the molecular machineries of transcription, replication and DNA repair, they are constantly unwound, moved or reshuffled by chromatin remodelling complexes (Li et al. 2005).

13

2.2. Chromatin interactions

In interphase nuclei, most DNA interactions occur within individual chromosomes and result in formation of chromosome territories (Figure 2) (Lieberman-Aiden et al. 2009; Zhang et al. 2012; Doğan and Liu 2018). The territories contain the compartments named A and B. The A compartments are generally gene-rich and transcriptionally active while the B compartments are relatively gene-poor and transcriptionally silent. The A/B compartments contain the topologically associated domains (TADs). TADs are created by local interactions within a DNA region of a limited size, typically tens of kilobase-pairs large (Doğan and Liu 2018). They were found in several plant species, but not in Arabidopsis (Dong et al. 2017; Feng et al. 2014).

The intrachromosomal interactions of chromatin are organized around anchoring points, the nuclear envelope or the nuclear bodies (Gibcus and Dekker 2013; Quinodoz et al. 2018). The nuclear envelope is composed of two membranes perforated with nuclear pores. Nuclear pores are specific protein complexes which bridge the nuclear membranes. The inner nuclear membrane is covered with nuclear lamina, a meshwork of filamentous proteins. Nuclear lamina is associated with transcriptionally inactive gene-poor chromatin regions – lamina-associated domains (LADs), whereas nuclear pores associate with active chromatin regions (Guelen et al. 2008; Capelson et al. 2010; Kind et al. 2015).

The nuclear bodies, such as the nuclear speckles and the nucleolus, are dynamic but relatively stable nuclear structures without delineating membranes (Quinodoz et al. 2018). The nuclear speckles are enriched with protein-coding genes transcribed by RNA polymerase II (pol II) (Spector and Lamond 2011; Reddy et al. 2012), whereas the nucleolus contains primarily ribosomal RNA genes transcribed by pol I (Figure 2) (Pederson 2011). The chromatin regions that interact with the nucleoli are named the nucleolus-associated domains (NADs). The nucleolar periphery plays an important role in organizing inactive chromatin regions (Picart- Picolo et al. 2019). NADs also include some active pol III-dependent tRNA-coding genes and other mostly inactive genomic regions, such as transposable elements, sub-telomeric regions and protein-coding regions (Németh et al. 2010; Frédéric Pontvianne et al. 2016).

14

Figure 1. (A) DNA is packed in the nucleus in an organized manner (Baker 2011). DNA and the associated proteins together form the chromatin. DNA covered with nucleosomes is organized in complex higher order structures including chromatin fiber and chromatin loops. (B) A detailed model of a nucleosome core (Harp et al. 2000). The ventral and side view, the axis divides the histone octamer in two parts, each containing one of the histones H2A (violet), H2B (green), H3 (yellow), and H4 (blue).

Figure 2. The genomic DNA is packed in the nucleus in a specific way. Individual chromosomes tend to be close to each other (coloured lines) and regions on different chromosomes sharing similar properties can organize around a nuclear body, e.g. the nucleolus or nuclear speckles. Ribosomal RNA genes transcribed by RNA polymerase I (pol I) and centromeric regions associate with the nucleolus while the chromosome regions containing a high-density of RNA polymerase II (pol II) associate with nuclear speckles. These form spatial hubs of inter-chromosomal contacts (Quinodoz et al. 2018).

2.3. Chromatin states

Historically, we classify chromatin into two states based on the condensation level: transcriptionally active loosely packed euchromatin and silenced more tightly packed (Grewal and Jia 2007). However, several distinct chromatin states can be distinguished based on the epigenetic marks, such as DNA methylation, post-translational

15 histone modifications and histone variants (Filion et al. 2010; Roudier et al. 2011; Sequeira- Mendes et al. 2014). Distinct chromatin states contain specific combinations of epigenetic marks that determine overall chromatin behaviour in DNA maintenance processes, including transcription, replication, and DNA repair. In Arabidopsis, four main chromatin states were distinguished which preferentially contain active genes, repressed genes, silent repeat elements and intergenic regions (Roudier et al. 2011). Later, another more detailed study extended the number of monitored epigenetics marks and increased the number of distinct chromatin states to nine (Sequeira-Mendes et al. 2014).

2.4. Chromatin maintenance during the genome replication

During the replication of the genome, thousands of replication origins are synchronized to ensure that the genome is replicated only once and completely (Alabert and Groth 2012). From each replication origin, a bi-directional replication fork emerges which passes through parent DNA strand and leaves two daughter strands behind. Nucleosomes are disrupted ahead of the replication fork and they are reassembled on the daughter strands by histone chaperones, a group of diverse negatively charged proteins which bind positively charged histones (Ransom et al. 2010). They allow the ordered formation of nucleosomes and prevent incorrect interactions between histones and DNA.

The central role in replication-coupled nucleosome assembly is played by the chaperones CAF1 (Chromatin Assembly Factor 1), ASF1 (Anti-silencing Factor 1), FACT (Facilitates Chromatin Transcription) and NAP1 (Nucleosome Assembly Protein 1) (Figure 3) (Yu et al. 2015; Rowlands et al. 2017). CAF1 and ASF1 maintain the histones H3 and H4 while FACT and NAP1 maintain the histones H2A and H2B. Nucleosomes reassembled on daughter DNA strands contain recycled parental or new histones, synthesized in the cytoplasm (Ransom et al. 2010; Ramachandran and Henikoff 2015). The parental histones may bear posttranslational modifications which are thus transmitted to newly synthesized DNA as epigenetic marks (Rowlands et al. 2017). Recycled and new H3/H4 do not mix in a single reassembled nucleosome (Xu et al. 2010). However, both recycled and new H2A-H2B dimers can freely associate with recycled or new (H3-H4)2 tetramers. In contrast, old and new histones H3/H4 mix in the nucleosomes assembled during replication-independent processes (Kumar and Leffak 1986).

16

The obstacles for the replication machinery often cause the stalling of a replication fork (Rowlands et al. 2017). When the stalling is prolonged, it increases the risk of the replication fork collapse and DNA damage leading to mutations and genome instability (Alabert and Groth 2012). To prevent DNA damage, various factors stabilize the paused forks and aid the resumption of replication. Mutation in these factors can lead to checkpoint activation and genome instability (Ang et al. 2016; Polo and Almouzni 2015). Interestingly, the pausing of a replication fork provides also an opportunity for a change of the chromatin state (Alabert and Groth 2012).

The sites susceptible to the replication fork stalling could vary both between cells and between different regions of the genome. The stalling often occurs in the genome regions with repetitive DNA sequences which include subtelomeres, telomeres, tRNA genes, rRNA genes, and centromeres. These regions are rich in complex secondary DNA structures, such as G4 quadruplexes, and some of them (typically rDNA loci and telomeres) show an increased occurrence of DNA-RNA hybrid structures termed as R-loops (Lindström et al. 2018; Toubiana and Selig 2018). Secondary structures need to be unwound or cleaved by specialized enzymes to allow replication fork progression (Paeschke et al. 2013; León-Ortiz et al. 2014).

17

Figure 3. The role of chaperones in DNA replication-coupled assembly (Rowlands et al. 2017). The nucleosomes are disassembled by the chaperones ASF1 and FACT. CAF1 is guided by the interaction with PCNA. H3-H4 dimers are transferred from the chaperone ASF1 to CAF1, forming a (H3-H4)2 tetramers. The nucleosomes are completed by H2A-H2B dimers brought by the chaperone FACT or NAP1. The chaperone Rtt106 is involved in the delivery of new H3/H4 histones, but also interacts with ASF1, CAF1 and FACT and may coordinate the assembly of H3-H4 and H2A-H2B. New nucleosomes consist of either parental (dark blue/green) or newly synthesized (light blue/green) histones. The parental histones may bear epigenetic marks which are transmitted to newly synthesized DNA (yellow stars). New H3-H4 dimers, transported from the cytoplasm, are acetylated by histone acetyltransferases (red stars). These predeposition marks are consequently removed. It is assumed that CAF1 is responsible for the maintenance of both, old and new, H3-H4 histones, but it is also possible that it works with the new H3/H4 histones only.

2.5. Chromatin Assembly Factor 1

The Chromatin Assembly Factor 1 (CAF1) is a histone chaperone which plays a central role in the nucleosome assembly during DNA replication and repair (Smith and Stillman 1991; Kamakaka et al. 1996; Gaillard et al. 1996; Rowlands et al. 2017). CAF1 is a protein complex of three subunits, highly conserved in eukaryotes. The three subunits are referred to as Fasciata1

18

(FAS1), Fasciata 2 (FAS2), and Multicopy Suppressor of IRA 1 (MSI1) in plants (Kaya et al. 2001); p150, p60, and p48 in mammals (Verreault et al. 1996); and Chromatin Assembly Complex 1-3 (CAC1-3) in yeast (Kaufman et al. 1997). The two largest subunits FAS1/p150/CAC1 and FAS2/p60/CAC2 function exclusively in CAF1 while the smallest subunit MSI1/p48/CAC3 functions in several distinct chromatin remodelling complexes (Hennig et al. 2005).

CAF1 interacts with various proteins including histone chaperones ASF1 (Mello et al. 2002) and Rtt106 (Fazly et al. 2012), DNA helicases BLM and WRN (Jiao et al. 2004; Jiao et al. 2007), and PCNA, a ring-shaped protein complex of the replication fork (Shibahara and Stillman 1999). CAF1 binds an H3-H4 dimer, delivered by ASF1 (Mello et al. 2002), and two CAF1-H3-H4 complexes mediate the assembly of the histone tetramer (H3-H4)2 and its binding on the newly synthetized DNA (Mattiroli et al. 2017) to which CAF1 is guided by PCNA (Shibahara and Stillman 1999; Ben-Shahar et al. 2009). CAF1 deposits the replication- specific histone variant H3.1 but not the replication-independent histone variant H3.3 (Tagami et al. 2004). CAF1 deposits the H3/H4 histones newly synthetized and transported from the cytoplasm. However, its role in the deposition of the recycled histones is uncertain (Figure 3) (Rowlands et al. 2017).

CAF1 is essential for the maintenance of heterochromatin and for multicellular organism development (Yu et al. 2015; Cheng et al. 2019). It interacts with histone deacetylases (Martínez-Balbás et al. 1998), methyltransferases and other epigenetic regulators (Loyola et al. 2009; Yu et al. 2015), such as transcription repressor MBD1 (Reese et al. 2003) and heterochromatin protein HP1 (Quivy et al. 2004). In yeast, loss of function mutation of CAF1 leads to reduced position-dependent gene silencing at telomeres (Kaufman et al. 1997; Monson et al. 1997), centromeres (Dohke et al. 2008) and the silent mating type loci (Enomoto and Berman 1998; Dohke et al. 2008).

CAF1 is also involved in chromatin restoration following DNA repair, in particular, nucleotide excision repair (NER) (Gaillard et al. 1996; Polo et al. 2006) and double strand break (DSB) repair by homologous recombination (HR) and by nonhomologous end joining (NHEJ) (Lewis et al. 2005; Nabatiyan et al. 2006; Song et al. 2007). Depletion of CAF1 is lethal in animals (Houlard et al. 2006; Nabatiyan and Krude 2004). In yeast, CAF1 is not essential for the cell viability, however, it causes delays in the cell cycle and the increase in the UV radiation sensitivity (Kaufman et al. 1997).

19

2.6. Chromatin Assembly Factor 1 in Arabidopsis thaliana

In Arabidopsis, the mutation in MSI1 is embryo-lethal (Hennig et al. 2005) and the mutations in FAS1/2 lead to dysfunction of CAF1 with serious phenotypic consequences. The fas mutants are smaller than wild-type plants and show the abnormal leaf and flower morphology and disorganization of apical meristems (Figure 4) (Leyser and Furner 1992; Kaya et al. 2001). The phenotypic defects accumulate and viability and fertility of the mutants seriously decrease in late generations (Mozgová et al. 2010). Therefore, the mutants cannot be propagated beyond the ninth generation (Pontvianne et al. 2013).

Interestingly, when CAF1 is disrupted, the nucleosomes are preserved by an unknown CAF1-independent mechanism, likely involving the chaperone HIRA (Muñoz-Viana et al. 2017). However, the nucleosomes show a lower occupancy in the non-transcribed regions which is consistent with the importance of CAF1 in heterochromatin maintenance.

The telomeres and 45S rDNA loci are particularly sensitive to disruption of CAF1. The telomeres shorten substantially and the 45S rDNA repeat copy number reduces while other repetitive regions, 5S rDNA and centromeres, remain unaffected (Mozgová et al. 2010). The loss of telomeres and rDNA does not require passage through meiosis, however, it is prominent with the plant propagation (Muchová et al. 2015).

While the telomeres of wild-type plants are ~3-5 kb long, the telomeres in mutants progressively shorten over three to five generations to ~1.5-1.7 kb (Mozgová et al. 2010) . The telomere shortening was observed on all chromosomes to a similar extent. In addition, in late generations of mutants, the telomere shortening is associated with the increased formation of anaphase bridges.

In the diploid wild-type genome, 45S rDNA consists of approximately 1140 45S rRNA genes, majority of which is silenced (Pontes et al. 2003; Pruitt and Meyerowitz 1986). In the second generation, the mutants still contain ~60% of rDNA and in the fifth generation, they contain only ~10% of rDNA (Mozgová et al. 2010). Thus, in late generations of the mutants there are only about 114 genes left. All the remaining genes are transcriptionally active and the overall 45S rRNA production is similar as in the wild-type (Mozgová et al. 2010; Pontvianne et al. 2013).

20

The CAF1 deficient plants show several characteristics which imply that their 45S rDNA instability is caused by homology-dependent DNA damage repair (HDR) during which DSB site in the middle of the repetitive sequence is repaired by cutting the injured repeat out of the genome, and thus shortening the total length of rDNA (Muchová et al. 2015). In CAF1- deficient plants, the cell cycle is defected with overall increased frequency of homologous recombination (Ramirez-Parra and Gutierrez 2007b; Kirik et al. 2006; Ramirez-Parra and Gutierrez 2007a); DSBs increase in 45S rDNA, especially in the transcribed regions, and they occur independently of replication (Muchová et al. 2015); And most importantly, the disruption of HDR by a knock-out mutation of a recombinase, RAD51B, decreases the rate of 45S rDNA loss.

The mechanism responsible for the telomere shortening in fas mutants has not been clarified yet. Studies on the double mutants show that it differs from the mechanism responsible for the loss of 45S rDNA and that is independent of the telomerase activity (Muchová et al. 2015; Jaške et al. 2013). Double mutants rad51b/fas, with a disrupted HR pathway, still display the telomere shortening (Muchová et al. 2015) and double mutants tert/fas, with a disrupted telomerase, display even greater telomere shortening than single fas and tert mutants (Jaške et al. 2013).

The plants with the restored CAF1 function, segregated from the fas1xfas2 genetic background, show a wild-type morphology and progressive recovery of telomeres and rDNA (Pavlištová et al. 2016). While telomeres recover within three to four plant generations, the extent and rate of the rRNA genes recovery is more variable. Several revertant lines were obtained which substantially differ in the rRNA genes copy number and their distribution among rDNA loci. In addition, the ratio of rRNA gene variants and their expression is also specific for distinct revertant lines.

21

Figure 4. Arabidopsis thaliana carrying a T-DNA insertion in fas2 gene (Mozgová et al. 2010). The severity of phenotype defects increases with the plant generations. The 3rd and the 5th generation of mutants are depicted (G3 and G5).

2.7. Nucleolus

The most distinctive nuclear sub-compartment, the nucleolus, is present in most eukaryotic cells. A single nucleus can contain several nucleoli, as in human, or only one nucleolus, as in Arabidopsis (Picart-Picolo et al. 2019). The nucleolus is formed by specific chromosomal regions, called nucleolus organizing regions (NORs) (Pederson 2011). NORs contain genes for ribosomal RNA (rRNA) which are usually present in hundreds of copies, each two copies being separated by an intergenic spacer (IGS).

The nucleolus contains three major components: the fibrillar center (FC), the surrounding dense fibrillar component (DFC), and the granular component (GC) (Figure 5)

22

(Pederson 2011). In a simplistic model, these components ensure the spatial separation of inactive rRNA genes (FC), pre-rRNA synthesis by pol I and early pre-rRNA processing (DFC), and late rRNA processing and pre-ribosome assembly (GC) (Raska et al. 2006). However, the ribosome biogenesis is a more complex process and many of its steps are simultaneous and synchronized. The pre-rRNA transcript is processed simultaneously with its transcription and pre-ribosome assembly (Henras et al. 2015). The rRNA maturation is accompanied by transient interactions with many snoRNPs, complexes of proteins and small nucleolar RNAs (snoRNAs). For instance, the U3 snoRNP, creates a complex with the protein nucleolin which binds both, pre-rRNA and rDNA and cuts the pre-rRNA in 5’ETS specific site (Sáez-Vasquez et al. 2004; Abou-Ellail et al. 2011).

The overall ribosome biogenesis activity is modulated throughout the cell cycle and the portion of actively transcribed rRNA genes changes based on the cell requirements (Dimario 2004; Grummt and Pikaard 2003). Therefore, the size of the nucleolus changes as well. In prophase, the nucleolus disassembles altogether, and it reassembles in late anaphase / early telophase (Dimario 2004).

Apart from the ribosome biogenesis, the nucleolus participates in a wide range of other processes. Actually, a substantial part of Arabidopsis nucleolar proteome, ~40 %, consists of proteins with a function non-related to the ribosome biogenesis, for instance the proteins of 26S proteasome complex (Montacié et al. 2017). The nucleolus further functions as a reservoir of iron in plant cells (Roschzttardtz et al. 2011). The nucleolus also hosts a partial assembly of the signal recognition particle (SRP), a protein-RNA complex that recognizes and directs specific proteins to the endoplasmic reticulum (Politz et al. 2002). Yet another function of the nucleolus is the nucleolar detention, a form of posttranslational regulation of protein mobility. The affected proteins contain nucleolar detention sequence which is targeted by long noncoding RNAs transcribed from the IGS (Audas et al. 2012). They are temporarily imprisoned within the nucleolus and cannot work in their active sites. Similarly, the nucleolus is involved in the cell cycle-dependent regulation of telomerase trafficking (Tomlinson et al. 2006). Human telomerase is assembled in the dense fibrillar component during S phase and it is detained in the nucleolus through the interaction with the protein nucleolin until it is transported to the Cajal bodies and eventually recruited to the telomeres (Lee et al. 2014; Khurts et al. 2004).

23

Figure 5. Schematic representation of the nucleolar structure. Nucleolus contain three components: fibrillar centers (yellow), dense fibrillar components (green), granular components (pink) (Raska et al. 2006).

24

3. Ribosomal DNA

Genes encoding ribosomal DNA (rDNA) are the most abundant and utilized genes in eukaryotes. The primary role of rDNA is to be transcribed into rRNA which is then incorporated into ribosomes, organelles which supply the cell with proteins. There are 2 types of rRNA genes: 5S rRNA genes and 45S rRNA genes, the latter encoding 25S, 5.8S and 18S rRNAs (Layat et al. 2012). The molecules of 5.8S, 25S and 5S rRNA are incorporated in the large ribosome subunit while the 18S rRNA is incorporated in the small ribosome subunit (Figure 6A) (Ben-Shem et al. 2011).

Figure 6. (A) Schematic picture of the plant 80S ribosome (Layat et al. 2012). It has two subunits: the small 40S and the large 60S subunit. The large subunit consists of 25S, 5.8S and 5S rRNA and 50 ribosomal proteins. The small subunit contains 18S and rRNA and 33 proteins. (B) Localization of 5S and 45S rDNA in Arabidopsis thaliana ecotype Columbia (Layat et al. 2012). The active 5S rDNA loci (in red, surrounded by a blue rectangle) localized on the left arms of chromosomes 4 and 5. The inactive 5S rDNA (in red) loci are on the chromosome 3 and 5. The 45S rDNA loci are localized on the chromosome 2 and 4 (in green).

3.1. Ribosomal DNA loci

The rRNA genes are usually present in hundreds of copies which are arranged head-to- tail in tandem repeat arrays, termed as rDNA loci, that constitute essential genome components. Each two copies are separated by an intergenic spacer (IGS). A genome usually contains hundreds of rRNA genes, the activity of which changes based on the cell requirements throughout the cell cycle and ontogenesis (Dimario 2004; Grummt and Pikaard 2003). A significant fraction of them, often >50 %, is permanently blocked (Grummt and Pikaard 2003). The reasons for the genome to maintain these extra copies are not well understood. They likely play a role in maintaining the genome integrity (Ide et al. 2010).

25

Typically, rDNA is physically separated in two types of gene arrays, 5S and 45S rDNA. Another arrangement, the linked 45S-5S repeat arrays, are often found in plants, e.g. in the Asteraceae family (Garcia et al. 2010) and in primitive eukaryotes such as yeast (Nomura et al. 2013), but rarely in animals (Figure 7) (Sochorová et al. 2018). Recently, our laboratory demonstrated the linked 45S-5S arrangement in the moss Physcomitrella Patens (Goffová et al. 2019).

The rDNA loci numbers and positions in chromosomes, analysed mostly by fluorescence in situ hybridisation (FISH), are summarized in an online resource, The Plant/Animal rDNA Database (Vitales et al. 2017; Sochorová et al. 2018). In both, animals and plants, the average number of rDNA loci tends to be low, two and four sites per diploid chromosome set, respectively (Sochorová et al. 2018). However, in both groups there are also species with extraordinary high numbers of rDNA loci.

Interestingly, rDNA loci can differ in number and size within a species, as observed e.g. in common bean (Pedrosa-Harand et al. 2006). Moreover, rDNA loci often display an uneven number of genes between homologous chromosomes of a single individual. The variability of the rDNA size has the origin in the intrinsic sequence repetitiveness of the region which promotes random occurrence of unequal crossovers. Therefore, to some extent the number of genes increases or decreases by chance.

The sequence of rDNA has also some unique characteristics. The coding regions are evolutionary conserved and very similar even among distantly related organisms such as animals and plants (Ben-Shem et al. 2011). On the other hand, the noncoding IGSs have big differences even between closely related species (Brown et al. 1972). However, within a species, the differences in IGSs are relatively small. It can be explained by an evolutionary model of concerted evolution which states that all members of the rRNA gene family evolve in a concerted manner rather than independently (Eickbush and Eickbush 2007). According to the concerted evolution, the rRNA genes are homogenized by unequal crossovers and gene conversions. Any new mutation is slowly spread by these mechanisms, however, the variability is primarily in the noncoding area where it is not so strictly limited by the purifying selection (Cvijović et al. 2018).

The homogenizing mechanisms are affected by the location of rRNA genes. The individual variants of IGS are often clustered together (Eickbush and Eickbush 2007). In

26 hominoids, the IGSs located closely to telomeres are very similar, but the IGSs distal to telomeres are more differentiated (Gonzalez and Sylvester 2001).

Figure 7. Two types of rRNA gene arrays (Nei and Rooney 2005). In most eukaryotes, the 45S rRNA genes are separate from 5S rRNA gene arrays. In yeast, the 5S rRNA gene is incorporated in the IGS of 45S rRNA gene unit to create the linked 45S-5S repeat arrays. 45S rRNA gene unit encodes 18S, 5.8S and 25/28S rRNA. The genes are separated by external and internal transcribed spacers (ETS and ITS).

3.2. 5S rDNA

5S rRNA is transcribed by pol III in the nucleoplasm and subsequently it is transported to the nucleolus (Pederson 2000). 5S rRNA genes are present in a variable number of copies, usually less than 1000. Typically, they occur in tandem arrays that have interstitial or pericentromeric localization (Vitales et al. 2017; Sochorová et al. 2018).

Arabidopsis thaliana contains about 1000 copies of 5S rDNA per haploid genome, organised in 3-4 blocks (Figure 6B) (Campell et al. 1992; Cloix et al. 2002). Two blocks are situated in the pericentromeric regions of chromosomes 4L and 5L (L - left arm). The other two blocks, transcriptionally inactive, are in pericentromeric regions of chromosomes 5R and 3L (R - right arm)(Cloix et al. 2002). The 3L 5S rDNA is dispensable as it is missing in some of the Arabidopsis ecotypes (Tutois et al. 2002). In Columbia ecotype, two types of 5S rDNA repeat units exist: the major 502 bp long unit and the minor 251 bp long unit (Cloix et al. 2002). The coding region of the repeat covers only 120 nt and is highly conserved.

In some organisms, 5S rRNA genes can be linked to other multigene families, such as 45S rRNA genes, small nuclear RNA genes (Vierna et al. 2011; Cross and Rebordinos 2005)

27 or histone protein coding genes (Cabral-de-Mello et al. 2010). A dispersed organisation of 5S rRNA genes has also been observed (Morzycka-Wroblewska et al. 1985).

3.3. 45S rDNA

A single 45S rRNA repeat unit contains three rRNA genes separated by two internal and two external transcribed spacers (ITS1, ITS2, 5’ ETS, 3’ETS) in the following gene order: 5’ETS, 18S rRNA, ITS1, 5.8S rRNA, ITS2, 25S rRNA, and 3’ETS (Figure 8) (Long and Dawid 1980). The three genes of the repeat unit are transcribed together in a single transcript which is subsequently processed to three mature rRNA molecules. Individual repeat units are divided by intergenic spacers (IGSs), also termed as nontranscribed spacers (NTSs). The primary sequence length of 25S rRNA can slightly vary between species, therefore, one can also encounter the terms 25S (plants including Arabidopsis thaliana), 26S (wheat) or 28S (mammals). For the same reason, the 45S rRNA is often named as 35S in plants (Seitz and Seitz 2014).

Figure 8. General schematic view of human 45S ribosomal genes and their transcripts (Raska et al. 2004). A single gene unit consists of three genes,18S, 5.8S and 28S rRNA, which are transcribed together in a single transcript. The three genes are separated by internal transcribed sequences (ITS1 and ITS2). At the borders of the unit, there are external transcribed sequences (5’ETS, and 3’ETS). The units are arranged in tandem arrays and they are separated by intergenic spacers (IGSs).

45S rDNA is the nucleolar organising region (NOR) and during active rRNA synthesis it is transcribed by pol I and forms the nucleoli visible in the nucleus (Viktorovskaya and Schneider 2015; Scheer and Weisenberger 1994; Goodfellow and Zomerdijk 2012). NORs are

28 frequently located adjacent to telomeres on eukaryotic chromosomes, as in the Arabidopsis thaliana genome (Sochorová et al. 2018; Roa and Guerra 2012). The Arabidopsis haploid genome contains approximately 570-750 copies of 45S rRNA gene in two gene arrays situated in the subterminal regions of chromosome 2 and 4, NOR2 and NOR4 (Figure 6B) (Pruitt and Meyerowitz 1986; Copenhaver and Pikaard 1996a). The 45S rRNA gene unit is approximately 10 kb long from which the IGS covers 4.5 kb (Pruitt and Meyerowitz 1986; Gruendler et al. 1991). NOR2 and NOR4 are similar in size, each spanning 3.5-4.0 Mb (Copenhaver and Pikaard 1996b). These arrays make up 5% of the Arabidopsis genome.

3.4. 45S rDNA intergenic spacer

The intergenic spacer (IGS) is a noncoding regulatory sequence which spreads between each two 45S repeat units. The length of the IGS varies greatly among species, from ~1.5 kb (Oxytricha fallax) to ~40 kb (Mus musculus) (Spear 1980; Grozdanov, Georgiev, and Karagyozov 2003). It is a rapidly evolving region and its variation can be exploited in studies of genetic relationships between closely related species (Bhatia et al. 1996; Gorokhova et al. 2002).

In addition, the IGS sequences show relatively high intraindividual variability. Which stands in contrast to the high homogeneity of the coding regions of 45S rDNA. The IGS length and sequence heterogeneity was observed in many species, e.g. in Arabidopsis (Gruendler et al. 1991), in molluscs (Guo et al. 2019), in mouse (Tseng et al. 2008), and in human (Caburet et al. 2005). The IGS variants may be useful genetic markers which can distinguish the individual 45S rDNA units and loci (Chandrasekhara et al. 2016; Abou-Ellail et al. 2011).

The primary function of the IGS is the regulation of rDNA transcription by pol I. The IGS contains a gene promoter (GP), transcription initiation and termination sites (TIS, TTS), and several blocks of tandem repeats, which function as transcription enhancers in some species (Doelling, Gaudino, and Pikaard 1993). IGS may also contain elements regulating the replication (Nomura, Nogi, and Oakes 2013) and the pre-rRNA processing (Abou-Ellail et al. 2011). Since the IGS is fast evolving sequence with little similarity between species, the pol I and the associated transcription factors are highly species specific (Goodfellow and Zomerdijk 2012).

29

The heterogeneity of the IGS typically results from variation in the number of repeats and regulatory elements which may affect rRNA transcription efficiency (Gorokhova et al. 2002). The IGS often contains duplications of the GP, the spacer promoters (SPs). The SPs are usually present in 1-2 copies. However, in Drosophila, the IGS is composed of many tandemly repeated SPs that enhance transcription of pre-rRNA from the GP by recruiting factors or the transcription machinery (Grimaldi et al. 1990; Putnam and Pikaard 1992). The IGS also regulates rDNA activity in another way: through the noncoding RNA (ncRNA) (Jacob et al. 2012).

In mammals, the RNA-seq analysis indicates that almost the entire IGS is transcribed, although the levels of the spacer RNAs are 1,000-fold lower than the levels of rRNAs (Agrawal and Ganley 2018; 2016; Santoro et al. 2010). Short noncoding transcripts originating from SPs bind to the Nucleolar Remodelling Complex (NoRC) and guide the selective rDNA silencing (Mayer et al. 2006; Mayer et al. 2008). Similarly, in plants, IGS transcripts are processed to 24 nt-long small intereferring RNAs (siRNAs) which direct the methylation of rDNA leading to its silencing (Tucker et al. 2010). The mechanism regulates the number of active rRNA genes and it is also responsible for the nucleolar dominance, an epigenetic phenomenon of the selective silencing of rDNA inherited from one progenitor in interspecific hybrids (Preuss et al. 2008).

In yeast, the IGS contains a pol II-dependent promoter (E-pro). Transcripts resulting from E-pro maintain the copy number of rRNA genes through binary switch of unequal/equal sister-chromatid recombination (Kobayashi and Ganley 2005). The sister chromatids are aligned together by the protein complex cohesin leading to an equal crossover. If the rDNA repeats are deleted or inserted, the E-pro transcripts might trigger an unequal sister-chromatid crossover through the cohesion dissociation and the copy number is restored (Kobayashi and Ganley 2005).

3.5. 45S rDNA intergenic spacer in Arabidopsis thaliana

In Arabidopsis, only a reference sequence of IGS has been known for a long time (Figure 9) (Gruendler et al. 1991). In this reference, the TIS lies 1835 bp upstream of 18S rRNA gene. The TIS consists of the sequence motif TATATAGGG (the first transcribed nucleotide, +1, is underlined) which is highly conserved in plants and its mutation prevents pol I-dependent

30 transcription (Doelling et al. 1993). The core GP was mapped to -55 to +6 nt surrounding the TIS (Gruendler et al. 1991; Doelling and Pikaard 1995).

In the middle part of Arabidopsis IGS, there is a high length heterogeneity due to variable number of 20-21 bp long tandem repeats which form 1-3 blocks rich in SalI restriction sites, SalI boxes (Gruendler et al. 1991). Each two SalI boxes are separated by a SP which is followed by a 24-30 nt-long poly-A stretch. The SalI box number and length as well as the SP number are variable (Abou-Ellail et al. 2011). The function of SalI boxes and SPs is not clear. They probably do not function as transcription enhancers (Doelling and Pikaard 1995; Wanzenböck et al. 1997). However, they may produce small siRNA mediating the rDNA silencing (Abou-Ellail et al. 2011).

The 5’ ETS is the most stable part of IGS. It contains two stable blocks of repeats without the length variability, denoted as D and C (Figure 9) (Gruendler et al. 1991). On the contrary, the 3’ETS contains a variable number (3-5) of 47 bp-long R repeats with U3 snoRNP binding sites (Figure 9). The U3 snoRNP participates in the pre-rRNA processing (Abou-Ellail et al. 2011). Four distinct combinations of R repeats found in Arabidopsis are referred to as rRNA gene variants, var1-4 (Figure 10) (Pontvianne et al. 2010). Based on the length heterogeneity, the variants can be distinguished by a PCR with specific primers (Figure 11).

3.6. Regulation of rRNA gene variants in Arabidopsis thaliana

Var1 represents approx. 50 % of rRNA genes, var2 30 %, and var3 the remaining 20 % (Abou-Ellail et al. 2011). Var4 is a rare low-copy variant. The location of variants is NOR- specific (Figure 12). NOR2 contains var1 and var3 and NOR4 contains var2, var3 and var4. Under normal plant growth conditions, NOR4 is transcriptionally active. On the other hand, NOR2 is active only in the early development, then it is silenced (Chandrasekhara et al. 2016; Pontvianne et al. 2010). When the rRNA genes are translocated from inactive NOR2 to active NOR4, the translocated genes become activated which indicates that the rRNA genes regulation based on the NOR level might be more important than the hypothetical regulation based on the sequence differences among gene variants (Mohannath et al. 2016).

The expressed rRNA genes are located inside the nucleolus in an euchromatic state while the silenced rRNA genes are excluded from the nucleolus in a heterochromatic state (Pontvianne et al. 2013). Many factors were identified to maintain the chromatin state of RNA

31 genes, e.g. histone deacetylase HDA6, methyltransferase MET1, nucleolar protein nucleolin, and histone chaperone CAF1. Dysfunction of the factors leads to the deregulation of rRNA gene variants (Pontvianne et al. 2010; Pontvianne et al. 2013).

Nucleolin is a major nucleolar protein involved in many processes, e.g. rRNA transcription and pre-rRNA processing. Its mutation results in rDNA decondensation, hypomethylation in 5’ETS, increased SPs expression with accumulation of siRNA, and activation of var1 rRNA genes (Pontvianne et al. 2010). Similarly, mutation of the histone deacetylase HDA6 gene and the DNA methyltransferase MET1 gene activate var1 rRNA genes (Pontvianne et al. 2013).

In CAF1-deficient plants, the rDNA copy number is progressively reduced to 10 % in 5-9 generations (Mozgová et al. 2010). The active copies are lost preferentially and subsequently the silenced copies are activated to guarantee stable rRNA production (Pontvianne et al. 2013). The process continues up to the point when all the remaining copies are activated (Pontvianne et al. 2013). The rDNA deregulation is apparent by the activation of var1 genes already in the 2nd generation of the mutants (Figure 11).

32

Figure 9. IGS in Arabidopsis. Dotplot of an IGS clone reveals the repetitiveness of the region (Gruendler et al. 1991). The IGS contains blocks of tandem repeats: R repeats in 3’ETS(Abou- Ellail et al. 2011), SalI box 1-3 (S-box) in NTS, and D/C repeats in 5’ETS (Gruendler et al. 1991). Spacer promoters (SP1-2) separate the SalI boxes and a gene promoter (GP) contains TIS.

Figure 10. Schematic alignment of 3’ETS rRNA gene variants, var1-4 (Pontvianne et al. 2010). Variants are heterogenous in length and can be distinguished by PCR with specific primers (arrows). The low sequence identity is depicted as dark and light grey.

33

Figure 11. The rRNA variants can be distinguished by a PCR with variant-specific primers (Pontvianne et al. 2013). (A) Semiquantitative PCR detection of rRNA gene variants in genomic DNA of WT, fas1 and fas2 mutants in the 2nd, 5th and 9th generation (G2, G5, G9). (B) Expression of rRNA gene variants detected by RT-PCR. In WT, var1 is silenced. In fas mutants, var1 is activated already in the 2nd generation. Actin2 (ACT2) is a positive control and normalizer.

Figure 12. The distribution of 45S rDNA variants. NOR2 contains var1 and a subtype of var3 (var3a), whereas NOR4 contains var2, var4, and two other subtypes of var3 (var3b/c) (Sáez- Vásquez and Delseny 2019; Chandrasekhara et al. 2016). In the early stages of the plant development both NORs are active. NOR2 is subsequently silenced.

34

4. Ribosomal DNA instability in Physcomitrella patens

4.1. Physcomitrella patens

Physcomitrella patens is a representative of moss (bryophytes), lower plants without well-developed vasculature (Figure 13). Physcomitrella has a unique feature among higher eukaryotes: a highly efficient homologous recombination (HR). This was successfully exploited by the genomic engineering as it allows easy gene targeting and generation of mutant alleles (Kamisugi et al. 2006). In Physcomitrella, HR dominates as a major pathway of DSB repair while in other higher eukaryotes non-homologous end joining (NHEJ) generally prevails (Puchta 2005).

Figure 13. Development of Physcomitrella patens is characterized by a dominant haploid gametophyte (green) and a minor diploid sporophyte (red) (Roberts et al. 2012).

4.2. Ribosomal DNA instability in pprad51 and pprtel1 mutants

HR can be very precise because it requires a homologous sequence of the damaged region as a template. It can be a sister chromatid, homologous chromosome or other highly similar sequence (Puchta 2005). During HR, the 3’ overhangs are created at the damaged site

35 and they are covered with RAD51 recombinase (Holloman 2011). RAD51 is responsible for the homologous pairing and the invasion of the single-strand overhang to the double strand of the homology sequence, resulting in the formation of a displacement loop (D-loop). Apart from its role in DSB repair, RAD51 is also essential for the alternative telomere lengthening in the absence of telomerase (Olivier et al. 2018).

The antagonist of RAD51, Regulator of Telomere Length 1 (RTEL1), is a helicase which promotes disassembly of D-loops and thus protects a genome from inappropriate recombination (Barber et al. 2008). RTEL1 also plays an essential role in facilitating genome replication through its interaction with PCNA (Vannier et al. 2013). Importantly, it participates in telomere maintenance as it disassembles telomeric loops (T-loops) and quadruplex structures (G4) which otherwise block the replication fork progression during S phase or block the extension of telomeres by telomerase (Vannier et al. 2012; Sarek et al. 2015; Vannier et al. 2013).

In Physcomitrella, the telomeres and rDNA are affected by the loss of RTEL1 function (Goffová et al. 2019). The pprtel1 mutants show the instability of telomeres, the median length of which is reduced from 1100 bp to 700 bp. The standard Physcomitrella haploid genome contains approximately 900 rDNA units organized in linked arrangements of 45S-5S (Rosato et al. 2016; Goffová et al. 2019). In pprtel1 mutants, the rDNA loss reaches 30 %. This decrease is steady and does not progress in subsequent passages. The transcription of rDNA is also reduced (Goffová et al. 2019).

Even greater reduction of rDNA (70 %) was observed in pprad51-1-2. This indicates that other RAD51-indepenent repair pathways, such as NHEJ, prevail in the absence of functional RAD51 (Goffová et al. 2019).

36

5. G-quadruplexes

The most common form of DNA within living cells is B-DNA, a right-handed double helix with an A-T and C-G base-pairing (Watson and Crick 1953). However, DNA can adopt many alternative conformations which have a significant effect on replication, transcription and genome stability (Saini et al. 2013). One of them is a G-quadruplex (G4). It is a four-stranded nucleic acid secondary structure that can form within closely positioned guanines by Hoogsteen base-pairing. G4 consists of G-tetrads (Figure 14A), square and planar arrangements of four guanines that are stacked on top of each other. G-tetrads exist in several topologies with different orientation of DNA strands (Figure 14B).

Figure 14. G4 structure. (A) G-tetrad is a planar arrangement of four guanines. A monovalent cation occupies the central position. Two or more G-tetrads are stacked on top of each other (Huppert and Balasubramanian 2005). (B) G4s can adopt a range of different topologies in which the strands run parallel or antiparallel (Harkness and Mittermaier 2017).

The G-tetrads are stabilized by a monovalent cation, typically K+ or Na+, which coordinate electronegative carbonyl groups of guanines. Because of the size difference, Na+ ions are positioned in the plane of the G-tetrads, whereas K+ ions are positioned in between

37 each pair of G-tetrads (Phan et al. 2006). Due to the stacking, the G4s are helical in nature although for convenience they are often depicted without helicity (Huppert 2008).

The formation of G4 is highly dependent on underlying DNA sequence. A typical G4 forming sequence consists of four tracts of three or more guanines (G3+) which are separated by a loop of variable sequence and length (N1-7) (Bochman et al. 2012). Apart from the canonical motif G3+N1-7G3+N1-7G3+N1-7G3+, structural studies also identified stable G4s with the loops longer than 7 bp or with mismatches in G-tracts leading to bulges (Guédin et al. 2010; Mukundan and Phan 2013).

The stability of G4 is influenced by many factors including the length of G-tracts (Rachwal et al. 2007b), the length and the sequence of the loops (Rachwal et al. 2007a; Risitano 2004), the nature of the central cation: generally K+ stabilizes G4 better than Na+ (Bhattacharyya et al. 2016), the orientation of the strands, and the reactions leading to an alternative stable structure competing with the G4 association (Hardin et al. 2000). The mutually exclusive G4s in the overlapping DNA region can also compete against each other (Agrawal et al. 2014).

5.1. Prediction of potential G-quadruplexes

Methods to detect G4 in vivo are rather limited. Therefore, they are often accompanied by methods which evaluate the potential of DNA to create a G4 structure. This approach can provide a global picture of G4 occurrence on a whole genome scale however the results remain to some extent theoretical. The methods include laboratory techniques, e.g. high-throughput G4-seq (Chambers et al. 2015), and computational prediction tools.

Based on the structural studies of G4 stability, several G4-motif-searching algorithms have been developed and applied on genomes of various organisms. The algorithms typically search for the consensus motif G3+N1-7G3+N1-7G3+N1-7G3+ within a window of 40-100 nucleotides (Huppert and Balasubramanian 2005). Using this simple approach, the imperfect G4s with bulges and long loops remain unrecognized (Saini et al. 2013; Chambers et al. 2015). Importantly, it does not provide a way to quantitatively evaluate the stability of the putative quadruplex.

Algorithm G4Hunter addresses the issues (Bedrat et al. 2016). It does not search for the consensus motif but evaluates G-richness (the fraction of G in the region) and GC-skewness

38

(over-/under-abundance of G or C in the region) of a given sequence and gives a quadruplex propensity score as an output. The scoring is based on simple rules (Figure 15). Each position is scored between -4 to 4. The A and T get neutral score 0. The G and C are scored based on the surrounding sequence which allows to evaluate G-richness of both DNA strands at once (1 for a single G surrounded by T/C/A, 2 for GG, 3 for GGG and 4 for G4+; -1 for a single C, -2 for CC, -3 for CCC and -4 for C4+). The score of the C is negative to account for GC-skewness. The score is summed and normalized for the sequence length. The regions with the normalized score higher than an optional threshold (|1.4| by default) are identified as putative quadruplex forming sequences. Recently, the algorithm was implemented in the Shiny/R framework (Lacroix 2019) and a user-friendly web application was developed (Brázda et al. 2019).

Figure 15. G4Hunter scoring rules (Bedrat et al. 2016). The A and T are scored with a neutral score 0. The G and C are scored based on the surrounding sequence. The positive score 1, 2, 3, 4 for a run of G, G2, G3, G4+ and negative score -1, -2, -3, -4 for a run of C, C2, C3, C4+. The score is summed and normalized for the sequence length.

Another algorithm, pqsfinder, also provides a prediction of imperfect G4s with a quantitative evaluation of their stability (Hon et al. 2017). Compared to G4Hunter, it uses more complex searching and scoring model. The algorithm searches for four consecutive and possibly imperfect G runs interrupted by loops of semi-arbitrary lengths (Figure 16). With the use of backtracking approach, it can also detect overlapping quadruplexes, unlike G4Hunter which reports only the locally best quadruplex. The putative G4s are scored by the scoring function which awards each stacking G-tetrad, penalizes mismatches and bulges and quantifies the destabilizing effect of the loops. The parameters of the scoring function were optimized by training of the model on the big set of putative quadruplex-forming sequences identified by G4- seq in the human genome. Pqsfinder is implemented as a Bioconductor package (Huber et al. 2015; Hon et al. 2017).

39

Figure 16. Pqsfinder G4 searching (Hon et al. 2017). G4s must meet defined constraints; every G4 consists of four G runs (R1-R4) and three loops (L1-L3) which fit between the minimal and the maximal length. The overall G4 length is also limited by the maximal length. A G run can contain either a single bulge or a single mismatch, however, one of the G runs must always be perfect consisting only of G.

Both tools, G4Hunter and pqsfinder, were successfully tested on the set of experimentally proven quadruplex-forming sequences (Hon et al. 2017; Bedrat et al. 2016).

5.2. G-quadruplexes as a source of genome instability

The duplex DNA with the C-G pairing is very stable. Therefore, G4 formation depends on the denaturation of the duplex, as it occurs during the replication, transcription or recombination (Maizels 2006; Sarkies et al. 2010). On the other hand, once formed, G4s are more stable than the duplex, with the melting temperature 20-30 oC higher, and can pose a substantial obstacle for the cellular processes (Lipps and Rhodes 2009). Thus, the folding and unfolding of G4s typically requires the aid of specific proteins with the helicase activity, such as the helicase RTEL1 which has been demonstrated to unwind telomere G4s (Vannier et al. 2012; Mendoza et al. 2016). Furthermore, the formation of G4 is compatible with the formation of R-loop which then can induce DNA damage (Magis et al. 2019). R-loops are another non- canonical DNA structures that comprise nascent RNA hybridized with the DNA template, leaving the non-template DNA single-stranded.

G4s were also found in gene promoters, thus potentially regulating the transcription (Rhodes and Lipps 2015), most notably in the human proto-oncogenes (Brooks and Hurley 2009). For instance, in the promoter of the human proto-oncogene c-myc, G4 functions as an off-switch while duplex and single stranded DNA functions as an on-switch. Therefore, one of the strategies to suppress c-myc overexpression in the cancer cells is the stabilization of G4s. Further, in mammalian cells, G4s were identified as a major source of telomere instability as

40 they can cause replication fork stalling which then results in sporadic DNA breaks and loss of telomeric DNA (Sfeir et al. 2009).

41

6. Aims of the study

The first part of the thesis concerns the model plant Arabidopsis thaliana. In Arabidopsis, only a reference sequence of IGS has been known for a long time. The aim of the study is to describe IGS variability in detail, to map the key regulatory elements in the region, and to obtain a representative set of IGS sequences. To explore the IGS variants, we employ two approaches: high-throughput PacBio Single Molecule Real-Time sequencing of the PCR product and more traditional approach of cloning the PCR product followed by the Sanger sequencing. In addition, we combine the in silico restriction analysis of the obtained IGS sequences with the restriction fragment length polymorphism (RFLP) analysis of the genomic DNA in order to support the evidence of individual IGS variants existence in the genome.

Next, we aim to analyse the IGS sequence variability in plants with dysfunctional histone chaperone CAF1 and in plants with restored CAF1 function: fas mutants and revertants, respectively. In fas mutants, 90 % of rDNA is progressively lost in 5-9 generations and in revertants rDNA is recovered, however, to various extent (Mozgová et al. 2010; Pavlištová et al. 2016). By comparison of IGS variants in fas mutants and revertants to IGS variants in WT we aim to clarify whether the rDNA instability is associated with the sequence rearrangements in the IGS region.

The second part of the thesis concerns the model plant Physcomitrella patens. In Physcomitrella, rDNA instability was observed in pprtel mutants. We aim to clarify whether G4 structures may contribute to the observed rDNA instability. We analyse the sequence of the rDNA unit for potential to form G4 structures with two independent prediction tools.

42

7. Methods

7.1. Analysis of 45S rDNA intergenic spacer Arabidopsis thaliana

7.1.1. Plant material

All the Arabidopsis thaliana plants were on the Columbia 0 background (Col 0).

The T-DNA insertion mutants fas1-4 and fas2-4 are designated as NASC: N828822, SAIL_662_D10 and NASC: N533228, SALK_033228 (Exner et al. 2006). The fas1-4 mutants were segregated from heterozygous progeny and used in the first generation or grown for another two, five or seven generations (fas1-4 G1-/-, G2-/-, G5-/-, G7-/-). The fas2-4 mutants were segregated in the first generation (fas2-4 G1-/-).

The wild-type (WT) plants either had no mutant history or were segregated from the progeny of fas1 heterozygotes and grown for another two or five generations (G2+/+, G5+/+). The WT revertant lines 1, 3, 4 and 6 were segregated from the progeny of FAS1fas1/FAS2fas2 double heterozygotes (Pavlištová et al. 2016). These lines have a low (line 6), medium (line 3), or high (lines 1 and 4) rDNA content.

7.1.2. DNA isolation

Genomic DNA was isolated from leaves of 5 weeks-old plants using the DNA isolation protocol (Dellaporta et al. 1983) or using a NucleoSpin Plant II Midiprep Kit (Macherey- Nagel). A single sample contained a mix of leaves coming from 1 to 3 different individuals of the same genotype and generation (“siblings”). The DNA quality was checked by the electrophoresis. The genotype was tested by PCR using a published protocol (Mozgová et al. 2010).

7.1.3. Cloning, plasmid DNA isolation and sequencing of clones

The IGS was amplified by PCR using a pair of primers designed in conserved regions of 25S and 18S rRNA genes (Table 1: 25SFw and 18SR, or 25SFw_seq and 18SR_seq). The PCR contained: o 1.25 U of ExTaq polymerase (TaKaRa), o 0.8 mmol·l-1 of dNTP mix,

43

o 5 µl of 10× ExTaq Buffer, o 50 ng of genomic DNA, o 0.6 µmol·l-1 of primers, o water up to 50 µl. The PCR conditions included incubation at 94 ºC for 30 s, followed by 30 cycles of 98 ºC for 10s, 55 ºC for 30 s, 72 ºC for 4 min, with final incubation at 72ºC for 20 min. The product was cloned into E. coli. The cloning reaction was prepared using 4 µl of the fresh PCR product (approx. 100 ng), 1 µl of plasmid pCRTM Invitrogen (10 ng·µl-1) and 1 µl of salt solution (from the TOPO® TA Cloning® Kit, Invitrogen). The reaction was incubated for 30 min at the room temperature then mixed with 20 µl of electrocompetent cells (E. coli ElectroMAXTM Stbl4TM, Invitrogen) or chemocompetent cells (E. coli One Shot® TOP10, Invitrogen). In the case of the electrocompetent cells, the salt in the cloning reaction was diluted 5×, the bacteria were transformed by electroporation at 2.5 kV, mixed with 1 ml of SOC medium, incubated at 30 ºC, 200 rpm for 90 min, then spread on Petri dishes with LB medium containing ampicillin (50 µg·ml-1) and incubated at 30 ºC for 16h. In the case of the chemocompetent cells, the bacteria were transformed by heatshock at 42 ºC in a water bath for 45 s, mixed with 1 ml of SOC medium, incubated at 37 ºC, 200 rpm for 90 min, then spread on Petri dishes with LB medium containing ampicillin (50 µg·ml-1), IPTG (10-4 mol·ml-1) and X-gal (0.04 mg·ml-1). Plates were incubated at 37 ºC 16 h.

M13F GTAAAACGACGGCCAG M13R CAGGAAACAGCTATGAC 25SFw CCCAGTGCTCTGAATGTCAA 25SFw_seq TACGCGACGGGGTATTGTAAG 18SR GAATCGAACCCTAATTCTCCG 18SR_seq CATTCGCAGTTTCACAGTCT 669Fw GGATTCCTCGACCAGGACTTG 4213R GAAGAAAGAAGACGGACGAATC 3443R GCGCAAGACGACTATACCG 2703R CCTGATACAACATCGGATTTTC

Table 1. Primers used for the IGS amplification and sequencing (in 5’-3’ direction).

44

Figure 17. Approximate location of sequencing primers.

The blue-white test was used to select clones with plasmid insertions. The length of the insertions was tested by PCR with M13F/M13R primers (Table 1). The plasmid DNA using GenEluteTM HP Plasmid Miniprep Kit (Sigma Aldrich) or QIA® Spin Miniprep Kit (Qiagen). Sequencing of the clones was performed by Macrogen in South Korea using the Sanger method. The primers used in the sequencing reactions cover the whole IGS (Table 1, Figure 17). The complete sequences of IGS clones were assembled by overlapping regions of sequencing reads approx. 1.1 kb long.

7.1.4. Single molecule real-time sequencing

The IGS was amplified by PCR and sequenced by Single molecule real-time (SMRT) sequencing (Pacific Biosciences). SMRT is a high-throughput method which produces thousands of several kb-long reads at a single sequencing run. A single SMRT read can cover the complete IGS. Two series of SMRT sequencing were conducted. In the first series, the samples were prepared by PCR with the use of Q5® High-Fidelity DNA polymerase (NEB) to amplify the shorter, approx. 3.5 kb long, fraction of the IGS. In the second series, Phusion® Hot Start II High Fidelity polymerase (Thermo Scientific) was used to amplify both fractions of the IGS, 3.5 and 4.5 kb long. In the case of PCR with Phusion polymerase, 25-30 identical 10 µl PCRs were prepared and run parallel to obtain 5 µg of the product required for SMRT. A single PCR contained: o 0.2 U of polymerase o 0.3 mmol·l-1 of dNTP mix, o 2 µl of 5× GC Reaction Buffer (Thermo scientific), o 20 ng of genomic DNA, o 0.25 µmol·l-1 of primers Fx and Rx (Table 2), o 2.5% DMSO

45

-1 o 1.25 mmol·l MgCl2 o water up to 10 µl. The PCR products were mixed and purified using QIA® PCR purification Kit (Qiagen). The process was repeated with ten different genomic and different pairs of primers Fx/Rx where x is 0-9 (Table 2). The primers were designed in the conserved region of the 18S and 25S rRNA genes and were distinguished by a unique barcode. The annealing temperatures were optimized for each pair of primers and were in the ranges 49-51 ºC (first 2 cycles) and 69-71 ºC (another 25 cycles). PCR conditions included incubation at 98 ºC for 30s, followed by 2 cycles of 98 ºC for 10 s, 49-51 ºC for 30 s, 72 ºC for 2 min 20 s, followed by 25 cycles of 98 ºC for 10 s, 69-71 ºC for 30 s, 72 ºC for 2 min 20 s, with final incubation at 72 ºC for 5 min. In the case of Q5 polymerase, 7-15 identical 25 µl PCRs were prepared and run in parallel. A single PCR contained: o 0.5 U of polymerase o 0.2 mmol·l-1 of dNTP mix, o 2 µl of 5× Reaction Buffer (NEB), o 2 µl of 5× Enhancer Buffer (NEB), o 25 ng of genomic DNA, o 0.25 µmol·l-1 of primers Fx and Rx (Table 2), o water up to 25 µl. PCR conditions included incubation at 98 ºC for 30s, followed by 2 cycles of 98 ºC for 10 s, 49 ºC for 30 s, 72 ºC for 3 min 20 s, followed by 25 cycles of 98 ºC for 10 s, 69 ºC for 30 s, 72 ºC for 3 min 20 s, with final incubation at 72 ºC for 5 min. The quality and concentration of PCR products was measured using electrophoresis and Nanodrop (Thermo Scientific). The PCR products, labelled by barcodes, were mixed equimolarly and sequenced by GATC Biotech in Germany.

46

F0 GGTAGGCGCTCTGTGTGCAGCTACGCGACGGGGTATTGTAAG F1 GGTAGTCATGAGTCGACACTATACGCGACGGGGTATTGTAAG F2 GGTAGTATCTATCGTATACGCTACGCGACGGGGTATTGTAAG F3 GGTAGATCACACTGCATCTGATACGCGACGGGGTATTGTAAG F4 GGTAGACGTACGCTCGTCATATACGCGACGGGGTATTGTAAG F5 GGTAGTGTGAGTCAGTACGCGTACGCGACGGGGTATTGTAAG F6 GGTAGAGAGACACGATACTCATACGCGACGGGGTATTGTAAG F7 GGTAGCTGCTAGAGTCTACAGTACGCGACGGGGTATTGTAAG F8 GGTAGAGCACTCGCGTCAGTGTACGCGACGGGGTATTGTAAG F9 GGTAGTCATGCACGTCTCGCTTACGCGACGGGGTATTGTAAG R0 CCATCAGAGTACTACATATGACATTCGCAGTTTCACAGTCT R1 CCATCCGTGTGCATAGATCGCCATTCGCAGTTTCACAGTCT R2 CCATCATGTATCTCGACTGCACATTCGCAGTTTCACAGTCT R3 CCATCGACTCGACGCAGAGTCCATTCGCAGTTTCACAGTCT R4 CCATCCGATGACGTCGCTGTACATTCGCAGTTTCACAGTCT R5 CCATCCACACGTAGTCTGCGCCATTCGCAGTTTCACAGTCT R6 CCATCGCTGTATCGCAGAGACCATTCGCAGTTTCACAGTCT R7 CCATCCGAGCTATCTCATACTCATTCGCAGTTTCACAGTCT R8 CCATCCATGAGTACTCGTCGCCATTCGCAGTTTCACAGTCT R9 CCATCCAGCGACTGTGATACTCATTCGCAGTTTCACAGTCT Table 2. Primers used for SMRT sequencing. Forward F0-F9 primers consist of an adapter sequence, a barcode sequence and the sequence of 25SFw_seq primer. Reverse R0-R9 primers consist of an adapter, a barcode and 18SR_seq primer.

7.1.5. Sequence analysis

The raw data from Pacific Biosciences sequencing were processed with the RS_ReadsOfInsert protocol in SMRT analysis pipeline to produce the consensus reads with the quality corrections. We divided the consensus reads to groups by barcodes that labelled the original genomic DNA samples. To obtain the complete IGS sequences, we filtered only the high-quality reads, longer than 3 kb, that contain the beginning (CCCTCCCCTAA) and the end (ATCGATGAATG) of the IGS and at least one transcription initiation site (TATATAGGG). For the searching of the sequence motives we have used customed scripts written in Python.

47

We searched the filtered reads for SalI boxes that we defined as a close occurrence of two or more SalI restriction sites (GTCGAC). The borders of a SalI box were determined by the position of its first and its last SalI restriction site. Next, we searched the reads for core promoters (GP and SP) that we defined as regions from -55 to +6 nucleotides around a TIS sequence (TATATAGGG, the +1 is underlined). The IGS sequences containing the similar distribution of SalI boxes and GPs/SPs were grouped and aligned by ClustalW to create a consensus sequences, one from each of the IGS types detected.

7.1.6. Restriction fragment analysis, probe labelling, and hybridization

The hybridization probes IGS1 and IGS2 were prepared by PCR using plasmid DNA containing a single IGS clone as a template. The PCR with IGS1F/IGS2R or IGS2F/IGS2R primers (Table 3) contained 1.25 U of Taq polymerase (NEB) and was performed according to the manufacturer’s instructions. The PCR product was extracted from a 1% agarose gel by a QIAX® II Gel Extraction Kit (Qiagen) and labeled with radioactive α-[32P]dCTP according to the Rediprime II DNA Labeling System protocol (GE Healthcare Life Sciences).

To digest genomic DNA, 900 ng of fas2 G2−/− or G3−/− and 500 ng of fas2 G2+/+ was mixed with 30 U of EcoRI and 15 U of HindIII (NEB); the larger amount of mutant DNA was used to compensate for the loss of rRNA genes in fas mutants. The mix was incubated at 37 °C for 16 h, then 15 U EcoRI and 7.5 U of HindIII was added, the mixture was incubated for another 2 h, lyophilized, and subjected to electrophoresis on a 1.3 % agarose gel overnight at 40 V. The digested DNA from the gel was transferred by Southern blot to a Hybond™–XL membrane (GE Healthcare) and hybridized with the IGS1 or IGS2 probe in 0.25 M Na- phosphate pH 7.5, 7 % SDS, 0.016 M EDTA at 65 °C overnight. The membrane was washed three times in 2 × SSC, 0.5 % SDS at 65 °C. The hybridization process was repeated with chloroplast probes to check that the DNA was digested completely (Fajkus and Reich 1991).

The credit for conducting the restriction fragment analysis goes to I. Mozgová, L. Vansáčová and M. Dvořáčková (Havlová et al. 2016). It was not done by the author of the thesis. 7.1.7. Restriction fragment analysis in silico

All the IGS sequences, obtained by Sanger sequencing of the clones and Pacific Biosciences sequencing, were searched for the restriction sites of EcoRI and HindIII to get in

48 silico restriction fragments using customed scripts written in Python. The results were mapped to restriction fragment analysis conducted in a laboratory.

7.2. Analysis of the rRNA gene unit in Physcomitrella patens

7.2.1. Prediction of G-quadruplexes

The sequence of rRNA gene unit was downloaded from the NCBI Gene Database (organism Physcomitrella patens, locus LOC112273741, chromosome 20 from 135981 to 124466). To detect regions that are likely to fold into a G4 we used two independent tools: pqsfinder (Hon et al. 2017) with default settings, and a custom implementation of G4Hunter algorithm (Bedrat et al. 2016) with the threshold score set to 1. The computation was conducted in the environment of R studio with the use of Bioconductor package Gviz (Hahne and Ivanek 2016) for plotting the results.

7.2.2. Detection of the putative transcription start site of the rRNA gene unit

Using bwa-mem v0.7.17-r1188 (Li and Durbin 2009), we mapped publicly available RNA-seq data (accession numbers: SRX1176830, SRX3381969, SRX3381970 and SRX2484017) to the region between 5S and 45S rRNA gene. We identified a point where the coverage significantly increases compared to the rest of the intergenic spacer and denoted it as the putative transcription start site (TSS). The putative TSS is located on chr 20 127235 bp starting with the sequence “TATGTGGGGG”.

49

8. Results 8.1. Variability of the intergenic spacer in Arabidopsis thaliana

The length heterogeneity of the IGS was apparent already from the PCR amplification of the region (Figure 18). The PCR product contains two main fractions, ~3.5 and 4.5 kb-long.

To explore the IGS variants in detail, we used two approaches: high-throughput PacBio Single Molecule Real-Time sequencing of the PCR product and more traditional approach of the cloning the PCR product followed by the Sanger sequencing. We obtained 1711 reads and 86 clones. The sequences were submitted to GenBank under the accession numbers KU994650- KU994739 (clones), KU992939-KU994649 (processed reads), and SRP071272 (raw reads).

We named the variants systematically to preserve the text clarity. The complete name of an IGS variant consists of 3’ETS variant (Pontvianne et al. 2010) and SalI box type (SalI box length in bp) separated by a dot. For example, the reference sequence (Chandrasekhara et al. 2016), which contains 3’ETS of var1, two 294 bp-long SalI boxes, and one 500 bp-long SalI box, is represented as var1.294.294.500 (Figure 20A).

The Table 3 and Figure 19 summarizes the IGS sequences obtained by the PacBio sequencing and the Table 4 summarizes the IGS sequences obtained by the cloning.

Figure 18. IGS length heterogeneity. Product of PCR with primers located in the conserved region of 25S rRNA gene and 18S rRNA gene. Genomic DNA comes from a segregated WT of the 2nd generation (G2+/+), fas1 mutant of the 2nd and 5th generation (G2-/-, G5 -/-).

50

Table 3. Number of IGS sequences obtained by PacBio sequencing. The sequences are classified based on the IGS variant and the source of genomic DNA (WT, revertant lines with low, medium or high rDNA content, and fas mutants in the 1st, 5th, and 7th generation). The names of individual IGS variants consist of the 3’ETS variant and the SalI box variant separated by a dot.

Figure 19. Graphical summary of distribution of PacBio reads among the IGS variants. 51

Table 4. Number of IGS sequences obtained by the cloning of the PCR product and the Sanger sequencing.

8.1.1. Variability of the intergenic spacer in WT plants

The dataset of WT IGS consists of 407 PacBio reads and 17 clones (Table 3 and 4). We found the sequence and length heterogeneity in all three parts of the IGS: the 3’ETS, the NTS, and the 5’ETS.

52

8.1.1.1. 3’ETS variants

The beginning of the IGS is delimited by the end of the 25S rRNA gene at the proximal site and by the first SalI restriction site at the distal site. It contains 3’ETS based on which the rRNA genes can be classified into four variants, var1-4 (Pontvianne et al. 2010). In our dataset, we detected all the variants except for var4. That can be explained by the low copy number of var4 in the genome. Surprisingly, we detected one yet undescribed variant which we named var5. Var5 is relatively abundant in WT genome (~13 % of PacBio reads, Figure 19). It is closely related to var3 but differs mainly in a 100 bp-long deletion at the distal site and can therefore be regarded as a var3 subtype (Figure 20B, 20C). The primers designed for PCR amplification of 3’ETS var1-4 (Pontvianne et al. 2010) are unable to detect var5 because of its deleted region. We tried to design the primers specific for var5, however, we found out that the repetitive character of IGS makes it impossible. The first SalI restriction site is situated at the position 778 bp (var1), 726 (var2), 655 (var3), 736 (var4) and 549 (var5) counted from the end of the 25S rRNA gene.

8.1.1.2. NTS variants

In the middle region of the IGS (NTS), we detected a significant length heterogeneity caused by differences in the number of 20 bp-long repeats composing the SalI box. The SalI box is either short (<1 kb) or long (>1 kb). Altogether, we found ten different variants of SalI boxes and we named them based on their length in bp (294, 314, 366, 458, 582, 705, 1005, 1045, 1254, 1505). The alignment of the SalI box variants is presented in Figure 20D.

The PCR amplification of the IGS shows two major fractions of the length ~3.5 and ~4.5 kb (Figure 18). This is consistent with the sequencing data. The shorter fraction of IGSs contains one rather short SalI box of this type: 294 (3 %), 366 (14 %), 458 (49 %), 582 (23 %) or 705 (1 %). The longer fraction of IGS contains two different SalI boxes separated by a spacer promoter: 294.1005 (1 %), 294.1045 (7 %), 294.1254 (0.2 %) or 314.1005 (1 %). Rarely, the IGS contains one very long SalI box: 1045 (0.5 %) or 1505 (0.2 %). Only the 294 and 314 SalI box types tend to be combined into longer IGS variants, while other short SalI box types are isolated.

53

8.1.1.3. Combination of 3’ETS and SalI box variants

Combinations of 3’ETS variants var1-5 with adjoining SalI boxes create 19 unique IGS variants (Figure 21). The SalI box variants tend to be connected to a specific 3’ETS variant (Table 5). For instance, the 366 SalI box occurred in 55 cases together with var5 while only once with var3. Similarly, the 458 SalI box is mostly connected to var3 and the 582 SalI box is typically found together with var1. The long SalI box type, 294.1045, tends to be connected to var2.

To conclude, each 3’ETS variant seem to be preferentially adjoined by a specific SalI box variant (var1.582, var2.294.1045, var3.458, var5.366). Although our methods provide only semi-quantitative data, these four types of IGS are clearly more frequent than the remaining variants. The least variability is associated with var5, which is always connected to 366 SalI box (Table 5). On the other hand, var1 is associated with the highest number of different adjoining regions, forming eight different subtypes of IGS (Table 5). It was previously shown that var1 is the most abundant and the least expressed 3’ETS variant that often undergoes epigenetic reprogramming (Pontvianne et al. 2010; Abou-Ellail et al. 2011; Earley et al. 2010).

8.1.1.4. 5’ETS variants

The most distal region of the IGS, the 1841 bp-long 5’ETS, is situated upstream of the 18S rRNA gene (Gruendler et al. 1991). It contains two tandem 310 bp-long repeats, termed C1 and C2, which differ from each other in 4 SNPs at position 28, 130, 182, and 274 bp (Gruendler et al. 1991). Although the length heterogeneity in the 5’ETS is rare, we detected a 310 bp-long deletion corresponding to a C repeat in 4 % of sequences. We denoted the major variant, containing both C repeats, as varA and the variant with a deletion, containing only one C repeat, as varB. Based on the SNPs, we conclude that the C repeat in varB results from various combinations of original C1 and C2 repeat rather than selective deletion of either C1 or C2 (Figure 22). The varB does not seem to be linked to a specific 3’ETS or SalI box variant. It was found in the following IGS variants: var3.458, var3.582, var1.582, var1.366, and var5.366.

54

Figure 20. Alignments of IGS variants. (A) Schematic view of the reference var1.294.294.500 (Chandrasekhara et al. 2016). (B) Alignment of the 3’ETS variants found in WT. Var4 was not detected in our dataset and therefore it is not included in this alignment. 3’ETS variants are depicted from the end of the 25S rRNA gene to the first SalI restriction site. Lighter shades represent lower the sequence identity. (C) Alignment of 3’ETS variants. Positions containing sequence differences are shown in full, while regions containing identical bases are displayed as a green box with a number representing the length of the region with identical bases. (D) Alignment of SalI box variants. Lighter shades represent the lower sequence identity. Numbers represent the length of individual SalI box in bp. SalI boxes 633 and 797 (in red) are found only in fas1 mutant.

55

Figure 21. Distribution of promoters and SalI boxes in IGS variants (WT). Lines represent individual IGS variants. SalI boxes are in green, promoters are in red. The numbers above each line represent the length (in bp) of the corresponding region. The name of an IGS variant consists of 3’ETS variant and SalI box variant separated by a dot.

Table 5. The SalI box variants tend to be associated with specific 3’ETS variants. The 366 SalI box is typical for var5, the 458 SalI box is typical for var3, and the 582 SalI box is typical for var1.

56

Figure 22. We identified two 5’ETS variants varA/B, here depicted from the TIS “TATATAGGG” to the beginning of 18S rRNA. The canonical varA contains two 310 bp-long repeats, termed C1 and C2, which differ from each other in 4 SNPs at position 28, 130, 182, and 274 bp (Gruendler et al. 1991). VarB contains only one C repeat. Based on the SNPs, we conclude that the C repeat in varB results from various combinations of original C1 and C2 repeat rather than selective deletion of either C1 or C2.

8.1.2. Variability of the intergenic spacer in fas mutants

The dataset of fas mutant IGSs consists of 610 PacBio reads and 69 clones (Table 3 and 4). The data come from three generations of mutants (fas1 G2, G5 and G7 and fas2 G2). In fas mutants rRNA genes are progressively lost (Mozgová et al. 2010) and this is reflected in our data by the decreased IGS variability. While in the WT, we found 19 IGS variants, in the fas mutants, we detected only 6-10 variants for a given generation (Table 6). Some of IGS variants are apparently exclusive for mutants and were not detected in the WT (Figure 23); these include

57 var3.294, var1.1045, var3.1045, var1.294.633, var3.294.1045 and var4.797 (Table 6). Distinct occurrence of variants may result from the single-strand annealing recombination events which is the presumed mechanism responsible for rDNA loss in fas mutants (Muchová et al. 2015). Another plausible explanation is that the mutant-exlusive variants went undetected in the WT due to their low copy numbers and the limited capacity of the sequencing which was saturated with the major IGS variants. Whereas in fas mutants, the overall copy number of rRNA genes is reduced thus making the rare IGS variant more likely to be detected.

Table 6. There are 6-10 different IGS variants in a single generation of fas mutants. Variants exclusive to mutants (that were not found in the WT) are labelled by asterisks.

Figure 23. IGS variants identified in the WT, fas mutans and revertants.

58

8.1.3. Restriction fragment length polymorphism

Using in silico prediction of restriction sites, we created restriction fragments of IGS variants and compared them to the signals visible in the RFLP analysis (Figure 24). The genomic DNA was digested with EcoRI and HindIII, separated by agarose gel electrophoresis, blotted and hybridized with IGS1 probe specific to the 3’ETS (Figure 24A, 24C). The most abundant IGS variants could all be mapped to a strong signal. Some of the less abundant variants mapped to a smeared signal and their existence could not be sufficiently supported by the RFLP. Three relatively strong signals could not be assigned to any of the variants described in our sequencing analysis, which suggests that our dataset is not complete and other variants exist in the genome, including the reference var1.294.294.500 (Chandrasekhara et al. 2016).

Next, the genomic DNA was digested with EcoRI and Southern-hybridized with 5’ETS- specific probe, IGS2 (Figure 24B, 24C). The 5’ETS variants varA/varB could be mapped to the signals. The varB signal is considerably weaker than varA signal which corresponds to the sequencing data where varB is a rare variant (4 %). In a significant subset of 18S rRNA genes, the RFLP analysis also revealed an incomplete DNA digestion caused by CpG methylation. The inhibited EcoRI restriction site is situated in the 18S rRNA gene. The incomplete DNA digestion leads to 1506 bp long electrophoretic mobility shift of varA/varB (Figure 24B).

59

Figure 24. In silico analysis of IGS restriction fragments. (A) Genomic DNA from fas1 (G2) was digested with EcoRI and HindIII and hybridized with the IGS1 probe which covers the 3’ETS. The IGS variants were matched to the image based on their in silico restriction fragments. The observed RFLP is caused by heterogenous SalI box regions in combination with the presence or absence of internal EcoRI and HindIII sites. Three relatively strong signals (labelled by question marks) could not be assigned to any of the variants, suggesting that there are still-unidentified IGS variants in the genome. The unknown signals could also be explained

60 by some combinations of overlapping cytosine methylations which prevented EcoRI from complete DNA digestion at a subset of EcoRI sites resulting in longer fragments. M represents a molecular weight marker (M1-1 kb DNA ladder, M2-2-log ladder) (B) Genomic DNA from fas1 (G3) was digested with EcoRI and hybridized with the IGS2 probe, that visualize the polymorphism in the 5’ETS. (C) Positions of the IGS1 and IGS2 probes.

8.1.4. Variability of the intergenic spacer in revertant lines

The revertant lines (revertants) are FAS1FAS1/FAS2FAS2 plants segregated from the progeny of a cross between fas1 and fas2 plants. In revertants, the restoration of CAF1 leads to rRNA genes recovery, however, the recovery is uneven and asymmetric in the gene copy numbers and distribution of rDNA loci (Pavlištová et al. 2016). In the parental mutants, only a subset of IGS variants remained present, thus without major rearrangements during the process of rDNA recovery, the spectrum of IGS variants in revertants cannot expand to the WT level.

We selected four lines as representatives of revertants with a low (line6), medium (line3), or high (line1 and line4) amount of rDNA and sequenced the IGS in these representatives to study the rearrangements during rDNA recovery. The expression of rRNA gene variants is altered in revertants (Pavlištová et al. 2016). Unlike WT, these lines express var1 rRNA genes. The line1 and line4 have no detectable var2 in their genome (based on PCR). Further, line4 has a completely rearranged NORs. In the WT, rDNA is evenly distributed between the active NOR4 and the silenced NOR2 (Chandrasekhara et al. 2016). In line4, the majority of rDNA is present on the chromosome 2 which forms the active NOR2 associated with the nucleolus (Pavlištová et al. 2016).

The sequencing data show that the loss of the IGS variability in the parental mutants propagates to the revertants. Similar to fas mutants, revertants show 10-12 variants for a single line compared to the 19 variants in the WT. Var1.582 is preferentially recovered in line1 and line4 and var3.458 in line3 and line6. Both variants belong to the most abundant types in the WT and are still present in fas1 G7. Most of the revertant IGS variants were found in both WT and fas mutants (Figure 23). We observed only a sporadic formation of new variants.

61

8.2. Prediction of G-quadruplexes in the rRNA gene unit of Physcomitrella patens

To identify the factors contributing to the rDNA instability observed in pprtel1 mutants, we analysed the rDNA unit for a potential to form G4 structures using G4Hunter and pqsfinder. We observed that the spacer between 5S and 18S rRNA shows conspicuous clustering of potential quadruplex-forming sequences whose stability scores, measured by the predicting tools, is comparable with the telomeric DNA repeats (Figure 25). For the comparison, the score of plant/human telomeric DNA is 60/64 when measured by pqsfinder and 1.4/1.5 when measured by G4Hunter.

By mapping publicly available RNA-seq data to the 5S-18S spacer, we were able to roughly identify the position of transcription start site (TSS) of 45S pre-rRNA. Particularly strong G4 sites upstream of 18S rRNA gene are situated downstream of the TSS and thus they are a part of the pre-rRNA transcript.

Figure 23. The distribution of G4s over the rRNA gene unit (the reference sequence downloaded from NCBI Gene Database: Physcomitrella patens LOC112273741 chr20 from 135981 to 124466 bp). The G4s were predicted from the sequence using pqsfinder (Hon et al. 2017) and G4Hunter (Bedrat et al. 2016). The putative TSS was set by mapping RNA-seq data to the reference sequence.

62

9. Discussion

rRNA genes in various organisms have been the subject of numerous studies concerning their dynamic evolution and regulation of expression. The model of concerted evolution has been demonstrated to be responsible for high sequence similarity of rRNA genes within a species (Eickbush and Eickbush 2007). The rapid sequence homogenization can be attributed to homologous recombination events. Unequal crossover is considered the major driving force in the evolution of rRNA genes, with the sister chromatid exchange occurring more often than the exchange between homologs (Eickbush and Eickbush 2007).

This is apparent in Arabidopsis thaliana, where the rRNA genes within a single NOR are more similar to one another than they are to the rRNA genes on another chromosome (Copenhaver and Pikaard 1996b). Consistent with the homogenizing tendency of rRNA genes, in revertant lines, where the amount of rDNA was reduced due to fas1 or fas2 mutation, we observed that a single type of IGS variant is preferentially recovered after reintroduction of functional FAS alleles and that the rDNA rearrangements giving rise to new IGS variants during rDNA recovery are relatively rare.

Overall, sequence variation in rDNA is largely limited to the IGS that contains genetic elements controlling the transcription: gene and spacer promoters, enhancers and terminators. The rRNA expression is regulated by epigenetic mechanisms at the level of individual rRNA genes and at the same time at the level of the whole rDNA loci (Pontvianne et al. 2013; S. Preuss and Pikaard 2007). It is important to know the actual sequence structure of the IGS and its natural variation so that the contributions of genetic and epigenetic mechanisms can be assessed realistically. Although Arabidopsis thaliana has been frequently used as a model organism in rDNA studies, only a reference IGS sequence (referred as var1.294.294.500 in this thesis and used in (Chandrasekhara et al. 2016)), artificially assembled from sequence fragments, has been available before our study (Havlová et al. 2016) was published.

Interestingly, the data on IGS differ substantially from the reference var1.294.294.500. The reference contains three SalI boxes in a row separated by two spacer promoters. Yet, we found only one IGS clone (var2.294.89.273) which follows the same arrangement and even this one differs from the reference in the length of the SalI box.

Importantly, our results associate previously known variants in the 3’ETS (Pontvianne et al. 2010) with their IGS sequence context, and reveal novel 3’ETS variant, var5. The

63 distribution of 3’ETS variants on individual NORs was described in detail in (Chandrasekhara et al. 2016) showing that var2 rRNA genes are located on the transcriptionally active NOR4, while var1 rRNA genes are located on the inactive NOR2. Var3 rRNA genes are distributed on both NORs, but NOR4 contains var3 with a HindIII restriction site. The specific location of var5 was not studied, however, most of var5 IGS sequences contain HindIII (data not shown), and thus var5 rRNA genes are probably located on the active NOR4.

Mutants in the FAS1 and FAS2 gene represent a great model for the study of rDNA dynamics since they lose around 20 % of their rDNA per generation. Previous studies showed that transcriptionally active copies at NOR4 are depleted first, then the inactive var1 copies are activated and subsequently also depleted (Muchová et al. 2015; Pontvianne et al. 2013). Consistent with these studies, the var2 was detected rarely in fas1 mutants by both cloning and SMRT sequencing (Table 3 and Table 4) while in the WT this IGS variant represents around 10 % of SMRT reads. Further, the abundance of the frequently occurring inactive var1.582 (around 20 % in the WT) shows a dramatic decrease between generations 5 and 7 of fas1 mutants (Figure 19).

Aside from the rDNA loss, it is possible that some re-organization occurs inside the IGS in fas1 plants due to the increased levels of non-allelic homologous recombination (Kirik et al. 2006). Two additional SalI box types, 633 and 797, are found in mutants. Their alignment to the the WT SalI box types shows that 633 might have resulted from a deletion in 1045, while 797 might have resulted from the fusion of 314 and 705. It seems that fas mutants and revertant lines with a fas mutation history display acceleration of concerted evolution as demonstrated by a loss, gain and spreading of specific IGS variants within a limited number of generations. This corresponds to the generally increased levels of homologous recombination events (Kirik et al. 2006; Endo et al. 2006) and direct involvement of these events in the loss of rDNA in fas mutants (Muchová et al. 2015).

rDNA loci are particularly sensitive genomic regions as they are highly transcribed during almost the entire cell cycle, including S phase when the integrity of rDNA is endangered during the process of replication (Dvořáčková et al. 2015; Dvořáčková et al. 2018). The fragility of rDNA results from the intrinsic character of the sequence which is repetitive and prone to create complex secondary structures such as the G4 quadruplexes and R-loops. In these features, rDNA shows striking similarity to the telomeres. In mammalian cells, G4s were identified as a major source of telomere instability as they can cause replication fork stalling

64 which results in sporadic DNA breaks and loss of telomeric DNA (Sfeir et al. 2009). It can be hypothesized that this effect may also apply on rDNA loci. This is supported by the fact that telomere and rDNA instability was observed in plants with dysfunctional G4-unwinding helicase RTEL1 (Goffová et al. 2019). This instability might be explained by the clustering of G4 sites that we observed in a region upstream 18S rRNA gene which is most likely a part of pre-rRNA transcript. The G4 sites may be a major source of problems for the replication fork passage in the absence of RTEL1 helicase.

65

10. Conclusion

The first part of this thesis has focused on the 45S rDNA IGS variability in the model plant organism Arabidopsis thaliana. The IGS in Arabidopsis has not been comprehensively described despite its potential importance in the regulation of rDNA transcription and replication. The IGS variants may be useful genetic markers which can distinguish the individual rDNA units and loci. In conclusion, we have described here the variant arrangements of the IGS in Arabidopsis WT plants, mutants with dysfunctional histone chaperone CAF1 (fas1 and fas2 mutants), and plants with restored CAF1 function (revertants).

The detailed characterisation allowed the description of a new variant in the 3’ETS region of pre-rRNA, termed var5, and of the preferential association of 3’ETS variants with specific IGS arrangements. Overall, fas mutants show less variability than WT plants. We have observed the selective loss of some variants and sporadic generation of new IGS variants. These results correspond to the presumed mechanism of the loss of rDNA copies in fas mutants via the single strand annealing type of homology-dependant repair, during which DSB site in the middle of the repetitive sequence is repaired by cutting the injured repeat out of the genome, and thus shortening the total length of rDNA (Muchová et al. 2015). Apparently, this mechanism also simplifies the original WT spectrum of rDNA variants.

In revertants, we have observed the IGS spectrum similar to that of their parental mutants, suggesting that rDNA recovery occurs through a relatively precise DNA synthesis- dependent homologous recombination mechanism. We conclude that the rDNA rearrangements giving rise to new IGS variants during rDNA recovery are relatively rare and that the decreased IGS variability results from the preferential recovery of variants which remained in the genome in the moment of FAS restoration.

In the second part of the thesis, we have focused on rDNA in the model plant Physcomitrella patens. Using publicly available sequences, we have identified the putative TSS and the positions of G4s in the rDNA unit. These results support the hypothesis that G4 structures substantially contribute to the rDNA instability observed in Physcomitrella with dysfunctional G4-unwinding helicase RTEL1.

In conclusion, maintaining rDNA stability is a matter of the highest importance for the overall genome integrity and cellular viability. This thesis extends the current knowledge about

66 the occurrence of distinct rRNA genes variants and clarifies the role of some important factors which are involved in rDNA maintenance.

67

11. References

Abou-Ellail, Mohamed, Richard Cooke, and Julio Sáez-Vásquez. 2011. “Variations in a Team: Major and Minor Variants of Arabidopsis Thaliana RDNA Genes.” Nucleus 2 (4): 294–99. https://doi.org/10.4161/nucl.2.4.16561. Agrawal, Prashansa, Clement Lin, RaveendraI. Mathad, Megan Carver, and Danzhou Yang. 2014. “The Major G-Quadruplex Formed in the Human BCL-2 Proximal Promoter Adopts a Parallel Structure with a 13-Nt Loop in K+ Solution.” Journal of the American Chemical Society 136 (5): 1750–53. https://doi.org/10.1021/ja4118945. Agrawal, Saumya, and Austen R. D. Ganley. 2016. “Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data.” Methods in Molecular Biology (Clifton, N.J.) 1455: 161–81. https://doi.org/10.1007/978-1-4939-3792-9_13. Agrawal, Saumya, and Austen R. D. Ganley. 2018. “The Conservation Landscape of the Human Ribosomal RNA Gene Repeats.” PLOS ONE 13 (12): e0207531. https://doi.org/10.1371/journal.pone.0207531. Alabert, Constance, and Anja Groth. 2012. “Chromatin Replication and Epigenome Maintenance.” Nature Reviews Molecular Cell Biology 13 (3): 153–67. https://doi.org/10.1038/nrm3288. Ang, J. Sidney, Supipi Duffy, Romulo Segovia, Peter C. Stirling, and Philip Hieter. 2016. “Dosage Mutator Genes in Saccharomyces Cerevisiae: A Novel Mutator Mode-of-Action of the Mph1 DNA Helicase.” Genetics 204 (3): 975–86. https://doi.org/10.1534/genetics.116.192211. Audas, Timothy E., Mathieu D. Jacob, and Stephen Lee. 2012. “Immobilization of Proteins in the Nucleolus by Ribosomal Intergenic Spacer Noncoding RNA.” Molecular Cell 45 (2): 147–57. https://doi.org/10.1016/j.molcel.2011.12.012. Baker, Monya. 2011. “Making Sense of Chromatin States.” Nature Methods 8 (9): 717–22. https://doi.org/10.1038/nmeth.1673. Barber, Louise J., Jillian L. Youds, Jordan D. Ward, Michael J. McIlwraith, Nigel J. O’Neil, Mark I.R. Petalcorin, Julie S. Martin, et al. 2008. “SPAR1/RTEL1 Maintains Genomic Stability by Suppressing Homologous Recombination.” Cell 135 (2): 261–71. https://doi.org/10.1016/j.cell.2008.08.016. Bedrat, Amina, Laurent Lacroix, and Jean-Louis Mergny. 2016. “Re-Evaluation of G-Quadruplex Propensity with G4Hunter.” Nucleic Acids Research 44 (4): 1746–59. https://doi.org/10.1093/nar/gkw006. Ben-Shem, Adam, Nicolas Garreau de Loubresse, Sergey Melnikov, Lasse Jenner, Gulnara Yusupova, and Marat Yusupov. 2011. “The Structure of the Eukaryotic Ribosome at 3.0 Å Resolution.” Science (New York, N.Y.) 334 (6062): 1524–29. https://doi.org/10.1126/science.1212642. Bhatia, S., M. Singh Negi, and M. Lakshmikumaran. 1996. “Structural Analysis of the RDNA Intergenic Spacer of Brassica Nigra: Evolutionary Divergence of the Spacers of the Three Diploid Brassica Species.” Journal of Molecular Evolution 43 (5): 460–68. https://doi.org/10.1007/bf02337518. Bhattacharyya, Debmalya, Gayan Mirihana Arachchilage, and Soumitra Basu. 2016. “Metal Cations in G-Quadruplex Folding and Stability.” Frontiers in Chemistry 4 (September). https://doi.org/10.3389/fchem.2016.00038. Bleuyard, Jean-Yves, Maria E. Gallego, and Charles I. White. 2006. “Recent Advances in Understanding of the DNA Double-Strand Break Repair Machinery of Plants.” DNA Repair 5 (1): 1–12. https://doi.org/10.1016/j.dnarep.2005.08.017. Bochman, Matthew L., Katrin Paeschke, and Virginia A. Zakian. 2012. “DNA Secondary Structures: Stability and Function of G-Quadruplex Structures.” Nature Reviews. Genetics 13 (11): 770– 80. https://doi.org/10.1038/nrg3296. Boulon, Séverine, Belinda J. Westman, Saskia Hutten, François-Michel Boisvert, and Angus I. Lamond. 2010. “The Nucleolus under Stress.” Molecular Cell 40 (2): 216–27. https://doi.org/10.1016/j.molcel.2010.09.024.

68

Brázda, Václav, Jan Kolomazník, Jirí Lýsek, Martin Bartas, Miroslav Fojta, Jirí Štastný, and Jean- Louis Mergny. 2019. “G4Hunter Web Application: A Web Server for G-Quadruplex Prediction.” Bioinformatics (Oxford, England), February. https://doi.org/10.1093/bioinformatics/btz087. Brooks, Tracy A., and Laurence H. Hurley. 2009. “The Role of Supercoiling in Transcriptional Control of MYC and Its Importance in Molecular Therapeutics.” Nature Reviews Cancer 9 (12): 849–61. https://doi.org/10.1038/nrc2733. Brown, Donald D., Pieter C. Wensink, and Eddie Jordan. 1972. “A Comparison of the Ribosomal DNA’s of Xenopus Laevis and Xenopus Mulleri: The Evolution of Tandem Genes.” Journal of Molecular Biology 63 (1): 57–73. https://doi.org/10.1016/0022-2836(72)90521-9. Cabral-de-Mello, D. C., R. C. Moura, and C. Martins. 2010. “Chromosomal Mapping of Repetitive DNAs in the Beetle Dichotomius Geminatus Provides the First Evidence for an Association of 5S RRNA and Histone H3 Genes in Insects, and Repetitive DNA Similarity between the B Chromosome and A Complement.” Heredity 104 (4): 393–400. https://doi.org/10.1038/hdy.2009.126. Caburet, Sandrine, Chiara Conti, Catherine Schurra, Ronald Lebofsky, Stuart J. Edelstein, and Aaron Bensimon. 2005. “Human Ribosomal RNA Gene Arrays Display a Broad Range of Palindromic Structures.” Genome Research 15 (8): 1079–85. https://doi.org/10.1101/gr.3970105. Campell, B. R., Y. Song, T. E. Posch, C. A. Cullis, and C. D. Town. 1992. “Sequence and Organization of 5S Ribosomal RNA-Encoding Genes of Arabidopsis Thaliana.” Gene 112 (2): 225–28. https://doi.org/10.1016/0378-1119(92)90380-8. Capelson, Maya, Yun Liang, Roberta Schulte, William Mair, Ulrich Wagner, and Martin W. Hetzer. 2010. “Chromatin-Bound Nuclear Pore Components Regulate Gene Expression in Higher Eukaryotes.” Cell 140 (3): 372–83. https://doi.org/10.1016/j.cell.2009.12.054. Chambers, Vicki S., Giovanni Marsico, Jonathan M. Boutell, Marco Di Antonio, Geoffrey P. Smith, and Shankar Balasubramanian. 2015. “High-Throughput Sequencing of DNA G-Quadruplex Structures in the Human Genome.” Nature Biotechnology 33 (8): 877–81. https://doi.org/10.1038/nbt.3295. Chandrasekhara, Chinmayi, Gireesha Mohannath, Todd Blevins, Frederic Pontvianne, and Craig S. Pikaard. 2016. “Chromosome-Specific NOR Inactivation Explains Selective RRNA Gene Silencing and Dosage Control in Arabidopsis.” Genes & Development 30 (2): 177–90. https://doi.org/10.1101/gad.273755.115. Cheng, Liang, Xu Zhang, Yan Wang, Haiyun Gan, Xiaowei Xu, Xiangdong Lv, Xu Hua, Jianwen Que, Tamas Ordog, and Zhiguo Zhang. 2019. “Chromatin Assembly Factor 1 (CAF-1) Facilitates the Establishment of Facultative Heterochromatin during Pluripotency Exit.” Nucleic Acids Research 47 (21): 11114–31. https://doi.org/10.1093/nar/gkz858. Cloix, C., S. Tutois, Y. Yukawa, O. Mathieu, C. Cuvillier, M. C. Espagnol, G. Picard, and S. Tourmente. 2002. “Analysis of the 5S RNA Pool in Arabidopsis Thaliana: RNAs Are Heterogeneous and Only Two of the Genomic 5S Loci Produce Mature 5S RNA.” Genome Research 12 (1): 132–44. https://doi.org/10.1101/gr.181301. Copenhaver, Gregory P., and Craig S. Pikaard. 1996a. “RFLP and Physical Mapping with an RDNA- Specific Endonuclease Reveals That Nucleolus Organizer Regions of Arabidopsis Thaliana Adjoin the Telomeres on Chromosomes 2 and 4.” The Plant Journal 9 (2): 259–72. https://doi.org/10.1046/j.1365-313X.1996.09020259.x. Copenhaver, Gregory P., and Craig S. Pikaard. 1996b. “Two-Dimensional RFLP Analyses Reveal Megabase-Sized Clusters of RRNA Gene Variants in Arabidopsis Thaliana, Suggesting Local Spreading of Variants as the Mode for Gene Homogenization during Concerted Evolution.” The Plant Journal 9 (2): 273–82. https://doi.org/10.1046/j.1365-313X.1996.09020273.x. Cross, I., and L. Rebordinos. 2005. “5S RDNA and U2 SnRNA Are Linked in the Genome of Crassostrea Angulata and Crassostrea Gigas Oysters: Does the (CT)n.(GA)n Microsatellite Stabilize This Novel Linkage of Large Tandem Arrays?” Genome 48 (6): 1116–19. https://doi.org/10.1139/g05-075.

69

Cvijović, Ivana, Benjamin H. Good, and Michael M. Desai. 2018. “The Effect of Strong Purifying Selection on Genetic Diversity.” Genetics 209 (4): 1235–78. https://doi.org/10.1534/genetics.118.301058. Dellaporta, Stephen L., Jonathan Wood, and James B. Hicks. 1983. “A Plant DNA Minipreparation: Version II.” Plant Molecular Biology Reporter 1 (4): 19–21. https://doi.org/10.1007/BF02712670. Dimario, Patrick J. 2004. “Cell and Molecular Biology of Nucleolar Assembly and Disassembly.” International Review of Cytology 239: 99–178. https://doi.org/10.1016/S0074-7696(04)39003- 0. Doelling, J H, R J Gaudino, and C S Pikaard. 1993. “Functional Analysis of Arabidopsis Thaliana RRNA Gene and Spacer Promoters in Vivo and by Transient Expression.” Proceedings of the National Academy of Sciences of the United States of America 90 (16): 7528–32. Doelling, J. H., and C. S. Pikaard. 1995. “The Minimal Ribosomal RNA Gene Promoter of Arabidopsis Thaliana Includes a Critical Element at the Transcription Initiation Site.” The Plant Journal: For Cell and Molecular Biology 8 (5): 683–92. https://doi.org/10.1046/j.1365- 313x.1995.08050683.x. Doğan, Ezgi Süheyla, and Chang Liu. 2018. “Three-Dimensional Chromatin Packing and Positioning of Plant Genomes.” Nature Plants 4 (8): 521–29. https://doi.org/10.1038/s41477-018-0199-5. Dohke, Kohei, Shota Miyazaki, Katsunori Tanaka, Takeshi Urano, Shiv I. S. Grewal, and Yota Murakami. 2008. “Fission Yeast Chromatin Assembly Factor 1 Assists in the Replication- Coupled Maintenance of Heterochromatin.” Genes to Cells: Devoted to Molecular & Cellular Mechanisms 13 (10): 1027–43. https://doi.org/10.1111/j.1365-2443.2008.01225.x. Dong, Pengfei, Xiaoyu Tu, Po-Yu Chu, Peitao Lü, Ning Zhu, Donald Grierson, Baijuan Du, Pinghua Li, and Silin Zhong. 2017. “3D Chromatin Architecture of Large Plant Genomes Determined by Local A/B Compartments.” Molecular Plant 10 (12): 1497–1509. https://doi.org/10.1016/j.molp.2017.11.005. Dvořáčková, Martina, Miloslava Fojtová, and Jiří Fajkus. 2015. “Chromatin Dynamics of Plant Telomeres and Ribosomal Genes.” The Plant Journal 83 (1): 18–37. https://doi.org/10.1111/tpj.12822. Dvořáčková, Martina, Berta Raposo, Petr Matula, Joerg Fuchs, Veit Schubert, Vratislav Peška, Bénédicte Desvoyes, Crisanto Gutierrez, and Jiří Fajkus. 2018. “Replication of Ribosomal DNA in Arabidopsis Occurs Both inside and Outside the Nucleolus during S Phase Progression.” Journal of Cell Science 131 (2). https://doi.org/10.1242/jcs.202416. Earley, Keith W., Frédéric Pontvianne, Andrzej T. Wierzbicki, Todd Blevins, Sarah Tucker, Pedro Costa-Nunes, Olga Pontes, and Craig S. Pikaard. 2010. “Mechanisms of HDA6-Mediated RRNA Gene Silencing: Suppression of Intergenic Pol II Transcription and Differential Effects on Maintenance versus SiRNA-Directed Cytosine Methylation.” Genes & Development 24 (11): 1119–32. https://doi.org/10.1101/gad.1914110. Eickbush, Thomas H., and Danna G. Eickbush. 2007. “Finely Orchestrated Movements: Evolution of the Ribosomal RNA Genes.” Genetics 175 (2): 477–85. https://doi.org/10.1534/genetics.107.071399. Endo, Masaki, Yuichi Ishikawa, Keishi Osakabe, Shigeki Nakayama, Hidetaka Kaya, Takashi Araki, Kei-ichi Shibahara, et al. 2006. “Increased Frequency of Homologous Recombination and T- DNA Integration in Arabidopsis CAF-1 Mutants.” The EMBO Journal 25 (23): 5579–90. https://doi.org/10.1038/sj.emboj.7601434. Enomoto, Shinichiro, and Judith Berman. 1998. “Chromatin Assembly Factor I Contributes to the Maintenance, but Not the Re-Establishment, of Silencing at the Yeast Silent Mating Loci.” Genes & Development 12 (2): 219–32. https://doi.org/10.1101/gad.12.2.219. Exner, Vivien, Patti Taranto, Nicole Schönrock, Wilhelm Gruissem, and Lars Hennig. 2006. “Chromatin Assembly Factor CAF-1 Is Required for Cellular Differentiation during Plant Development.” Development (Cambridge, England) 133 (21): 4163–72. https://doi.org/10.1242/dev.02599. Fajkus, J., and J. Reich. 1991. “Evaluation of Restriction Endonuclease Cleavage of Plant Nuclear DNA Using Contaminating Chloroplast DNA.” Folia Biologica 37 (3–4): 224–26.

70

Fazly, Ahmed, Qing Li, Qi Hu, Georges Mer, Bruce Horazdovsky, and Zhiguo Zhang. 2012. “Histone Chaperone Rtt106 Promotes Nucleosome Formation Using (H3-H4)2 Tetramers.” Journal of Biological Chemistry 287 (14): 10753–60. https://doi.org/10.1074/jbc.M112.347450. Feng, Suhua, Shawn J. Cokus, Veit Schubert, Jixian Zhai, Matteo Pellegrini, and Steven E. Jacobsen. 2014. “Genome-Wide Hi-C Analyses in Wild-Type and Mutants Reveal High-Resolution Chromatin Interactions in Arabidopsis.” Molecular Cell 55 (5): 694–707. https://doi.org/10.1016/j.molcel.2014.07.008. Filion, Guillaume J., Joke G. van Bemmel, Ulrich Braunschweig, Wendy Talhout, Jop Kind, Lucas D. Ward, Wim Brugman, et al. 2010. “Systematic Protein Location Mapping Reveals Five Principal Chromatin Types in Drosophila Cells.” Cell 143 (2): 212–24. https://doi.org/10.1016/j.cell.2010.09.009. Fromont-Racine, Micheline, Bruno Senger, Cosmin Saveanu, and Franco Fasiolo. 2003. “Ribosome Assembly in Eukaryotes.” Gene 313 (August): 17–42. https://doi.org/10.1016/s0378- 1119(03)00629-2. Gaillard, Pierre-Henri L., E. M. Martini, P. D. Kaufman, B. Stillman, E. Moustacchi, and G. Almouzni. 1996. “Chromatin Assembly Coupled to DNA Repair: A New Role for Chromatin Assembly Factor I.” Cell 86 (6): 887–96. https://doi.org/10.1016/s0092-8674(00)80164-6. Garcia, Sònia, José L. Panero, Jiri Siroky, and Ales Kovarik. 2010. “Repeated Reunions and Splits Feature the Highly Dynamic Evolution of 5S and 35S Ribosomal RNA Genes (RDNA) in the Asteraceae Family.” BMC Plant Biology 10 (1): 176. https://doi.org/10.1186/1471-2229-10- 176. Gibcus, Johan H., and Job Dekker. 2013. “The Hierarchy of the 3D Genome.” Molecular Cell 49 (5): 773–82. https://doi.org/10.1016/j.molcel.2013.02.011. Goffová, Ivana, Radka Vágnerová, Vratislav Peška, Michal Franek, Kateřina Havlová, Marcela Holá, Dagmar Zachová, et al. 2019. “Roles of RAD51 and RTEL1 in Telomere and RDNA Stability in Physcomitrella Patens.” The Plant Journal: For Cell and Molecular Biology 98 (6): 1090– 1105. https://doi.org/10.1111/tpj.14304. Gonzalez, I. L., and J. E. Sylvester. 2001. “Human RDNA: Evolutionary Patterns within the Genes and Tandem Arrays Derived from Multiple Chromosomes.” Genomics 73 (3): 255–63. https://doi.org/10.1006/geno.2001.6540. Goodfellow, Sarah J., and Joost C. B. M. Zomerdijk. 2012. “Basic Mechanisms in RNA Polymerase I Transcription of the Ribosomal RNA Genes.” Sub-Cellular Biochemistry 61. https://doi.org/10.1007/978-94-007-4525-4_10. Gorokhova, Elena, Thomas E. Dowling, Lawrence J. Weider, Teresa J. Crease, and James J. Elser. 2002. “Functional and Ecological Significance of RDNA Intergenic Spacer Variation in a Clonal Organism under Divergent Selection for Production Rate.” Proceedings of the Royal Society of London. Series B: Biological Sciences 269 (1507): 2373–79. https://doi.org/10.1098/rspb.2002.2145. Grewal, Shiv I. S., and Songtao Jia. 2007. “Heterochromatin Revisited.” Nature Reviews Genetics 8 (1): 35–46. https://doi.org/10.1038/nrg2008. Grimaldi, G., P. Fiorentini, and P. P. Di Nocera. 1990. “Spacer Promoters Are Orientation-Dependent Activators of Pre-RRNA Transcription in Drosophila Melanogaster.” Molecular and Cellular Biology 10 (9): 4667–77. https://doi.org/10.1128/mcb.10.9.4667. Grozdanov, Petar, Oleg Georgiev, and Luchezar Karagyozov. 2003. “Complete Sequence of the 45- Kb Mouse Ribosomal DNA Repeat: Analysis of the Intergenic Spacer.” Genomics 82 (6): 637–43. https://doi.org/10.1016/s0888-7543(03)00199-x. Gruendler, P., I. Unfried, K. Pascher, and D. Schweizer. 1991. “RDNA Intergenic Region from Arabidopsis Thaliana. Structural Analysis, Intraspecific Variation and Functional Implications.” Journal of Molecular Biology 221 (4): 1209–22. https://doi.org/10.1016/0022- 2836(91)90929-z. Grummt, Ingrid, and Craig S. Pikaard. 2003. “Epigenetic Silencing of RNA Polymerase I Transcription.” Nature Reviews. Molecular Cell Biology 4 (8): 641–49. https://doi.org/10.1038/nrm1171.

71

Guédin, Aurore, Julien Gros, Patrizia Alberti, and Jean-Louis Mergny. 2010. “How Long Is Too Long? Effects of Loop Size on G-Quadruplex Stability.” Nucleic Acids Research 38 (21): 7858–68. https://doi.org/10.1093/nar/gkq639. Guelen, Lars, Ludo Pagie, Emilie Brasset, Wouter Meuleman, Marius B. Faza, Wendy Talhout, Bert H. Eussen, et al. 2008. “Domain Organization of Human Chromosomes Revealed by Mapping of Nuclear Lamina Interactions.” Nature 453 (7197): 948–51. https://doi.org/10.1038/nature06947. Guo, Zhansheng, Leng Han, Zhenlin Liang, and Xuguang Hou. 2019. “Comparative Analysis of the Ribosomal DNA Repeat Unit (RDNA) of Perna Viridis (Linnaeus, 1758) and Perna Canaliculus (Gmelin, 1791).” PeerJ 7 (September). https://doi.org/10.7717/peerj.7644. Hahne, Florian, and Robert Ivanek. 2016. “Visualizing Genomic Data Using Gviz and Bioconductor.” Methods in Molecular Biology (Clifton, N.J.) 1418: 335–51. https://doi.org/10.1007/978-1- 4939-3578-9_16. Hardin, Charles C., Adam G. Perry, and Katie White. 2000. “Thermodynamic and Kinetic Characterization of the Dissociation and Assembly of Quadruplex Nucleic Acids.” Biopolymers. January 1, 2000. https://doi.org/10.1002/1097-0282(2000/2001)56:3<147::AID- BIP10011>3.0.CO;2-N. Harkness, Robert W., and Anthony K. Mittermaier. 2017. “G-Quadruplex Dynamics.” Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics, Biophysics in Canada, 1865 (11, Part B): 1544–54. https://doi.org/10.1016/j.bbapap.2017.06.012. Harp, J. M., B. L. Hanson, D. E. Timm, and G. J. Bunick. 2000. “Asymmetries in the Nucleosome Core Particle at 2.5 A Resolution.” Acta Crystallographica. Section D, Biological Crystallography 56 (Pt 12): 1513–34. https://doi.org/10.1107/s0907444900011847. Havlová, Kateřina, Martina Dvořáčková, Ramon Peiro, David Abia, Iva Mozgová, Lenka Vansáčová, Crisanto Gutierrez, and Jiří Fajkus. 2016. “Variation of 45S RDNA Intergenic Spacers in Arabidopsis Thaliana.” Plant Molecular Biology 92 (4): 457–71. https://doi.org/10.1007/s11103-016-0524-1. Hennig, Lars, Romaric Bouveret, and Wilhelm Gruissem. 2005. “MSI1-like Proteins: An Escort Service for Chromatin Assembly and Remodeling Complexes.” Trends in Cell Biology 15 (6): 295–302. https://doi.org/10.1016/j.tcb.2005.04.004. Henras, Anthony K., Célia Plisson‐Chastang, Marie-Françoise O’Donohue, Anirban Chakraborty, and Pierre-Emmanuel Gleizes. 2015. “An Overview of Pre-Ribosomal RNA Processing in Eukaryotes.” Wiley Interdisciplinary Reviews: RNA 6 (2): 225–42. https://doi.org/10.1002/wrna.1269. Holloman, William K. 2011. “Unraveling the Mechanism of BRCA2 in Homologous Recombination.” Nature Structural & Molecular Biology 18 (7): 748–54. https://doi.org/10.1038/nsmb.2096. Hon, Jirí, Tomáš Martínek, Jaroslav Zendulka, and Matej Lexa. 2017. “Pqsfinder: An Exhaustive and Imperfection-Tolerant Search Tool for Potential Quadruplex-Forming Sequences in R.” Bioinformatics (Oxford, England) 33 (21): 3373–79. https://doi.org/10.1093/bioinformatics/btx413. Houlard, Martin, Soizik Berlivet, Aline V. Probst, Jean-Pierre Quivy, Patrick Héry, Geneviève Almouzni, and Matthieu Gérard. 2006. “CAF-1 Is Essential for Heterochromatin Organization in Pluripotent Embryonic Cells.” PLoS Genetics 2 (11): e181. https://doi.org/10.1371/journal.pgen.0020181. Huber, Wolfgang, Vincent J. Carey, Robert Gentleman, Simon Anders, Marc Carlson, Benilton S. Carvalho, Hector Corrada Bravo, et al. 2015. “Orchestrating High-Throughput Genomic Analysis with Bioconductor.” Nature Methods 12 (2): 115–21. https://doi.org/10.1038/nmeth.3252. Huppert, Julian L. 2008. “Four-Stranded Nucleic Acids: Structure, Function and Targeting of G- Quadruplexes.” Chemical Society Reviews 37 (7): 1375–84. https://doi.org/10.1039/b702491f. Huppert, Julian L., and Shankar Balasubramanian. 2005. “Prevalence of Quadruplexes in the Human Genome.” Nucleic Acids Research 33 (9): 2908–16. https://doi.org/10.1093/nar/gki609. Ide, Satoru, Takaaki Miyazaki, Hisaji Maki, and Takehiko Kobayashi. 2010. “Abundance of Ribosomal RNA Gene Copies Maintains Genome Integrity.” Science 327 (5966): 693–96. https://doi.org/10.1126/science.1179044.

72

Jacob, Mathieu D., Timothy E. Audas, Sahra-Taylor Mullineux, and Stephen Lee. 2012. “Where No RNA Polymerase Has Gone Before.” Nucleus 3 (4): 315–19. https://doi.org/10.4161/nucl.20585. Jaške, Karin, Petr Mokroš, Iva Mozgová, Miloslava Fojtová, and Jiří Fajkus. 2013. “A Telomerase- Independent Component of Telomere Loss in Chromatin Assembly Factor 1 Mutants of Arabidopsis Thaliana.” Chromosoma 122 (4): 285–93. https://doi.org/10.1007/s00412-013- 0400-6. Jiao, R., J. A. Harrigan, I. Shevelev, T. Dietschy, N. Selak, F. E. Indig, J. Piotrowski, P. Janscak, V. A. Bohr, and I. Stagljar. 2007. “The Werner Syndrome Protein Is Required for Recruitment of Chromatin Assembly Factor 1 Following DNA Damage.” Oncogene 26 (26): 3811–22. https://doi.org/10.1038/sj.onc.1210150. Jiao, Renjie, Csanád Z. Bachrati, Graziella Pedrazzi, Patrick Kuster, Maja Petkovic, Ji-Liang Li, Dieter Egli, Ian D. Hickson, and Igor Stagljar. 2004. “Physical and Functional Interaction between the Bloom’s Syndrome Gene Product and the Largest Subunit of Chromatin Assembly Factor 1.” Molecular and Cellular Biology 24 (11): 4710–19. https://doi.org/10.1128/MCB.24.11.4710-4719.2004. Kamakaka, R. T., M. Bulger, P. D. Kaufman, B. Stillman, and J. T. Kadonaga. 1996. “Postreplicative Chromatin Assembly by Drosophila and Human Chromatin Assembly Factor 1.” Molecular and Cellular Biology 16 (3): 810–17. https://doi.org/10.1128/mcb.16.3.810. Kamisugi, Yasuko, Katja Schlink, Stefan A. Rensing, Gabriele Schween, Mark von Stackelberg, Andrew C. Cuming, Ralf Reski, and David J. Cove. 2006. “The Mechanism of Gene Targeting in Physcomitrella Patens: Homologous Recombination, Concatenation and Multiple Integration.” Nucleic Acids Research 34 (21): 6205–14. https://doi.org/10.1093/nar/gkl832. Kaufman, P. D., R. Kobayashi, and B. Stillman. 1997. “Ultraviolet Radiation Sensitivity and Reduction of Telomeric Silencing in Saccharomyces Cerevisiae Cells Lacking Chromatin Assembly Factor-I.” Genes & Development 11 (3): 345–57. https://doi.org/10.1101/gad.11.3.345. Kaya, H., K. I. Shibahara, K. I. Taoka, M. Iwabuchi, B. Stillman, and T. Araki. 2001. “FASCIATA Genes for Chromatin Assembly Factor-1 in Arabidopsis Maintain the Cellular Organization of Apical Meristems.” Cell 104 (1): 131–42. https://doi.org/10.1016/s0092-8674(01)00197-0. Khurts, Shilagardi, Kenkichi Masutomi, Luvsanjav Delgermaa, Kuniaki Arai, Naoki Oishi, Hideki Mizuno, Naoyuki Hayashi, William C. Hahn, and Seishi Murakami. 2004. “Nucleolin Interacts with Telomerase.” Journal of Biological Chemistry 279 (49): 51508–15. https://doi.org/10.1074/jbc.M407643200. Kind, Jop, Ludo Pagie, Sandra S. de Vries, Leila Nahidiazar, Siddharth S. Dey, Magda Bienko, Ye Zhan, et al. 2015. “Genome-Wide Maps of Nuclear Lamina Interactions in Single Human Cells.” Cell 163 (1): 134–47. https://doi.org/10.1016/j.cell.2015.08.040. Kirik, Angela, Ales Pecinka, Edelgard Wendeler, and Bernd Reiss. 2006. “The Chromatin Assembly Factor Subunit FASCIATA1 Is Involved in Homologous Recombination in Plants.” The Plant Cell 18 (10): 2431–42. https://doi.org/10.1105/tpc.106.045088. Kobayashi, Takehiko, and Austen R. D. Ganley. 2005. “Recombination Regulation by Transcription- Induced Cohesin Dissociation in RDNA Repeats.” Science (New York, N.Y.) 309 (5740): 1581–84. https://doi.org/10.1126/science.1116102. Kumar, S., and M. Leffak. 1986. “Assembly of Active Chromatin.” Biochemistry 25 (8): 2055–60. https://doi.org/10.1021/bi00356a033. Lacroix, Laurent. 2019. “G4HunterApps.” Bioinformatics 35 (13): 2311–12. https://doi.org/10.1093/bioinformatics/bty951. Layat, Elodie, Julio Sáez-Vásquez, and Sylvette Tourmente. 2012. “Regulation of Pol I-Transcribed 45S RDNA and Pol III-Transcribed 5S RDNA in Arabidopsis.” Plant and Cell Physiology 53 (2): 267–76. https://doi.org/10.1093/pcp/pcr177. Lee, Ji Hoon, Yang Sin Lee, Sun Ah Jeong, Prabhat Khadka, Jürgen Roth, and In Kwon Chung. 2014. “Catalytically Active Telomerase Holoenzyme Is Assembled in the Dense Fibrillar Component of the Nucleolus during S Phase.” Histochemistry and Cell Biology 141 (2): 137– 52. https://doi.org/10.1007/s00418-013-1166-x.

73

León-Ortiz, Ana María, Jennifer Svendsen, and Simon J. Boulton. 2014. “Metabolism of DNA Secondary Structures at the Eukaryotic Replication Fork.” DNA Repair, Cutting-edge Perspectives in Genomic Maintenance, 19 (July): 152–62. https://doi.org/10.1016/j.dnarep.2014.03.016. Lewis, L. Kevin, G. Karthikeyan, Jared Cassiano, and Michael A. Resnick. 2005. “Reduction of Nucleosome Assembly during New DNA Synthesis Impairs Both Major Pathways of Double- Strand Break Repair.” Nucleic Acids Research 33 (15): 4928–39. https://doi.org/10.1093/nar/gki806. Leyser, H. M. Ottoline, and I. J. Furner. 1992. “Characterisation of Three Shoot Apical Meristem Mutants of Arabidopsis Thaliana.” Development 116 (2): 397–403. Li, Gu, Marcia Levitus, Carlos Bustamante, and Jonathan Widom. 2005. “Rapid Spontaneous Accessibility of Nucleosomal DNA.” Nature Structural & Molecular Biology 12 (1): 46–53. https://doi.org/10.1038/nsmb869. Li, Heng, and Richard Durbin. 2009. “Fast and Accurate Short Read Alignment with Burrows- Wheeler Transform.” Bioinformatics (Oxford, England) 25 (14): 1754–60. https://doi.org/10.1093/bioinformatics/btp324. Lieberman-Aiden, Erez, Nynke L. van Berkum, Louise Williams, Maxim Imakaev, Tobias Ragoczy, Agnes Telling, Ido Amit, et al. 2009. “Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome.” Science (New York, N.Y.) 326 (5950): 289–93. https://doi.org/10.1126/science.1181369. Lindström, Mikael S., Deana Jurada, Sladana Bursac, Ines Orsolic, Jiri Bartek, and Sinisa Volarevic. 2018. “Nucleolus as an Emerging Hub in Maintenance of Genome Stability and Cancer Pathogenesis.” Oncogene 37 (18): 2351–66. https://doi.org/10.1038/s41388-017-0121-z. Lipps, Hans J., and Daniela Rhodes. 2009. “G-Quadruplex Structures: In Vivo Evidence and Function.” Trends in Cell Biology 19 (8): 414–22. https://doi.org/10.1016/j.tcb.2009.05.002. Long, Eric O., and Igor B. Dawid. 1980. “Repeated Genes in Eukaryotes.” Annual Review of Biochemistry 49 (1): 727–64. https://doi.org/10.1146/annurev.bi.49.070180.003455. Loyola, Alejandra, Hideaki Tagami, Tiziana Bonaldi, Danièle Roche, Jean Pierre Quivy, Axel Imhof, Yoshihiro Nakatani, Sharon Y R Dent, and Geneviève Almouzni. 2009. “The HP1α–CAF1– SetDB1-Containing Complex Provides H3K9me1 for Suv39-Mediated K9me3 in Pericentric Heterochromatin.” EMBO Reports 10 (7): 769–75. https://doi.org/10.1038/embor.2009.90. Luger, K., A. W. Mäder, R. K. Richmond, D. F. Sargent, and T. J. Richmond. 1997. “Crystal Structure of the Nucleosome Core Particle at 2.8 A Resolution.” Nature 389 (6648): 251–60. https://doi.org/10.1038/38444. Magis, Alessio De, Stefano G. Manzo, Marco Russo, Jessica Marinello, Rita Morigi, Olivier Sordet, and Giovanni Capranico. 2019. “DNA Damage and Genome Instability by G-Quadruplex Ligands Are Mediated by R Loops in Human Cancer Cells.” Proceedings of the National Academy of Sciences 116 (3): 816–25. https://doi.org/10.1073/pnas.1810409116. Maizels, Nancy. 2006. “Dynamic Roles for G4 DNA in the Biology of Eukaryotic Cells.” Nature Structural & Molecular Biology 13 (12): 1055–59. https://doi.org/10.1038/nsmb1171. Malik, Harmit S., and Steven Henikoff. 2003. “Phylogenomics of the Nucleosome.” Nature Structural & Molecular Biology 10 (11): 882–91. https://doi.org/10.1038/nsb996. Martínez-Balbás, M. A., T. Tsukiyama, D. Gdula, and C. Wu. 1998. “Drosophila NURF-55, a WD Repeat Protein Involved in Histone Metabolism.” Proceedings of the National Academy of Sciences of the United States of America 95 (1): 132–37. https://doi.org/10.1073/pnas.95.1.132. Mattiroli, Francesca, Yajie Gu, Tejas Yadav, Jeremy L Balsbaugh, Michael R Harris, Eileen S Findlay, Yang Liu, et al. 2017. “DNA-Mediated Association of Two Histone-Bound Complexes of Yeast Chromatin Assembly Factor-1 (CAF-1) Drives Tetrasome Assembly in the Wake of DNA Replication.” Edited by Paul D Kaufman. ELife 6 (March): e22799. https://doi.org/10.7554/eLife.22799. Mayer, Christine, Melanie Neubert, and Ingrid Grummt. 2008. “The Structure of NoRC-Associated RNA Is Crucial for Targeting the Chromatin Remodelling Complex NoRC to the Nucleolus.” EMBO Reports 9 (8): 774–80. https://doi.org/10.1038/embor.2008.109.

74

Mayer, Christine, Kerstin-Maike Schmitz, Junwei Li, Ingrid Grummt, and Raffaella Santoro. 2006. “Intergenic Transcripts Regulate the Epigenetic State of RRNA Genes.” Molecular Cell 22 (3): 351–61. https://doi.org/10.1016/j.molcel.2006.03.028. Mello, Jill A., Herman H. W. Silljé, Daniele M. J. Roche, Doris B. Kirschner, Erich A. Nigg, and Geneviève Almouzni. 2002. “Human Asf1 and CAF-1 Interact and Synergize in a Repair- Coupled Nucleosome Assembly Pathway.” EMBO Reports 3 (4): 329–34. https://doi.org/10.1093/embo-reports/kvf068. Mendoza, Oscar, Anne Bourdoncle, Jean-Baptiste Boulé, Robert M. Brosh, and Jean-Louis Mergny. 2016. “G-Quadruplexes and Helicases.” Nucleic Acids Research 44 (5): 1989–2006. https://doi.org/10.1093/nar/gkw079. Mohannath, Gireesha, Frederic Pontvianne, and Craig S. Pikaard. 2016. “Selective Nucleolus Organizer Inactivation in Arabidopsis Is a Chromosome Position-Effect Phenomenon.” Proceedings of the National Academy of Sciences 113 (47): 13426–31. https://doi.org/10.1073/pnas.1608140113. Monson, Ellen K., Derik de Bruin, and Virginia A. Zakian. 1997. “The Yeast Cac1 Protein Is Required for the Stable Inheritance of Transcriptionally Repressed Chromatin at Telomeres.” Proceedings of the National Academy of Sciences 94 (24): 13081–86. https://doi.org/10.1073/pnas.94.24.13081. Montacié, Charlotte, Nathalie Durut, Alison Opsomer, Denise Palm, Pascale Comella, Claire Picart, Marie-Christine Carpentier, et al. 2017. “Nucleolar Proteome Analysis and Proteasomal Activity Assays Reveal a Link between Nucleolus and 26S Proteasome in A. Thaliana.” Frontiers in Plant Science 8. https://doi.org/10.3389/fpls.2017.01815. Morzycka-Wroblewska, E., E. U. Selker, J. N. Stevens, and R. L. Metzenberg. 1985. “Concerted Evolution of Dispersed Neurospora Crassa 5S RNA Genes: Pattern of Sequence Conservation between Allelic and Nonallelic Genes.” Molecular and Cellular Biology 5 (1): 46–51. https://doi.org/10.1128/mcb.5.1.46. Mozgová, Iva, Petr Mokros, and Jirí Fajkus. 2010. “Dysfunction of Chromatin Assembly Factor 1 Induces Shortening of Telomeres and Loss of 45S RDNA in Arabidopsis Thaliana.” The Plant Cell 22 (8): 2768–80. https://doi.org/10.1105/tpc.110.076182. Muchová, Veronika, Simon Amiard, Iva Mozgová, Martina Dvořáčková, Maria E. Gallego, Charles White, and Jiří Fajkus. 2015. “Homology-Dependent Repair Is Involved in 45S RDNA Loss in Plant CAF-1 Mutants.” The Plant Journal: For Cell and Molecular Biology 81 (2): 198– 209. https://doi.org/10.1111/tpj.12718. Mukundan, Vineeth Thachappilly, and Anh Tuân Phan. 2013. “Bulges in G-Quadruplexes: Broadening the Definition of G-Quadruplex-Forming Sequences.” Journal of the American Chemical Society 135 (13): 5017–28. https://doi.org/10.1021/ja310251r. Muñoz-Viana, Rafael, Thomas Wildhaber, Minerva S. Trejo-Arellano, Iva Mozgová, and Lars Hennig. 2017. “Arabidopsis Chromatin Assembly Factor 1 Is Required for Occupancy and Position of a Subset of Nucleosomes.” The Plant Journal: For Cell and Molecular Biology 92 (3): 363–74. https://doi.org/10.1111/tpj.13658. Nabatiyan, Arman, and Torsten Krude. 2004. “Silencing of Chromatin Assembly Factor 1 in Human Cells Leads to Cell Death and Loss of Chromatin Assembly during DNA Synthesis.” Molecular and Cellular Biology 24 (7): 2853–62. https://doi.org/10.1128/mcb.24.7.2853- 2862.2004. Nabatiyan, Arman, Dávid Szüts, and Torsten Krude. 2006. “Induction of CAF-1 Expression in Response to DNA Strand Breaks in Quiescent Human Cells.” Molecular and Cellular Biology 26 (5): 1839–49. https://doi.org/10.1128/MCB.26.5.1839-1849.2006. Nei, Masatoshi, and Alejandro P. Rooney. 2005. “Concerted and Birth-and-Death Evolution of Multigene Families.” Annual Review of Genetics 39: 121–52. https://doi.org/10.1146/annurev.genet.39.073003.112240. Németh, Attila, Ana Conesa, Javier Santoyo-Lopez, Ignacio Medina, David Montaner, Bálint Péterfia, Irina Solovei, Thomas Cremer, Joaquin Dopazo, and Gernot Längst. 2010. “Initial Genomics of the Human Nucleolus.” PLOS Genetics 6 (3): e1000889. https://doi.org/10.1371/journal.pgen.1000889.

75

Nomura, Masayasu, Yasuhisa Nogi, and Melanie Oakes. 2013. Transcription of RDNA in the Yeast Saccharomyces Cerevisiae. Landes Bioscience. https://www.ncbi.nlm.nih.gov/books/NBK6403/. Olivier, Margaux, Cyril Charbonnel, Simon Amiard, Charles I White, and Maria E Gallego. 2018. “RAD51 and RTEL1 Compensate Telomere Loss in the Absence of Telomerase.” Nucleic Acids Research 46 (5): 2432–45. https://doi.org/10.1093/nar/gkx1322. Paeschke, Katrin, Matthew L. Bochman, P. Daniela Garcia, Petr Cejka, Katherine L. Friedman, Stephen C. Kowalczykowski, and Virginia A. Zakian. 2013. “Pif1 Family Helicases Suppress Genome Instability at G-Quadruplex Motifs.” Nature 497 (7450): 458–62. https://doi.org/10.1038/nature12149. Paredes, Silvana, and Keith A. Maggert. 2009. “Ribosomal DNA Contributes to Global Chromatin Regulation.” Proceedings of the National Academy of Sciences 106 (42): 17829–34. https://doi.org/10.1073/pnas.0906811106. Pavlištová, Veronika, Martina Dvořáčková, Michal Jež, Iva Mozgová, Petr Mokroš, and Jiří Fajkus. 2016. “Phenotypic Reversion in Fas Mutants of Arabidopsis Thaliana by Reintroduction of FAS Genes: Variable Recovery of Telomeres with Major Spatial Rearrangements and Transcriptional Reprogramming of 45S RDNA Genes.” The Plant Journal: For Cell and Molecular Biology 88 (3): 411–24. https://doi.org/10.1111/tpj.13257. Pederson, T. 2011. “The Nucleolus.” Cold Spring Harbor Perspectives in Biology 3 (3): 1–15. https://doi.org/10.1101/cshperspect.a000638. Pederson, Thoru. 2000. “The Nucleolus and the Four Ribonucleoproteins of Translation.” The Journal of Cell Biology 148 (6): 1091–96. https://doi.org/10.1083/jcb.148.6.1091. Pedrosa-Harand, Andrea, Cícero C. Souza de Almeida, Magdalena Mosiolek, Matthew W. Blair, Dieter Schweizer, and Marcelo Guerra. 2006. “Extensive Ribosomal DNA Amplification during Andean Common Bean (Phaseolus Vulgaris L.) Evolution.” Theoretical and Applied Genetics 112 (5): 924–33. https://doi.org/10.1007/s00122-005-0196-8. Phan, Anh Tuân, Vitaly Kuryavyi, and Dinshaw J Patel. 2006. “DNA Architecture: From G to Z.” Current Opinion in Structural Biology 16 (3): 288–98. https://doi.org/10.1016/j.sbi.2006.05.011. Picart-Picolo, Ariadna, Nathalie Picault, and Frédéric Pontvianne. 2019. “Ribosomal RNA Genes Shape Chromatin Domains Associating with the Nucleolus.” Nucleus 10 (1): 67–72. https://doi.org/10.1080/19491034.2019.1591106. Politz, Joan C., Laura B. Lewandowski, and Thoru Pederson. 2002. “Signal Recognition Particle RNA Localization within the Nucleolus Differs from the Classical Sites of Ribosome Synthesis.” The Journal of Cell Biology 159 (3): 411–18. https://doi.org/10.1083/jcb.200208037. Polo, Sophie E., and Geneviève Almouzni. 2015. “Chromatin Dynamics after DNA Damage: The Legacy of the Access-Repair-Restore Model.” DNA Repair 36 (December): 114–21. https://doi.org/10.1016/j.dnarep.2015.09.014. Polo, Sophie E., Danièle Roche, and Geneviève Almouzni. 2006. “New Histone Incorporation Marks Sites of UV Repair in Human Cells.” Cell 127 (3): 481–93. https://doi.org/10.1016/j.cell.2006.08.049. Pontes, Olga, Richard J. Lawrence, Nuno Neves, Manuela Silva, Jae-Hyeok Lee, Z. Jeffrey Chen, Wanda Viegas, and Craig S. Pikaard. 2003. “Natural Variation in Nucleolar Dominance Reveals the Relationship between Nucleolus Organizer Chromatin Topology and RRNA Gene Transcription in Arabidopsis.” Proceedings of the National Academy of Sciences 100 (20): 11418–23. https://doi.org/10.1073/pnas.1932522100. Pontvianne, Frédéric, Mohamed Abou-Ellail, Julien Douet, Pascale Comella, Isabel Matia, Chinmayi Chandrasekhara, Anne DeBures, et al. 2010. “Nucleolin Is Required for DNA Methylation State and the Expression of RRNA Gene Variants in Arabidopsis Thaliana.” PLOS Genetics 6 (11): e1001225. https://doi.org/10.1371/journal.pgen.1001225. Pontvianne, Frederic, Todd Blevins, Chinmayi Chandrasekhara, Iva Mozgová, Christiane Hassel, Olga M. F. Pontes, Sarah Tucker, et al. 2013. “Subnuclear Partitioning of RRNA Genes between the Nucleolus and Nucleoplasm Reflects Alternative Epiallelic States.” Genes & Development 27 (14): 1545–50. https://doi.org/10.1101/gad.221648.113.

76

Pontvianne, Frédéric, Marie-Christine Carpentier, Nathalie Durut, Veronika Pavlištová, Karin Jaške, Šárka Schořová, Hugues Parrinello, et al. 2016. “Identification of Nucleolus-Associated Chromatin Domains Reveals a Role for the Nucleolus in 3D Organization of the A. Thaliana Genome.” Cell Reports 16 (6): 1574–87. https://doi.org/10.1016/j.celrep.2016.07.016. Preuss, Sasha B., Pedro Costa-Nunes, Sarah Tucker, Olga Pontes, Richard J. Lawrence, Rebecca Mosher, Kristin D. Kasschau, et al. 2008. “Multi-Megabase Silencing in Nucleolar Dominance Results from SiRNA-Directed de Novo DNA Methylation Recognized by Specific Methylcytosine Binding Proteins.” Molecular Cell 32 (5): 673–84. https://doi.org/10.1016/j.molcel.2008.11.009. Preuss, Sasha, and Craig S. Pikaard. 2007. “RRNA Gene Silencing and Nucleolar Dominance: Insights into a Chromosome-Scale Epigenetic on/off Switch.” Biochimica Et Biophysica Acta 1769 (5–6): 383–92. https://doi.org/10.1016/j.bbaexp.2007.02.005. Pruitt, Robert E., and Elliot M. Meyerowitz. 1986. “Characterization of the Genome of Arabidopsis Thaliana.” Journal of Molecular Biology 187 (2): 169–83. https://doi.org/10.1016/0022- 2836(86)90226-3. Puchta, Holger. 2005. “The Repair of Double-Strand Breaks in Plants: Mechanisms and Consequences for Genome Evolution.” Journal of Experimental Botany 56 (409): 1–14. https://doi.org/10.1093/jxb/eri025. Putnam, C D, and C S Pikaard. 1992. “Cooperative Binding of the Xenopus RNA Polymerase I Transcription Factor XUBF to Repetitive Ribosomal Gene Enhancers.” Molecular and Cellular Biology 12 (11): 4970–80. Quinodoz, Sofia A., Noah Ollikainen, Barbara Tabak, Ali Palla, Jan Marten Schmidt, Elizabeth Detmar, Mason M. Lai, et al. 2018. “Higher-Order Inter-Chromosomal Hubs Shape 3D Genome Organization in the Nucleus.” Cell 174 (3): 744-757.e24. https://doi.org/10.1016/j.cell.2018.05.024. Quivy, Jean-Pierre, Danièle Roche, Doris Kirschner, Hideaki Tagami, Yoshihiro Nakatani, and Geneviève Almouzni. 2004. “A CAF-1 Dependent Pool of HP1 during Heterochromatin Duplication.” The EMBO Journal 23 (17): 3516–26. https://doi.org/10.1038/sj.emboj.7600362. Rachwal, Phillip A., Tom Brown, and Keith R. Fox. 2007a. “Sequence Effects of Single Base Loops in Intramolecular Quadruplex DNA.” FEBS Letters 581 (8): 1657–60. https://doi.org/10.1016/j.febslet.2007.03.040. Rachwal, Phillip A., Tom Brown, and Keith R. Fox. 2007b. “Effect of G-Tract Length on the Topology and Stability of Intramolecular DNA Quadruplexes.” Biochemistry 46 (11): 3036– 44. https://doi.org/10.1021/bi062118j. Ramachandran, Srinivas, and Steven Henikoff. 2015. “Replicating Nucleosomes.” Science Advances 1 (7). https://doi.org/10.1126/sciadv.1500587. Ramirez-Parra, Elena, and Crisanto Gutierrez. 2007a. “E2F Regulates FASCIATA1, a Chromatin Assembly Gene Whose Loss Switches on the Endocycle and Activates Gene Expression by Changing the Epigenetic Status.” Plant Physiology 144 (1): 105–20. https://doi.org/10.1104/pp.106.094979. Ramirez-Parra, Elena, and Crisanto Gutierrez. 2007b. “The Many Faces of Chromatin Assembly Factor 1.” Trends in Plant Science 12 (12): 570–76. https://doi.org/10.1016/j.tplants.2007.10.002. Ransom, Monica, Briana K. Dennehey, and Jessica K. Tyler. 2010. “Chaperoning Histones during DNA Replication and Repair.” Cell 140 (2): 183–95. https://doi.org/10.1016/j.cell.2010.01.004. Raska, Ivan, Karel Koberna, Jan Malínský, Helena Fidlerová, and Martin Masata. 2004. “The Nucleolus and Transcription of Ribosomal Genes.” Biology of the Cell 96 (8): 579–94. https://doi.org/10.1016/j.biolcel.2004.04.015. Raska, Ivan, Peter J. Shaw, and Dusan Cmarko. 2006. “Structure and Function of the Nucleolus in the Spotlight.” Current Opinion in Cell Biology 18 (3): 325–34. https://doi.org/10.1016/j.ceb.2006.04.008.

77

Reddy, Anireddy S. N., Irene S. Day, Janett Göhring, and Andrea Barta. 2012. “Localization and Dynamics of Nuclear Speckles in Plants.” Plant Physiology 158 (1): 67–77. https://doi.org/10.1104/pp.111.186700. Reese, Brian E., Kurtis E. Bachman, Stephen B. Baylin, and Michael R. Rountree. 2003. “The Methyl- CpG Binding Protein MBD1 Interacts with the P150 Subunit of Chromatin Assembly Factor 1.” Molecular and Cellular Biology 23 (9): 3226–36. https://doi.org/10.1128/MCB.23.9.3226- 3236.2003. Rhodes, Daniela, and Hans J. Lipps. 2015. “G-Quadruplexes and Their Regulatory Roles in Biology.” Nucleic Acids Research 43 (18): 8627–37. https://doi.org/10.1093/nar/gkv862. Risitano, A. 2004. “Influence of Loop Size on the Stability of Intramolecular DNA Quadruplexes.” Nucleic Acids Research 32 (8): 2598–2606. https://doi.org/10.1093/nar/gkh598. Roa, Fernando, and Marcelo Guerra. 2012. “Distribution of 45S RDNA Sites in Chromosomes of Plants: Structural and Evolutionary Implications.” BMC Evolutionary Biology 12 (1): 225. https://doi.org/10.1186/1471-2148-12-225. Roberts, Alison W., Eric M. Roberts, and Candace H. Haigler. 2012. “Moss Cell Walls: Structure and Biosynthesis.” Frontiers in Plant Science 3. https://doi.org/10.3389/fpls.2012.00166. Rolef Ben-Shahar, Tom, Araceli G. Castillo, Michael J. Osborne, Katherine L. B. Borden, Jack Kornblatt, and Alain Verreault. 2009. “Two Fundamentally Distinct PCNA Interaction Peptides Contribute to Chromatin Assembly Factor 1 Function.” Molecular and Cellular Biology 29 (24): 6353–65. https://doi.org/10.1128/MCB.01051-09. Rosato, Marcela, Aleš Kovařík, Ricardo Garilleti, and Josep A. Rosselló. 2016. “Conserved Organisation of 45S RDNA Sites and RDNA Gene Copy Number among Major Clades of Early Land Plants.” Edited by Xiu-Qing Li. PLOS ONE 11 (9): e0162544. https://doi.org/10.1371/journal.pone.0162544. Roschzttardtz, Hannetz, Louis Grillet, Marie-Pierre Isaure, Geneviève Conéjéro, Richard Ortega, Catherine Curie, and Stéphane Mari. 2011. “Plant Cell Nucleolus as a Hot Spot for Iron.” Journal of Biological Chemistry 286 (32): 27863–66. https://doi.org/10.1074/jbc.C111.269720. Roudier, François, Ikhlak Ahmed, Caroline Bérard, Alexis Sarazin, Tristan Mary-Huard, Sandra Cortijo, Daniel Bouyer, et al. 2011. “Integrative Epigenomic Mapping Defines Four Main Chromatin States in Arabidopsis.” The EMBO Journal 30 (10): 1928–38. https://doi.org/10.1038/emboj.2011.103. Rowlands, Hollie, Piriththiv Dhavarasa, Ashley Cheng, and Krassimir Yankulov. 2017. “Forks on the Run: Can the Stalling of DNA Replication Promote Epigenetic Changes?” Frontiers in Genetics 8. https://doi.org/10.3389/fgene.2017.00086. Sáez-Vasquez, J., D. Caparros-Ruiz, F. Barneche, and M. Echeverría. 2004. “Characterization of a Crucifer Plant Pre-RRNA Processing Complex.” Biochemical Society Transactions 32 (Pt 4): 578–80. https://doi.org/10.1042/BST0320578. Sáez-Vásquez, Julio, and Michel Delseny. 2019. “Ribosome Biogenesis in Plants: From Functional 45S Ribosomal DNA Organization to Ribosome Assembly Factors.” The Plant Cell 31 (9): 1945–67. https://doi.org/10.1105/tpc.18.00874. Saini, Natalie, Yu Zhang, Karen Usdin, and Kirill S. Lobachev. 2013. “When Secondary Comes First – the Importance of Non-Canonical DNA Structures.” Biochimie 95 (2): 117–23. https://doi.org/10.1016/j.biochi.2012.10.005. Santoro, Raffaella, Kerstin-Maike Schmitz, Juan Sandoval, and Ingrid Grummt. 2010. “Intergenic Transcripts Originating from a Subclass of Ribosomal DNA Repeats Silence Ribosomal RNA Genes in Trans.” EMBO Reports 11 (1): 52–58. https://doi.org/10.1038/embor.2009.254. Sarek, Grzegorz, Jean-Baptiste Vannier, Stephanie Panier, John H. J. Petrini, and Simon J. Boulton. 2015. “TRF2 Recruits RTEL1 to Telomeres in S Phase to Promote T-Loop Unwinding.” Molecular Cell 57 (4): 622–35. https://doi.org/10.1016/j.molcel.2014.12.024. Sarkies, Peter, Charlie Reams, Laura J. Simpson, and Julian E. Sale. 2010. “Epigenetic Instability Due to Defective Replication of Structured DNA.” Molecular Cell 40 (5): 703–13. https://doi.org/10.1016/j.molcel.2010.11.009. Scheer, U., and D. Weisenberger. 1994. “The Nucleolus.” Current Opinion in Cell Biology 6 (3): 354– 59. https://doi.org/10.1016/0955-0674(94)90026-4.

78

Seitz, Ursula, and Ulrich Seitz. 2014. “The Molecular Weight of RRNA Precursor Molecules and Their Processing in Higher Plant Cells.” Zeitschrift Für Naturforschung C 34 (3–4): 253–258. https://doi.org/10.1515/znc-1979-3-416. Sequeira-Mendes, Joana, Irene Aragüez, Ramón Peiró, Raul Mendez-Giraldez, Xiaoyu Zhang, Steven E. Jacobsen, Ugo Bastolla, and Crisanto Gutierrez. 2014. “The Functional Topography of the Arabidopsis Genome Is Organized in a Reduced Number of Linear Motifs of Chromatin States[C][W].” The Plant Cell 26 (6): 2351–66. https://doi.org/10.1105/tpc.114.124578. Sfeir, Agnel, Settapong T. Kosiyatrakul, Dirk Hockemeyer, Sheila L. MacRae, Jan Karlseder, Carl L. Schildkraut, and Titia de Lange. 2009. “Mammalian Telomeres Resemble Fragile Sites and Require TRF1 for Efficient Replication.” Cell 138 (1): 90–103. https://doi.org/10.1016/j.cell.2009.06.021. Shibahara, K., and B. Stillman. 1999. “Replication-Dependent Marking of DNA by PCNA Facilitates CAF-1-Coupled Inheritance of Chromatin.” Cell 96 (4): 575–85. https://doi.org/10.1016/s0092-8674(00)80661-3. Smith, S, and B Stillman. 1991. “Stepwise Assembly of Chromatin during DNA Replication in Vitro.” The EMBO Journal 10 (4): 971–80. Sochorová, Jana, Sònia Garcia, Francisco Gálvez, Radka Symonová, and Aleš Kovařík. 2018. “Evolutionary Trends in Animal Ribosomal DNA Loci: Introduction to a New Online Database.” Chromosoma 127 (1): 141–50. https://doi.org/10.1007/s00412-017-0651-8. Song, Yanjun, Feng He, Gengqiang Xie, Xiaoyan Guo, Yanjuan Xu, Yixu Chen, Xuehong Liang, et al. 2007. “CAF-1 Is Essential for Drosophila Development and Involved in the Maintenance of Epigenetic Memory.” Developmental Biology 311 (1): 213–22. https://doi.org/10.1016/j.ydbio.2007.08.039. Spear, B. B. 1980. “Isolation and Mapping of the RRNA Genes in the Macronucleus of Oxytricha Fallax.” Chromosoma 77 (2): 193–202. https://doi.org/10.1007/bf00329544. Spector, David L., and Angus I. Lamond. 2011. “Nuclear Speckles.” Cold Spring Harbor Perspectives in Biology 3 (2). https://doi.org/10.1101/cshperspect.a000646. Stults, Dawn M., Michael W. Killen, Erica P. Williamson, Jon S. Hourigan, H. David Vargas, Susanne M. Arnold, Jeffrey A. Moscow, and Andrew J. Pierce. 2009. “Human RRNA Gene Clusters Are Recombinational Hotspots in Cancer.” Cancer Research 69 (23): 9096–9104. https://doi.org/10.1158/0008-5472.CAN-09-2680. Tagami, Hideaki, Dominique Ray-Gallet, Geneviève Almouzni, and Yoshihiro Nakatani. 2004. “Histone H3.1 and H3.3 Complexes Mediate Nucleosome Assembly Pathways Dependent or Independent of DNA Synthesis.” Cell 116 (1): 51–61. https://doi.org/10.1016/s0092- 8674(03)01064-x. Tomlinson, Rebecca L., Tania D. Ziegler, Teerawit Supakorndej, Rebecca M. Terns, and Michael P. Terns. 2006. “Cell Cycle-Regulated Trafficking of Human Telomerase to Telomeres.” Molecular Biology of the Cell 17 (2): 955–65. https://doi.org/10.1091/mbc.E05-09-0903. Toubiana, Shir, and Sara Selig. 2018. “DNA:RNA Hybrids at Telomeres – When It Is Better to Be out of the (R) Loop.” The FEBS Journal 285 (14): 2552–66. https://doi.org/10.1111/febs.14464. Tsai, Robert Y. L., and Thoru Pederson. 2014. “Connecting the Nucleolus to the Cell Cycle and Human Disease.” The FASEB Journal 28 (8): 3290–96. https://doi.org/10.1096/fj.14-254680. Tseng, Hung, Weichin Chou, Junwen Wang, Xiaohong Zhang, Shengliang Zhang, and Richard M. Schultz. 2008. “Mouse Ribosomal RNA Genes Contain Multiple Differentially Regulated Variants.” PloS One 3 (3): e1843. https://doi.org/10.1371/journal.pone.0001843. Tucker, Sarah, Alexa Vitins, and Craig S. Pikaard. 2010. “Nucleolar Dominance and Ribosomal RNA Gene Silencing.” Current Opinion in Cell Biology 22 (3): 351–56. https://doi.org/10.1016/j.ceb.2010.03.009. Tutois, S., C. Cloix, O. Mathieu, C. Cuvillier, and S. Tourmente. 2002. “Analysis of 5S RDNA Loci among Arabidopsis Ecotypes and Subspecies.” Text. September 2002. https://doi.org/info:doi/10.1166/gl.2002.016. Vannier, Jean-Baptiste, Visnja Pavicic-Kaltenbrunner, Mark I.R. Petalcorin, Hao Ding, and Simon J. Boulton. 2012. “RTEL1 Dismantles T Loops and Counteracts Telomeric G4-DNA to Maintain Telomere Integrity.” Cell 149 (4): 795–806. https://doi.org/10.1016/j.cell.2012.03.030.

79

Vannier, Jean-Baptiste, Sumit Sandhu, Mark IR Petalcorin, Xiaoli Wu, Zinnatun Nabi, Hao Ding, and Simon J. Boulton. 2013. “RTEL1 Is a Replisome-Associated Helicase That Promotes Telomere and Genome-Wide Replication.” Science 342 (6155): 239–42. https://doi.org/10.1126/science.1241779. Verreault, A., P. D. Kaufman, R. Kobayashi, and B. Stillman. 1996. “Nucleosome Assembly by a Complex of CAF-1 and Acetylated Histones H3/H4.” Cell 87 (1): 95–104. https://doi.org/10.1016/s0092-8674(00)81326-4. Vierna, J, K T Jensen, A Martínez-Lage, and A M González-Tizón. 2011. “The Linked Units of 5S RDNA and U1 SnDNA of Razor Shells (Mollusca: Bivalvia: Pharidae).” Heredity 107 (2): 127–42. https://doi.org/10.1038/hdy.2010.174. Viktorovskaya, Olga V., and David A. Schneider. 2015. “Functional Divergence of Eukaryotic RNA Polymerases: Unique Properties of RNA Polymerase I Suit Its Cellular Role.” Gene, SI: RNA Polymerases I,III,IV, &V, 556 (1): 19–26. https://doi.org/10.1016/j.gene.2014.10.035. Vitales, Daniel, Ugo D’Ambrosio, Francisco Gálvez, Aleš Kovařík, and Sònia Garcia. 2017. “Third Release of the Plant RDNA Database with Updated Content and Information on Telomere Composition and Sequenced Plant Genomes.” Plant Systematics and Evolution 303 (8): 1115– 21. https://doi.org/10.1007/s00606-017-1440-9. Wanzenböck, Eva-Maria, Christian Schöfer, Dieter Schweizer, and Andreas Bachmair. 1997. “Ribosomal Transcription Units Integrated via T-DNA Transformation Associate with the Nucleolus and Do Not Require Upstream Repeat Sequences for Activity in Arabidopsis Thaliana.” The Plant Journal 11 (5): 1007–16. https://doi.org/10.1046/j.1365- 313X.1997.11051007.x. Watson, J. D., and F. H. Crick. 1953. “Molecular Structure of Nucleic Acids; a Structure for Deoxyribose Nucleic Acid.” Nature 171 (4356): 737–38. https://doi.org/10.1038/171737a0. Xu, Mo, Chengzu Long, Xiuzhen Chen, Chang Huang, She Chen, and Bing Zhu. 2010. “Partitioning of Histone H3-H4 Tetramers During DNA Replication–Dependent Chromatin Assembly.” Science 328 (5974): 94–98. https://doi.org/10.1126/science.1178994. Yu, Zhongsheng, Jiyong Liu, Wu-Min Deng, and Renjie Jiao. 2015. “Histone Chaperone CAF-1: Essential Roles in Multi-Cellular Organism Development.” Cellular and Molecular Life Sciences 72 (2): 327–37. https://doi.org/10.1007/s00018-014-1748-3. Zhang, Yu, Rachel Patton McCord, Yu-Jui Ho, Bryan R. Lajoie, Dominic G. Hildebrand, Aline C. Simon, Michael S. Becker, Frederick W. Alt, and Job Dekker. 2012. “Spatial Organization of the Mouse Genome and Its Role in Recurrent Chromosomal Translocations.” Cell 148 (5): 908–21. https://doi.org/10.1016/j.cell.2012.02.002.

80

12. Attachments

Attachment 1:

Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana. Havlová K., Dvořáčková M., Peiro R., Abia D., Mozgová I., Vansáčová L., Gutierrez C. and Fajkus J. (2016). Plant Mol Biol. doi:10.1007/s11103-016-0524-1

Attachment 2:

Roles of RAD51 and RTEL1 in telomere and rDNA stability in Physcomitrella patens. Goffová I., Vágnerová R., Peška V., Franěk M., Havlová K., Holá M., Zachová D., Fojtová M., Cuming A., Kamisugi Y., Angelis K. J. and Fajkus J. (2019). Plant J. doi: 10.1111/tpj.14304

Attachment 3: Curriculum Vitae

81

Plant Mol Biol DOI 10.1007/s11103-016-0524-1

Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana

Kateřina Havlová1 · Martina Dvořáčková1,3 · Ramon Peiro4 · David Abia4 · Iva Mozgová2 · Lenka Vansáčová2 · Crisanto Gutierrez4 · Jiří Fajkus1,2

Received: 29 June 2016 / Accepted: 3 August 2016 © Springer Science+Business Media Dordrecht 2016

Abstract Approximately seven hundred 45S rRNA investigate mutants dysfunctional in chromatin assem- genes (rDNA) in the Arabidopsis thaliana genome are bly factor-1 (CAF-1) (fas1 and fas2 mutants), which are organised in two 4 Mbp-long arrays of tandem repeatsknown to have a reduced number of rDNA copies, and arranged in head-to-tail fashion separated by an intergenicplant lines with restored CAF-1 function (segregated from spacer (IGS). These arrays make up 5 % ofA. thethaliana a fas1xfas2 genetic background) showing major rDNA genome. IGS are rapidly evolving sequences and frequent rearrangements. The systematic rDNA loss in CAF-1 rearrangements inside the rDNA loci have generated con- mutants leads to the decreased variability of the IGS and to siderable interspecific and even intra-individual variability the occurrence of distinct IGS variants. We present for the which allows to distinguish among otherwise highly con- first time a comprehensive and representative set of com- served rRNA genes. The IGS has not been comprehensively plete IGS sequences, obtained by conventional cloning and described despite its potential importance in regulation of by Pacific Biosciences sequencing. Our data expands the rDNA transcription and replication. Here we describe the knowledge of the A. thaliana IGS sequence arrangement detailed sequence variation in the complete IGS of A. thali- and variability, which has not been available in full and in ana WT plants and provide the reference/consensus IGS detail until now. This is also the first study combining IGS sequence, as well as genomic DNA analysis. We further sequencing data with RFLP analysis of genomic DNA.

Keywords Arabidopsis thaliana · Chromatin assembly Electronic supplementary material The online version of this factor · Nucleolus organizer region · 45S ribosomal article (doi:10.1007/s11103-016-0524-1) contains supplementary DNA · Intergenic spacer · rDNA rearrangements material, which is available to authorized users.

Martina Dvořáčková [email protected] Introduction Jiří Fajkus [email protected] Genome rearrangements play a key role in evolution of spe- cies leading to biodiversity, adaptation to particular environ- 1 Mendel Centre for Plant Genomics and Proteomics, mental changes, or tolerance to abiotic and natural stresses CEITEC, Masaryk University, Kamenice 5, 62500 Brno, (e.g. Geiser et al. 2016; Kelly et al. 2015; Long et al. 2013; Czech Republic 2008 2 Mandakova and Lysak ). It is interesting that genome Faculty of Science, Laboratory of Functional Genomics reorganisation often involves 45S rRNA genes (rDNA) and Proteomics, National Centre for Biomolecular Research, Masaryk University, Kotlářská 2, 61137 Brno, (Elliott et al. 2013; Garcia and Kovarik 2013; Long et al. Czech Republic 2013; Weider et al. 2005). Ribosomal genes fall into the 3 Institute of Biophysics ASCR, v.v.i., Královopolská 135, group of well-conserved housekeeping genes characterised 61265 Brno, Czech Republic by repetitiveness, high abundance, and the presence of two 4 Centro de Biologia Molecular Severo Ochoa, CSIC-UAM, distinct fractions, an actively transcribed and an inactive Nicolas Cabrera 1, Madrid 28049, Spain fraction, condensed and often spatially separated from the 1 3 2 Plant Mol Biol active gene copies (Grummt and Pikaard 2003; Pontes et al. The A. thaliana IGS can be divided into three parts, the 2003; Pontvianne et al. 2013). Examples of ribosomal locus non-transcribed spacer (NTS) and two external transcribed reorganisation can be found in many different organisms. spacers (3′ETS and 5′ETS) which are transcribed by poly- There are plant species containing 45S (or 35S) rDNA asso- merase I as a part of the nascent pre-rRNA. The sequence ciated with 5S rDNA in a single operon, unlike the majority variability within the 3′ETS has been extensively studied of plants including Arabidopsis thaliana where these two (Abou-Ellail et al. 2011; Pontvianne et al. 2010). Four units are separated (Garcia and Kovarik2013 ; Garcia et 3′ETS variants were described from which we can dis- al. 2010). Further, in A. thaliana from Nothern Sweden the tinguish four rRNA gene subtypes. Three of the variants, amount of rDNA is highly variable, thus having significant var1-3, are abundant, while var4 is relatively rare. rRNA impact on the total genome size (Long et al.2013 ). The abil- gene subtypes differ in transcriptional activity according to ity of rDNA to expand or contract is well described in bud - the plant’s developmental stage; var2, var3 and var4 are ding yeast (e.g. Kobayashi et al. 2004). In marine bacteria, expressed in roots, flowers, leaves and seedlings while var1 short insertions in 16S rDNA result in structural changes is expressed only in 2-day-old seedlings. Recently, the chro- during the adaptation to a high pressure environment (Lauro mosomal position of these variants has been determined et al. 2007); in fish, rapid speciation is associated with ret- (Chandrasekhara et al.2016 ). Var1 and a subset of var3 map rotransposon-driven amplification of rDNA (Symonova et to NOR2 while var2, var4 and the majority of var3 map to al. 2013); and the phenomenon termed a “jumping” NOR, NOR4. Since all the inactive rRNA gene subtypes map to originally described by (Schubert and Wobus 1985) results the same rDNA locus, Chandrasekhara et al. suggest that in a variable number of rDNA loci inAllium . One could con- the gene subtypes are selectively silenced based on their clude that such changes are associated with high sequence chromosomal position rather than in a sequence-dependent variability inside the rDNA genes, but in fact the regions manner. Our latest studies further show that this distribution affected the most are the short internal spacers (internal of active and inactive rDNA variants to chromosomes 4 and transcribed spacers ITS1 and ITS2) and the spacer between 2, respectively, can be altered when the ribosomal genes individual gene clusters (intergenic spacer, IGS). The rDNA are eliminated and subsequently recovered, as observed in ITS have become widely used for species identification andfasciata (fas1 or fas2) mutants and plant lines derived from even the highly variable IGS can be utilized to distinguisha fas background with restored FAS function, respectively among closely related species in phylogenetic studies (e.g. (Pavlištová et al. 2016). FASCIATA 1 and 2 are two of the Cavallero et al. 2015; Han et al. 2016; Konstantinova and subunits of the highly conserved trimeric histone H3/H4 Yli-Mattila 2004; Lin et al. 2014; Marcilla et al. 2001). chaperone Chromatin Assembly Factor-1, CAF-1 (Smith We focus here on the IGS of rDNA in the plant A. thali- and Stillman 1989) whose deficiency results in severe phe- ana. This model plant contains 570–750 copies of the notypic defects including stem fasciation, abnormal leaf rRNA gene organised at two chromosomal sites (Pruitt andand flower morphology, and disorganization of the- api Meyerowitz 1986) on chromosomes two and four (Copen- cal meristem (Kaya et al. 2001; Leyser and Furner 1992; haver and Pikaard 1996a). Transcriptionally active rDNA Reinholz 1966). The fas mutation leads to cell cycle-related copies are known as Nucleolus Organizer Regions (NORs). defects such as increased homologous recombination Long rDNA arrays consist of ca. 10 kb-long repetitive units (Endo et al. 2006; Kirik et al. 2006; Takeda et al. 2004), of which 5.7 kb represent the coding region (18S–5.8S– slower progression of the S-phase, constitutive activation 25S) separated by a ~4.5 kb long IGS. So far, the A. thali- of the G2 checkpoint and triggering the endocycle (Exner ana IGS has been described mainly leaving the intraspecies et al. 2006; Ramirez-Parra and Gutierrez 2007; Schonrock sequence variability aside (Gruendler et al. 1989), with et al. 2006). In addition, fas mutants show progressive only fragments of IGS sequences available in public -data telomere shortening and loss of rDNA repeats while other bases which allow to assemble only consensual IGS. Therepetitive genomic regions remain unaffected (Mozgova et IGS typically contains tandem repetitions enriched with al. 2010). Our previous results show that both originally SalI restriction sites (SalI boxes), spacer and rDNA gene active and originally inactive rDNA variants are lost in fas promoters, transcription terminators, and signals for pre- mutants (Pontvianne et al. 2013) and that this loss is mitotic rRNA processing (Abou-Ellail et al. 2011; Gruendler et and connected with homology-dependent repair (Muchova al. 1989). SalI boxes, in particular, are often discussed in et al. 2015). In experiments leading to FAS gene reintro- connection with regulation of rDNA transcription, which duction we observed that in contrast to WT plants both is independent of the well-described epigenetic regulation rDNA loci, two as well as four, can be activated (Pavlištová (Durut et al. 2014; Earley et al. 2010; Pontvianne et al. et al. 2016). 2010). SalI boxes function as enhancers or terminators of As we previously reviewed (Dvorackova et al. 2015), rDNA transcription in Xenopus or mammals (Grummt et al. rDNA represents a problematic template to replicate and a 1986; Pikaard et al. 1990). hot spot for recombination. The fas mutants, as well as plant 1 3 Plant Mol Biol 3 lines with restored FAS function (here termed revertants), Cloning, plasmid DNA isolation and sequencing are unique examples of plants where rDNA rearrangements have occurred. Further, the IGS sequence has not been stud- We amplified the IGS using a pair of primers designed in a ied extensively inA. thaliana plants and so far only fragmen - conserved region of the 25S and 18S rRNA genes (25SFw tary information has been available. This lack of knowledge and 18SR or 25SFw_seq and 18SR_seq, online resource 1). of the complete IGS sequence and its naturally existing The PCR contained 1.25 U of ExTaq polymerase (TaKaRa), − variants limits interpretation of studies addressing genetic 0.8 mmol l 1 of dNTP mix, 5 µl of 10 × ExTaq Buffer, 50 ng − and epigenetic factors involved in regulation of rDNA loci. of genomic DNA, 0.6 µmol l 1 of primers and water up to Here we address the question whether the combination of 50 µl. The PCR conditions included incubation at 94 °C for 3′ETS variants and the juxtaposed IGS regions (mainly SalI 30 s, followed by 30 cycles of 98 °C for 10 s, 55 °C for 30 s, boxes) is completely random or whether there are some 72 °C for 4 min, with final incubation at 72 °C for 20 min. rules for IGS arrangements. Furthermore, we ask how fas The cloning reaction was prepared using 4 µl of fresh PCR disruption reflects on IGS sequence and variability, and if product (approx. 100 ng), 1 µl of plasmid pCR™ Invitrogen − plant lines with recovered rDNA arrays contain the same (10 ng µl 1) and 1 μl of salt solution (from the TOPO® TA IGS set-up as the original plants. We present a comprehen- Cloning® Kit, Invitrogen). The reaction was incubated for sive set of IGS sequences obtained by conventional cloning 30 min at room temperature then mixed with 20 µl of electro- and Pacific Biosciences (PacBio) sequencing supported by competent cells (E. coli ElectroMAX™ Stbl4™, Invitrogen) genomic DNA analysis by RFLP. Finally, we characterise or chemocompetent cells (E.coli One Shot® TOP10, Invitro- a new type of variability within the NTS region where the gen). In the case of electrocompetent cells, the salt in the clon- repetitive elements and spacer promoters are located, and ing reaction was diluted 5×, the bacteria were transformed by report a reference IGS sequence summarising all the infor- electroporation at 2.5 kV, mixed with 1 ml of SOC medium, mation obtained. incubated at 30 °C, 200 rpm for 90 min, then spread on Petri dishes with LB medium and incubated at 30 °C for 16 h. In the case of chemocompetent cells, the bacteria were trans- Material and methods formed by heatshock at 42 °C in a water bath for 45 s, mixed with 1 ml of SOC medium, incubated at 37 °C, 200 rpm for Plant material and DNA isolation 90 min, then spread on Petri dishes with LB medium contain- − − − ing ampicillin (50 µg ml1), IPTG (10 4 mol ml 1) and X-gal − All A. thaliana plants were on a Columbia 0 background (0.04 mg ml 1). Plates were incubated at 37 °C for 16 h. (Col 0). The T-DNA insertion mutantsfas1–4 and fas2–4 The blue–white test was used to select clones with a plas- (named fas1 and fas2 in the following text) are desig- mid insertion and the length of the insertion was tested by nated as NASC: N828822, SAIL_662_D10 and NASC: PCR with M13F/M13R primers. The plasmid DNA was iso - N533228, SALK_033228 (Exner et al.2006 ). Wild-type lated using GenElute™ HP Plasmid Miniprep Kits (Sigma (WT) A. thaliana plants had no mutant history or were seg - Aldrich) or QIA® Spin Miniprep Kits (Qiagen). Sequencing regated from the progeny of fas1 heterozygotes and grown of the clones was performed by Macrogen (South Korea) for another two or five generations (G2 +/+, G5+/+). The using the Sanger method with primers covering the whole fas1 mutants were segregated from heterozygous progeny IGS; for details see the list of primers (online resource 1). and used in the first generation or grown for another two, We obtained approx. 1.1 kb long reads which were assem - five or seven generations fas1–4( G1−/−, G2−/−, G5−/−, bled by their overlapping regions. G7−/−). The fas2–4 mutants were segregated in the first generation (fas2–4 G1−/−). WT A. thaliana revertant Single molecule real-time sequencing lines 1, 3, 4, and 6 were segregated from the progeny of FAS1fas1/FAS2fas2 double heterozygotes (Pavlištová et Two series of single molecule real-time (SMRT) sequencing al. 2016). These lines have a low (line 6), medium (line were conducted. In the first series, the samples were prepared 3), or high (lines 1 and 4) rDNA content. Genomic DNA by PCR with the use of Q5 ® High-Fidelity DNA polymerase was isolated from leaves of 5 weeks-old plants using the (NEB) to amplify the shorter, approx. 3.5 kb long, fragment protocol according to (Dellaporta et al.1983 ) or using a of the IGS. In the second series, Phusion ® Hot Start II High- NucleoSpin Plant II Midiprep Kit (Macherey–Nagel). Fidelity polymerase (Thermo Scientific) was used to pro - Leaves used for a single sample came from 1 to 3 dif- duce both fragments of the IGS, 3.5 and 4.5 kb long. ferent individuals of the same genotype and generation In the case of PCR with Phusion polymerase, 25–30 (“siblings”). The DNA quality was checked by electropho- identical 10 μl PCRs were prepared and run in parallel to resis and the genotype was tested by PCR as described by obtain 5 µg of the product required for SMRT. A single PCR − (Mozgova et al. 2010). contained 0.2 U of polymerase, 0.3 mmol1 lof dNTP mix, 1 3 4 Plant Mol Biol

2 µl of 5 × GC Reaction Buffer (Thermo Scientific), 20 ng To compare IGS sequences with information available − of genomic DNA, 0.25 µmol l 1 of primers Fx and Rx (see in public databases we extracted two versions of the IGS − the list of primers for details), 2.5 % DMSO, 1.25 mmol l 1 reference. The first with 294, 294 and 500 bp long SalI of MgCl2 and MilliQ water up to 10 µl. The products were boxes, here termed var1.294.294.500, was published in mixed and purified using a QIA® PCR purification Kit (Qia- (Chandrasekhara et al. 2016) and the second was created gen). The process was repeated with ten different genomic as follows. The longest clone was BLAST searched against DNAs and different pairs of primers FxRx where x is 0–9. the official TAIR10A. thaliana genome, and the TAIR10 The primers were designed in the conserved region of the genome fragments were merged into a unique IGS reference 18S and 25S rRNA genes and were distinguished by a with 314 and 1077 bp longSal I boxes (online resource 2). unique barcode (online resource 1). The annealing tempera- To summarize the IGS sequences obtained, we con- ture had to be optimized for each pair of primers and was in structed a single IGS consensus as follows. We created the ranges 69–71 °C or 49–51 °C, respectively. PCR condi- 19 consensus sequences, one from each of the IGS types tions included incubation at 98 °C for 30 s, followed by 2 detected in the WT. These consensus sequences were cycles of 98 °C for 10 s, 49–51 °C for 30 s, 72 °C for 2 min aligned and a final WT consensus was extracted from the 30 s, followed by 25 cycles of 98 °C for 10 s, 69–71 °C for alignment. The consensus contains 316 and 1278 bp long 30 s, 72 °C for 2 min 30 s, with final incubation at 72 °C for SalI boxes (online resource 3). All multiple sequence align- 5 min. The quality and concentration of PCR products was ments were created by ClustalW. measured using electrophoresis and a Nanodrop (Thermo Scientific). The PCR products were then mixed equimolarly Restriction fragment analysis, probe labelling, and and sequenced by GATC Biotech (Germany). hybridization In the case of Q5 polymerase, 7–15 identical 25 μl PCRs were prepared and run in parallel. A single PCR contained The hybridization probes IGS1 and IGS2 were prepared by − 0.5 U of polymerase, 0.2 mmol l 1 of dNTP mix, 2 µl of PCR using plasmid DNA containing a single IGS clone as Reaction Buffer 5 × (NEB), 2 µl of 5 × Enhancer Buffer a template. The PCR with IGS1F/IGS1R or IGS2F/IGS2R − (NEB), 25 ng of genomic DNA, 0.25 µmol l 1 of primers Fx primers (online resource 1) contained 1.25 U ofTaq poly- and Rx, and MilliQ water up to 25 µl. The PCR conditions merase (NEB) and was performed according to the manu- were incubation at 98 °C for 30 s, followed by 2 cycles of facturer’s instructions. The PCR product was extracted 98 °C for 10 s, 49 °C for 30 s, 72 °C for 3 min 20 s, followed from a 1 % agarose gel by a QIAX® II Gel Extraction Kit by 25 cycles of 98 °C for 10 s, 69 °C for 30 s, 72 °C for 3 min (Qiagen) and labeled with radioactive α-[32P]dCTP accord- 20 s, with final incubation at 72 °C for 5 min. The rest of the ing to the Rediprime II DNA Labeling System protocol (GE process was identical to that using Phusion polymerase. Healthcare Life Sciences). To digest genomic DNA, 900 ng of fas2 G2−/− or Sequence analysis G3−/− and 500 ng offas2 G2+/+ was mixed with 30 U of EcoRI and 15 U of HindIII (NEB); the larger amount of The Pacific Biosciences sequencing method produces long mutant DNA was used to compensate for the loss of rRNA reads which cover the sequenced region multiple times. The genes in fas mutants. The mix was incubated at 37 °C for raw data were processed with the RS_ReadsOfInsert pro- 16 h, then 15 U EcoRI and 7.5 U of HindIII was added, tocol in a SMRT analysis pipeline that produced consensus the mixture was incubated for another 2 h, lyophilized, and reads. We filtered the acquired data based on these criteria: subjected to electrophoresis on a 1.3 % agarose gel over- high quality reads longer than 3 kbp containing the begin- night at 40 V. The digested DNA from the gel was trans- ning (CCCTCCCCTAA) and end (ATCGATGAATG) of ferred by Southern blot to a Hybond™–XL membrane (GE the IGS and at least one transcription initiation site (TIS) Healthcare) and hybridized with the IGS1 or IGS2 probe in (TATATAGGG). The reads were identified by barcodes. 0.25 M Na-phosphate pH 7.5, 7 % SDS, 0.016 M EDTA at To search the sequences for SalI boxes and gene and 65 °C overnight. The membrane was washed three times in spacer promoters (GP and SP), we needed a clear definition 2 × SSC, 0.5 % SDS at 65 °C. The hybridization process was of these elements. We defined the SalI box as a close occur- repeated with chloroplast probes to check that the DNA was rence of two and more SalI restriction sites (GTCGAC). digested completely (Fajkus and Reich 1991). The borders of a SalI box are determined by the position of the first and last SalI restriction sites. The promoters were Pulsed field gel electrophoresis defined as the region from− 55 to +6 nucleotides around a TIS sequence (TATATAGGG), where the underlined A is the Cells were embedded in agarose blocks as described in (Foj- first transcribed nucleotide+ 1. To search for these elements tova et al. 2002). The DNA concentration and integrity were as well as for restriction sites, we used Python scripting. checked by a preliminary pulsed field gel electrophoresis 1 3 Plant Mol Biol 5

(PFGE) using a CHEF MAPPER (Biorad). Pieces of aga- followed by Sanger sequencing. The sequences were sub- rose blocks were washed three times in 0.1 × TE (0.1 mM mitted to GenBank under the accession numbers KU994650- EDTA, 1 mM Tris–HCl, pH 8.0) and digested overnight KU994739 for the clones, KU992939-KU994649 for the with 80 U ofHind III (NEB). DNA was separated by PFGE PacBio reads, and SRP071272 for the raw reads. The num- (pulse times: 0.22–17.33 s linear) at 14 °C in a 1 % (w/v) ber of reads and detected IGS variants obtained in long agarose gel in 0.5× TBE (4.5 mM Tris–HCl, 4.5 mM boric reads from PacBio SMRT sequencing are given in Table 1 acid, 1.25 mM EDTA). The gel was Southern-blotted onto and summarized in Fig. 1. Corresponding results obtained Hybond™–XL membrane (GE Healthcare) and hybridized by Sanger sequencing of cloned PCR products are shown with an IGS1 or IGS2 probe as described above. in Table 2. To simplify addressing variants in the text, we named them systematically. A complete IGS variant name consists Results of an 3′ETS type and a SalI box type separated by a dot. There are five types of 3′ETS variants, var1-var5, includ- Sequencing data and nomenclature of IGS variants ing the var5 newly described here (see the next Section). If a particular IGS contains two or three SalI boxes in a row, To explore the IGS variants in detail, we used PacBio Sin- they are also represented by their type (length in bp) sepa- gle Molecule Real-Time sequencing, as well as cloning rated by a dot. For example, we use as a reference the IGS

Table 1 Number of IGS sequences obtained by PacBio sequencing IGS variant WT Revertant lines fas1 fas2 6 (low) 3 (medium) 1 (high) 4 (high) G1 G5 G7 G1 var1.294 7 7 5 6 4 4 var2.294 4 var3.294 1 var1.294.633 1 1 1 1 1 var2.294.1005 3 var3.294.1005 1 1 1 1 var1.294.1045 6 3 2 1 2 4 var2.294.1045 24 1 1 var3.294.1234 1 var1.294.1254 1 1 2 var3.314.1005 6 1 2 6 var1.314.1045 1 var1.366 4 2 1 var5.366 54 63 14 7 1 20 26 52 10 var1.458 5 2 1 3 5 2 1 var2.458 3 var3.458 190 165 27 162 72 137 90 72 51 var1.582 81 15 67 46 47 43 4 15 var2.582 7 3 1 var3.582 7 1 4 1 var1.705 3 1 1 1 var2.705 1 var1.1045 1 var3.1045 2 var1.1505 1 Sum 407 253 52 252 137 218 172 136 84 The sequences are classified according to the IGS variant and the source of genomic DNA (WT, revertant lines with low, medium or high rDNA content and fas mutants in generations 1, 5 and 7). The sequences are deposited in GenBank under the accession numbers KU992939– KU994649. The names of individual IGS variants consist of the 3′ETS variant and the SalI box variant (based on the length in bp) separated by a dot

1 3 6 Plant Mol Biol

Table 2 Numbers of individual IGS types obtained by cloning FAS G2 fas1 G2 fas1 G5 fas1 G7

IGS variant var1.294 1 var3.294.1045 1 var1.294.1045 1 1 var1.294.633 1 var3.314.1005 1 5 var5.366 1 3 4 var3.1045 1 var3.458 6 11 5 4 var1.458 1 var1.582 2 4 5 1 var3.582 1 var1.705 2 var2.1045 1 var4.797 1 Fig. 1 Graphical summary of PacBio read distribution in all plants: Other clones which may contain a truncated SalI box region WT, revertant lines with low (6), medium (3) or high (1, 4) rDNA var3.1056 2 content, and fas1, fas2 mutants. The sequences can be found in Gen- Bank under the accession numbers KU992939–KU994649. The name var3.314.892 1 of individual IGS variants consists of the 3′ETS variant and the SalI var3.160.1005 1 box variant (based on the length in bp) separated by a dot var3.1342 1 var3.314.737 1 sequence assembled from sequences in public databases and var3.953 1 published in (Chandrasekhara et al. 2016), which is 4725 bp var3.829 1 long and contains two 294 bp-long SalI boxes, one 500 bp var2.1118 1 long SalI box, two spacer promoters, and one gene promoter var2.294.942 1 (Fig. 2a). As this consists of a var1 3′ETS followed by three var3.1015 1 1 SalI boxes which are 294, 294 and 500 bp long, it is desig- var3.552 1 nated as var1.294.294.500 using our nomenclature. var2.294.89.273 1 Two different variants in the 5′ETS region characterised var1.643 1 de novo in this paper were named varA and varB. var1.1020 1 var1.222.986 1 IGS variants in wild-type plants var1.664 1 var3.68.633 1 Our results show that the IGS sequences within individual var3.779 1 sister WT plants are far from being homogeneous. PCR var3.531 1 amplification of the IGS, using primers for the conserved var3.573 1 region of 18S and 25S genes, revealed two major products 3.5 and 4.5 kb long (online resource 4). Analysing our data- PCR products were cloned in E. coli. In total, 86 clones were sequenced by Sanger sequencing. The clones are classifiedord -acc set of WT IGS sequences which consists of 407 IGS PacBio ing to the genomic DNA used in PCR. The majority of the clones sequences and 16 clones (Tables 1, 2) we found variability belong to 14 significant IGS variants. Neverthless, 22s cloneappear in all three parts of the IGS, the 3′ETS, the NTS, and the to be truncated in the SalI box region. The sequences can be found in 5′ETS. GenBank under the accession numbers KU994650–KU994739. Th e name of individual IGS variant consists of the 3′ETS variant a nd the The beginning of the IGS, which is delimited by the end of SalI box variant (based on the length in bp) separated by a dot the 25S rRNA gene at the proximal site and by the firstSal I restriction site at the distal site, contains the 3′ETS sequence according to which the rRNA genes can be classified into gene type which we named var5 which is closely related to four types, var1–4 (Abou-Ellail et al. 2011). Given the small var3 but differs mainly in a 100 bp deletion at the distal site copy number of var4 in the genome, we did not detect it in of the 3′ETS and can therefore be regarded as a var3 subtype. WT reads, it was however found in fas1 clones. In addition, The first SalI restriction site is situated at position 778 bp our data shows one yet undescribed although abundant rRNA (var1), 726 bp (var2), 655 bp (var3), 736 bp (var4) or 549 bp 1 3 Plant Mol Biol 7

(var5) from the end of the 25S rRNA gene. The alignment of variability is associated with 3′ETS var5, where only one WT 3′ETS variants is presented in Fig. 2b, c. IGS subtype was found. In the NTS, we detected significant length variability in The last region of the IGS, the 1841 bp long 5′ETS situ- the SalI boxes which is caused by differences in the number ated upstream of the 18S rRNA gene, was described in detail of 20 bp repeats, creating short (<1 kb) and long (>1 kb) (Gruendler et al. 1991). It contains two tandem 310 bp long SalI clusters. Altogether, we found ten different types of SalI repeats termed C1 and C2, which differ from each other in boxes in the WT, and two types specific to fas1. We named 4 SNPs at position 28, 130, 182 and 274 bp (Gruendler et them according to their length in bp (294, 314, 366, 458, al. 1991). Although the length heterogeneity in the 5′ETS is 582, 705, 1005, 1045, 1254, or 1505). The alignment of all rare, we detected in 4 % of sequences a 310 bp long dele- 12 SalI box types is presented in Fig. 2d. While some IGS tion corresponding to a C repeat (Fig. 4b). We designated have only one SalI box, others have a combination of two the major variant, containing both C repeats, as varA and different types. Such variability is clearly seen in the PCR the deletion variant, containing only one C repeat, as varB amplification of the IGS that shows two major products (Fig. 2a, online resource 5). Based on the SNPs, we conclude of length ~3.5 and ~4.5 kb (online resource 4). The short that the C repeat in varB results from various combinations PCR product contains only one SalI box of these types: 294 of original C1 and C2 repeat rather than selective deletion (2.9 %), 366 (14.1 %), 458 (48.7 %), 582 (23.2 %) or 705 of either C1 or C2 (online resource 5). The varB sequences (0.7 %). The long PCR product usually contains a combi- come from different IGS variants: var3.458, var3.582, nation of two SalI boxes separated by a spacer promoter. var1.582, var1.366 and var5.366. Therefore, the varB does There are four different combinations of SalI box types not seem to be linked to a specific 3′ETS orSal I box variant. (named according to the length of the individualSal I box in bp separated by a dot) occurring with different frequen- Consensus and reference WT IGS cies: 294.1005 (1 %), 294.1045 (7.4 %), 294.1254 (0.2 %) and 314.1005 (1 %). Rarely, we could see an IGS with onlyTo compare IGS sequences with information available one long SalI box of the type 1045 (0.5 %) or 1505 (0.2 %).in public databases we extracted an additional version Thus within the shorter SalI boxes, only types 294 and 314 of the IGS reference by BLAST searching of the longest tend to be combined into longer IGS variants, while other clone against the official TAIR10 A. thaliana genome. The SalI box types are isolated. In addition, the variant 314 is TAIR10 genome fragments have been merged into a unique always combined into a longer IGS. IGS reference with 314 and 1077 bp longSal I boxes (online Combinations of 3′ETS variants var1–5 with adjoining resource 2). To summarize the IGS sequences, we have con- SalI box types creates 19 unique IGS variants (Fig. 3). Some structed a single IGS consensus, which contains 316 and of the SalI box types tend to be connected to a specific 3′ETS 1278 bp long SalI boxes (online resource 3). variant, var1–5 (Table 3). The 366 SalI box occurred in 55 cases together with var5, while only in four cases with var1. IGS variants in fas1 and fas2 mutants The 458 SalI box is mostly connected with var3 (46.8 %) and rarely with var2 (0.7 %) or var1 (1.2 %). The 582Sal I We describe here that in the WT the four most com- box is typically found with var1 (19.8 %) and rarely with mon IGS types—var1.582, var2.294.1045, var3.458, var2 (1.7 %) or var3 (1.7 %). It is interesting that although and var5.366-are present. In fas1 and fas2 mutants rDNA 3′ETS var1 is the longest variant, its most abundant adjoin- genes are systematically lost (Mozgova et al.2010 ) and ing IGS is rather short. The highest number of long IGS this is reflected in our data by the decreased variability in types, 294.1045, is associated with a 3′ETS of var2. To different types of IGS sequences obtained (Tables1, 2). conclude, in the WT each 3′ETS seems to contain one most We found 19 IGS variants in the WT, while in both fas common type of adjoining IGS (var1.582, var2.294.1045, mutants there are only 6–10 variants for a given genera- var3.458, or var5.366) (see Table 3 for details). Although tion (Table 4). Some of these variants are apparently exclu- our methods provide only semi-quantitative data, these four sive for mutants (resulting from non-allelic homologous types of IGS are clearly more frequent than the remaining recombination (Kirik et al. 2006)) and they do not occur variants (Fig. 1). in the WT: var3.294, var1.1045, var3.1045, var1.294.633, It was previously shown by analysing 3′ETS rDNA var3.294.1045 and var4.797 (Table 4, online resource 6). expression that var1 is the most abundant and the least On the other hand, there are some WT variants which do not expressed 3′ETS variant and that it often undergoes epigen- occur in fas1 mutants: var1.1505, var2.294.1005, var2.294, etic reprogramming (Abou-Ellail et al. 2011; Earley et al. var2.294.1045, var1.294.1254, var2.458, var2.1045 (online 2006, 2010). Here we show that this 3′ETS variant is also resource 6). Distinct occurrence of variants may result from associated with the highest number of different adjoining the presumed mechanism of rDNA loss via single-strand regions, forming eight different subtypes of IGS. The least annealing recombination events (Muchova et al. 2015). 1 3 8 Plant Mol Biol

Fig. 2 Alignments of IGS variants. a Schematic view of the reference sequences are shown in full, while regions containing identical bases var1.294.294.500 (Chandrasekhara et al. 2016). b Alignment of the are displayed as a box (green) with a number representing the number 3′ETS region found in the WT, var1/2/3/5. Var4 was not detected in our of bp in each box. d Alignment of SalI box types. Lighter shades rep- dataset and therefore it is not included in this alignment. Four different resent lower sequence identity. Numbers represent the length of indi- 3′ETS variants are shown from the end of the 25S rRNA gene to the vidual SalI box in bp. SalI boxes 633 and 797 (red) are found only in firstSal I restriction site. Lighter shades represent lower sequence iden - fas1 mutants tity. c Alignment of the 3′ETS region. Positions containing different

Restriction fragment length polymorphism compared them to the signals visible in RFLP analysis (Fig. 4). We further focused on analysis of the IGS at the genomic Firstly, genomic DNA was digested with EcoRI and DNA level, using selected restriction enzymes for RFLP HindIII and hybridized with an IGS1 probe specific to the analyses. Using in silico prediction of restriction sites, we 3′ETS (Fig. 4a). The most abundant IGS variants could all created restriction maps of individual IGS variants and be mapped to a strong signal. Some of the less abundant 1 3 Plant Mol Biol 9

Fig. 3 Distribution of promoters and SalI boxes in WT IGS variants. corresponding region. The name of individual IGS variants consists of Lines represent individual IGS variants showing schematic length the 3′ETS variant and the SalI box variant (based on the length in bp) differences between IGS types. SalI boxes are in green, promoters in separated by a dot red. The numbers above each line represent the length (in bp) of the variants mapped to a smeared signal and their existence recovery of rDNA genes at the levels of their copy numbers, could not be sufficiently supported by RFLP. There were distribution of rDNA loci, or expression of 3′ETS variants also three relatively strong signals which could not be (Pavlištová et al. 2016). Therefore, we selected four rever- assigned to any of our variants, suggesting the existence tant lines as representatives of revertants with a low (line of other variants including the reference var1.294.294.500 6), medium (line 3) or high (lines 1 and 4) amount of rDNA (Chandrasekhara et al. 2016). (Table 1). Unlike the WT, these lines also express 3′ETS Secondly, genomic DNA was digested with EcoRI var1; lines 1 and 4 in addition to var3, and lines 4 and 6 with and hybridized with an IGS2 probe specific to the 5′ETS relatively balanced expression of var1-3. The lines 1 and (Fig. 4b). The 5′ETS variants varA/B could be mapped to 4 have no detectable var2 in their genome (based on PCR the signals, the varB signal being considerably weaker than studies). Further, line 4 has acquired a complete change of the varA signal, which corresponds to our sequencing data. its NOR set-up; the vast majority of its rDNA is present on The RFLP analysis also revealed an inhibition of cleavage chromosome 2, thus forming an active NOR2 associated at the first EcoRI restriction site, situated in the 18S rRNA with the nucleolus (Pavlištová et al. 2016). gene, due to an overlapping CpG methylation in a signifi- Our sequencing data show that the variability of the IGS cant subset of 18S rRNA genes, which led to incomplete in revertant lines is decreased, showing 10–12 variants for a DNA digestion and a subsequent 1506 bp long electropho- single line compared to the 19 variants in the WT. This is a retic mobility shift of varA/varB signals. similar situation to the fas1 and fas2 mutants, indicating that rearrangements of the IGS are probably relatively rare dur- IGS variants in revertant lines ing recovery. In mutants, only a subset of IGS types remains present, and thus without major reorganisation the spectrum FAS1FAS1/FAS2FAS2 plants segregated from a cross between of IGS types cannot expand to the original WT level. This fas1 and fas2 plants (the progeny ofFAS1fas1 /FAS2fas2 supports the view that restoration of CAF-1 function occurs double heterozygotes) show very uneven and asymmetric preferentially via precise (allelic) homologous recombination 1 3 10 Plant Mol Biol

Table 3 Numbers of PacBio reads and clones in WT classified accord- or to the WT. We used a full HindIII digest in combination ing to SalI box type and 3′ETS variant with PFGE to analyse the organisation of rDNA clusters in PacBio reads and clones in WT the WT, fas1, and revertant lines. This method was described previously and generates an approximately 10 kb ladder rep - SalI box type 3′ETS variant resenting mono- and oligomers of rDNA units (Copenhaver var1 var2 var3 var5 et al. 1995). The most recent studies show that rDNA genes with Hind 294 8 (1.9 %) 4 (1 %) III sites are present at chromosome 4 in the IGS 366 4 (1 %) 55 (13.1 %) type containing mostly var3 3′ETS and rarely also in var1 or 458 5 (1.2 %) 3 (0.7 %) 196 (46.8 %) var4 IGS types, while rDNA clusters on chromosome 2 are HindIII site-free (Chandrasekhara et al. 2016). Looking for 582 83 (19.8 %) 7 (1.7 %) 7 (1.7 %) HindIII restriction in our set of clones and PacBio reads, we 705 3 (0.7 %) can confirm that var2 and most of the var1 sequences do not 294.1005 3 (0.7 %) 1 (0.2 %) contain HindIII restriction sites while most of the var3 and 294.1045 7 (1.7 %) 24 (5.7 %) var5 do (for details see online resource 7). 294.1254 1 (0.2 %) We detected by PFGE that in late generations (G4 or G6) 314.1005 6 (1.4 %) of fas1 mutants the higher molecular weight rDNA clusters 1045 1 (0.2 %) (presumably lacking the HindIII site) are completely lost 1505 1 (0.2 %) (follow the asterisks in Fig.5 ). Previous data showed that The SalI box types of the IGS tend to be joined with specific 3′ETS active rDNA copies on chromosome four were lost first, -fol variants, var1–5. The 366 SalI box is typical for var5, the 458 is typi- cal for var3, and the 582 is typical for var1 lowed by inactive copies (Copenhaver et al. 1995; Chan- drasekhara et al. 2016; Muchova et al. 2015; Pontvianne et al. 2013). Here we observed the decrease in high molecular that prevents further loss of rDNA. Var1.582 is preferen- weight DNA fragments in fas1 (Fig. 5, asterisks), while Hin- tially recovered in lines 1 and 4, and var3.458 in lines 3 and dIII site-containing rDNA copies (likely on chromosome 6 (Fig. 1; Table 1). Both of these variants belong to the most four) are still maintained in the fas1 genome even in later abundant type in the WT and are still present in fas1 G7 plants. generations of mutants (Fig. 5). This result rather suggests Regarding the origin of IGS variants, most of the rever - that loss of rDNA copies occurs at both chromosomal loci tant variants were found in the WT as well as in fas1 mutants simultaneously. It is interesting that although copies lack- (online resource 6). However, there are some exceptions; ing a HindIII site are mostly lost in fas1, in revertant lines variant var1.294.633 seems to come from a mutant pro- (which come from G4 of mutants) rDNA clusters become genitor since it was found only in revertant lines andfas1 even more heterogeneous than in the WT in terms of occur- mutants, with no occurrence in the WT. On the other hand, rence of the HindIII sites (follow the arrows in Fig. 5). The variants var2.294.1045 and var1.294.1254 were found variation in outcomes of the rDNA accumulation suggests only in revertant lines and the WT, and thus were probablya stochastic character for initiation of the recovery process, formed de novo in revertants. Therefore, a small percent- which is followed by subsequent amplification. The greatest age of new variants might have originated from rDNA rear- similarity to the WT was observed in line 4, which contains rangements. In revertant lines with medium or high rDNA high molecular weight DNA fragments (thus HindIII site- content, we further detected three rare IGS variants which lacking rDNA copies) originating from the WT. However, in were not found in either the WT orfas mutants (var2.705, this line rDNA accumulates on chromosome 2, while being var1.314.1045 and var3.294.1234). repressed on chromosome 4. The variability in organisation To conclude, revertant lines share specific variants with of the recovered rDNA possibly also reflects the variable both WT as well as fas1 mutants, with only sporadic forma- 3′ETS expression pattern observed in reverted lines, which tion of new variants. It seems that the loss of IGS variability does not follow the setting in the WT where NOR4 con- in parental mutants propagates to revertants. Apparently, tains all active variants and NOR2 is inactive (Pavlištová decreased IGS variability leads to preferential recovery et al. 2016) (Chandrasekhara et al. 2016). of variants which remain in the genome at the moment of FAS restoration, after the extensive rDNA loss in previous mutant generations. Discussion

Organisation of rDNA clusters Ribosomal RNA gene arrays in various organisms have been the subject of numerous studies concerning their dynamic We next examined whether the newly formed rDNA clusters evolution and regulation of expression. A phenomenon in revertant lines appeared more similar to the fas1 mutant known as concerted evolution has been demonstrated to 1 3 Plant Mol Biol 11

Fig. 4 In silico analysis of IGS restriction fragments. a Genomic IGS variants, suggesting that there may be still-unknown IGS variants DNA from fas1 plants in G2 was digested with EcoRI and HindIII and or that some combinations of overlapping cytosine methylation pre- hybridized with the IGS1 probe which covers the 3′ETS (see panel vent EcoRI from complete DNA digestion at a subset of EcoRI sites, c). The IGS variants were matched to the image according to in silico resulting in longer fragments. M represents a molecular weight marker restriction digestion. The observed RFPL is mostly due to a different (M1–1 kb DNA ladder; M2–2-log ladder). b Genomic DNA from fas1 number and length of SalI boxes, in combination with the presence or plants in G3 was digested with EcoRI and hybridized with the IGS2 absence of internal EcoRI and HindIII sites. Three relatively strong probe, which shows the polymorphism in the 5′ETS. c The positions signals (question marks) could not be assigned to any of the identified of IGS1 and IGS2 probes

1 3 12 Plant Mol Biol

Table 4 IGS variability in fas mutants homogenization. The major driving force in the evolution Genotype #variants Variants of the rRNA genes is considered to be unequal crossover, with sister chromatid exchange occurring more often than fas2 G1 8 var1.294, var3.294*, var5.366, exchange between homologs (Eickbush and Eickbush var1.458, var3.458, var1.582, 2007). This can be demonstrated also in A. thaliana, where var2.582, var1.1045* the rRNA genes within a single NOR are more similar to one fas1 G1 8 var1.294, var1.366, var5.366, var1.458, var3.458, var1.582, var1.705, another than they are to the rRNA genes on another chromo- var1.294.633* some (Copenhaver and Pikaard 1996b). Our data based on fas1 G2 9 var5.366, var1.458, var3.458, var1.582, analysis of rDNA clusters by HindIII supports the theory var3.582, var1.705, var1.294.633*, of rDNA loci homogenisation and preferential recombina- var1.294.1045, var3.294.1045* tion between homologues. In revertant WT lines, where the fas1 G5 10 var1.294, var5.366, var3.458, amount of rDNA was reduced due tofas1 or 2 mutations, var1.582, var3.582, var1.294.633*, var1.294.1045, var3.294.1045, usually one type of rDNA variant is preferentially recovered var3.294.1005, var3.314.1005 after reintroduction of functional FAS alleles. fas1 G7 6 var5.366, var3.458, var1.582, Overall, sequence variation in rDNA is largely limited var3.1045*, var3.314.1005, var4.797* to the IGS that contains genetic elements controlling RNA There are 6–10 different variants in a single generation of fas polymerase transcription, the gene promoter, enhancers, mutants. Variants with the symbol asterisk are exclusive to mutants spacer promoters and terminators. Since rRNA gene activ- and were not found in the WT. The most abundant IGS types are ity, both at the level of individual genes and of the whole shown in bold font. The name of individual IGS variant consists of rDNA loci, is at the same time subject to epigenetic regula- the 3′ETS variant and the SalI box variant (based on the length in bp) separated by a dot tions (Pontvianne et al. 2013; Preuss and Pikaard 2007), it is important to know the actual sequence structure of the IGS and its natural variation so that contributions of genetic and epigenetic phenomena can be assessed realistically. Although A. thaliana has been used frequently as a model organism in rDNA studies, it is surprising that no contigu- ous sequences of rDNA units have been available so far, and this knowledge gap was bridged using artificially assembled rDNA fragments. Interestingly, our data on IGS differ substantially from the reference var1.294.294.500 which was assembled from fragments available in public databases and used in a recent study (Chandrasekhara et al. 2016). The reference contains three SalI boxes in a row separated by two spacer promoters. In contrast, we found only one IGS clone (var2.294.89.273) which follows the same arrangement and even this one dif- fers from the reference in the length of the SalI box. Importantly, our results associate previously known vari- ants in the 3′ETS with their actual IGS sequence context, and reveal a novel 3′ETS variant, var5. The distribution of 3′ETS variants on individual NORs was described in detail Fig. 5 PFGE analysis of rDNA. High molecular weight DNA digested (Chandrasekhara et al. 2016) showing that var2 IGS types with HindIII was separated by PFGE and hybridized with an IGS2 are located the transcriptionally active NOR4, while var1 is probe. WT, fas1 (generations 2, 4 and 6) and revertant lines 1, 3, 4, on the inactive NOR2. Var3 is distributed on both NORs, 6 [WT lines segregated from a cross between fas1xfas2, (Pavlištová but NOR4 contains a var3 with a HindIII site. Mutants in et al. 2016)] are shown. ND represents a non-digested DNA control; M is a molecular weight standard. Asterisks mark the fraction of high the FAS1 (or FAS2) gene lose around 20 % of their rDNA molecular weight rDNA present in WT plants, lost in fas1 mutants repeats per generation, thus representing a unique model and recovered in revertant lines with higher copy number.Arrowheads to study IGS dynamics. Previous studies showed that tran- mark newly-formed rDNA fragments present in revertant lines scriptionally active copies at NOR4 are depleted first, then inactive var1-containing copies are activated and subse- be responsible for high sequence similarity of rRNA genes quently also depleted (Muchova et al.2015 ; Pontvianne et within a species. These studies suggested homologous al. 2013). Consistent with these studies, when we analysed recombination events as the mechanism of rapid sequence three different generations of fas1 mutants (G1,G5,G7) the 1 3 Plant Mol Biol 13 actively transcribed var2 (var2.294.942, truncated version Infrastructures” (CESNET LM2015042) is greatly appreciated. This of var2.294.1045) was detected only once (G7, conventional work was supported by GACR 13-11563P, 16-04166Y and by the Min- istry of Education, Youth and Sports of the Czech Republic-projects cloning, Table 2), while in the WT this IGS variant repre- KONTAKT II no. LH15189 and CEITEC 2020 (LQ1601). Work in sents around 10 % of SMRT reads (Table 1; Fig. 1). Further, the laboratory of C.G. is supported by Grants BFU2012-34821 and the abundance of the frequently occurring inactive var1.582 BFU2015-68396-R. (around 20 % in the WT) is very low in G7 of fas1 mutants (Table 1; Fig. 1). Based on SMRT results, the dramatic Author contributions J.F. and M.D. designed the study, K.H., M.D., I.M. and L.V. performed experiments, all authors performed analysis decrease of this variant occurs between generations 5 and and interpretation of results, K.H., M.D. and J.F. wrote the manuscript. 7 of fas1 (Table 1; Fig. 1) leaving the remaining copies of var3.458 and var5.366. Aside from the rDNA loss, it is pos- sible that some re-organisation occurs inside the IGS in fas1 References plants due to the increased levels of non-allelic homologous recombination (Kirik et al. 2006). Two additional SalI box Abou-Ellail M, Cooke R, Saez-Vasquez J (2011) Variations on a team types, 633 and 797, are found in mutants. Their alignment to major and minor variants of Arabidopsis thaliana rDNA genes. Nucleus 2:294–299. doi:10.4161/nucl.2.4.16561 the WT SalI box types shows that 633 is related to 1045 and Cavallero S, De Liberato C, Friedrich KG, Di Cave D, Masella V, might have resulted from a deletion in 1045, while since the D’Amelio S, Berrilli F (2015) Genetic heterogeneity and phylog- SalI box 797 is similar to 314 at the proximal site and 705 eny of Trichuris spp. from captive non-human primates based on at the distal site we conclude that 797 might have resulted ribosomal DNA sequence data. Infect Genet Evol 34:450–456. doi: from their fusion. In the light of the concept of concerted 10.1016/j.meegid.2015.06.009 Chandrasekhara C, Mohannath G, Blevins T, Pontvianne F, Pikaard evolution of rDNA, fas mutants and plants with a fas muta- CS (2016) Chromosome-specific NOR inactivation explains tion history and recovered CAF-1 function display accelera- selective rRNA gene silencing and dosage control in Arabidopsis tion of concerted evolution as demonstrated by a loss, gain Gene Dev 30:177–190. doi:10.1101/gad.273755.115 - and spreading of specific IGS variants within a limited num- Copenhaver GP, Pikaard CS (1996a) RFLP and physical map ping with an rDNA-specific endonuclease reveals that ber of generations. This corresponds to generally increased nucleolus organizer regions of Arabidopsis thaliana adjoin levels of homology-dependent recombination events (Endo the telomeres on chromosomes 2 and 4. Plant J 9:259–272. et al. 2006; Kirik et al. 2006; Takeda et al. 2004) and direct doi:10.1046/j.1365-313X.1996.09020259.x fas1 and fas2 Copenhaver GP, Pikaard CS (1996b) Two-dimensional RFLP analyses involvement of these events in loss of rDNA in reveal megabase-sized clusters of rRNA gene variants in Ara- mutants (Muchova et al. 2015). bidopsis thaliana, suggesting local spreading of variants as the In conclusion, we describe here the variant arrangements mode for gene homogenization during concerted evolution. Plant of the IGS in the 45S rDNA in A. thaliana WT plants, fas1 J 9:273–282. doi:10.1046/j.1365-313X.1996.09020273.x and fas2 Copenhaver GP, Doelling JH, Gens S, Pikaard CS (1995) Use of mutants, and plants with restored CAF-1 function. RFLPs larger than 100 kbp to map the position and internal orga- This detailed characterisation allowed the description of a nization of the nucleolus organizer region on chromosome 2 in new variant in the 3′ETS region of pre-rRNA, termed var5, Arabidopsis thaliana. Plant J 7:273–286 and of the preferential association of 3′ETS variants with Dellaporta SL, Wood J, Hicks JB (1983) A plant DNA miniprepara- FAS1 (or FAS2) tion: Version II. Plant Mol Biol Report 1:19–21 specific IGS arrangements. Dysfunctional Durut N et al (2014) A duplicated NUCLEOLIN gene with antagonis- leads to selective loss of some variants and the generation tic activity is required for chromatin organization of silent 45S of new IGS variants. Overall, both fas mutants show less rDNA in Arabidopsis. Plant Cell 26:1330–1344. doi:10.1105/ variability than WT plants. These results correspond to the tpc.114.123893 Dvorackova M, Fojtova M, Fajkus J (2015) Chromatin dynam- presumed mechanism of the loss of rDNA copies via the ics of plant telomeres and ribosomal genes. Plant J 83:18–37. single strand annealing type of homology-dependent repair doi:10.1111/tpj.12822 (Muchova et al. 2015) which, besides causing an absolute Earley K et al (2006) Erasure of histone acetylation by Arabidopsis reduction of the number of rDNA copies, at the same time HDA6 mediates large-scale gene silencing in nucleolar domi- nance. Genes Dev 20:1283–1293. doi:10.1101/gad.1417706 simplifies the original WT spectrum of rDNA variants. Earley KW et al (2010) Mechanisms of HDA6-mediated rRNA gene Plants with restored FAS1 and FAS2 function show an IGS silencing: suppression of intergenic Pol II transcription and dif- spectrum similar to that of their parental mutants, suggest- ferential effects on maintenance versus siRNA-directed cytosine ing that rDNA recovery occurs through a relatively pre- methylation. Genes Dev 24:1119–1132. doi:10.1101/gad.1914110 Eickbush TH, Eickbush DG (2007) Finely orchestrated movements: cise DNA synthesis-dependent homologous recombination evolution of the ribosomal RNA genes. Genetics 175:477–485. mechanism. doi:10.1534/genetics.107.071399 Elliott TA, Stage DE, Crease TJ, Eickbush TH (2013) In and out of the Acknowledgments We thank Veronika Pavlištová for provid- rRNA genes: characterization of Pokey elements in the sequenced ing us with the revertant A. thaliana lines. Access to computing and Daphnia genome. Mobile DNA. doi:10.1186/1759-8753-4-20 storage facilities owned by parties and projects contributing to the Endo M et al (2006) Analysis of Arabidopsis CAF-1 mutants show- National Grid Infrastructure MetaCentrum, provided under the pro- ing enhanced homologous recombination. Plant Cell Physiol gramme “Projects of Large Research, Development, and Innovations 47:S60–S60 1 3 14 Plant Mol Biol

Exner V, Taranto P, Schonrock N, Gruissem W, Hennig L (2006) Chro- phylogeny and adaptation. Appl Environ Microbiol 73:838–845. matin assembly factor CAF-1 is required for cellular differen- doi:10.1128/AEM.01726-06 tiation during plant development. Development 133:4163–4172 Leyser HMO, Furner IJ (1992) Characterization of 3 shoot api- doi:10.1242/dev.02599 cal meristem mutants of Arabidopsis thaliana. Development Fajkus J, Reich J (1991) Evaluation of restriction endonuclease cleav- 116:397–403 age of plant nuclear-DNA using contaminating chloroplast DNA. Lin RQ, Shu L, Zhao GH, Cheng T, Zou SS, Zhang Y, Weng YB Folia Biol Prague 37:224–226 (2014) Characterization of the intergenic spacer rDNAs of two Fojtova M, Fulneckova J, Fajkus J, Kovarik A (2002) Recovery of pig nodule worms,Oesophagostomum dentatum and O. quadris- tobacco cells from cadmium stress is accompanied by DNA pinulatum. Sci World J 2014:147963. doi:10.1155/2014/147963 repair and increased telomerase activity. J Exp Bot 53:2151–2158 Long Q et al (2013) Massive genomic variation and strong selection in Garcia S, Kovarik A (2013) Dancing together and separate again: Arabidopsis thaliana lines from Sweden. Nat Genet 45:884–890. gymnosperms exhibit frequent changes of fundamental 5S and doi:10.1038/ng.2678 35S rRNA gene (rDNA) organisation. Heredity 111:23–33. Mandakova T, Lysak MA (2008) Chromosomal phylogeny and karyo- doi:10.1038/hdy.2013.11 type evolution in x = 7 crucifer species (Brassicaceae). Plant Cell Garcia S, Panero JL, Siroky J, Kovarik A (2010) Repeated reunions 20:2559–2570. doi:10.1105/tpc.108.062166 and splits feature the highly dynamic evolution of 5S and 35S Marcilla A et al (2001) The ITS-2 of the nuclear rDNA as a molecular ribosomal RNA genes (rDNA) in the Asteraceae family. BMC marker for populations, species, and phylogenetic relationships in Plant Biol 10:176. doi:10.1186/1471-2229-10-176 Triatominae (Hemiptera: Reduviidae), vectors of Chagas disease. Geiser C, Mandakova T, Arrigo N, Lysak MA, Parisod C (2016) Mol Phylogenet Evol 18:136–142 doi:10.1006/mpev.2000.0864 Repeated whole-genome duplication, karyotype reshuffling, and Mozgova I, Mokros P, Fajkus J (2010) Dysfunction of chromatin biased retention of stress-responding genes in buckler mustard. assembly factor 1 induces shortening of telomeres and loss of Plant Cell 28:17–27. doi:10.1105/tpc.15.00791 45S rDNA in Arabidopsis thaliana. Plant Cell 22:2768–2780. Gruendler P, Unfried I, Pointner R, Schweizer D (1989) Nucleotide- doi:10.1105/tpc.110.076182 Sequence of the 25S–18S ribosomal gene spacer from Arabi- Muchova V, Amiard S, Mozgova I, Dvorackova M, Gallego ME, dopsis thaliana. Nucleic Acids Res 17:6395–6396. doi:10.1093/ White C, Fajkus J (2015) Homology-dependent repair is involved nar/17.15.6395 in 45S rDNA loss in plant CAF-1 mutants. Plant J 81:198–209. Gruendler P, Unfried I, Pascher K, Schweizer D (1991) rDNA inter- doi:10.1111/tpj.12718 genic region from Arabidopsis thaliana. Structural analysis, Pavlištová V, Dvořáčková M, Jež M, Mozgová I, Mokroš P, Fajkus J intraspecific variation and functional implications. J Mol Biol (2016) Phenotypic reversion in mutants of by reintroduction of 221:1209–1222 genes: variable recovery of telomeres with major spatial rear- Grummt I, Pikaard CS (2003) Epigenetic silencing of RNA polymerase rangements and transcriptional reprogramming of 45S rDNA I transcription. Nat Rev Mol Cell Bio 4:641–649.10.1038/ doi: genes. Plant J. doi:10.1111/tpj.13257 nrm1171 Pikaard CS et al (1990) Enhancers for RNA polymerase I in mouse Grummt I, Kuhn A, Bartsch I, Rosenbauer H (1986) A transcription ribosomal DNA. Mol Cell Biol 10:4816–4825 terminator located upstream of the mouse rDNA initiation site Pontes O et al (2003) Natural variation in nucleolar dominance reveals affects rRNA synthesis. Cell 47:901–911 the relationship between nucleolus organizer chromatin topol- Han EH, Cho K, Goo Y, Kim M, Shin YW, Kim YH, Lee SW (2016) ogy and rRNA gene transcription in Arabidopsis. P Natl Acad Sci Development of molecular markers, based on chloroplast and USA 100:11418–11423. doi:10.1073/pnas.1932522100 ribosomal DNA regions, to discriminate three popular medicinal Pontvianne F et al (2010) Nucleolin is required for DNA methyla- plant species, Cynanchum wilfordii, Cynanchum auriculatum, tion state and the expression of rRNA gene variants in Arabi- and Polygonum multiflorum. Mol Biol Reports. doi:10.1007/ dopsis thaliana. PLoS Genet 6:e1001225. doi:10.1371/journal. s11033-016-3959-1 pgen.1001225 Kaya H, Shibahara KI, Taoka KI, Iwabuchi M, Stillman B, Araki T Pontvianne F et al (2013) Subnuclear partitioning of rRNA genes (2001) FASCIATA genes for chromatin assembly factor-1 in ara- between the nucleolus and nucleoplasm reflects alterna- bidopsis maintain the cellular organization of apical meristems. tive epiallelic states. Gene Dev 27:1545–1550. doi:10.1101/ Cell 104:131–142 gad.221648.113 Kelly LJ et al (2015) Analysis of the giant genomes of Fritillaria Preuss S, Pikaard CS (2007) rRNA gene silencing and nucleolar (Liliaceae) indicates that a lack of DNA removal characterizes dominance: insights into a chromosome-scale epigenetic on/ extreme expansions in genome size. New Phytol 208:596–607. off switch. Bba Gene Struct Expr 1769:383–392 doi:10.1016/j. doi:10.1111/nph.13471 bbaexp.2007.02.005 Kirik A, Pecinka A, Wendeler E, Reiss B (2006) The chromatin Pruitt RE, Meyerowitz EM (1986) Characterization of the assembly factor subunit FASCIATA1 is involved in homologous genome ofArabidopsis thaliana. J Mol Biol 187:169–183. recombination in plants. Plant Cell 18:2431–2442. doi:10.1105/ doi:10.1016/0022-2836(86)90226-3 tpc.106.045088 Ramirez-Parra E, Gutierrez C (2007) E2F regulates FASCIATA1, a Kobayashi T, Horiuchi T, Tongaonkar P, Vu L, Nomura M (2004) SIR2 chromatin assembly gene whose loss switches on the endocycle regulates recombination between different rDNA repeats, but and activates gene expression by changing the epigenetic status. not recombination within individual rRNA genes in yeast. Cell Plant Physiol 144:105–120. doi:10.1104/pp.106.094979 117:441–453 Reinholz E (1966) Radiation induced mutants showing changed inflo- Konstantinova P, Yli-Mattila T (2004) IGS-RFLP analysis and devel- rescence characteristics Arabid Inf Serv 3:19–20 opment of molecular markers for identification ofFusarium poae, Schonrock N, Exner V, Probst A, Gruissem W, Hennig L (2006) Func- Fusarium langsethiae, Fusarium sporotrichioides and Fusarium tional genomic analysis of CAF-1 mutants in Arabidopsis thali- kyushuense. Int J Food Microbiol 95:321–331. doi:10.1016/j. ana. J Biol Chem 281:9560–9568. doi:10.1074/jbc.M513426200 ijfoodmicro.2003.12.010 Schubert I, Wobus U (1985) In situ hybridisation confirms jumping Lauro FM, Chastain RA, Blankenship LE, Yayanos AA, Bartlett DH nucleolus organizing regions in Allium. Chromosoma 92:143– (2007) The unique 16S rRNA genes of piezophiles reflect both 148. doi:10.1007/BF00328466

1 3 Plant Mol Biol 15

Smith S, Stillman B (1989) Purification and characterization of CAF-I, Takeda S et al (2004) BRU1, a novel link between responses to DNA a human cell factor required for chromatin assembly during DNA damage and epigenetic gene silencing in Arabidopsis. Gene Dev replication in vitro. Cell 58:15–25 18:782–793. doi:10.1101/gad.295404 Symonova R et al (2013) Genome differentiation in a species Weider LJ, Elser JJ, Crease TJ, Mateos M, Cotner JB, Markow pair of coregonine fishes: an extremely rapid speciation TA (2005) The functional significance of ribosomal (r)DNA driven by stress-activated retrotransposons mediating exten- variation: Impacts on the evolutionary ecology of organisms. sive ribosomal DNA multiplications. BMC Evol Biol 13:42. Annu Rev Ecol Evol S 36:219–242. doi:10.1146/annurev. doi:10.1186/1471-2148-13-42 ecolsys.36.102003.152620

1 3 The Plant Journal (2019) 98, 1090–1105 doi: 10.1111/tpj.14304 Roles of RAD51 and RTEL1 in telomere and rDNA stability in Physcomitrella patens

Ivana Goffova1,2,‡, Radka Vagnerov a3,‡ , Vratislav Peska 4, Michal Franek1,2, Katerina Havlova1,2, Marcela Hola3 , Dagmar Zachova1,2, Miloslava Fojtova1,2, Andrew Cuming5, Yasuko Kamisugi5, Karel J. Angelis3 and Jirı Fajkus1,2,4,* 1Laboratory of Functional Genomics and Proteomics, National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kotlarska 2, CZ-61137, Brno, Czech Republic, 2Mendel Centre for Plant Genomics and Proteomics, Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic, 3The Czech Academy of Sciences, Institute of Experimental Botany, Na Karlovce 1, CZ-16000, Prague, Czech Republic, 4The Czech Academy of Sciences, Institute of Biophysics, Kralovopolsk a 135, 612 65, Brno, Czech Republic, and 5Centre for Plant Sciences, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, UK

Received 23 January 2019; revised 21 February 2019; accepted 27 February 2019; published online 5 March 2019. *For correspondence (e-mail [email protected]). ‡These authors contributed equally.

SUMMARY

Telomeres and ribosomal RNA genes (rDNA) are essential for cell survival and particularly sensitive to fac- tors affecting genome stability. Here, we examine the role of RAD51 and its antagonist, RTEL1, in the moss Physcomitrella patens. In corresponding mutants, we analyse their sensitivity to DNA damage, the mainte- nance of telomeres and rDNA, and repair of double-stranded breaks (DSBs) induced by genotoxins with vari- ous modes of action. While the loss of RTEL1 results in rapid telomere shortening, concurrent loss of both RAD51 genes has no effect on telomere lengths. We further demonstrate here the linked arrangement of 5S and 45S rRNA genes in P. patens. The spacer between 5S and 18S rRNA genes, especially the region down- stream from the transcription start site, shows conspicuous clustering of sites with a high propensity to form quadruplex (G4) structures. Copy numbers of 5S and 18S rDNA are reduced moderately in the pprtel1 mutant, and significantly in the double pprad51-1-2 mutant, with no progression during subsequent cultiva- tion. While reductions in 45S rDNA copy numbers observed in pprtel1 and pprad51-1-2 plants apply also to 5S rDNA, changes in transcript levels are different for 45S and 5S rRNA, indicating their independent tran- scription by RNA polymerase I and III, respectively. The loss of SOL (Sog One-Like), a transcription factor regulating numerous genes involved in DSB repair, increases the rate of DSB repair in dividing as well as dif- ferentiated tissue, and through deactivation of G2/M cell-cycle checkpoint allows the cell-cycle progression manifested as a phenotype resistant to bleomycin.

Keywords: Physcomitrella patens, ribosomal RNA genes, telomere, genome stability, RTEL1, RAD51, Sog One-Like.

INTRODUCTION Maintenance of genome stability is particularly important same factors are essential both as functional components in plants that as sessile organisms are exposed to the con- of telomeres and of DSB repair. This situation is exempli- tinuous impact of diverse environmental stresses, includ- fied by the KU70/80 heterodimer, which acts in one of the ing DNA damaging factors. Among key players in main DSB repair pathways, non-homologous end-joining protection against genome and chromosome instability are (NHEJ) (Boulton and Jackson, 1996), as well as in protec- factors involved in double-stranded DNA break (DSB) tion of blunt-ended plant telomeres (Valuchova et al., repair and telomere maintenance. In addition, both these 2017), and which is an interaction factor of telomerase processes are interconnected. Although one of the basic (Chai et al., 2002; Ting et al., 2005). Similarly, the RAD51 roles of telomeres is their end-protective function, that is, protein, the essential factor of another major DSB repair to distinguish natural chromosome ends from unrepaired pathway – homologous recombination (HR) – acts also in dsDNA chromosome breaks, paradoxically sometimes the an alternative telomere lengthening (ALT) in the absence

1090 © 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd RAD51 & RTEL1 in telomere & rDNA stability in moss 1091 of telomerase (Le et al., 1999; Olivier et al., 2018). Telom- known as telomere rapid deletion. To promote T-loop erase, a ribonucleoprotein complex, elongates telomeres, unwinding, RTEL1 is recruited to telomeres in S-phase by thereby compensating for their erosion due to incom- the telomeric TRF2 protein (Sarek et al., 2015). plete replication by the lagging strand synthesis in each Furthermore, RTEL1 can dissolve quadruplex (G4) DNA replication. structures that otherwise block the extension of telomeres In land plants, telomerase appears to be a universal by telomerase (Vannier et al., 2012). The role of RTEL1 in mechanism to maintain telomeres. Our recent studies of telomere dynamics was confirmed by finding its mutation plant species with telomeres of previously unknown as causative in Hoyeraal–Hreidarsson syndrome, a severe sequence in which telomerase-independent mechanisms form of the bone-marrow failure and cancer predisposition had been presumed (Sykorova et al., 2003a,b, 2006) disorder, dyskeratosis congenita, which is characteristic by demonstrated that even these unusual telomere repeats short telomeres and genome instability (Le Guen et al., were synthesized by telomerase (Peska et al., 2015; Fajkus 2013; Faure et al., 2014; Vannier et al., 2014). Importantly, et al., 2016). Conversely, in the absence of telomerase RTEL1 also promotes genomewide replication through its function in plant cells, for example, due to its targeted interaction with PCNA, increasing replication fork stability, knockout, ALT is activated and can partially or fully com- extension rates and origin usage (Vannier et al., 2013). A pensate for the lost telomerase activity (Ruckova et al., recent report revealed that reversed replication forks occur- 2008). The ALT mechanism and its variants, originally ring in telomeres of RTEL1-deficient cells due to compro- described in yeasts and later also in human cells (Lundblad mised telomere replication are aberrantly bound by and Blackburn, 1993; McEachern and Blackburn, 1996; telomerase. This binding prevents the restart of reversed Bryan et al., 1997; Nakamura et al., 1998; Le et al., 1999) replication forks at telomeres and leads to critically short are mostly dependent on HR and its factors. Among these, telomeres (Margalef et al. 2018). Therefore, paradoxically, RAD51, the eukaryotic orthologue of the bacterial RecA telomerase can contribute to telomere shortening by stabi- recombinase, plays a central role in HR in yeast and ani- lizing stalled replication forks at chromosome ends. mals. Loss of RAD51 function causes lethality in verte- As in mammals, A. thaliana RTEL1 also contributes to brates, but not in other animals or in the flowering plant telomere homeostasis. The concurrent loss of RTEL1 and Arabidopsis thaliana, in which the AtRAD51 gene is dis- the telomerase reverse transcriptase (TERT) leads to faster pensable for vegetative development, but required for telomere shortening than in the single-mutant line tert, meiosis (Li et al., 2004). As is the case for humans, resulting in developmental arrest after four generations A. thaliana has five RAD51 paralogues in addition to (Recker et al., 2014). This observation indicated the role of AtRAD51: AtRAD51B, AtRAD51C, AtRAD51D, XRCC2 and RTEL1 in ALT, which otherwise partially compensates for XRCC3 (Bleuyard et al., 2006; Pradillo et al., 2012; Da Ines the TERT loss (Ruckova et al., 2008). The requirement for et al., 2013; Serra et al., 2013), which are involved in RTEL1 in multiple pathways to preserve genome stability recombination and DNA repair. in A. thaliana can be explained by its putative role in the RAD51 exercises its role in facilitating recombination by destabilization of DNA loop structures, such as D-loops performing the homology search and strand invasion steps and T-loops (see above), and it has been demonstrated of HR. Therefore, regulation of RAD51 filament formation that RAD51-dependent HR participates in ALT in is essential for promoting error-free DNA repair. In yeast, A. thaliana (Olivier et al., 2018). The authors also showed SRS2 is one of the most important antagonists of RAD51, that this RAD51 role is dependent on RTEL1 helicase, pos- thereby helping to protect the cell from inappropriate HR. sibly functioning in a dissolution of a displacement loop SRS2 removes RAD51 from ssDNA ends, thereby prevent- (D-loop) during telomere replication. ing the homology search in HR (for review see Karpenshif In A. thaliana, the RTEL1 homologue suppresses HR and and Bernstein, 2012). is involved in the processing of DNA replication intermedi- The functional homologue of SRS2 in higher eukaryotes ates and interstrand and intrastrand DNA crosslinks. Upon – RTEL1 – has been initially described in Caenorhabditis replication stress imposed by deficiency of RTEL1 in elegans as a helicase suppressing inappropriate recombi- A. thaliana, a replication checkpoint is activated that is con- nation events by promoting disassembly of D-loop recom- trolled by a master regulator of the DNA damage response bination intermediates, and the loss of its function results in plants, SOG1 (Suppressor Of Gamma1), with a charac- in increased genome instability (Barber et al., 2008). In teristic NAC domain (Hu et al., 2015). The response to DNA addition to its regulatory role in HR, RTEL1 was shown to damage starts with a transient induction of general stress act also in telomere maintenance in mammalian telom- genes that is coincident with the sustained induction of erase-positive cells (Uringa et al., 2012). This function has DNA repair genes and is followed, after a short delay, by been explained by the RTEL1 function in disassembling the repression of the cell-cycle regulating genes (recently telomeric loops (T-loops), thereby blocking inappropriate reviewed in Nikitaki et al., 2018). SOG1 mutants suppress excision of large telomere regions as T-circles, the process radiation-induced arrest and proceed through the cell cycle

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 1092 Ivana Goffova et al. unimpeded (Preuss and Britt, 2003; Adachi et al., 2011). Telomere maintenance is not only a necessary require- SOG1 targets numerous genes required for repair by HR, ment for chromosome stability during DNA replication and including RAD51 (Ogita et al., 2018). Interestingly, RTEL1 cell proliferation, but its disturbance for example, due to itself is not a target gene of SOG1 control. SOG1 plays an dysfunction of telomerase (Riha et al., 2001; Siroky et al., important role not only in the response and removal of 2003; Ruckova et al., 2008) or Chromatin Assembly Factor- DNA damage, but also as a replication checkpoint (Bour- 1 (Mozgova et al., 2010; Jaske et al., 2013) can also be used bousse et al., 2018). In the P. patens genome, a direct as an indication of a compromised chromosome and gen- equivalent of A. thaliana SOG1 does not exist, neverthe- ome stability. In addition to telomeres, other genomic loci less the SOG1-like protein (SOL), has been identified are not only essential for cell survival, but are also very (Pp3c22_130V3.1). In addition to the NAC domain, it shares sensitive markers of genome instability such as the loci two SQ motifs with A. thaliana SOG1, however their posi- coding for 5S and 45S ribosomal RNA genes (rDNA). In A. tions differ. In addition, the SOL protein lacks an N-term- thaliana, 45S rDNA loci showed changes in the dosage and inal extension. Therefore, it is not clear whether the moss expression pattern of rRNA gene variants, as well as in SOL is an orthologue of A. thaliana SOG1 (Yoshiyama, their distribution to nucleolar organizer regions 2 and 4, in 2015). response to dysfunction of a number of chromatin-asso- In P. patens, RTEL1 has been found among genes upreg- ciated proteins as for example histone methyltransferases, ulated after c-irradiation. The Pprtel1 knockout resulted in factors involved in DNA repair or histone chaperones (Moz- a severe growth deficiency that was independent of the gova et al., 2010; Pontvianne et al., 2010, 2012, 2013, 2016; presence of bleomycin (Kamisugi et al., 2016). The authors Durut et al., 2014; Muchova et al., 2015; Havlova et al., hypothesized that this growth phenotype might be the 2016; Pavlistova et al., 2016). Therefore, to elucidate the result of telomere deficiency, but they did not analyse role of RAD51 and RTEL1 in P. patens genome stability, we telomeres. primarily examined the sensitivity of corresponding The control of genome stability in P. patens shows dis- mutants to DNA damage, the maintenance of their telom- tinct features when compared with A. thaliana, for exam- eres and rDNA, and repair of DSBs induced by genotoxins ple, a highly efficient somatic HR. Two highly with various modes of action. We further demonstrated homologous RAD51 genes, initially identified as RAD51A that P. patens possesses a so-called linked arrangement of and RAD51B, have been characterized in P. patens (Ayora 5S and 45S rRNA genes with a region of a high propensity et al., 2002; Markmann-Mulisch et al., 2002) and subse- to form quadruplex (G4) structures in their spacer region. quently named as RAD51-1 and RAD51-2 by Schaefer Interestingly, changes in 45S rDNA copy numbers et al. (2010) to avoid confusion with other RAD51 par- observed in strains with a loss of function of either RTEL1 alogues. Interestingly, the P. patens RAD51-1 and -2 gene or RAD51 genes also applied to 5S rDNA, while changes in structure is unique, being intronless in both paralogues transcript levels are different for 5S and 45S rRNA. (Markmann-Mulisch et al., 2002). Loss of RAD51 function in P. patens caused a significant vegetative phenotype RESULTS and marked hypersensitivity to the double-stranded break- Sensitivity to genotoxic treatment inducing agent bleomycin in P. patens but not in A. thaliana. This finding indicates that HR is used for To characterize repair defects of pprad51, pprtel1 and pp- somatic DNA damage repair in P. patens (Markmann- sol we compared their capabilities to remove lesions as Mulisch et al., 2007). PpRAD51-1 and PpRAD51-2 have par- DSBs induced by radiomimetic bleomycin, small alkylation tially redundant functions in the maintenance of genome adducts induced by methyl methanesulfonate (MMS) and integrity and resistance to ionizing radiation. Loss of func- bulky DNA helix distortion by inducing pyrimidine pho- tion of the two RAD51 proteins completely abolished gene todimers and 60-40-photoproducts by UVC irradiation. targeting and strongly increased illegitimate integration These lesions represent blocks for DNA replication and are rates (Schaefer et al., 2010). repaired or bypassed by various error-free as well as error- In contrast with the effects of the loss of function of HR prone pathways (Hola et al., 2015; Manova and Gruszka, factors, the loss of key factors of NHEJ (MRE11, RAD50, 2015; Nikitaki et al., 2018) The sensitivity of protonemata NBS1, KU70 and LIG4) has little or no impact on growth was tested by 1 h of acute treatment with genotoxin in a phenotype, overall DSB repair and telomere maintenance liquid medium, followed by explants subculture on Petri in P. patens, while a clear telomere phenotype can be seen plates with the drug-free medium to determine the ability in the corresponding A. thaliana mutants (Hola et al., 2013; of the tissue to recover. Explant growth after 3 weeks was Fojtova et al., 2015). Therefore, it is not possible to simply monitored as a plant fresh weight rather than plant surface generalize the results obtained in only one of these model area (Kamisugi et al., 2016) as their diameters versus plants as applying to DNA repair and telomere biology in explant mass substantially differed in studied lines all plants. (Figure 1).

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 RAD51 & RTEL1 in telomere & rDNA stability in moss 1093

Figure 1. Sensitivity of the P. patens WT and repair mutants sol, rtel1 and rad51-1-2 to bleomycin treatment. Untreated (a) and treated 1 day explants of P. patens with 20 lgmlÀ1 (b) and 50 lgmlÀ1 (c) bleomycin for 1 h were inoculated as ‘spot inocula’ onto BCDAT agar plates and photographed after 3 weeks of recovery. Note the plant morphology when the mass of ppsol colonies is nearly twice of that of WT (Figure 2) regardless of their smaller surface area.

Sensitivity to bleomycin Sensitivity to MMS Both pprtel1 and rad51 double mutant pprad51-1-2 were In response to the treatment with MMS, sensitivities of extremely sensitive to bleomycin treatment (Markmann- pprtel1 and pprad51-1-2 differed substantially. Whereas Mulisch et al., 2007; Schaefer et al., 2010). For both lines, the sensitivity of pprtel1 was very similar to wild-type treatment with 20 lgmlÀ1 bleomycin treatment is a lethal (WT), pprad51-1-2 was nearly twice as sensitive as WT and dose from which they do not recover. This is a striking pprtel1 after exposure to 10 mM MMS. After exposure to observation for pprtel1 as A. thaliana orthologue atrtel1 is 30 mM MMS rad51-1-2 explants did not recover at all, while not sensitive to bleomycin and zeocin treatment (Hu et al., WT and pprtel1 still showed 20% growth. As in the case of 2015). Contrary to pprtel1 and pprad51-1-2, ppsol mani- bleomycin, ppsol generally manifests a partially resistant fested resistance to bleomycin treatment. This yet unidenti- phenotype, particularly at the higher concentration of fied response mechanism to bleomycin treatment resulted 30 mM MMS (Figure 2b). MMS-induced alkylation damage in nearly twice the stimulation of recovery growth when disrupts DNA replication and, if not repaired by the base compared with untreated tissue (Figure 2a). excision repair (BER) pathway, leads to the requirement for

Figure 2. Growth responses of P. patens WT, sol, rtel1 and rad51-1-2 to acute treatment with bleomycin (a), MMS (b) and to UVC irradiation (c). In acute treat- ment, the tissue was incubated for 1 h in BCDAT liquid medium supplemented with bleomycin or MMS at the indicated concentrations. Explants were then inoculated on drug-free medium and incubated under standard growth conditions for 3 weeks. For UV treatment the explants were first inoculated on BCDAT plates and then irradiated by indicated dose of UV. To avoid photolyase-mediated dimer reversions the plates were kept in darkness after irradiation or handled under the red light for 24 h prior being grown under standard condition for 3 weeks. For each treatment the mean weight of treated plants (ÆSD, n = 8) were normalized to the weight of untreated plants that was set as a default to 100%. A representative image of plant growth is shown in Figure 1.

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 1094 Ivana Goffova et al.

RAD51. Nevertheless, the observed sensitivity of pprad51- insignificantly varies between 7.5 and 11%T, both in divid- 1-2 might also reflect a situation in which the number of ing as well as differentiated tissue. In dividing cells, DSBs DNA alkylation lesions induced by MMS is high enough to are rapidly removed, and after 20 min, fewer than 50% of saturate the entire DNA (or only predisposed sites) and DSBs remain in WT, whereas 25% in pprad51-1-2 and locally promote the formation of clustered damage. Clus- pprtel1, and only 15% remain in ppsol (Figure 3). The DSBs tered alkylation lesions and various intermediates of their repair time-course is identical in pprad51-1-2 and pprtel1 repair could lead to the formation of DSBs, which may be and is intermediate between the most rapidly repairing pp- behind pprad51-1-2 sensitivity towards MMS treatment, sol and the slowest repairing WT genotypes. Kinetic the phenomenon that was earlier observed in mammalian parameters of DSB removal were assessed by analysis of and yeast cells in which mutants in the HR pathway are time-course data with the GraphPad program. Halftimes sensitive to alkylation damage (Lundin et al., 2005). (s1/2) of the rapid phase of DSB insignificantly varies With regard to the response to MMS, RTEL1 as a HR between 2 and 2.3 min for WT and mutant lines. The differ- antagonist evidently does not play an important role in the ences are observed during the slow phase, when all repair of MMS-induced lesions in moss because, even mutant lines repair faster with s1/2 around 20 min (pprtel1 when its activity is eliminated, the growth of pprtel1 after 19.6; ppsol 20 and pprad51 21 min) than WT 25.5 min. treatment is still almost the same as that of the WT. In differentiated tissue pprad51-1-2, pprtel1 and ppsog1 Interestingly, ppsol, particularly at higher concentrations also repair DSBs faster than WT, but the situation is not as of MMS, manifests resistance to treatment, as for radiomi- clear as in dividing cells. Firstly, after 3 h of recovery still metic bleomycin. more than 30% of DSBs remain unrepaired in the most rapidly repairing lines pprad51-1-2 and ppsog1. WT and Sensitivity to ultraviolet light pprtel1 followed much slower course of DSB repair, with Both pprtel1 and pprad51-1-2 have a similar sensitivity to UVC irradiation (Figure 2c). The slight resistance of pprad51-1-2 over pprtel1 at 1 kJ mÀ2 suggests that bulky UV lesions are not targets for HR removal. By contrast, the absence of RTEL1 helicase might partially affect the resolu- tion and removal of clipped off fragments encompassing the photo-adducts during nucleotide excision repair (NER). DSB repair in dividing and differentiated tissue of P. patens Early growth of P. patens protonemata as unbranched fila- ments can be examined through intensive shearing to gen- erate short, 4–7-cell-long fragments that results in the initial culture comprising approximately 30–50% of meris- tematic apical cells. After 1 week of growth, protonemata extend to more than 20 cells in length showing already signs of branching and the fraction of apical cells drops to less than 10% of the culture. We used this model system of ‘early’ and ‘late’ protonemal culture to compare the role of RAD51, RTEL1 and SOL in the repair of DSBs in dividing and partially differentiated cells. Repair of DSBs proceeds in two phases, the first, rapid, generally assigned to the NHEJ pathway, followed by a slower phase assigned to more complex processes like HR, removal of crosslinks and replication obstacles (Goodarzi et al., 2010). Despite defects in known genes participating in NHEJ or HR DSBs repair, some mutants for example, Figure 3. DSB repair kinetics determined by comet assay in tissue with a atlig4, atku80, atck2 can remove DSBs faster than the par- predominance of dividing (open symbols) or differentiated (closed) cells of WT (black), ppsol (green), pprad51-1-2 (red) and pprtel1 (blue). Protonemal ent WT organism (Kozak et al., 2009; Moreno-Romero tissue fragments that had regenerated after 1 day of subculture (open sym- et al., 2012). bols) or 7 days of subculture (closed symbols) was treated with 30 lgmlÀ1 In this study, we describe faster repair of DSBs in of bleomycin for 1 h and repair kinetics was measured as % of DSBs remaining after the 0, 3, 5, 10, 20, 60 and 180 min of repair recovery. Maxi- pprad51-1-2, pprtel1, and ppsol mutant lines than in the mum damage is normalized as 100% at t = 0 for all lines. Error bars indicate WT. The base DNA damage detected in all genotypes only SDs based on at least two independent experiments in all cases.

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 RAD51 & RTEL1 in telomere & rDNA stability in moss 1095 the first rapid phase of DSB repair affected, which was and, moreover, reflected telomere length at a specific chro- much slower than that of pprad51-1-2 and ppsol. Surpris- mosome arm. The analysis of PETRA products from the 2L ingly analysis of repair kinetics with GraphPad did not arm showed a product (which also includes a subtelomeric return any parameters, what indicates ambiguous repair region) of c. 3200 bp in WT and 2300 bp in pprtel1 isolates, kinetics in all lines. We interpret this observation so that in therefore displaying a net shortening the telomere of ca. differentiated tissue, the DSB repair is conveyed rather by 900 bp (Figure 4b). In contrast with pprtel1, pprad51-1-2 a rapid NHEJ mechanism, when HR is abolished in rad51- did not reveal any significant changes in telomere lengths, 1-2 or supressed in sol plants. in either TRF or PETRA analyses. Loss of PpRTEL1, but not RAD51 results in dramatic Telomerase expression and activity do not change in telomere shortening either pprad51-1-2 or pprtel1 lines

Measurement of telomere lengths using terminal restric- Transcript levels of the TERT gene encoding for the telom- tion fragments (TRF) analysis revealed that telomere erase catalytic subunit were analysed (Figure 5a), as was lengths drop from their WT length ranging between 800 the level of telomerase activity measured by the in vitro and 1400 bp (median length of 1100 bp) to a narrow range telomere repeat amplification protocol (TRAP) assay in around a median length of 700 bp (Figure 4a). This short- either the quantitative (Figure 5b) or conventional version ening had occurred since the first analysed passage and (Figure 5c) in WT, pprad51-1-2 and pprtel1 plants. The did not show further progression in subsequent protone- results of both assays demonstrated that neither telom- mal passaging, therefore probably representing a function- erase expression, nor the activity changes significantly in ally critical minimum telomere length. This result was either of the mutants. These results are consistent with the supported by an independent Primer Extension Telomere recently described role of both genes in ALT in A. thaliana Repeat Amplification (PETRA) assay, which was more (Olivier et al., 2018) and indicate that the critical telomere appropriate for the analysis of short telomeres than TRF shortening observed in pprtel1 moss lines results from

Figure 4. Telomere lengths in P. patens RAD51 and RTEL1 mutant lines. (a) TRF analysis in 7-day-old (WT, pprad51-1-2) and 10-day-old (pprtel1) protonema cul- tures in three biological replicates. Shortened telomeres were detected in the pprtel1 mutant (marked with arrow). Triangles indicate the signal of interstitial telomeric sequences (ITS). Hybridization signals evaluated via the MultiGauge Analysis software (FujiFilm) are presented by a box-and-whisker plot (b) with the top part (white) and bottom part (black) representing the upper and lower quartiles, separated by the median. The ends of whiskers represent the minimum and maximum telomere lengths reflected in hybridization signal. (c) PETRA assay results of telomere lengths on 2L chromosome arm of the wild-type, pprad51-1-2 and pprtel1 in three biological replicates showing telomere shortening in pprtel1. Molecular size markers (M) are shown on the left of each panel.

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 1096 Ivana Goffova et al.

Figure 5. PpTERT transcript levels and telomerase activities in ppprad51-1-2 and pprtel1. (a) Relative levels of PpTERT exon 10 transcripts in studied plant lines examined by RT-qPCR in three biological replicates. (b) Telomerase activity measured by Quantitative TRAP assay. PpTERT transcript levels and telomerase activities in mutants were measured in three biological replicates and related to the values of wild-type. Error bars indicate SD. (c) Conventional TRAP assay of the wild-type, pprad51-1-2 and pprtel1 in four biological replicates. NC represents a negative control. telomere rapid deletion events, rather than from an also used the results to estimate the copy number of the impaired telomerase function in these mutants. rDNA loci. Interestingly, when calculating the coverage of the rRNA genes by the WGS data, the coverage of 5S 45S rDNA and 5S rDNA show a linked arrangement in P. patens In order to properly design analyses of rDNA genomic sta- bility and expression, we first needed to clarify the struc- ture of rDNA loci in P. patens. While the separate arrangement of 45S rDNA and 5S rDNA is more typical among land plants than the linked one, in bryophytes the linked arrangement has been demonstrated in liverwort, Marchantia polymorpha, and a moss Funaria hygrometrica (Sone et al., 1999). A later systematic study showed that a physical linkage of all rRNA genes is present in early land plants such as liverworts, mosses, hornworts, lycophytes and most of the fern lineages. The separate arrangement can be detected in seed plants and heterosporous (water) ferns. A change from L- to the S-type organization was observed in monilophytes (water ferns) and chlorophyte algae (Chlamydomonas) (Wicke et al., 2011). We demon- strate here experimentally – using PCR (Figure 6) and fluo- rescence in situ hybridisation on nuclei (FISH) and extended DNA fibres (EDF-FISH) – that 45S rDNA and 5S rDNA exist in the linked arrangement in P. patens (Fig- ures 7 and S1). This linked arrangement is consistent with the previous failure to detect a 5S cluster by BLAST search against the P. patens WGS (Wicke et al., 2011). Using the latest version of P. patens genome and associated Figure 6. PCR of intergenic spacers between 45S and 5S rDNA. PCR pro- sequencing data (Lang et al., 2018; Perroud et al., 2018), duct of intergenic spacer between 18S and 5S rDNA (~5500 bp) of the wild- we performed clustering of repeats in P. patens WGS data type, pprad51-1-2 and pprtel1 (the first three lanes from the left), and PCR using the RepeatExplorer tool. These bioinformatic results product of intergenic spacer between 5S and 28S rDNA (~950 bp) (the next three lanes) in all studied plant lines, respectively. Expected product lengths clearly supported the linked arrangement of the 45S and are based on gene sequence prediction LOC112273807 (NCBI). Molecular 5S rDNA loci (Data S1, Annotation of supercluster 5). We size markers are shown on the left.

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 RAD51 & RTEL1 in telomere & rDNA stability in moss 1097

Figure 7. Fluorescence in situ hybridisation of 5S (in green) and 5.8S (in red) loci in nuclei and extended DNA fibers (EDF-FISH). (a) Isolated nuclei of P. patens display nucleolar localization of both 5S and 5.8S loci showing substantial colocalization in distinct foci. Non-colocalizing areas might reflect sub-domain organi- zation within the nucleolus, as different RNA polymerases transcribe the two repeats. Notice the faint 40,6-diamino-2-phenylindole (DAPI) staining underlying the hybridization signals. See Figure S1 for more examples. (b) Extended DNA fibers from unfixed nuclei show the physical association and intermingled arrangement of 5S and 5.8S rDNA on individual DNA fibers corresponding to their linked arrangement. Individual signals are highlighted with green (5S) or red (5.8S) arrowheads. rRNA genes was twice that of 28S, 5.8S and 18S rRNA The double amount of 5S rRNA could either point to (Data S1). The copy numbers assessed from the fre- the existence of orphan copies of 5S rRNA genes, or an quency of NGS reads (913 for 28S, 986 for 5.8S and 913 artifact of the NGS procedure. Nevertheless, in agreement for 28S rRNA corresponds well to the previous estimates with the previous results (Wicke et al., 2011), our clus- of 45S rDNA copies from NGS data (903 Æ 45 copies per tering analysis did not detect any separate 5S rDNA 1C) and qPCR (653 copies per 1C) (Rosato et al., 2016). clusters.

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 1098 Ivana Goffova et al.

45S and 5S rDNA copy numbers are reduced in both copy numbers (Figure 8b). This corresponds to a relative pprad51-1-2 and pprtel1 lines independence of 5S and 45S rDNA transcription. To monitor genome stability outside telomeres at the The region with a high potential to form G4 structures rDNA loci, which are particularly sensitive to genome insta- upstream of 18S rRNA gene surrounds a putative bility, we measured changes in copy numbers of 18S rRNA transcription start site (representing the 45S rDNA transcription unit) and 5S Formation of G4 structures has previously been reported rRNA genes using qPCR on genomic DNA of 7-day-old pro- as a causative factor of telomere instability during replica- tonemata of pprad51-1-2 and pprtel1 mutants. The results tion in mammalian cells (Sfeir et al., 2009) and RTEL1 has revealed copy number reduction of 18S rDNA to 30% in been demonstrated as a factor counteracting these struc- pprad51-1-2, and 75% in pprtel1 mutant (Figure 8a, left tures (Vannier et al., 2012). In addition, formation of G4 in panel). Similar losses were observed in 5S rDNA (Fig- transcribed regions results in formation of R-loops, three- ure 8b, left panel) indicating the loss of both parts of the stranded nucleic acid structures that comprise nascent linked rDNA locus occurs in whole units. These losses do RNA hybridized with the DNA template, leaving the non- not show any progression in subsequent passages and template DNA single-stranded (Skourti-Stathaki and Proud- seem to reflect a steady-state situation. Interestingly, while foot, 2014) which then induces DNA damage (De Magis the greater loss of genomic rDNA copies in pprad51-1-2 is et al., 2019). Therefore, to identify putative factors con- fully compensated at the 18S rRNA transcript levels, this tributing to the observed instability of rDNA in pprad51-1-2 does not occur in a less affected pprtel1 line (Figure 8a, and pprtel1 mutants, we analysed the sequence unit of a right panel) where relative transcript levels roughly corre- linked rDNA for a local potential to form G4 structures spond to the decreased number of 18S rDNA copies. Nota- using G4hunter (Bedrat et al., 2016) and pqsfinder (Hon bly, relative levels of 5S rDNA transcripts are reduced in et al., 2017). Both prediction tools showed similar results, both mutants without any obvious relation to 5S rDNA pointing to numerous sites with a potential to form quadruplex structures along the rDNA unit, both in genic and spacer regions (see Figure 9 and Table S1). The noticeable clustering of putative G4 sites can be seen in the spacer region between 5S and 18S rRNA genes. From all of these, a sequence with a particularly strong potential to form G4 structure has been identified ca. 500 bp upstream of the 18S rRNA gene (position 10 937–10 971) evident from a pqsfinder score of 132: more than a double the score obtained for plant or human telomere DNA sequences (scores of 60 and 64, respectively) using the same tool, and significantly exceeds scores of the other G- rich regions inside the rDNA unit. The high G4 potential of this region is supported by our observation that PCR amplification was problematic across the linker between the 5S and 18S rRNA genes, requiring addition of dimethyl sulfoxide (DMSO) to the reaction mix- ture (see Experimental procedures). Furthermore, our repeat clustering analysis (Data S1) indicates that this region is of a high G4 potential by a dramatically (by two orders of magnitude) lower number of NGS reads when compared with the neighbouring regions. To investigate a potential biological role of the G4 region, we mapped putative transcription start sites in the linked rDNA unit. Using bwa-mem v0.7.17-r1188 (Li and Durbin, 2009), we mapped publicly available RNA- Seq data (accession numbers SRX1176830, SRX3381969, Figure 8. Loss of 45S rDNA (a) and 5S rDNA (b) genomic copies and their SRX3381970 and SRX2484017) to the intergenic spacer transcript levels in pprad51-1-2 and pprtel1 mutants. rDNA copy numbers (IGS) located between 5S and 18S rRNA gene. We and transcripts were measured by qPCR and normalised to values obtained in the wild-type. Asterisks indicate statistical significant difference based on denoted a spot where the coverage significantly increases Student’s t-test: *P < 0.05; **P < 0.01; ***P < 0.001. Error bars indicate SD. compared with the rest of the IGS as the putative

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 RAD51 & RTEL1 in telomere & rDNA stability in moss 1099

Figure 9. The distribution of G4 structures over the rRNA gene unit. The G4 structures were predicted from the sequence using pqsfinder and G4Hunter. The putative transcription start site (TSS) was determined by mapping RNA-Seq data to the reference sequence. The arrow indicates the direction of transcription. For more details, see Table S1. transcription start site (TSS). We were able to roughly cell-cycle regulation, such as cyclin-dependent kinase (CDK) determine the TSS to be located on chr 20, position inhibitors, stem cell death and DNA repair, whereas SOG1 127 235 bp, starting with the sequence ‘TATGTGGGGG’. preferentially targets genes involved in HR (Bourbousse The TSS therefore locates into the region of high propen- et al., 2018; Ogita et al., 2018). sity to form G4 (position 8748–8756 according to Figure 9 Resistance to zeocin, a member of bleomycin family of and Table S2), upstream from the particularly strong G4 antibiotic compounds has also been reported in A. thaliana site (position 10 937–10 971 in Figure 9 and Table S2), in atsog1-1 (Adachi et al., 2011). which is therefore part of the primary transcript. Bleomycin induces clustered damage of oxidation lesions in the site where drug attaches to DNA. From the DISCUSSION results after MMS treatment, we can draw the assumption that MMS also induces clustered damage of alkylation DSB repair dramatically differs in dividing and lesions, which can to some extent parallel radiomimetic differentiated tissue of P. patens damage incurred after bleomycin treatment, although SOG1 is a plant functional homologue of p53 and despite based on DNA alkylation. the fact that there is no sequence conservation between Repair of DSBs is driven by a cellular imperative to p53 and SOG1, they share a subset of conserved target repair DNA damage quickly, and enable replication of the genes, suggesting that they have been co-opted to mediate DNA template. We suggested earlier that this rapid repair both shared and unique aspects of the DNA damage could be performed by activation of the rapid salvage response in plants versus mammals. SOG1 governs most pathway, which is present as a backup for emergency of the genes induced by c-irradiation by their phosphoryla- scenarios (Kozak et al., 2009; Moreno-Romero et al., tion with ATR and ATM protein kinases. Comparison 2012). In dividing cells, elimination of RAD51 and RTEL1 between SOG1 and p53 target genes showed that leads to speed up the removal of DSBs, suggesting that both transcription factors control genes responsible for both are involved in regular rather than salvage rapid

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 1100 Ivana Goffova et al.

DSB repair pathway. Nevertheless, the most striking Surprisingly, the loss of RTEL1, which acts as an antago- observation is the rapid removal of DSBs by mutating nist of RAD51, also results in a moderate decrease in SOL. An elaborate network of ATM, ATR and SOG1 gov- rDNA copies and, moreover, in a corresponding reduc- erning plant response to c-irradiation is evidently a slow tion in rRNA transcripts. Reduction in 45S rDNA copy process, having priority over the rapid pathway that is numbers (to 40% of the copies in WT plants) has been present as a backup in normal cells, but is active in observed previously also in A. thaliana rtel1 plants (Roh- emergency scenarios such as the absence of key compo- rig et al., 2016). This situation can be interpreted as a nents of regular repair pathways. result of upregulated inappropriate HR events and persis- The rate of DSB repair in differentiated tissue is much tent HR intermediates, as well as G4 structures, which slower than in dividing cells suggesting that repair associ- accumulate in the absence of RTEL1 helicase and may ated with the preparation of functional template for DNA constitute obstacles to both replication and transcription. replication is not so urgent to allow for cell survival. When Therefore, the cause of rDNA loss could be in principle replication is not a primary need for the cell to survive, the same in the case of telomeres and rDNA: unresolved more elaborate mechanisms of repair start to be involved G4 structures and recombination intermediates or struc- that lead to ambiguous kinetics. The removal of DSBs in all tures that strongly resemble them (R-loops, D-loops and three studied lines pprad51-1-2, pprtel1 and ppsol is T-loops). Interestingly, the losses of rDNA in both rtel1 quicker than in WT, suggesting an important role of alter- and pprad51-1-2 plants, as well as the loss of telomeres native pathways suppressed in WT under normal condi- in rtel1 plants, are not progressive and, apparently, a tions. DSB repair in pprad51-1-2 and also in ppsol new equilibrium in the maintenance of these loci is suggests that P. patens which is vitally dependent on the reached soon after their initial loss. Presumably, HR pathway for maintaining genome stability, was moved decreased copy numbers of these tandem repeats to a much faster pathway when the HR was disrupted by decrease also their overall propensity to form arrays of direct interference with homology search. We can also see non-canonical structures, which reduces the need for a partial, although not decisive effect when the RTEL1 heli- their removal either with a specific helicase, or via dele- case, presumably involved in the resolution of HR D-loops, tion-prone repair events in the absence of RAD51. is mutated. Most likely, the fast track alternative to HR is Importantly, we have demonstrated here experimentally any of several suggested NHEJ mechanisms. (using PCR, FISH and EDF-FISH), as well as by in silico analysis of NGS data, the linked arrangement of 5S and Ribosomal RNA genes are intrinsically unstable and highly 45S rDNA loci. Therefore, it makes sense that the changes sensitive to dysfunction of RTEL1 and RAD51 in P. patens in 5S and 45S rDNA copy numbers observed in pprad51- The results demonstrated the rDNA loci as sites of frequent 1-2 and pprtel1 lines are similar, resulting from the loss of repair events, presumably because of conflicts between the entire rDNA units. We further identified numerous ongoing transcription and replication, and sites of prefer- regions of high G4 potential (comparable with the G4 ential formation of RNAÀDNA hybrids, R-loops, which potential of telomeric DNA repeats) across the linked induce recombination events. Moreover, similar to found rDNA units. Conspicuous clustering of G4 sites was for telomeres, rDNA units also harbour regions with a high observed in the spacer region between 5S and 18S rRNA potential to form quadruplex DNA. genes, including one particularly strong G4 site in the While in A. thaliana, RAD51 is dispensable for vegetative transcribed region upstream of the 18S rRNA gene. This development (Li et al., 2004), in P. patens, the loss of region may therefore be a major source of problems for RAD51 function causes a significant vegetative phenotype replication fork passage, which are further aggravated in and marked hypersensitivity to the double-stranded break- the absence of RTEL1 helicase. Nevertheless, RAD51- inducing agent bleomycin (Markmann-Mulisch et al., dependent HR is mostly successful in solving these events 2007). This indicates that HR is the preferred pathway for of replication fork stalling or collapse, and the mitotic loss somatic DNA damage repair in P. patens but not in of rDNA is therefore relatively moderate. However, in the A. thaliana. The marked reduction of rDNA copies in absence of RAD51, when single-stranded annealing pprad51-1-2 indicates that, in the absence of RAD51 func- becomes a primary pathway to repair these events tion, deletion-prone RAD51-independent repair pathways (Muchova et al., 2015), this results in a considerably prevail as, for example, single-stranded annealing or higher frequency of deletions of rDNA units, even in the NHEJ. This finding is supported by relatively rapid DSB presence of RTEL1, as observed in our results. repair in differentiated cells in comparison with WT and Dysfunction of RTEL1 but not RAD51 results in telomere pprtel1 lines (Figure 3). shortening in P. patens However, this loss does not seem crucial for rRNA function as the level of transcripts remains the same as The results of telomere analyses in pprad51-1-2 are consis- in WT and does not progress with subsequent passages. tent with the previous results obtained in corresponding

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 RAD51 & RTEL1 in telomere & rDNA stability in moss 1101 mutants of A. thaliana and with the notion that in the pres- EXPERIMENTAL PROCEDURES ence of telomerase, RAD51 is not involved in telomere maintenance (Olivier et al., 2018). In fact, this also corre- Plant material and cultivation sponds to the general view of telomeres as recombination- The WT Physcomitrella patens (accession (Hedw.) B.S.G. ‘Grans- silent chromosome loci. den2004’) (Knight et al., 2002), the pprad51-1-2 double mutant The observed dramatic loss of telomeres in pprtel1 (Schaefer et al., 2010) was generated by D. G. Schaefer, Neuchatel  lines is consistent with telomere rapid deletion, presum- University, Switzerland, and F. Nogue, INRA, Paris, France, and kindly provided by F. Nogue. The pprtel1 in P. patens (accession ably via T-loop excision, as described in mammalian Villersexel K3) (Kamisugi et al., 2016) and ppsol-KO2 were pre- cells, including, for example, patients with Hoyeraal–Hrei- pared by Y. Kamisugi and A.C. Cuming of CPS, University of darsson syndrome (Le Guen et al., 2013; Faure et al., Leeds, UK, as described in Methods S1. These strains were cul- 2014; Vannier et al., 2014). It is possible that after the T- tured as ‘spot inocula’ on BCDAT agar medium supplemented loop excision, the short P. patens telomere chromatin with 1 mM CaCl2 and 5 mM ammonium tartrate (BCDAT medium) or as lawns of protonema filaments by subculture of homogenized fibres are no longer able to form T-loops (Murti and Pre- tissue on BCDAT agar medium overlaid with cellophane in growth scott, 1999; Fajkus and Trifonov, 2001) and, therefore, chambers with 18/6 h day/night cycle at 22/18°C (Knight et al., they are maintained at a steady-state, critically short 2002). level by telomerase. In addition, accumulation of telom- One-day-old protonema tissue for repair experiments was pre- eric G4 structures in the absence of RTEL1 helicase activ- pared from 1-week-old tissue scraped from plates, suspended in ity may contribute to telomere loss by blocking of the 8 ml of BCDAT medium, and sheared by a T25 homogenizer (IKA, Staufen, Germany) at 10 000 rpm for two, 1-min cycles and let to replication fork passing through telomere DNA. However, recover for 24 h in cultivation chamber with gentle shaking at this observation is in striking contrast with a previous 100 rpm. This treatment yielded a suspension of 3–5 cell protone- result obtained in A. thaliana RTEL1 mutants, in which mata filaments, which readily settle for recovery. no telomere shortening has been observed (Recker et al., Treatment and sensitivity assay 2014). Nevertheless, the authors observed that the con- current loss of RTEL1 and the telomerase TERT subunit Sensitivity to DNA damage was measured after treatment either in A. thaliana leads to rapid, severe telomere shortening, with freshly prepared solutions of bleomycin sulphate supplied as which occurs much more rapidly than in the single- Bleomedac inj. (Medac, Hamburg, Germany), or MMS (cat. M4016, Sigma-Aldrich, St. Louis, MO, USA) in BCDAT medium. mutant line tert, resulting in developmental arrest after Protonemal tissue (7d following subculture) for acute treatment four generations. We speculate that the higher frequency was harvested from cellophane overlays, briefly blotted with ster- of HR events, including the aberrant HR events at telom- ile filter paper to remove residual water, then dispersed in 5 ml eres in the absence of RTEL1, results in a clear effect on liquid BCDAT medium containing indicated concentrations of P. patens telomere homeostasis, while in A. thaliana it bleomycin or MMS and treated for 1 h. For recovery from the treatment, a Petri dish with drug-free BCDAT agar and without cel- can only be revealed upon activation of HR-mediated lophane overlay was inoculated with eight explants of one line ALT as occurs in tert mutants (Ruckova et al., 2008). per quadrant and left to grow for 3 weeks (Figure 1). Concurrent loss of RTEL1 and TERT in A. thaliana would UVC treatment was carried out in a Hoefer UVC 500 crosslinker therefore present simultaneous inhibition of both telom- at 254 nm by irradiating P. patens lines spotted as eight explants ere maintenance pathways, telomerase and ALT. In per quadrant on a Petri dish with BCDAT agar without cellophane accordance with this notion, the recent study demon- overlay with the indicated dose of UVC. Irradiation and the follow- ing steps were performed for 24 h in the dark or under red illumi- strated a role of RTEL1 and RAD51 in ALT in A. thaliana nation to block photolyases, which are activated by blue light (Olivier et al., 2018). The null telomeric phenotype of (435 nm). A. thaliana RTEL1 mutants indicates a kind of protection Sensitivity to genotoxin treatment was assessed after 3 weeks against aberrant telomeric HR events in telomerase-posi- by weighing explant colonies. In every experiment, the treatment tive cells, which does not occur in moss and mammals was carried in duplicate and experiments were repeated two or (this work and (Margalef et al., 2018)). Presumably, a three times. The fresh weight of the treated explant colonies was normalized to the fresh weight of untreated explants of the same diverse composition of shelterin and shelterin-like telom- line. ere protein complexes, which prevent the recognition of telomeres as chromosome breaks (Sfeir and de Lange, Single-cell gel electrophoresis (Comet) assay 2012), or epigenetic modifications of telomere chromatin In repair kinetics experiments, 1- and 7-day-old protonemata were which regulate – among others À HR at telomeres (Gon- used. After bleomycin treatment, tissue was thoroughly rinsed in zalo et al., 2006; Jung et al., 2013), could be responsible water, blotted on filter paper and either flash-frozen in liquid N2 for this difference. Despite the genuine nature of this (repair t = 0) or left to recover on plates in liquid BCDAT medium protective mechanism in A. thaliana, we conclude that for the indicated repair times, before being harvested and frozen in liquid N2. DNA double-stranded breaks were detected by a telomerase is not able to compensate fully for the telom- comet assay using the fully neutral N/N protocol as previously ere rapid deletion caused by RTEL1 dysfunction in described (Hola et al., 2013). In brief, approximately 100 mg of fro- P. patens. zen tissue was cut with a razor blade in 450 ll PBS + 10 mM

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 1102 Ivana Goffova et al. ethylene diamine tetraacetic acid (EDTA) on the ice and released phosphoimager (FLA 7000; FujiFilm) and evaluated by the Mul- nuclei filtered through green 30 lm CellTrics filters (Partec, tiGauge Analysis software (FujiFilm). The mean telomere length Gorlitz,€ Germany) into Eppendorf tubes on ice. About 80 llof was calculated as Σ(ODi 9 Li)/Σ(ODi); ODi is the signal intensity nuclear suspension was dispersed in 320 ll of melted 0.7% low above background within the interval i, Li is the molecular size melting temperature agarose (Gibco BRL, cat. no. 15510-027) at (kb) at the mid-point of the interval i. As an alternative 40°C and four 80 ll aliquots were immediately pipetted onto approach to measuring telomeres, PETRA was performed for 20 each of two coated microscope slides (in duplicate per slide), cycles as described in Watson and Shippen (2007) with the covered with a 22 9 22 mm cover slip and then chilled on ice for subtelomeric primer Pp2L. 1 min to solidify the agarose. After removal of cover slips, slides Analysis of telomerase activity using telomere repeat were immersed in lysis solution (2.5 M NaCl, 10 mM Tris-HCl, 0.1 M EDTA and 1% N-lauroyl sarcosinate, pH 7.6 for 1 h at room amplification protocol temperature to dissolve cellular membranes and remove attached proteins. The whole procedure from chopping tissue to Protein extracts from 7-day-old protonema tissue of P. patens  placement into the lysis solution takes approximately 3 min. were prepared according to Fojtova et al. (2015) and diluted to 250 ng lÀ1. These extracts were subjected to the TRAP assay After lysis, slides were directly equilibrated in TrisÀacetic acid l based on the elongation of substrate primer TS21 by the telom- (TA) electrophoresis buffer (0.1 M Tris, 0.3 M sodium acetate, pH 9) to remove salts and detergents. Comet slides were then sub- erase and subsequent PCR amplification of the elongated primer jected to electrophoresis at 1 V cmÀ1 for 3 min. After elec- as described by Fajkus et al. (1998). Quantitative TRAP assay was trophoresis, slides were placed for 5 min in 70% EtOH, 5 min in performed using Elizyme Green Mix AddROX (Elisabeth Pharma- 96% EtOH, and air-dried. Comets stained with SYBR Gold con, Brno, Czech Republic) with TS21 substrate primer, TelPr (Molecular Probes/Invitrogen, cat. no. S11494) were viewed in reverse primer (for sequences see Table S2) and one ll of protein epifluorescence with a Nikon Eclipse 800 microscope and evalu- extract. Samples of three biological replicates (independently ated by the LUCIA Comet cytogenetic software (LIM Inc., Prague, grown passages) were analysed in three technical triplicates in a Czech Republic). 20 ll reaction mix using a Rotorgene6000 cycler and software (Qiagen, Hilden, Germany). Telomerase activity in pprad51 and The fraction of DNA in comet tails (% tail-DNA, % T DNA) was pprtel1 was calculated relative to WT using the DCt method (Pfaffl, used as a measure of DNA damage. Data in this study were 2001). obtained in at least three independent experiments, with the exception of the pprtel1 1-day-old culture that was repeated only RNA isolation and reverse transcription twice due to a slower growth. In each experiment, the % T DNA was measured at seven time points: 0, 3, 5, 10, 20, 60 and 180 min Total RNA was extracted from protonema culture using the after treatment and in control tissue without treatment. Measure- RNeasy plant mini kit (Qiagen), and quality was checked on a 1% ments included four independent gel replicas of 25 evaluated agarose gel. cDNA synthesis was performed using 3 lg of RNA comets totaling at least 300 comets analyzed per experimental with random nonamers (Sigma-Aldrich) and an M-MuLV reverse point. The percentage of remaining damage after given repair transcriptase kit (New England Biolabs). time (tx) is defined as: Analysis of mutual arrangement of 5S and 45S rRNA % damage remainingðÞ¼tx genes mean% tail DNAðÞÀtx mean% tail DNAðÞ control  100 mean% tail DNAðÞÀt0 mean% tail DNAðÞ control In silico analysis of mutual arrangement of 5S and 45S rDNA was performed using clustering of P. patens DNA repeats from the and plotted (Figure 3). Time-course repair data were analyzed for available genome sequencing data (https://www.ncbi.nlm.nih.gov/ one- and two-phase decay kinetics by Prism v.5 program (Graph- sra/?term=SRR4408325) by RepeatExplorer (RE) (Novak et al., Pad Software Inc., San Diego, CA, USA). 2013). For details on pre-processing, see Data S1. Experimental analy- DNA isolation sis of the linked arrangement was performed using PCR with pri- mers 18S(Hpa)R, Pp_5S_F, Pp_5S_R and Pp_28S_F (see Table S2). DNA was isolated from 7-day-old (WT and pprad51) or 10-day-old Intergenic spacer 5S-18S rDNA was amplified by 0.3 U Phusion (pprtel1) protonemal culture according to Dellaporta et al. (1983). High-Fidelity DNA Polymerase (Thermo Scientific) in reaction mix The quality of DNA was checked, and its concentration deter- with 19 GC buffer, 3% DMSO, 500 lM dNTPs, 0.5 lM primers and mined by electrophoresis in a 1% (w/v) agarose gel stained with 20 ng of gDNA. The reaction conditions were: initial denaturation ethidium bromide using Gene Ruler 1 kb DNA Ladder (Thermo at 98°C/3 min, 30 cycles of 98°C/30 sec; 64.3°C/30 sec; 72°C/ Scientific, Waltham, MA, USA) as standard and Multi Gauge soft- 3.5 min and final polymerization step 72°C/10 min. For amplifica- ware (Fujifilm, Tokyo, Japan). tion of intergenic spacer 28S-5S rDNA, the reaction mix consisted of 0.25 lM primers, 0.2 U Phusion High-Fidelity DNA Polymerase, Analysis of telomere lengths using TRF and PETRA assays 19 HF buffer, 250 lM dNTPs and 20 ng of gDNA and the reaction was performed as follows: 30 sec of initial denaturation at 98°C; Determination of telomere lengths by TRF analysis is based on 30 cycles of 10 sec at 98°C, 30 sec at 63°C, 1 min at 72°C and final the digestion of genomic DNA by a frequently cutting restric- extension at 72°C for 10 min. Sizes of the products were measured tion endonuclease (without a recognition site in the telomeric using agarose gel electrophoresis. sequence) as described in (Fojtova et al., 2015). Samples of 1.5 lg genomic DNA were digested by MseI overnight (New Fluorescence in situ hybridisation on nuclei and extended England Biolabs, Ipswich, MA, USA) and separated in 1% (w/v) DNA fibres (FISH, EDF-FISH) agarose gels followed by Southern blot hybridization with a [32P]-radioactively labelled telomere probe prepared according Nuclei were isolated from 7-day-old P. patens growing on Petri to Ijdo et al. (1991). Hybridization signals were visualised using dishes by chopping the moss with a razor blade in ice cold

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 RAD51 & RTEL1 in telomere & rDNA stability in moss 1103 nucleus isolation buffer (NIB À 0.5 M sucrose; 10 mM EDTA; Core Facility Plant Sciences of CEITEC MU is acknowledged for 2.5 mM DTT; 100 mM KCl; 1 mM spermine; 4 mM spermidine in the cultivation of experimental plants used in this paper. We 10 mM Tris-Cl, pH 9.5). After chopping, the solution was filtered acknowledge the core facility CELLIM of CEITEC supported by the through a 50 lm and 30 lm pore-sized disposable filters (Cell- Czech-BioImaging large RI project (LM2015062 funded by MEYS Trics, Sysmex, Germany). Afterwards, the filtrate was supple- CR) for their support with obtaining scientific data presented in mented with 1/10 volume of 10% Triton-X in NIB and centrifuged this paper. at 2000 g for 10 min at 4°C. Supernatant was removed after cen- trifugation and the resulting pellet was resuspended in room tem- CONFLICTS OF INTEREST perature (RT) hypotonic solution (75 mM KCl). After a 5-min incubation at RT, 2 ll of the suspension was spotted on one end The authors declare no conflicts of interest. of a clean Superfrost microscopic slide. The drop on the micro- scopic slide was left at RT until the edge of the drop started drying FUNDING INFORMATION out (6–8 min). Afterwards, 7 ll of sodium dodecylsul- This work was supported by the Czech Science Founda- fateÀTrisÀEDTA (STE) buffer (0.5% SDS; 5 mM EDTA; 100 mM Tris-Cl, pH 7) was pipetted onto the droplet, gently stirred (without tion (project 16-01137S), by the project SYMBIT, reg. no. the pipette tip touching the glass surface) and incubated for CZ.02.1.01/0.0/0.0/15_003/0000477 financed by the ERDF, 5 min. Next, the glass slide was tilted at a 15–30% angle to allow and by the Ministry of Education, Youth and Sports of the droplet to slowly run down the slide and dry out completely. the Czech Republic under the project CEITEC 2020 Fixation of the sample followed in prechilled 3:1 methanol/acetic acid mixture for 10 min, after which the slides were dried at RT. (LQ1601) and LTC17047. ACC and YK acknowledge sup- The slides were then baked at 60°C for 30 min. For further details port from the UK Biotechnology and Biological Sciences on FISH and EDF-FISH (probe preparation and labelling, hybridiza- Research Council (BBSRC) through research grant BB/ tion and image acquisition), see Methods S2. 1006710/1. Analysis of rDNA copy numbers and transcript levels SUPPORTING INFORMATION qPCR for rDNA copy number analysis was performed in techni- Additional Supporting Information may be found in the online ver- cal triplicates for three biological replicates of samples to anal- sion of this article. yse 18S rDNA (primer combination 18S(Xba)F and 18S(Hpa)R) Figure S1. Examples of FISH patterns of probes for 5S rDNA (in and 5S rDNA (primer combination Pp_5S_F and Pp_5S_R) nor- green) and 5.8 rDNA (in red) in nuclei of P. patens. malized to ubiquitin as reference gene (primer combination Table S1. Results from pqsfinder in Physcomitrella patens region ubqFw and ubqRev) using FastStart SYBR Green Master Mix chr20 135 981–124 466. (Roche, Basel, Switzerland). Reactions with primer concentration À1 Table S2. List of primers. 0.25 lM and 1 ll of gDNA (20 ng ll ) were performed under the following conditions: initial denaturation 95°C/15 min, 35 Data S1. RepeatExplorer analysis of P. patens WGS data. cycles of 95°C/30 sec, 56°C/30 sec, 72°C/30 sec with final incuba- Methods S1. Targeted mutagenesis of PpSOL. tion at 75°C/5 min followed by the melting temperature analy- Methods S2. FISH and EDF-FISH. sis. The same conditions were used for transcript levels analysis of 18S rRNA, 5S rRNA and TERT gene (with primers derived from exon 10). Sequences of primers are listed in REFERENCES Table S2. Copy numbers or levels of transcripts in mutant lines were calculated relatively to WT by DDCt method (Pfaffl, 2001). Adachi, S., Minamisawa, K., Okushima, Y. et al. (2011) Programmed induc- tion of endoreduplication by DNA double-strand breaks in Arabidopsis. All data were statistically analyzed using two-sample F-test and Proc. Natl Acad. Sci. USA, 108, 10004–10009. 0 two-tailed unpaired Student s t-test. Ayora, S., Piruat, J.I., Luna, R., Reiss, B., Russo, V.E.A., Aguilera, A. and Alonso, J.C. (2002) Characterization of two highly similar Rad51 homo- Prediction of a propensity to form G4 structure logs of Physcomitrella patens. J. Mol. Biol. 316,35–49. Barber, L.J., Youds, J.L., Ward, J.D. et al. (2008) RTEL1 maintains geno- From the locus LOC112273741 (NCBI Gene Database, organism mic stability by suppressing homologous recombination. Cell, 135, Physcomitrella patens) we selected a region that covered a single 261–271. rRNA gene unit (18S, 5.8S, 28S, 5S and a part of 18S of the adja- Bedrat, A., Lacroix, L. and Mergny, J.L. (2016) Re-evaluation of G-quadru- cent rRNA gene unit). The region was located on chr20 from plex propensity with G4Hunter. Nucleic Acids Res. 44(4), 1746–1759. 135 981 to 124 466 bp. To detect parts of this region that are likely Bleuyard, J.Y., Gallego, M.E. and White, C.I. (2006) Recent advances in to fold into a G4 we used two independent tools: pqsfinder (Hon understanding of the DNA double-strand break repair machinery of et al., 2017), with default settings, and a custom implementation plants. DNA Repair, 5,1–12. Boulton, S.J. and Jackson, S.P. (1996) Saccharomyces cerevisiae Ku70 of G4Hunter algorithm (Bedrat et al., 2016), with the threshold potentiates illegitimate DNA double-strand break repair and serves as a score set to 1. The computation was conducted in the environ- barrier to error-prone DNA repair pathways. EMBO J. 15, 5093–5103. ment of R studio with the use of the Bioconductor package Gviz to Bourbousse, C., Vegesna, N. and Law, J.A. (2018) SOG1 activator and plot the results (Hahne and Ivanek, 2016). MYB3R repressors regulate a complex DNA damage network in Ara- bidopsis. Proc. Natl Acad. Sci. USA, 115, E12453–E12462. ACKNOWLEDGEMENTS Bryan, T.M., Englezou, A., DallaPozza, L., Dunham, M.A. and Reddel, R.R. (1997) Evidence for an alternative mechanism for maintaining telomere Access to computing and storage facilities owned by parties and length in human tumors and tumor-derived cell lines. Nat. Med. 3, 1271– projects contributing to the National Grid Infrastructure MetaCen- 1274. trum provided under the programme ‘Projects of Large Research, Chai, W.H., Ford, L.P., Lenertz, L., Wright, W.E. and Shay, J.W. (2002) Development, and Innovations Infrastructures’ (CESNET Human Ku70/80 associates physically with telomerase through interac- LM2015042), was greatly appreciated. tion with hTERT. J. Biol. Chem. 277, 47242–47247.

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 1104 Ivana Goffova et al.

Da Ines, O., Degroote, F., Amiard, S., Goubely, C., Gallego, M.E. and White, Knight, C.D., Cove, D.J., Cumming, A.C. and Quatrano, R.S. (2002) Molecu- C.I. (2013) Effects of XRCC2 and RAD51B mutations on somatic and mei- lar Plant Biology. Oxford: Oxford University Press. otic recombination in Arabidopsis thaliana. Plant J. 74, 959–970. Kozak, J., West, C.E., White, C., da Costa-Nunes, J.A. and Angelis, K.J. De Magis, A., Manzo, S.G., Russo, M., Marinello, J., Morigi, R., Sordet, O. (2009) Rapid repair of DNA double strand breaks in Arabidopsis thaliana and Capranico, G. (2019) DNA damage and genome instability by G- is dependent on proteins involved in chromosome structure mainte- quadruplex ligands are mediated by R loops in human cancer cells. Proc. nance. DNA Repair, 8, 413–419. Natl Acad. Sci. USA, 116, 816–825. Lang, D., Ullrich, K.K., Murat, F. et al. (2018) The Physcomitrella patens Dellaporta, S.L., Wood, J. and Hicks, J.B. (1983) A Plant DNA miniprepara- chromosome-scale assembly reveals moss genome structure and evolu- tion. Version II. Plant Mol. Biol. Rep. 1,19–21. tion. Plant J. 93, 515–533. Durut, N., Abou-Ellail, M., Pontvianne, F. et al. (2014) A duplicated Le Guen, T., Jullien, L., Touzot, F. et al. (2013) Human RTEL1 deficiency NUCLEOLIN gene with antagonistic activity is required for chro- causes HoyeraalÀHreidarsson syndrome with short telomeres and gen- matin organization of silent 45S rDNA in Arabidopsis. Plant Cell, 26, ome instability. Hum. Mol. Genet. 22, 3239–3249. 1330–1344. Le, S., Moore, J.K., Haber, J.E. and Greider, C.W. (1999) RAD50 and RAD51 Fajkus, J. and Trifonov, E.N. (2001) Columnar packing of telomeric nucleo- define two pathways that collaborate to maintain telomeres in the somes. Biochem. Biophys. Res. Comm. 280, 961–963. absence of telomerase. Genetics, 152, 143–152. Fajkus, J., Fulneckova, J., Hulanova, M., Berkova, K., Riha, K. and Matyasek, Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Bur- R. (1998) Plant cells express telomerase activity upon transfer to callus rows-Wheeler transform. Bioinformatics, 25, 1754–1760. culture, without extensively changing telomere lengths. Mol. Gen. Genet. Li, W.X., Chen, C.B., Markmann-Mulisch, U., Timofejeva, L., Schmelzer, E., 260, 470–474. Ma, H. and Reiss, B. (2004) The Arabidopsis AtRAD51 gene is dispens- Fajkus, P., Peska, V., Sitova, Z., Fulneckova, J., Dvorackova, M., Gogela, R., able for vegetative development but required for meiosis. Proc. Natl Sykorova, E., Hapala, J. and Fajkus, J. (2016) Allium telomeres Acad. Sci. USA, 101, 10596–10601. unmasked: the unusual telomeric sequence (CTCGGTTATGGG)(n) is syn- Lundblad, V. and Blackburn, E.H. (1993) An alternative pathway for yeast thesized by telomerase. Plant J. 85, 337–347. telomere maintenance rescues Est1- senescence. Cell, 73, 347–360. Faure, G., Revy, P., Schertzer, M., Londono-Vallejo, A. and Callebaut, I. Lundin, C., North, M., Erixon, K., Walters, K., Jenssen, D., Goldman, A.S.H. (2014) The C-terminal extension of human RTEL1, mutated in Hoyeraal- and Helleday, T. (2005) Methyl methanesulfonate (MMS) produces heat- Hreidarsson syndrome, contains Harmonin-N-like domains. Proteins- labile DNA damage but no detectable in vivo DNA double-strand breaks. Struct. Funct. Bioinformat. 82, 897–903. Nucleic Acids Res. 33, 3799–3811. Fojtova, M., Sykorova, E., Najdekrova, L., Polanska, P., Zachova, D., Manova, V. and Gruszka, D. (2015) DNA damage and repair in plants À from Vagnerov a, R., Angelis, K.J. and Fajkus, J. (2015) Telomere dynamics in models to crops. Front. Plant Sci. 6, 885. the lower plant Physcomitrella patens. Plant Mol. Biol. 87, 591–601. Margalef, P., Kotsantis, P., Borel, V., Bellelli, R., Panier, S. and Boulton, S.J. Gonzalo, S., Jaco, I., Fraga, M.F., Chen, T.P., Li, E., Esteller, M. and Blasco, (2018) Stabilization of reversed replication forks by telomerase drives M.A. (2006) DNA methyltransferases control telomere length and telom- telomere catastrophe. Cell, 172, 439–453.e14. ere recombination in mammalian cells. Nat. Cell Biol. 8, U416–U466. Markmann-Mulisch, U., Hadi, M.Z., Koepchen, K., Alonso, J.C., Russo, Goodarzi, A.A., Jeggo, P. and Lobrich, M. (2010) The influence of hete- V.E.A., Schell, J. and Reiss, B. (2002) The organization of Physcomitrella rochromatin on DNA double strand break repair: getting the strong, patens RAD51 genes is unique among eukaryotic organisms. Proc. Natl silent type to relax. DNA Repair, 9, 1273–1282. Acad. Sci. USA, 99, 2959–2964. Hahne, F. and Ivanek, R. (2016) Visualizing genomic data using Gviz and Markmann-Mulisch, U., Wendeler, E., Zobell, O., Schween, G., Steinbiss, bioconductor. In Statistical Genomics. Methods in Molecular Biology H.H. and Reiss, B. (2007) Differential requirements for RAD51 in Physco- (Mathe, E. and Davis, S., eds). New York, NY: Humana Press, 1418, pp. mitrella patens and Arabidopsis thaliana development and DNA damage 335–351. repair. Plant Cell, 19, 3080–3089. Havlova, K., Dvorackova, M., Peiro, R., Abia, D., Mozgova, I., Vansacova, L., McEachern, M.J. and Blackburn, E.H. (1996) Cap-prevented recombination Gutierrez, C. and Fajkus, J. (2016) Variation of 45S rDNA intergenic spac- between terminal telomeric repeat arrays (telomere CPR) maintains ers in Arabidopsis thaliana. Plant Mol. Biol. 92, 457–471. telomeres in Kluyveromyces lactis lacking telomerase. Genes Dev. 10, Hola, M., Kozak, J., Vagnerov a, R. and Angelis, K.J. (2013) Genotoxin 1822–1834. induced mutagenesis in the model plant Physcomitrella patens. Biomed. Moreno-Romero, J., Armengot, L., Marques-Bueno, M.M., Britt, A. and Mar- Res. Int. 2013, 535049. tinez, M.C. (2012) CK2-defective Arabidopsis plants exhibit enhanced Hola, M., Vagnerov a, R. and Angelis, K.J. (2015) Mutagenesis during plant double-strand break repair rates and reduced survival after exposure to responses to UVB radiation. Plant Physiol. Biochem. 93,29–33. ionizing radiation. Plant J. 71, 627–638. Hon, J., Martı´nek, T., Zendulka, J. and Lexa, M. (2017) pqsfinder: an exhaus- Mozgova, I., Mokros, P. and Fajkus, J. (2010) Dysfunction of chromatin tive and imperfection-tolerant search tool for potential quadruplex- assembly factor 1 induces shortening of telomeres and loss of 45S rDNA forming sequences in R. Bioinformatics, 33(21), 3373–3379. in Arabidopsis thaliana. Plant Cell, 22, 2768–2780. Hu, Z.B., Cools, T., Kalhorzadeh, P., Heyman, J. and De Veylder, L. (2015) Muchova, V., Amiard, S., Mozgova, I., Dvorackova, M., Gallego, M.E., Deficiency of the Arabidopsis helicase RTEL1 triggers a SOG1-dependent White, C. and Fajkus, J. (2015) Homology-dependent repair is involved in replication checkpoint in response to DNA cross-links. Plant Cell, 27, 45S rDNA loss in plant CAF-1 mutants. Plant J., 81, 198–209. 149–161. Murti, K.G. and Prescott, D.M. (1999) Telomeres of polytene chromosomes Ijdo, J.W., Wells, R.A., Baldini, A. and Reeders, S.T. (1991) Improved telom- in a ciliated protozoan terminate in duplex DNA loops. Proc. Natl Acad. ere detection using a telomere repeat probe (Ttaggg)N generated by Sci. USA, 96, 14436–14439. PCR. Nucleic Acids Res. 19, 4780–4780. Nakamura, T.M., Cooper, J.P. and Cech, T.R. (1998) Two modes of survival Jaske, K., Mokros, P., Mozgova, I., Fojtova, M. and Fajkus, J. (2013) A telom- of fission yeast without telomerase. Science, 282, 493–496. erase-independent component of telomere loss in chromatin assembly fac- Nikitaki, Z., Hola, M., Dona, M., Pavlopoulou, A., Michalopoulos, I., Angelis, tor 1 mutants of Arabidopsis thaliana. Chromosoma, 122,285–293. K.J., Georgakilas, A.G., Macovei, A. and Balestrazzi, A. (2018) Integrating Jung, A.R., Yoo, J.E., Shim, Y.H., Choi, Y.N., Jeung, H.C., Chung, H.C., Rha, plant and animal biology for the search of novel DNA damage biomark- S.Y. and Oh, B.K. (2013) Increased alternative lengthening of telomere ers. Mutat. Res. 775,21–38. phenotypes of Telomerase-negative immortal cells upon Trichostatin-A Novak, P., Neumann, P., Pech, J., Steinhaisl, J. and Macas, J. (2013) Repea- treatment. Anticancer Res. 33, 821–829. tExplorer: a Galaxy-based web server for genome-wide characterization Kamisugi, Y., Whitaker, J.W. and Cuming, A.C. (2016) The transcriptional of eukaryotic repetitive elements from next-generation sequence reads. response to DNA-double-strand breaks in Physcomitrella patens. PLoS Bioinformatics, 29, 792–793. ONE, 11, e0161204. Ogita, N., Okushima, Y., Tokizawa, M. et al. (2018) Identifying the target Karpenshif, Y. and Bernstein, K.A. (2012) From yeast to mammals: recent genes of SUPPRESSOR OF GAMMA RESPONSE 1, a master transcription advances in genetic control of homologous recombination. DNA Repair, factor controlling DNA damage response in Arabidopsis. Plant J. 94, 11, 781–788. 439–453.

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 RAD51 & RTEL1 in telomere & rDNA stability in moss 1105

Olivier, M., Charbonnel, C., Amiard, S., White, C.I. and Gallego, M.E. (2018) loss of function abolishes gene targeting and de-represses illegitimate RAD51 and RTEL1 compensate telomere loss in the absence of telom- integration in the moss Physcomitrella patens. DNA Repair, 9, 526–533. erase. Nucleic Acids Res. 46, 2432–2445. Serra, H., Da Ines, O., Degroote, F., Gallego, M.E. and White, C.I. (2013) Pavlistova, V., Dvorackova, M., Jez, M., Mozgova, I., Mokros, P. and Fajkus, Roles of XRCC2, RAD51B and RAD51D in RAD51-independent SSA J. (2016) Phenotypic reversion in fas mutants of Arabidopsis thaliana by recombination. PLoS Genet. 9, e1003971. reintroduction of FAS genes: variable recovery of telomeres with major Sfeir, A. and de Lange, T. (2012) Removal of shelterin reveals the telomere spatial rearrangements and transcriptional reprogramming of 45S rDNA end-protection problem. Science, 336, 593–597. genes. Plant J. 88, 411–424. Sfeir, A., Kosiyatrakul, S.T., Hockemeyer, D., MacRae, S.L., Karlseder, J., Perroud, P.F., Haas, F.B., Hiss, M. et al. (2018) The Physcomitrella patens Schildkraut, C.L. and de Lange, T. (2009) Mammalian telomeres resem- gene atlas project: large-scale RNA-seq based expression data. Plant J. ble fragile sites and require TRF1 for efficient replication. Cell, 138, 95, 168–182. 90–103. Peska, V., Fajkus, P., Fojtova, M., Dvorackova, M., Hapala, J., Dvoracek, V., Siroky, J., Zluvova, J., Riha, K., Shippen, D.E. and Vyskot, B. (2003) Rear- Polanska, P., Leitch, A.R., Sykorova, E. and Fajkus, J. (2015) Characterisa- rangements of ribosomal DNA clusters in late generation telomerase- tion of an unusual telomere motif (TTTTTTAGGG)(n) in the plant Ces- deficient Arabidopsis. Chromosoma, 112, 116–123. trum elegans (Solanaceae), a species with a large genome. Plant J. 82, Skourti-Stathaki, K. and Proudfoot, N.J. (2014) A double-edged sword: R 644–654. loops as threats to genome integrity and powerful regulators of gene Pfaffl, M.W. (2001) A new mathematical model for relative quantification in expression. Genes Dev. 28, 1384–1396. real-time RT-PCR. Nucleic Acids Res. 29, e45. Sone, T., Fujisawa, M., Takenaka, M. et al. (1999) Bryophyte 5S rDNA was Pontvianne, F., Abou-Ellail, M., Douet, J. et al. (2010) Nucleolin is required inserted into 45S rDNA repeat units after the divergence from higher for DNA methylation state and the expression of rRNA gene variants in land plants. Plant Mol. Biol. 41, 679–685. Arabidopsis thaliana. PLoS Genet. 6, e1001225. Sykorova, E., Lim, K.Y., Chase, M.W., Knapp, S., Leitch, I.J., Leitch, A.R. and Pontvianne, F., Blevins, T., Chandrasekhara, C., Feng, W., Stroud, H., Jacob- Fajkus, J. (2003a) The absence of Arabidopsis-type telomeres in Cestrum sen, S.E., Michaels, S.D. and Pikaard, C.S. (2012) Histone methyltrans- and closely related genera Vestia and Sessea (Solanaceae): first evidence ferases regulating rRNA gene dose and dosage control in Arabidopsis. from eudicots. Plant J. 34, 283–291. Genes Dev. 26, 945–957. Sykorova, E., Lim, K.Y., Kunicka, Z., Chase, M.W., Bennett, M.D., Fajkus, J. Pontvianne, F., Blevins, T., Chandrasekhara, C. et al. (2013) Subnuclear par- and Leitch, A.R. (2003b) Telomere variability in the monocotyledonous titioning of rRNA genes between the nucleolus and nucleoplasm reflects plant order Asparagales. Proc. R. Soc. B Biol. Sci. 270, 1893–1904. alternative epiallelic states. Genes Dev. 27, 1545–1550. Sykorova, E., Fajkus, J., Meznikova, M., Lim, K.Y., Neplechova, K., Blat- Pontvianne, F., Carpentier, M.C., Durut, N. et al. (2016) Identification of tner, F.R., Chase, M.W. and Leitch, A.R. (2006) Minisatellite telomeres nucleolus-associated chromatin domains reveals a role for the nucleolus occur in the family Alliaceae but are lost in Allium. Am. J. Bot. 93, in 3D organization of the A. thaliana genome. Cell Rep. 16, 1574–1587. 814–823. Pradillo, M., Lopez, E., Linacero, R., Romero, C., Cunado, N., Sanchez- Ting, N.S.Y., Yu, Y.P., Pohorelic, B., Lees-Miller, S.P. and Beattie, T.L. (2005) Moran, E. and Santos, J.L. (2012) Together yes, but not coupled: new Human Ku70/80 interacts directly with hTR, the RNA component of insights into the roles of RAD51 and DMC1 in plant meiotic recombina- human telomerase. Nucleic Acids Res. 33, 2090–2098. tion. Plant J. 69, 921–933. Uringa, E.J., Lisaingo, K., Pickett, H.A., Brind’Amour, J., Rohde, J.H., Zelen- Preuss, S.B. and Britt, A.B. (2003) A DNA-damage-induced cell cycle check- sky, A., Essers, J. and Lansdorp, P.M. (2012) RTEL1 contributes to DNA point in Arabidopsis. Genetics, 164, 323–334. replication and repair and telomere maintenance. Mol. Biol. Cell, 23, Recker, J., Knoll, A. and Puchta, H. (2014) The Arabidopsis thaliana homo- 2782–2792. log of the helicase RTEL1 plays multiple roles in preserving genome sta- Valuchova, S., Fulnecek, J., Prokop, Z., Stolt-Bergner, P., Janouskova, E., bility. Plant Cell, 26, 4889–4902. Hofr, C. and Riha, K. (2017) Protection of Arabidopsis blunt-ended telom- Riha, K., McKnight, T.D., Griffing, L.R. and Shippen, D.E. (2001) Living with eres is mediated by a physical association with the Ku heterodimer. Plant genome instability: plant responses to telomere dysfunction. Science, Cell, 29, 1533–1545. 291, 1797–1800. Vannier, J.B., Pavicic-Kaltenbrunner, V., Petalcorin, M.I.R., Ding, H. and Rohrig, S., Schropfer, S., Knoll, A. and Puchta, H. (2016) The RTR complex Boulton, S.J. (2012) RTEL1 dismantles T loops and counteracts telomeric partner RMI2 and the DNA helicase RTEL1 are both independently G4-DNA to maintain telomere integrity. Cell, 149, 795–806. involved in preserving the stability of 45S rDNA repeats in Arabidopsis Vannier, J.B., Sandhu, S., Petalcorin, M.I.R., Wu, X.L., Nabi, Z., Ding, H. thaliana. PLoS Genet. 12, e1006394. and Boulton, S.J. (2013) RTEL1 is a replisome-associated helicase that Rosato, M., Kovarik, A., Garilleti, R. and Rossello, J.A. (2016) Conserved promotes telomere and genome-wide replication. Science, 342, 239– organisation of 45S rDNA sites and rDNA gene copy number among 242. major clades of early land plants. PLoS ONE, 11, e0162544. Vannier, J.B., Sarek, G. and Boulton, S.J. (2014) RTEL1: functions of a dis- Ruckova, E., Friml, J., Schrumpfova, P.P. and Fajkus, J. (2008) Role of alter- ease-associated helicase. Trends Cell Biol. 24, 416–425. native telomere lengthening unmasked in telomerase knock-out mutant Watson, J.M. and Shippen, D.E. (2007) Telomere rapid deletion regu- plants. Plant Mol. Biol. 66, 637–646. lates telomere length in Arabidopsis thaliana. Mol. Cell. Biol. 27, 1706–1715. Sarek, G., Vannier, J.B., Panier, S., Petrini, J.H.J. and Boulton, S.J. (2015) Wicke, S., Costa, A., Munoz, J. and Quandt, D. (2011) Restless 5S: the re- TRF2 recruits RTEL1 to telomeres in S phase to promote T-loop unwind- arrangement(s) and evolution of the nuclear ribosomal DNA in land ing. Mol. Cell, 57, 622–635. plants. Mol. Phylogenet. Evol. 61, 321–332. Schaefer, D.G., Delacote, F., Charlot, F., Vrielynck, N., Guyon-Debast, A., Le Yoshiyama, K.O. (2015) SOG1: a master regulator of the DNA damage Guin, S., Neuhaus, J.M., Doutriaux, M.P. and Nogue, F. (2010) RAD51 response in plants. Genes Genet. Syst. 90, 209–216.

© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd, The Plant Journal, (2019), 98, 1090–1105 KateřinaEducationEmail: katkahemalova @gmail.comHavlová - | MU · ·Since 09/2014 present | Doctoral degree programme Genomics and Proteomics PřF Research topic: Analysis of genome instability using genomic and bioinformatic approaches · Supervisor: prof. RNDr. Jiří Fajkus, CSc. - - - - Research Group of Chromatin Molecular Complexes, Mendel Centre for Plant Genomics and · Proteomics, CEITEC MU https://www.ceitec.eu/chromatin molecular complexes jiri fajkus/rg51 Since 09/2018 the study has been disrupted or conducted| inMU a remote mode due to family reasons · 2015 | Bachelor’s degree programme Bioinformatics FI Research topic: Modelling of cell signalling pathways by using boolean netMUworks · 2014 | Master’s degree programme Genomics and Proteomics | PřF Research topic: Characterisation of sequence variants of 45S rDNA spacers in ArMUabidopsis thaliana · 2012 | Bachelor’s degree programme Molecular Biology and Genetics | PřF Research topic: Nucleosome positioning and methods of its determination Publications Roles of RAD51 and RTEL1 in telomere and rDNA stability in Physcomitrella patens.

Goffová I., Vágnerová R., Peška V., Franěk M., Havlová K., Holá M., Zachová D., Fojtová M., Cuming A., Kamisugi Y., VariationAngelis K. J.of and 45S Fajkus rDNA J. intergenic (2019). Plant spacers J. doi: in10.1111/tpj.14 Arabidopsis304 thaliana.

- - - Havlová K., Dvořáčková M., Peiro R., Abia D., Mozgová I., Vansáčová L., Gutierrez C. and Fajkus J. (2016). Plant Mol Biol. doi:10.1007/s11103Presentations 016 0524 1 -

10 Nov 11 Nov 2015 | Meetings of biochemists and molecular biologists | Brno| Oral presentation- E

8 Nov 11 Nov 2014 | MBO Conference Series: From Functional Genomics to Systems BiologyInternship | Heidelberg and stays | Poster presentation -

1/07/2017-31/12/2017 | Internship at the Bioinformatics Core Facility CEITEC MU|

1/11/2016 10/12/2016 Stay at The Plant Genome and Development Laboratory Université- De Perpignan Via Domitia | France |

2013 2015 | Undergraduate student member of the Systems Biology Laboratory FI MU