Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Break-seq reveals hydroxyurea-induced fragility as a result of unscheduled

conflict between DNA replication and transcription

Elizabeth Hoffman1,2, Andrew McCulley1, Brian Haarer1, Remigiusz Arnak1, and Wenyi Feng1,*

1Department of Biochemistry and Molecular Biology

SUNY Upstate Medical University

750 East Adams Street

Syracuse, NY 13210

2Current address: Department of Biochemistry and Molecular Genetics, University of Virginia

Health System, P.O. Box 800733, Charlottesville, VA 22908

*Corresponding author: [email protected]

Running title

Mechanism of hydroxyurea-induced chromosome breaks

Key words

Replication-transcription conflict, chromosome fragile site, hydroxyurea, DNA replication stress,

Break-seq, DNA double strand breaks, single-stranded DNA, replication fork stability, origin of replication Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

ABSTRACT

We previously demonstrated that in Saccharomyces cerevisiae replication checkpoint inactivation via a mec1 mutation leads to chromosome breakage at replication forks initiated from virtually all origins after transient exposure to hydroxyurea (HU), an inhibitor of ribonucleotide reductase. Here we sought to determine whether all replication forks containing single-stranded DNA gaps have equal probability of producing double strand breaks (DSBs) when cells attempt to recover from HU exposure. We devised a new methodology, Break-seq, that combines our previously described DSB labeling with next generation sequencing to map chromosome breaks with improved sensitivity and resolution. We show that DSBs preferentially occur at transcriptionally induced by HU. Notably, different subsets of the HU-induced genes produced DSBs in MEC1 and mec1 cells as replication forks traversed a greater distance in

MEC1 cells than in mec1 cells during recovery from HU. Specifically, while MEC1 cells exhibited chromosome breakage at stress-response transcription factors, mec1 cells predominantly suffered chromosome breakage at transporter genes, many of which are the substrates of the said transcription factors. We propose that HU-induced chromosome fragility arises at higher frequency near HU-induced genes as a result of destabilized replication forks encountering transcription factor binding and/or the act of transcription. We further propose that replication inhibitors can induce unscheduled encounters between replication and transcription and give rise to distinct patterns of chromosome fragile sites.

2 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

INTRODUCTION

Chromosome fragile sites (CFSs) were defined cytologically as site-specific gaps, constrictions, or breakage on mammalian metaphase (Sutherland 1979). Recent years have seen intense scrutiny of the underlying mechanisms of chromosome fragility as increasing evidence suggests that CFSs are hot spots for genome rearrangements frequently observed in cancer cells (Arlt et al. 2006; Durkin and Glover 2007; Casper et al. 2012; Debatisse et al. 2012). Replication timing analyses suggested that DNA replication fork instability is a potential cause for chromosome fragility (Le Beau et al. 1998; Wang et al. 1998; Hellman et al.

2000; Palakodeti et al. 2004). Recent studies also suggested that, at least in the case of FRA3B and FRA16C (two of the most frequent common fragile sites in the ), paucity of replication initiation events is correlated with chromosome fragility (Letessier et al. 2011; Ozeri-

Galai et al. 2011). Thus, the mechanism of chromosome fragility at the CFSs still remains unclear—in particular, theories that are capable of explaining why different cell types or replication inhibitors produce distinct spectra of CFSs are still lacking. For instance, it was reported that fibroblasts and lymphocytes from the same individual showed different frequencies of CFSs (Murano et al. 1989). It is thought that differential expression plays a role in shaping the chromosome fragility profile under various conditions, suggesting that conflict between replication and may be an underlying cause of chromosome fragility.

Conflict between replication and transcription is a well-documented phenomenon in both prokaryotes and eukaryotes (Bermejo et al. 2012; Merrikh et al. 2012). Such conflicts, particularly head-on collisions, are generally avoided in most model organisms as recently reviewed (Mirkin and Mirkin 2007). For instance, highly transcribed genes are encoded on the leading strand in most bacterial genomes (Rocha 2002). It was hypothesized that such an

3 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

organization would ensure the directions of replication and transcription to be codirectional and to avert head-on collisions (Brewer 1988). In those cases where coincidental transcription and replication are inevitable, cells seem to have evolved mechanisms to resolve these conflicts without any apparent ill consequence. For instance, the yeast ribosomal DNA also contains a replication fork barrier to specifically halt replication fork progression, thereby averting head-on collision between the transcription and replication machineries (Brewer and

Fangman 1988). Intriguingly, the Saccharomyces cerevisiae genome is rather conducive to such potential conflict as the origins of replication (origins hereafter) are preferentially located in intergenic regions between converging transcription units (MacAlpine and Bell 2005;

Nieduszynski et al. 2007; Yin et al. 2009). It has also been shown that yeast tRNAs can stall replication forks in a polar fashion (Deshpande and Newlon 1996). Similarly, RNA polymerase

II (Pol II) transcribed genes can produce strong pause sites for replication forks (Azvolinsky et al. 2009). How the organism resolves these potential conflicts and maintains fitness is still unclear. Interestingly, in vitro experiments using programmed head-on collision between

Escherichia coli DNA and RNA polymerases indicated that the replication fork is capable of resuming synthesis without collapsing upon collision (Pomerantz and O'Donnell 2010).

However, such analysis has not been performed with eukaryotic enzymes where DNA polymerase has relatively lower speed than the bacterial equivalent and therefore might not fare as well in a collision with the RNA polymerase. Consistent with this notion, mutations in a multitude of pathways that increase the frequency of replication-transcription conflicts can lead to genome instability (Tuduri et al. 2009; Luna et al. 2012; Duch et al. 2013). Here we propose that replication inhibitors can also induce unscheduled conflicts between replication and transcription due to their dual effects on these two processes, leading to DSBs.

4 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

We recently examined the dynamics of chromosome fragility in a yeast replication checkpoint mutant, mec1-1 sml1-1 (mec1-1 is a lethal mutation that requires the presence of the sml1-1 allele for survival, hereafter referred to as mec1 for simplicity), after nucleotide starvation by a potent inhibitor of ribonucleotide reductase, hydroxyurea (HU) (Feng et al. 2011). Using microarray-based simultaneous mapping of single stranded DNA (ssDNA) and double strand breaks (DSBs) we observed that chromosomal fragility was preceded by ssDNA formation at the replication forks. However, DSBs were observed at apparently similar frequencies at the replication forks throughout the genome. Moreover, while ssDNA was present at replication forks during exposure to HU, high levels of DSBs were not evident until HU was removed from the cells. These observations prompted the current study in which we asked if replication forks at certain genomic regions are more prone to breakage than at others, particularly during recovery from HU.

Here we describe an improved methodology named “Break-seq” that combines the previously described DSB mapping by end-repair (Feng et al. 2011) with next generation sequencing technology, a readout that provides unparalleled sensitivity and resolution compared to microarrays (hereafter referred to as “Break-chip”). Break-seq facilitated the detection of

DSBs not only during cell recovery from HU where chromosome breakage was abundant, but also when cells were still exposed to HU where chromosome breakage was below detection by previous methods. We identified breakage hot spots at a wide range of transporter genes including metal ion transmembrane transporters in mec1 cells and at stress-response transcriptional factors in MEC1 cells. Based on these results, we propose that DSBs result from a clash between unstable replication forks and increased transcription at HU-induced genes due to the dual effects of HU on replication and gene transcription.

5 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

RESULTS

Break-seq, a universally adaptable DSB mapping method. We previously used Break-chip to map chromosome breakage in the yeast checkpoint mutant mec1-1 and demonstrated that HU treatment causes single-stranded DNA (ssDNA) formation at replication forks, which is subsequently converted to DSBs by an unknown mechanism (Feng et al. 2011). However, it is unclear if all ssDNA regions have equal likelihood of becoming DSBs. Moreover, if the DSBs resulted from intrinsic instability of the ssDNA, it is unclear why chromosome breakage is only evident during recovery from HU rather than when cells are still in HU (Feng et al. 2009). To achieve DSB detection with high sensitivity we developed an improved methodology, Break-seq

(Fig. 1A). Briefly, we labeled DNA ends with biotin using the T4 DNA polymerase contained in the End-Repair enzyme mix (End-ItTM, Epicentre), followed by shearing of the chromosomal

DNA, capturing labeled double-stranded DNA and constructing libraries using multiplexed

Illumina technology. We used the Burrows-Wheeler Aligner (Li and Durbin 2009) or Bowtie 2

(Langmead and Salzberg 2012) to align these sequence reads to the S. cerevisiae reference genome (SacCer3) (Table S1). We then used MACS (Model-based Analysis of ChIP-seq) to identify genomic regions showing significant level (p < 1E-5) of end-labeled signals in either a single sample (One-Sample test) or enriched in a sample compared to a control (Two-Sample test) (Zhang et al. 2008).

We first performed a proof-of-concept experiment by digesting a G1-arrested (by α- factor) sample from mec1 cells with BamHI in vitro (referred to as the G1BamHI sample), followed by Break-seq. We identified 1296 DSBs in the G1BamHI sample with apparently good correlation with the known 1679 BamHI sites in the S288c reference genome based on visual inspection (Fig. 1B). We further demonstrated that 231 of 241 DSBs on four representative

6 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

chromosomes (II, III, VI, and X) were found within a 1-kb distance from a known BamHI site with a median distance of 217 bp (Table S2). This result demonstrated that the Break-seq method identifies bona fide DSBs. It also implies that the average End-Repair track length by

T4 DNA polymerase is 200-300 bp.

Break-seq permits detection of low levels of DSBs at origins of replication during exposure to HU. We performed two independent experiments (Exp A and B) for mec1 and one experiment (Exp C) for MEC1 cells and collected the following samples: 1) cells synchronously released into S phase in the presence of 200 mM HU for 1 h (HU), and 2) cells recovered for 1 h after removal of HU (Recovery). For clarity, hereafter the samples associated with the MEC1 and mec1 cells will bear the superscripts “MEC1”(as in HUMEC1) and “mec1” (as in

Recoverymec1), respectively. Both the HUmec1 samples and the Recovery mec1 samples exhibited significant concordance between the two experiments (Fig. 1C & Table S3). To compare the

DSB profiles between samples we visualized the sequence reads using SeqMonk (Babraham

Bioinformatics, http://www.bioinformatics.babraham.ac.uk/projects/seqmonk/) (Fig. S1 & Fig.

2A). We also tested if copy number variation in the samples in S phase would produce artifactual DSB signals at the replicated region by performing Break-seq from the G1 samples or

“mock” Break-seq, without biotinylated nucleotides, from the HU samples. Both control samples yielded apparently random signals across the genome (Fig. S2). However, we note that some of the signals in the mock Break-seq samples were enriched near origins (Fig. S2, block arrows).

Using MACS we detected 138 and 151 DSB peaks in the HUmec1 samples of Exp A and

B, respectively, with ~60% concordance (Fig. 2B & Table S4). We also detected 103 DSB peaks in the HUMEC1 sample (Table S4). We compared the locations of these DSBs to those of

7 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

626 confirmed or likely OriDB (origin database)-curated origins (http://www.oridb.org, S. cerevisiae v2.0.1). The majority of the DSBs in the HUmec1 samples, 79% (109 of 138) and 85%

(129 of 151) from Exp A and B, overlapped with an origin with the median distance between their mid points being 600 bp and 377 bp, respectively. Similarly, 72% (74 of 103) of DSBs from the HUMEC1 sample overlapped with an origin with a median distance of 543 bp. Co- localization of the DSBs and the origins was evaluated by calculating the frequency at which a randomized set of DSBs in a simulation (10,000 iterations) were found to overlap with origins at the same or greater frequency than the observed data. All three HU samples showed significant co-localization of the DSBs and origins (P < 0.0001). Thus, we concluded that low levels of chromosome breakage occurred near the origins even in the presence of HU in both mec1 and

MEC1 cells. We note, however, DSB signals near selected origins were at least partially attributable to ploidy increase.

DSBs during recovery from HU are correlated with replication fork progression. Using

MACS we also detected 190 and 217 DSBs in the Recoverymec1 samples from Exp A and B, respectively, with 65% concordance (Fig. 2B), and 79 DSBs in the RecoveryMEC1 sample (Table

S4). DSBs during recovery from HU were more distal from the origins in the MEC1 cells than those in the mec1 cells (Fig. 2A), resulting in a lower concordance (R = 0.793) between DSBs in the HU and Recovery samples (Table S3). This observation is consistent with the notion that in the MEC1 cells replication forks were capable of resuming after the removal of HU, whereas those in mec1 cells were not (Feng et al. 2009). The comparison also revealed that at many genomic loci the DSB level was only apparent in the Recovery sample compared to the HU sample (Fig. 2A).

8 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Enrichment of chromosome breaks during Recovery from HU occurs preferentially at transporter genes in mec1 cells. We then asked if there were “enriched DSBs” in the Recovery samples relative to the HU samples by performing a Two-Sample test in MACS. We identified

186 and 209 enriched DSBs from Exp A and B, respectively, with approximately 50% overlap

(Fig. 2C & Table S5). The relatively low level of concordance is attributable to variations in replication fork dynamics as we previously observed unrestrained replication fork movement in mec1 cells in HU (Feng et al. 2009). Nevertheless, we focused on those common regions of

DSBs and asked: What makes these enriched DSBs the preferred sites for chromosome breakage?

We first asked what genomic features were associated with these enriched DSBs in the

Recoverymec1 samples in the two biological replicate experiments. We identified 745 and 760 features in Exp A and B, respectively, either enclosed by or overlapping with the enriched DSBs

(Fig. 2C). There were 203 common features, including 184 -coding genes, 1 pseudogene

(YIL167W, remnant of an ancestral ORF encoding an L-serine dehydratase found in other yeast species), 17 tRNAs, and 1 small nucleolar RNA (SNR37), associated with enriched DSBs (Fig.

2C & Table S6). The RNA genes equally partitioned into convergent (9 tRNAs and SNR37) and codirectional (8 tRNAs) with respect to a replication fork from the nearest origin. In contrast, among the 185 common protein-coding genes, 69 are codirectional and 116 are convergent with the nearest replication fork (Table S7). This bias for convergent replication and transcription contrasts the genome average—nearly equal numbers of ORFs in the yeast genome (3185 vs.

3406) are convergent or codirectional with the nearest replication fork, respectively (χ2= 15.37, the two-tailed P < 0.0001). Therefore, we wondered if the formation of the enriched DSBs

9 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

involved clashes between replication and transcription at these genes. If this were true, one would predict some commonality among these break-associated genes.

We then asked what (GO) enrichment might exist among the 185 common genes, or “enriched DSB-associated genes.” These genes were enriched for

“transporter activity” (p = 0.0037) and consisted of a wide range of transporters of carbohydrates, lipids, metal ions, and drugs (Table S8). In particular, “metal ion transmembrane transporter activity” (p = 0.002 and p = 0.004 for Exp A and B, respectively) was represented by

11 genes in each experiment with 4 in common (ARN2, YKE4, ALR2, and UGA4, highlighted in

Table S8). These 4 gene products are responsible for transporting siderophore-iron, zinc, magnesium, and gamma-aminobutyric acid, respectively. The “enriched DSB-associated genes” also showed predominance for “salvage pathways of pyrimidine ribonucleotides” (FUR1 and

CDD1, p = 0.034). We note the observed GO enrichment was relatively weak, i.e., not deemed statistically significant (p > 0.01) when corrected for multiple testing. Nevertheless, we wondered if the enrichment of DSBs associated with these genes was a result of transcription up- regulation in response to HU. Closer inspection of the enriched DSB loci provided evidence supporting our hypothesis. For instance, transcription from both ARN2 and UGA4 are convergent with replication from the nearest confirmed origin (Fig. 3A). Furthermore, we identified in both experiments an enriched DSB at chrXII:448345-451179, between the convergent ARS1216 and the first copy of 35S rDNA transcript within the ribosomal DNA

(rDNA) repeats (data not shown). Note the replication fork from ARS1216 is not insulated by a replication fork barrier as are forks generated from the origins within the rDNA repeats.

Enriched DSBs in MEC1 cells is correlated with transcription factors for stress response.

We also identified 137 “enriched DSBs” from the RecoveryMEC1 sample relative to the HUMEC1

10 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

sample in MEC1 cells (Table S5) and 244 associated genes (Table S6). These genes were enriched for RNA polymerase II transcription factors (GO:1077, p = 0.005, Table S8), represented by PDR3, RIM101, RPN4, UME6, and CAT8, which are all direct or indirect regulators of stress response genes (Wendler et al. 1997; Nishizawa et al. 2010; Wang et al.

2010; Soontorngun et al. 2012; McDaniel and Strahl 2013). For instance, an enriched DSB- associated region approximately 15 kb downstream from ARS1013 (Fig. 3B) encompass RNR2

(strongly induced by replication stress in a Mec1/Rad53 checkpoint-dependent manner (Huang et al. 1998)), APS3 (encoding a small subunit of a clathrin-associated protein complex whose protein abundance increases in response to DNA replication stress (Tkach et al. 2012)), RRN7

(encoding a core factor of the rDNA transcriptional complex), and two uncharacterized ORFs

(YJL27C, whose deletion renders cell HU-sensitive, and YJL28W). Similarly, enriched DSBs were detected at the RIM101 locus (Fig. 3B). In addition, distinct DSBs can also be observed— though not deemed significant by MACS—at OCA5 and WSC4, both implicated in stress response (Verna et al. 1997). Interestingly, many of the targets of these stress-response genes were associated with enriched DSBs in the mec1 mutant, e.g., ARN2 and YKE4 are substrates of

RIM101, RPN4, and UME6 (Salin et al. 2008; Reimand et al. 2010). Therefore, it stands to reason that a common stress response was elicited in the mec1 and MEC1 cells and that HU- induced gene expression causes unscheduled conflicts with replication at different loci contingent on replication fork progression.

We also asked if the differential enrichment of DSB-associated genes in MEC1 and mec1 cells is due to non-random distribution of these genes with regard to their distance to origins.

The average distance between the 113 most DSB-proximal genes (among the 244 genes encompassed by the enriched DSBs in the MEC1 cells) and the nearest origin is 8649 bp (9476

11 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

bp if Rad53-checked origins were excluded), at least 3.3 times the average distance between the

DSB-associated genes in mec1 cells and their nearest origins (2619 bp). We asked if the transcription factors might be more distant from the origins than the transporter genes such that the replication forks in mec1 cells would have a lower probability of encountering them during recovery from HU. The average distance between all 50 “RNA polymerase II transcription factors” [GO:1077] and their nearest origin is 7764 bp, and the average distance between 206 substrates of these transcription factors and their nearest origin is 7805 bp. We also tested multiple randomly chosen gene groups for origin-to-gene distances and did not find apparent bias for any group (data not shown). Thus, it does not appear that there is a biased distribution of yeast genes with regard to their relative location to the origins. That different DSB-associated genes are enriched in mec1 vs. MEC1 cells is more likely due to differential replication dynamics, including origin activation timing/efficiencies and fork migration rates.

Meta-analyses of DSBs with regards to genes involved in codirectional or convergent replication and transcription. To gain a comprehensive understanding of the relationship between the enriched DSBs relative to the DSB-associated genes we performed a meta-analysis to ask how the DSBs are distributed relative to the transcription unit. We calculated the relative distance between the center of each DSB and its associated gene—the actual distance (to the start, mid and end of the gene) was normalized by gene size. In MEC1 cells DSBs tended to occur near the 5’-end of genes regardless of replication fork direction (Fig. 3C). However, there was a relatively increased proportion of DSBs near the 3’-end of genes when replication and transcription was convergent (Fig. 3C). In mec1 cells DSBs were similarly enriched near the 5’- end of genes in codirectional replication and transcription; however, there was a more significant

12 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

shift of DSBs towards the middle and 3’-end of genes when replication and transcription was convergent.

RNA-seq reveals cell cycle phase-specific gene expression in mec1 and MEC1 cells. To further substantiate our core hypothesis we examined global gene expression in cells under identical experimental conditions as for Break-seq, with the exception that two Recovery samples (R30 and R60, for 30 and 60 min of recovery, respectively) for each strain were collected (Fig. 4A). Flow cytometry analysis showed delayed replication progression during

Recovery in the mec1 compared to the MEC1 (WT) cells (Fig. 4B). Paired-end sequence reads from the mRNA libraries were quantified by RPKM (reads per kilobase per million reads) for every mRNA in the yeast genome (Table S9). Those genes showing significant change (p <

0.05) between samples were identified (Table S10). Hierarchical clustering analysis validated

HU-induction of genes from previously defined functional groups, including the Mec1 checkpoint-dependent DNA damage signature genes (Gasch et al. 2001; Dubacq et al. 2006)

(Fig. 4C, left panel; Table S10). We then identified those genes with relatively higher expression in the Recovery than in the HU sample for both MEC1 and mec1 cells (Fig. 4C, middle and right panels; Table S10). GO enrichment analysis demonstrated increased expression of oxidative stress response genes as exemplified by genes with oxidoreductase activities and those involved in cell wall structures in both MEC1 and mec1 cells (Table S11). Moreover, mec1 cells produced increased expression from iron binding genes [GO:5506] specifically during Recovery, consistent with the observation that the enriched DSB-associated genes were predominantly metal ion transporter genes. Likewise, genes in the structural constituent of cell wall [GO:5199], comprised of TIP1, SED1, CIS3, HSP150, and PIR1, are all stress-response genes showing increased expression during Recovery in MEC1 cells (Table S11). Temporal expression profiles

13 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

of selected genes from these groups demonstrated Recovery-specific expression in the respective strains (Fig. 4D&E). Therefore, these results recapitulated the enriched DSB patterns and provided support for our core hypothesis.

Chromosome breakage at the metal ion transporter genes is further enhanced by additional iron chelation in mec1 cells recovering from HU. To further substantiate the hypothesis that HU-induced expression of metal ion transporter genes resulted in DSBs we used bathophenanthroline sulfonate (BPS), an iron chelator, to further drive gene expression at the transporter genes in mec1 cells. After the release from the G1/S transition and incubation with

HU for 1 h, cells were allowed to recover with or without BPS for 1 hr (R-1h+BPS and R-1h-

BPS, respectively). First, we did not observe significant difference at bulk levels of chromosome breakage in these samples by Pulse Field Gel Electrophoresis (Fig. 5A). Because DSBs were abundant, Break-chip offered sufficient sensitivity for this analysis. As shown in Fig. 5B, higher levels of DSBs were observed at selected genomic regions in the sample with BPS than the one without BPS. We calculated the average signals in a 1-kb sliding window across the genome and identified those windows with a significant increase in the R-1h+BPS sample compared to the R-

1h-BPS sample (> 2 standard deviations, p < 0.05). We found 362 associated genes (Table S12), including ARN2 (encoding a siderophore transporter) and GIN4 (encoding a protein kinase involved in bud growth and septin ring assembly) (Fig. 5B), both shown to be transcriptionally induced by iron deprivation (Yun et al. 2000; Protchenko et al. 2001; Shakoury-Elizeh et al.

2004; Puig et al. 2008). GO enrichment analysis revealed the following top categories, in descending order of significance: copper ion membrane transport [GO: 35434] (p = 0.0001), copper ion import [GO:15677] (p = 0.0004) and mRNA export from nucleus in response to heat stress [GO:31990] (p = 0.0011). These genes are also enriched for “iron ion homeostasis

14 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

[GO:55072]” (p = 0.0090). These results are consistent with our prediction that BPS treatment would further increase the transport of heavy metal ions across cellular membranes, thus supporting our hypothesis that DSBs can be enhanced by gene expression to increase the probability of conflicts with replication.

Programmed replication-transcription conflict through HU induction causes plasmid instability. To provide more rigorous test of our hypothesis we also asked if HU produces chromosome instability in programmed convergent replication and transcription. We selected four genes, YHK8, AHP1, RNR3, and TSA1, previously shown to be induced by HU (Dubacq et al. 2006), as verified by RNA-seq (Table S10). We cloned these genes and their endogenous promoters into a plasmid, YCpGal-ARS1-MCS, such that transcription would either occur towards (F) or away from (R) ARS1. We measured plasmid retention in MEC1 cells exposed to

HU for 1 h after synchronous entry into S phase, as well as in G1 control samples. The results indicated that placing these genes in convergent orientation with the incoming replication fork led to a greater reduction in plasmid stability compared to codirectional placement relative to the replication fork (p = 0.02 in a paired Student’s t-Test) (Fig. 6). These experiments provided further evidence that HU-induced convergent replication and transcription enhance chromosome fragility.

15 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

DISCUSSION

Our previous studies have established that, after transient exposure to HU, checkpoint- deficient mec1 cells suffer from extensive ssDNA formation and ultimately DSBs at destabilized replication forks (Feng et al. 2011). In this study, Break-seq analyses not only confirmed that chromosome breakage occurs at replication forks, but also provided evidence for enriched DSBs when cells attempted to recover from HU. The observations that enriched DSBs were associated with stress-response genes and with different subgroups represented in the MEC1 and mec1 cells led to a model for the mechanism of enhanced chromosome breakage during recovery from HU

(Fig. 7). We propose that the enriched DSBs occurred at genomic loci where replication forks encountered transcription and/or transcription factor binding at HU-induced genes. Because the replication forks in MEC1 cells were capable of traversing a greater distance after the removal of

HU than those in mec1 cells different subclasses of the stress response genes were revealed in these cells. Consistent with the predictions from this model we showed that further induction of the metal ion transporter genes through iron deprivation by BPS increased DSBs near these genes. We also provided evidence that programmed conflicts between replication and transcription induced by HU led to the reduction of plasmid stability specifically when transcription occurred in a convergent manner with respect to replication. We acknowledge that there is considerable variability in each Break-seq experiment with respect to the class of genes showing DSB association due to differential replication dynamics. Nevertheless, gene expression profiling provided support for the observed DSB patterns, as exemplified by the iron- binding genes enriched in mec1 cells during recovery from HU.

Our hypothesis predicts that enriched DSBs (in Recovery) would occur at genes normally expressed during S phase, such as the histone genes. To our surprise, histone genes were largely

16 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

excluded from enriched DSBs despite the observation that the majority of canonical histone genes indeed showed S phase-specific expression, peaking at 30 min during recovery from HU

(Fig. S3A&B). Therefore, we entertained the possibility that replication might have preceded transcription at these loci, thus averting conflicts between replication and transcription. Indeed, based on a previous study (Alvino et al. 2007) at the cell cycle stage equivalent to the recovery phase in our experiments, the earliest-replicating histone genes (HTA1/HTB1 and HTA2/HTB2) have achieved ~80% replication. On the other hand, the latest-replicating non-canonical histone gene loci (HTZ1 and HHO1) were not transcriptionally induced (Fig. S3C). Thus, both groups of genes were spared from replication-transcription conflict-induced DSBs. The HHF1/HHT1 and

HHF2/HHT2 loci with intermediate replication timing and transcription induction comprised the only group that would show higher probability of coinciding replication and transcription. We did observe enriched DSBs in the vicinity of, albeit not within, these genes (data not shown).

Therefore, it seems that the histone genes were largely exempted from DSBs due to temporally separated replication and transcription.

This work provided insights into the mechanisms of genome instability induced through replication-transcription conflicts. Meta-analysis of break positions within DSB-associated genes suggest that replication forks tend to stall at the 5’-end of genes regardless of fork direction in MEC1 cells, albeit with a relative increase at the 3’-ends when replication and transcription was convergent. This biased DSB pattern was more pronounced in mec1 cells— while codirectional replication and transcription still produced DSBs primarily at the 5’-ends of genes, convergent juxtaposition led to a more notable shift towards the middle and 3’-ends.

These results suggest that replication forks were generally more unstable when encountering the transcription initiation complex than the termination complex. In contrast, in mec1 cells

17 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

convergent replication forks were more unstable when encountering the transcription elongation and/or termination complex. Alternatively, these biased DSB distributions might reflect the patterns of residual stalled transcription initiation or elongation complex from G1 persisting into subsequent stages of the cell cycle. Our observations in MEC1 yeast cells therefore contrasted the “punctuation mark theory” based on bacterial plasmids where the transcription initiation complex preferentially stalls the convergent replication fork, whereas the termination complex preferentially stalls the codirectional replication fork (Mirkin et al. 2006). This apparent discrepancy might reflect intrinsic differences between a functional transcription complex in our system versus an engineered and stationary complex in bacteria. Alternatively, differences in replication fork speed and/or replicative helicase polarities between these two organisms might also contribute to these results. Future work is warranted to investigate the cause for the positional bias of replication fork stalling within a transcription unit in yeast.

Our work also bears on our understanding of HU’s impact on cellular processes. Though

HU is primarily known as a ribonucleotide reductase inhibitor it has a multitude of other cellular effects. For instance, HU treatment results in lowered intracellular iron and in turn, increased iron uptake by transferrin (Chitambar and Wereley 1995). HU has also been shown to bind metal ions such as iron and copper directly (Konstantinou et al. 2011). Moreover, HU induces oxidative stress response (Dubacq et al. 2006). Cellular defense against oxidative stress requires the activities of superoxide dismutases (SODs) which in turn require increased mobilization of heavy metals across different cellular membranes for their activities towards reactive oxygen species: copper and zinc for Sod1 (Bermingham-McDonogh et al. 1988; Chang et al. 1991) and manganese for Sod2 (Ravindranath and Fridovich 1975). Finally, we have also uncovered diverse chemical-genetic interactions between HU and a wide range of genetic mutants

18 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

(McCulley et al. 2014). Therefore, we propose that HU triggers a global stress response involving with a wide range of cellular functions including heavy metal mobilization.

In summary, we have upgraded Break-chip to Break-seq with superior sensitivity of DSB detection. This improvement facilitated the discovery of enriched DSBs in cells recovering from exposure to HU compared to cells whose cell cycle was still arrested by the drug, and led to the hypothesis that unscheduled conflict between replication and transcription underlies replication stress-induced chromosome breakage. Our model potentially provides a solution to a long- standing problem in mammalian chromosome biology: the reason why replication inhibitors induce distinct patterns of CFSs is because they each produces a different gene expression pattern and causes replication-transcription conflicts at different genomic regions. This work also explains why chromosome breakage was more pronounced in the Recovery than in the HU sample: in HU, replication forks stall and have a lower probability of breaking and colliding with transcription, whereas during recovery they can resume or initiate from the previously unfired origins, thus increasing the probability of DSBs. We believe that the Break-seq methodology as well as our proposed model of chromosome breakage have direct impact on the discovery of mammalian chromosome fragile sites.

19 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

METHODS

Yeast growth conditions and strains. All yeast strains were cultured in YPD medium or SC medium lacking tryptophan and incubated at 30°C. The strains used in this study were derivatives of strain A364a: WFA72, MATa leu2-3,112 trp1-1 his3-11 and WFA73, MATa mec1-1::HIS3 sml1-1 leu2-3,112 trp1-1 his3-11.

Cell cycle synchronization, sample collection and preparation of agarose-embedded chromosomal DNA. Cells were grown in YPD liquid medium at 30° C until they reached an optical density at 600 nm (OD600) of ~0.25. The cultures were then synchronized by 3 μM α- factor, followed by release into S phase with pronase addition (0.3 μg/ml). HU was added at 200 mM immediately before adding pronase. After 1 h, cultures were filtered to remove HU and cells were resuspended in fresh YPD medium to recover for up to 3 h with samples taken every hour. BPS was added at 0.8 μM to the YPD medium just before cell resuspension.

Approximately 106 yeast cells from each sample were washed in 50 mM EDTA and embedded in 0.5% Incert low melting agarose (Lonza) in 50 mM EDTA, then cast in a 100-μl mold

(BioRad) to solidify. The plugs were then processed for cell lysis and protein removal as previously described (Feng et al. 2011).

Flow cytometric analysis. One-ml samples from the synchronized cultures described above were collected every 15 min and fixed with ethanol. SYTOX Green-stained samples were measured for DNA content with a Beckton Dickinson LSRFortessa. Data analysis was performed with FlowJo.

20 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Pulse Field Gel Electrophoresis. PFGE analysis was performed as described previously (van

Brabant et al. 2001). Electrophoresis was conducted at 14˚C for 24 h with a switch time ramped from 10 to 90 sec at 6 volts/cm. Standard procedures were employed to ethidium bromide stain and photograph the gel.

Cloning of AHP1, TSA1, YHK8, and RNR3. PCR products of AHP1, TSA1, YHK8, and

RNR3, along with flanking sequences, were cloned into YCpGal-ARS1-MCS, derived from

YCpGal-ARS1 (M. K. Raghuraman, unpublished), replacing a GAL1 promoter adjacent to the origin, ARS1, such that transcription occurs either towards (Forward, “F”) or away from

(Reverse, “R”) ARS1. The PCR products were inserted as either NdeI-NotI (AHP1) or AseI-NotI

(all others) fragments into the NdeI and NotI sites. PCR primer sequences are listed in Table

S13. All clones were confirmed by Sanger sequencing.

Plasmid retention assay. WFA72 cells transformed with plasmids described above were grown in synthetic medium lacking tryptophan (SC-Trp) at 30° C until OD600 reached ~0.25. Cell cycle synchronization and HU (200 mM) treatment were performed as described above. After 1h treatment, serial diluted cells were plated at ~500 cells per plate on solid YPD or SC-Trp medium after sonication, and plates were incubated at 30° C for 2-3 days before colonies were counted. The percentage of plasmid retention was calculated as the percentage of tryptophan prototrophic colonies of total number of colonies on YPD medium. Each measurement was averaged from at least three biological replicates.

Break-seq and data analyses. Agarose-embedded chromosomal DNA sample were subjected to

Break-seq library construction. Detailed procedures as well as sequence data analyses, including the random simulation tests, are described in the supplemental text.

21 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Break-chip. DSB labeling followed by sample hybridization on Agilent G4493A Yeast Whole

Genome ChIP-on-Chip 4x44K microarrays were performed as previously described with one modification (Feng et al. 2011). Extraction of DNA from agarose plugs was performed by β- agarase digestion rather than electroelution described in the Break-seq procedures.

RNA-seq. Total RNA was isolated from ~5x108 cells by standard procedure and 2 μg of RNA from each sample were used to prepare mRNA libraries using the Illumina TruSeq Stranded mRNA Sample Preparation Kit according to the manufacturer’s recommendations. The libraries were validated on the Agilent Technologies 2100 Bioanalyzer using the Agilent DNA 1000 chip, pooled and sequenced on the MiSeq using a 2 x 75 cycle run. Raw sequences were mapped onto the SacCer3 genome by BWA and subsequent analyses were performed with SeqMonk and

Cluster 3.0 (de Hoon et al. 2004). Java TreeView (Saldanha 2004) was used to visualize clustered data.

Gene ontology (GO) enrichment analysis. All data analysis of GO enrichment for either molecular function or biological process was performed with the Saccharomyces cerevisiae genome database “Gene Ontology Enrichment” widget, or by using FunSpec (Robinson et al.

2002) with a maximum P-value of 0.01. For RNA-seq data, GO enrichment P values were corrected for multiple testing by the Benjamini Hochberg method (Hochberg and Benjamini

1990).

22 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

DATA ACCESS

All primary sequencing data from this study have been submitted to the NCBI Gene Expression

Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) under the SuperSeries accession GSE63517, with the Break-seq, RNA-seq and Break-chip data assigned separate accession numbers,

GSE58808, GSE63516 and GSE64446, respectively.

23 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

REFERENCES

Alvino GM, Collingwood D, Murphy JM, Delrow J, Brewer BJ, Raghuraman MK. 2007.

Replication in hydroxyurea: it's a matter of time. Mol Cell Biol 27(18): 6396-6406.

Arlt MF, Durkin SG, Ragland RL, Glover TW. 2006. Common fragile sites as targets for

chromosome rearrangements. DNA Repair (Amst) 5(9-10): 1126-1135.

Azvolinsky A, Giresi PG, Lieb JD, Zakian VA. 2009. Highly transcribed RNA polymerase II

genes are impediments to replication fork progression in Saccharomyces cerevisiae. Mol

Cell 34(6): 722-734.

Bermejo R, Lai MS, Foiani M. 2012. Preventing replication stress to maintain genome stability:

resolving conflicts between replication and transcription. Mol Cell 45(6): 710-718.

Bermingham-McDonogh O, Gralla EB, Valentine JS. 1988. The copper, zinc-superoxide

dismutase gene of Saccharomyces cerevisiae: cloning, sequencing, and biological

activity. Proc Natl Acad Sci U S A 85(13): 4789-4793.

Brewer BJ. 1988. When polymerases collide: replication and the transcriptional organization of

the E. coli chromosome. Cell 53(5): 679-686.

Brewer BJ, Fangman WL. 1988. A replication fork barrier at the 3' end of yeast ribosomal RNA

genes. Cell 55(4): 637-643.

Casper AM, Rosen DM, Rajula KD. 2012. Sites of genetic instability in mitosis and cancer.

Annals of the New York Academy of Sciences 1267: 24-30.

Chang EC, Crawford BF, Hong Z, Bilinski T, Kosman DJ. 1991. Genetic and biochemical

characterization of Cu,Zn superoxide dismutase mutants in Saccharomyces cerevisiae. J

Biol Chem 266(7): 4417-4424.

24 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Chitambar CR, Wereley JP. 1995. Effect of hydroxyurea on cellular iron metabolism in human

leukemic CCRF-CEM cells: changes in iron uptake and the regulation of transferrin

receptor and ferritin gene expression following inhibition of DNA synthesis. Cancer Res

55(19): 4361-4366. de Hoon MJ, Imoto S, Nolan J, Miyano S. 2004. Open source clustering software. Bioinformatics

20(9): 1453-1454.

Debatisse M, Le Tallec B, Letessier A, Dutrillaux B, Brison O. 2012. Common fragile sites:

mechanisms of instability revisited. Trends in genetics : TIG 28(1): 22-32.

Deshpande AM, Newlon CS. 1996. DNA replication fork pause sites dependent on transcription.

Science 272(5264): 1030-1033.

Dubacq C, Chevalier A, Courbeyrette R, Petat C, Gidrol X, Mann C. 2006. Role of the iron

mobilization and oxidative stress regulons in the genomic response of yeast to

hydroxyurea. Mol Genet Genomics 275(2): 114--124.

Duch A, Felipe-Abrio I, Barroso S, Yaakov G, Garcia-Rubio M, Aguilera A, de Nadal E, Posas

F. 2013. Coordinated control of replication and transcription by a SAPK protects

genomic integrity. Nature 493(7430): 116-119.

Durkin SG, Glover TW. 2007. Chromosome fragile sites. Annu Rev Genet 41: 169-192.

Feng W, Bachant J, Collingwood D, Raghuraman MK, Brewer BJ. 2009. Centromere replication

timing determines different forms of genomic instability in Saccharomyces cerevisiae

checkpoint mutants during replication stress. Genetics 183(4): 1249-1260.

Feng W, Collingwood D, Boeck ME, Fox LA, Alvino GM, Fangman WL, Raghuraman MK,

Brewer BJ. 2006. Genomic mapping of single-stranded DNA in hydroxyurea-challenged

yeasts identifies origins of replication. Nat Cell Biol 8(2): 148-155.

25 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Feng W, Di Rienzi SC, Raghuraman MK, Brewer BJ. 2011. Replication stress-induced

chromosome breakage is correlated with replication fork progression and is preceded by

single-stranded DNA formation. G3 (Bethesda) 1(5): 327--335.

Gasch AP, Huang M, Metzner S, Botstein D, Elledge SJ, Brown PO. 2001. Genomic expression

responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog

Mec1p. Mol Biol Cell 12(10): 2987-3003.

Hellman A, Rahat A, Scherer SW, Darvasi A, Tsui LC, Kerem B. 2000. Replication delay along

FRA7H, a common fragile site on human chromosome 7, leads to chromosomal

instability. Mol Cell Biol 20(12): 4420-4427.

Hochberg Y, Benjamini Y. 1990. More powerful procedures for multiple significance testing.

Statistics in medicine 9(7): 811-818.

Huang M, Zhou Z, Elledge SJ. 1998. The DNA replication and damage checkpoint pathways

induce transcription by inhibition of the Crt1 repressor. Cell 94(5): 595-605.

Konstantinou E, Pashalidis I, Kolnagou A, Kontoghiorghes GJ. 2011. Interactions of

hydroxycarbamide (hydroxyurea) with iron and copper: implications on toxicity and

therapeutic strategies. Hemoglobin 35(3): 237-246.

Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nature methods

9(4): 357-359.

Le Beau MM, Rassool FV, Neilly ME, Espinosa R, 3rd, Glover TW, Smith DI, McKeithan TW.

1998. Replication of a common fragile site, FRA3B, occurs late in S phase and is delayed

further upon induction: implications for the mechanism of fragile site induction. Hum

Mol Genet 7(4): 755-761.

26 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Letessier A, Millot GA, Koundrioukoff S, Lachages AM, Vogt N, Hansen RS, Malfoy B, Brison

O, Debatisse M. 2011. Cell-type-specific replication initiation programs set fragility of

the FRA3B fragile site. Nature 470(7332): 120-123.

Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform.

Bioinformatics 25(14): 1754-1760.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R.

2009. The sequence alignment/map (SAM) format and SAMtools. Bioinformatics 25:

2078-2079.

Luna R, Rondon AG, Aguilera A. 2012. New clues to understand the role of THO and other

functionally related factors in mRNP biogenesis. Biochim Biophys Acta 1819(6): 514-

520.

MacAlpine DM, Bell SP. 2005. A genomic view of eukaryotic DNA replication. Chromosome

Res 13(3): 309-326.

McCulley A, Haarer B, Viggiano S, Karchin J, Feng W. 2014. Chemical suppression of defects

in mitotic spindle assembly, redox control, and sterol biosynthesis by hydroxyurea. G3

(Bethesda) 4(1): 39-48.

McDaniel SL, Strahl BD. 2013. Stress-free with Rpd3: a unique chromatin complex mediates the

response to oxidative stress. Mol Cell Biol 33(19): 3726-3727.

Merrikh H, Zhang Y, Grossman AD, Wang JD. 2012. Replication-transcription conflicts in

bacteria. Nature reviews Microbiology 10(7): 449-458.

Mirkin EV, Castro Roa D, Nudler E, Mirkin SM. 2006. Transcription regulatory elements are

punctuation marks for DNA replication. Proc Natl Acad Sci U S A 103(19): 7276-7281.

27 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Mirkin EV, Mirkin SM. 2007. Replication fork stalling at natural impediments. Microbiology

and molecular biology reviews : MMBR 71(1): 13-35.

Murano I, Kuwano A, Kajii T. 1989. Fibroblast-specific common fragile sites induced by

aphidicolin. Hum Genet 83(1): 45-48.

Nieduszynski CA, Hiraga S, Ak P, Benham CJ, Donaldson AD. 2007. OriDB: a DNA replication

origin database. Nucleic Acids Res 35(Database issue): D40-46.

Nishizawa M, Tanigawa M, Hayashi M, Maeda T, Yazaki Y, Saeki Y, Toh-e A. 2010. Pho85

kinase, a cyclin-dependent kinase, regulates nuclear accumulation of the Rim101

transcription factor in the stress response of Saccharomyces cerevisiae. Eukaryotic cell

9(6): 943-951.

Ozeri-Galai E, Lebofsky R, Rahat A, Bester AC, Bensimon A, Kerem B. 2011. Failure of origin

activation in response to fork stalling leads to chromosomal instability at fragile sites.

Mol Cell 43(1): 122-131.

Palakodeti A, Han Y, Jiang Y, Le Beau MM. 2004. The role of late/slow replication of the

FRA16D in common fragile site induction. Genes, chromosomes & cancer 39(1): 71-76.

Pomerantz RT, O'Donnell M. 2010. Direct restart of a replication fork stalled by a head-on RNA

polymerase. Science 327(5965): 590-592.

Protchenko O, Ferea T, Rashford J, Tiedeman J, Brown PO, Botstein D, Philpott CC. 2001.

Three cell wall mannoproteins facilitate the uptake of iron in Saccharomyces cerevisiae. J

Biol Chem 276(52): 49244-49250.

Puig S, Vergara SV, Thiele DJ. 2008. Cooperation of two mRNA-binding proteins drives

metabolic adaptation to iron deficiency. Cell metabolism 7(6): 555-564.

28 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Quinlan AR, Hall IM. 2010. BEDTools: a flexible suite of utilities for comparing genomic

features. Bioinformatics 26(6): 841-842.

Ravindranath SD, Fridovich I. 1975. Isolation and characterization of a manganese-containing

superoxide dismutase from yeast. J Biol Chem 250(15): 6107-6112.

Reimand J, Vaquerizas JM, Todd AE, Vilo J, Luscombe NM. 2010. Comprehensive reanalysis

of transcription factor knockout expression data in Saccharomyces cerevisiae reveals

many new targets. Nucleic Acids Res 38(14): 4768-4777.

Robinson MD, Grigull J, Mohammad N, Hughes TR. 2002. FunSpec: a web-based cluster

interpreter for yeast. BMC bioinformatics 3: 35.

Rocha E. 2002. Is there a role for replication fork asymmetry in the distribution of genes in

bacterial genomes? Trends in microbiology 10(9): 393-395.

Saldanha AJ. 2004. Java Treeview--extensible visualization of microarray data. Bioinformatics

20(17): 3246-3248.

Salin H, Fardeau V, Piccini E, Lelandais G, Tanty V, Lemoine S, Jacq C, Devaux F. 2008.

Structure and properties of transcriptional networks driving selenite stress response in

yeasts. BMC genomics 9: 333.

Shakoury-Elizeh M, Tiedeman J, Rashford J, Ferea T, Demeter J, Garcia E, Rolfes R, Brown

PO, Botstein D, Philpott CC. 2004. Transcriptional remodeling in response to iron

deprivation in Saccharomyces cerevisiae. Mol Biol Cell 15(3): 1233-1243.

Soontorngun N, Baramee S, Tangsombatvichit C, Thepnok P, Cheevadhanarak S, Robert F,

Turcotte B. 2012. Genome-wide location analysis reveals an important overlap between

the targets of the yeast transcriptional regulators Rds2 and Adr1. Biochem Biophys Res

Commun 423(4): 632-637.

29 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Sutherland GR. 1979. Heritable fragile sites on human chromosomes. III. Detection of

fra(X)(q27) in males with X-linked mental retardation and in their female relatives. Hum

Genet 53(1): 23-27.

Tkach JM, Yimit A, Lee AY, Riffle M, Costanzo M, Jaschob D, Hendry JA, Ou J, Moffat J,

Boone C et al. 2012. Dissecting DNA damage response pathways by analysing protein

localization and abundance changes during DNA replication stress. Nat Cell Biol 14(9):

966-976.

Tuduri S, Crabbe L, Conti C, Tourriere H, Holtgreve-Grez H, Jauch A, Pantesco V, De Vos J,

Thomas A, Theillet C et al. 2009. Topoisomerase I suppresses genomic instability by

preventing interference between replication and transcription. Nat Cell Biol 11(11): 1315-

1324. van Brabant AJ, Buchanan CD, Charboneau E, Fangman WL, Brewer BJ. 2001. An origin-

deficient yeast artificial chromosome triggers a cell cycle checkpoint. Mol Cell 7(4): 705-

713.

Verna J, Lodder A, Lee K, Vagts A, Ballester R. 1997. A family of genes required for

maintenance of cell wall integrity and for the stress response in Saccharomyces

cerevisiae. Proc Natl Acad Sci U S A 94(25): 13804-13809.

Wang L, Darling J, Zhang JS, Qian CP, Hartmann L, Conover C, Jenkins R, Smith DI. 1998.

Frequent homozygous deletions in the FRA3B region in tumor cell lines still leave the

FHIT exons intact. Oncogene 16(5): 635-642.

Wang X, Xu H, Ha SW, Ju D, Xie Y. 2010. Proteasomal degradation of Rpn4 in Saccharomyces

cerevisiae is critical for cell viability under stressed conditions. Genetics 184(2): 335-

342.

30 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Wendler F, Bergler H, Prutej K, Jungwirth H, Zisser G, Kuchler K, Hogenauer G. 1997.

Diazaborine resistance in the yeast Saccharomyces cerevisiae reveals a link between

YAP1 and the pleiotropic drug resistance genes PDR1 and PDR3. J Biol Chem 272(43):

27091-27098.

Yin S, Deng W, Hu L, Kong X. 2009. The impact of nucleosome positioning on the organization

of replication origins in eukaryotes. Biochem Biophys Res Commun 385(3): 363-368.

Yun CW, Ferea T, Rashford J, Ardon O, Brown PO, Botstein D, Kaplan J, Philpott CC. 2000.

Desferrioxamine-mediated iron uptake in Saccharomyces cerevisiae. Evidence for two

pathways of iron uptake. J Biol Chem 275(14): 10709-10715.

Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nusbaum C, Myers RM,

Brown M, Li W et al. 2008. Model-based analysis of ChIP-Seq (MACS). Genome Biol

9(9): R137.

31 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

ACKNOWLEDGEMENTS

We wish to thank Anna Brosius for sharing shell scripts for bwa sequence mapping and analysis, Sujith Valiyaparambil and Jonathan Bard at the University at Buffalo Genomics and

Bioinformatics Core for HiSeq sequencing, Siyi Zhang and Jie Peng for assisting data analysis, and Nancy Walker for technical support. We are also indebted to Karen Gentile and Frank

Middleton at Upstate Medical University Microarray and Sequencing Core Facility for generous help with MiSeq sequencing. We also thank M. K. Raghuraman and Sara Di Rienzi for critical reading of the manuscript. This work was supported by a Pathway to Independence Award

(4R00GM08137804) from the National Institute of Health and a Basil O’Connor Starter Scholar

Award (56464) from March of Dimes to W. Feng.

32 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

FIGURE LEGENDS

Figure 1. (A) Schematic representation of Break-seq procedures. Chromosomal DNA is shown embedded in agarose plugs (depicted by cubes) for in-gel labeling for DSBs. (B) View of a portion of chromosome X (105000-500000 bp) showing the concordance between DSBs from the BamHImec1, G1 (Control) sample and the known BamHI sites (the annotation track on top).

The data track is shown at the bottom. Color scales from blue to red indicate increasing number of sequence reads in a given window. (C) Scatter plot of Pearson’s correlation between two biological replicate experiments. Log transformed values of read counts per base normalized to total number of sequenced bases are plotted, for the HU samples (top) and the Recovery samples

(bottom). In each plot, the values from the Exp A samples are plotted on the X-axis and those from the Exp B samples are plotted on the Y-axis.

33 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 2. Chromosome views of the Break-seq data using SeqMonk. Sequence reads quantification was performed as described in Fig. S1. (A) View of chromosome X (745,751 bp) of the HU and Recovery samples from all experiments. The annotation tracks, from top to bottom, are: (1) ORFs (red, Watson-strand-encoded; blue, Crick-strand-encoded), (2) OriDB- curated confirmed and likely ARSs, (3) Rad53-checked origins (Feng et al. 2006), (4) Rad53- unchecked origins (Feng et al. 2006), (5) enriched DSBs in Exp A, (6) enriched DSBs in Exp B, and (7) enriched DSBs in Exp C. The data tracks are as labeled. (B) Venn diagrams showing concordance between the DSBs in HU samples alone and between the DSBs in Recovery samples alone in the two experiments. (C) Venn diagrams showing the concordance between the enriched DSBs in two experiments (left), and the identification of 203 common features (middle) and 185 common protein-coding genes (right).

34 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 3. Examples of genomic loci containing enriched DSBs in mec1 (A) and MEC1 (B) cells. The annotation tracks are gene (1), labeled red and blue for Watson- and Crick-strand- encoded, respectively, and origins of replication (2). ORF names and ARS numbers are as labeled and centered on the marker in each track. The data tracks are as labeled. Only those genes that overlap with the enriched DSBs are named, with the exception of OCA5 and WSC4

(see main text for details). A dubious ORF and two tRNAs at the RNR2 loci are not labeled for space restriction. Distribution of enriched DSBs with respect to genes in MEC1 (C) and mec1

(D) cells. All DSB-associated genes, in either codirectional (blue) or convergent (red) orientation with respect to replication from the nearest origin (circles similarly color-coded in the two orientations) were aligned at the start codon and normalized by size. Genes (grey) in the intervening region between the origin and DSB-associated gene are illustrated to indicate the longer gene-to-origin distance in MEC1 cells. The number of DSBs occurring at the relative positions of each gene were reported for the codirectional (blue) or convergent (red) groups in histograms.

35 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 4. RNA-seq analysis. (A) Illustration of sample collection: G1, α-factor arrested; HU, exposed to HU for 1 h; R30 and R60, recovery from HU for 30 and 60 min, respectively. (B)

Cell cycle progression by FACS. Asyn, log phase culture prior to cell cycle synchronization.

(C) Selected clusters of gene expression patterns in MEC1 (WT) and mec1 cells. Color bar values indicate Log10 transformed values of normalized RPKM read count. The three panels represent growth phase- and strain-specific gene expression patterns. Selected genes in each cluster are boxed for emphasis and color-coded for the functional groups. Left: HU-specific and

Mec1-dependent (e.g., DNA damage response genes in blue boxes); Middle: Recovery-specific in WT cells (e.g., stress-response genes in orange boxes); Right: Recovery-specific in mec1 cells

(e.g., metal ion transporter genes in purple boxes). Gene expression profiles for selected

Recovery-specific genes in WT (D) and mec1 (E) cells from the cluster analysis are shown. The times at which the samples were taken are labeled on the time scale on the X axes.

36 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 5. Replication and transcription in a convergent orientation cause higher levels of plasmid instability after HU induction than when in a codirectional orientation. Box plot of the percentages of change in plasmid stability in MEC1 cells carrying YHK8, AHP1, TSA1, and

RNR3, in either convergent (-F) or codirectional (-R) juxtaposition relative to the ARS1 origin, after exposure to HU compared to the control (before exposure to HU) in a synchronous culture.

Note that negative values denote increased plasmid loss. Each box encloses 50% of the data with the median value of the variable displayed as a line. The top and bottom of the box mark the upper and lower quartile (limits of ±25% of the variable population), respectively. The lines extending from the top and bottom of each box mark the minimum and maximum values within the data set.

37 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 6. Treatment with BPS induces additional breakage at discrete loci. (A) Chromosome breakage is readily observed in mec1 cells recovering from HU in the absence (yellow, -BPS) and presence of 0.8 μM BPS (red, +BPS) by PFGE. BPS addition alone does not cause chromosome breakage: “BPS 1h” and “R 2h” represent cells that have been treated with BPS for

1 h and those that have recovered in fresh medium for 1 h after treatment with BPS, respectively.

“Marker”, Yeast Chromosome PFG marker; “λ DNA”, phage lambda DNA, ~50 kb. (B) Break- chip demonstrate examples of genomic regions showing enhanced breakage in the +BPS sample

(red) compared to the –BPS sample (blue), as indicated by blue arrows.

38 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 7. Model for chromosome fragility induced by HU as a result of unscheduled conflict between destabilized replication forks and increased gene transcription at HU-inducible genes in mec1 cells and MEC1 cells. In the presence of HU only those Rad53-unchecked origins (Feng et al. 2006) are activated in MEC1 cells and the replication forks travel a greater distance than those in the mec1 cells, thus reaching different locations in the genome. Therefore, during recovery from HU the stalled replication forks in MEC1 and mec1 cells encounter the transcription induction of different subsets of HU-responsive genes. Collision between replication and transcription brings forth further destabilization of the ssDNA at the replication forks, causing

DSBs. Note although the replication forks in WT cells would have traversed through the HU- induced genes that are proximal to the origins observed in the mec1 cells, ssDNA is only formed at the replication fork at a more distal region, thus sparing these genes from DSBs.

39 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

SUPPLEMENTAL MATERIALS

Supplemental text

Detailed experimental procedures for Break-seq. Each processed agarose plug was incubated twice in 5 ml of TE 0.1 buffer (10 mM Tris-HCl, 0.1 mM EDTA, pH 8.0) for 15 min at room temperature, then incubated twice in 5 ml of 1X End-Repair reaction buffer (33 mM Tris-acetate, pH 7.8, 66 mM potassium acetate, 10 mM magnesium acetate, and 0.5 mM dithiothreitol) for 30 min at room temperature. Each plug was then mixed with 50 μl of labeling mix (1X End-Repair buffer, 100 μM each of dTTP, dCTP, and dGTP, 84 μM dATP, 16 μM Biotin-14-dATP

(Invitrogen), 1 mM ATP, and 3 μl End-It enzyme mix (Epicentre)) and incubated at 25°C for 1.5 h. The plug was washed twice with 5 ml TE 0.1 buffer at room temperature for 15 min. The agarose was digested by β-agarase (New England Biolabs) according to manufacturer’s recommendation. The released genomic DNA was brought to a final volume of 250-300 μl in

TE 0.1 buffer and sonicated in a Bioruptor (Diagenode) to achieve an average size of smaller than 500 bp. The fragmented DNA was purified by a Qiagen PCR Cleanup Kit, followed by

End-Repair (with unmodified dNTPs) and A-tailing using standard Illumina DNA library construction conditions. The A-tailed DNA was then brought to a volume of 100 μl with EB buffer (Qiagen) for binding with M270 Dynabeads (Life technologies). For each DNA sample,

50 μl of bead suspension was first immobilized on a magnet and washed three times each with

100 μl of 1X Binding and Washing buffer (B&W buffer, 5 mM Tris-HCl, pH 7.5, 0.5 mM

EDTA, pH 8.0, and 1 M NaCl), followed by resuspension in 100 μl of 2X B&W buffer. The

100-μl DNA sample was then mixed with the beads and incubated at room temperature for 30 min with agitation. The DNA-bound beads were washed twice each with 100 μl of 1X B&W

40 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

buffer, then washed twice each with 100 μl of EB buffer. The washed beads were then resuspended in ligation reaction mix containing 3 μM each of Adapter 1 and Adapter 2 and incubated at room temperature overnight with agitation. The beads were washed twice each with 200 μl of 1X B&W buffer and twice each with 200 μl of EB and the associated DNA was then amplified by PCR (18-20 cycles) using multiplexed PE PCR primers (Table S14) using standard procedures recommended by Illumina. The PCR products were purified by AmPure beads (Agencourt) to remove free adapters according to manufacturer’s recommendations and the resulting DNA was subjected to 50-bp or 100-bp paired-end sequencing on the Illumina

HiSeq platform.

Sequencing read mapping and identification of DSB peaks by MACS. Raw sequence reads were mapped onto the UCSC Saccharomyces cerevisiae reference genome, SacCer3

(http://hgdownload.soe.ucsc.edu/downloads.html - yeast), by Burrows-Wheeler Aligner (BWA version 0.6.2, http://bio-bwa.sourceforge.net/) or Bowtie 2 (http://bowtie- bio.sourceforge.net/bowtie2/index.shtml) using default parameters. Those uniquely mapped sequence reads were converted to BAM files using SAMtools (Li et al. 2009)

(http://samtools.sourceforge.net/) and subjected to subsequent analysis with Model-based

Analysis for ChIP-seq (MACS version 1.4.1, http://liulab.dfci.harvard.edu/MACS/).

Identification of DSB peaks using either one-sample or two-sample analysis was performed with default parameters with a P value less than 1E-5. Data visualization was performed with

SeqMonk.

Random simulation test for correlation between DSBs and origins. The locations of origins were obtained from OriDB. Only 606 confirmed or likely origins, but not dubious origins, were

41 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

used for analysis. The association between a DSB and the genomic feature, each defined by a start and an end chromosomal coordinate, was determined by whether the two regions overlap with each other (by at least 1 bp) using the intersectBed function in BEDTools (Quinlan and Hall

2010). The shuffleBed function was used to randomly select the same number of DNA segments per chromosome with the same size (of the DSB regions) distribution as observed in the real data. The usages of these BEDTools functions are:

$ ./intersectBed –a DSBFILE.bed –b GENOMICFEATUREFILE.bed >

DSBGENOMICFEATUREintersection.bed

$ ./shuffleBed –a DSBFILE.bed –g yeast.SacCer3.genome –chrom > SHUFFLEDDSBFILE.bed

The shuffled DSBs were then compared to the genomic feature list for the number of overlaps.

This random simulation was performed 10000 times in each test. The number of simulations in which the randomized DSBs overlapped with the genomic feature at an equal or higher frequency than the real data was divided by the number of simulations to calculate the P value.

A P value of < 0.05 was considered significant to reject the null hypothesis.

42 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Supplemental figure legends

Figure S1. Genome and chromosome views of Break-seq data from mec1 samples in Exp A using SeqMonk. Sequence read counts were scored in a running window of 7 bp and a step window of 7 bp, adjusted for “per million reads” with duplicated reads removed. (A) Genomic views of HU (top) and Recovery (bottom) samples. Note Chr IV is truncated for display. Color scale from blue to red indicate increasing number of sequence reads in a given window. (B) A portion of Chr IV, blocked in (A), is expanded for a chromosome view for the samples indicated.

The ORF or CDS track (red, Watson-strand-encoded; blue, Crick-strand-encoded) is shown on top.

43 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure S2. Control Break-seq experiments demonstrating specificity of DSB labeling. (A)

Chromosome view of a middle section of Chr VII of the Break-seq or “mock Break-seq”

(without the biotinylated dUTP in DSB labeling) data from samples as indicated. The annotation tracks are as shown, with the Watson and Crick strand-coded genes shown in red and blue, respectively, in the gene track. (B) Venn diagrams comparing MACS peak calling using different control samples versus one sample (HU only).

44 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure S3. Comparisons of transcription (A&B) and replication timing (C) at the histone gene loci. The gene expression profiles and the relative position on a time scale for sample collection are indicated, identical to Fig. 4. The histone genes are divided into three groups as indicated based on replication timing in a normal S phase (without HU), extracted from published data

(Alvino et al. 2007). Percentage of replication was derived from and can be roughly equated to percentage of “heavy-light” (HL) DNA in a density transfer experiment. The equivalent times for the HU and R30 samples in the RNA-seq experiment on the normal S phase scale in the replication timing study are as indicated.

45 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Supplemental tables

*Table S2, S4-S10, and S12 are provided as separate Excel files.

Table S1. Break-seq sequence reads analysis. All experiments were performed by 50-bp

(Experiments: Control, B and C) or 100-bp (Experiment A) paired-end sequencing on a HiSeq

2500 platform in a multiplexed format. The sequence reads from the control sample were mapped by Bowtie 2 and those from all other samples were mapped by BWA.

Exp Strain Sample Total reads Coverage Unique reads

Control mec1-1 G1/BamHI 13,000,829 54 7,412,344

(57.01%)

A mec1-1 HU 1h 42,683,360 356 39,196,024

(91.83%)

A mec1-1 Recovery 1h 36,370,088 303 32,474,677

(89.29%)

B mec1-1 HU 1h 42,241,856 176 36,241,977

(85.80%)

B mec1-1 Recovery 1h 34,741,036 145 31,545,932

(90.80%)

C MEC1 HU 1h 25,931,204 108 22,404,639

(86.40%)

C MEC1 Recovery 1h 21,006,796 88 18,187,648

(86.58%)

46 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S2. The End-Repaired Sites in a G1-arrested control sample from mec1 that was digested by BamHI in vitro. The coordinates of the End-Repaired Sites were obtained from MACS one- sample analysis. Only the known BamHI sites on chromosomes II, III, VI, and X are shown.

Those sites that are unique to the End-Repaired sites or BamHI sites are listed as "no match" in the corresponding column.

BreakSeq_TableS2.xls

47 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S3. Concordance between biological replicate experiments. Log2 transformed number of sequence reads acquired per base normalized by total number of bases sequenced in each sample was calculated for each sample and compared with each other. Pearson’s correlation R value calculated by SeqMonk is reported for each comparison. Sample names are labeled first by Experiment (Exp) A, B or C, followed by strain name and sample name (“HU” and “R” for cells collected at 1 h after HU treatment and 1 h after the removal of HU, respectively).

Exp A Exp A Exp B Exp B Exp C Exp C Correlation mec1_HU mec1_R mec1_HU mec1_R MEC1_HU MEC1_R Exp A mec1_HU 1 0.857 0.937 0.891 0.803 0.813 Exp A mec1_R 1 0.817 0.895 0.698 0.614 Exp B mec1_HU 1 0.888 0.861 0.842 Exp B mec1_R 1 0.772 0.745 Exp C MEC1_HU 1 0.793 Exp C MEC1_R 1

48 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S4. Results of one-sample analysis by MACS to identify significant End-Repaired Sites in the HU and Recovery samples in mec1 (Experiments A and B) and MEC1 (Experiment C) cells. "Summit" indicates the peak location of the detected enriched breakage site. "tags" indicates number of sequence reads in peak region. "significance" is -10*log10(p value) for the peak region (e.g. p value = 1e-10, then the “significance” value is 100). "fold enrichment" is fold enrichment for this region against random Poisson distribution with local lambda.

BreakSeq_TableS4.xls

49 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S5. Results of two-sample analysis by MACS to identify significant enriched End-

Repaired Sites in the Recovery samples compared to the corresponding HU samples in mec1

(Experiments A and B) and MEC1 (Experiment C) cells. "FDR (%)", False discovery rate (%).

"Summit" indicates the peak location of the detected enriched breakage site. "tags" indicates number of sequence reads in peak region. "significance" is -10*log10(pvalue) for the peak region (e.g. pvalue =1e-10, then this value should be 100). "fold_enrichment" is fold enrichment for this region against random Poisson distribution with local lambda. FDR (%) is calculated by reversing the experimental and control samples for determination of enriched break sites.

BreakSeq_TableS5.xls

50 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S6. Genomic features associated with the enriched chromosome break regions in mec1

(Exp A and B) and MEC1 (Exp C) cells. Note that some genomic features are shared by enriched chromosome break regions. The different categories of features other than ORFs are color-coded as indicated.

BreakSeq_TableS6.xls

51 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S7. Analysis of the closest genes associated with the common enriched break regions for mec1 cells in Exp A and B. The relationship between each gene and its closest ARS or proARS

(as per OriDB) is listed. Those "Likely ARSs" that have not yet been assigned a proARS or

ARS number are listed by their position, as in "ChrX:Start-End". Those genes that are more than

5 kb away from the nearest ARS are shown in the "Distance to nearest ARS" column in red and bolded. Same analysis of the closest genes associated with each enriched break region for MEC1 cells is also shown.

BreakSeq_TableS7.xls

52 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S8. GO term enrichment analysis of "common break-associated genes" in Exp A and B

(for mec1-1 cells) and in Exp C (for MEC1 cells).

BreakSeq_TableS8.xls

53 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S9. RNA-seq data after RPKM (reads per kilobase per million reads, Log2 transformed) quantification, reported as Mean and Standard deviation of RPKM for each mRNA feature. The strain and sample are labeled as "strain-sample", as in "Mean mec1-HU".

BreakSeq_TableS9.xls

54 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S10. Genes that show significant (p<0.05) increase (red) or decrease (green) in the pair- wise comparative analyses between samples from the RNA-seq experiment.

BreakSeq_TableS10.xls

55 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S11. Top GO enrichment (with Benjamini Hochberg correction for multiple testing) groups of genes with indicated changes in expression as a result of HU exposure in MEC1 and mec1 cells. Those groups with a P value greater than 0.05 are shaded in grey.

MEC1 mec1

Molecular Function P value Molecular Function P value

Increased expression in HU relative to G1 sample oxidoreductase activity oxidoreductase activity 4.11E-10 0.0024 [GO:16491] [GO:16491] glyceraldehyde-3-phosphate catalytic activity dehydrogenase (NAD+) 4.80E-06 0.0131 [GO:3824] (phosphorylating) activity [GO:4365] ribonucleoside-diphosphate antioxidant activity reductase activity 0.0003 0.0467 [GO:16209] [GO:4748] Increased expression in R60 relative to HU sample structural constituent of ribosome iron ion binding 7.08E-46 0.0221 [GO:3735] [GO:5506] oxidoreductase activity, acting on structural molecule activity paired donors, with incorporation 7.33E-41 0.0268 [GO:5198] or reduction of molecular oxygen [GO:16705] protein kinase regulator activity structural constituent of cell wall 7.39E-05 0.0746 [GO:19887] [GO:5199] kinase regulator activity lyase activity 8.45E-05 0.0780 [GO:19207] [GO:16829] cyclin-dependent protein serine/threonine kinase regulator heme binding 0.0001 0.0804 activity [GO:20037] [GO:16538] structural constituent of cell wall tetrapyrrole binding 0.0007 0.0804 [GO:5199] [GO:46906]

56 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S12. Results of microarray analysis to identify those regions that show significant (> 2 standard deviations) decreases (black entries) or increases (red and bold entries) in the "BPS" sample compared to the "no BPS" sample.

BreakSeq_TableS12.xls

57 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S13. Primer sequences for PCR-based cloning of HU-inducible genes. “F” and “R” in primer name denote that the gene is in convergent or divergent orientation relative to the replication fork from ARS1, respectively.

Primer name Sequence (gene-specific sequences in capital letters)

YHK8-F-Ase-5' 5’-ctcgattaatTGGTCCTACATTGGAAACC-3’

YHK8-F-Not-3' 5’-gagtgcggccgcCACCAATGACGTAAGCGGG-3’

YHK8-R-Not-5' 5’-gagtgcggccgcTGGTCCTACATTGGAAACC-3’

YHK8-R-Ase-3' 5’-ctcgattaatCACCAATGACGTAAGCGGG-3’

AHP1-F-Nde-5' 5’-ctcacatatgCCTAAATCATCGGCGTCCC-3’

AHP1-F-Not-3' 5’-gagtgcggccgcTACTCCTTGTACAGAATCG-3’

AHP1-R-Not-5' 5’-gagtgcggccgcCCTAAATCATCGGCGTCCC-3’

AHP1-R-Nde-3' 5’-ctcgattaatTACTCCTTGTACAGAATCG-3’

RNR3-F-Ase-5' 5’-ctcgattaatGAAGGCTTGTTTCAGTTTG-3’

RNR3-F-Not-3' 5’-gagtgcggccgcGGTCTTAATACATACTAACG-3’

RNR3-R-Not-5' 5’-gagtgcggccgcGAAGGCTTGTTTCAGTTTG-3’

RNR3-R-Ase-3' 5’-ctcgattaatGGTCTTAATACATACTAACG-3’

TSA1-F-Ase-5' 5’-ctcgattaatGATAAAGAACTTCGTGCTC-3’

TSA1-F-Not-3' 5’-gagtgcggccgcTAGATAGATAGGTATAAACG-3’

TSA1-R-Not-5' 5’-gagtgcggccgcGATAAAGAACTTCGTGCTC-3’

TSA1-R-Ase-3' 5’-ctcgattaatTAGATAGATAGGTATAAACG-3’

58 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Table S14. Illumina adaptor sequences and multiplexing PCR primer sequences. “P”, phosphate; “Am”, amine; “MC6”, modification on C6; “MO”, modification on O. The shaded regions are where the multiplex indices reside.

Adapter 1 sequence (top) 5’-AmMC6/ACACTCTTTCCCTACACGACGCTCTTCCGATCT-3’

Adapter 1 sequence (bottom) 5’-P-GATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA/AmMO-3’

Adapter 2 sequence (top) 5’-AmMC6/GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-3’

Adapter 2 sequence (bottom) 5’-P-GATCGGAAGAGCACACGTCTGAACTCCAGTCAC/AmMO-3’

PCR primer Index 1 CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTC

PCR primer Index 2 CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTC

PCR primer Index 3 CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTC

PCR primer Index 4 CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTC

PCR primer Index 5 CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTC

PCR primer Index 6 CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTC

PCR primer Index 7 CAAGCAGAAGACGGCATACGAGATGATCTGGTGACTGGAGTTC

PCR primer Index 8 CAAGCAGAAGACGGCATACGAGATTCAAGTGTGACTGGAGTTC

PCR primer Index 9 CAAGCAGAAGACGGCATACGAGATCTGATCGTGACTGGAGTTC

PCR primer Index 10 CAAGCAGAAGACGGCATACGAGATAAGCTAGTGACTGGAGTTC

PCR primer Index 11 CAAGCAGAAGACGGCATACGAGATGTAGCCGTGACTGGAGTTC

PCR primer Index 12 CAAGCAGAAGACGGCATACGAGATTACAAGGTGACTGGAGTTC

59 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 1

A Cell lysis & End-repair with Elution and protein removal biotinylated dATP fragmentation

Capture on streptavidin beads Illumina TruSeq Illumina PCR adaptor ligation multiplexed paired-end sequencing B

C mec1_HU 1h Exp B vs. Exp A 16 mec1_R 1h Exp B vs. Exp A

14 14

12 12

10 10

8 8

6 6

R = 0.895 4 R = 0.937 4 Log (read counts perbase)(read counts Log Log (read counts perbase)(read counts Log 2 2

0 0 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 16 Log (read counts per base) Log (read counts per base) Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 2

A 1 2 3 4 5 6 7 HUmec1 (ExpA)

Recoverymec1(ExpA)

HUmec1 (ExpB)

Recoverymec1 (ExpB)

HUMEC1 (ExpC)

RecoveryMEC1 (ExpC)

B DSBs in HU or Recovery C Enriched DSBs All features associated with Protein-coding genes associated samples (Recovery vs. HU) Enriched DSBs with Enriched DSBs

Exp A Exp B Exp A Exp B Exp A Exp B Exp A Exp B Exp A Exp B HU HU Recovery Recovery

53 85 66 54 136 81 94 92 117 542 203 557 337 185 372 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 3

A Enriched in mec1 Recovery samples B Enriched in MEC1 Recovery sample C ExpC_Collinear COS8 YHL046W-A UGA4 YJL28W RNR2 RRN7 RIM101 ExpC_Convergent ARN2 YJL27C APS3 15 1 MEC1 2 ARS802 ARS405 ARS1013 (not shown) ARS804 10 ~500 bp left distal to ARN2 ~200 bp right distal to UGA4 ~17 kb left distal to RNR2 ~8 kb left distal to RIM101

(Exp A) HUmec1

5 NumberDSBs of

(Exp A) Recoverymec1 0 Start Middle End Relative positions of transcription units

(Exp B) HUmec1 D ExpA&B_Collinear ExpA&B_Convergent 35 (Exp B) Recoverymec1 mec1-1 30

25

(Exp C) HUMEC1 20

15

10 NumberDSBs of MEC1 (Exp C) Recovery 5

0 Start Middle End Relative positions of transcription units Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 4

A B C HU HU WT mec1

R60

G1 G1 HU R30 R60 R30 HU R60 G1 HU R60 R30 R30 G1 S mec1_ mec1_ WT_G1 WT_R30 WT_R60 WT_HU mec1_ mec1_ mec1_ WT_G1 WT_R30 WT_HU mec1_ WT_R60 mec1_ mec1_ mec1_ WT_R60 mec1_ mec1_ HU WT_G1 WT_R30 WT_HU mec1_ G1 HU R30 R60 G1

Asyn AIM26 PRY1 HUL5 YKL036C PRY3 YLR271W 1C 2C 1C 2C UGP1 BUD23 CAB1 D CYC7 PSR2 ATG12 400 400120 120 YCR047W-A PHO81 TUL1 FMT1 YOR389W WT mec1 WT mec1 IPP1 CYC7 COS12 110 110 350 350 IRC19 YEF1 YJR129C YDR179W-A UTR2 GUP2 CYC7 PSR2 100 100 TDH1 YGR079W BIO4 300 300 RNR3 PKH1 PCA1 YOL046C YDR491C SFT1 90 90 MCH5 IZH1 ZRT1 250 250 PDA1 MZM1 YHL034W-A 80 80 OKP1 UBP12 SBP1 RNR4 ELO1 VEL1 RelativemRNA level 200 RelativemRNA level 200 YJL028W CIS3 70 70 YLL019W-A YJL027C ILV5 YRA2 RNR2 YER088C-A YEL057C 150 15060 60 YJL026C-A YER088W-B CSE4 0 40 80 120 0 40 80 120 RRN7 VPS28 FAD1 Time (min) Time (min) IRC8 NSE4 THI12 G1 HU R30 R60 G1 HU R30 R60 LAP3 YIR020W-A FRE2 YBL077W ATP5 YLL020C E ZRT1 FRE2 ILS1 GEA1 CAP1 150 15040 40 YPR1 PIL1 KRE28 WT mec1 WT mec1 ISW2 35 35 PDC6 RFM1 YOR302W TAT1 NCR1 30 30 CPA1 PDE2 YDR220C ZRT1 FRE2 RAX1 100 100 PSF1 AYR1 25 25 SND1 BEM4 MTG1 RMD6 PSR2 YSN1 20 20 YDL242W MHR1 KAP114 VHT1 SMC4 CDC3 15 15 50 50 YCR023C ARP6 PPM2 PSK2 10 10 YMD8 THP1 RelativemRNA level RelativemRNA level COG1 5 5

0 0 0 0 0 40 80 120 0 40 80 120 Time (min) Time (min) Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 5

A -BPS +BPS B IV 8 GIN4 8 IV

4 0

0 0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 VIII 8 ARN2 8

4 0

0 0 100 200 300 400 500 Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 6

Change in plasmid retention after HU exposure in at least 3 biological replicates

60

40

20

0

-20

-40 YHK8-F YHK8-R AHP1-F AHP1-R TSA1-F TSA1-R RNR3-F RNR3-R Changein plasmid retention HUafter exposure (%)

Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure 7

Transporter Stress response Other Rad53-unchecked Rad53-checked Replication- DSB genes transcription genes origin of replication origin of replication transcription factor genes (early) (late) clash

MEC1 mec1-1

G1

HU

S

HU ON ON ON ON ON ON

G2

M Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure S1

A HUmec1 (Exp A) I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI 0 Mb 0.2 Mb 0.4 Mb 0.6 Mb 0.8 Mb 1 Mb Recoverymec1 (Exp A) I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI 0 Mb 0.2 Mb 0.4 Mb 0.6 Mb 0.8 Mb 1 Mb B CDS

HUmec1 (Exp A)

Recoverymec1 (Exp A) Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure S2 A gene Ori-DB ARSs Rad53-checked origins Rad53-unchecked origins

WT-G1_Break-seq (888,428 reads)

WT-HU_mock Break-seq (2,319,280 reads)

WT-HU_Break-seq (2,470,661 reads)

mec1-G1_Break-seq (2,114,418 reads)

mec1-HU_mock Break-seq (2,353,555 reads)

mec1-HU_Break-seq (2,590,624 reads)

B WT mec1 HU only 0 HU to HU_mock HU only 2 HU to HU_mock (265) 5 0 (472) (43) 2 0 (60) MACS peak calling methods: • HU only, one sample analysis with HU 230 sample 8 31 30 • HU to HU_mock, two sample analysis 242 with HU_mock as control 27 • HU to G1, two sample analysis with G1 as control

*All methods applied p < 0.05 as cut-off HU to G1 HU to G1 (90) (547) Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Figure S3 HTA1 HHF1 HHO1 HTB1 HHT1 HTZ1 HTA2 HHF2 HTB2 HHT2 Distance to A 500 origin (bp) WT_mRNA level HTA1Distance1819 to HHF1 HHO1 HTB1 originEarliest-replicating, (bp)604 HHT1 no HTZ1 400 HTA2HTA1 conflict18192166 with HHF2HHF1transcription HHO1 HTB2Distance1069 to HHT2 HTB1origin (bp)604 HHT1 HTZ1 HTA1100 HHF1HTA2 2166514 HHF2HHO1 HTA1 1819 HHF1 HHO1 HTB1 HHT1HTB2 10691521 HHT2HTZ1 300 ReplicationHTB1 timing604 HHT1 HTZ1 HTA1HTA2 HHF1HHF2 8514514 HHO1 100 HTA2 Mid-replicating,2166 HHF2 higher HTB1HTB2 HHT1HHT2 15217477 HTZ1 HTB2 probability1069 ofHHT2 conflict with HTA1100 HHF1HTA2 ReplicationHHO1HHF2 timing85144985 HTA1 HHF1 transcription514 HHO1 200 HTB1 HHT1HTB210080 HTZ1HHT2 74775435 ReplicationHTB1 timing HHT1 1521 HTZ1 HTA1HTA2 HHF1HHF2 HHO1 4985 100 HTA2 ReplicationHHF2 timing8514 RelativemRNA level HTB1HTB2 HHT1HHT280 HTZ1 5435 HTB2 HHT2 7477 100 HTA2 ReplicationHHF2 timing 100 HTA1 HHF1 HHO1 Relatively4985 late-replicating but HTB210080 HHT2 ReplicationHTB1 timing HHT18060 HTZ1 low5435 HU-induction, no conflict 100 Replication timing with transcription HTA280 HHF2 0 ReplicationHTB2 timing HHT260 -2010080 0 HTA120 40 60HHF180 100 120HHO1 140 HTA1 HHF1 HHO1 HTB1 8060HHT1 HTZ1% HL HTB1 HHT1 HTZ1 G1 ReplicationHTA2 TimeHU (min) HHF2 timingR30 R60 6040 HTA2 HHF2 80 HTB2 HHT2 HTB2 HHT2 B 500 60 % HL C 100 40

mec1_mRNA level% HL Replication timing 8060 6040 % HL 400 80 % HL 4020 60 40 % HL 20 6040 % HL 300 20 60 % HL 40 200 Equivalent 40 time for R30 20 5 % HL 10 15 20 25 30 35 40 45 200% HL 0 40 4020 5 10 15 20Time25 (min)30 35 40 45

RelativemRNA level 200 0 100 20 5 10 15 20 20 25 30 Time35 (min)40 45 0 5 10 15 20 Equivalent25 30 35 40 45 Time (min) time for HU 200 5 10 15 20 25 30 Time35 (min)40 45 0 0 5 10 15 20 25 30 Time35 0 (min)40 45 -20 0 0 20 40 60 5 80 10100 12015 140 20 255 10 30 15 3520 2540 30 4535 40 45 Time (min) 5 10 Time15 (min)20 25 30 Time35 (min)40 45 Time (min) 0 Time (min) 5 10 15 20 25 30 35 40 45 Time (min) Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press

Break-seq reveals hydroxyurea-induced chromosome fragility as a result of unscheduled conflict between DNA replication and transcription

Elizabeth Hoffman, Andrew McCulley, Brian Haarer, et al.

Genome Res. published online January 21, 2015 Access the most recent version at doi:10.1101/gr.180497.114

Supplemental http://genome.cshlp.org/content/suppl/2015/01/16/gr.180497.114.DC1 Material

P

Accepted Peer-reviewed and accepted for publication but not copyedited or typeset; accepted Manuscript manuscript is likely to differ from the final, published version.

Open Access Freely available online through the Genome Research Open Access option.

Creative This manuscript is Open Access.This article, published in Genome Research, is Commons available under a Creative Commons License (Attribution-NonCommercial 4.0 License International license), as described at http://creativecommons.org/licenses/by-nc/4.0/. Email Alerting Receive free email alerts when new articles cite this article - sign up in the box at the Service top right corner of the article or click here.

Advance online articles have been peer reviewed and accepted for publication but have not yet appeared in the paper journal (edited, typeset versions may be posted when available prior to final publication). Advance online articles are citable and establish publication priority; they are indexed by PubMed from initial publication. Citations to Advance online articles must include the digital object identifier (DOIs) and date of initial publication.

To subscribe to Genome Research go to: https://genome.cshlp.org/subscriptions

Published by Cold Spring Harbor Laboratory Press