Identification of Involved in the Maintenance of Genome Stability

by

Edith Hang Yu Cheng

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Biochemistry University of Toronto

©Copyright by Edith Cheng2015

Identification of Proteins Involved in the Maintenance of Genome Stability

Edith Cheng

Doctor of Philosophy

Department of Biochemistry University of Toronto

2015 Abstract

Aberrant changes to the genome structure underlie numerous human diseases such as cancers.

The functional characterization ofgenesand proteins that maintain stability will be important in understanding disease etiology and developing therapeutics. I took a multi-faceted approach to identify and characterize involved in the maintenance of genome stability.

As biological pathways involved in genome maintenance are highly conserved in evolution, results from model organisms can greatly facilitate functional discovery in humans. In S. cerevisiae, I identified 47 essential depletions with elevated levels of spontaneous DNA damage foci and 92 depletions that caused elevated levels of chromosome rearrangements. Of these, a core subset of 15 DNA replication genes demonstrated both phenotypes when depleted.

Analysis of rearrangement breakpoints revealed enrichment at yeast fragile sites, Ty retrotransposons, early origins of replication and replication termination sites. Together, thishighlighted the integral role of DNA replicationin genome maintenance.

In light of my findings in S. cerevisiae, I identified a list of 153 human proteins that interact with the nascentDNA at replication forks, using a DNA pull down strategy (iPOND) in human cell lines. As a complementary approach for identifying human proteins involved in genome ii maintenance, I usedthe BioID techniqueto discernin vivo proteins proximal to the human BLM-

TOP3A-RMI1-RMI2 genome stability complex, which has an emerging role in DNA replication progression. I uncovered45proximal proteins to the wildtype complex and catalogued the gains and losses of proximal proteins following the expression of three RMI1 mutants and in conditions of DNA replication stress. Altogether, I used a multi-faceted approach to identify a replication-enriched dataset of genes and proteins that collectively safeguard the genome from , which will be invaluable for functional characterization studies in the future.

iii

Acknowledgements

I would like to thank my supervisor, Grant Brown, for his support and guidance, and for the opportunity to work in his lab. I would also like to thank my committee members, Brigitte

Lavoie and Igor Stagljar for their advice and encouragement throughout the years.

I would like to thank the past members of the Brown lab, especially Jay Yang and Jessica

Vaisica for their contribution to my projects. A special thanks goes tothe present members of the

Brown Lab for their helpful discussions and for their friendship.

Finally, I would like to thank my friends, my grandmother, my sister and my parentsfor their constant encouragement.

iv

List of Abbreviations

AML Acute Myeloid Leukemia BTRR BLM-TOP3A-RMI1-RMI2 BIR Break-Induced Replication CMG Cdc45-MCM-GINS CIN Chromosome Instability CML Chronic Myelogenous Leukemia CFS Common Fragile Site CGH Comparative Genome Hybridization CHEF Contour-clamped Homogeneous Electric Field CDK Cyclin Dependent D-loop Displacement-loop dHJ double Holliday Junction DSBR Double-Strand Break Repair DSB Double-Stranded Break dsDNA double-stranded DNA FACT complex “Facilitates chaperone complex GO GINS Go-Ichi-Ni-San GCR Gross Chromosome Rearrangements HR HU Hydroxyurea iPOND Identification of Proteins on Nascent DNA LFQ Label Free Quantification LTR Long Terminal Repeats LOH Loss of Heterozygousity MCM Mini Chromosome Maintenance MDS Myelodysplastic Syndrome NCC Nascent Chromatin Capture NHEJ Non-Homologous End Joining ORC Origin Recognition Complex BioID Proximity-dependent Biotin Identification RPA Replication A RSZ Replication Slow Zones RPC Progression Complex RBP RNA binding proteins SGD Saccharomyces Genome Database SAINT Significance Analysis of INTeractome SNP Single Nucleotide Polymorphism SSA Single Strand Annealing ssDNA single-stranded DNA SCE Sister Chromatid Exchange SDSA Synthesis-Dependent Strand-Annealing Tet allele Tetracycline-regulated allele UFB Ultra-fine DNA Bridge

v

Table of Contents

ACKNOWLEDGMENTS ...... IV

LIST OF ABBREVIATIONS ...... V

TABLE OF CONTENTS ...... VI

LIST OF FIGURES ...... X

LIST OF TABLES ...... XII

LIST OF APPENDICES ...... XIII

CHAPTER 1 INTRODUCTION ...... 1 1.1 GENOME INSTABILITY ...... 2 1.2 DNA REPLICATION ...... 4 1.2.1 Replication initiation ...... 4 1.2.2 Replication elongation and the replisome ...... 5 1.3 IMPEDIMENTS TO THE DNA REPLICATION FORK ...... 7 1.3.1 Genotoxic Stress ...... 7 1.3.2 Protein-DNA obstacles ...... 8 1.3.3 DNA topology and unusual DNA structures ...... 8 1.3.4 Chromosome fragile sites ...... 9 1.4 CELLULAR RESPONSE TO STALLED DNA REPLICATION FORKS ...... 10 1.5 CELLULAR RESPONSE TO DSB FORMATION ...... 13 1.6 MECHANISMS OF DOUBLE-STRANDED BREAK REPAIR ...... 14 1.6.1 Non-homologous end joining (NHEJ) ...... 15 1.6.2 Homologous Recombination (HR)...... 16 1.7 BLM-TOP3A-RMI1-RMI2 GENOME MAINTENANCE COMPLEX ...... 18 1.7.1 The (BLM) ...... 18 1.7.2 BLM-TOP3A-RMI1-RMI2 (BTRR) core complex ...... 18 1.7.3 BTRR involvement in higher order multi-complex structures ...... 19 1.7.4 Role of BTRR in homologous recombination and double Holliday junction dissolution ...... 20 1.7.5 BTRR in DNA replication progression ...... 21 1.7.6 BTRR at perturbed DNA replication forks ...... 22 1.7.7 BTRR in maintenance of common fragile site stability ...... 23 1.8 RATIONALE ...... 24

vi

CHAPTER 2 GENOME REARRANGEMENTS CAUSED BY DEPLETION OF ESSENTIAL DNA REPLICATION PROTEINS IN ...... 25 2.1 SUMMARY ...... 26 2.2 INTRODUCTION ...... 26 2.3 RESULTS ...... 28 2.3.1 Depletion of essential gene products causes spontaneous DNA damage ...... 28 2.3.2 Depletion of essential gene products causes chromosome loss and rearrangement ...... 30 2.3.3 Chromosome III rearrangements in essential genome stability mutants ...... 33 2.3.4 Essential gene product depletion causes genome rearrangements with boundaries at Ty retrotransposons ...... 35 2.3.5 Boundaries of rearrangements correlate with Ty retrotransposons, LTRs, tRNA genes, early replication origins and replication termination sites ...... 44 2.3.6 Depletion of NSE1 human homolog reduces replication progression ...... 45 2.4 DISCUSSION ...... 46 2.4.1 Comparison of conditional allele screens for genome instability mutants ...... 46 2.4.2 Essential genes involved in DNA replication are critical for genome stability ...... 49 2.4.3 Ty retrotransposons and tRNA genes promote chromosome rearrangements ...... 51 2.4.4 Parallels with human common fragile sites ...... 52 2.5 METHODS ...... 53 2.5.1 Yeast strains and media ...... 53 2.5.2 Fluorescence microscopy ...... 53 2.5.3 Illegitimate mating assays ...... 54 2.5.4 Array comparative genome hybridization ...... 54 2.5.5 CHEF gel electrophoresis and southern blot analysis ...... 55 2.5.6 Restriction Digestion and Sequencing Analysis of FS1 and FS2 ...... 56 2.5.7 Enrichment Analyses ...... 56 2.5.8 High-throughput microscopy of EdU incorporation ...... 56

CHAPTER 3 IDENTIFICATION OF PROTEINS ENRICHED AT NEWLY REPLICATED DNA ...... 58 3.1 SUMMARY ...... 59 3.2 INTRODUCTION ...... 59 3.3 RESULTS ...... 62 3.3.1 Isolating known replisome proteins using iPOND ...... 62 3.3.2 Determining proteins enriched on nascently replicated DNA with iPOND-MS ...... 64 3.3.3 siRNA screen to identify proteins that play a role in DNA replication progression ...... 67 vii

3.4 DISCUSSION ...... 70 3.4.1 Identification of known replisome associated factors by iPOND-MS ...... 70 3.4.2 Comparison with previous iPOND-MS studies ...... 70 3.4.3 Newly identified DNA replication factors ...... 72 3.5 METHODS ...... 74 3.5.1 Cell Culture ...... 74 3.5.2 iPOND ...... 74 3.5.3 Affinity purification and immunoblot analysis ...... 74 3.5.4 Affinity purification for mass spectrometry ...... 75 3.5.5 Mass spectrometry ...... 75 3.5.6 Data analysis ...... 76 3.5.7 High-throughput fluorescence microscopy ...... 76

CHAPTER 4 BIOTIN LIGASE TAGGING REVEALS NOVEL PROTEINS PROXIMAL TO THE BLM-TOP3A-RMI1-RMI2 DNA REPLICATION AND REPAIR COMPLEX...... 77 4.1 SUMMARY ...... 78 4.2 INTRODUCTION ...... 78 4.3 RESULTS ...... 81 4.3.1 Identification of proximal proteins to wildtype BLM, RMI1 and RMI2 ...... 81 4.3.2 Identification of changes in proximal proteins in RMI1 mutants ...... 87 4.3.3 Changes in BLM, RMI1 and RMI1S455N interactions during replication stress ...... 92 4.3.4 Knockdown of novel BTRR proximal proteins results in decreased replication progression 95 4.4 DISCUSSION ...... 97 4.5 METHODS ...... 99 4.5.1 Plasmid Construction ...... 99 4.5.2 Cell Culture ...... 99 4.5.3 Construction of stable cell lines ...... 100 4.5.4 Biotin Identification (BioID) ...... 100 4.5.5 BioID affinity purification for mass spectrometry ...... 100 4.5.6 Mass spectrometry ...... 101 4.5.7 Data analysis ...... 101 4.5.8 Fluorescence microscopy for protein localization ...... 101 4.5.9 iPOND-western ...... 102 4.5.10 High-throughput fluorescence microscopy ...... 102

viii

CHAPTER 5 GENERAL DISCUSSION AND FUTURE DIRECTIONS ...... 103 5.1 NOVEL COMMON FRAGILE SITE MAINTENANCE GENES DETECTED FROM SPONTANEOUS DNA DAMAGE SCREENS IN S. CEREVISIAE ...... 104 5.1.1 Function of NSE1 in preventing spontaneous DNA damage and chromosome breaks in human cell lines ...... 105 5.1.2 Do chromosome breaks caused by NSE1 depletion localize to common fragile sites? ...... 106 5.1.3 Mechanism by which hNSE1 contributes to the maintenance of common fragile site stability 107 5.2 NOVEL DNA REPLICATION PROTEIN CANDIDATES FROM IPOND-MS ...... 107 5.2.1 Co-localization of iPOND candidates with replisome proteins ...... 108 5.2.2 Contributions of iPOND candidates to the dynamics of DNA replication progression ...... 108 5.2.3 Future applications for iPOND-MS ...... 109 5.3 BIOID CANDIDATES THAT ARE PROXIMAL TO THE BTRR COMPLEX ...... 110 5.3.1 Co-localization of BioID candidates with BTRR complex ...... 110 5.3.2 Function of BioID candidates in DNA replication ...... 111 5.3.3 Function of BioID candidates in suppression of sister chromatid exchange ...... 112 5.3.4 Function of BioID candidates in double Holliday junction dissolution ...... 112 5.3.5 Function of BioID candidates at ultra-fine anaphase bridges and common fragile sites .... 113 5.3.6 Future applications of BioID methodology ...... 114 5.4 CONCLUSION...... 115

REFERENCES ...... 151

ix

List of Figures

Figure 1.1 Pre-replicative complex assembly during replication initiation...... 4

Figure 1.2 Activation of the pre-replicative complex in replication initiation...... 5

Figure 1.3 Eukaryotic DNA replication fork...... 6

Figure 1.4 Impediments to DNA replication progression ...... 7

Figure 1.5 ATR dependent DNA damage sensing pathway ...... 11

Figure 1.6 Cellular response to DNA replication fork stalling ...... 13

Figure 1.7 Mechanisms of DSB repair and genomic consequences ...... 15

Figure 1.8 Conserved interactions between DNA repair proteins with BLM and Sgs1 ...... 19

Figure 1.9 BTRR function in double Holliday junction dissolution and decatenation ...... 21

Figure 2.1 Depletion of yeast essential genes results in elevated levels of spontaneous Ddc2 foci formation ...... 29

Figure 2.2 Depletion of yeast essential genes results in elevated levels of illegitimate mating ...... 31

Figure 2.3 Classification of rearrangement events that lead to illegitimate mating ...... 34

Figure 2.4 Comparative genome hybridization microarray analysis of class 3 illegitimate diploids ...... 36

Figure 2.5 Mechanisms of repair in replication deficient mutants ...... 38

Figure 2.6 Southern Blot confirmation of chromosome rearrangements predicted in microarray analysis 39

Figure 2.7 Restriction map and Southern blot analysis of the FS2 region of chromosome III ...... 41

Figure 2.8 siRNA depletion of hNSE1 in mammalian U2OS cells results in reduced DNA incorporation46

Figure 2.9 Comparison of Ddc2 foci and a-like faker genome instability screens ...... 47

Figure 2.10 Mechanisms by which genome instability occurs in replication deficient mutants ...... 51

Figure 3.1 Schematic of iPOND methodology ...... 61 x

Figure 3.2 Enrichment of replisome protein, PCNA, on nascent DNA by iPOND...... 63

Figure 3.3 Proteins enriched on nascently replicated DNA by iPOND-MS ...... 65

Figure 3.4 siRNA depletion of iPOND candidates causes reduced EdU incorporation ...... 68

Figure 3.5 Venn diagram comparisons with previous replication fork protein datasets ...... 72

Figure 4.1 Schematic of BioID methodology ...... 81

Figure 4.2 Flag-BirA* fusion proteins biotinylate endogenous proteins in HEK293 cells ...... 82

Figure 4.3 Comparison of BLM, RMI1 and RMI2 protein interactions determined by BioID-MS...... 84

Figure 4.4 Comparison of RMI1 and BLM proximal protein interactions determined by BioID ...... 86

Figure 4.5 Construction of Flag-BirA*-RMI1 mutants ...... 88

Figure 4.6 Comparison BioID candidates for RMI1 mutants ...... 90

Figure 4.7 Comparison of RMI1, RMI1 S455N and BLM BioID candidates in the presence of replication stress ...... 93

Figure 4.8 Reduced EdU Incorporation following siRNA depletion of RMI1 and BLM BioID candidates ...... 96

Figure 4.9 Comparison of BLM BioID candidates with BLM protein interactors defined by AP-MS ...... 98

Figure 5.1 Summary of novel genes and proteins involved in the maintenance of genome stability ...... 105

xi

List of Tables

Table 1.1 A selection of genes with a role in the maintenance of genome integrity1 ...... 3

Table 2.1 Frequencies of illegitimate mating in tetracycline-regulatable promoter conditional alleles grown in the presence and absence of doxycycline 1 ...... 33

Table 2.2 Classification of Class 3 illegitimate diploid chromosome rearrangements...... 37

Table 2.3 Enrichment Analysis of the Correlation between Boundaries of Chromosome Rearrangements (n=14) and Selected Genomic Features ...... 44

Table 2.4 Yeast strain genotypes ...... 53

Table 2.5 Primers used to generate probes in Southern blot analysis ...... 55

Table 2.6 Primers used for PCR amplification and sequencing of FS1 and FS2 ...... 56

xii

List of Appendices

Table 1. List of Tet alleles used in Ddc2 foci microscopy assay……………………….………………..116

Table 2. Functional descriptions of Tet alleles that displayed elevated Ddc2 foci ………………..….…117

Table 3. List of 153 proteins that interact with nascent DNA determined by iPOND-……………….…119

Table 4. List of targeted siRNAs used to test the effect of gene depletion on EdU incorporation Intensities………………………………………………………………………………………………...137

Table 5. List of BTRR candidate interactors determined by BioID-MS……………………...…………143

xiii 1

Introduction

2

1 INTRODUCTION 1.1 Genome Instability

Cell proliferation involves tightly coordinated biological processes, which act to promote accurate genome duplication and to preserve genome integrity. Coordination of DNA replication, DNA damage sensing, DNA repair and are required for successful cell cycle progression and are critical for the maintenance of genome integrity. Evolutionary conservation of these pathways in lower such as Saccharomyces cerevisiae reflects their importance in cell survival. A comprehensive characterization of the proteins involved in these pathways and their regulation will lead to an understanding of how cells protect themselves against genome instability.

Genome instability refers to a range of genetic alterations. The smallest alterations include nucleotide base substitutions, micro-insertions and micro-deletions. Gross chromosome rearrangements (GCR) include translocations, duplications, inversions and deletions. Chromosome instability (CIN) refers to changes in the chromosome number that reflect either whole chromosome gains or losses, largely due to faulty chromosome segregation. Genome instability could lead to deleterious functional consequences such as improper activation of oncogenes, repression of tumor suppressors or creation of fusion proteins with novel functions (Aguilera and Gomez-Gonzalez, 2008; Mani and Chinnaiyan, 2010). A well-characterized example is the reciprocal translocation between 9 and 22, which results in the constitutively activated Bcr-Abl fusion that drives rapid cell proliferation in chronic myelogenous leukemia (CML) (Rowley, 1973). The presence of a GCR at this breakpoint in 95% of CML patients suggests that there is a contribution of the genomic sequence that makes it highly prone to rearrangements (Rowley, 1973). Many examples of genome instability diseases have been identified, which frequently involve gene mutations in the DNA replication, DNA damage sensing and DNA repair pathways ( Table 1.1). Characterization of the proteins in these pathways and the complexity of their regulation will reveal the molecular mechanisms by which genome instability arises and aid in the understanding of disease etiology.

3

Table 1.1 A selection of genes with a role in the maintenance of genome integrity1

S. cerevisiae H. sapiens FUNCTION HUMAN DISEASE GCR HR Replication MCM4 MCM4 Replicative Cancer predisposition Unknown High RFA1,2 and 3 RPA70, 32 Replication factor Cancer predisposition High High and 14 A CAC1, 2 and 3 CHAF1A and Chromatin Unknown High High B assembly ASF1 ASF1A and B Chromatin Unknown High High assembly TOP1 and TOP1 and Unknown Unknown High TOP2 TOP2 CDC9 LIG1 Ligase 1 Unknown Unknown High POL1-PRI1 POLA1- α and Unknown High High and 2 PRIM1 and 2 POL3 POLD1 Replicative Unknown Unknown High polymerase POL30 PCNA factor Unknown Normal High RFC1-5 RFC1-5 Clamp loader Unknown High High DNA damage Sensing MEC1 ATR Transducer kinase Seckel syndrome High High TEL1 ATM Transducer kinase Ataxia telangiectasia; High High Cancer predisposition CHK1 CHEK1 Effector kinase Rare tumors; Cancer High Unknown predisposition RAD53 CHEK2 Effector kinase Li-Fraumeni High Unknown syndrome variant; Cancer predisposition DDC2 ATRIP Signaling Unknown High High DNA repair SGS1 BLM and RecQ helicase Bloom syndrome; High High WRN ; Cancer predisposition TOP3 TOP3A and B Unknown High High MRE11 MRE11 HR and NHEJ Ataxia telangiectasia- High High like disease XRS2 NBS1 HR and NHEJ Nijmegen breakage High Unknown syndrome; Cancer predisposition BRCA1 Damage checkpoint Familial breast and High High mediator ovarian cancer FANCA-G, Cross-linkage ; High High FANCD2, repair cancer predisposition FANCL, BRCA2 H2A H2AX Chromatin Unknown High Unknown decondensation SMC5-SMC6 SMC5-SMC6 Cohesion and repair Unknown High Unknown

1This table lists a selection of genes in which mutations have been shown to increase homologous recombination (HR) and gross chromosomal rearrangements (GCR). Adapted from (Aguilera and Gomez-Gonzalez, 2008)

4

1.2 DNA replication

1.2.1 Replication initiation

DNA replication is the essential process by which cells duplicate their genome for cell division. The timing and recruitment of specific enzymatic proteins to the DNA must be well coordinated in order to ensure accurate DNA synthesis. The replication process is highly conserved between species and the ordered assembly of the multiple core protein complexes is well characterized (reviewed in (Bell and Dutta, 2002; Hubscher, 2009; Masai et al., 2010). Notably, many replication protein functions were first discovered in S. cerevisiae and yeast protein homologs will be highlighted in superscript.

Replication initiation is divided into two main stages and begins with the assembly of the pre- replicative complex (pre-RC) (Figure 1.1). The origin recognition complex (ORC) binds to potential origin sequences and sequentially recruits and Cdt1 in order to facilitate the “loading” of the mini chromosome maintenance (MCM) helicase proteins (Chen et al., 2007; Shen and Prasanth, 2012; Tanaka and Diffley, 2002b; You and Masai, 2008). To ensure that initiation only occurs once per cell cycle, post-translational modification of Cdt1 and Cdc6 by cyclin dependent kinase (CDK) results in protein degradation and nuclear export, respectively (Takeda et al., 2005). Present only in higher eukaryotes, Geminin inhibits Cdt1 as an additional method to prevent re-initiation of replication origins during the cell cycle (Lee et al., 2004).

Figure 1.1 Pre-replicative complex assembly during replication initiation. See text for details. Adapted from (Takeda and Dutta, 2005).

Next, the Go-Ichi-Ni-San (GINSSld5-Psf1-Psf2-Psf3) complex is recruited to the pre-RC to allow stable association of Cdc45 with the MCM2-7 to form the active Cdc45-MCM-GINS (CMG) helicase complex (Figure 1.2)(Kanemaki and Labib, 2006; Labib and Gambus, 2007; Moyer et al., 2006; Shen and Prasanth, 2012). In budding yeast, Mcm10 recruits Cdc45 to origins of replication and interacts with the Dbf4-dependent kinase (Dbf4-Cdc7) to mediate the phosphorylation of

5

MCM2-7, forming an active replication fork (Figure 1.2) (Kanke et al., 2012; Lee et al., 2003b; Merchant et al., 1997; van Deursen et al., 2012). Dbf4-Cdc7 also phosphorylates Sld3 to mediate the Sld3-Cdc45 interaction at the pre-RC (Heller et al., 2011; Yabuuchi et al., 2006). Dpb11 acts to scaffold both Sld3-Cdc45 and a pre-formed Sld2-GINS-DNA polymerase  complex to MCM2-7 (Tak et al., 2006; Tanaka et al., 2007). While the human homologs for Sld2 and Dpb11 have been described (RECQ4 and TOPBP1, respectively) a functional ortholog of Sld3 has not been identified (Shen and Prasanth, 2012). Additional replisome progression complex (RPC) components assemble at the site of initiation prior to DNA synthesis, including the fork stabilization factors, ClaspinMrc1, TimelessTof1 and TipinCsm3(Gambus et al., 2006; Shen and Prasanth, 2012). Furthermore, the interaction between MCM complex with And1Ctf1 and the “facilitates chromatin transcription” (FACT) histone chaperone complex is important in promoting DNA unwinding of nucleosome bound templates at origins (Gambus et al., 2006). Ultimately, the ordered assembly of all of these protein factors is required to initiate DNA synthesis. Unwinding of the DNA by the resulting RPC leads to the initiation of DNA replication by the large protein complex at the active replication fork termed the “replisome”(Masai et al., 2005; Pacek et al., 2006; Pacek and Walter, 2004; Tan et al., 2006).

Figure 1.2 Activation of the pre-replicative complex in replication initiation. See text for details. Adapted from(Takeda and Dutta, 2005).

1.2.2 Replication elongation and the replisome

The eukaryotic replisome is the core multi-component protein complex that mediates proper DNA replication progression at the DNA replication fork (Figure 1.3). Unwinding of double- stranded DNA (dsDNA) is mediated by MCM2-7 and subsequently (RPA) is recruited to coat the uncovered single-stranded DNA (ssDNA) (Fanning et al., 2006; Wold,

6

1997). This allows the DNA polymerase α-primase complex (Polα), to access the replication fork and synthesize short RNA-DNA primers required for replication to begin. DNA elongation is mediated by replicative (Polδand Polε) (Hubscher et al., 2002; McInerney et al., 2007; Nick McElhinny et al., 2008; Pursell et al., 2007; Walter and Newport, 2000). Stable association of processive polymerases with the template is mediated by the replication processivity clamp, proliferating cell nuclear antigen (PCNA) (Jonsson and Hubscher, 1997; Moldovan et al., 2007), which is loaded on to DNA by the complex (RFC) (Cullmann et al., 1995; Ellison and Stillman, 2001; Fien and Stillman, 1992; Maga et al., 2000). During DNA synthesis in the discontinuous lagging strand, primer RNA displacement results in a intermediate that is processed by (FEN1Rad27), DNA2 endonuclease, RNaseH, Polδ and DNA ligase (LIG1) (Kao et al., 2004; Levin et al., 1997; Liu et al., 2004; Stewart et al., 2006). Tight regulation of this process ensures that only the DNA ahead of the replication fork is unwound, while the remainder of the genome is double-stranded and packaged by . In the following sections we will discuss other proteins that travel with the replisome to maintain stable replication fork structure.

Figure 1.3Eukaryotic DNA replication fork. Schematic of replisome proteins at a DNA replication fork. MCM helicase unwinds dsDNA allowing recruitment of RPA, which binds exposed ssDNA. This open conformation allows access to Polα primase for primer elongation and replicative polymerases Polδ and Polε for DNA synthesis. PCNA is loaded by the RFC clamp loader complex to enhance processivity of DNA replication. Discontinuous fragments on the lagging strand are processed by Rad27 (FEN1 in humans), Dna2 helicase, RNaseH and Lig1 DNA ligase. Topoisomerases (Top2) and checkpoint proteins Mrc1, Tof1, Csm3 (CLASPIN, TIMELESS, TIPIN in humans) and Rrm3 helicase (unknown in humans) travel with replisome to promote replication fork stability. Adapted from (Branzei and Foiani, 2010)

7

1.3 Impediments to the DNA replication fork

1.3.1 Genotoxic Stress

Genotoxic stress refers to stress that affects the integrity of the genome. Exogenous sources of genotoxicity include exposure to UV and ionizing radiation, and chemical agents. To kill rapidly proliferating cells, most cancer therapeutics are genotoxins that target the DNA replication pathway. These include alkylating agents (modification of nucleotide bases), intercalating compounds (insertion between nucleotides) and inhibitors (inhibition of essential DNA replication proteins)(Figure 1.4A). On the other hand, there are many endogenous obstructions to DNA replication progression that normal cells must overcome during the cell cycle to prevent mutations and cancer development. Prolonged replication fork stalling can result in replisome disassembly, exposing ssDNA gaps at stalled forks which are unstable, susceptible to fork reversals and prone to collapse (Figure 1.4B). These events may lead to the formation of double- stranded breaks (DSB), which are major source of genome instability if they are not accurately repaired (Mankouri et al., 2013). Incomplete genome duplication may also lead to problems in mitosis and a failure to complete cell division. Identification of gene products that prevent spontaneous DNA damage and genome instability will be the main focus of this thesis.

Figure 1.4 Impediments to DNA replication progression Schematic examples of DNA replication fork hindrances. These obstacles can interfere with the completion of DNA replication in S-phase, leading to subsequent problems in mitosis and genome instability. Impediments include A) DNA adducts; B) DNA repair intermediates; C) proteins bound to DNA, such as histones; D) collisions between DNA replication and transcription machinery; E) DNA topological stress such as at two converging replication forks; F) intra-molecular ssDNA secondary structures such as those at R-loops and common fragile sites. Adapted from (Mankouri et al., 2013)

8

1.3.2 Protein-DNA obstacles

One source of replication fork blockage is the presence of protein-DNA complexes. Histone- DNA complexes in the form of nucleosomes are present along the length of the genome and need to be disassembled ahead of the DNA replication fork during elongation (Figure 1.4C). The human INO80 ATPase of the chromatin re-modeling complex is one example of a protein that is localized to DNA replication forks and prevents highly ordered chromatin structures from pausing the replication fork. Depletion of INO80 results in abnormal chromosome segregation and multinucleation, suggesting its involvement with the maintenance of genome integrity (Hur et al., 2010).

The replisome and transcription protein machinery share the same template DNA and can collide and become displaced from DNA (Figure 1.4D)(Bermejo et al., 2012; Branzei and Foiani, 2010). One example of a protein that prevents this type of disruption to replication is Fob1 from S. cerevisiae. Fob1 acts as a barrier that stops replication fork entry into the highly transcribed rRNA in order to prevent head-on collisions. Fob1 deficient strains had elevated extra- chromosomal circles and decreased rRNA gene copies in the genome, representing hyper- recombination and genome instability (Takeuchi et al., 2003).

1.3.3 DNA topology and unusual DNA structures

The DNA topology ahead of the replication fork is another type of physical barrier to DNA synthesis. Helicase unwinding at the replication fork creates positive supercoils ahead of the fork, which must be removed by topoisomerases to allow the progression of the replisome (Figure 1.4E)(Bermejo et al., 2012; Branzei and Foiani, 2010). In S. cerevisiae, the topoisomerases, Top1 and Top2, travel with the replication fork to remove the torsional stress and to promote stability (Bermejo et al., 2007). Similar to Fob1 deficient strains in the previous section, TOP1 and TOP2 double mutants excise rDNA repeats as extrachromosomal circles due to improper recombination, again suggesting elevated genome instability (Kim and Wang, 1989).

A recent theory has implicated the role of transcriptional R loops in the induction of genome instability through interference with DNA replication (Aguilera, 2002; Aguilera and Garcia- Muse, 2012; Castellano-Pozo et al., 2012; Gan et al., 2011; Skourti-Stathaki and Proudfoot,

9

2014). During transcription mRNA can hybridize with the template dsDNA, leaving ssDNA lacking complementary sequence. Together, these three strand structures are termed R loops,which are barriers to the replisome (Figure 1.4D). These unusual DNA structures are highly susceptible to DNA damage and DSB formation, which could lead to chromosome rearrangements through inaccurate repair (Aguilera, 2002; Aguilera and Garcia-Muse, 2012). Two types of protein surveillance factors mitigate this problem: factors that prevent formation of R loops and factors that actively remove R loops. For example, the THO complex interacts with the transcription machinery and directly links mRNA packaging with the nuclear export machinery, preventing formation of DNA/RNA hybrids (Aguilera, 2002). The THO complex is important for genome stability as mutants exhibit a hyper-recombination phenotype (Huertas and Aguilera, 2003). By contrast, RNaseH uniquely cleaves RNA of the RNA/DNA hybrid within R loops, freeing complementary DNA sequences, resulting in abolishment of unusual structures that obstruct DNA replication (Chon et al., 2013). R loop formation resulting from replication- transcription collisions is prevalent in long human genes, particularly at human common fragile sites, which will be discussed next (Helmrich et al., 2011).

1.3.4 Chromosome fragile sites

Natural sequences of the genome can alsobe barriers to DNA replication. Chromosome fragile sites are genomic regions that are sensitive to formation of ssDNA gaps and DNA breaks (Durkin and Glover, 2007). Rare fragile sites are present in a small proportion of the population and increased breakage at these sites is due to expansion of nucleotide repeats (Durkin and Glover, 2007). By contrast, common fragile sites (CFSs) are present in the normal chromosome structure of all individuals and represent sequences that are inherently difficult to replicate (Durkin and Glover, 2007). To date, over 200 CFSs were detected by visualization of breaks and gaps in metaphase spreads following treatment of lymphocytes with aphidicolin, an inhibitor of replicative DNA polymerases (Glover et al., 2005; Mrasek et al., 2010). Furthermore, systematic SNP mapping studies have shown that out of the ten most common focal deletions in cancer tissues and cancer cell lines, five map to CFSs. FRA3B is the most highly damaged CFS. Not surprisingly, it is located within the FHIT tumor suppressor gene, which is involved in cellular response to DNA damage.

10

An active area of study is the mechanism by which breaks at CFSs arise. One theory is that the inhibition of replicative polymerases causes the helicase to travel uncoupled from the replisome, leaving long stretches of ssDNA that can fold upon itself to make secondary structures (Figure 1.4F) (Durkin and Glover, 2007; Franchitto and Pichierri, 2011; Glover et al., 2005). In vitro studies of the highly flexible A/T rich sequences at the FRA16B CFS revealed its ability to form stable secondary hairpin structures (Zhang and Freudenreich, 2007). But, genome-wide positional analyses of CFS do not suggest that they are significantly enriched in sequences that are more prone to form secondary structures than non-fragile site regions (Debatisse et al., 2012). However, CFSs do coincide with regions with late replication completion and a paucity of replication initiation events (Debatisse et al., 2012; Letessier et al., 2011). Thus, an alternate theory is that genome instability at CFSs reflects abnormal distribution of initiation and problems in replication completion (Debatisse et al., 2012).

There is evidence to support that CFSs are evolutionarily conserved, as the replication dynamics at replication slow zones (RSZs) of S. cerevisiae are similar to CFSs in humans (Cha and Kleckner, 2002). While RSZs remain intact in wildtype cells, deficient expression of a protein that senses damaged or un-replicated DNA, Mec1, resulted in increased chromosomal breakage at these sites (Cha and Kleckner, 2002). Similarly, deficient ATR (the human Mec1 homolog) expression in mammalian cells caused increased breaks at CFSs even in the absence of DNA replication inhibitors (Wan et al., 2010). While the distinct determinants of RSZs and CFSs and the mechanistic features that contribute to their genome instability are still unknown, a shared property is that they have attributes that disturb natural DNA replication progression, emphasizing the importance of DNA replication itself to the maintenance of genome stability. Taken together, impediments to DNA replication, particularly at CFSs, cause genome instability that is exhibited in human cancers.

1.4 Cellular response to stalled DNA replication forks

When DNA polymerase encounters various barriers to replication progression, it has the potential to uncouple from the MCM helicase and cause the accumulation of unwound ssDNA on the stalled strand (Byun et al., 2005; Ghosal and Chen, 2013). Coating of ssDNA with RPA is the first signal for downstream recruitment of proteins that stabilize the replication fork and

11

prevent dissociation of replisome components (Figure 1.5) (Ghosal and Chen, 2013; Jones and Petermann, 2012). This step is crucial as it has been demonstrated that exposed ssDNA from partial reduction of the nuclear RPA pools causes accelerated replication fork breakage (Toledo et al., 2013). Next, Ataxia-Telangiectasia mutated and RAD3-related (ATRMec1) kinase and ATR-interacting protein (ATRIPDdc2) are recruited to RPA coated ssDNA through a direct interaction between RPA and ATRIP (Cimprich and Cortez, 2008; Ghosal and Chen, 2013; Zou and Elledge, 2003). This complex recruitment causes the partial activation of ATR via auto- phosphorylation (Liu et al., 2011). RPA coated ssDNA also recruits the RAD17-replication factor C (RFC) clamp loader complex, which facilitates the loading of the RAD9-RAD1- HUS1Rad17-Mec3-Ddc1 (9-1-1) complex at the ssDNA and dsDNA junction (Ellison and Stillman, 2003; Zou et al., 2002; Zou et al., 2003). This complex recruits topoisomerase IIβ-binding protein 1 (TOPBP1) to the damaged fork, which interacts with ATR in order to stimulate its full activation (Burrows and Elledge, 2008; Delacroix et al., 2007; Lee and Dunphy, 2010; Mordes et al., 2008). Activated ATR then phosphorylates replisome components such as TIMELESSTof1 and TIPINCsm3 (Chou and Elledge, 2006; Gotter et al., 2007; Unsal-Kacmaz et al., 2007; Yoshizawa-Sugata and Masai, 2007) to prevent the dissociation of replication machinery at

Figure 1.5 ATR dependent DNA damage sensing pathway Schematic of protein recruitment to a stalled replication fork. ATR is recruited to RPA bound ssDNA and phosphorylates RAD17, which is associated with the 9-1-1 complex. Subsequently, TOPBP1 is recruited to fully activate ATR. ATR mediated phosphorylation of CHK1 leads to cellular responses that stabilize the replication fork. One example is mediating S-phase cell cycle arrest through phosphorylation of CDC25A. Adapted from (O'Sullivan and Karlseder, 2010).

12

stalled replication forks. Recently, SMARCAL1 and AH2 have also been shown to be recruited to RPA coated regions to stabilize the replication fork by directly re-annealing the stably unwound DNA (Bansbach et al., 2009; Ciccia et al., 2009; Postow et al., 2009; Weston et al., 2012; Yuan et al., 2012; Yusufzai et al., 2009). Lastly, ATR dependent phosphorylation of CLASPINMrc1 mediates the recruitment of CHK1Rad53 to the RPA coated ssDNA (Chini and Chen, 2003; Jeong et al., 2003; Lee et al., 2003a; Liu et al., 2006). CHK1Rad53 dissociates from chromatin after activation by ATR mediated phosphorylation and acts as a major effector kinase with numerous target substrates (Cimprich and Cortez, 2008; Ghosal and Chen, 2013; Niida et al., 2007; Smits et al., 2006).

The resulting protein coated substrates and subsequent checkpoint signaling prevent cell cycle progression, direct the removal of the barrier, slow origin firing and retain replisome conformation at the fork for resumption of DNA synthesis (Figure 1.6) (Cimprich and Cortez, 2008; Ghosal and Chen, 2013). However, under prolonged replication inhibition, thereplication machinery is vulnerable to dissociation and can undergo replication fork collapse (Cimprich and Cortez, 2008; Saintigny et al., 2001). If the replisome is not reloaded, ssDNA and other unstable structures at the fork are highly susceptible to nucleases such as MUS81-EME1, leading to DSB formation (Figure 1.6) (Hanada et al., 2007; Minocherhomji and Hickson, 2014; Paulsen and Cimprich, 2007; Pepe and West, 2014). As such, DNA damage signaling proteins play an essential role in the maintenance of genome stability as mutations have been linked to various genome instability diseases ( Table 1.1).

13

Figure 1.6 Cellular response to DNA replication fork stalling DNA replication fork stabilization (right) requires the continued association of polymerases to the stalled fork, inhibition of MCM helicase and inhibition of DNA repair machinery. Preservation of the appropriate replisome conformation allows the opportunity for replication resumption following barrier removal or lesion bypass. However, if the stalled fork is not stabilized (left), DNA polymerases can dissociate and cause ssDNA accumulation. Fork recovery involves reloading of DNA polymerases to the replisome and removal of the barrier. If this fails to occur, ssDNA and other structures that form at the fork may be cleaved by nucleases, resulting in double-stranded breaks. These can be used to restart replication through various DNA repair mechanisms. But inappropriate repair can lead to gross genome instability. Adapted from (Paulsen and Cimprich, 2007) 1.5 Cellular response to DSB formation

Formation of DSBs threatens genome stability. Similar to the cellular response to stalled replication forks, cells coordinate the timely recruitment of multi-protein structures to DSBs to aid their repair (Cimprich and Cortez, 2008; Lisby and Rothstein, 2009). One of the first protein complexes to be recruited to a DSB is the MRN complex (MRE11–RAD50–NBS1Mre11-Rad50-Xrs2) that bridges the two ends of the break (Shiloh and Ziv, 2013; Stracker and Petrini, 2011). The MRN complex acts in conjunction with 53BP1 and BRCA1 to activate the ataxia-telangiectasia mutated (ATMTel1) kinase at sites of DSB (Falck et al., 2005; Lee et al., 2010; Lee and Paull, 2005). Activated ATM phosphorylates histone H2AX adjacent to the DSB (γH2AXpH2A) (Rogakou et al., 1998), which subsequently binds to the sensor protein, MDC1 (Stucki et al., 2005). MDC1 is in turn phosphorylated by ATM and leads to the rapid expansion of chromatin bound γH2AX on both sides of the DSB, promoting additional recruitment of ATM, MRN and

14

MDC1 to propagate the signal (Jungmichel et al., 2012; Liu et al., 2012; Lou et al., 2006; Luo et al., 2011; Savic et al., 2009). MDC1 also stimulates recruitment of the E3 , RNF8-RNF168, which in turn specifies the ubiquitylation of H2AX (Huen et al., 2007; Kolas et al., 2007; Mailand et al., 2007). Ubiquitylated H2AX is next bound by RAP80, which signals a second wave of protein recruitment to sites of DSBs, including the BRCA1-Abraxas-BRCC36- BARD1 complex (Wang et al., 2007). Alternatively, recruitment of the BRCA1-BACH1-CtIPSae2 complex can occur. The nuclease activities of and CtIPSae2 and MRE11Mre11promote resection of the DSB ends (Lengsfeld et al., 2007; Lisby and Rothstein, 2009), which are further resected by BLMSgs1 and DNA2Dna2 nucleases or EXO1Exo1 nuclease to create ssDNA (Mimitou and Symington, 2008; Zhu et al., 2008). Next, RPA coats the exposed ssDNA to trigger a downstream DSB repair response. The choice between preservation of blunt DSB ends or initiation of end resection is important for determining the method of DSB repair, which will be discussed next. However, it is important to note that proteins in the DSB response cascade described in this section play a crucial role in genome stability and mutations are often linked with genome instability diseases ( Table 1.1).

1.6 Mechanisms of double-stranded break repair

Non-homologous end joining (NHEJ) and homologous recombination (HR) are the two main methods for DSB repair (Figure 1.7) (Pardo et al., 2009). The choice between NHEJ and HR repair pathway depends on the stage of the cell cycle that the DSB is found (Pardo et al., 2009). While NHEJ can act at any stage of the cell cycle, HR is preferred in late S or G2 when sister chromatids are available as DNA templates (Pardo et al., 2009). The regulation and recruitment of protein machinery to DSBs is highly coordinated to mediate accurate repair. DSB are a main source of genome instability if they are inaccurately repaired.

15

Figure 1.7 Mechanisms of DSB repair and genomic consequences A) Non-homologous end joining (NHEJ) is a homology independent repair pathway. The processed DNA ends are ligated, but this process can lead to loss or gain of nucleotides (purple dashed lines) B)Several mechanisms of homology mediated DNA repair exist and each involve MRN mediated end resection to generate ssDNA tails. If a DSB is flanked by two homologous DNA sequences, it can be repaired by single-strand annealing (SSA). C) If only one end of the DSB is able to find homology, break-induced replication (BIR) is the preferred method of repair. A processed single stranded end of broken DNA invades the homologous chromosome to form a displacement loop (D-loop) and normal replication machinery performs unidirectional DNA synthesis using the template sequence. D)Two DSB repair methods involve both ends of the DSB. Synthesis-dependent strand-annealing (SDSA) leads to a non-crossover outcome, whereas E) double-strand break repair (DSBR) can result in either a non-crossover or crossover outcome. Non-allelic homologous recombination can result in gross chromosomal rearrangements. Adapted from (Aguilera and Gomez-Gonzalez, 2008; Mani and Chinnaiyan, 2010)

1.6.1 Non-homologous end joining (NHEJ)

In NHEJ, two ends of a DSB are re-ligated together, without the requirement of homologous sequences (Figure 1.7A) (Pardo et al., 2009). MRN, /KU80 and DNA dependent protein kinase catalytic subunit (DNA-PKcs) are recruited to the two ends of the DSB for stabilization and prevention of end resection (Panier and Boulton, 2014). Subsequently, XLF-XRCC4-Ligase

16

IV (LIG4) are recruited to perform the ligation (Wilson et al., 1997). This method of repair may lead to small deletions, inversions and chromosomal translocations (Daley et al., 2005; Pardo et al., 2009).

1.6.2 Homologous Recombination (HR)

In contrast to NHEJ, DSB repair by HR requires homology between the breakage site and the template DNA for repair(Pardo et al., 2009). In the first step of HR, the 5’ ends of the DSB are resected by MRNMRX(Moreau et al., 2001), CtIPSae2(Sartori et al., 2007) and EXO1Exo1(Tsubouchi and Ogawa, 2000). This generates long RPA coated 3’ single-stranded DNA that interacts with BRCA2 (Rad52 in yeast) (Alani et al., 1992; Davies and Pellegrini, 2007). Next, RPA is displaced by the assembly of long chains of RAD51 to direct the search for homologous sequence (Song and Sung, 2000). In S. cerevisiae, Rad54, Rad55 and Rad57 are recruited to stabilize the filament in preparation for homology search (Sugawara et al., 2003; Sung, 1997). The resulting Rad51 filament is primed to perform strand invasion, which refers to the displacement of one strand of a DNA duplex to form a heteroduplex displacement-loop (D- loop). From this stage, several outcomes can occur dependent on whether one strand (BIR and SDSA), both strands (DSBR) or neither strand (SSA) are able to invade homologous sequence (Figure 1.7). In most cases the resolution of recombination intermediates by HR results in an error free product. However, homologous sequence can sometimes be found at ectopic locations of the genome, resulting in various genome rearrangements.

1.6.2.1 Single Strand Annealing (SSA)

If the 3’ ssDNA filament is unable to find homologous sequence, 5’ strand resection can be extended until it uncovers direct complementary sequences that can anneal together, termed single strand annealing (SSA) (Figure 1.7B) (Pardo et al., 2009). The remaining single stranded sequences can be cleaved by the XPF-ERCC1 flap endonuclease (Adair et al., 2000; Motycka et al., 2004) and the gaps filled by the replication machinery (Pardo et al., 2009). In S. cerevisiae, the Rad1-Rad10 flap endonuclease also requires the action of Msh2, Msh3 and Slx4 to complete SSA (Fishman-Lobell and Haber, 1992; Flott et al., 2007; Saparbaev et al., 1996). SSA results in the deletion of one DNA repeat and sequences in between the two repeats, but it is useful when DSBs occur within repeated elements (Pardo et al., 2009).

17

1.6.2.2 Break induced replication (BIR)

When only one end of the DSB can anneal to the homologous genomic sequence or if one end of the broken DNA molecule is lost, the preferred method of HR is BIR (Figure 1.7C). At the site of strand invasion, the replication machinery performs unidirectional DNA synthesis using the template sequence (Pardo et al., 2009). In S. cerevisiae, BIR requires the essential DNA replication initiation components Pol32, Cdc45, GINS, Mcm2-7 and Cdt1(Lydeard et al., 2010). BIR can result in loss of heterozygousity (LOH) if the strand invasion occurs on a homolog. BIR can also result in non-reciprocal translocations, insertions and deletions if an ectopic chromosome sequence is used (Bosco and Haber, 1998).

1.6.2.3 Synthesis dependent strand annealing (SDSA) and DSB repair (DSBR)

In SDSA, the 3’ invading end is extended by DNA polymerase and eventually displaced to allow re-annealing with the complementary strand at the second end of the DSB (Pardo et al., 2009). BLMSgs1 and Srs2 (in budding yeast) displace Rad51 from the single-stranded DNA as a mechanism to disrupt the D-loop in SDSA (Bugreev et al., 2007; Veaute et al., 2003). This is an error free mechanism that forms non-crossover DNA products (Figure 1.7D) (Pardo et al., 2009).

In DSBR, the second ssDNA tail also invades the dsDNA duplex (second-strand invasion) or anneals to the displaced DNA strand that is produced by DNA synthesis from the first ssDNA (second end capture). This results in cross-stranded structures called double Holliday Junctions (dHJs) (Pardo et al., 2009). The resolution of dHJs can be mediated by GEN1, MUS81-EME1 or SLX1-SLX4 endonuclease cleavage (Fekairi et al., 2009; Ip et al., 2008; Munoz et al., 2009; Svendsen et al., 2009; Taylor and McGowan, 2008) that could result in either non-crossover or cross-over products, depending on the cleavage site (Figure 1.7E) (Pardo et al., 2009). By contrast, BLM-TOP3A-RMI1-RMI2 (BTRR) mediates dHJ dissolution, which is the preferred method of DNA repair as it results solely in non-crossover products (Wu and Hickson, 2003). The study of the BTRR complex and its interaction partners will be a focus of this thesis, as BTRR plays and integral role in the maintenance of genome stability.

18

1.7 BLM-TOP3A-RMI1-RMI2 genome maintenance complex

1.7.1 The Bloom syndrome protein (BLM)

As described in previous sections, cells have a highly regulated system of pathways to repair spontaneous DNA damage that occurs during the cell cycle. Enzymatic proteins are highly connected in protein interaction networks and precisely coordinated by sequential binding and release of partner proteins. Notably, BLM is an important protein that is involved in many proteinpathways that preserve genome stability (Larsen and Hickson, 2013). BLM belongs to the RecQ DNA helicase family of proteins and plays a highly conserved role in DNA repair throughout evolution. Deficient expression of BLM gives rise to Bloom syndrome, which is characterized by dwarfism, sensitivity to sunlight and immunodeficiency(Hickson, 2003). Bloom syndrome is also associated with cancer predisposition (German et al., 2007), suggesting the importance of functional BLM proteins in the maintenance of genome stability. The diagnostic molecular phenotype of Bloom syndrome is a ten-fold elevated frequency of sister chromatid exchanges (SCEs), representing crossover events during recombination (Chaganti et al., 1974) and increased chromosome breaks and chromosome rearrangements (German et al., 1965).

1.7.2 BLM-TOP3A-RMI1-RMI2 (BTRR) core complex

BLM and its functional homolog in budding yeast, Sgs1, have a conserved domain structure that facilitates binding of various proteins (Figure 1.8) (Bernstein et al., 2010; Larsen and Hickson, 2013). BLM has several functionally conserved interaction partners that together form a core complex. BLM physically interacts with the type IA topoisomerase, TOP3ATop3(Gangloff et al., 1994; Wu et al., 2000), and binds directly with and OB-fold (oligonucleotide-binding fold) domain-containing protein, RMI1Rmi1(Chang et al., 2005; Mullen et al., 2005; Raynard et al., 2008; Yin et al., 2005). In humans, RMI1 binds another OB-fold containing protein, RMI2, which is not conserved in S. cerevisiae(Singh et al., 2008; Xu et al., 2008). In vivo, these four proteins form the BTRR complex, which exists in distinct sub-nuclear compartments and can be co-immunoprecipitated (Xu et al., 2008; Yang et al., 2012). Reduced protein expression or mis- expression of these of these proteins results in increased SCE and chromosome breaks (Chaganti et al., 1974; Singh et al., 2008; Xu et al., 2008; Yin et al., 2005), suggesting that a functional BTRR complex is crucial for preventing Bloom syndrome.

19

Figure 1.8 Conserved interactions between DNA repair proteins with BLM and Sgs1 The domain structure of BLM and its functional homolog in S. cerevisiae, Sgs1, is conserved (legend of domains shown). Many physical interactions with DNA repair proteins (colored ovals) are also evolutionarily conserved. Of these, the BTRR core complex is comprised of BLM Sgs1, TOP3A Top3 and RMI1Rmi1. In H. sapiens, RMI2 is an additional member of the core complex. Although many more BLM and Sgs1 interaction partners exist, only those DNA repair proteins with mapped interaction regions are displayed. Adapted from (Bernstein et al., 2010)

1.7.3 BTRR involvement in higher order multi-complex structures

The important contribution of BTRR in preventing genome instability is illustrated by its many interactions with non-core proteins involved in DNA replication (CAF-1), DNA damage sensing (ATR, ) and DNA repair (RAD51, MLH1) (Figure 1.8) (Bernstein et al., 2010). Coordinated interaction with these protein components may dictate the specificity of BTRR core complex function in various genome stability pathways. Several multi-complex structures involving BLM have been identified by affinity purification and mass spectrometry (Meetei et al., 2003; Wang et al., 2000). The BRAFT complex consists of BLM, TOP3A, RPA and five Fanconi anemia complex proteins, which are proteins involved in DNA repair (Wang et al., 2000). Another higher order protein complex termed BASC (BRCA-1 associated genome surveillance complex) consists of BLM, MSH2, MSH6, MLH1, ATM, MRE11-RAD50-NBS1 and RFC, which are proteins required for DNA damage sensing and repair (Meetei et al., 2003).

20

While many BLM protein interactions have been identified, the interaction profiles of the remainder of the core complex have not been explored. One focus of this thesis will be to use novel techniques to catalog these protein interactions and study their coordination under different replication stress conditions in order to reveal a global picture of the BTRR complex interaction network. The current understanding of BTRR complex functions in various genome stability pathways is highlighted in the following sections.

1.7.4 Role of BTRR in homologous recombination and double Holliday junction dissolution

The most well characterized role of BTRR is its ability to suppress DNA crossover events that can be visualized as dramatically elevated levels of SCE in Bloom syndrome patients. It does so through a process termed dHJ dissolution, whereby DNA duplexes are unlinked and resolved without any exchange of sequences (Figure 1.9A) (Cejka et al., 2010; Larsen and Hickson, 2013). Distinct from the classical dHJ resolution that leads to a 1:1 ratio of crossovers to non- crossovers, dHJ dissolution involves two enzymatic steps and only leads to non-crossover events (Figure 1.7E). First, the BLM helicase activity catalyzes the branch migration of HJ to form a hemicatenane of two duplex DNAs interlinked by catenated single strands (Wu and Hickson, 2003). Then, decatenation is facilitated by TOP3A, which disconnects the duplex DNAs to complete the dissolution pathway (Figure 1.9B) (Wu and Hickson, 2003). This pathway is solely dependent on the catalytic activities of BLM and TOP3A and cannot be substituted by any other human RecQ (Wu et al., 2005; Wu and Hickson, 2003). The activity of RMI1 stimulates the dissolution activity and mediates the stable binding of TOP3A to dHJ structures (Wu et al., 2006). RMI1 also stimulates TOP3A in the decatenation of single stranded circular DNA, which resembles intermediate structures formed during branch migration (Yang et al., 2010). Finally, RMI2 may play a minor role in the stimulation of dHJ dissolution activity (Singh et al., 2008). However this finding is controversial, as it could not be reproduced (Xu et al., 2008). Nevertheless, BTRR plays an essential role in the resolution of HR intermediates to prevent crossover events that may lead to genome rearrangements and loss of heterozygosity.

21

Figure 1.9 BTRR function in double Holliday junction dissolution and decatenation A) Schematic diagram of a double Holliday junction (dHJ). The BTRR complex (BLM-TOP3A-RMI1) bind and promote translocation and branch migration of dHJs to facilitate dissolution. B) During DNA replication, hemicatenane structures are thought to form behind the replication fork. BLM and TOP3A are required for the decatenation of these structures. Other BTRR core complex members, RMI1 and RMI2 stabilize the complex and stimulate maximal decatenation activity. Adapted from (Bernstein et al., 2010)

1.7.5 BTRR in DNA replication progression

An novel role of BTRR in normal DNA replication progression was revealed by the use of a molecular combing technique that enables analysis of DNA replication at the resolution of individual DNA fibers (Herrick and Bensimon, 1999). Results from this technique showed that BLM deficient cells exhibited slowed replication fork progression and an elevated frequency of fork pausing (Rao et al., 2007). While RMI1 depletion did not affect fork pausing, it caused a greater decrease in replication fork progression than BLM deficiency alone (Yang et al., 2012). This suggests that RMI1 and BLM might contribute independently to the replication fork progression rate despite their interaction in the core complex.

One mechanism for controlling fork velocity is through the regulation of dNTP levels. New evidence implicates the activity of the BLM helicase in maintaining balanced pyrimidine pools for normal fork progression (Chabosseau et al., 2011). As visualized by molecular combing, the normalization of pyrimidine pools with exogenous sources was sufficient to fully restore

22

replication fork speed, but not the fork restart defect in BLM deficient cells (Chabosseau et al., 2011). Furthermore, pyrimidine pool normalization also reduced the frequency of sister chromatid exchange, suggesting that pyrimidine pool imbalance contributes to the Bloom syndrome phenotype (Chabosseau et al., 2011). As BTRR involvement in DNA replication progression is an emerging field, the direct role of BTRR at DNA replication forks and other mechanisms of fork velocity regulation have not yet been probed.

1.7.6 BTRR at perturbed DNA replication forks

Not only does BTRR play a role in normal DNA replication progression, but recent evidence implicates BTRR in the maintenance of stalled DNA replication forks. The aggregation of multi- protein complexes at sites of replication stress can be visualized as nuclear foci by fluorescence microscopy (Lisby and Rothstein, 2004). In the absence of replication stress, BLM, TOP3A and RMI1 co-localize to nuclear foci, in vivo. The addition of replication fork stalling agents increases the percentage of cells with BTRR nuclear foci and increases the abundance of foci per cell (Yang et al., 2012). Focus formation is dependent on a functional BTRR complex, as it is abrogated when RMI1 is depleted or mutated (Yang et al., 2012). BLM also co-localizes with the ATR checkpoint kinase, which phosphorylates BLM at threonine T99 and T122 following hydroxyurea (HU) treatment (Rao et al., 2005). Thus, BTRR complexes co-localize and are post- translationally modified in response to replication stress.

Microscopy studies are unable to define the mechanistic role of BTRR at perturbed forks, but molecular combing studies show that BLM and RMI1 depleted cells are deficient in recovery from replication forks blocked by aphidicolin or HU, suggesting a role in replication fork restart (Davies et al., 2007; Yang et al., 2012). Cells expressing an un-phosphorylatable T99A or T122A mutant BLM fail to recover from HU treatment and the T99A and T122A mutants do not rescue defective replication fork restart in BLM deficient cells (Rao et al., 2005). Furthermore, recent evidence suggests that BTRR interacting partners also promote replication fork restart following perturbation. One example is FANCD2, a member of the Fanconi anemia complex that co- immunoprecipitates with BLM in human cells (Hirano et al., 2005; Pichierri et al., 2004). By molecular combing, FANCD2 and BLM double depleted cells exhibit an exacerbated defect in replication fork restart following recovery from aphidicolin treatment in comparison to depletion of BLM alone, suggesting that they act in a common pathway (Chaudhury et al., 2013).

23

However, FANCD2 depletion alone did not affect replication progression or pyrimidine pool imbalance (Chabosseau et al., 2011). This suggests that while BLM cooperates with FANCD2 to protect stalled forks, it works independently of FANCD2 in its role in maintaining normal replication fork speed. Another BLM interacting protein, RIF1, is recruited to stalled forks with the same kinetics as BLM and is required for efficient fork restart after HU treatment (Roux et al., 2012). However, RIF1 does not play a role in SCE suppression, suggesting that it specifically coordinates stalled forks with BTRR and but is not involved with the dHJ dissolution function (Xu et al., 2010). This idea of BTRR sub-complex formation during replication stress has only recently been explored and will be a focus of my thesis.

1.7.7 BTRR in maintenance of common fragile site stability

BTRR has been found to play a novel role in the segregation of sister chromatids (Chan et al., 2007). BLM is required to recruit TOP3A and RMI1 to a novel class of ultra-fine DNA bridges (UFBs), which connect sister chromatids during anaphase of mitosis (Chan et al., 2007). There are two classes of UFBs, ones that arise from centromeres and ones that arise from CFSs, which both cause difficulty for replication fork progression (Chan et al., 2009). Treatment of cells with replication stress inducing agents such as aphidicolin, HU or mitomicin C results in an increased number of UFBs per cell (Chan et al., 2009), suggesting that UFBs are formed due to replication defects. It is proposed that UFBs result from late replication intermediates that have not been replicated to completion prior to mitosis (Chan and Hickson, 2011; Chan et al., 2009). As BLM-/- cells have significantly increased UFBs (Chan et al., 2007), it is postulated that the decatenation function of the BTRR complex can prevent sister chromatid nondisjunction and chromosome breakage at UFBs(Chan et al., 2009; Larsen and Hickson, 2013).

Upcoming work in this field may implicate BTRR multi-protein complexes in resolution of UFBs. FANCA, FANCG and FANCI deficient cells all have an increase in replication stress induced UFBs (Vinciguerra et al., 2010). Other proteins such as FANCD2 and FANCI localize to the CFSs at the ends of UFBs (Chan et al., 2009). As the Fanconi anemia pathway is required to suppress chromosome breakage at CFSs during replication stress (Howlett et al., 2005) and many of its members interact with the BLM core complex, it is plausible that one role of the multi- protein complex may be to resolve UFBs.

24

1.8 Rationale

Changes in genome structure underlie many human diseases, including tumorigenesis and cancer. Aberrant structural changes to the genome can arise from mutations or misregulation of genes involved in many biological pathways including DNA replication, DNA damage recognition, DNA repair and cell cycle progression. Identification and functional characterization of genes whose leads to chromosome instability will be important in understanding predisposing events and pathways that contribute to cancer progression. I took three systematic approaches to identify and characterize genes and proteins involved in the maintenance of genome stability, with a focus on those involved in DNA replication.

As biological pathways involved in genome maintenance are highly conserved in evolution, results from model organisms can greatly facilitate functional discovery in humans. I used S. cerevisiae to identify essential genes that play an important role in preventing genome instability (Chapter 2). I identified many essential genes that are critical in preventing both spontaneous DNA damage and genome rearrangements. As most of these genes were involved in DNA replication, I hypothesized that this biological process may be especially integral to preventing genome instability. Therefore, I also identified a list of human proteins that both play a role in replication progression and that interact with the replication fork, using a novel replication fork pull down strategy in human cell lines (Chapter 3).

Chapter 4 describes my work on the stability complex, BTRR, which has many roles in genome maintenance and is important to cell survival. BTRR has an emerging role in DNA replication progression, as depletion of RMI1 or BLM of the core complex significantly decreases the replication fork rate. I reasoned that identifying core complex protein interactions could not only could lead to the functional discovery of new genome maintenance factors, but also those that participate in DNA replication progression. I used anin vivo technique (BioID) to identify these protein interactors and tested their involvement in DNA replication progression. Altogether, I have used three different technologies and two different model organisms to demonstrate the importance of DNA replication to the maintenance of genome stability.

25

Chapter 2

Genome rearrangements caused by depletion of essential DNA replication proteins in Saccharomyces cerevisiae.

PUBLICATION:

Cheng, E., J. A. Vaisica, J. Ou, A. Baryshnikova, Y. Lu, F. P. Roth and G. W. Brown (2012). Genome rearrangements caused by depletion of essential DNA replication proteins in Saccahromyces cerevisiae. Genetics192(1): 147-160.

DATA ATTRIBUTION:

Jessica Vaisica performed the microscopy screen to detect spontaneous Ddc2-YFP DNA damage foci in Tet shut-off alleles.

ACKNOWLEDGEMENTS: I thank Philip Hieter for providing yeast strains for the illegitimate mating assays, Lisa Yu for Tet allele strain construction, Tao Qi for technical assistance with CHEF gel analysis and Jordan Young for technical assistance with high-throughput microscopy.

26

2 Genome rearrangements caused by depletion of essential DNA replication proteins in Saccharomyces cerevisiae. 2.1 Summary

Genetic screens of the collection of ~4500 deletion mutants in Saccharomyces cerevisiae have identified the cohort of non-essential genes that promote maintenance of genome integrity. Here I probe the role of essential genes needed for genome stability. To this end, I screened 217 tetracycline-regulated promoter alleles of essential genes and identified 47 genes whose depletion results in spontaneous DNA damage. I further showed that 92 of these 217 essential genes have a role in suppressing chromosome rearrangements. I identified a core set of 15 genes involved in DNA replication that are critical in preventing both spontaneous DNA damage and genome rearrangements. Mapping, classification and analysis of rearrangement breakpoints indicated that yeast fragile sites, Ty retrotransposons, tRNA genes, early origins of replication and replication termination sites are common features at breakpoints when essential replication genes that suppress chromosome rearrangements are down-regulated. I propose mechanisms by which depletion of essential replication proteins can lead to double-stranded DNA breaks near these features, which are subsequently repaired by homologous recombination at repeated elements.

2.2 Introduction

Accurate transmission of the genome is essential for normal cell growth and survival. As such, cells have developed elaborate mechanisms to prevent errors in replication and to respond to spontaneous DNA damage that can lead to genomic instability (Branzei and Foiani, 2007, 2009, 2010; Cimprich and Cortez, 2008; Harper and Elledge, 2007; Kolodner et al., 2002). The failure to repair the genome in an error-free manner can result in chromosome abnormalities that underlie many human diseases, including cancers (Aguilera and Gomez-Gonzalez, 2008; Kolodner et al., 2002; McKinnon and Caldecott, 2007). Therefore, defining the genes that contribute to genome maintenance will be useful in understanding disease development and in

27

designing new strategies for therapeutics. However, to date, a comprehensive curation of genes that function to suppress genome instability is incomplete.

Yeast is an ideal model for genomic studies due to the conservation of gene functions and biological pathways between yeast and humans. Phenotypic screens conducted with the Saccharomyces cerevisiae non-essential gene deletion collection (Giaever et al., 2002) have aided in the annotation and functional characterization of non-essential genes involved in the suppression of spontaneous DNA damage (Alvaro et al., 2007; Huang and Kolodner, 2005; Huang et al., 2003; Shor et al., 2005) and in the suppression of spontaneous chromosome rearrangements (Andersen et al., 2008; Smith et al., 2004; Yuen et al., 2007). However, since the deletion of essential genes causes lethality, similar genome-wide screening approaches to identify the complete set of genes that suppress spontaneous DNA damage and chromosome rearrangements require collections of conditional alleles of essential genes.

Systematic collections of conditional alleles have been generated in several ways, including the replacement of native promoters with a tetracycline-regulated promoter (Mnaimneh et al., 2004; Yu et al., 2006),destabilization of target gene mRNAs through the insertion of a selectable marker in the 3’UTR of essential genes (Schuldiner et al., 2005),systematic addition of a heat inducible degron to the amino terminus of the protein product (Labib et al., 2000), systematic generation of novel temperature-sensitive alleles (Ben-Aroya et al., 2008), and systematic integration of existing temperature-sensitive alleles (Li et al., 2011). Despite the availability of several essential gene collections, no one collection is complete, suggesting that complementary approaches using a number of screening strategies and multiple types of conditional alleles will be necessary to identify all of the essential genes that function to suppress genomic instability.

28

2.3 Results

2.3.1 Depletion of essential gene products causes spontaneous DNA damage

I used a collection of tetracycline-regulated promoter alleles (Tet alleles) (Mnaimneh et al., 2004; Yu et al., 2006) of essential genes to systematically identify genes that suppress spontaneous DNA damage. Since elevated levels of spontaneous DNA damage should elicit a checkpoint response and cause cell cycle delay, I screened the 217 strains that accumulated in S-phase or G2-phase of the cell cycle following gene-product depletion by promoter shut off (Appendix Table 1) (Yu et al., 2006). Spontaneous DNA damage was measured by the re-localization of the DNA damage checkpoint protein Ddc2 from a diffuse nuclear pattern to discrete sub-nuclear foci (Figure 2.1A). Following growth of these strains in doxycycline to repress essential , the fraction of cells with Ddc2 foci was quantified. I determined that the individual depletion of 47 essential gene products caused an increase of Ddc2 foci relative to wildtype levels (Appendix Table 2), using a cutoff of three standard deviations from the wild type mean (Figure 2.1B). The Gene Ontology (GO) processes of the essential genes that were identified are varied, but on average the highest levels of Ddc2 foci were observed following the depletion of gene products involved in DNA replication,response to DNA damage stimuli and cell cycle progression, indicating the importance ofthese essential processes in the maintenance of genome integrity (Figure 2.2B). In addition to the identification of essential genes with defined roles in genome maintenance, 20 essential genes with previously unrecognized contributions to the suppression of spontaneous DNA damage were also identified (Figure 2.1B, grey bars).

29

Figure 2.1Depletion of yeast essential genes results in elevated levels of spontaneous Ddc2 foci formation A) 217 Tet alleles that express Ddc2-YFP and display a G2/M or cell cycle arrest phenotype were grown in the presence of doxycycline (10 μg/mL) for 4 hours in order to inhibit the transcription of each essential gene. Representative DIC and YFP images are shown for the wildtype, DPB11 and NSE1 strains. Ddc2-YFP foci are indicated with white arrows. B) The percent of cells with Ddc2-YFP foci is plotted for 47 Tet alleles (Appendix Table 2) that showed an increase in Ddc2 foci of at least three standard deviations above the average observed in wild type. Bars are shaded according to the GO process annotation of each gene of interest.

30

2.3.2 Depletion of essential gene products causes chromosome loss and rearrangement

Increased levels of Ddc2 foci could reflect increased spontaneous DNA damage, defective repair of spontaneous DNA damage or a combination of both. An increase in spontaneous DNA damage may not impact genome integrity if the damage is repaired accurately. To directly identify essential genes that suppress chromosome rearrangements and chromosome loss, I used an illegitimate mating assay (Lemoine et al., 2008; Lemoine et al., 2005; Strathern et al., 1981) that measures loss of genetic information from chromosome III to screen the same 217 Tet allele strains. Mutation or deletion of the MATα locus on chromosome III in haploid cells results in a reversion to the default MATa mating type, termed a-like fakers, allowing these MATα cells to mate illegitimately with strains of the MATαmating type (Strathern et al., 1981). I determined the levels of a-like faker formation using a patch mating assay (Figure 2.1A). I found that the depletion of 92 essential genes caused elevated illegitimate mating frequencies both relative to the minus doxycycline control and relative to the wild type control, indicating loss of genetic information at the MAT locus in these strains. Thirty strains did not form colonies in the presence of doxycycline and nine strains could not be constructed with the MATα mating type and therefore could not be evaluated. Strains were subcategorized into groups with high (>10 colonies; 57 strains), moderate (1 to 10 colonies; 35 strains), or wild type(0 colonies; 86 strains)levels of illegitimate mating and the distributions of Ddc2 foci formation for each category were compared (Figure 2.2B).Both the high and medium categories had greater Ddc2 foci formation when compared to the wild type category (P-value of 0.022 for high vs. wild type and P-value of 0.028 for medium vs. wild type; one-sided Mann-Whitney test), indicating a relationship between the extents of Ddc2 focus formation and the illegitimate mating phenotype. Additionally, strains with spontaneous Ddc2 foci formation above our cutoff were more likely to have increased illegitimate mating (P-value of 0.00073; hypergeometric test), although the overlap between the two screens, at 29 genes, was not absolute.

31

Figure 2.2Depletion of yeast essential genes results in elevated levels of illegitimate mating A)MATα Tet alleles were grown on YPD or YPD containing doxycycline (10 μg/mL) for 24 hours and a standard mating test was performed using MATα and MATa tester strains. Representative images of strains with elevated levels of illegitimate diploid formation following growth in doxycycline are shown. B)The resulting number of illegitimate diploid colonies that grew without doxycycline treatment was subtracted from the number that grew with doxycycline treatment and was used to subcategorize the strains into four groups. For each group, the distribution of percent of budded cells with Ddc2-YFP foci was plotted. Bold lines represent the median values, boxes represent the upper and lower quartiles, whiskers represent 1.5 times the inter-quartile range and outliers are indicated by circles. C) Comparison of Tet alleles with elevated levels of Ddc2-foci and >10 illegitimate mating diploid colonies.

32

I focused on the strains with the most robust chromosome instability phenotype, the 15 strains with both elevated Ddc2 foci and high levels of illegitimate mating (Figure 2.2C). These strains were subjected to a quantitative illegitimate mating assay (Table 2.1). In the presence of doxycycline, all of these strains exhibited significantly elevated levels of illegitimate mating relative to the wild type strain. Increases in illegitimate mating ranged from less than two-fold wild type (CSE1) to 62-fold wild type (MCM7). Previous studies of GAL promoter regulated conditional alleles of DNA polymerases  and  found increases of approximately 200-fold and 50-fold, respectively (Lemoine et al., 2008; Lemoine et al., 2005). Although DNA polymerases  and  were not assayed in our screens, I identified a role for DNA polymerase  (POL2, DPB11) in suppressing illegitimate mating. Additionally, I found that disrupting a wide range of replication functions (CDC45, MCM4, MCM5, MCM7, DPB11, POL2, POL30, RFC2, RFC5) caused increased illegitimate mating. DNA2, which functions in Okazaki fragment processing (Budd et al., 2000; Lee et al., 2000) and in DNA repair (Zhu et al., 2008) resulted in increased illegitimate mating, as did repression of the DNA repair genes NSE1 (Pebernard et al., 2008; Santa Maria et al., 2007) and UBC9 (Branzei et al., 2006). Genes with functions outside of DNA replication and repair were also identified. CSE1 is responsible for nuclear shuttling of the nuclear transporter importin (Hood and Silver, 1998; Kunzler and Hurt, 1998; Solsbacher et al., 1998), and roles for CSE1 in DNA replication (Yu et al., 2006) and in chromosome segregation (Xiao et al., 1993), likely reflecting effects on importin  cargos, have been described. NUF2 is a kinetochore component and functions in chromosome segregation (Osborne et al., 1994). Depletion of DNA replication (CDC45, MCM4, MCM7, POL2) and segregation (NUF2) gene products had the most striking effect (greater than 20-fold difference).

33

Table 2.1 Frequencies of illegitimate mating in tetracycline-regulatable promoter conditional alleles grown in the presence and absence of doxycycline 1

Tet allele Frequencies of illegitimate mating (10-4) - doxycycline + doxycycline wildtype 0.83 (0.20)2 [1]3 0.8 (0.26) [1] CDC45 11 (6.0) [14] 17 (1.6) [22] CSE1 0.5 (0.04) [0.6] 1.4 (0.40) [1.8] DNA2 4.3 (2.7) [5.2] 4.7 (2.5) [5.9] DPB11 0.94 (0.1) [1.1] 9.2 (3.4) [12] MCM4 11 (9.8) [13] 36 (27) [45] MCM5 3.4 (0.8) [4.1] 12 (3.0) [15] MCM7 4.1 (2.9) [4.9] 34 (24) [62] NSE1 17 (11) [21] 39 (7.3) [48] NUF2 2.5 (1.1) [3.0] 18 (8.9) [22] POL2 2.2 (0.9) [2.6] 26 (21) [33] POL30 5.4 (4.9) [6.5] 8.3 (2.7) [10] RFC2 2.2 (0.03) [2.6] 5.9 (0.14) [7.4] RFC5 1.5 (0.76) [1.8] 3.0 (0.64) [3.8] SPT16 2.4 (1.7) [2.9] 11 (11) [14] UBC9 2.4 (0.4) [2.9] 3.9 (2.2) [4.8]

1 Tet allele strains and wildtype strain were grown in parallel for 24 hours on YPD liquid media containing or lacking 10 μg/mL doxycycline. Strains were mixed with five-fold excess of a MATα tester strain and plated on YPD solid media. After 5 hours, cells were re-suspended in water and plated on illegitimate diploid selection media. This assay was repeated two times. 2 Numbers in brackets represent standard deviations. 3 Numbers in square brackets represent the frequency normalized to wildtype.

2.3.3 Chromosome III rearrangements in essential genome stability mutants

Three common classes of information loss on chromosome III in the illegitimate mating assay can be distinguished by performing the assay with a strain with nutritional markers flanking the MAT locus on chromosome III (Lemoine et al., 2005) (Figure 2.3A). I used this modified assay to classify chromosome instability events in the 15 strains with both increased spontaneous DNA damage and high levels of illegitimate mating. Class 1 mating events result from a gene conversion or mutation at the MATα locus. Class 2 events result from the loss of chromosome III. Class 3 events result from chromosome rearrangements that lead to the loss of the MATα locus and distal regions of the right arm of chromosome III (Figure 2.3A). For each strain I classified the chromosome rearrangements in approximately 200 illegitimate diploids and measured the frequencies and ratios of the three classes (Figure 2.3B and C).

34

Figure 2.3Classification of rearrangement events that lead to illegitimate mating A)A schematic diagram of the three expected classes of genetic events resulting in illegitimate mating. Using diagnostic selection media, mutations in the MAT locus, whole chromosome III loss and loss of the right arm of chromosome III can be distinguished as class 1, 2 and 3 genetic events, respectively (Lemoine et al., 2008; Lemoine et al., 2005). B) I classified approximately 200 illegitimate diploids for each strain. The frequencies of the three classes of rearrangements are plotted for the 15 strains with the most elevated levels of illegitimate mating. C)The ratios of the three classes of rearrangements are plotted for the indicated strains.

Increases in class 2 (whole chromosome loss) and 3 (chromosome arm loss) rearrangements were evident for all 15 genes tested (Figure 2.4B). Depletion of CSE1, DPB11, MCM4, MCM5, POL30, SPT16, and UBC9 resulted in ratios of the three classes of chromosome instability that were not significantly different than that observed in the wild type strain (P-value >0.01 by the Fisher exact test) (Figure 2.3C). This could indicate that depletion of these gene products exacerbates a condition already present in wild type cells. By contrast, repression of CDC45, DNA2, MCM7, RFC2, RFC5 and POL2 resulted in a significant difference in ratios of the three classes relative to wild type (P-value <0.01 by the Fisher exact test) with a preferential increase in class 3 (chromosome arm loss) events. Depletion of NUF2, a kinetochore associated protein, resulted in a dramatic increase in class 2 (whole chromosome loss) events, consistent with the

35

function of this gene in chromosome segregation(Osborne et al., 1994; Wigge and Kilmartin, 2001). Finally, I observed that the depletion of NSE1, a subunit of the structural maintenance of chromosome (Smc5/6) complex (Fujioka et al., 2002), resulted in the loss of class 1 (gene conversion or point mutation) events, and in similar levels of class 2 (whole chromosome loss) and class 3 (chromosome arm loss) events. Our data suggest that NSE1 contributes to both the DNA repair (class 3) and chromosome segregation (class 2) functions of the SMC5/6 complex (Behlke-Steinert et al., 2009; Irmisch et al., 2009; Outwin et al., 2009; Santa Maria et al., 2007). I conclude that depletion of different essential gene functions can cause different patterns of genomic instability.

2.3.4 Essential gene product depletion causes genome rearrangements with boundaries at Ty retrotransposons

In order to obtain a comprehensive assessment of the chromosome rearrangement breakpoint locations in the illegitimate diploids that were isolated following essential gene depletion in our classification experiment, I used comparative genome hybridization on tiling microarrays to map rearrangement breakpoints (Dion and Brown, 2009). Genomic DNA was isolated from six illegitimate diploid colonies that exhibited a chromosome III arm loss (class 3) phenotype following the depletion of each of seven essential genes (CDC45, DPB11, MCM7, NSE1, RFC2, SPT16, UBC9), chosen to represent the functional diversity present in the quantitative illegitimate mating assay.Genomic DNA was hybridized to aS. cerevisiaewhole genome tiling microarray and copy number variation was determined by comparison to genomic DNA isolated from a wildtype a/α diploid. Representative copy number profiles are depicted in Figure 2.4 and the boundaries of each rearrangement are shown in Table 2.2. Figure 2.5 A and B summarize the breakpoints observed on chromosome III and the resulting subclasses of rearrangements that were observed.

36

Figure 2.4Comparative genome hybridization microarray analysis of class 3 illegitimate diploids Genomic DNA was isolated from class 3 (chromosome III arm loss) illegitimate diploids and hybridized to a Saccharomyces cerevisiae whole genome tiling microarray to identify copy number variations. In each histogram the y-axis represents log2 ratios of probe signal intensities, comparing the indicated strain to a legitimate MATa/ diploid, and the x-axis represents chromosome coordinates. Black arrows indicate breakpoint locations on each chromosome, black circles represent the locations of centromeres and the chromosome number is indicated to the right of each histogram. A representative histogram for each of the major types of rearrangements observed is shown. A) A Class 3-1 diploid, in which no copy number variation of chromosome III was evident. B)Class 3-2 diploids have a loss of sequence (red) from the right arm of chromosome III and duplication of sequences (blue) from chromosome XV.C) Class 3-3 diploids have an amplification of the left arm sequence and a deletion of the right arm sequence of chromosome III. D)Class 3-4 diploids have a loss of sequence from the right arm of chromosome III without copy number variation on non-homologous chromosomes. E) Class 3-5 diploids exhibit a loss of sequence from the right arm of chromosome III and loss of sequence from the right arm of chromosome V.

37

Table 2.2 Classification of Class 3 illegitimate diploid chromosome rearrangements

Class1 Strain Essential Type of Chromosomes Size of Boundaries of gene chromosome involved altered chromosome depleted rearrangement chromosome rearrangements (kb) 3A(i)2 MCM7-2-57- MCM7 Translocation III/VII 690 FS2; 1 YGRCTy1-2, YGRCTy2- 1 3A(i) RFC2-1-1-1 RFC2 Translocation III/VII 710 FS1; YGRWTy1-1 3A(i) RFC2-1-1-2 RFC2 Translocation III/VII 710 FS1; YGRWTy1-1 3A(i)2 SPT16-1-3-2 SPT16 Translocation III/XV 640 FS1; YORWTy1-2 3A(ii)2 CDC45-4-4- CDC45 Ectopic BIR III/III 120 YCLWTy2-1; 1 YCRCδ6 3A(ii) CDC45-4- CDC45 Ectopic BIR III/III 140 YCLWTy2-1; 52-1 YCRCδ7 3A(ii)2 MCM7-6-1-1 MCM7 Ectopic BIR III/III 140 YCLWTy2-1; YCRCδ7 3A(ii) RFC2-2-56-1 RFC2 Ectopic BIR III/III 140 YCLWTy2-1; YCRCδ7 3A(ii) RFC2-2-6-1 RFC2 Ectopic BIR III/III 120 YCLWTy2-1; YCRCδ6 3A(ii) RFC2-2-6-2 RFC2 Ectopic BIR III/III 120 YCLWTy2-1; YCRCδ6 3A(ii) SPT16-1-55- SPT16 Ectopic BIR III/III 120 YCLWTy2-1; 1 YCRCδ6 3A(ii)2 SPT16-2-1-1 SPT16 Ectopic BIR and III/III and V 170 (chrIII) YCLWTy2-1; Interstitial 520 (chrV) FS2 (chrIII) and deletion YERCTy1-1; YERCTy1- 2 (chrV) 3B2 DPB11-1-1-2 DPB11 Arm deletion III 150 FS1 3B MCM7-4-1-2 MCM7 Arm deletion III 170 FS2 3B SPT16-2-51- SPT16 Arm deletion III 120 YCRCδ6 1 3B SPT16-2-51- SPT16 Arm deletion III 120 YCRCδ6 2 3D2 RFC2-1-52-1 RFC2 Hawthorne III 220 MATα, HMR deletion 3F2 DPB11-2-56- DPB11 Chromosome III/XVI 970 YCLWTy2-1; 1 fusion YPLWTy1-1 3F2 NSE1-4-5-1 NSE1 Chromosome III/V 530 YCLWTy2-1; YERCTy1- fusion 1 3F NSE1-6-51-1 NSE1 Chromosome III/V 580 YCLWTy2-1; YERCTy1- fusion 2 3F2 UBC9-1-51- UBC9 Chromosome III/XVI 930 YCLWTy2-1; YPRWTy1- 1 fusion 3, YPRCTy1-4 3F UBC9-1-51- UBC9 Chromosome III/XVI 930 YCLWTy2-1; YPRWTy1- 2 fusion 3, YPRCTy1-4 3F UBC9-1-51- UBC9 Chromosome III/XVI 930 YCLWTy2-1; YPRWTy1- 3 fusion 3, YPRCTy1-4

1 All strains were examined by comparative genome hybridization on a microarray 2 Genome rearrangements predicted from microarray analysis were confirmed by Southern blot analysis

38

Figure 2.5Mechanisms of repair in replication deficient mutants A) A schematic of boundaries of rearrangement that occur on chromosome III of replication mutants are shown. B)Followingadouble stranded break and resection to Ty retrotransposons (arrows)or delta long terminal repeats (triangles), chromosome fragments can be repaired in illegitimate diploids through several mechanisms. Class 3-1: Chromosome III is repaired by BIR using the homologous chromosome of the tester strain. Class 3-2: Ectopic break induced replication (BIR) mediated by strand invasion at a Ty retrotransposon on a non-homologous chromosome XV of the tester strain (grey) yields non-reciprocal translocations. Class 3-3: Ectopic BIR involving strand invasion at a different locus of chromosome III results in shortened fragments of chromosome III with two left arms. Class 3- 4:Chromosome III fragments are directly repaired by acquisition. Class 3-5:Chromosome fusions can be created through Single Stranded Annealing (SSA) of chromosome III and chromosome V fragments with boundaries at Ty retrotransposons or through a BIR and half-crossover event.

39

Four of the 42 class 3 illegitimate diploids that I tested exhibited poor hybridization profiles and were not analyzedfurther. The remaining samples were divided into five different subclasses based on the rearrangement profiles of chromosome III. The most frequent subclass, class 3-1, comprised microarray profiles that lacked deletions or duplications (Figure 2.4A). This type of profile was observed in 15 out of 38 (39%) illegitimate diploids and was present following the depletion of all essential genes tested, with the exception of RFC2 and SPT16. As previously suggested, this genome profile likely represents successful repair of chromosome III by break- induced replication (BIR) using the 1225α strain chromosome III as a template (Lemoine et al., 2008; Lemoine et al., 2005), resulting in full restoration of chromosome III sequences (Figure 2.5B). Southern blot analysis following separation of chromosomes on aContour-clamped Homogeneous Electric Field (CHEF) gel, using a probe specific for the HIS4 locus on the left arm of chromosome III, revealed a single DNA band corresponding to the size of chromosome III, confirming that class 3C diploids have two intact copies of chromosome III (Figure 2.6A, lane 4).

Figure 2.6Southern Blot confirmation of chromosome rearrangements predicted in microarray analysis A) Intact chromosomal DNA was isolated from Class 3 (chromosome arm loss) illegitimate diploids. Genomic DNA was separated on a Countour-clamped Homogenous electric field (CHEF) gel and Chromosome III was detected through hybridization with a radio-labelled probespecific to the left arm of chromosome III. Smaller chromosome III fragments were also detected in samples with chromosome rearrangements. Representative Southern blots for a wildtype diploid, Class 3-3, 3-4 and 3-1 diploids of the indicated strains are shown, respectively. B)Representative Southern blots of a wildtype and a Class 3-2 illegitimate diploid. Chromosomes on the left and right panels were detected with probes for chromosomes III and XV, respectively. In addition to chromosome III and XV, a non- reciprocal translocation (nrt) was visualized. C)Representative Southern blot of a wildtype and a Class 3-5 illegitimate diploid. Chromosomes on the left and right panels were detected with probes for chromosome III and V, respectively. In addition to chromosomes III and V, a chromosome fusion (fus) event was detected.

40

Class 3-2 diploids had both a deletion of sequences from the right arm of chromosome III and a duplication of sequences from a non-homologous chromosome, suggesting the presence of a nonreciprocal chromosome translocation (Figure 2.4B and 2.5B). Illegitimate diploids isolated following the depletion of MCM7, RFC2, and SPT16 displayed this type of rearrangement. As previously described following depletion of DNA polymerase  or (Lemoine et al., 2008; Lemoine et al., 2005), this class had breakpoints either at FS1 (Fragile Site 1) or FS2 of chromosome III and at Ty1 retrotransposons of a non-homologous chromosome (chromosome VII or XV, in our case). Previous sequencing analyses and restriction mapping of FS1 and FS2 regions in several strain backgrounds, including the S288C strain background that I used in our study, indicated the presence of a direct pair of tandem Ty1 retrotransposons and an inverted pair of Ty1 retrotransposons, respectively, that are un-annotated in the SaccharomycesGenome Database (SGD) (Hoang et al., 2010; Lemoine et al., 2008; Lemoine et al., 2005; Umezu et al., 2002). I confirmed that this same arrangement of Ty1 retrotransposons at FS1 and FS2 is present in the wildtype Tet allele strain using PCR analyses (Figure 2.7A). I further verified the arrangement of Ty1 retrotransposons at FS2 using Southern blot analyses (Figure 2.7B). In keeping with the mechanism proposed by Lemoine et al.(Lemoine et al., 2008; Lemoine et al., 2005), I predict that class 3-2 represents ectopic BIR with strand invasion at a Ty1 retrotransposon element of a non-homologous chromosome (Figure 2.5B). Southern blot analysis of a representative SPT16 illegitimate diploid revealed one band consistent with the size of chromosome III as well as a second band consistent with the predicted size of a non-reciprocal translocation between chromosome III and chromosome XV (Figure 2.6B, lane 2). Hybridization of the same blot with a chromosome XVprobe confirmed the presence of a chromosome containing sequences from both chromosome III and chromosome XV (Figure 2.6B, lane 4).

41

Figure 2.7 Restriction map and Southern blot analysis of the FS2 region of chromosome III A) Restriction digest map of S. cerevisiae strain BY4741. Regions of the genome that were confirmed by PCR and sequence analysis are shaded in grey. The map is drawn to scale and restriction enzyme cut sites are indicated for EcoRI (E) and XbaI (X). The positions of the Southern blot hybridization probes, FS2-2 and FEN2, and the expected fragment sizes are depicted above and below the map, respectively. B) Restriction digest and Southern blot analysis using genomic DNA isolated from BY4741 yielded fragment sizes that correspond to the restriction digest map and confirm the presence of an inverted pair of Ty1 retrotransposons at the FS2 position of chromosome III.

Class 3-3 illegitimate diploids had both an amplification of sequences proximal to the Ty2 retrotransposon, YCLWTy2-1, and a deletion of right arm sequences distal to FS2 or delta elements YCRCδ6 or YCRCδ7 of chromosome III (Figure 2.4C and Figure 2.5B). Although class 3-2 was the most common class of rearrangement following depletion of DNA polymerase  or (Lemoine et al., 2008; Lemoine et al., 2005), class 3-3 was more common in our study. Sequencing of chromosome III in the S288C strain background used in our study indicates a Ty1 retrotransposon insertion directly upstream of YCLWTy2-1 in the forward orientation that is un-

42

annotated in SGD, resulting in a configuration similar to FS1(Hoang et al., 2010). This was the second most abundant profile and was observed in eight samples, following the depletion of CDC45, MCM7, RFC2 or SPT16. One model for this rearrangement (Figure 2.5B) is that it results from a BIR event with inaccurate strand invasion at the Ty retrotransposons on the left arm of chromosome III, which have homology to the Ty retrotransposons at the breakpoints on the right arm of chromosome III. The high frequency of this type of rearrangement could reflect the abundance and orientation of LTRs and Ty retrotransposons on chromosome III or the spatial proximity of the relevant Ty retrotransposons and LTRs within the nucleus (Duan et al., 2010). Southern blot analysis using a probe within the amplified region of chromosome III resulted in the detection of two distinct chromosome sizes (Figure 2.6A, lane 2). One corresponds to the expected size of an intact chromosome III contributed by the tester strain, and the other (of lower molecular weight) corresponds to the size of the inaccurately repaired chromosome predicted by the CGH data. Additionally, the intensity of the rearranged chromosome band was approximately two-fold higher than that of the intact chromosome III band. Band sizes and intensities are consistent with the position of the probe in the amplified region of the left arm and therefore support the indicated structure of this class of rearrangement (Figure 2.5B).

Class 3-4 illegitimate diploids have only a deletion of the right arm of chromosome III (Figure 2.4D and 2.5B). I observed four of these events after the depletion of genes involved in DNA replication (MCM7, DPB11, SPT16). Two of these events had breakpoints at FS1 and FS2 (MCM7, DPB11), the same breakpoints observed following depletion of DNA polymerases  and (Lemoine et al., 2008; Lemoine et al., 2005). The other two, from SPT16 illegitimate diploids, had breakpoints at YCRCδ6. As suggested previously, this class might represent chromosome fragments that persist through direct telomere capping (Figure 2.5B), or by acquisition of telomeric sequences by BIR utilizing a telomere-proximal Ty element (Lemoine et al., 2008; Lemoine et al., 2005). These rearrangement profiles were confirmed by Southern blot analysis, where two bands of equal intensity were visualized, one corresponding to intact chromosome III and the other to the predicted chromosome III fragment (Figure 2.6A, lane 3).

The class 3-5 rearrangement pattern includes a deletion of all but the left arm of chromosome III in addition to an arm deletion on a non-homologous chromosome (chromosome XVI or V) (Figure 2.4E and Figure 2.5B). By contrast to the four subclasses of rearrangements described

43

above, class 3-5 profiles were not observed following depletion of DNA polymerases  and (Lemoine et al., 2008; Lemoine et al., 2005). However, this profile has been documented upon the deletion of a non-essential gene required for strand invasion in homologous recombination, RAD52(Casper et al., 2009). In our studies, class 3-5 profiles were observed following the depletion of NSE1 and UBC9, which also function in DNA repair (Branzei et al., 2006; Irmisch et al., 2009; Santa Maria et al., 2007). I found one additional class 3-5 rearrangement following the depletion of DPB11 (Table 2.2). In each case, the breakpoint on chromosome III corresponds with the only Ty retrotransposon on the left arm and the rearrangement results in the loss of one copy of the chromosome III centromere. I predict that this acentric chromosome III fragment would be fused to the non-homologous chromosome to allow this chromosome fragment to persist (Figure 2.5B). The chromosome fusion could result from an interruption of a BIR event and subsequent resolution of the strand invasion intermediate with a non-homologous chromosome, resulting in a half crossover chromosome. Alternatively, since the breakpoints on the non-homologous chromosomes coincide with Ty retrotransposon sites, it is possible that a homology mediated repair mechanism, such as single-strand annealing (SSA) (Mieczkowski et al., 2006), is involved in the formation of this chromosome fusion (Figure 2.5B). This mutant chromosome was confirmed by Southern blot analysis of a representative illegitimate diploid isolated following the depletion of NSE1 (Figure 2.6C, lanes 2 and 4). A probe specific for chromosome III detected two chromosome sizes, one corresponding to intact chromosome III and another to the predicted size of the chromosome III-chromosome V fusion. I also detected two chromosome bands following re-hybridization with a probe specific for chromosome V, one corresponding to the size of intact chromosome V and the other corresponding to the expected size of the chromosome fusion.

Finally, depletion of RFC2resulted in one example of a “Hawthorne deletion” (Class 3-6), an interstitial deletion between repeated regions of the MATα and HMRa loci (Hawthorne, 1963). I did not observe any examples of class 3-7, the hallmark of which is amplification of sequences between FS1 and FS2(Casper et al., 2009). The total ratio of the seven rearrangement classes was 15:4:8:4:6:1:0 (3-1 : 3-2 : 3-3 : 3-4 : 3-5 : 3-6: 3-7).

44

2.3.5 Boundaries of rearrangements correlate with Ty retrotransposons, LTRs, tRNA genes, early replication origins and replication termination sites

To determine if the boundaries of rearrangements are correlated with particular genomic features, I performed an enrichment test. I segmented the genome into 5kb bins and scored each bin for the presence or absence of genomic features and breakpoints. For each feature, I determined the fold enrichment of bins that contain both the feature and a breakpoint in comparison to what would be expected if breakpoints were randomly placed into bins (Table 2.3). Consistent with other studies of rearrangement breakpoints in yeast (Argueso et al., 2008; Dunham et al., 2002; Lemoine et al., 2008; Lemoine et al., 2005; Li et al., 2009), our boundaries of rearrangement were significantly enriched at loci with Ty retrotransposons, Long Terminal Repeats (LTRs) and tRNA genes (Table 2.3). As these sites have repetitive sequences, they may represent endpoints of resection that facilitate recombination repair between non-homologous loci. I also found that boundaries of rearrangement were significantly enriched near early replication origins and replication termination sites (Table 2.3). One possibility is that mis-regulation of replication firing and replication fork convergence causes double-stranded DNA breaks that promote rearrangement events.

Table 2.3 Enrichment Analysis of the Correlation between Boundaries of Chromosome Rearrangements (n=14) and Selected Genomic Features

Feature Fold Enrichment P-value False Discovery Rate (FDR) corrected P-value LTRs 8.823 1.87 x 10-12 3.55 x 10-11 tRNA genes 7.495 3.76 x 10-9 3.57 x 10-8 Ty retrotransposons 7.988 0.005 0.0344 Termination sites 7.313 0.007 0.0333 Early replication origins 9.355 0.0184 0.0699 High confidence ARSs 1.739 0.186 0.505

45

2.3.6 Depletion of NSE1 human homolog reduces replication progression

Depletion of NSE1resulted in a robust chromosome rearrangement phenotype and the most elevated levels of Ddc2 foci from our yeast studies. NSE1 is an essential component of the SMC5-SMC6 complex, which is highly conserved in humans. While this complex plays an important role in DNA repair, it has a less defined role than its counterparts, (SMC1- SMC3) and (SMC2-SMC4) (Jeppsson et al., 2014). Since I observed an enrichment of yeast DNA replication genes involved in the suppression of rearrangements, I reasoned that NSE1 and its associated complex might play a significant role in DNA replication progression. I used siRNAs to deplete the expression of the NSE1 human homolog in asynchronous U2OS cells and used high-throughput microscopy to monitor the incorporation of a nucleoside analog (EdU) as a measure of DNA replication (Figure 2.8A and B). I observed a significant reduction in the distribution of mean EdU intensities per nucleus following the depletion of NSE1 or PCNA positive control relative to a non-targeting (NT) siRNA negative control (P-value <0.01; Mann- Whitney test). I performed this experiment with three biological replicates and observed a significant decrease in the average mean EdU intensity per nucleus following the depletion of PCNA as expected (P-value = 0.02; one-tailed t-test)(Figure 2.8C). While the average mean EdU intensity per nucleus was reduced following depletion of NSE1, the average mean EdU intensity per nucleus fell short of our stringent significance cut-off (P-value = 0.08; one-tailed t-test). Nevertheless, this functional contribution of NSE1 to human DNA replication is consistent with a recent report describing a slow DNA replication progression phenotype in SMC5 or SMC6 depleted human cell lines (Gallego-Paez et al., 2014).

46

Figure 2.8 siRNA depletion of hNSE1 in mammalian U2OS cells results in reduced DNA incorporation A)Representative images following 48 hr transfection with a non-targeting (NT) siRNA control, an siRNA targeting the human homolog of NSE1 or an siRNA pool of four siRNAs targeting PCNA in U2OS cells. EdU (10μM) was added to media 30 mins prior to paraformaldehyde fixation, click chemistry and staining. EdU and DAPI channels are shown. B)DAPI signal was used for segmentation of the nuclei and each dot on the graph represents the mean intensity per nucleus of the images in (A). Red lines reflect the average EdU mean intensity per nucleus of each sample and a significant reduction was observed following knockdown with siNSE1 relative to the control (*** = p<0.0001 by Mann-Whitney test) C) Bar graph of the average EdU mean intensity per nucleus in three biological replicates of the experiment described in (A). Error bars represent standard error. 2.4 Discussion

2.4.1 Comparison of conditional allele screens for genome instability mutants

The collection of tetracycline-regulated promoter conditional alleles (Tet alleles) encompasses 773 essential genes (63%), out of a total of 1135 that are annotated in SGD (Mnaimneh et al., 2004; Yu et al., 2006). I quantified the accumulation of Ddc2 damage foci in the 217 strains of the Tet allele collection that showed accumulation in S or G2 phase following promoter shut-off (Yu et al., 2006). I identified 47 genes that function to protect the genome from spontaneous DNA damage, 20 of which did not have previously annotated roles in the maintenance of genome stability. A similar screen for Ddc2 foci accumulation was recently reported, using a set of 592 temperature-sensitive conditional alleles representing 399 essential genes (Li et al., 2011). Of the 114 genes that were in common to both sets of conditional alleles (Figure 2.9A), mutants of ten essential genes displayed elevated levels of Ddc2 foci in both screening efforts, a small but

47

significant overlap (P-value of 0.0043; hypergeometric test) (Figure 2.9B). Fifteen genes were identified in our screen that were negative in Li et al., and 15 genes were identified in Li et al. that were negative in our screen (Figure 2.9B). Altogether, I identified 37 genes with elevated Ddc2 foci that were not identified by Li et al.

Figure 2.9 Comparison of Ddc2 foci and a-like faker genome instability screens A) Comparison of genes screened in our study with those screened by Li et. al. using ts-alleles. B) Comparison of Ddc2 foci positives in our study and in Li et. al. with the set of genes screened in both studies. C) Comparison of genes screened in our study with those screened by Stirling et al. using a combination of ts-alleles and DAmP alleles. D) Comparison of a-like faker positives in our study and in Stirling et. al. with the set of genes screened in both studies.

I also screened for the a-like faker chromosome instability phenotype in 208 Tet alleles that I assayed for Ddc2 foci formation. Of the 68 genes that were in common with a recent a-like faker screening effort using ts alleles (Stirling et al., 2011) (Figure 2.9C), the overlap of 11 essential gene mutants with elevated levels of a-like fakers in both screening efforts was insignificant (P- value of 0.064 by the hypergeometric test) (Figure 2.9D). I identified 59 essential genes that contribute to the suppression of genome instability that were not identified by Stirling et. al. Focusing on the genes assayed in both screens, there were seven false negatives in our screen of

48

Tet alleles and 22 false negatives in the Stirling et. al. screen of 364 ts-alleles.These, and similar false negatives in the Ddc2 foci screens, likely represent cases where the false negative allele was not sufficiently compromised to display a significant phenotype. Finally, a recent screen was performed to determine the extent of Rad52 foci in 305 essential chromosome instability genes. Comparison with our Ddc2 foci screen revealed that only the depletion of CDC9, CDC45, MCM5, NSE1 and PSF2 resulted in elevated levels of both Rad52 and Ddc2 foci (Stirling et al., 2012). Given that each conditional allele collection is currently incomplete, that a positive score with one kind of allele is at best weakly predictive of a positive score with a distinct allele, that the overlap between screens of different allele collections is small, and that the functions of essential genes are likely perturbed to varying degrees within any one collection, screening complementary collections of different kinds of alleles will ultimately be necessary to define the complete cohort of essential genes that maintain genome stability.

A common theme in the overlap among these screens of essential gene collections is enrichment for genes that function in DNA replication, indicating that among essential processes, replication defects are strong contributors to genome instability. Both Ddc2 foci screens were enriched for genes with DNA replication as their GO process annotations, relative to the S. cerevisiae genome (14.5 and 25.3 fold enrichment for this study and Li et al., 2011, respectively; Bonferroni corrected P-values of 7.41x10-12 and 2.51x10-27). Similarly, both a-like faker screens displayed enrichment for DNA replication (10.8 and 17.5 fold enrichment fold enrichment for this study and Stirling et. al. 2011, respectively; Bonferroni corrected P-values of 1.87x10-13 and 3.20x10- 19). There is precedent that data from these yeast functional screens might be useful for predicting novel DNA replication factors in higher organisms. In this study, NSE1 was identified in both the Ddc2 foci and a-like faker screen. Subsequently, I uncovered a role for the human NSE1 homolog in mammalian replication progression. It will be of great interest to test the involvement of other human homologs from our yeast functional screens in replication progression and the maintenance of genome stability.

I compared the functional differences between essential gene alleles that had elevated Ddc2 foci and those that had increased frequency of illegitimate mating, as this is the first time the same set of essential genes has been analyzed with both assays. Eighteen alleles displayed only elevated levels of Ddc2 foci and 63 alleles had only the a-like faker chromosome instability phenotype,

49

while 29 alleles had elevated levels of both. This core of 29 genes was enriched for DNA replication function relative to the genome (20.4 fold enrichment; Bonferroni corrected P-value of 2.04x10-12), again indicating the primary role of replication errors in genome rearrangements. By contrast, the alleles that displayed only the a-like faker chromosome instability were enriched for genes involved in transcription initiation (15.9 fold enrichment; Bonferroni corrected P-value of 6.22x10-6). By analyzing the overlap between the screens, I found that strains with increased illegitimate mating tended to have a larger fraction of cells with spontaneous Ddc2 foci, and that strains with spontaneous Ddc2 foci formation above our cutoff were more likely to have increased illegitimate mating, which suggests some predictive value of one phenotype for the other. However, the overlap between the Ddc2 focus screen and the a-like faker screen was far from absolute. Therefore, consistent with a recent study (Stirling et al., 2011), I suggest that the complete set of genes with roles in genome maintenance will be obtained not only by screening different allele collections, but also by the application of multiple screening methods.

2.4.2 Essential genes involved in DNA replication are critical for genome stability

I observed a prominent role for DNA replication genes in the suppression of chromosome rearrangements. Defects in a range of distinct replication functions, including initiation (CDC45, DBF4, DPB11, MCM4, MCM5, MCM7, PSF2), elongation (CDC45, DNA2, MCM4, MCM5, MCM7, POL2, POL30, PSF2, RFC2, RFC5) and termination (UBC9) caused spontaneous DNA damage and chromosome rearrangements, suggesting that each stage of replication is crucial to the maintenance of genome stability. The rearrangements that I observed likely involve DNA DSBs and there are several mechanisms by which defects in replication could contribute to DSB formation.

Reduced levels of proteins involved in pre-replicative and pre-initiation complex formation at replication origins likely result in fewer replication forks emanating from fewer origins, increasing the likelihood that regions of the genome might remain un-replicated and become susceptible to breakage. Consistent with this view, reduced levels of activated origins of replication and elevated frequencies of gross chromosomal rearrangements have been observed in strains with mutations in CDC6, CLN2, ORC2 or SIC1 genes involved in origin licensing (Bielinsky, 2003; Bruschi et al., 1995; Lengronne and Schwob, 2002; Shimada et al., 2002;

50

Tanaka and Diffley, 2002a). Depletion of DNA replication elongation factors might disrupt the kinetics of replication in S-phase, resulting in replication fork stalling and chromosome rearrangements as has been noted in RFA1 mutants (Chen et al., 1998). Defects in elongation could also disrupt the coordinated movements of and transcription machinery along the DNA, leading to increased levels of collision and DNA breakage (Deshpande and Newlon, 1996; Ivessa et al., 2003) (Figure 2.10A). Another consequence of defective replication elongation could be the delay of Okazaki fragment synthesis resulting in the accumulation of ssDNA on the lagging strand, secondary structure formation, and blocks to replisome progression (Sogo et al., 2002)(Figure 2.10B). This type of mechanism has been proposed to allow the formation of hairpin structures at inverted Ty retrotransposon repeats, causing chromosome rearrangements when DNA polymerase α and δ are depleted (Casper et al., 2009; Lemoine et al., 2008; Lemoine et al., 2005). Depletion of polymerase  (POL2) in our study could cause fragile site instability by a similar mechanism. Finally, when two replication forks converge in the last stages of replication, termination structures need to be accurately resolved prior to mitosis in order to prevent DSBs (Figure 2.10C). Both Ubc9 and the Nse1-containing Smc5/6 complex have been proposed to impinge on the resolution of termination structures (Branzei et al., 2006) and I observe that rearrangement breakpoints are enriched at termination sites (Fachinetti et al., 2010) (Table 2.3), suggesting that defective termination could contribute to accumulation of chromosome rearrangements. Together, our results emphasize the importance of replication defects in initiation, elongation, and termination, in causing DNA damage and chromosome rearrangements.

51

Figure 2.10Mechanisms by which genome instability occurs in replication deficient mutants A)tRNA genes and replication forks are clustered in proximity to Ty retrotransposons.The transcription machinery creates obstacles for replication fork progression and could lead to DSB formation and resection to the repeated elements. B)Secondary structure formation involving repeated Ty retrotransposons (arrows) on the lagging strand causes replication fork stalling, subsequent replisome dissociation and the formation of double stranded breaks (DSBs). C)Failure to resolve termination structures at converging replication forks that flank Ty retrotransposons, is a potential source of DSBs.

2.4.3 Ty retrotransposons and tRNA genes promote chromosome rearrangements

Each of the 38 rearrangement breakpoints that I mapped in this study, regardless of the specific function of the gene that was depleted, was proximal to a Ty retrotransposon element, highlighting the critical role of these repetitive elements in chromosome rearrangements in yeast. Chromosome rearrangements involving Ty retrotransposons are observed at a basal level in wildtype S. cerevisiae strains and a number of experimental connections between rearrangements and Ty elements have been made [reviewed in (Garfinkel, 2005; Lesage and Todeschini, 2005; Mieczkowski et al., 2006)].

Ty retrotransposons could function in at least two ways to promote chromosome rearrangements. They could represent sites of chromosome breakage (Lemoine et al., 2005) or they could provide homologous sequences for recombination-mediated repair of breaks that occur at distal sequences (Hoang et al., 2010). The yeast fragile site FS2 is thought to be a site of chromosome breakage, and a model in which single stranded DNA (ssDNA) at inverted pairs of Ty retrotransposons, such as FS2,allows formation of secondary structures that inhibit the progression of the replisome, causing replication fork stalling and DNA breakage has been

52

proposed (Casper et al., 2009; Lemoine et al., 2008; Lemoine et al., 2005)(Figure 2.10B). However, of the 38 chromosome rearrangement mutants that I mapped, I observed only three (7.9%) with boundaries within the FS2 region, suggesting that other modes of chromosome breakage predominate in our study.

Examination of the boundaries of chromosome rearrangements in replication mutants revealed significant enrichment of nearby early replication origins and tRNA genes (Table 2.1). Transcription complexes on tRNA genes can impede an on-coming replisome, thereby promoting replication fork pausing and DSB formation (Figure 2.10A) (Deshpande and Newlon, 1996; Ivessa et al., 2003), suggesting that additional breakage could result from transcription- replication collisions. This combination of features surrounding breakpoints of rearrangement is consistent with those observed at natural evolutionary breakpoints when S. cerevisiae is compared to related yeasts, as well as breakpoints observed during artificial evolution of S. cerevisiae (Di Rienzi et al., 2009; Dunham et al., 2002; Kellis et al., 2003), suggesting that in addition to replication fidelity, these features are important determinants of instability.

2.4.4 Parallels with human common fragile sites

There are a number of parallels between common fragile sites in yeast and in humans. Inhibition or depletion of DNA polymerases (Glover et al., 1984; Lemoine et al., 2008; Lemoine et al., 2005) or DNA damage checkpoint proteins (Arlt et al., 2004; Casper et al., 2002; Cha and Kleckner, 2002; Durkin et al., 2006; Focarelli et al., 2009; Raveendranathan et al., 2006; Schwartz et al., 2005; Vernon et al., 2008) can induce chromosome breaks at common fragile sites in both yeast and human. Although human common fragile sites lack distinctive sequence similarities, they have attributes that impair replication progression (Glover et al., 2005; Glover et al., 1984; Zlotorynski et al., 2003), a shared property of yeast fragile sites (Admire et al., 2006; Cha and Kleckner, 2002; Deshpande and Newlon, 1996; Ivessa et al., 2003; Lemoine et al., 2005; Raveendranathan et al., 2006; Roeder and Fink, 1980). Additionally, recent studies of the human common fragile site FRA3B have suggested that instability at this site is not due to replication fork slowing or stalling, but rather is due to a paucity of replication initiation events (Letessier et al., 2011). In our studies, early firing origins of replication are enriched in regions with rearrangement breakpoints. Depletion of replication initiation factors could disrupt origin firing at these sites and thereby contribute to instability in a manner analogous to FRA3B. It will

53

be of great interest to test the general role of replication proteins in suppressing chromosome rearrangements that I have observed in yeast in the maintenance of human common fragile sites.

2.5 Methods

2.5.1 Yeast strains and media

Tet allele strains were constructed as described previously (Mnaimneh et al., 2004). The genotype of the wild-type Tet allele strain, R1158, is MATaURA3::CMV-tTA his3Δ1leu2Δ0 met15Δ0. Using standard genetic methods, 217 MATα Tet allelestrains were engineered to contain YFP-Ddc2 marked with a nourseothricin (Nat) resistance gene. Genotypes for strains used in this study are listed inTable 2.4. The essential genes that were studied are listed in Appendix Table S1. Standard yeast media and growth conditions were used unless otherwise specified(Sherman, 1991).

Table 2.4 Yeast strain genotypes

Strain Genotype Reference Mnaimneh et al., 2004; Yuen et al., R1158 MATaURA3::CMV-tTA his3Δ1leu2Δ0 met15Δ0 2007 R MATα YFGpr::kan -TetO7-TATACYC1 Ddc2:YFP 1 YFG -TetO7 strains [natMX] URA3::CMV-tTA his3Δ1 leu2Δ0 met15Δ0 Constructed for this paper MCY13 MATαlys1 Chang, et. al., 2005 MCY14 MATa lys1 Chang, et. al., 2005 Lemoine et al., 2005; Lemoine et al., 1225α MATα his4-15 leu2 thr4 ura3-52 trp1 lys 2008 R MAT-/α YFGpr::kan -TetO7-TATACYC1/YFGpr YFG-TetO7 DDC2:YFP[natMX]/DDC2 URA3::CMV-tTA/ura3- strains/1225α 52 his3Δ1/HIS3 leu2Δ0/LEU2 met15Δ0/MET15 illegitimate diploid HIS4/his4-15 LEU2/leu2 THR4/thr4 TRP1/trp1 strains LYS/lys Constructed for this paper

1YFG denotes "Your Favourite Gene"

2.5.2 Fluorescence microscopy

Tet allele strains were grown in YPD liquid media at 30°C. Samples were divided into two cultures and grown in parallel in the presence and absence of 10 μg/mL doxycycline for four additional hours at 23°C. Intracellular localization of Ddc2-YFP was determined by fluorescence microscopy as previously described for Rad52-YFP (Chang et al., 2005; Lisby et al., 2004; Lisby and Rothstein, 2004). Ddc2 foci were quantified in at least 100 cells for each strain. Ddc2 foci in wild type cells were analyzed four times and used to calculate a standard deviation. Tet allele

54

strains that had Ddc2 foci levels that were at least three standard deviations greater than wild type were scored as positive.

2.5.3 Illegitimate mating assays

Tet allele strains and the R1158 wild type strain were grown in parallel for 24 hours on YPD solid media either containing or lacking 10 μg/mL of doxycycline. A standard mating assay was performed with tester strains MCY13 (MATα – legitimate mating) and MCY14 (MATa– illegitimate mating) on the same media conditions that the strains were grown. Diploids were isolated by replica plating on minimal media.

In the quantitative form of this mating assay, Tet allele strains and R1158 wild type strain were grown in parallel for 24 hours in YPD liquid media containing or lacking 10 μg/mL doxycycline. Strains were mixed with five-fold excess of MCY13, MCY14, or 1225α (MATα his4 thr4) tester strains and plated on YPD solid media. After 5 hours, cells were collected, re-suspended in water and plated on diploid selection media. Independent illegitimate diploids were isolated after the mating of the Tet allele strains with the1225α tester strain. For each mating experiment, approximately 100 diploids were isolated and tested for their ability to grow in the presence or absence of histidine or threonine. This assay was repeated two times. Viability of each strain following growth in doxycycline was confirmed by plating on YPD. Only MCM7 (10%),NUF2(30%), and UBC9 (50%) had less than 100% viability following growth in doxycycline.

2.5.4 Array comparative genome hybridization

Genomic DNA was extracted (Qiagen) from independent illegitimate diploids and wildtype diploids isolated from the mating assay. Comparative Genome Hybridization (CGH) on a microarray was performed as previously described (Dion and Brown, 2009) using S. cerevisiae whole genome tiling microarrays (Affymetrix). Signal intensities of the experimental and wild type control samples were normalized and compared using the Tiling Analysis Software (Affymetrix). Genomic patterns were mapped and analyzed using the Integrated Genome Browser Software (Affymetrix).

55

2.5.5 CHEF gel electrophoresis and southern blot analysis

Contour-clamped homogenous electric field (CHEF) gels were used to examine intact chromosomes of illegitimate diploids isolated from the mating assay. CHEF gel analysis was performed as described previously (Desany et al., 1998). A 1.2% agarose gel was run at 8V/cm using pulse times of 120 seconds for 30 hours at 14°C in 0.5X TBE buffer. PCR purified fragments were radio-labeled by random priming (Stratagene) and used as hybridization probes for Southern blot analysis. PCR primers designed for probe construction are listed in Table 2.5.

Table 2.5Primers used to generate probes in Southern blot analysis

Probe Primer Name Primer Sequence Chromosome length Experiment HIS4-F CATAGCAGCAACGGCTTG III 1533 confirm rearrangement HIS4-R CTCCAGTTCTCCAAAGAGGAAGA YCK3-F CCCAACGATCTTCACAACACA V 1570 confirm rearrangement YCK3-R CAACAGCAACAAAAACAGCA UFD1-F TGGCTTTAGTTCTTTTGGCG VII 1070 confirm rearrangement UFD1-R TTTCGATTACTTCCGGGCTT TUF1-F GCTCGACAAAGGTAACAGACA XV 1359 confirm rearrangement TUF1-R ACTCCAGTTGCATCAATAAGT HUT1-F CATCCAGTTTGGTGATCTGTG XVI 1003 confirm rearrangement HUT1-R GCAGATTTTGCCTTCGGTAT FS2-1-F GAAAGTTATGTGTGGAGTTCG III 463 confirm FS2 position FS2-1-R CTGACATACCTCATTTTGAGA FEN2-F ATTCCGCTTGATATTGCCGT III 1184 confirm FS2 position FEN2-R GTCGGTTTCACCAATGCATA

56

2.5.6 Restriction Digestion and Sequencing Analysis of FS1 and FS2

Genomic DNA was isolated (Qiagen) from wildtype strains R1158 and BY4741 and digested with EcoRI and XbaI (New England Biolabs) using the suggested conditions. Digested fragments were separated on a 1% agarose gel and hybridized with FEN2 and FS2-2 probes for Southern blot analysis (Table 2.5). 5’ and 3’ ends of fragile site 1 (FS1) and FS2 were PCR amplified and sequenced. PCR primers used for both amplification and sequencing are listed inTable 2.6.

Table 2.6 Primers used for PCR amplification and sequencing of FS1 and FS2

Primer Pairs Primer Sequence FS1-Ty1A5'-For TCGTACGTCTCTCGGAATTG FS2-Ty1B5'-Rev TCTTCTGTTTTGGAAGCTGAAA FS2-Ty1B3'-For TGACAAAACCTCTTCCGATAAAA FS1-Ty1B3'-Rev CGAGAGATTAATTTTTGTTTTCAGAT FS2-Ty1A5'-For GAGAAACCACCAGTAGCGTTC FS2-Ty1A5'-Rev CGAGACCAAGAAGAACATTGC FS2-Ty1A3'-For CTGGAACAGCTGATGAAGCA FS2-Ty1A3'-Rev CGGCAACCAAAACGTAATCT FS2-Ty1B5'-For GTTGCCGCAAACCAAAAA FS2-Ty1B5'-Rev TCTTCTGTTTTGGAAGCTGAAA FS2-Ty1B3'-For TGACAAAACCTCTTCCGATAAAA FS2-Ty1B3'-Rev CTGACGCAGTTCTTCATTGG

2.5.7 Enrichment Analyses

S. cerevisiae chromosomes were divided into 5kb bins. For each bin, the presence or absence of breakpoints and genomic features was tabulated. Various genomic features (Di Rienzi et al., 2009) and replication termination sites (Fachinetti et al., 2010) from previous datasets were used for analysis. For each feature, the total number of bins with both the feature and a breakpoint was determined. To test for enrichment of breakpoints and each feature, a hypergeometric distribution was assumed. P-values <0.05 were considered as evidence of a correlation and P- values <0.05 after a False Discovery Rate (FDR) correction were considered strongly significant.

2.5.8 High-throughput microscopy of EdU incorporation

U2OS cells were grown in McCoy’s 5A media supplemented with 10% fetal bovine serum, 100 U/ml penicillin and 100 μg/ml streptomycin. Cells were transfected on clear-bottom 96-well plates (Fisher Scientific) with 40nM of siRNA for 48 hours, using Lipofectamine RNAiMAX

57

(Life Technologies). A negative non-targeting control siRNA (Dharmacon, GE Healthcare) and a custom designed siRNA sequence targeting human NSE1 was used (CCGGCTTTGCGTCTTCCAACAA) (Taylor et al., 2008). 10μM EdU was added to the culture media for the last 30 mins of transfection. Cells were washed with PBS, fixed with 4% paraformaldehyde for 10 min and permeabilized with 0.3% Triton for 10 min. EdU was detected by performing Click-iT reaction (Invitrogen) with Alexa-Fluor 488 azide (Life Technologies). Cells were blocked with 1XPBS buffer containing 10% goat serum, 0.5% NP-40 and 0.5% saponin for 30 mins at room temperature. Cells were incubated with gH2AX antibody (EMD Millipore JBW301) for 1 hr at room temperature (1:10000). After wash with PBS, cells were incubated in goat anti-mouse IgG-Alexa546 (Life Technologies; 1:1000) and DAPI (0.5 ug/mL) for 1 hour at room temperature. Images were acquired from each well using In Cell Analyzer 6000 (GE Healthcare Life Sciences) at 10x magnification at non-saturating exposure times. Columbus image analysis software (PerkinElmer) was used to segment nuclei with the DAPI channel and the mean EdU signal intensity per nucleus was calculated. Data was represented with the use of Prism software (GraphPad).

58

Chapter 3 Identification of proteins enriched at newly replicated DNA

DATA ATTRIBUTION:

Wade Dunham (Gingras Lab) performed the mass spectrometry. Jordan Young (Durocher Lab) acquired the microscopy images on the high-throughput microscope.

ACKNOWLEDGEMENTS:

I thank Wade Dunham for assistance with SAINT analysis and Xiaohan Yang for technical assistance with siRNA transfections.

59

3 Identification of proteins enriched at newly replicated DNA 3.1 Summary

Accurate DNA replication is important for the maintenance of genome stability and cell cycle progression. In the past, defining proteins at DNA replication forks in an unbiased manner has been difficult in mammalian cells. I used iPOND to catalog a novel set of proteins that interact with nascent DNA at the replication fork. I identified a candidate list of 153 proteins that is enriched in biological processes, such as DNA replication, cell cycle and DNA repair. While I detected components of the canonical replisome machinery, I also identified 113 novel proteins that have not been detected using iPOND in the past. Furthermore, I determined that depletion of a subset of candidates resulted in a significantly reduced replication progression, supporting the utility of iPOND for defining new proteins involved in DNA replication.

3.2 Introduction

Faithful replication of the genome is essential for the stable transmission of genetic information. Eukaryotic DNA duplication requires the coordinated unwinding of chromatin ahead of replication forks and the restoration of chromatin afterwards. As numerous replication forks progress simultaneously during S-phase, this process is highly coordinated and requires the rapid recruitment of replication machinery (Branzei and Foiani, 2009). Cells may encounter DNA lesions, insufficient nucleotides, or other forms of replication fork stress that may result in replication fork stalling during the normal cell cycle (Zeman and Cimprich, 2014). Prolonged pausing results in the recruitment of protein complexes and the activation of the DNA damage response. A comprehensive understanding of the intricate dynamics of protein organization at active replication forks during S-phase is important in the study of DNA replication. Failure to coordinate this dynamic sequence of protein interactions in each cell cycle can lead to mutations, epigenetic changes or chromosomal aberrations that contribute to aging and diseases such as cancer (Branzei and Foiani, 2009).

60

Duplication of the genome is mediated by a core protein complex termed the replisome(Bell and Dutta, 2002, Johnson and O'Donnell, 2005). Key components consist of polymerases, helicase, primase, Okazaki fragment processing machinery and PCNA. Chromatin remodelers, cell cycle progression factors and DNA repair proteins are also part of the replication machinery and act in proximity to the replisome. In vivo evidence of their cooperation can be visualized by the co- localization of proteins at “replication factories” by fluorescence microscopy. One important example includes the co-localization of DNA polymerase with the discrete nuclear PCNA foci during S-phase (Fuss and Linn, 2002). However, limitations of this technique include limited throughput, the use of highly specific antibodies and dependence on protein expression levels. As such, essential replisome proteins such as MCM2-7 helicase can not be visualized at sites of DNA synthesis (Dimitrova et al., 1999; Laskey and Madine, 2003).

Affinity purification of known replisome components coupled with mass spectrometry is an alternate method for defining novel replication factors. This technique has previously been used to assess PCNA interactions (Ohta et al., 2002) and many other replisome proteins. However, the limitations of affinity purification-mass spec approaches are similar to co-localization experiments, in that they require high levels of protein expression and high antibody specificity. Furthermore, chromatin bound fractions are relatively insoluble and are difficult to immuno- precipitate without harsh conditions that may disrupt protein interactions.

Restrictions of the described techniques have prevented the field from characterizing replication proteins and the dynamics of recruitment in a systematic manner. A new strategy has been developed to immune-precipitate newly replicated DNA for analysis of the associated proteins (termed isolation of proteins on nascent DNA, or iPOND) (Kliszczak et al., 2011; Lopez- Contreras et al., 2013; Sirbu et al., 2012; Sirbu et al., 2011). Cells are grown in media containing a thymidine analog, 5-ethynyl-2’deoxyuridine (EdU), which is incorporated into newly replicated DNA (Figure 3.1). Distinct from other thymidine analogs, EdU contains an alkyne functional group that can be exploited for covalent linkage to a biotin-azide using Click Chemistry (Moses and Moorhouse, 2007). A standard formaldehyde cross-linking procedure preserves protein-DNA interactions in the cells. Following cell lysis, streptavidin coated beads are used to capture nascent DNA and associated protein complexes. Reversing the cross-linking and elution from the nascent DNA yields an enriched sample of proteins that can be analyzed by

61

western blot analysis for its composition. To distinguish those proteins that are specifically bound at DNA replication forks from chromatin factors that are generally bound to DNA, EdU- DNA can be chased away from the replication fork by addition of thymidine to the media. Furthermore, this procedure can be coupled to mass spectrometry to identify the full set of proteins that interact with nascent DNA (Kliszczak et al., 2011; Sirbu et al., 2013).

I have used iPOND-MS to identify a novel set of replication factors. Distinct from previous studies, I used the SAINT algorithm (Choi et al., 2011) to define a candidate list of proteins that interact with newly replicated DNA at replication forks. DNA replication, cell cycle, DNA repair were GO Biological Processes that were significantly enriched in this dataset. Additionally, I defined 113 novel proteins that have never been associated with replication forks in prior studies. Using a secondary phenotypic screen, I found that the depletion of a subset of candidate proteins resulted in a reduced replication progression, supporting the use of this method for defining novel proteins involved in DNA replication. I have generated a large resource that will facilitate future functional studies in the fields of DNA replication and genome maintenance.

Figure 3.1 Schematic of iPOND methodology EdU is added to cultured cells and incorporated into nascent DNA, in vivo. Cells are treated with formaldehyde to cross-link protein-DNA complexes. Click chemistry is used to covalently conjugate biotin to EdU-labelled DNA. Cells are lysed in denaturing conditions and sonicated. Biotin-labelled protein-DNA complexes are purified using streptavidin beads. Eluted proteins are analyzed by immunoblotting or mass spectrometry. Adapted from (Sirbu et al., 2012).

62

3.3 Results

3.3.1 Isolating known replisome proteins using iPOND

I used iPOND methodology to define and characterize novel proteins associated with the replication forks (Sirbu et al., 2012; Sirbu et al., 2011). Here, HEK293 cells were cultured in EdU containing culture media for 10 mins and replaced with thymidine for 10 mins or 30 mins. Each sample was analyzed by iPOND to distinguish fork-associated proteins from those that bind generally to chromatin. EdU pulsed samples represent nascent DNA directly at the replication fork, while the thymidine chased samples represent maturing chromatin at increasing distances from the replication fork (Figure 3.2A). I show that core histone H3 protein levels are equivalent in all samples tested, regardless of the distance of the EdU label from the replication fork (Figure 3.2B). This supports its expected role as a general chromatin-bound protein. By contrast, a known replisome component, PCNA, is detected at high levels in the EdU pulsed sample, but protein levels are significantly reduced in thymidine chased samples (Figure 3.2B). This is consistent with the specific role of PCNA at the replication fork. Together, my results demonstrate that the iPOND technique has the sensitivity to distinguish replisome components from general chromatin associated factors.

63

Figure 3.2 Enrichment of replisome protein, PCNA, on nascent DNA by iPOND A) Schematic of iPOND procedure used to identify replisome proteins. Following an EdU pulse, thymidine is added to the culture media to chase the EdU labeled DNA away from the replication fork. The thymidine chase control samples can be used to distinguish general chromatin binding proteins from replication fork associated proteins. Schematic adapted from (Sirbu et al., 2012). B)HEK293 cells were incubated with EdU for 10 mins. Following a wash step, cells were incubated in thymidine for 0, 10 or 30 mins, as indicated. Eluted proteins from the iPOND methodology were analyzed by western blot. Equal levels of protein are detected in each input sample. No proteins were detected in the iPOND negative control lacking Click chemistry. PCNA was enriched on nascent DNA at the replication fork (0 min), but diminished in thymidine chase samples (10min; 30min). General chromatin binding protein, H3, was detected in all samples.

64

3.3.2 Determining proteins enriched on nascently replicated DNA with iPOND-MS

I used iPOND coupled with mass spectrometry to identify replication fork associated proteins. I collected cells that were pulsed for 10 mins with EdU for analysis of replication fork associated proteins. Cells with a 30 min thymidine chase were chosen for comparison, as these samples had negligible levels of PCNA by western blot, indicating sufficient distance from a replication fork (Figure 3.2B). As the typical mammalian replication fork speed is 1-2kb per min (Mechali, 2010), it is likely that I am capturing proteins that are 0-20kb and 40-80kb from the forks in the EdU pulse and thymine chase samples, respectively. Additionally, cells pulsed with EdU, but not treated with Click chemistry to covalently attach biotin to the incorporated EdU, were used as a negative control in mass spectrometry analysis. I used the “Significance Analysis of INTeractome” (SAINT) label free quantification (LFQ) analysis tool (Choi et al., 2011) to assign confidence scores to each protein.I chose candidates that had >2 spectral counts in two biological replicates and a stringent significance cutoff SAINT score of >0.9. SAINT normalizes spectral counts to the length of the proteins and to the total number of spectra in the purification, and it uses the negative control to estimate the spectral count distribution for false interactions. Prey proteins with SAINT score >0.9 in the EdU pulsed sample and a score of <0.9 in the thymidine chased sample were considered as candidate nascent DNA interactors. For prey proteins with SAINT score >0.9 in both samples, the ratio of the summed spectral counts from two biological replicates (EdU pulsed/thymidine chased) was used as a measure for protein enrichment at the active fork and a z-score cut off of >0.5 was considered significant. Using these criteria, I identified 153 proteins enriched at replication forks (Appendix Table 3). Of these, 31play a role in DNA replication (Figure 3.3A). These included known replisome proteins such as MCM helicase subunits, DNA polymerases, primase, PCNA, RFC clamp loader complex, replication protein A (RPA) subunits and Okazaki fragment processing proteins. These results suggest that our approach has the ability to capture key players at the replication fork.

65

Figure 3.3 Proteins enriched on nascently replicated DNA by iPOND-MS A)Bar plot of average spectral counts for core replisome components determined by iPOND-MS. HEK293 cells were collected after a 10 min EdU pulse (EdU) and following 30 min of thymidine chase (THY). Proteins were eluted at the end of the iPOND procedure and analyzed by mass spectrometry. The average spectral counts from two biological replicates are represented for each core replisome component. Bars with black outline denote a SAINT score of >0.9. B) Bar plot of average spectral counts for other functional categories of proteins found on nascent DNA. Prey protein functional annotations were taken from Alabert et. al. 2014. C) Pie chart of the functional distribution of proteins enriched on newly replicated DNA. Numbers represent the count of protein candidates in each category.

66

I also identified many accessory proteins that are known to interact with active DNA replication forks (Figure 3.3B and C). As expected, DNA replication was represented in the significantly enriched GO Biological Process groups (P = 1.94 x 10-43). Moreover, our list contains 37 PCNA interacting proteins, which is greater than predicted by chance (P =5.84 x 10-35). But many other GO Biological Process groups were significantly enriched including DNA repair (P = 7.17 x 10-36), chromatin organization (P = 3.93 x 10-33), cell cycle (P = 4.21 x 10-25) and transcription (P = 5.33 x10-5). This list included well-documented histone chaperones and cofactors such as CHAF1A, CHAF1B, DNMT1 and UHRF1 that facilitate nucleosome assembly on newly replicated DNA (Alabert and Groth, 2012). I also identified SUPT16H and SSRP1, which are core components of the FACT histone chaperone complex that reorganizes the nucleosomes to coordinate the movement of transcription and DNA replication machinery. This corroborates reports of the FACT complex association with MCM proteins of the replisome (Gambus et al., 2006; Tan et al., 2006). For comparison, I also identified a list of general chromatin bound factors (Thymidine chased/EdU pulsed; z-score cutoff >0.5). By contrast, this list of proteins was significantly enriched in chromatin assembly or disassembly (P = 9.99 x 10-13) and nucleosome organization (P = 5.15 x 10-12). Together, this suggests that iPOND-MS can successfully distinguish replication fork associated factors from general chromatin bound proteins. Interestingly, I discovered 6 uncharacterized proteins that were enriched at replication forks and 43 proteins that have no expected functions in DNA replication or repair. These will be the main focus of future studies (Figure 3.3C).

67

3.3.3 siRNA screen to identify proteins that play a role in DNA replication progression

To identify new proteins that may play a role in DNA replication progression, I performed an additional screen targeting iPOND candidate proteins with an undefined role in replication. I performed an RNAi knockdown screen on U2OS cells with pools of four siRNAs and looked for changes in EdU incorporation by high-throughput fluorescence microscopy (Appendix Table 4). Following 48 hours of siRNA transfections, cells were incubated for 30 mins with EdU and Click chemistry was performed using an azide-conjugated fluorophore. The mean EdU intensity per nucleus was quantified as a proxy for replication progression (Figure 3.4). I chose to study 26 proteins with the highest EdU/Thy enrichment, with a focus on proteins without clear annotations in DNA replication and repair (2 transcription, 5 RNA processing, 8 Other). I included ADNP, UHRF1, UBA1 and FANCD2, which had z-scores just below our stringent cut- off. RPA3 was also included, representing a replication protein that fell below the cut-off criteria for determining iPOND candidates. A non-targeting (NT) siRNA was used as a negative control and a siRNA pool targeting PCNA was used as a positive control.

Depletion of PCNA resulted in a reduction in mean EdU intensity per nucleus, which had a significantly different distribution than the NT control (P ≤ 0.0001; one-tailed Mann Whitney) (Figure 3.4A and B). Not surprisingly, the average mean EdU intensity per nucleus from three replicates was significantly reduced relative to the NT control (P ≤ 0.05; one tailed t-test). Although to lesser degrees, other proteins with a role at the replication fork (RPA1, RPA2, RPA3) also had a significant difference in average mean EdU intensity per nucleus relative to the NT control (P ≤ 0.05; one tailed t-test). Importantly, this suggests that significant EdU intensity differences can be detected following depletion of replisome components. Depletion of ADNP and UHRF1, genes below my cut-off criteria for iPOND candidates, resulted in no significant EdU intensity differences relative to NT control. However, depletion of iPOND candidate USP7, a ubiquitin specific , had a moderate reduction in average mean EdU that did not meet the significance cut-off (Figure 3.4C). This may reflect the peripheral role of USP7 in DNA replication. Although USP7 interacts with MCM binding protein, knockout cells only have slow progression through late S-phase without a global affect on DNA replication fork rate (Jagannathan et al., 2014). Conversely, replication stress response proteins UBA1 and FANCD2,

68

Figure 3.4 siRNA depletion of iPOND candidates causes reduced EdU incorporation A)Representative images following 48 hr transfection with a non-targeting (NT) siRNA control or an siRNA targeting uncharacterized replication proteins determined by iPOND, in U2OS cells. EdU (10μM) was added to media 30 mins prior to paraformaldehyde fixation, click chemistry and staining. EdU and DAPI channels are shown. B)DAPI signal was used for segmentation of the nuclei and EdU pixel intensity was measured within each nucleus.Each dot on the graph represents the mean intensity per nucleus of the images in (A). Red lines reflect the average EdU mean intensity per nucleus of each sample. C) Bar graph of average EdU mean intensity per nucleus in three replicates of the experiment in (A). Black bars represent positive and negative controls. Red bars represent average mean EdU incorporation that is significantly different from NT control (p ≤ 0.05; one-tailed t-test). Error bars represent standard error. Preys marked with X representiPOND –MS candidates that fell below our stringent (EdU pulse/Thymidine chase) enrichment z-score>0.5 cut off.

69

which also did not meet our iPOND candidate cut off criteria, were amongst the depletions with a significant decrease in EdU incorporation. Further experimentation is necessary to assess the contribution of siRNA knockdown efficienciesto decreases in EdU incorporation. This will be important in assessing if the appropriate cut-off criteria was used in our selection of iPOND candidates.

Nevertheless, 15/30 depletions tested had significantly decreased mean EdU intensities per nucleus (P≤ 0.05; one tailed t-test) (Figure 3.4C). Of these were seven novel components not previously thought to have roles in DNA replication, such as the molecular chaperone CCT2 (Figure 3.4A). Markedly, depletion of other proteins such as RPS7, SF3A1 and NONO which may play a role in RNA processing, had severely reduced EdU incorporation, mimicking the phenotype following PCNA depletion (Figure 3.4C). While I have shown evidence suggesting a role of novel factors in DNA replication, EdU incorporation changes could also reflect changes in cell-cycle distribution. Agenome-scale cell cycle profiling study has been performed with pooled esiRNA depletions in HeLa cells and propidium iodide staining (Kittler et al., 2007). 32/153 of my iPOND candidates have been annotated in this study. 15/32 genes have catalogued S-phase defects, including replisome componentsPCNA, polymerase and primase. Other iPOND candidates with S-phase defects, such as RPS7, can be prioritized for future studies.By contrast, CCT2 and SF3A1 have annotated cell division defects, which could be an alternate explanation to the reduced EdU incorporation phenotype(Appendix Table 3). Similarly, a genome-wide assessment of gH2AX intensity levels in HeLa cells depleted with pooled siRNAs was used as a proxy for DNA damage (Appendix Table 3). 12/153 iPOND candidates were annotated with high gH2AX intensity levels, including replisome components RPA and TIMELESS. gH2AX intensity levels could be another parameter to consider for prioritization of downstream studies. While global studies in the literature may help narrow down potential candidates, thorough protein expression analysis, cell cycle analysis and DNA combing analysis of replication fork speed of individual siRNAs will be important in future functional studies. Nevertheless, the EdU incorporation validation screen provides a first indication that the iPOND methodology is a powerful approach for identifying components of the DNA replication machinery.

70

3.4 Discussion

3.4.1 Identification of known replisome associated factors by iPOND-MS

Here I described a strategy for identifying proteins associated with nascent DNA. Importantly, I was able to capture 26 core replisome proteins including PCNA, MCM helicase subunits, DNA polymerases, primase, RFC clamp loader complex subunits, RPA and Okazaki fragment processing proteins. However, known replisome proteins absent from this list may reflect the technical limitations of iPOND, lack of detection by mass spectrometry or the stringency filters that I applied. The iPOND methodology detects proteins at an average DNA replication fork on the genome in an asynchronous population of cells.Proteins that interact with only a subset of replication forks oronly during specific times of the cell cycle may be under-represented. Additionally, technical parameters that affect replisome detection include the length of EdU pulse and the size of DNA fragments generated after cell lysis. For example, ssDNA created by uncoupling of helicase and polymerase may not be detected due to the absence of EdU in these regions. Less aggressive fragmentation may be needed in order to capture sufficient dsDNA segments containing EdU and adjacent ssDNA, in turn allowing capture of replisome proteins that act ahead of the replication fork (Sirbu et al., 2013). Furthermore, incomplete trypsinization of proteins cross-linked to the affinity purified EdU fragments could affect detection by mass spectrometry. Finally, the stringency thresholds in the mass spectrometry analysis can contribute to detection. For example,FEN1, a protein involved in flap removal step of Okazaki fragment processing was detected by iPOND-MS, but fell below the stringency cut-offs of my study (Appendix Table 3).CHTF18, RNASEH2C, GINS3 and CLASPIN were other bona-fide replisome proteins detected by iPOND-MS. However each protein only had two or fewer spectral counts, which excluded these proteins from our list of iPOND candidates. This suggests that with our stringent cut-off, some false negatives may be yielded.

3.4.2 Comparison with previous iPOND-MS studies

While collecting data from this iPOND-MS study, two other datasets of replication fork associated proteins were published using similar methodology (Lopez-Contreras et al., 2013; Sirbu et al., 2013). 22 proteins were shared between all datasets, representing a robust group of replisome proteins, including PCNA, POLD1, POLE, RFC1-5, ATAD5, LIG1, CHAF1A,

71

CHAF1B, DNMT1, MSH2, MSH3 and MSH6 (Figure 3.5A). Five and 14 proteins overlapped with the Sirbu et. al. and Lopez-Contreras et. al. study, respectively. These also represented bona-fide core replication proteins including MCM helicase components, RPA and PRIM2. Therefore, our dataset corroborates other reports of proteins at replication forks (Lopez- Contreras et al., 2013; Sirbu et al., 2013). Notably, I defined 113 proteins at replication forks that were not identified by previous studies, using this methodology. Differences can be attributed to technical elements of the iPOND procedure such as the amount of starting material, total EdU labeling time, sizes of EdU labeled fragments that were pulled down and differences in the length of thymidine chase time. Additionally, differences could reflect various methods of LFQ MS analysis and stringency filters used in each study. Nevertheless, I am confident in the parameters of our analysis because our list of true positives contains many core replisome proteins that were undetected in other studies, such as PRIM1, MCM3, MCM6, MCM7, RNASEH2A and PMS1(Appendix Table 3).Furthermore, my iPOND candidates were validated using a secondary phenotypic screen that demonstrated reduced EdU incorporation, implicating these proteins in DNA replication progression.

Recently, a dataset of factors enriched on nascent chromatin was published using a different technique called Nascent Chromatin Capture (NCC) (Alabert et al., 2014). In this method, biotin- dUTP was incorporated into newly replicated DNA at replication forks and used for affinity purification following protein crosslinking. I compared my iPOND-MS candidates with the 266 proteins that had a nascent chromatin enrichment score of ≥0.5 and were identified in all three replicates of the NCC-MS study (Alabert et al., 2014). I identified 44 core replisome proteins that overlapped with this dataset including PCNA, MCM2-7, PRIM1, PRIM2, POLE, RFC1-5, LIG1 (Figure 3.5B). However, 109 of my iPOND-MS candidates were not in the NCC-MS dataset by these criteria.This suggests that while previous datasets have been published, our approach reveals a large resource of proteins that were undetected by previous attempts, and these proteins may play an important role in S-phase progression and DNA replication fork dynamics. Together with complementary approaches in the literature, my iPOND-MS dataset contributes to a comprehensive detection of replication components during an unperturbed cell cycle.

72

Figure 3.5 Venn diagram comparisons with previous replication fork protein datasets A)Venn diagram overlap of replication fork enriched proteins from iPOND-MS in literature. Lopez-Contreras et. al.and Sirbu et. al. studies were conducted in HEK293T cells with a 10 min EdU pulse and a 60 min thymidine chase. Our study (Cheng) was conducted in HEK293 cells with a 10 min EdU pulse and a 30 min thymidine chase. Numbers in the Venn diagram represent the count of proteins that were enriched in the EdU pulse fraction relative to thymidine chase samples. B)Venn diagram overlap of replication fork enriched proteins determined by NCC-MS. Alabert et. al.study was conducted with HeLa suspension cultures with a 20 min biotin-dUTP incorporation time and a 2 hr chase.

3.4.3 Newly identified DNA replication factors

I have identified a large candidate list of novel replication fork associated proteins with potential for downstream mechanistic studies in DNA replication and repair. Although I have made good headway by testing the effect their depletions on EdU incorporation using siRNA pools, it will be imperative to repeat this screen with de-convoluted pools of individual siRNAs on promising hits and test for protein depletion using western blot analysis. Flow cytometry analysis can be used to exclude cell cycle distribution effects caused by siRNA depletion. Further investigation into the role of proteins in DNA replication dynamics can be addressed with molecular combing analyses of individual DNA fibres.

Here, I will comment on one interesting group of novel proteins that merit future functional studies. Domain analysis of gene depletions that led to the highest reduction in EdU incorporation revealed enrichment of RRM domains (P = 3.07 x 10-7; SMART domain annotation) that are found in RNA binding proteins (RBPs). During transcription, conditions may favor RNA:DNA hybrids, resulting in secondary structure formation by the complementary ssDNA strand that is leftover, collectively termed an R-loop (Dutertre et al., 2014). R-loops pose a significant topological constraint to oncoming replication machinery, causing replication fork

73

stalling and double stranded breaks. Certain RBP proteins have been demonstrated to travel with TOP1 topoisomerase that cleaves DNA ahead of the replication forks to reduce topological constraints(Dutertre et al., 2014). Many TOP1 protein interactors have been shown to modulate DNA cleavage and relaxation activity and several were identified in our iPOND-MS study. Notably, the depletion of NONO, a TOP1 interactor (Czubaty et al., 2005), resulted in the highest reduction of EdU incorporation in our study. Furthermore, a recent iPOND-like methodology (Dm-Chp) has also demonstrated its interaction with S-phase DNA (Kliszczak et al., 2011). Together, I envision that accumulation of topological constraints in NONO depleted cells may stall DNA replication.

Another class of TOP1 interactors is splicing factors such as SRSF1, which act to promote co- transcriptional processing in order to prevent the formation of R-loops (Dutertre et al., 2014; Tuduri et al., 2009). Depletion of iPOND candidates that may play a role in mRNA splicing, SF3A1 and PTBP1, resulted in significantly reduced levels of EdU incorporation. Similar to SRSF1, it is possible that depletion of these factors slows intron removal, favoring the formation of R-loops that impede replication progression. Alternatively, deregulation of splicing factors can indirectly lead to reduced replication by promoting aberrant splicing of critical DNA replication factors. Downstream analyses will be required to distinguish between these possibilities.

RNASEH2B and hNRNPR are known TOP1 interactors with roles in RNA metabolism and their depletion results in a modest EdU incorporation reduction. RNASEH2 cleaves RNA within the RNA-DNA hybrids particularly during Okazaki fragment processing on the lagging strand of DNA replication, but could serve to remove R-loop structures that impede replication progression (Cerritelli and Crouch, 2009; Chon et al., 2013; Skourti-Stathaki and Proudfoot, 2014). While hNRNPR has an unknown role, it may play a similar role to the yeast nuclear hnRNP, Npl3, which prevents accumulation of R-loop impediments and mitigates transcription associated genome instability (Santos-Pereira et al., 2013). It is likely that the human hNRNP counterparts act through a similar mechanism, but it remains to be tested. As such, our iPOND- MS study has yielded a promising list of candidate proteins with exciting avenues for future studies, which will be discussed in more depth in Chapter 5.

74

3.5 Methods

3.5.1 Cell Culture

All culture media were supplemented with 10% fetal bovine serum (FBS), 100 U/mL penicillin, and 100 mg/mL streptomycin. U2OS cells were cultured in McCoy’s 5A medium. HEK293 FRT/TO Flp-In stable cells were cultured in DMEM supplemented with 1x GlutaMAX (35050; Invitrogen), 200 mg/mL hygromycin B, and 5 mg/mL blasticidin.

3.5.2 iPOND

Proteins conjugated to DNA replication forks were identified using the iPOND methodology previously described (Sirbu et al., 2012; Sirbu et al., 2011). HEK293-Flag-RMI1 stable cells (1.5x108 cells/sample) were incubated with 10μM of the nucleoside analog 5-ethynyl-2’- deoxyuridine (EdU) for 10 mins. If thymidine chase was applied, cells were washed once with temperature and pH-equilibrated media containing 10μM thymidine for 30 mins. Cells were cross-linked in 1% formaldehyde/PBS for 20 mins and quenched using 0.125M glycine. Cells were washed three times in PBS and resuspended in 0.25% Triton X-100/PBS for 30 mins to permeabilize. Cell pellets were washed once with 0.5% BSA/PBS and once with PBS before cells were incubated in Click-iT reaction mix (Invitrogen) containing 1μM biotin-azide (Life Technologies). DMSO was added in place of biotin-azide in negative controls. Cells were washed once with 0.5% BSA/PBS. Cell pellets were resuspended at a concentration of 1.5 x 107 cells per 100μL of lysis buffer containing 1% SDS, 50nM Tris (pH 8.0), 1ug/mL leupeptin and 1μg/mL aprotinin. Samples were sonicated using alternating 20 sec constant pulse and a 40 sec pause on ice, for a total time of 5 min per sample (Branson Model 450, tapered microtip; constant duty cycle % and output setting 2). Cells were centrifuged at 14000 rpm for 20 mins at 4°C and supernatant was used for purification with streptavidin beads.

3.5.3 Affinity purification and immunoblot analysis

Streptavidin-agarose beads (Novagen) were washed two times in lysis buffer and once with PBS. Washed beads were incubated at 4°C in the dark for 14-20 hours. The beads were washed once with lysis buffer, once with 1M NaCl and twice with lysis buffer again. The captured proteins were eluted in SDS sample buffer for 30 minutes at 95°C. Proteins were resolved by SDS-PAGE

75

and immunoblotting using standard procedures. Antibodies against PCNA (Santa-Cruz) and histone H3were used.

3.5.4 Affinity purification for mass spectrometry

Pre-cleared lysates after sonication were added to 2mL RIPA buffer (50 mM Tris-HCL pH 7.5, 150 mM NaCl, 1% Triton X-100, 1mM EDTA, 1 mM EGTA, 0.1% SDS) and substituted with 1:500 protease inhibitors (Sigma P8340) and 0.5% sodium deoxycholate. 30 μL of streptavidin- sepharose beads (GE Healthcare) were added to equivalent volumes of pre-cleared lysates per sample for affinity purification at 4°C on a nutator for 3 hours. Streptavidin-sepharose beads were pelleted at 400g for 1 min. The supernatant was removed and the beads were washed two times with RIPA buffer, two times with TAP lysis buffer (50mM HEPES-KOH pH 8.0, 100mM KCl, 10% glycerol, 2mM EDTA, and 0.1% NP-40) and three times ABC buffer (50nM ammonium bicarbonate pH 8). Streptavidin-sepharose beads were resuspended in 30μL of ABC buffer containing 1μg of trypsin and incubated for 17-20 hrs on a rotating disc at 37°C. An additional 0.5 μg of trypsin diluted in 10uL of ABC buffer was added to each sample and rotated for an additional 2 hrs. Beads were pelleted at 400g for 1 min and the supernatant was transferred to a new tube. Beads were washed twice with 30μL of filter sterilized HPLC dH2O and these rinses were combined with the original supernatant. Pooled supernatant was centrifuged at 16100g for 10 minutes and 80μL was transferred to a new tube. Samples were acidified to 2% formic acid and dried in a speed vacuum.

3.5.5 Mass spectrometry

Affinity-purified digested material from each sample was re-suspended in 12μL of 5% formic acid and centrifuged at 16,100 x g for 1 minute before 6 uL was taken for MS analysis and 5μL injected by autosampler onto a spray tip formed from a fused silica capillary column (0.75μm ID, 350um OD) using a laser puller. The column had previously been loaded with 10-12 cm of C18 reversed-phase material (ZorbaxSB, 3.5μm) by pressure bomb loading in MeOH and pre- equilibrated with buffer A (0.1% formic acid). The column was placed in-line with a LTQ- Orbitrap Velos or Elite (Thermo Electron, Bremen, Germany) equipped with a nanoelectrospray ion source (Proxeon Biosystems, Odense, Denmark) connected in-line to a NanoLC-Ultra 2D plus HPLC system (Eksigent, Dublin, USA). The LTQ-Orbitrap Velos/Elite instrument under

76

Xcalibur 2.0 was operated in the data dependent mode to automatically switch between MS and up to 10 subsequent MS/MS acquisition. Buffer A was 0.1% formic acid. Buffer B was acetonitrile in 0.1% formic acid. The following HPLC gradient was used: 20min 2%B at 400μL/min, 75.5 min linear increase to 35%B at 200μL/min, 5 min linear increase to 80%B, 6.5 min 80%B at 200μL/min, 18min 2%B at 200μL/min.

3.5.6 Data analysis

Raw spectral count data was analyzed by the Significance Analysis of INTeractome (SAINT) label free quantification (LFQ) computational analysis tool (Choi et al., 2011). Prey proteins with a SAINT score >0.9 in the EdU pulsed sample were considered positive iPOND candidates. For prey proteins with SAINT score >0.9 in the EdU pulsed sample and the Thymidine chased sample, the ratio of total spectral counts from two biological replicates (EdU pulsed/thymidine chased) was used as a measure for protein enrichment at the active fork. For these proteins, a z- score>0.5 was considered a positive iPOND candidate (Appendix Table 3). ToppGene was used for GO Biological Process enrichment and SMART domain enrichment (Chen et al., 2009).

3.5.7 High-throughput fluorescence microscopy

U2OS cells were transfected on clear-bottom 96-well plates (Fisher Scientific) with 40nM of siRNA (Appendix Table 4) for 48 hours, using Lipofectamine RNAiMAX (Life Technologies). A negative non-targeting control siRNA (Dharmacon, GE Healthcare) and siGENOME SMARTpool mixture of four siRNAs targeting each gene (Dharmacon, GE Healthcare) were used in the screen. 10μM EdU was added to the culture media for the last 30 mins of transfection. Cells were washed with PBS, fixed with 4% paraformaldehyde for 10 min and permeabilized with 0.3% Triton for 10 min. EdU was detected by performing Click-iT reaction (Invitrogen) with Alexa-Fluor 488 azide (Life Technologies). Cells were blocked with 1XPBS buffer containing 10% goat serum, 0.5% NP-40 and 0.5% saponin for 30 mins at room temperature. After wash with PBS, cells were incubated in DAPI (0.5 ug/mL) for 1 hour at room temperature. Images were acquired from each well using In Cell Analyzer 6000 (GE Healthcare Life Sciences) at 10x magnification at non-saturating exposure times. Columbus image analysis software (PerkinElmer) was used to segment nuclei with the DAPI channel and the mean EdU signal intensity per nucleus was calculated.

77

Chapter 4

Biotin ligase tagging reveals novel proteins proximal to the BLM-TOP3A-RMI1-RMI2 DNA replication and repair complex.

DATA ATTRIBUTION:

Wade Dunham (Gingras Lab) performed the immunoprecipitation steps of BioID and the mass spectrometry. Jordan Young (Durocher Lab) acquired microscopy images of siRNA transfected cells on the high-throughput microscope.

ACKNOWLEDGEMENTS:

I thank Wade Dunham for assistance with SAINT analysis and Xiaohan Yang for technical assistance with the siRNA transfections.

78

4 Biotin ligase tagging reveals novel proteins proximal to the BLM-TOP3A-RMI1-RMI2 DNA replication and repair complex 4.1 Summary

Bloom syndrome is an autosomal recessive disorder that is caused by mutations in the BLM gene. BLM associates with TOP3A, RMI1 and RMI2 to form the BTR complex that is essential for the maintenance of genome stability. These members of the core complex also interact with other proteins to form distinct sub-complexes under different cellular environments or phases of the cell cycle.

Here, I took an unbiased approach to identify changes in BTRR sub-complex composition under conditions of DNA replication stress in order to further understand its contribution to the maintenance of genome stability. I coupled mass spectrometry with a novel technique for identifying protein interactions, BioID. For each member of the complex, I captured interactions with the other three members of the core BTRR complex as well as other members of BTRR sub-complexes, suggesting that the sensitivity of this method was sufficient to confirm known interactions. We also discovered novel candidate BTRR protein interaction partners and identified the subset of proteins that are proximal to the BTRR complex following treatment with replication stress. Furthermore, I saw a change of protein interactions following the expression of mutated forms of core complex members, in comparison to wildtype. This study uncovered novel proteins potentially involved in the maintenance of genome stability, which will be a useful resource for future studies aimed at understanding crucial roles of the BTRR complex in replication stress resistance and disease etiology.

4.2 Introduction

The BTRR protein complex is essential for maintenance of genome stability and comprises of BLM, TOP3A, RMI1 and RMI2. Defective BLM protein is implicated in Bloom syndrome, a disorder associated with growth retardation, light sensitivity, immuno-deficiency and cancer predisposition (German et al., 2007). The diagnostic feature in BLM deficient cells is a ten-fold elevated frequency of sister chromatid exchange (SCE), which reflects the role of this complex

79 in the suppression of crossover events during homologous recombination (Chaganti et al., 1974). TOP3A, RMI1 and RMI2 share this hallmark phenotype, which is consistent with the co- operative role of the complex in DNA repair (Singh et al., 2008; Xu et al., 2008; Yin et al., 2005). Many BLM mutations have been characterized in the past. Recently, a single nucleotide polymorphism (SNP) of RMI1 was found to be associated with an increased risk of acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS), implicating other BTRR proteins in cancer risk (Broberg et al., 2007). However, molecular consequences of this genetic variant are yet to be defined.

The BTRR complex has a well-characterized role in the resolution of double Holliday junctions in homologous recombination repair (Larsen and Hickson, 2013). However, recent studies implicate an additional role of the BTRR complex in normal DNA replication progression and in mediating the restart of collapsed DNA replication forks. Molecular combing experiments demonstrated that BLM or RMI1 depleted cells have an impaired replication fork rate, hypersensitivity to drugs that induce replication stress and reduced efficiency of stalled replication fork recovery (Rao et al., 2007; Yang et al., 2012). Moreover, interaction with accessory proteins to the BTRR complex aids its replication specific functions. For example, RIF1 physically interacts with BLM and is recruited to stalled replication forks with similar kinetics as the BTRR complex in order to promote stalled replication fork recovery (Xu et al., 2010). However, RIF1 is dispensable for the suppression of crossover events, as its depletion results in normal levels of SCE (Xu et al., 2010). This supports the notion that accessory proteins may exclusively mediate distinct functions of BTRR complex in the protection of genome integrity. Identification and characterization of accessory proteins to the core complex in various cellular conditions will contribute to understanding the numerous roles of BTRR in the maintenance of genome stability.

Accessory proteins that interact with the BTRR complex have previously been probed using two main methods. Functional protein complexes can be predicted by the co-localization of their components, in vivo. BLM, TOP3A, RMI1 and RMI2 co-localize to distinct nuclear foci, which increase in abundance and intensity when cells are treated with drugs that induce replication stress (Yang et al., 2012). Accessory proteins of the BTRR complex that are implicated in DNA replication and repair, RAD51 and RIF1, co-localize with the core complex suggesting cooperative function in vivo(Ouyang et al., 2009; Xu et al., 2010). While this technique is

80 commonly used to validate potential interactors, the limited scale of this method and the requirement for targeted antibodies makes it unsuitable for large-scale novel protein interaction discoveries. Affinity purification coupled with mass spectrometry (AP-MS) is an alternative technique for defining BTRR sub-complexes. AP-MS was used to identify the BRAFT (BLM, RPA, Fanconi anemia proteins, TOP3A) complex and the BASC (BRCA1 associated genome surveillance complex- BRCA1, BLM MSH2, MSH6, MLH1, ATM, RAD50-MRE11-NBS1) (Meetei et al., 2003; Wang et al., 2000). Notably, the majority of BLM interacting proteins also play a crucial role in DNA replication and repair.To date, AP-MS has not been used to probe the other members of the BTRR core complex. Severe limitations of AP-MS include poor solubilization of chromatin bound proteins and the requirement strong detergents that are not amenable for preserving protein interactions.

The protein interaction profile of the BTRR complex in different environmental conditions has not been extensively studied and will be useful in understanding its complex roles in DNA replication and repair. The proximity-dependent biotin identification (BioID) technique (Roux et al., 2012) addresses the shortfalls of current methods and is capable of identifying protein complexes in vivo. This promising new technique has been used to study human A protein interactions (Roux et al., 2012), hippo pathway signaling (Couzens et al., 2013), adherens junction structures (Van Itallie et al., 2014), tight junction structures (Steed et al., 2014), architecture (Kim et al., 2014) and c- interacting partners (Dingar et al., 2014). BioID requires the expression of a fusion of the protein of interest with a mutant form of biotin ligase (BirA*) from (Figure 4.1). Addition of excess biotin to the culture medium allows biotinylation of proteins that are in close proximity to the fusion protein within the cellular environment. Cells are lysed under harsh conditions that disrupt protein interactions, yet permit purification of biotinylated proteins by virtue of the high-affinity interaction between biotin and streptavidin. The biotinylated proteins are then eluted and identified by mass spectrometry or by western blot analysis.

81

Figure 4.1 Schematic of BioID methodology Induced expression of a BirA* fusion protein in live HEK293 stable cell lines and addition of excess biotin to the media allows for selective biotinylation of neighbouring proteins. Following stringent cell lysis and protein denaturation, biotinylated proteins are affinity purified using streptavidin coated beads. The eluted candidate proteins can be used for western blot or mass spectrometry analysis. Adapted from Roux et. al. 2012.

I used BioID to identify proximal proteins to the wildtype BTRR complex. To capture a larger picture of protein interaction dynamics of the BTRR complex, I assayed gains and losses of protein interactions when genetic variants of RMI1 were expressed, as well as changes in BTRRproximal proteins when replication stress was applied. Integration of candidate binding partners with existing knowledge of BTRR complex architecture will aid in the understanding how BTRR functions in multiple molecular pathways.

4.3 Results

4.3.1 Identification of proximal proteins to wildtype BLM, RMI1 and RMI2

To determine the proximal and interacting proteins of the wildtype BTRR complex in its cellular environment, I used the BioID technique. I generated a HEK293 cell line that stably expresses a Flag-BirA* fusion protein of each individual member of the BTRR complex under the control of a tetracycline-regulatable promoter. I performed confirmation experiments to verify that the stable cell lines expressing fusion Flag-BirA* proteins retain normal biological activity. First, cell extracts probed with Flag antibodyshowed that protein expression was inducible by the addition of doxycycline to the culture media (Figure 4.2A). Furthermore, the same extracts probed with streptavidin-HRP showed that endogenous proteins were only biotinylated in the presence of soluble biotin, and only in cells with a Flag-BirA* fusion (Figure 4.2B). Differences observed in protein biotinylation patterns between wildtype BLM, RMI1 and RMI2, could represent differences in contacts with accessory proteins. For example, Flag-BirA*-RMI2 has fewer biotinylated proteins than Flag-BirA*-RMI1, suggesting that proteins within the core BTRR complex can have different interaction profiles.

82

Figure 4.2 Flag-BirA* fusion proteins biotinylate endogenous proteins in HEK293 cells A)N-terminally tagged Flag-BirA* fusion proteins of RMI1, BLM and RMI2 were expressed stably under the control of a tetracycline promoter in HEK293 cells. Cells were incubated with doxycycline (5μg/mL) and soluble biotin (50μM) for 24 hrs.Western blot analysis of cell lysates using RMI1 antibody andFlag antibody confirmed fusion protein expression. Tubulin expression was probed as a protein loading control.*denotes a non-specific band with identical size to Flag-RMI1 B) Western blot analysis with streptavidin-HRP confirmed biotinylation of endogenous proteins.RMI1, BLM, RMI1 exhibit distinct patterns of protein biotinylation.C) Flag-BirA*-RMI1 stable cell line was incubated with doxycycline (5μg/mL), soluble biotin (50μM) and aphidicolin (1μM) for 24 hrs before fixation and staining with Flag Antibody or a streptavidin conjugated alexa-fluor 488. DNA was stained with DAPI. Flag-RMI1 co-localizes with biotinylated proteins in punctate nuclear foci.

83

Lastly, I confirmed that Flag-BirA* BTRR complex members co-localize with biotin labeled proteins by fluorescence microscopy (Figure Figure 4.2C). Consistent with the localization pattern of wildtype RMI1, Flag-BirA*-RMI1 localized to punctate nuclear foci and the number of foci increased in the presence of aphidicolin, an inhibitor of DNA polymerases. Together, I demonstrate that the Flag-BirA* fusions selectively biotinylate proteins in a proximity-dependent manner and that they are expressed in nuclear foci, which resemble the localization pattern of the wildtype BTRR complex.

Next, I determined the proximal and interacting proteins of BLM, RMI1 and RMI2 using BioID-MS. TOP3A was not assayed as a bait protein due to difficulties in cloning and cell line construction. HEK293 cells lines with Flag-BirA*-fusion proteins were lysed following growth in the presence of doxycycline and biotin for 24 hours. Biotinylated proteins were captured using streptavidin beads and the identities of bound proteins were analyzed by mass spectrometry. HEK293 expressing Flag-BirA* alone was harvested in parallel as a control for non-specifically biotinylated proteins or endogenously biotinylated proteins that interact with the streptavidin beads. I used Significance Analysis of INTeractomes (SAINT) to determine the probability that each proximal prey protein is a true interaction (Choi et al., 2011). SAINT analysis excludes candidates that were abundant in the Flag-BirA* alone control and uses spectral counts normalized to both protein length and total number of spectra in each pull-down. Proteins with >2 spectra in at least two biological replicates and a SAINT score of ≥ 0.75 were considered as true proximal proteins. By these criteria, I identified a list of 28, 26 and 5 robust proximal proteins for RMI1, BLM and RMI2, respectively. A representative selection of prey hits is sub- categorized by known functional complexes and average spectral count data are depicted in Figure 4.3A and a complete list is found in Appendix Table 5. The proximal proteins that were shared by multiple BTRR bait proteins are shown as a Venn diagram in Figure 4.3B.

84

Figure 4.3 Comparison of BLM, RMI1 and RMI2 protein interactions determined by BioID-MS A) At least two biological replicates of each cell line expressing Flag-BirA* fusion proteins of BLM, RMI1 or RMI2 were harvested for BioID analysis. A heatmap of average spectral counts is depicted for a representative selection of protein interactions in each functional category. SAINT scoreof ≥0.75 and a spectral count >2 in each biological replicate were used to distinguish true prey protein interactions (black boxes). The total number of prey proteins is listed at the bottom of each bait column. Novel proximal proteins were identified in addition to all members of the BTRR complex and known accessory proteins involved in DNA replication and repair. B) A Venn diagram depicts the overlap of BLM, RMI1 and RMI2 proximal prey proteins discovered by BioID.

As expected, BLM,RMI1 and RMI2 interacted with each other and the remaining core member TOP3A. For each bait protein tested, the core complex components had the highest prey spectral counts (Figure 4.3A). Differences in the total number of associated proteins and spectral counts between each individual member of the BTRR core complex could reflect the 3D architecture of the complex and protein surface contact availability. By contrast, it could reflect true differences in accessory protein interactions between the core complex components and the abundance of unique sub-complexes that contain any combination of BLM, RMI1 or RMI2. As asynchronous

85 cultures were collected for BioID analysis, our candidate list likely represents the strongest sub- complexes formed and the most abundant BTRR sub-complexes during the cell cycle. Besides the core complex, the RMI2 bait protein only had two proximal prey proteins, each with moderate spectral counts. As such, I will focus on the BLM and RMI1 proximal proteins in the remainder of this study.

While the core complex represented the proximal prey interactions with the highest levels of spectral counts, the next highest level included known BLM sub-complex members (BRAFT and BASC complex), PML body components and other known BLM-interacting DNA repair proteins (Figure 4.3A). Even though RMI1 interactions have not been probed previously by AP-MS, it is not surprising that I found six shared interactions with BLM because they act together within the BTRR complex (Figure 4.3B). Notably, ZNF451 is a novel protein that I identified, which has never been demonstrated to interact BLM or RMI1. It is a promising candidate interactor, as it has been localized to PML bodies (Karvonen et al., 2008) where the BLM and TOP3A reside during an unperturbed cell cycle (Yankiwski et al., 2000). Besides components of the PML body, the remaining shared prey interactions (FANCM and RPA1) belong to DNA replication and repair complexes that play a co-operative role with BTRR complex (Figure 4.4). Together, this demonstrates that BioID has sufficient sensitivity to map known BTRR complex interactions, but also has the potential to reveal novel interactions in vivo.

86

Figure 4.4 Comparison of RMI1 and BLM proximal protein interactions determined by BioID Interaction diagram of proximal proteins of RMI1 and BLM. Previouslypublished protein binding partners are presented as triangular nodes and novel proximal protein interactions are depicted as circular nodes. Proteins are grouped by known protein complexes and colored by manually curated biological processes. Edges are weighted by average number of spectral counts in two biological replicates. All proteins represented had a SAINT score ≥ 0.75.

I identified 18 prey proteins that were proximal only to BLM, which included the known interactions with TOPBP1, BRIP1 and RAD50, proteins involved in DNA damage signaling and repair (Figure 4.4). Importantly, I also discovered 12 novel proteins that were solely proximal to BLM, suggesting that BioID can be used as a complementary technique to identify proteins that were excluded from previous methods of study. For example, YEATS2 prey protein has a comparable average spectral count to the core complex protein, TOP3A (Figure 4.3A). It is a nuclear scaffolding subunit of a histone acetyltransferase complex that to our knowledge has only been described in one study (Wang et al., 2008). Next, I discovered 20 prey proteins that

87 areonly proximal to RMI1, which may represent novel RMI1 sub-complexes. For example, PDXDC1 is a pyridoxal dependent decarboxylase domain containing protein of unknown function that has higher average spectral counts levels than RMI2. I also identified TDRD3- TOP3B complex as potential proximal proteins to RMI1. While this complex has a similar structure to RMI1-TOP3A, it has not been found to interact with the BTRR complex in previous analyses (Yang et al., 2014). Encouragingly, of the BioID candidates with known localizations in the Human Protein Atlas project, 19/21 and 17/20 for BLM and RMI baits respectively, have nuclear localization in HEK293 cells, which is consistent with the role of RMI1 in genome maintenance and DNA replication. This preliminary data demonstrates that BioID can be used to identify novel BTRR sub-complex components that are promising candidates for downstream functional studies.

4.3.2 Identification of changes in proximal proteinsin RMI1 mutants

I performed BioID analysis with three different RMI1 mutant baits in order to assess changes in proximal proteins that may contribute to the understanding of associated disease consequences.To do this, I created stable cell lines expressing Flag-BirA* fusion proteins of several RMI1 mutants (Figure 4.5A). The RMI1 K166A and LLTD mutants exhibit decreased physical interaction with BLM and TOP3A in co-IP experiments (Raynard et al., 2008; Yang et al., 2012). These mutants also fail to form foci and do not co-localize with BLM and TOP3A by fluorescence microscopy (Yang et al., 2012). More specifically, the RMI1 K166A mutant has defects in dHJ dissolution (Raynard et al., 2008) and has a reduced rate of normal replication progression (Yang et al., 2012). Aberrant molecular phenotypes have not been demonstrated for the RMI1 S455N polymorphism, and S455 is not located within a known domain structure. Strikingly, however, RMI1 S455N is associated with a significantly increased risk of acute myeloid leukemia and myelodysplatic syndromes(Broberg et al., 2007).

88

Figure 4.5Construction of Flag-BirA*-RMI1 mutants A) Schematic of RMI1 domain structure. RMI1 consists of two putative OB-fold domains and a conserved DUF1767 domain. Red stars denote locations of mutations used in the study.Two mutants with amino acid substitutions (LLTD and K166A)have reduced interaction with the BTRR complex, through co-IP and co- localization experiments. The RMI1 S455N polymorphism is not within a domain structure but is associated with a significantly increased risk of acute myeloid leukemia and myelodysplatic syndromes. B)N-terminally tagged Flag- BirA* fusion proteins of RMI1 and three RMI1 mutants were expressed stably under the control of a tetracycline promoter in HEK293 cells. Cells were incubated with doxycycline (5μg/mL) and soluble biotin (50μM) for 24 hrs.Western blot analysis of cell lysates using RMI1 antibody andFlag antibody confirmed fusion protein expression. Tubulin expression was probed as a protein loading control.*denotes a non-specific band size that is identical to Flag-RMI1C) Western blot analysis with streptavidin-HRP confirmed biotinylation of endogenous proteins. Differences in protein biotinylation are observed in RMI1 mutants compared to RMI1 wildtype.

Identifying the changes in proximal proteins for these mutants using BioID may lead to a better understanding of their molecular phenotypes. As before, I confirmed that mutant RMI1 fusion protein expression is dependent on doxycycline and that expression levels are comparable to Flag-BirA*-RMI1 (Figure 4.5B). Again, I show that protein biotinylation is dependent on the presence of exogenous biotin. Drastic differences observed in protein biotinylation patterns between wildtype RMI1 mutants could represent differences in contacts with accessory proteins (Figure 4.5C). For example, Flag-BirA*-RMI1K166A biotinylated fewer proteins than Flag- BirA*-RMI1 wildtype, indicating that expression of RMI1 variants can significantly alter protein interaction profiles.

89

BioID-MS analysis was performed with HEK293 cell lines expressing Flag-BirA*-RMI1 mutants and I determined true interactors using the same criteria as previously described. The average spectral count data for a representative selection of prey hits is depicted inFigure 4.6 and a complete list of BioID candidates is listed in Appendix Table 5. RMI1 S455N and wildtype RMI1 baits have comparable levels of average spectral counts for BLM, TOP3A and RMI2 preys, suggesting that the BTRR complex is intact in this mutant background (Figure 4.6A). By contrast, RMI1 K166A and LLTD baits had drastically decreased spectral counts for BLM and TOP3A prey proteins, corroborating previous observations of defective interaction and abrogated formation of the core BTRR complex by these mutant RMI1 proteins (Yang et al., 2012). From a recent RMI1-TOP3A crystal structure, the K166 residue is in the OB-fold domain that stabilizes the orientation of the hydrophobic zipper that interfaces with TOP3A (Bocquet et al., 2014).Mutation at this residue can cause a crucial conformational change affecting the interface, which may explain the low levels of TOP3A spectral counts. By contrast, structural studies indicate that residues 57-60 (LLTD) in the DUF1767 domain do not contribute to TOP3A interaction because it projects away from the interface (Bocquet et al., 2014). As such, alanine replacements of the LLTD residues resulted in reduced TOP3A prey spectral counts, but retained a significant SAINT score of ≥ 0.75. Domain binding studies have demonstrated that the N- terminal region of RMI1 (residues 1-211) are necessary for RMI1-BLM interaction and required for stimulating dHJ dissolution (Raynard et al., 2008). Both K166A and LLTD mutations lie in this region, which may account for the abolished BLM spectral counts. Notably, the average prey spectral counts for RMI2 were comparable for RMI1 wildtype and RMI1 mutant baits, suggesting that the RMI1-RMI2 interaction remains intact. Structural studies of RMI1-RMI2 indicate that S455 is not within a structured domain of RMI1, nor within the interaction interface with RMI2 (Wang et al., 2010). This may explain the observation of comparable spectral counts for all BTRR prey components, using RMI1S455N and wildtype RMI1 as baits. Extrapolating from these results, defects in dHJ dissolution, decatenation and replication progression associated with K166A and LLTD (Raynard et al., 2008; Yang et al., 2010; Yang et al., 2012) may only be a consequence of disruptions to BLM and TOP3A interaction, but not RMI2. Therefore, our BioID analysis confirmed known interaction profiles from structural studies, but also revealed important novel features of the BTRR core complex organization and function.

90

Figure 4.6 Comparison BioID candidates for RMI1 mutants A)At leasttwo biological replicates of each cell line expressing Flag-BirA* fusion proteins of wildtype RMI1 or three wildtype RMI1 mutants were harvested for BioID analysis. A heatmap of average spectral counts is depicted for representative protein interactions in each functional category. SAINT scoreof ≥0.75 was used to distinguish true prey protein interactions (black boxes). The total number of prey proteins is listed at the bottom of each bait column. B)Bar graphs of the top ten GO biological process enrichment categories for BioID candidates for each RMI1 mutant. Parenthesized grey numbers represent the total number of candidates that were detected in the category.C)A Venn diagram depicts the overlap of RMI1 and RMI1 S455N proximal prey proteins discovered by BioID.

91

Wildtype RMI1 bait had 28 BioID candidate proteins, while RMI1 K166A and LLTD mutants only had 15 and 23, respectively. Average spectral counts for BTRR core components and representative BioID candidates are plotted in Figure 4.6A. Not only were the core complex interactions lost in these mutants, but many other known DNA repair protein interactors and potential novel interactors. For example, RIF1 and RPA1, which are known interactors that play a role in DNA replication and replication fork dynamics, had drastically diminished spectral counts (Figure 4.6A). The loss of these non-core complex interactions could be relevant to the reduced replication fork dynamics previously observed when these two mutants are expressed.

While many BioID candidates were lost in RMI1 K166A and LLTD baits, several new prey proximal proteins were gained. As expected, DNA replication and DNA repair were the most enriched GO Biological Process categories observed for wildtype RMI1 BioID candidates (Figure 4.6B). By contrast, GO Biological Process enrichment of the RMI1 mutants reveal that protein refolding and chaperone factors were dominant. CACYBP and STIP1 chaperone and heat shock proteins are two examples (Figure 4.6A). This suggests that lack of proper protein folding could be a contributing factor to the phenotypes observed in RMI1 mutants. Furthermore, this suggests that a single base mutations or small disruptions in the crucial domain structures in RMI1 can abrogate important interactions that contribute to BTRR function and localization, possibly through protein mis-folding.

Of the mutants tested, RMI1 S455N and wildtype RMI1 baits shared the most prey proximal hits (25 proteins), each with similar levels of average spectral counts (Figure 4.6C). Of these, AHCTF1 and PDXDC1 had intermediate levels of spectral counts that were more abundant than RMI2 and represent promising candidates for future functional studies. AHCTF1 has an AT- hook DNA binding domain and is both a component of the nuclear pore complex and the kinetochore (Rasala et al., 2006). PDXDC1 is a protein with unknown function containing a pyridoxal dependent decarboxylase domain. While the S455N mutationdoes not lie within a structured domain of RMI1, it is the only RMI1 mutation that is correlated with a disease phenotype in humans (Broberg et al., 2007). While three BioID proximal proteins were not observed in the mutant, eight proteins were gained (Figure 4.6C). Future research will be geared towards validating these interactions using a complementary method and determining their cooperative role with RMI1 as well as their contribution to the BTRR complex in the maintenance of genome stability. Altogether, our data demonstrates that BioID is a useful tool

92 for an unbiased dissection of in vivo changes to protein interaction networks in polymorphic proteins.

4.3.3 Changes in BLM, RMI1 and RMI1S455N interactions during replication stress

Recently, BLM and its interacting proteins have been implicated in the replication stress response, which requires recruitment and signaling of proteins to the stalled DNA replication fork (Larsen and Hickson, 2013). I used BioID to decipher BTRR core complex protein interaction changes that occur in the presence of aphidicolin in order to dissect the contribution of BTRR to the replication stress response. I grew HEK293 cells expressing BLM, RMI1 or RMI1S455N baits in the presence of aphidicolin prior to the addition of soluble biotin. Average spectral counts for representative BioID candidates are shown in Figure 4.7A and a complete list is found in Appendix Table 5. As expected, BLM, TOP3A, RMI1 and RMI2 of the core complex represented the strongest interactions in the presence of aphidicolin, with the highest levels of average spectral counts (Figure 4.7A). For wildtype RMI1 and RMI1S455N baits, I observed elevated spectral counts for each prey core component following the treatment of aphidicolin relative to DMSO, which is in agreement with the increased abundance and intensity of nuclear foci (Figure 4.2C). RIF1, a known interactor of RMI1, shared the same elevated spectral count pattern in the presence of aphidicolin, with comparable values to TOP3A. Importantly, each core component was scored as a prey protein interaction for the BLM bait, although the average numbers of spectral counts were lower than the numbers observed for the RMI1 and RMI1S455N baits. One consideration is that this result could reflect differences in the levels of protein expression in the HEK293 cell lines constructed (Figure 4.2A). Nevertheless, this data indicates that the BioID technique has the sensitivity to detect known core complex interactions and accessory protein interactions required during the replication stress response.

93

Figure 4.7 Comparison of RMI1, RMI1 S455N and BLM BioID candidates in the presence of replication stress A) HEK293 cell lines expressing Flag-BirA* fusion proteins of RMI1, RMI1 S455N and BLM were grown in the presence or absence of aphidicolin. At least two biological replicates of each cell line were harvested for BioID analysis. A heatmap of average spectral counts is depicted for representative protein interactions. SAINT score of ≥0.75 was used to distinguish true prey protein interactions (black boxes). B)Venn diagrams comparing BioID candidates in the presence and absence of aphidicolin for RMI1, RMI1 S455N and BLM bait proteins. There was a predominant loss of BioID prey proximal proteins in the presence of aphidicolin. C)Venn diagram comparison of BioID candidates for RMI1 and RMI1S455N baits in the presence of aphidicolin.

94

Treatment with aphidicolin only lead to losses of prey proximal proteins for RMI1, RMI1 S455N and BLM baits, with the exception of one prey protein gain for the RMI1 bait, CUEDC2 (Figure 4.7B). However, further inspection revealed that its SAINT score in the absence of aphidicolin was 0.74, which is close to our stringent cut-off of ≥ 0.75. Future studies will aim at validating the CUEDC2 and all other candidate interactions using complementary methods such as co- localization and co-IP. However, this data suggests that no significant gains in protein interactions occur with the aphidicolin treatment conditions that I used in this study.

Predominant loss of interactions during aphidicolin treatment could indicate that I have highlighted the most important subset of accessory proteins that are required during replication stress from the total possible combinations of BTRR sub-complexes that exist in the endogenous cellular environment. For example, RGPD3 and RGPD5may play a role that is independent from replication stress response because they are only significant prey proximal proteins in the absence of aphidicolin (Figure 4.7A). While the function of these proteins has not been described in the literature, it is known that they have a GRIP golgi targeting domain and share sequence identity with RANBP2, a nuclear pore complex component. Interestingly, RMI1 and RMI1 S455N prey proteins AHCTF1 and PDXDC1, retain their high level of spectral counts in the presence of aphidicolin, suggesting that these proteins could play a role during replication stress. Validation of these four proximal proteins using a complementary assay, such as co-localization in the presence and absence of aphidicolin, will be necessary before further functional characterization. In the presence of aphidicolin, RMI1 S455N had three additional proximal prey proteins compared to RMI1 (Figure 4.7C). However, spectral counts for these BioID candidates were relatively low. Once again, use of a complementary method will confirm if these are indeed true proximal proteins. Nevertheless, BioID is a promising tool to narrow down both known and novel accessory proteins that are required in the in vivo response to replication stress.

95

4.3.4 Knockdown of novel BTRR proximal proteins results in decreased replication progression

Besides the role of BTRR in DNA repair,BLM and RMI1 play a role in the regulation of normal replication progression during the cell cycle. In particular, DNA combing evidence has shown that depletion of these components leads to a decreased rate of replication progression (Chabosseau et al., 2011; Rao et al., 2007; Yang et al., 2012). To identify new proteins that act with BTRR in the regulation of DNA replication progression, I performed an additional screen focusing on proteins with undefined roles in replication from our BioID-MS dataset. I performed an RNAi knockdown screen in U2OS cells using targeted pools of four siRNAs (Appendix Table 4) and assessed changes in EdU incorporation by high-throughput fluorescence microscopy. The mean EdU intensity per nucleus was quantified as a proxy for replication progression and I obtained data for three biological replicates of the screen. Besides the SAINT score criteria described previously, I chose proteins without clear functions in DNA replication or repair, proteins with nuclear localization and proteins with literature suggesting a function in nucleic acid metabolism. I focused on 16 proximal proteins to BLM, 21 proximal proteins to wildtype RMI1 or RMI1S455N and 4 that were BioID candidates for all proteins. I included siRNA pools targeting PCNA and a non-targeting siRNA as positive and negative controls, respectively.

As expected, a significant reduction of EdU incorporation was observed relative to a non- targeting (NT) siRNA control following depletion of the replisome component, PCNA (Figure 4.8A). Moreover, the distribution of mean EdU intensity per nucleus was significantly different from NT control in each biological replicate (p <0.0001; one-tailed Mann Whitney) (Figure 4.8B). Finally, the average mean EdU intensity per nucleus from three biological replicates was significantly reduced relative to the NT control (p<0.05; t-test) (Figure 4.8C). Next I tested the effects of RMI1 or BLM depletions and found a moderate decrease in EdU incorporation (Figure 4.8A), with the distributions of mean EdU intensities per nucleus relative to NT control being significantly different (p<0.0001; one-tailed Mann Whitney test) (Figure 4.8B). The average mean EdU intensity per nucleus in three biological replicates was reduced but not significant, which could indicate that the requirement for the BTRR complex during replication is less crucial than the requirement for PCNA, corroborating iPOND and replisome co-localization results described previously (Figure 4.8C).

96

Figure 4.8 Reduced EdU Incorporation following siRNA depletion of RMI1 and BLM BioID candidates A)Representative images of U2OS cells following 48 hr transfection with a non-targeting (NT) siRNA control or siRNA pools targeting RMI1, BLM or BioID candidates. EdU (10μM) was added to media 30 mins prior to paraformaldehyde fixation, click chemistry using azide conjugated fluorophore and DAPI staining of nuclear DNA. B)DAPI signal was used for segmentation of the nuclei and each point on the graph represents mean intensity per nucleus in one biological replicate. Distributions were significantly different than NT control by one-tailed Mann Whitney test (* p ≤ 0.05; ** p ≤ 0.0001) C)Bar graph of average EdU mean intensity per nucleus in three biological replicates for RMI1 BioID candidates. Dotted red line represents the average mean EdU intensity following depletion of RMI1. Error bars represent standard error. A legend of RMI1 mutant BioID candidates is shown. D)Bar graph of average EdU mean intensity per nucleus in three replicates for BLM BioID candidates. Dotted red line represents average mean EdU intensity following depletion of BLM. Error bars represent standard error. Four proteins were shared BioID candidates with RMI1 and RMI1 S455N.

97

Interestingly, RLTPR and YEATS2 are both novel and unstudied proteins that had reduced average mean EdU intensity per nucleus similar to RMI1 and BLM respectively, suggesting that they may play a similar role in the maintenance of replication progression (Figure 4.8C). RLTPR is required for the development of regulatory T cells and contains concatenated leucine-rich repeats (LRRs), a tropomodulin capping protein motif and a region rich in proline residues (Liang et al., 2013; Matsuzaka et al., 2004). YEATS2 is a scaffolding protein of a histone acetyltransferase complex and interacts with the TATA-binding protein (TBP) to regulate transcription (Wang et al., 2008). Additionally, there is a link between YEATS domain containing proteins and mixed lineage leukemias (Schulze et al., 2009). While these are just are two examples of promising BioID candidates with relatively undefined roles in DNA replication, I found a total of 42/42 depletions that had a significant change in the distributions of mean EdU intensity per nucleus in at least one biological replicate. Thus, the EdU incorporation assay was useful in identifying BioID candidates with possible roles in DNA replication progression for functional characterization. However, one caveat may be that a reduced EdU intensity could also reflect cell cycle defects or variable knockdown efficiencies. Therefore, future studies will focus on determining the extent of siRNA knockdown, determining cell cycle effects and resolving the dynamics of replication fork progression.

4.4 Discussion

The BTRR complex is important in the DNA repair pathway and has emerging roles in DNA replication progression. However, localization of the core complex components to chromatin and insoluble nuclear foci has hindered the definition of a comprehensive protein interaction network by traditional methods such as AP-MS. I leveraged the advantage of in vivo biotin labeling in BioID to study BTRR interaction partners. Two previous studies have attempted to catalog BLM interacting partners with AP-MS, but systematic studies have not been attempted with any other components of the BTRR complex. Only three proteins overlapped between these datasets and my BioID candidates, which correspond to TOP3A, RMI1 and a known accessory protein, RPA1. Encouragingly, I also successfully identified the novel TOPBP1 interaction with BLM(Wang et al., 2013) and I identified 21 proximal proteins that were not found in previous AP-MS attempts using BLM antibodies. Of these, many are known interactors such as Fanconi anemia complex members (FANCM, BRIP1, FANCD2) and DNA repair complex components

98

(SLX4, RAD50, NBN, BRCA1), which are all chromatin associated. As such, I have demonstrated the advantage of using BioID to identify proteins in cellular compartments that are difficult to analyze by AP-MS. These two complementary techniques can be used in combination to achieve protein interaction networks for each component of the BTRR complex.

Figure 4.9 Comparison of BLM BioID candidates with BLM protein interactors defined by AP-MS A Venn diagram comparison of BLM BioID candidates from this study and two AP-MS studies of BLM protein interactions in human cell lines. Proteins identified in each study are listed.

I note that BLM interaction partners from the literature such as MLH1, MLH2, MSH6 and ERCC6L did not make our stringent BioID candidate cutoff (Figure 4.3A). One limitation of BioID that may explain this result is that in vivo biotin conjugation depends on the number and availability of primary amine acceptor sites in neighbouring proteins. Many sites may be hidden within large multi-component complexes with complicated 3D architecture, such as BTRR and BTRR sub-complexes. As such, results from BioID may not be comprehensive and should be viewed as a screening technique to detect potential interactors.

Nevertheless, I produced the first catalogue of wildtype RMI1 proximal proteins and demonstrated the utility of BioID to probe disease-associated mutations of RMI1. These insights could provide future avenues of research into the mechanism of disease development and could reveal disease-specific protein-interactions that can be exploited in therapeutics. I also demonstrated that BioID is appropriate for distinguishingRMI1 and BLM interactors in the

99 presence or absence of aphidicolin, which will give insights into replication stress response. Using a secondary screening method, I identified candidates that lead to decreased EdU incorporation following knockdown by siRNA, possibly reflecting the role of the BTRR complex in DNA replication. Complementary methods will be used to validate interaction profiles and candidates with defects in cell cycle progression will be investigated through alternative approaches. Future evaluations of BioID candidates will surely provide novel insight into BTRR complex function and assembly. These exciting ventures will be discussed in the following chapter.

4.5 Methods

4.5.1 Plasmid Construction

To construct plasmids for generating stable cell lines, BLMand RMI2 genes were amplified from pCR4-TOPO-BLM and pOTB7-RMI2, respectively. The PCR products were digested with AscI and XhoI and cloned into a pcDNA5/ FRT/TO-Flag-BirA*vector to generate N-terminally tagged gene constructs that can be integrated into the mammalian genome using the Flp-In T- REx system (Invitrogen). pcDNA5/FRT/TO-Flag expression vectors containing RMI1, RMI1 K166A, RMI1 LLTD and RMI1 S455N genes (Yang et al., 2012) were restriction digested with AscI and XhoI and resulting fragments were cloned into the pcDNA5/FRT/TO-Flag-BirA* vector. These constructs were also integrated into the mammalian genome using the Flp-In T- Rex system (Invitrogen).

4.5.2 Cell Culture

All culture media were supplemented with 10% fetal bovine serum (FBS), 100 U/mL penicillin, and 100 mg/mL streptomycin. U2OS cells were cultured in McCoy’s 5A medium. HEK293 FRT/TO Flp-In stable cells were cultured in DMEM supplemented with 1x GlutaMAX (35050; Invitrogen), 200 mg/mL hygromycin B, and 5 mg/mL blasticidin. U2OS FRT/TO Flp-In stable cells were cultured in McCoy’s 5A supplemented with 1x GlutaMAX (35050; Invitrogen), 200 mg/mL hygromycin B, and 5 mg/mL blasticidin.

100

4.5.3 Construction of stable cell lines

HEK293 FRT/TO Flp-In stable cell lines expressing inducibleFLAG-BirA* tagged proteins for BioID were generated using the Flp-In T-REx system (Invitrogen). To check the expression of fusion constructs, cells were treated with 5μg/mL doxycycline and 50μM of biotin for 24 hours. Media was removed and cells were washed twice with cold PBS. Cells were collected in lysis buffer and western blot analysis was performed using FLAG antibody (M2; Sigma-Aldrich), RMI1 antibody (Yang et al., 2012) or streptavidin-HRP (Cell Signaling).

4.5.4 Biotin Identification (BioID)

Proximal and interacting proteins of BLM, RMI1 and RMI2 were identified using the BioID as previously described(Roux et al., 2012). For each biological replicate, two 150mm dishes of HEK293 cells stably expressing a tetracycline regulated N-terminally tagged Flag-BirA* protein construct were grown to 60-80% confluency. Cells were treated with 5μg/mL doxycycline and 50μM of biotin for 24 hours. For cells with drug treatment, cells were incubated in 5μg/mL doxycycline and 1μM aphidicolin for 2 hrs and 50μM biotin was subsequently added to the media for 22 hours. Alternatively, cells were incubated in 5μg/mL doxycycline and 1μM aphidicolin for 24 hrs and 50μM biotin was subsequently added to the media for 24 hours. Media was removed and cells were washed twice with cold PBS. Cells were scraped and pooled together and washed two times with cold PBS. Pellets were flash frozen at -80°C.

4.5.5 BioID affinity purification for mass spectrometry

Pre-cleared lysates after sonication were added to 2mL RIPA buffer (50 mM Tris-HCL pH 7.5, 150 mM NaCl, 1% Triton X-100, 1mM EDTA, 1 mM EGTA, 0.1% SDS) and substituted with 1:500 protease inhibitors (Sigma P8340) and 0.5% sodium deoxycholate. 30 μL of streptavidin- sepharose beads (GE Healthcare) were added to equivalent volumes of pre-cleared lysates per sample for affinity purification at 4°C on a nutator for 3 hours. Streptavidin-sepharose beads were pelleted at 400g for 1 min. The supernatant was removed and the beads were washed two times with RIPA buffer, two times with TAP lysis buffer (50mM HEPES-KOH pH 8.0, 100mM KCl, 10% glycerol, 2mM EDTA, and 0.1% NP-40) and three times ABC buffer (50nM ammonium bicarbonate pH 8). Streptavidin-sepharose beads were resuspended in 30μL of ABC buffer containing 1μg of trypsin and incubated for 17-20 hrs on a rotating disc at 37°C. An additional 0.5 μg of trypsin diluted in 10uL of ABC buffer was added to each sample and rotated

101 for an additional 2 hrs. Beads were pelleted at 400g for 1 min and the supernatant was transferred to a new tube. Beads were washed twice with 30μL of filter sterilized HPLC dH2O and these rinses were combined with the original supernatant. Pooled supernatant was centrifuged at 16100g for 10 minutes and 80μL was transferred to a new tube. Samples were acidified to 2% formic acid and dried in a speed vacuum.

4.5.6 Mass spectrometry

Mass spectrometry was performed as described in Chapter 3.

4.5.7 Data analysis

Raw spectral count data was analyzed by Significance Analysis of INTeractome” (SAINT) label free quantification (LFQ) computational analysis tool (Choi et al., 2011). Preys with ≥2 spectral counts in at least two biological replication with a SAINT score of ≥ 0.75 was considered as a significant positive. ToppGene was used for GO Biological Process enrichment (Chen et al., 2009).

4.5.8 Fluorescence microscopy for protein localization

HEK293 or U2OS cell lines were grown in 8-well CultureSlides (354108; BD Falcon). Cells were fixed with 2% paraformaldehyde in PBS for 30 minutes, then permeabilized with 0.3% Triton X-100 in PBS for another 30 minutes. Cells were blocked in Blocking Buffer (10% donkey serum, 0.5% NP-40, and 0.5% Saponin in PBS) for 30 minutes at room temperature. For immunostaining, cells were incubated with primary antibody overnight at 4C, and with secondary antibody for 2 hours at room temperature. Rabbit anti-RMI1 (Yang et al., 2012),streptavidin-Alexa488 (Molecular Probes), mouse-FLAG (M2; Sigma-Aldrich) and goat anti-rabbit Alexa 546 (1:500; Molecular Probes) antibodies were used for detection. All antibodies were diluted in Blocking Buffer. In between each step, cells were washed three times with PBS, 5 minutes each. Cells were stained with 0.4 μg/ml DAPI for 20 minutes to visualize DNA, and mounted with ProLong Gold anti-fade reagent (P36934; Invitrogen). Confocal images were taken using Volocity imaging software (PerkinElmer) on a Leica DMI6000 microscope (Quorum Technologies). The maximum Z-projections of each image containing 11 z-slices with a 0.5 μm step size were analyzed.

102

4.5.9 iPOND-western iPOND-western was performed as described previously (Sirbu et al., 2012) and in Chapter 3.

4.5.10 High-throughput fluorescence microscopy

EdU incorporation assay was performed as described in Chapters 2 and 3.

103

Chapter 5

General Discussion and Future Directions

104

5 General Discussion and Future directions

In the three previous chapters, I have described several distinct screening procedures to identify novel genes and proteins that may play a role in genome maintenance. Here, I will describe future avenues of research, with a focus on screen validation experiments and downstream functional evaluation of candidate proteins.

5.1 Novel common fragile site maintenance genes detected from spontaneous DNA damage screens in S. cerevisiae

I identified 47 essential genes whose depletion results in spontaneous DNA damage and 92 genes that suppress spontaneous chromosome rearrangements (Figure 5.1A). Notably, depletion of Non-Smc element 1 (NSE1) was amongst the core set of 15 genes that resulted in high levels of both types of events. Moreover, chromosome breakpoints in these mutants are located at specific Ty retrotransposon sites, which are analogous to human common fragile sites (CFSs) (Roeder and Fink, 1980). Nse1 is a subunit of the multifunctional structural maintenance of chromosome 5/6 (SMC5/6) complex, with roles in homologous recombination, DNA replication (Ampatzidou et al., 2006; Chavez et al., 2010; Irmisch et al., 2009) and cohesin recruitment tosites of DNA- damage (De Piccoli et al., 2006; Potts et al., 2006; Strom and Sjogren, 2007). However, the functions of NSE1 and other essential protein subunits of SMC5/6 remain poorly characterized relative to its structural counterparts, cohesin and condensin (Jeppsson et al., 2014). With its diverse functions, Nse1 and SMC5/6 may be critical for the maintenance of chromosome stability, particularly at human CFSs.

105

Figure 5.1 Summary of novel genes and proteins involved in the maintenance of genome stability A)Summary of the number of yeast essential gene depletions that were tested in Chapter 2 B)Summary of the number of iPOND candidates identified in Chapter 3. C)Summary of the number of wildtype BioID candidates detected for the BTRR complex described in Chapter 4.

Proteins involved in the response to DNA damage and replication stress in S. cerevisiae were shown to be involved in the regulation of fragile site stability. For example, depletion of MEC1, a gene encoding a protein kinase involved in DNA checkpoint response, confers a similar phenotype to cells depleted of NSE1. Depletion of MEC1results in replication fork stalling and chromosome breakage at replication slow zones, which are functionally analogous to human CFSs (Cha and Kleckner, 2002). MEC1 is the ortholog of human ATR, which has a known role in the maintenance of common fragile site stability in mammalian cells (Casper et al., 2002; Wan et al., 2010). This suggests that the processes that maintain fragile site stability may be conserved between budding yeast and human. Future endeavors can be focused on the conserved role of the human ortholog of NSE1 in CFS stability and the mechanism by which it acts.

5.1.1 Function of NSE1 in preventing spontaneous DNA damage and chromosome breaks in human cell lines

To address if the increased spontaneous DNA damage and chromosome rearrangement phenotype observed in yeast also occurs in humans, two analogous assays could be performed using human cell lines depleted of NSE1. Following the formation of DSBs in humans, H2AX present in flanking areas is phosphorylated at serine residue 139, and is termed γH2AX (Lukas and Bartek, 2009). This modification promotes the localization of cell DNA repair proteins to damaged areas of the genome. Repair centres can be concentrated to discrete areas of the nucleus

106 that can be visualized by fluorescence microscopy using an antibody specific for γH2AX. Spontaneous accumulation of γH2AX foci indicates the persistence of endogenous DNA damage. The number of γH2AX foci can be quantified in cells depleted of NSE1 and compared to the levels in cells that express wildtype levels of NSE1. Elevated levels of foci will suggest a role of NSE1 in the suppression of spontaneous DNA damage.

Failure of cells to repair spontaneous lesions marked by γH2AX may lead to deleterious translocations or rearrangements. The frequency of chromosome breaks can be visualized on metaphase spreads of human cell lines. A standard protocol can be used where chromosomes from colcemid treated cells are isolated and fixed on to cover slips for nuclear DNA staining with DAPI (Arlt et al., 2004). Using microscopy, the frequency of chromosome breaks and abnormal structures can be quantified in cells depleted of NSE1 and compared to the levels in cells that express wildtype amounts of NSE1. Elevated levels of metaphase chromosome breaks following the depletion of NSE1 will suggest a conserved role of this protein in the maintenance of genome stability in mammalian cells.

5.1.2 Do chromosome breaks caused by NSE1 depletion localize to common fragile sites?

Chromosomal breakpoints may represent preferential sites for recombination and repair. Following the depletion of NSE1 in S. cerevisiae, chromosome breakpoints were observed at Tyretrotransposon elements in the genome that mimic CFSs in humans (Roeder and Fink, 1980). To determine if the depletion of human NSE1 results in breakpoints at CFSs, fluorescence in situ hybridization (FISH) with CFS targeted probes can be performed on metaphase chromosome spreads described previously. FRA3B and FRA16D are good CFS candidates because they are the most sensitive breakpoint sites in human cells (Durkin and Glover, 2007; Glover et al., 1984; Ohta et al., 1996). Co-localization of FISH probes with chromosome breaks will confirm that they occur at specific CFSs. The frequencies of fragile site breakpoints relative to other sites of breakage can be compared in both wildtype and hNSE1 depleted cells. This quantification can be repeated in the presence and absence of aphidicolin in order to investigate the role of NSE1 in the maintenance of fragile site stability in response replication stress.

107

5.1.3 Mechanism by whichhNSE1contributes to the maintenance of common fragile site stability

There are two current models for human CFS expression (Debatisse et al., 2012). One theory is that CFSs are late replicating regions of the genome (Glover et al., 2005; Le Beau et al., 1998; Palakodeti et al., 2004) and that helicase unwinding activity at these regions creates ssDNA with secondary structures that cause difficulty in replication (Glover et al., 2005; Lemoine et al., 2005). However, genome wide analyses have not provided evidence of secondary structure enrichment at CFSs relative to non-fragile sites with similar sequence composition (Helmrich et al., 2006; Tsantoulis et al., 2008). By contrast, a second model suggests that a deficit of replication initiation events at CFSs causes defects in replication completion, which ultimately form chromosome breaks (Letessier et al., 2011). Thus, CFSs are more sensitive to treatments that reduce fork speed, such as the depletion of genes that affect fork progression or treatment with aphidicolin. Preliminary evidence from our study showed that the depletion of NSE1 resulted in a significant decrease in EdU incorporation, suggesting reduction in replication. Molecular combing experiments can be used determine the rate of DNA replication and to test the role of NSE1 in replication progression both in the presence and absence of aphidicolin. Further details with respect to this technique will be outlined in next section, which describes strategies for elucidating iPOND-MS candidates that play a role in replication dynamics. Altogether, the approach summarized here will uncover a mechanistic role for NSE1and its associated SMC5/6 complex in the maintenance of genome stability.

5.2 Novel DNA replication protein candidates from iPOND-MS

I observed that essential proteins involved in DNA replication play an important role in the maintenance of the S. cerevisiae genome. As this biological process is conserved and integral to cell cycle progression, it is important to define the corresponding proteins involved in DNA replication progression in humans. To this end, I used iPOND-MS to identify 153 proteins that interact with active DNA replication forks (Figure 5.1B). I showed decreased EdU incorporation upon depletion of 12 protein candidates, providing preliminary evidence of their role in DNA replication or cell cycle progression. Future efforts will focus on defining the mechanistic role of novel proteins in DNA replication progression.

108

5.2.1 Co-localization of iPOND candidates with replisome proteins

I am confident that I have assembled a robust list of iPOND-MS protein candidates, as it included 29 replisome components and was enriched for known proteins involved in DNA replication and repair. However, it will be imperative to confirm the replication fork interaction of candidate proteins using iPOND-western. Furthermore, use of complementary assays will strengthen the inference that protein candidates function at replication forks. Epitope tagged several candidates in expression vectors, which can be transfected into stable cell lines expressing GFP-PCNA for analysis. Co-localization with a replisome component or alternatively, co-localization with EdU marked newly replicated DNA can place novel candidate proteins at sites of DNA replication, in vivo.

5.2.2 Contributions of iPOND candidates to the dynamics of DNA replication progression

For protein candidates with a confirmed interaction with the replisome and newly replicated DNA, it will be most interesting to decipher their role in DNA replication progression. To this end, I demonstrated a reduced EdU incorporation phenotype following depletions of iPOND candidates using siRNA pools. It will be imperative to deconvolute the pools and test the effects of individual siRNAs to eliminate non-targeting effects. Cell cycle defects can be excluded by use of two-parameter FACS analysis that will compare EdU incorporation to total DNA content marked by Propidium Iodide(Bugler et al., 2010; Daboussi et al., 2008; Hanada et al., 2007). An increase in the fraction of EdU-containing cells in response to depletion will indicate that inactivation causes S-phase delay. Propidium Iodide staining can also be used to assess if reduced EdU incorporation phenotypes were a consequence of cell cycle delays.As the depletion of replisome components important for replication progression such as PCNA and RPA causes to S-phase defects(Kittler et al., 2007), we anticipate prioritizing iPOND candidates with S-phase delays for downstream analysis.

To assess the direct effect of depletion on the rate of replication fork progression, I can use DNA combing, a technique that allows the study of replication at the resolution of individual DNA molecules (Michalet et al., 1997). Newly synthesized DNA in cells are labelled by the incorporation of bromodeoxyuridine (BrdU) or other halogenated thymidine analogues such as chloro- or iododeoxyuridine (CldU, IdU). Individual DNA fibers are then straightened and

109 aligned on specialized coverslips. DNA fibers can be stained with an anti-DNA antibody, while nascent DNA tracts can be distinguished using different fluorochrome-labelled antibodies specific to the thymidine analogues. BrdU tract lengths correlate with the rate of DNA synthesis and thus with the rate of replication fork progression. Shorter tract lengths in gene depleted cells compared to cells expressing wildtype levels of the gene of interest can suggest a role in the maintenance of replication fork progression in a normal cell cycle. This analysis pipeline has been performed with high success in in previous iPOND studies (Alabert et al., 2014; Lopez- Contreras et al., 2013; Sirbu et al., 2013). Identification of novel protein targets will expand the current knowledge and understanding of DNA replication.

5.2.3 Future applications for iPOND-MS

Together with previous studies, I have established that the utility of iPOND-MS in identifying new candidate DNA replication proteins. There is opportunity to leverage this technique by using cell cycle synchronization to precisely identify proteins recruited at each step of DNA replication. Using double thymidine block of HeLa cells and release into a synchronous S-phase, Kliszczak et. al. revealed the enrichment of certain proteins on DNA replicating in late S-phase as opposed to early S-phase. Another application of the iPOND technique is to assess the difference in protein recruitment in the presence of replication stress or DNA damaging reagents. Sirbu et al. have employed the use of HU in HEK293T suspension cells to test the effects of stalled replication forks, and HU with an ATR inhibitor to test the effects of collapsed DNA replication forks. They revealed newly identified proteins that are needed to facilitate DNA replication in stressed conditions (Sirbu et al., 2013). Relevant to our studies, it would be advantageous to test the effects of aphidicolin, which has a different mechanism for inducing stress and is a known inducer of common fragile site breakage. Furthermore, testing the effects of chemotherapeutic agents that target DNA replication will be useful in understanding mechanisms of action and deciphering novel drug targets. One opportunity that has not been explored in the literature by iPOND-MS is the knockdown of proteins involved in DNA replication or the expression of mutant proteins, which would allow assessment of differences in protein recruitment to the DNA replication fork. Relevant to our research, it would be interesting to test BLM -/- cells or cells expressing clinical mutations of RMI1. These efforts will expand the application of the iPOND-MS screening tool to enhance our understanding of protein recruitment dynamics under different cellular conditions and mutational backgrounds.

110

5.3 BioID candidates that are proximal to the BTRR complex

I used the novel BioID technology to identify 49 candidate interactors for BLM, RMI1 and RMI2 (Figure 5.1C). Of these, I distinguished a primary subset of 31 BLM and RMI1 proximal proteins that were also responsive to aphidicolin induced replication stress. Additionally, I identified changes in proximal proteins upon expression of several RMI1 mutants, which revealed a subset of 11 candidate changes between the clinical RMI1 S455N mutant and wildtype RMI1. In the future, BioID analysis of Flag-BirA*-TOP3A or Flag-BirA*- TOP3AY337F mutant that lacks enzymatic function (Goulaouic et al., 1999) would make the BTRR complex study more comprehensive. For completeness, it would also be beneficial to test the proximal proteins of an RMI1 loop (94-134AA) mutant, which is critical for interaction with TOP3A (Bocquet et al., 2014). Nevertheless, I have defined many novel proteins that co-operate with BTRR complex in genome maintenance. Depletion of 25 candidates had reduced EdU incorporation intensities, suggesting possible involvement in DNA replication (Figure 5.1C). Not only do I foresee the role of BTRR accessory proteins in replication, but I anticipate their function with the BTRR complex in other modes of genome maintenance. As such, downstream studies will both be aimed at validating novel BTRR protein interactions and studying the mechanism by which they function.

5.3.1 Co-localization of BioID candidates with BTRR complex

Our list of BioID candidates included many known interactions of the BTRR complex and is enriched for DNA replication and repair processes. Importantly, I detected 21 novel BLM proximal proteins that were excluded from previous AP-MS attempts. As such, our list has the potential of containing interactors that have never been detected in the literature. It will be imperative to confirm the BioID candidate interactions with the BTRR complex using complementary methods. For example, epitope tagged candidates in expression vectors can be transfected into stable cell lines expressing GFP tagged BTRR components for analysis by fluorescence microscopy. In vivo co-localization can be used to validate their proximity to the BTRR complex. Furthermore, accompanying co-immunoprecipitation experiments can be used to determine if the BioID candidates are associated with the BTRR complex members. Together, these experiments will localize BioID candidates and BTRR complex members to the same spatial compartments, which may indicate functional cooperation. Discussed in the following

111 sections, a major goal will be to decipher the roles of BioID candidates in each of the main genome maintenance pathways that the BTRR complex plays an essential role.

5.3.2 Function of BioID candidates in DNA replication

Studies have described the essential role for BLM and RMI1 in promoting normal replication progression in unperturbed cells (Chabosseau et al., 2011; Rao et al., 2007; Yang et al., 2012). I anticipate that several BioID candidates may have similar roles in DNA replication progression and thus assessed their possible contributions using an EdU incorporation microscopy assay. The same pipeline of experiments outlined for iPOND-MS candidate validation can be used to assess the roles of these candidates in replication progression. This will elucidate the BTRR accessory proteins that may act cooperatively to regulate the rate of DNA replication progression.

Particularly for proximal proteins of BTRR that were detected in presence of aphidicolin, a modification to the molecular combing experiment can be used to test their contribution to replication fork stalling and recovery. Depleted cells can be labeled sequentially with CldU and IdU. CldU will mark the presence of actively replicating forks. Following a short labeling time, aphidicolin can be added to the CldU containing media, which will presumably stall replication forks and inhibit further CldU incorporation. After removal of aphidicolin, recovery of stalled forks in the presence of IdU can mark forks that are able to resume DNA replication. The rate of fork restart can be determined by dividing the number of active forks by the total number of forks. A reduction in the efficiency of recovery upon depletion of BTRR complex members or BioID candidates would support their role in DNA replication fork restart. Molecular combing experiments can additionally be performed with double knockdowns of BTRR accessory proteins and core complex members to decipher if they work in the same pathway to promote replication progression. Synergistic effects may suggest that BTRR and BioID candidates act in distinct functional pathways that regulate DNA replication. This approach has been used successfully to decipher the role of BLM and its accessory protein, RIF1, in replication fork restart (Davies et al., 2007; Xu et al., 2010). Thus, it is hopeful that I will dissect novel BTRR complex interactors that act in DNA replication progression and replication stress response.

112

5.3.3 Function of BioID candidates in suppression of sister chromatid exchange

The diagnostic phenotype for Bloom syndrome patients is a ten-fold elevated level of SCE and all BTRR complex members are essential for the suppression of SCE (Chaganti et al., 1974; Wang et al., 2000; Wu and Hickson, 2003; Xu et al., 2008; Yin et al., 2005). Each BTRR complex member is essential for the suppression of SCE, as elevated levels of SCE were observed upon the individual depletion of each BTRR component (Chaganti et al., 1974; Wang et al., 2000; Wu and Hickson, 2003; Xu et al., 2008; Yin et al., 2005). BTRR complex interacting proteins such as FANCM also display this phenotype, suggesting that they play a co- operative role in SCE suppression (Hoadley et al., 2012). To visualize SCEs, BioID candidate genes could be depleted using targeted siRNAs and grown in the presence of BrdU to achieve preferential labeling of sister chromatids. Sister chromatids can be differentiated using the standard fluorescence plus Giemsa technique described previously (Perry and Wolff, 1974). Increased SCE per chromosome relative to wildtype and relative to depletions of BTRR complex members can be used to assess the involvement of BioID candidates in this process. Elevated levels of SCE are observed upon depletion may indicated a contribution of novel interactors to the Bloom syndrome phenotype, whichwould merit further mechanistic studies.

5.3.4 Function of BioID candidates in double Holliday junction dissolution

The canonical function of the BTRR complex is to catalyze double Holliday junction (dHJ) dissolution, generating non-crossover intermediates during homologous recombination repair (Wu and Hickson, 2003). It is postulated that defects in this process are the cause of elevated levels of SCE (Wu and Hickson, 2003). In vitro studies have shown that BLM helicase activity is required for branch migration of Holliday junctions and conversion to hemi-catenane structures, which can in turn be unlinked by the single strand passage activity of TOP3A (Wu and Hickson, 2003). Additionally, the activity of this reaction is stimulated by RMI1 and RMI2 (Singh et al., 2008; Wu et al., 2006). It would be interesting to distinguish if RMI1 and BLM BioID candidates play a stimulatory role in dHJ dissolution using in vitro assays described in these studies. A standard technique can be used to assemble linked DNA structures that mimic dHJs (Wu and Hickson, 2003). Increasing concentrations of purified BioID candidates can be incubated with the DNA structures in the presence of BLM, TOP3A, RMI1, or any combination

113 thereof. After termination of the reaction and degradation of proteins in solution, the resulting DNA products can be assayed on a denaturing gel. Decatenated structures can clearly be distinguished from the original dHJ structures by differences in migration pattern. While BLM and TOP3A alone can drive the formation of decatenated structures, addition of RMI1 stimulated the reaction and caused elevated levels of decatenated structures (Wu et al., 2006). Using the same set-up, equal or increased levels of decatenated structures upon addition of BioID candidates in comparison to RMI1 would strongly suggest a comparable stimulatory role and involvement in BTRR complex activity. This assay can be used to uncover a mechanistic role of novel accessory proteins to the BTRR complex and their contribution to the suppression of SCE.

5.3.5 Function of BioID candidates at ultra-fine anaphase bridges and common fragile sites

Ultra-fine anaphase bridges (UFBs) are structures that connect sister chromatids in mitosis, which can not be visualized using conventional DAPI or Hoechst staining (Baumann et al., 2007). Instead, detection relies on immunostaining of proteins that localize to UFBs. PLK1- interacting checkpoint helicase (PICH/ERCC6L) is localized to UFBs (Baumann et al., 2007), which specifically recruits BLM and in turn recruits TOP3A and RMI1 of the core complex (Chan et al., 2007). The number of UFBs is significantly increased in BLM-/- cells, indicating that BLM is required for their suppression (Chan et al., 2007). Thus the BTRR complex plays an important role in genome maintenance beyond DNA replication and repair in S-phase (Chan and Hickson, 2011; Chan et al., 2007). I detected PICH as a proximal protein to the RMI1 bait, suggesting that I assayed a subset of cells with UFBs. As this is an emerging area of study, it is likely that other proteins with undefined function in UFB suppression are present in our list of BLM and RMI1 interactors defined from BioID.

One class of UFBs co-localize with CFSs (Chan et al., 2009). CFSs are regions of the genome that are prone to exhibit incomplete replication, which is exacerbated by replication stalling agents or depletion of genes required for DNA replication progression. (Debatisse et al., 2012; Letessier et al., 2011). Commencement of mitosis before these regions are fully replicated can result in the formation of UFBs(Letessier et al., 2011; Ozeri-Galai et al., 2011). Other BTRR complex interacting proteins, such as FANCD2 and FANCI of the Fanconi Anemia (FA) replication stress response complex, also localize to UFBs that contain CFSs (Chan et al., 2009). Interestingly, a known BLM interactor and FA complex protein, FANCM, was detected as a

114 proximal protein of both BLM and RMI1 baits in our BioID study. It is proposed that BTRR recruits FANCM through a “hand-off mechanism”, since FANCM localizes to UFBs following dissociation of BLM, (Chan and Hickson, 2011; Vinciguerra et al., 2010). As I detected an example of a BTRR complex accessory protein that is localized to UFBs containing CFSs, I envision that other proteins in our BioID candidate list could be involved in the same function. Below I outline a strategy for deciphering these BioID candidates.

BioID candidates detected in the presence of aphidicolin stress should be prioritized, as it is a common agent used to induce CFS expression. GFP-tagged BioID candidates can be expressed in U2OS cells, which can subsequently be immunostained for PICH or BTRR complex components and stained with DAPI to detect nuclear DNA. Co-localization of GFP-tagged BioID candidates with PICH or BTRR complex members at UFBs would be quantified. Furthermore, FISH probes targeting FRA3B or FRA16D can be added to distinguish the proportion of UFBs containing CFSs. This study would elucidate novel proteins that function together with BTRR complex, particularly in the maintenance of CFS stability in anaphase. Furthermore, UFB quantification in cells depleted of BioID candidates can also be performed. A significant increase in the number of UFBs, a phenotype observed in BLM -/- cells, may implicate the role of the depleted protein in suppression of UFB formation. Finally, if UFBs are unresolved in anaphase, chromosomal breakage can occur when cells divide in telophase, leading to the formation of micronuclei (Minocherhomji and Hickson, 2014; Sarbajna and West, 2014). For example, when BLM or FANCM complex proteins are depleted in cells, the number of micronuclei increases significantly (Blackford et al., 2012; Chan et al., 2009). An increase in micronuclei formation in DAPI stained cells depleted of certain BioID candidates would implicate its cooperative role. FISH staining can also be applied to these cells to define the proportion micronuclei that resulted from unsuccessful UFB resolution at CFSs. This is an exciting area of study, as the role of the BTRR complex in UFB and CFS stability is relatively novel to the field. I anticipate that our BioID dataset will aid the discovery of novel accessory proteins that cooperate with the BTRR complex.

5.3.6 Future applications of BioID methodology

There is tremendous opportunity to apply BioID technology to other proteins besides BTRR complex. As alluded to in previous sections, it has potential in uncovering proteins that are

115 membrane bound or are within chromatin compartments that are difficult to assay using other methods such as AP-MS. As I have shown that DNA replication proteins are essential to genome maintenance, it may be useful to use BioID-MS to assess proximal proteins of PCNA or replicative polymerases as a complement to our iPOND-MS study. Furthermore, there is the opportunity to test the changes in protein interaction networks caused by cell cycle synchronization or the effects of chemotherapeutic agents that target DNA replication. Whereas I have described a few options for future application of this technology, there are limitless uses for BioID as a screening tool in enhancing our understanding of protein interaction networks under different cellular conditions and mutational backgrounds.

5.4 Conclusion

There are multiple proteins involved in the cellular pathways that act to preserve the stability of the human genome. I have highlighted the essential role of DNA replication proteins and the role of BTRR complex in genome maintenance. Together with the large core protein dataset revealed by the screening studies in my PhD thesis, the outlined experiments offer a multi-faceted approach to identifying the novel proteins that safeguard our genome from mutations that are associated with disease development.

116

Appendices

Table 1. List of Tet alleles used in Ddc2 foci microscopy assay

ORF Tet allele ORF Tet allele ORF Tet allele ORF Tet allele strain name strain name strain name strain name YBR236C ABD1 YIL026C IRR1 YJL072C PSF2 YDR356W SPC110 YKL112W ABF1 YDR367W KEI1 YER171W RAD3 YDR201W SPC19 YNL048W ALG11 YOR181W LAS17 YOR048C RAT1 YMR117C SPC24 YBR070C ALG14 YMR296C LCB1 YOL010W RCL1 YLR066W SPC3 YBR243C ALG7 YFL018C LPD1 YBR002C RER2 YHR172W SPC97 YBR211C AME1 YER112W LSM4 YPL010W RET3 YNL126W SPC98 YNL172W APC1 YNL006W LST8 YJR068W RFC2 YGL207W SPT16 YDL008W APC11 YIL150C MCM10 YOL094C RFC4 YBR253W SRB6 YBR234C ARC40 YPR019W MCM4 YBR087W RFC5 YKL154W SRP102 YDL029W ARP2 YLR274W MCM5 YBL020W RFT1 YDL092W SRP14 YJL081C ARP4 YBR202W MCM7 YOL005C RPB11 YPL243W SRP68 YPR034W ARP7 YLR106C MDN1 YOR151C RPB2 YPL210C SRP72 YKL052C ASK1 YHR058C MED6 YHR143W-a RPC10 YNL222W SSU72 YIL004C BET1 YOL135C MED7 YBR079C RPG1 YJL156C SSY5 YDR299W BFR2 YBR193C MED8 YFR004W RPN11 YIL126W STH1 YDL220C CDC13 YIL046W MET30 YDL147W RPN5 YIR011C STS1 YFR028C CDC14 YIL106W MOB1 YOR261C RPN8 YLR305C STT4 YKL022C CDC16 YPL082C MOT1 YDL140C RPO21 YLR045C STU2 YGL116W CDC20 YOR370C MRS6 YPR187W RPO26 YKL018W SWD2 YOR074C CDC21 YAL034W-a MTW1 YJR123W RPS5 YGR274C TAF1 YLR314C CDC3 YHR023W MYO1 YGL048C RPT6 YDR167W TAF10 YOR257W CDC31 YPL190C NAB3 YDL111C RRP42 YDR145W TAF12 YLR103C CDC45 YOR372C NDD1 YCR035C RRP43 YML098W TAF13 YDL126C CDC48 YHR072W-a NOP10 YCR052W RSC6 YCR042C TAF2 YDL132W CDC53 YNL251C NRD1 YML127W RSC9 YBR198C TAF5 YJL194W CDC6 YLR007W NSE1 YER125W RSP5 YGL112C TAF6 YJR057W CDC8 YPL233W NSL1 YPL218W SAR1 YMR227C TAF7 YDL164C CDC9 YOL069W NUF2 YKL193C SDS22 YMR236W TAF9 YLR459W CDC91 YGL092W NUP145 YLR166C SEC10 YJR046W TAH11 YBR135W CKS1 YIL115C NUP159 YIR022W SEC11 YPL128C TBF1 YDL145C COP1 YJR042W NUP85 YLR208W SEC13 YKR062W TFA2 YGL238W CSE1 YDL193W NUS1 YMR079W SEC14 YDR460W TFB3 YLR323C CWC24 YPR162C ORC4 YBL050W SEC17 YGR186W TFG1 YDR016C DAD1 YHR118C ORC6 YBR080C SEC18 YGR005C TFG2 YDR052C DBF4 YJL002C OST1 YDR498C SEC20 YIL144W TID3 YHR019C DED81 YOR103C OST2 YNL287W SEC21 YMR146C TIF34 YOR236W DFR1 YMR076C PDS5 YIL109C SEC24 YDR429C TIF35 YHR164C DNA2 YOR281C PLP2 YGL137W SEC27 YGL145W TIP20 YDR141C DOP1 YER003C PMI40 YDL195W SEC31 YOR194C TOA1 YJL090C DPB11 YML069W POB3 YLR440C SEC39 YHR099W TRA1 YPR175W DPB2 YNL102W POL1 YFL005W SEC4 YDR407C TRS120 YPR183W DPM1 YBL035C POL12 YFL045C SEC53 YBR254C TRS20 YNL258C DSL1 YNL262W POL2 YMR013C SEC59 YDR177W UBC1 YIR010W DSN1 YBR088C POL30 YOR254C SEC63 YDL064W UBC9 YBR252W DUT1 YER012W PRE1 YLR026C SED5 YGR048W UFD1 YFL024C EPL1 YKL045W PRI2 YLR430W SEN1 YGL225W VRG4 YBL040C ERD2 YKR086W PRP16 YGR195W SKI6 YEL002C WBP1 YPL028W ERG10 YLL036C PRP19 YKL108W SLD2 YDR341C YDR341C YGL001C ERG26 YGR091W PRP31 YDR489W SLD5 YGR172C YIP1 YBR102C EXO84 YPR178W PRP4 YIL147C SLN1 YOR262W YOR262W YDL166C FAP7 YBR237W PRP5 YDR189W SLY1 YGR198W YPP1 YJR093C FIP1 YBR055C PRP6 YER029C SMB1 YFL038C YPT1 YGR267C FOL2 YDL055C PSA1 YFL008W SMC1 YLR060W FRS1 YMR308C PSE1 YLR086W SMC4 YHR188C GPI16 YDR013W PSF1 YOR159C SME1

117

Table 2. Functional descriptions of Tet alleles that displayed elevated Ddc2 foci

ORF GENE Description Subunit of the ARP2/3 complex, which is required for the motility and integrity of cortical YBR234C ARC40 actin patches Nuclear actin-related protein involved in chromatin remodeling, component of chromatin- YJL081C ARP4 remodeling enzyme complexes Type II membrane protein required for vesicular transport between the endoplasmic YIL004C BET1 reticulum and Golgi complex; v-SNARE with similarity to synaptobrevins Thymidylate synthase, required for de novo biosynthesis of pyrimidine YOR074C CDC21 deoxyribonucleotides; expression is induced at G1/S DNA replication initiation factor; recruited to MCM pre-RC complexes at replication YLR103C CDC45 origins; promotes release of MCM from Mcm10p, recruits elongation machinery; mutants in human homolog may cause velocardiofacial and DiGeorge syndromes DNA ligase found in the nucleus and mitochondria, an essential enzyme that joins Okazaki YDL164C CDC9 fragments during DNA replication; also acts in nucleotide excision repair, , and recombination Cyclin-dependent protein kinase regulatory subunit and adaptor; modulates proteolysis of M- YBR135W CKS1 phase targets through interactions with the proteasome; role in transcriptional regulation, recruiting proteasomal subunits to target gene promoters Alpha subunit of COPI vesicle coatomer complex, which surrounds transport vesicles in the YDL145C COP1 early secretory pathway protein that mediates the nuclear export of importin alpha (Srp1p), YGL238W CSE1 homolog of metazoan CAS protein, required for accurate chromosome segregation Regulatory subunit of Cdc7p-Dbf4p kinase complex, required for Cdc7p kinase activity and YDR052C DBF4 initiation of DNA replication; phosphorylates the Mcm2-7 family of proteins; cell cycle regulated Tripartite DNA replication factor with single-stranded DNA-dependent ATPase, ATP- YHR164C DNA2 dependent nuclease, and helicase activities; required for Okazaki fragment processing; involved in DNA repair; cell-cycle dependent localization Golgi-localized, leucine-zipper domain containing protein; involved in endosome to Golgi YDR141C DOP1 transport, organization of the ER, establishing cell polarity, and morphogenesis; detected in highly purified mitochondria in high-throughput studies Replication initiation protein that loads DNA pol epsilon onto pre-replication complexes at YJL090C DPB11 origins; checkpoint sensor recruited to stalled replication forks by the checkpoint clamp complex where it activates Mec1p; ortholog of human TopBP1 C-3 sterol dehydrogenase, catalyzes the second of three steps required to remove two C-4 YGL001C ERG26 methyl groups from an intermediate in ergosterol biosynthesis Subunit of the cohesin complex, which is required for sister chromatid cohesion during YIL026C IRR1 mitosis and meiosis and interacts with centromeres and chromosome arms, essential for viability Component of inositol phosphorylceramide (IPC) synthase; forms a complex with Aur1p and YDR367W KEI1 regulates its activity; required for IPC synthase complex localization to the Golgi; post- translationally processed by Kex2p; KEI1 is an essential gene Essential helicase component of heterohexameric MCM2-7 complexes which bind pre- YPR019W MCM4 replication complexes on DNA and melt the DNA prior to replication; accumulates in the nucleus in G1; homolog of S. pombe Cdc21p Component of the hexameric MCM complex, which is important for priming origins of DNA YLR274W MCM5 replication in G1 and becomes an active ATP-dependent helicase that promotes DNA melting and elongation when activated by Cdc7p-Dbf4p in S-phase Component of the hexameric MCM complex, which is important for priming origins of DNA YBR202W MCM7 replication in G1 and becomes an active ATP-dependent helicase that promotes DNA melting and elongation when activated by Cdc7p-Dbf4p in S-phase Huge dynein-related AAA-type ATPase (midasin), forms extended pre-60S particle with the YLR106C MDN1 Rix1 complex (Rix1p-Ipi1p-Ipi3p); acts in removal of ribosomal biogenesis factors at successive steps of pre-60S assembly and export from nucleus Essential abundant protein involved in regulation of transcription, removes Spt15p (TBP) YPL082C MOT1 from DNA via its C-terminal ATPase activity, forms a complex with TBP that binds TATA DNA with high affinity but with altered specificity

118

ORF GENE Description Essential component of the MIND kinetochore complex (Mtw1p Including Nnf1p-Nsl1p- YAL034W-A MTW1 Dsn1p) which joins kinetochore subunits contacting DNA to those contacting microtubules; critical to kinetochore assembly Transcriptional activator essential for nuclear division; localized to the nucleus; essential YOR372C NDD1 component of the mechanism that activates the expression of a set of late-S-phase-specific genes Essential subunit of the Mms21-Smc5-Smc6 complex; required for DNA YLR007W NSE1 repair and growth; has a nonstructural role in the maintenance of chromosomes Component of the evolutionarily conserved kinetochore-associated Ndc80 complex YOL069W NUF2 (Ndc80p-Nuf2p-Spc24p-Spc25p); involved in chromosome segregation, spindle checkpoint activity and kinetochore clustering Catalytic subunit of DNA polymerase (II) epsilon, a chromosomal DNA replication YNL262W POL2 polymerase that exhibits processivity and proofreading activity; also involved in DNA synthesis during DNA repair; interacts extensively with Mrc1p Proliferating cell nuclear antigen (PCNA), functions as the sliding clamp for DNA YBR088C POL30 polymerase delta; may function as a docking site for other proteins required for mitotic and meiotic chromosomal DNA replication and for DNA repair Subunit of the GINS complex (Sld5p, Psf1p, Psf2p, Psf3p), which is localized to DNA YJL072C PSF2 replication origins and implicated in assembly of the DNA replication machinery 5' to 3' DNA helicase, involved in nucleotide excision repair and transcription; subunit of YER171W RAD3 RNA polII initiation factor TFIIH and of Nucleotide Excision Repair Factor 3 (NEF3); homolog of human XPD protein; mutant has aneuploidy tolerance Subunit of heteropentameric Replication factor C (RF-C), which is a DNA binding protein YJR068W RFC2 and ATPase that acts as a clamp loader of the proliferating cell nuclear antigen (PCNA) processivity factor for DNA polymerases delta and epsilon Subunit of heteropentameric Replication factor C (RF-C), which is a DNA binding protein YBR087W RFC5 and ATPase that acts as a clamp loader of the proliferating cell nuclear antigen (PCNA) processivity factor for DNA polymerases delta and epsilon Component of the RSC chromatin remodeling complex; essential for mitotic growth; YCR052W RSC6 homolog of SWI/SNF subunit Swp73p E3 ubiquitin ligase of the NEDD4 family; involved in regulating many cellular processes, YER125W RSP5 including MVB sorting, heat shock response, transcription, and endocytosis; human homolog is involved in Liddle syndrome; mutant tolerates aneuploidy Component of the Sec13p-Sec31p complex of the COPII vesicle coat, required for vesicle YDL195W SEC31 formation in ER to Golgi transport; mutant has increased aneuploidy tolerance Subunit of the GINS complex (Sld5p, Psf1p, Psf2p, Psf3p), which is localized to DNA YDR489W SLD5 replication origins and implicated in assembly of the DNA replication machinery Subunit of the heterodimeric FACT complex (Spt16p-Pob3p), which associates with YGL207W SPT16 chromatin via interaction with Nhp6Ap and Nhp6Bp, and reorganizes nucleosomes to facilitate access to DNA by RNA and DNA polymerases Subunit of the RNA polymerase II mediator complex; associates with core polymerase YBR253W SRB6 subunits to form the RNA polymerase II holoenzyme; essential for transcriptional regulation Core component of the signal recognition particle (SRP) ribonucleoprotein (RNP) complex YPL243W SRP68 that functions in targeting nascent secretory proteins to the endoplasmic reticulum (ER) membrane Core component of the signal recognition particle (SRP) ribonucleoprotein (RNP) complex YPL210C SRP72 that functions in targeting nascent secretory proteins to the endoplasmic reticulum (ER) membrane Subunit (61/68 kDa) of TFIID and SAGA complexes, involved in RNA polymerase II YDR145W TAF12 transcription initiation and in chromatin modification, similar to histone H2A TFIID subunit (19 kDa), involved in RNA polymerase II transcription initiation, similar to YML098W TAF13 histone H4 with atypical histone fold motif of Spt3-like transcription factors YCR042C TAF2 TFIID subunit (150 kDa), involved in RNA polymerase II transcription initiation Subunit (17 kDa) of TFIID and SAGA complexes, involved in RNA polymerase II YMR236W TAF9 transcription initiation and in chromatin modification, similar to histone H3

Table 3. List of 153 proteins that interact with nascent DNA determined by iPOND-MS1

CLK THY Ratio of Sum of Sum of Spectral Spectral Saint Spectral Saint Counts Cell gH2AX3An Functional Prey Gene Protein Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Annotation4 Histone H2A type 1- 4 Other HIST1H2AA A 131 85 1 0 n/a * n/a Chromatin Probable global transcription 4 Other SMARCA1 activator SNF2L1 1054 50 1 0 n/a * n/a Chromatin chromatin assembly S 4 Other CHAF1A factor 1 subunit A 956 35 1 0 n/a * n/a arrest Chromatin LIG3 DNA ligase 3 1009 22 1 0 n/a * n/a 2 DNA repair DNA mismatch G0/1 MLH1 repair protein MLH1 756 21 1 0 n/a * n/a arrest 2 DNA repair DNA polymerase delta catalytic POLD1 subunit 1107 21 1 0 n/a * n/a 1 DNA replication DNA repair protein RAD50 RAD50 1312 19 1 0 n/a * n/a 2 DNA repair histone-binding protein RBBP4 S 4 Other RBBP4 isoform a 425 17 1 0 n/a * n/a arrest Chromatin LIG1 DNA ligase 1 919 17 1 0 n/a * n/a 1 DNA replication PMS1 protein PMS1 homolog 1 932 17 1 0 n/a * n/a 2 DNA repair DNA mismatch MSH3 repair protein Msh3 1137 16 1 0 n/a * n/a 2 DNA repair ribonuclease H2 RNASEH2B subunit B isoform 1 312 14 1 0 n/a * n/a 1 DNA replication DNA primase large PRIM2 subunit 509 13 1 0 n/a * n/a 1 DNA replication WD repeat and HMG-box DNA- binding protein 1 WDHD1 isoform 1 1129 13 1 0 n/a * n/a 1 DNA replication DNA repair protein XRCC1 XRCC1 633 10 1 0 n/a * n/a 2 DNA repair histone-lysine N- methyltransferase 4 Other EHMT1 EHMT1 isoform 1 1298 10 1 0 n/a * n/a Chromatin

119

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 gH2AX3 Annotation4 7 No expected ubiquitin carboxyl- chromatin USP7 terminal hydrolase 7 1102 10 1 0 n/a * n/a function Histone-lysine N- methyltransferase 4 Other EHMT2 EHMT2 1210 10 1 0 n/a * n/a Chromatin DNA polymerase epsilon catalytic S POLE subunit A 2286 10 1 0 n/a * n/a arrest 1 DNA replication double-strand break repair protein MRE11A MRE11A isoform 1 708 9 1 0 n/a * n/a 2 DNA repair ATPase family AAA domain- S ATAD5 containing protein 5 1844 9 1 0 n/a * n/a arrest 2 DNA repair DNA polymerase POLD3 delta subunit 3 466 8 1 0 n/a * n/a 1 DNA replication PCNA-associated KIAA0101 factor isoform 1 111 8 1 0 n/a * n/a 2 DNA repair DNA polymerase alpha catalytic S POLA1 subunit 1462 7 1 0 n/a * n/a arrest 1 DNA replication ribonuclease H2 RNASEH2A subunit A 299 7 1 0 n/a * n/a 1 DNA replication protein ZNF644 644 isoform 1 1327 6 1 0 n/a * n/a 3 Transcription Serine/threonine- protein kinase TLK2 tousled-like 2 772 5 0.99 0 n/a * n/a 2 DNA repair 7 No expected peroxiredoxin-4 chromatin PRDX4 precursor 271 5 0.99 0 n/a * n/a function Putative pre- mRNA-splicing factor ATP- dependent RNA DHX16 helicase DHX16 1041 5 0.99 0 n/a * n/a 5 RNA processing

120

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 gH2AX3 Annotation4 cyclin-dependent CDK9 kinase 9 372 5 0.99 0 n/a * n/a 3 Transcription protein timeless S TIMELESS homolog 1208 4 0.99 0 n/a * n/a arrest high 3 Transcription C-terminal-binding CTBP2 protein 2 isoform 1 445 4 0.95 0 n/a * n/a 3 Transcription ubiquitin carboxyl- 7 No expected terminal hydrolase 5 chromatin USP5 isoform 2 835 4 0.95 0 n/a * n/a function 7 No expected tubulin-specific chromatin TBCA chaperone A 108 4 0.95 0 n/a * n/a function 7 No expected Casein kinase II chromatin CSNK2A1 subunit alpha 391 3 0.97 0 n/a * n/a function 7 No expected methionine--tRNA G0/1 chromatin MARS ligase, cytoplasmic 900 3 0.95 0 n/a * n/a arrest high function MRG-binding 4 Other C20orf20 protein 204 3 0.95 0 n/a * n/a Chromatin nuclear protein 7 No expected localization protein chromatin NPLOC4 4 homolog 608 3 0.95 0 n/a * n/a function DNA replication complex GINS GINS1 protein PSF1 196 3 0.95 0 n/a * n/a 1 DNA replication denticleless protein G2 DTL homolog 730 3 0.95 0 n/a * n/a arrest 2 DNA repair negative elongation RDBP factor E 380 3 0.95 0 n/a * n/a 3 Transcription Origin recognition ORC4 complex subunit 4 436 3 0.95 0 n/a * n/a 1 DNA replication 7 No expected dihydrofolate chromatin DHFR reductase 187 3 0.95 0 n/a * n/a function replication factor C RFC1 subunit 1 isoform 1 1147 32 1 1 0 32.00 n/a 1 DNA replication

121

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 gH2AX3 Annotation4 Replication factor C RFC5 subunit 5 340 18 1 1 0 18.00 n/a 1 DNA replication DNA replication MCM5 MCM5 734 12 1 1 0 12.00 n/a 1 DNA replication replication factor C RFC4 subunit 4 363 23 1 2 0 11.50 n/a 1 DNA replication replication protein A 70 kDa DNA- S RPA1 binding subunit 616 44 1 4 0.81 11.00 n/a arrest high 1 DNA replication 4 Other WIZ protein WIZ 794 8 0.99 1 0 8.00 n/a Chromatin DNA replication licensing factor MCM4 MCM4 863 15 1 2 0 7.50 n/a 1 DNA replication SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A containing DEAD/H 4 Other SMARCAD1 box 1 1026 6 0.99 1 0 6.00 n/a Chromatin Telomere- associated protein 4 Other RIF1 RIF1 2472 6 0.95 1 0 6.00 n/a Chromatin probable ATP- dependent RNA DDX46 helicase DDX46 1031 6 0.95 1 0 6.00 n/a 5 RNA processing replication factor C RFC3 subunit 3 isoform 1 356 17 1 3 0.63 5.67 n/a 1 DNA replication transitional 7 No expected endoplasmic chromatin VCP reticulum ATPase 806 11 1 2 0.42 5.50 n/a function 7 No expected chromatin PRDX6 peroxiredoxin-6 224 5 0.99 1 0 5.00 n/a function

122

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 gH2AX3 Annotation4 DNA replication licensing factor MCM6 MCM6 821 14 1 3 0.5 4.67 n/a 1 DNA replication Polypyrimidine tract-binding protein PTBP1 1 531 14 1 3 0.5 4.67 n/a 5 RNA processing protein arginine N- methyltransferase 1 4 Other PRMT1 isoform 3 353 14 1 3 0.47 4.67 n/a high Chromatin 7 No expected 40S ribosomal S chromatin RPS7 protein S7 194 9 1 2 0.49 4.50 n/a arrest function superkiller viralicidic activity SKIV2L2 2-like 2 1042 4 0.99 1 0 4.00 n/a 5 RNA processing eukaryotic 7 No expected translation initiation G2 chromatin EIF3D factor 3 subunit D 548 4 0.99 1 0 4.00 n/a arrest function DNA polymerase delta subunit 2 POLD2 isoform 1 469 4 0.99 1 0 4.00 n/a 1 DNA replication Cell divisio 7 No expected transportin-1 n chromatin TNPO1 isoform 2 890 4 0.99 1 0 4.00 n/a defect function 7 No expected ribonuclease chromatin RNH1 inhibitor 461 4 0.99 1 0 4.00 n/a function Lysine-specific histone demethylase 4 Other KDM1A 1A 852 4 0.99 1 0 4.00 n/a Chromatin Cell divisio RNA-binding n RBM8A protein 8A 174 4 0.95 1 0 4.00 n/a defect high 5 RNA processing dnaJ homolog subfamily C DNAJC8 member 8 253 4 0.95 1 0 4.00 n/a 5 RNA processing

123

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 gH2AX3 Annotation4 DNA primase small PRIM1 subunit 420 4 0.95 1 0 4.00 n/a high 1 DNA replication 6- 7 No expected phosphofructokinas chromatin PFKP e type C isoform 1 784 4 0.95 1 0 4.00 n/a function Cell divisio splicing factor 3A n SF3A1 subunit 1 isoform 1 793 8 0.97 2 0 4.00 n/a defect high 5 RNA processing inosine-5'- 7 No expected monophosphate chromatin IMPDH2 dehydrogenase 2 514 7 0.99 2 0.49 3.50 n/a function histone-binding protein RBBP7 4 Other RBBP7 isoform 2 425 19 1 6 0.73 3.17 n/a Chromatin transcriptional regulator ATRX ATRX isoform 2 2454 3 0.95 1 0 3.00 n/a 3 Transcription chloride 7 No expected intracellular channel G0/1 chromatin CLIC1 protein 1 241 3 0.95 1 0 3.00 n/a arrest function nitric oxide 7 No expected synthase-interacting chromatin NOSIP protein 301 3 0.95 1 0 3.00 n/a function 7 No expected G0/1 chromatin MTPN myotrophin 118 3 0.95 1 0 3.00 n/a arrest function DNA-directed RNA polymerase II S POLR2C subunit RPB3 275 3 0.95 1 0 3.00 n/a arrest 3 Transcription cleavage stimulation factor subunit 3 CSTF3 isoform 1 717 3 0.95 1 0 3.00 n/a 5 RNA processing 7 No expected 60S ribosomal chromatin RPL9 protein L9 192 3 0.95 1 0 3.00 n/a function glutamate-rich WD repeat-containing GRWD1 protein 1 446 3 0.95 1 0 3.00 n/a 6 Uncharacterized

124

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 gH2AX3 Annotation4 SWI/SNF complex 4 Other SMARCC2 subunit SMARCC2 1214 3 0.95 1 0 3.00 n/a Chromatin Pogo transposable element with ZNF 4 Other POGZ domain 1410 3 0.95 1 0 3.00 n/a Chromatin E3 SUMO-protein PIAS1 ligase PIAS1 651 3 0.95 1 0 3.00 n/a 3 Transcription replication protein S RPA2 A 32 kDa subunit 270 6 0.93 2 0.39 3.00 n/a arrest high 2 DNA repair DNA replication licensing factor MCM2 MCM2 904 15 1 5 0.86 3.00 n/a 1 DNA replication histone deacetylase 4 Other HDAC2 2 488 12 1 4 0.5 3.00 n/a Chromatin cyclin-dependent G2 CDK1 kinase 1 isoform 1 297 11 0.91 4 0.17 2.75 n/a arrest 6 Uncharacterized 6-phosphogluconate 7 No expected dehydrogenase, chromatin PGD decarboxylating 483 13 0.97 5 0.31 2.60 n/a function serine/arginine-rich SRSF6 splicing factor 6 344 5 0.99 2 0.49 2.50 n/a 5 RNA processing nuclear autoantigenic sperm 4 Other NASP protein isoform 2 788 5 0.99 2 0 2.50 n/a Chromatin RNA-binding RALY protein RALY 306 5 0.99 2 0 2.50 n/a 5 RNA processing sister chromatid cohesion protein 4 Other PDS5B PDS5 homolog B 1447 5 0.99 2 0.49 2.50 n/a Chromatin eukaryotic translation initiation S EIF3C factor 3 subunit C 913 5 0.99 2 0 2.50 n/a arrest 6 Uncharacterized Splicing factor U2AF 65 kDa U2AF2 subunit 475 5 0.99 2 0.5 2.50 n/a high 5 RNA processing RNA-binding RBM4 protein 4 isoform 1 364 5 0.99 2 0 2.50 n/a 5 RNA processing

125

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 gH2AX3 Annotation4 Structural maintenance of chromosomes 4 Other SMC2 protein 2 1197 5 0.95 2 0 2.50 n/a Chromatin 7 No expected chromatin RDX radixin isoform 2 583 10 1 4 0.5 2.50 n/a function tRNA (guanine(26)- 7 No expected N(2))- chromatin TRMT1 dimethyltransferase 659 7 0.99 3 0.5 2.33 n/a function 7 No expected 60S ribosomal S chromatin RPL3 protein L3 isoform a 403 9 0.91 4 0.38 2.25 n/a arrest function 7 No expected chromatin CFL1 cofilin-1 166 4 0.99 2 0 2.00 n/a function GMP synthase 7 No expected [glutamine- chromatin GMPS hydrolyzing] 693 4 0.99 2 0.49 2.00 n/a function thioredoxin-like TXNL1 protein 1 289 4 0.99 2 0.49 2.00 n/a 6 Uncharacterized 7 No expected 60S ribosomal S chromatin RPL5 protein L5 297 4 0.96 2 0.49 2.00 n/a arrest function E3 ubiquitin-protein RAD18 ligase RAD18 495 6 0.95 3 0.5 2.00 n/a 2 DNA repair 7 No expected heat shock 70 kDa chromatin HSPA4 protein 4 840 7 0.97 4 0.74 1.75 n/a function Cell divisio splicing factor 3B n SF3B1 subunit 1 isoform 1 1304 5 0.96 3 0.5 1.67 n/a defect high 5 RNA processing non-POU domain- containing octamer- binding protein NONO isoform 1 471 33 1 20 0.55 1.65 n/a 5 RNA processing

126

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 gH2AX3 Annotation4 heterogeneous nuclear ribonucleoprotein R HNRNPR isoform 2 633 13 0.98 8 0.65 1.63 n/a 5 RNA processing 7 No expected ATP-citrate chromatin ACLY synthase 1101 13 0.92 8 0.63 1.63 n/a function U5 small nuclear ribonucleoprotein SNRNP200 200 kDa helicase 2136 8 0.99 5 0.88 1.60 n/a 5 RNA processing splicing factor 3B G2 SF3B3 subunit 3 1217 14 0.95 9 0.46 1.56 n/a arrest high 5 RNA processing 4 Other RUVBL2 ruvB-like 2 463 17 0.99 11 0.88 1.55 n/a Chromatin 7 No expected alpha-enolase chromatin ENO1 isoform 1 434 52 1 34 0.5 1.53 n/a function hematological and neurological expressed 1-like HN1L protein 190 3 0.95 2 0 1.50 n/a 6 Uncharacterized 7 No expected exosome complex S chromatin EXOSC4 component RRP41 245 3 0.95 2 0.49 1.50 n/a arrest function signal recognition 7 No expected particle 14 kDa chromatin SRP14 protein 136 3 0.95 2 0.49 1.50 n/a function 26S proteasome 7 No expected non-ATPase G0/1 chromatin PSMD3 regulatory subunit 3 534 3 0.95 2 0.49 1.50 n/a arrest function 7 No expected 40S ribosomal chromatin RPS13 protein S13 151 3 0.95 2 0.49 1.50 n/a function

127

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell gH2AX3An Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Annotation4 U5 small nuclear ribonucleoprotein SNRNP200 200 kDa helicase 2136 8 0.99 5 0.88 1.60 n/a 5 RNA processing splicing factor 3B G2 SF3B3 subunit 3 1217 14 0.95 9 0.46 1.56 n/a arrest high 5 RNA processing 4 Other RUVBL2 ruvB-like 2 463 17 0.99 11 0.88 1.55 n/a Chromatin 7 No expected alpha-enolase chromatin ENO1 isoform 1 434 52 1 34 0.5 1.53 n/a function hematological and neurological expressed 1-like HN1L protein 190 3 0.95 2 0 1.50 n/a 6 Uncharacterized 7 No expected exosome complex S chromatin EXOSC4 component RRP41 245 3 0.95 2 0.49 1.50 n/a arrest function signal recognition 7 No expected particle 14 kDa chromatin SRP14 protein 136 3 0.95 2 0.49 1.50 n/a function 26S proteasome 7 No expected non-ATPase G0/1 chromatin PSMD3 regulatory subunit 3 534 3 0.95 2 0.49 1.50 n/a arrest function 7 No expected 40S ribosomal chromatin RPS13 protein S13 151 3 0.95 2 0.49 1.50 n/a function 7 No expected chromatin CPNE1 Copine-1 537 3 0.95 2 0.49 1.50 n/a function Poly(U)-binding- splicing factor PUF60 PUF60 559 3 0.95 2 0.49 1.50 n/a 5 RNA processing Cell T-complex protein 1 divisio 7 No expected subunit beta isoform n chromatin CCT2 1 535 16 0.98 11 0.61 1.45 n/a defect function

128

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell gH2AX3An Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Annotation4 Heterogeneous nuclear SYNCRIP ribonucleoprotein Q 623 13 0.9 10 0.81 1.30 n/a 5 RNA processing T-complex protein 1 7 No expected subunit zeta isoform chromatin CCT6A a 531 10 0.94 8 0.89 1.25 n/a function 7 No expected chromatin PFN1 profilin-1 140 17 0.9 14 0.89 1.21 n/a function TAR DNA-binding TARDBP protein 43 414 9 0.93 8 0.84 1.13 n/a 3 Transcription G0/1 NCL nucleolin 710 45 0.92 41 0.76 1.10 n/a arrest 5 RNA processing 7 No expected aspartate--tRNA chromatin DARS ligase, cytoplasmic 501 7 0.9 7 0.89 1.00 n/a function DNA-binding CSDA protein A isoform a 372 9 0.9 9 0.89 1.00 n/a #N/A tyrosine-protein 4 Other BAZ1B kinase BAZ1B 1483 97 1 13 1 7.46 5.14 Chromatin chromatin assembly 4 Other CHAF1B factor 1 subunit B 559 22 1 3 0.95 7.33 5.03 Chromatin proliferating cell S PCNA nuclear antigen 261 77 1 12 1 6.42 4.23 arrest 1 DNA replication DNA mismatch repair protein Msh2 MSH2 isoform 1 934 69 1 11 1 6.27 4.10 2 DNA repair DNA mismatch MSH6 repair protein Msh6 1360 94 1 20 1 4.70 2.73 2 DNA repair sister chromatid cohesion protein PDS5 homolog A 4 Other PDS5A isoform 1 1337 10 1 3 0.95 3.33 1.54 Chromatin

129

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell gH2AX3An Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Annotation4 7 No expected G0/1 chromatin XPO5 exportin-5 1204 10 1 3 0.95 3.33 1.54 arrest function FACT complex SUPT16H subunit SPT16 1047 66 1 20 1 3.30 1.51 3 Transcription FACT complex SSRP1 subunit SSRP1 709 36 1 11 1 3.27 1.49 3 Transcription Transcription elongation factor G0/1 SUPT5H SPT5 1087 9 1 3 0.95 3.00 1.25 arrest 3 Transcription 7 No expected importin subunit chromatin KPNA2 alpha-2 529 9 1 3 0.95 3.00 1.25 function DNA (cytosine-5)- methyltransferase 1 4 Other DNMT1 isoform a 1632 84 1 28 1 3.00 1.25 Chromatin DNA topoisomerase TOP2A 2-alpha 1531 101 1 34 1 2.97 1.22 1 DNA replication DNA replication licensing factor MCM7 MCM7 isoform 1 719 23 1 8 0.97 2.88 1.14 1 DNA replication nuclear mitotic 4 Other NUMA1 apparatus protein 1 2115 14 1 5 0.99 2.80 1.08 Chromatin DNA topoisomerase TOP2B 2-beta 1621 29 1 11 1 2.64 0.93 1 DNA replication SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily A 4 Other SMARCA5 member 5 1052 124 1 49 1 2.53 0.84 Chromatin DNA replication licensing factor MCM3 MCM3 808 29 1 12 1 2.42 0.74 1 DNA replication

130

CLK THY Ratio of Sum of Sum of Spectral Protein Spectral Saint Spectral Saint Counts Cell gH2AX3An Functional Prey Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Annotation4 ATPase family AAA domain- containing ATAD2 protein 2 1390 26 1 11 1 2.36 0.69 6 Uncharacterized 7 No expected exportin-2 chromatin CSE1L isoform 1 971 34 1 15 1 2.27 0.61 function 7 No expected chromatin IPO5 importin-5 1115 9 1 4 0.95 2.25 0.60 function metastasis- associated 4 Other MTA1 protein MTA1 715 13 1 6 0.99 2.17 0.52 Chromatin Fanconi anemia FANCI group I protein 1328 26 1 12 1 2.17 0.52 high 2 DNA repair ubiquitin-like with PHD and ring finger 4 Other UHRF1 domains 1 793 34 1 16 1 2.13 0.49 Chromatin kinesin-like protein KIF22 4 Other KIF22 isoform 1 665 6 1 3 0.95 2.00 0.38 Chromatin metastasis- associated 4 Other MTA2 protein MTA2 668 8 1 4 0.99 2.00 0.38 Chromatin histone 4 Other HDAC1 deacetylase 1 482 10 1 5 0.99 2.00 0.38 Chromatin activity- dependent neuroprotector ADNP protein 1102 12 1 6 0.99 2.00 0.38 3 Transcription ubiquitin-like modifier activating UBA1 enzyme 1 1058 16 1 8 0.99 2.00 0.38 6 Uncharacterized Fanconi anemia, complementatio FANCD2 n group D2 1471 18 1 9 1 2.00 0.38 2 DNA repair

131

CLK THY Ratio of Sum of Sum of Spectral Prey Protein Spectral Saint Spectral Saint Counts Cell gH2AX3An Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Functional Annotation4 putative pre- mRNA-splicing factor ATP- dependent RNA DHX15 helicase DHX15 795 19 1 10 0.99 1.90 0.29 5 RNA processing X-ray repair cross- complementing XRCC6 protein 6 609 36 1 19 1 1.89 0.29 2 DNA repair chromodomain- helicase-DNA- 191 CHD4 binding protein 4 2 36 1 19 1 1.89 0.29 4 Other Chromatin general II-I isoform GTF2I 1 998 58 1 31 1 1.87 0.26 3 Transcription RecQ helicase- RECQL like 649 9 1 5 0.99 1.80 0.20 2 DNA repair activated RNA polymerase II transcriptional SUB1 coactivator p15 127 9 1 5 0.99 1.80 0.20 3 Transcription flap endonuclease FEN1 1 380 9 1 5 0.99 1.80 0.20 1 DNA replication protein 7 No expected chromatin PPM1G phosphatase 1G 546 9 1 5 0.99 1.80 0.20 function ATP-dependent RNA helicase G0/1 DDX1 DDX1 740 9 0.99 5 0.96 1.80 0.20 arrest 6 Uncharacterized X-ray repair cross- complementing XRCC5 protein 5 732 25 1 14 1 1.79 0.19 2 DNA repair TAGLN 2 transgelin-2 199 16 1 9 0.99 1.78 0.18 6 Uncharacterized transketolase 7 No expected chromatin TKT isoform 1 623 23 1 13 1 1.77 0.18 function

132

CLK THY Ratio of Sum of Sum of Spectral Prey Protein Spectral Saint Spectral Saint Counts Cell gH2AX3An Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Functional Annotation4 Cell eukaryotic divisio initiation factor n EIF4A3 4A-III 411 19 0.98 11 0.92 1.73 0.14 defect high 5 RNA processing serine/threonine- protein kinase VRK1 VRK1 396 5 0.99 3 0.95 1.67 0.09 4 Other Chromatin chromosome- associated kinesin 123 G2 KIF4A KIF4A 2 5 0.99 3 0.95 1.67 0.09 arrest 4 Other Chromatin DNA damage- 114 S DDB1 binding protein 1 0 10 1 6 0.96 1.67 0.09 arrest high 2 DNA repair cullin-associated NEDD8- dissociated 123 CAND1 protein 1 0 20 1 12 0.99 1.67 0.09 3 Transcription poly [ADP- ribose] 101 G0/1 PARP1 polymerase 1 4 112 1 68 1 1.65 0.07 arrest 2 DNA repair structural maintenance of chromosomes flexible hinge domain- SMCH containing protein 200 D1 1 5 18 1 11 1 1.64 0.06 6 Uncharacterized eukaryotic translation initiation factor 3 138 S 7 No expected chromatin EIF3A subunit A 2 8 1 5 0.96 1.60 0.03 arrest high function leukotriene A-4 hydrolase isoform G0/1 7 No expected chromatin LTA4H 1 611 8 0.99 5 0.99 1.60 0.03 arrest function lymphoid-specific HELLS helicase 838 11 1 7 0.99 1.57 0.00 4 Other Chromatin SET nuclear SET oncogene 290 11 1 7 1 1.57 0.00 3 Transcription

133

CLK THY Ratio of Sum of Sum of Spectral Prey Protein Spectral Saint Spectral Saint Counts Cell gH2AX3An Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Functional Annotation4 DNA-dependent protein kinase catalytic subunit 412 PRKDC isoform 1 8 58 1 37 1 1.57 0.00 2 DNA repair rRNA 2'-O- methyltransferase FBL fibrillarin 321 6 0.89 6 0.89 1.00 5 RNA processing low molecular weight phosphotyrosine protein phosphatase 7 No expected chromatin ACP1 isoform c 158 8 0.89 4 0.39 2.00 function ribosomal protein 7 No expected chromatin RPSA SA 295 8 0.89 7 0.86 1.14 function staphylococcal nuclease domain- containing protein SND1 1 910 6 0.89 3 0.64 2.00 3 Transcription interleukin enhancer binding G0/1 ILF3 factor 3, 90kDa 894 14 0.89 16 0.96 0.88 arrest 3 Transcription phosphatidylethan olamine-binding protein 1 7 No expected chromatin PEBP1 preproprotein 187 8 0.89 3 0.35 2.67 function non-histone HMGN chromosomal 2 protein HMG-17 90 19 0.89 27 0.89 0.70 4 Other Chromatin PTGES prostaglandin E 3 synthase 3 160 6 0.88 4 0.77 1.50 4 Other Chromatin NEDD8- conjugating 7 No expected chromatin UBE2M enzyme Ubc12 183 6 0.88 1 0 6.00 function heterogeneous nuclear HNRNP ribonucleoprotein G0/1 M M 730 50 0.88 38 0.47 1.32 arrest 5 RNA processing

134

CLK THY Ratio of Sum of Sum of Spectral Prey Protein Spectral Saint Spectral Saint Counts Cell gH2AX3An Gene Description AA Counts Score Counts Score CLK/THY Z-score Cycle2 notation Functional Annotation4 tyrosine 3- monooxygenase/tr yptophan 5- monooxygenase YWHA activation protein, 7 No expected chromatin B beta 246 11 0.85 9 0.75 1.22 function 60S ribosomal S 7 No expected chromatin RPL38 protein L38 70 5 0.85 2 0.39 2.50 arrest function probable ATP- dependent RNA DDX5 helicase DDX5 614 19 0.85 17 0.74 1.12 5 RNA processing triosephosphate isomerase isoform 7 No expected chromatin TPI1 1 249 9 0.85 7 0.79 1.29 function inner membrane protein, 7 No expected chromatin IMMT mitochondrial 758 9 0.84 7 0.48 1.29 function heterogeneous nuclear HNRNP ribonucleoprotein L L isoform a 589 8 0.84 4 0.37 2.00 5 RNA processing phosphoglycerate 7 No expected chromatin PGK1 kinase 1 417 22 0.84 11 0.09 2.00 function LUC7L 3 luc7-like protein 3 432 8 0.83 4 0.36 2.00 6 Uncharacterized poly(rC) binding PCBP2 protein 2 365 12 0.82 13 0.97 0.92 6 Uncharacterized 40S ribosomal protein S24 S 7 No expected chromatin RPS24 isoform c 133 7 0.82 4 0.5 1.75 arrest function histone acetyltransferase type B catalytic HAT1 subunit 419 7 0.8 5 0.44 1.40 4 Other Chromatin eukaryotic translation initiation factor 7 No expected chromatin EIF5A 5A-1 isoform B 154 13 0.8 6 0.17 2.17 function

135

1Prey proteins with SAINT >0.9 in nascent DNA sample (CLK) for and SAINT<0.9 for Thymidine chased sample (THY) in iPOND-MS were considered as a candidate prey. For prey proteins with SAINT >0.9 in CLK and THY, the enrichment ratio of CLK/THY spectral counts was calculated and used to determine the z- score. Spectral counts represent the number of peptides detected for all isoforms of the prey gene. A z-score of >0.5 was considered as a candidate prey. There are 153 total iPOND candidates indicated in black text. Grey texts represent candidate preys that fall below our cut-off criteria. Grey boxes representgenes that were tested in EdU incorporation assay. * An enrichment score was not calculated for prey proteins with zero spectral counts in the THY sample.

2 Cell cycle annotation categories from genome-scale profiling following esiRNA depletion in HeLa cells(Kittler et al., 2007)

3 gH2AX annotation from genome-scale profiling following pooled siRNA depletion in HeLa cells(Paulsen et al., 2009)

4 Functional annotation of proteins from Supplementary Table 1 of genome-wide assessment of proteins on nascent chromatin (Alabert et al., 2014)

136

137

Table 4. List of targeted siRNAs used to test the effect of gene depletion on EdU incorporation intensities Gene Symbol iPOND BioID Gene ID Gene Accession Sequence BLM Y Y 641 NM_000057 GAGCACAUCUGUAAAUUAA GAGAAACUCACUUCAAUAA CAGGAUGGCUGUCAGGUUA CUAAAUCUGUGGAGGGUUA TOP3A Y Y 7156 NM_004618 GAAACUAUCUGGAUGUGUA CCACAAAGAUGGUAUCGUA CCAGAAAUCUUCCACAGAA GAACAAGUCUGACCAAGCU RMI1 Y Y 80010 NM_024945 GCAGAAAUGUCAAAGAGAU CUAAAGAAGCGGUUAAAUA GGUCAACUAUGUACAGAAA UCAGAUAGCCAUUCCUUAA RMI2 Y Y 116028 NM_152308 GAAAGUAUGUGAUGGUGAU GAAAGUAUGUGGGAACUGG GAAGAUUUACACAGGAAUA UCAGAGAUGUUGAGACGUU FANCD2 Y Y 2177 NM_001018115 GGUCAGAGCUGUAUUAUUC GAUAAGUUGUCGUCUAUUA GAUCAACUCUCCUAAAGAU GAACAAAGGAAGCCGGAAU RAD50 Y Y 10111 NM_005732 GAAACAAACUGCAGAAUGU GAACAAGGAUCUGGAUAUU GCUCAGAGAUUGUGAAAUG UAACCUCACUGUUGGGAUA RIF1 Y Y 55183 NM_018151 UCACGUAGCCCUAAAUUUA AGACGGUGCUCUAUUGUUA GUGAGGAGAUCUAAAGGUU CAGAAGAGUCCAUUGCAUA RPA2 Y Y 6118 NM_002946 GAUCAAUGCACACAUGGUA CAAAAUAGAUGACAUGACA GAGUGAAGCAGGGAACUUU GUGGAACAGUGGAUUCGAA RPA3 Y Y 6119 NM_002947 GGAAGUGGUUGGAAGAGUA GAAGAUAGCCAUCCUUUUG CAUGCUAGCUCAAUUCAUC GAUCUUGGACUUUACAAUG ADNP Y 23394 NM_015339 CAACAAUCUUGGCAGUUUA CGAAGACCAUGAACGUAUA GAAGAAGUGUGUCCGUGAU GGAAAUGUCCGGUCUUUAC ATAD2 Y 29028 NM_014109 GAUCUAAUCUGUAGUAAUG GCACAGAAUUCUUCAGAUU GCAUAGAGCCUGUGCUUUA GAAAGGUGCUGAUUGUCUA CCT2 Y 10576 NM_006431 GAAAUUGCCUCUACCUUUG AAAGUUAGCUGUAGAAGCA GGAGGAAGUUUGGCAGAUU GAAGUUAAAUUCCGUCAAG DDX46 Y 9879 NM_014829 AAACAAGGGUUAUGCUUAU GUAAAUGUGUUUCGAUUGG GUAGUGGGUUCUCUGGUAA GCUCAAUUAUGUGCCGUUA ENO1 Y 2023 NM_001428 GAACGUCACAGAACAAGAG GAUAAGACUCGCUAUAUGG AAUGAUAAGACUCGCUAUA AUAAGGUGGUCAUCGGCAU HNRNPM Y 4670 NM_031203 CCAACAAUCUGGAGCGGAU GAUGAGAGGGCCUUACCAA AAGCAUAGUCUGAGCGGAA AAACAUAGGUCCCGCAGGA

137

138

Gene Symbol iPOND BioID Gene ID Gene Accession Sequence HSPA4 Y 3308 NM_002154 UGACAUAUAUGGAGGAAGA CAAGAAGCUUCCUGAAAUG UAAGAAUCGUUCAAUUGGA GGACCUGCCAAUCGAGAAU IMPDH2 Y 3615 NM_000884 GGACAGACCUGAAGAAGAA GCACGGCGCUUUGGUGUUC GGAAAGUUGCCCAUUGUAA CUAAAGAAAUAUCGCGGUA MTA1 Y 9112 NM_004689 AGACAUCACCGACUUGUUA GGAAGACCACCGACAGAUA GGGAGGAUUUCUUCUUCUA GGACCAAACCGCAGUAACA NONO Y 4841 NM_007363 CAGAGAAGCUGGUUAUAAA GAAGCCAGCUGCUCGGAAA AAUGAAGGCUUGACUAUUG CGACAUCACUGAGGAAGAA PCNA Y 5111 NM_002592 GAUCGAGGAUGAAGAAGGA GCCGAGAUCUCAGCCAUAU GAGGCCUGCUGGGAUAUUA GUGGAGAACUUGGAAAUGG PMS1 Y 5378 NM_000534 GAACAGGAGUCACAAAUGU CAAAGGAAGUUUAUGAAUG GAACGGCUGCUGAUAAUUU GCGAAUGGUUUCAAGAUAA PRMT1 Y 3276 NM_198318 UCAAAGAUGUGGCCAUUAA GCAACUCCAUGUUUCAUAA GCUACUGCCUCUUCUACGA GGUCAUCGGGAUCGAGUGU PTBP1 Y 5725 NM_175847 GGCACAAGCUGCACGGGAA UGACCAAGGACUACGGCAA UGCAGAUGGCGGACGGCAA ACGCGGAGAGAACCGAUUA RNASEH2B Y 79621 NM_024570 CUACAUCCCUAAAGAAUUA GCCAUUUACUUGUUCAAUA GAAGAGGAUUAUAUUCGUU UCAUAAAGGCUGAUAAGGA RPA1 Y 6117 NM_002945 GAAGUCAGCUGAAGCAGUU GCAAUCCAGUGCCCUAUAA GAAGUGCGACACCGAAUUU CCCUAGAACUGGUUGACGA RPS7 Y 6201 NM_001011 GAGAUGAACUCGGACCUCA GGGCAAGGAUGUUAAUUUU CUAAGGAAAUUGAAGUUGG GCCGUACUCUGACAGCUGU SF3A1 Y 10291 NM_001005409 GUAAGAAGAUCGGUGAGGA CAUCUUCGGUGUAGAGGAA CAGAUCGACUGGCAUGAUU GAACACAUGCGCAUUGGAC SMARCA1 Y 6594 NM_139035 GAAGAAACCAGUACGUGUA CAACGAGAAUGGUAUACAA CGAAGGAUCAGUAUCAAGA ACUAACCGCUUGCUCCUAA TRMT1 Y 55621 NM_017722 GAAGGAAUGUCCGGUGAAA CCAAAGGAAUCCAGAUCAA CGGAGAGGUUUGACGUCAU GCGUCUCGGCAAAGCGUCA UBA1 Y 7317 NM_153280 GCUCAGACCUGCAAGAGAA UAAGACAGAGCACAAAUUA GCUAUGGUUUCUAUGGUUA CAAGAACUUUGCCAUGAUU UHRF1 Y 29128 NM_001048201 GCCAUACCCUCUUCGACUA GGAACAGUCUUGUGAUCAG UGGAGGAGGACGUCAUUUA GAACGGCGUGGUCCAGAUG

138

139

Gene Symbol iPOND BioID Gene ID Gene Accession Sequence USP7 Y 7874 NM_003470 CUAAGGACCCUGCAAAUUA GUGGUUACGUUAUCAAAUA UGACGUGUCUCUUGAUAAA GAAGGUACUUUAAGAGAUC VCP Y 7415 NM_007126 GUAAUCUCUUCGAGGUAUA AAACAGAUCCUAGCCCUUA GAGAGCAACCUUCGUAAAG GCACAGGUGGCAGUGUAUA WDHD1 Y 11169 NM_001008396 GAUCAGACAUGUGCUAUUA GGUAAUACGUGGACUCCUA GCUGUGAAUUUAGCCAUUA GGUCGAAUAUCGUAUACUG ZNF644 Y 84146 NM_032186 CGACAUAUUGUAAACGCUA CAUGCUAAGAACCGACAUA CAUAGAACCAGGAAACCUU ACUCAAAUUAGGGCAUUAA AHCTF1 Y 25909 NM_015446 GCAAUAGGUUGAAUAUAGA GACAAGCAGCUGCGUAUUA CUGAUAGACUUAAGCAAUA CCACUAAAUUGGUCAAAUC ALMS1 Y 7840 NM_015120 CAACUGGCAUUGCUAGGUA GAACAUUUCAGAUUUCGAA GAGAGUAACUUAACCGAAG CAACAGGUCUUGCCAGAUA ANKRD28 Y 23243 NM_015199 GUAAUCGACUGUGAGGAUA UCAGAAUGCUUACGGCUAU GUUCGAGCACUAAUAUUUA CUAGAGGUGCCAAUAUUAA AP2M1 Y 1173 NM_001025205 UAUAUGAGCUGCUGGAUGA GAAGAGCAGUCACAGAUCA CGUGAUGGCUGCCUACUUU GGAGGCUUAUUCAUCUAUA APITD1 Y 378708 NM_199294 GUUUGCAAGACAUGCGAAA CAAUUAACACUGAAGAUGU CUCUUAGCCAGGAGGAGUA GAGCUGACUUUCCGACAGU BRCA1 Y 672 NM_007298 CAGCUACCCUUCCAUCAUA GGGAUACCAUGCAACAUAA GAAGGAGCUUUCAUCAUUC CUAGAAAUCUGUUGCUAUG BRIP1 Y 83990 NM_032043 GAACAGAAGUACACAAUUU GAUGAUCGCUUUAGGAAUA GGAAAUAGAUUGGCUAACA GAACUUGGUGUUACAUUUA C19orf40 Y 91442 NM_152266 CCAAAGAGCCCAGUAAGAA UAAAGGAAUUGUAGUCGUU CAGGAAAUGGCUACAGAAA CGGGUUAGAAAUUCCAAUA CENPC Y 1060 NM_001812 GCGAAUAGAUUAUCAAGGA GAACAGAAUCCAUCACAAA CGAAGUUGAUAGAGGAUGA UCAGGAGGAUUCGUGAUUA COPA Y 1314 NM_004371 ACUCAGAUCUGGUGUAAUA GCAAUAUGCUACACUAUGU GAUCAGACCAUCCGAGUGU GAGUUGAUCCUCAGCAAUU CSTB Y 1476 NM_000100 GAAGUUCCCUGUGUUUAAG GGGACAAACUACUUCAUCA CCAACAAAGCCAAGCAUGA GACCUUAUCUAACUACCAG

139

140

Gene Symbol iPOND BioID Gene ID Gene Accession Sequence CUEDC2 Y 79004 NM_024040 CCGAAAUGCUCAAAGAAGA GAACUUCGAUAUGGAGGCU CUGAGAUGAUGGAGGCCUA GAUUCAAAGAUGUGCGGAA DBT Y 1629 NM_001918 GGAGGAACAUUUACUCUUU GAAGAUAUCCUCAACUAUU GUUCAGAUCUGCUCUAUAU AGACAUGACUGUUCCUAUA DSG1 Y 1828 NM_001942 CGAGAUGGGUCGAAUGUUA CAAAUUUGCUCGAGAUUAG AUACAGAGCCGAACACUAA CAUGAUAGGUAGUCUGAGU DSP Y 1832 NM_001008844 CGACAUGAAUCAGUAAGUA AAACAGAACGCUCCCGAUA GACCGUCACUGAGCUAGUA GAAGAGAGGUGCAGGCGUA ECH1 Y 1891 NM_001398 UCAACAAGAUUUCGAGAGA AAUGUUCACUGCAGGUAUU GUACCUCCGUGACAUCAUC UAGAGUGCUUCAACAAGAU ERCC4 Y 2072 NM_005236 UGACAAGGGUACUACAUGA GUAGGAUACUUGUGGUUGA ACAAGACAAUCCGCCAUUA AAGACGAGCUCACGAGUAU ERCC6L Y 54821 NM_017669 GGAUAGAGUUUACCGAAUU GCUAAUCACUUGUGGGACU ACUUUAAGACAUUGCGAAU AGUAGGUGGUGUCGGUUUA FANCM Y 57697 NM_020937 GUACUGCACUUGAGAAUUU CAAACCAUGUUCACAAUUA CAACAGUGGUGAAUAGUAA GAACAAGAUUCCUCAUUAC HIST2H2AB Y 317772 NM_175065 UCACAAGCCUGGCAAGAAC GCUGUCCUGUUGCCCAAGA CCGCGGAAAUUCUGGAGCU CCGUGAGGAAUGACGAAGA HRNR Y 388697 NM_001009931 GCAACAUGGUUCUACAUCA GAACGACACGGAUCUAGCU CGACAUGGGUCCGGUUUGG UAGUAGCACUUCACCCUAU IKBKAP Y 8518 NM_003640 CAAGAAACGUUUAUUGGUA CAACAGACCCUGAUUAUGA CCAGAAAUUUGGACUCUUA GACCAUAGACCACAAGUUA LRRC49 Y 54839 NM_017691 CACAAAUCGUGCUACAUUA GACCGUAUGUCCUAUCAUC GCAGCAAUUUAACGCACUA CUACAGAAGUUAAUAUCGU MOB1B Y 92597 NM_173468 GCAGAUGGAACGAACAUAA GAGGAAGCACAUCUAAAUA GCUCUGCACCAAAGUAUAU AUGAAUGGGUUGCAGUUAA NBN Y 4683 NM_002485 GGAGGAAGAUGUCAAUGUU GAAGAAACGUGAACUCAAG GAAAUGGAUUCAGUCAAUA ACAUGGGAUUUGAGUGAAA NOP14 Y 8602 NM_003703 GCAAGUGACGCUAUCAAAU UGGAAUACGUUGGCGAUUU GGAAAGGGUUGAUAAGCGG GAGAGCGACAGCCCAGAUA PDXDC1 Y 23042 NM_015027 GAUAAGAGCAGUUUGAAAU GAGGAGAACUCGAGGCUUC GAAUCUGACCUAACCUUUA GAAUAGGGGUUGUCAGGUA

140

141

Gene Symbol iPOND BioID Gene ID Gene Accession Sequence PLK1 Y 5347 NM_005030 CAACCAAAGUCGAAUAUGA CAAGAAGAAUGAAUACAGU GAAGAUGUCCAUGGAAAUA CAACACGCCUCAUCCUCUA PML Y 5371 NM_033247 GGGGAAAGAUGCAGCUGUA GCAAAGAGUCGGCCGACUU GCGCUGGUGCAGAGGAUGA CCGAUGGCUUCGACGAGUU RAD54L2 Y 23132 NM_015106 GAGAGGGAGCGGCUUAUUA GAUGUACAGGCAAGAAUUA UAUGUCGGGUAUACCGUUA UCCAAGAUCCAGCGAGAUU RGPD5 Y 84220 NM_032260 GCAAAGCUGUAUUAUGAAG GUUCCAAGACCAAAGAUUA UUACAGGCGUUCAGUGGAA AGGCCGAUGUGGAGCGGUA RLTPR Y 146206 NM_001013838 GGACUGAUCUAGAGACCAG GCAAAGAUGGCGAGAUCAA GUGAAUGAAUUGUGUCAGU GAUCUGGCCUUGGAACCGA SIRT1 Y 23411 NM_012238 GUACAAACUUCUAGGAAUG GUAGGCGGCUUGAUGGUAA GCGAUUGGGUACCGAGAUA GGAUAGGUCCAUAUACUUU SLX4 Y 84464 NM_032444 AAACGUGAAUGAAGCAGAA GAACGAAGUCGCACAGAAG GCAGAGAAUUGCGAAAGCA GCGAACAGGUGCCUAUUGC STRA13 Y 201254 NM_144998 CACCUGCACUUCAAGGAUG GACCAAAGAAGCAGCAGUC CGCAGCUGCUCCUGGACUU ACGUGGACCAGCUGGAGAA TDRD3 Y 81550 NM_030794 AAAUAGAGGUUCUGAAAGA GAAGAUCUGGGCCAAUUAA GAAGAAGGCACCUACGAUC GAAUCCAGUUCGAAGUAAU TOP3B Y 8940 NM_003935 CCAGUGCGCUUCAAGAUGA CCACUACCCUGAGAACUUU GUGCACGGCUACUAUAAGA CGAGUACACUGGGACCUUU TOPBP1 Y 11073 NM_007027 GGAAUCACCUCCUCAUGUA ACACUAAUCGGGAGUAUAA GAGCCGAACAUCCAGUUUA AACCAUAUAUCCAUGCUAA TP53 Y 7157 NM_000546 GAGGUUGGCUCUGACUGUA GCACAGAGGAAGAGAAUCU GAAGAAACCACUGGAUGGA GCUUCGAGAUGUUCCGAGA YEATS2 Y 55689 NM_018023 CAAGAAGGAUGAUGGAUAA GAAUUCAGCUCGAGAUGCU CAGAACAAGCGGAUAGAUA GAGCUACGAACAAUGCUAA ZNF106 Y 64397 NM_022473 GGUCAUACCUCCAAAGUUA GAUCUAACCCGGCACAUUA CGGAGCCUCUCUUCAAAUA GAACGAAGGAACAGUAGAU

141

142

Gene Symbol iPOND BioID Gene ID Gene Accession Sequence ZNF451 Y 26036 NM_015555 AAUCAAACCUGUACAAGUU GAGCCUAGCACCUCUUAUA GCCACAAGUUUCAUAGAUA GUACCGACAUUGCCAAGAU ZZZ3 Y 26009 NM_015534 GAAGAUAGCAGAUGAAUUG CGACAGCACCCUCUUAAUA GAUAAGAGGACGCUUGUGU GAUCGUAUGGGACCAAUAU NT-Control #1 UAGCGACUAAACACAUCAA NT-Control #2 UAAGGCUAUGAAGAGAUAC NT-Control #3 AUGUAUUGGCCUGUAUUAG NT-Control #4 AUGAACGUGAAUUGCUCAA

142

Table 5. List of BTRR candidate interactors determined by BioID-MS1

Prey Avg Prey Gene Amino Spectral Saint Cell Cycle2 Bait Gene ID Protein Description Acids Counts Score Annotation gH2AX3Annotation BLM CENPC1 1060 centromere protein C 943 14 1 BLM BRIP1 83990 BRCA1 interacting protein C-terminal helicase 1 1249 16 1 BLM YEATS2 55689 YEATS domain containing 2 1422 29 1 BLM PML 5371 Promyelocytic Leukemia Protein 882 17 1 BLM TOPBP1 11073 topoisomerase (DNA) II binding protein 1 1522 20 1 S arrest BLM RMI1 80010 RecQ mediated genome instability 1 625 17 1 BLM NOP14 8602 NOP14 nucleolar protein 857 15 1 S arrest BLM TP53 7157 tumor protein p53 393 33 1 BLM TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 28 1 BLM ZZZ3 26009 zinc finger, ZZ-type containing 3 903 7 1 BLM RMI2 116028 RecQ mediated genome instability 2 147 6 1 BLM RAD50 10111 RAD50 homolog (S. cerevisiae) 1312 23 1 BLM AP2M1 1173 adaptor-related protein complex 2, mu 1 subunit 435 6 0.99 BLM FANCM 57697 Fanconi anemia, complementation group M 2048 5 0.99 BLM BRCA1 672 1, early onset 1863 8 0.98 G0/1 arrest Cell division BLM PLK1 5347 polo-like kinase 1 603 8 0.98 defect yes BLM NBN 4683 nibrin 754 20 0.97 BLM RAD54L2 23132 RAD54-like 2 (S. cerevisiae) 1467 4 0.96 BLM SLX4 84464 SLX4 structure-specific endonuclease subunit 1834 8 0.9 G0/1 arrest BLM FANCD2 2177 Fanconi anemia, complementation group D2 1471 5 0.84 BLM ZFP106 64397 Zinc finger protein 106 1883 5 0.83 BLM ZNF451 26036 Zinc finger protein 451 1061 7 0.83 BLM ERCC4 2072 excision repair cross-complementation group 4 916 5 0.82 yes RRS1 ribosome biogenesis regulator homolog (S. BLM RRS1 23212 cerevisiae) 365 3 0.81 BLM RPA1 6117 , 70kDa 616 16 0.79 S arrest yes BLM PHF8 23133 PHD finger protein 8 1060 3 0.78 BLM SIRT1 23411 sirtuin 1 747 4 0.69 G0/1 arrest BLM RPA3 6119 replication protein A 14 kDa subunit 121 3 0.68 yes BLM PDCD11 22984 programmed cell death 11 1871 7 0.67 S arrest BLM SUPT5H 6829 suppressor of Ty 5 homolog (S. cerevisiae) 1087 6 0.66 G0/1 arrest BLM NOP2 4839 NOP2 nucleolar protein 812 10 0.64 G0/1 arrest BLM NOC4L 79050 nucleolar complex associated 4 homolog (S. cerevisiae) 516 7 0.6 BLM ZNF326 284695 zinc finger protein 326 582 10 0.59 BLM MLH1 4292 mutL homolog 1 756 10 0.54 G0/1 arrest

143

Prey Avg Prey Gene Amino Spectral Saint Cell Cycle2 Bait Gene ID Protein Description Acids Counts Score Annotation gH2AX3Annotation RMI1 PDXDC1 23042 pyridoxal-dependent decarboxylase domain containing 1 788 44 1 RMI1 RIF1 55183 Rap1-interacting factor 1 homolog 2472 60 1 RMI1 RPA1 6117 replication protein A1, 70kDa 616 19 1 S arrest yes RMI1 FANCM 57697 Fanconi anemia, complementation group M 2048 17 1 RMI1 TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 69 1 RMI1 AHCTF1 25909 AT hook containing transcription factor 1 2266 44 1 RMI1 RGPD3 653489 RANBP2-like and GRIP domain containing 3 1758 29 1 RMI1 RMI2 116028 RecQ mediated genome instability 2 147 32 1 ANKRD2 RMI1 8 23243 ankyrin repeat domain 28 1053 9 1 RMI1 TP53 7157 tumor protein p53 393 20 1 RMI1 MOB1B 92597 MOB kinase activator 1B 216 6 1 RMI1 TOP3B 8940 topoisomerase (DNA) III beta 862 6 1 RMI1 ZNF451 26036 Zinc finger protein 451 1061 15 1 RMI1 BLM 641 Bloom syndrome, RecQ helicase-like 1417 103 1 RMI1 DBT 1629 dihydrolipoamide branched chain transacylase E2 482 15 0.98 RMI1 ECH1 1891 enoyl CoA hydratase 1, peroxisomal 328 5 0.97 G0/1 arrest inhibitor of kappa light polypeptide gene enhancer in B- RMI1 IKBKAP 8518 cells, kinase complex-associated protein 1332 7 0.97 stimulated by retinoic acid gene 13 protein homolog | RMI1 STRA13 201254 MHF2 81 3 0.96 RMI1 TDRD3 81550 tudor domain-containing protein 3 651 3 0.96 RMI1 RPA2 6118 , 32kDa 270 4 0.96 S arrest yes RMI1 C19orf40 91442 Fanconi anemia-associated protein of 24 kDa | FAAP24 215 2 0.93 S arrest -inducing TAF9-like domain-containing RMI1 APITD1 378708 protein 1 | MHF1 138 2 0.93 RMI1 PML 5371 Promyelocytic Leukemia Protein 882 6 0.9 RMI1 RPL34 6164 ribosomal protein L34 117 4 0.83 S arrest RGD, leucine-rich repeat, tropomodulin and proline-rich RMI1 RLTPR 146206 containing protein 1435 3 0.83 RMI1 CCP110 9738 centriolar coiled coil protein 110kDa 1012 2 0.83 RMI1 RPA3 6119 replication protein A 14 kDa subunit 121 3 0.8 yes RMI1 RGPD5 84220 RANBP2-like and GRIP domain-containing protein 5/6 1765 23 0.79 RMI1 CUEDC2 79004 CUE domain containing 2 287 6 0.74 RMI1 CSTB 1476 cystatin B (stefin B) 98 3 0.73 RMI1 PRPF4 9128 pre-mRNA processing factor 4 522 11 0.72 RMI1 ALMS1 7840 Alstrom syndrome 1 4167 4 0.69 RMI1 LRRC49 54839 leucine rich repeat containing 49 686 5 0.67 RMI1 ERCC6L 54821 excision repair cross-complementation group 6-like 1250 5 0.61

144

Prey Avg Prey Gene Amino Spectral Saint Cell Cycle2 Bait Gene ID Protein Description Acids Counts Score Annotation gH2AX3Annotation RMI1 GOLGA3 2802 golgin A3 1498 2 0.51 RMI1 PRSS1 5644 protease, serine, 1 (trypsin 1) 247 2 0.51

RMI2 RMI1 80010 RecQ mediated genome instability 1 625 8 1 RMI2 TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 7 1 RMI2 ERC1 23085 ELKS/RAB6-interacting/CAST family member 1 1116 11 0.87 G0/1 arrest RMI2 BLM 641 Bloom syndrome, RecQ helicase-like 1417 8 0.87 RMI2 DOPEY2 9980 dopey family member 2 2298 2 0.83 RMI2 ZSCAN18 65982 zinc finger and SCAN domain containing 18 510 1 0.68 TNKS1B RMI2 P1 85456 tankyrase 1 binding protein 1, 182kDa 1729 1 0.64 RMI2 TP53 7157 tumor protein p53 393 6 0.51

RMI1_K166A ECH1 1891 enoyl CoA hydratase 1, peroxisomal 328 10 1 G0/1 arrest RMI1_K166A FANCM 57697 Fanconi anemia, complementation group M 2048 7 1 KIAA152 RMI1_K166A 4 57650 Cancerous inhibitor of PP2A 905 25 1 RMI1_K166A STIP1 10963 stress-induced-phosphoprotein 1 543 35 1 RMI1_K166A RMI2 116028 RecQ mediated genome instability 2 147 48 1 RMI1_K166A TP53 7157 tumor protein p53 393 26 1 RMI1_K166A DBT 1629 dihydrolipoamide branched chain transacylase E2 482 25 1 RMI1_K166A NUDCD3 23386 NudC domain containing 3 361 9 0.98 G2 arrest RMI1_K166A BAG2 9532 Bcl-2-associated athanogene 2 211 15 0.98 yes RMI1_K166A DNAJC7 7266 DnaJ (Hsp40) homolog, subfamily C, member 7 494 9 0.93 RMI1_K166A CSTB 1476 cystatin B (stefin B) 98 5 0.92 suppression of tumorigenicity 13 (colon carcinoma) RMI1_K166A ST13 6767 (Hsp70 interacting protein) 369 6 0.91 carbamoyl-phosphate synthetase 2, aspartate RMI1_K166A CAD 790 transcarbamylase, and dihydroorotase 2225 17 0.9 TXNDC1 thioredoxin domain containing 12 (endoplasmic RMI1_K166A 2 51060 reticulum) 172 3 0.86 RMI1_K166A NUDCD2 134492 nudC domain-containing protein 2 157 5 0.86 RMI1_K166A TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 2 0.73 RMI1_K166A AIP 9049 aryl hydrocarbon interacting protein 330 2 0.73 RMI1_K166A CDC37 11140 cell division cycle 37 378 14 0.72 RMI1_K166A PARK7 11315 parkinson protein 7 189 9 0.72 RMI1_K166A CACYBP 27101 calcyclin binding protein 228 18 0.63 RMI1_K166A LRRC49 54839 leucine rich repeat containing 49 686 7 0.61 RMI1_K166A PRSS1 5644 protease, serine, 1 (trypsin 1) 247 3 0.58

145

Prey Avg Prey Gene Amino Spectral Saint Cell Cycle2 Bait Gene ID Protein Description Acids Counts Score Annotation gH2AX3Annotation RMI1_K166A RAD50 10111 RAD50 homolog (S. cerevisiae) 1312 7 0.57 5-methyltetrahydrofolate-homocysteine RMI1_K166A MTR 4548 methyltransferase 1265 5 0.53

RMI1_LLTD RMI2 116028 RecQ mediated genome instability 2 147 40 1 RMI1_LLTD STIP1 10963 stress-induced-phosphoprotein 1 543 27 1 KIAA152 RMI1_LLTD 4 57650 Cancerous inhibitor of PP2A 905 31 1 HSP90AB heat shock protein 90kDa alpha (cytosolic), class B RMI1_LLTD 1 3326 member 1 724 83 1 RMI1_LLTD NUDC 10726 nudC nuclear distribution protein 331 85 1 RMI1_LLTD FANCM 57697 Fanconi anemia, complementation group M 2048 8 1 RMI1_LLTD TTI1 9675 TELO2 interacting protein 1 1089 6 1 RMI1_LLTD BAG2 9532 Bcl-2-associated athanogene 2 211 17 1 yes RMI1_LLTD TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 11 1 RMI1_LLTD CDC37 11140 cell division cycle 37 378 14 0.99 RMI1_LLTD TP53 7157 tumor protein p53 393 14 0.99 HSPA (heat shock 70kDa) binding protein, cytoplasmic RMI1_LLTD HSPBP1 23640 cochaperone 1 362 5 0.99 RMI1_LLTD NUDCD3 23386 NudC domain containing 3 361 9 0.98 G2 arrest RMI1_LLTD ECH1 1891 enoyl CoA hydratase 1, peroxisomal 328 8 0.98 G0/1 arrest RMI1_LLTD NUDCD2 134492 nudC domain-containing protein 2 157 5 0.97 RMI1_LLTD AIP 9049 aryl hydrocarbon receptor interacting protein 330 4 0.97 RMI1_LLTD DNAJC7 7266 DnaJ (Hsp40) homolog, subfamily C, member 7 494 11 0.96 suppression of tumorigenicity 13 (colon carcinoma) RMI1_LLTD ST13 6767 (Hsp70 interacting protein) 369 7 0.96 RMI1_LLTD CACYBP 27101 calcyclin binding protein 228 29 0.95 RMI1_LLTD DBT 1629 dihydrolipoamide branched chain transacylase E2 482 17 0.93 carbamoyl-phosphate synthetase 2, aspartate RMI1_LLTD CAD 790 transcarbamylase, and dihydroorotase 2225 18 0.83 RMI1_LLTD HSPA1L 3305 heat shock 70kDa protein 1-like 641 124 0.82 HSP90AA heat shock protein 90kDa alpha (cytosolic), class A RMI1_LLTD 1 3320 member 1 732 48 0.8 RMI1_LLTD PARK7 11315 parkinson protein 7 189 8 0.66 RMI1_LLTD URI1 8725 URI1, prefoldin-like chaperone 535 3 0.64 RMI1_LLTD FKBP4 2288 FK506 binding protein 4, 59kDa 459 6 0.55 RMI1_LLTD RPS27A 6233 ribosomal protein S27a 156 10 0.52 S arrest

RMI1_S455N TP53 7157 tumor protein p53 393 14 1 RMI1_S455N AHCTF1 25909 AT hook containing transcription factor 1 2266 45 1

146

Prey Avg Prey Gene Amino Spectral Saint Cell Cycle2 Bait Gene ID Protein Description Acids Counts Score Annotation gH2AX3Annotation RMI1_S455N SKIV2L2 23517 superkiller viralicidic activity 2-like 2 (S. cerevisiae) 1042 24 1 RMI1_S455N FANCM 57697 Fanconi anemia, complementation group M 2048 18 1 RMI1_S455N PDXDC1 23042 pyridoxal-dependent decarboxylase domain containing 1 788 41 1 RMI1_S455N ZNF451 26036 Zinc finger protein 451 1061 16 1 RMI1_S455N TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 69 1 RMI1_S455N RPA1 6117 replication protein A1, 70kDa 616 18 1 S arrest yes RMI1_S455N TOP3B 8940 topoisomerase (DNA) III beta 862 6 1 RMI1_S455N BLM 641 Bloom syndrome, RecQ helicase-like 1417 89 1 RMI1_S455N DBT 1629 dihydrolipoamide branched chain transacylase E2 482 17 1 RMI1_S455N RMI2 116028 RecQ mediated genome instability 2 147 30 1 RMI1_S455N RIF1 55183 Rap1-interacting factor 1 homolog 2472 76 1 Cell division RMI1_S455N TUBA1A 7846 tubulin, alpha 1a 451 20 1 defect RMI1_S455N RPA3 6119 replication protein A 14 kDa subunit 121 4 0.99 yes RMI1_S455N ECH1 1891 enoyl CoA hydratase 1, peroxisomal 328 4 0.99 G0/1 arrest RMI1_S455N PML 5371 Promyelocytic Leukemia Protein 882 7 0.98 RMI1_S455N CEP97 79598 centrosomal protein 97kDa 865 3 0.97 RMI1_S455N MOB1B 92597 MOB kinase activator 1B 216 3 0.97 RMI1_S455N RPA2 6118 replication protein A2, 32kDa 270 4 0.97 S arrest yes RMI1_S455N CCP110 9738 centriolar coiled coil protein 110kDa 1012 3 0.97 RMI1_S455N C19orf40 91442 Fanconi anemia-associated protein of 24 kDa | FAAP24 215 3 0.97 S arrest RMI1_S455N CSTB 1476 cystatin B (stefin B) 98 3 0.96 stimulated by retinoic acid gene 13 protein homolog | RMI1_S455N STRA13 201254 MHF2 81 2 0.94 inhibitor of kappa light polypeptide gene enhancer in B- RMI1_S455N IKBKAP 8518 cells, kinase complex-associated protein 1332 5 0.93 RMI1_S455N LRRC49 54839 leucine rich repeat containing 49 686 9 0.93 RMI1_S455N CUEDC2 79004 CUE domain containing 2 287 5 0.87 apoptosis-inducing TAF9-like domain-containing RMI1_S455N APITD1 378708 protein 1 | MHF1 138 2 0.86 RGD, leucine-rich repeat, tropomodulin and proline-rich RMI1_S455N RLTPR 146206 containing protein 1435 2 0.84 RMI1_S455N TPGS1 91978 tubulin polyglutamylase complex subunit 1 290 8 0.84 RMI1_S455N ERCC6L 54821 excision repair cross-complementation group 6-like 1250 6 0.83 RMI1_S455N RGPD5 84220 RANBP2-like and GRIP domain-containing protein 5/6 1765 26 0.83 ANKRD2 RMI1_S455N 8 23243 ankyrin repeat domain 28 1053 6 0.82 RMI1_S455N NUP88 4927 88kDa 741 5 0.73 RMI1_S455N SMN1 6606 1, telomeric 294 1 0.73 RMI1_S455N APOB 338 apolipoprotein B 4563 1 0.73

147

Prey Avg Prey Gene Amino Spectral Saint Cell Cycle2 Bait Gene ID Protein Description Acids Counts Score Annotation gH2AX3Annotation RMI1_S455N RAD50 10111 RAD50 homolog (S. cerevisiae) 1312 8 0.7 RMI1_S455N ALB 280717 albumin 609 1 0.69 RMI1_S455N GOLGA3 2802 golgin A3 1498 2 0.61 RMI1_S455N ALMS1 7840 Alstrom syndrome 1 4167 2 0.6 TMEM19 RMI1_S455N 4A 23306 transmembrane protein 194A 444 2 0.58 RMI1_S455N ZNF326 284695 zinc finger protein 326 582 8 0.57 RMI1_S455N PRSS1 5644 protease, serine, 1 (trypsin 1) 247 2 0.55 RMI1_S455N TRA2B 6434 transformer 2 beta homolog (Drosophila) 288 1 0.53

BLM_Aph TP53 7157 tumor protein p53 393 31 1 BLM_Aph RAD50 10111 RAD50 homolog (S. cerevisiae) 1312 18 1 BLM_Aph TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 25 1 BLM_Aph PML 5371 Promyelocytic Leukemia Protein 882 21 1 BLM_Aph NOP14 8602 NOP14 nucleolar protein 857 7 1 S arrest BLM_Aph TOPBP1 11073 topoisomerase (DNA) II binding protein 1 1522 18 1 S arrest BLM_Aph YEATS2 55689 YEATS domain containing 2 1422 22 0.99 BLM_Aph RMI2 116028 RecQ mediated genome instability 2 147 7 0.97 BLM_Aph ZZZ3 26009 zinc finger, ZZ-type containing 3 903 4 0.97 BLM_Aph RMI1 80010 RecQ mediated genome instability 1 625 15 0.85 BLM_Aph SLX4IP 128710 SLX4 interacting protein 408 2 0.69 BLM_Aph NKRF 55922 NFKB repressing factor 690 7 0.67 BLM_Aph NBN 4683 nibrin 754 12 0.61 BLM_Aph BRCA1 672 breast cancer 1, early onset 1863 8 0.6 BLM_Aph SLX4 84464 SLX4 structure-specific endonuclease subunit 1834 6 0.59 BLM_Aph ZNF451 26036 Zinc finger protein 451 1061 4 0.56

RMI1_Aph RPA3 6119 replication protein A 14 kDa subunit 121 8 1 yes RMI1_Aph RIF1 55183 Rap1-interacting factor 1 homolog 2472 69 1 RMI1_Aph AHCTF1 25909 AT hook containing transcription factor 1 2266 50 1 RMI1_Aph TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 87 1 RMI1_Aph RPA1 6117 replication protein A1, 70kDa 616 26 1 S arrest yes RMI1_Aph DBT 1629 dihydrolipoamide branched chain transacylase E2 482 29 1 RMI1_Aph RMI2 116028 RecQ mediated genome instability 2 147 44 1 RMI1_Aph FANCM 57697 Fanconi anemia, complementation group M 2048 19 1 RMI1_Aph PDXDC1 23042 pyridoxal-dependent decarboxylase domain containing 1 788 47 1 RMI1_Aph ZNF451 26036 Zinc finger protein 451 1061 18 1 RMI1_Aph BLM 641 Bloom syndrome, RecQ helicase-like 1417 147 1 RMI1_Aph TP53 7157 tumor protein p53 393 25 1

148

Prey Avg Prey Gene Amino Spectral Saint Cell Cycle2 Bait Gene ID Protein Description Acids Counts Score Annotation gH2AX3Annotation inhibitor of kappa light polypeptide gene enhancer in B- RMI1_Aph IKBKAP 8518 cells, kinase complex-associated protein 1332 8 0.99 stimulated by retinoic acid gene 13 protein homolog | RMI1_Aph STRA13 201254 MHF2 81 3 0.93 RMI1_Aph ECH1 1891 enoyl CoA hydratase 1, peroxisomal 328 5 0.9 G0/1 arrest RMI1_Aph CUEDC2 79004 CUE domain containing 2 287 8 0.85 RMI1_Aph C19orf40 91442 Fanconi anemia-associated protein of 24 kDa | FAAP24 215 3 0.84 S arrest RMI1_Aph PML 5371 Promyelocytic Leukemia Protein 882 8 0.82 RMI1_Aph RPA2 6118 replication protein A2, 32kDa 270 5 0.76 S arrest yes RMI1_Aph CSTB 1476 cystatin B (stefin B) 98 4 0.74 RGD, leucine-rich repeat, tropomodulin and proline-rich RMI1_Aph RLTPR 146206 containing protein 1435 2 0.69 RMI1_Aph ETAA1 54465 Ewing tumor-associated antigen 1 926 2 0.69 ANKRD2 RMI1_Aph 8 23243 ankyrin repeat domain 28 1053 8 0.56 RMI1_Aph ERCC6L 54821 excision repair cross-complementation group 6-like 1250 10 0.52

RMI1_S455N _Aph FANCM 57697 Fanconi anemia, complementation group M 2048 26 1 RMI1_S455N _Aph PDXDC1 23042 pyridoxal-dependent decarboxylase domain containing 1 788 39 1 RMI1_S455N _Aph AHCTF1 25909 AT hook containing transcription factor 1 2266 58 1 RMI1_S455N _Aph RMI2 116028 RecQ mediated genome instability 2 147 37 1 RMI1_S455N _Aph RIF1 55183 Rap1-interacting factor 1 homolog 2472 85 1 RMI1_S455N _Aph BLM 641 Bloom syndrome, RecQ helicase-like 1417 141 1 RMI1_S455N _Aph TOP3A 7156 Topoisomerase (DNA) III Alpha 1001 93 1 RMI1_S455N _Aph ZNF451 26036 Zinc finger protein 451 1061 17 1 RMI1_S455N _Aph ECH1 1891 enoyl CoA hydratase 1, peroxisomal 328 9 1 G0/1 arrest RMI1_S455N _Aph DBT 1629 dihydrolipoamide branched chain transacylase E2 482 25 1

149

Prey Avg Prey Gene Amino Spectral Saint Cell Cycle2 Bait Gene ID Protein Description Acids Counts Score Annotation gH2AX3Annotation RMI1_S455N _Aph PML 5371 Promyelocytic Leukemia Protein 882 10 0.99 RMI1_S455N _Aph RPA1 6117 replication protein A1, 70kDa 616 19 0.99 S arrest yes RMI1_S455N _Aph TP53 7157 tumor protein p53 393 17 0.99 RMI1_S455N inhibitor of kappa light polypeptide gene enhancer in B- _Aph IKBKAP 8518 cells, kinase complex-associated protein 1332 6 0.85 RMI1_S455N _Aph CUEDC2 79004 CUE domain containing 2 287 7 0.84 RMI1_S455N RGD, leucine-rich repeat, tropomodulin and proline-rich _Aph RLTPR 146206 containing protein 1435 3 0.84 RMI1_S455N _Aph TOP3B 8940 topoisomerase (DNA) III beta 862 4 0.82 RMI1_S455N apoptosis-inducing TAF9-like domain-containing _Aph APITD1 378708 protein 1 | MHF1 138 3 0.81 RMI1_S455N _Aph ERCC6L 54821 excision repair cross-complementation group 6-like 1250 12 0.81 RMI1_S455N _Aph C19orf40 91442 Fanconi anemia-associated protein of 24 kDa | FAAP24 215 3 0.81 S arrest RMI1_S455N stimulated by retinoic acid gene 13 protein homolog | _Aph STRA13 201254 MHF2 81 3 0.81 RMI1_S455N _Aph RPA3 6119 replication protein A 14 kDa subunit 121 7 0.76 yes RMI1_S455N _Aph RPA2 6118 replication protein A2, 32kDa 270 5 0.76 S arrest yes RMI1_S455N _Aph CSTB 1476 cystatin B (stefin B) 98 4 0.67 RMI1_S455N _Aph ZCCHC8 55596 zinc finger, CCHC domain containing 8 707 29 0.51

1Prey proteins with SAINT ≥0.75 in BioID-MS were considered as a candidate prey (black text). Grey texts represent prey candidates that fall below the cut-off criteria. Grey shaded boxes represents genes that were tested in EdU incorporation assay.

2 Cell cycle annotation categories from genome-scale profiling following esiRNA depletion in HeLa cells (Kittler et al., 2007)

3 gH2AX annotation from genome-scale profiling following pooled siRNA depletion in HeLa cells (Paulsen et al., 2009)

150

151

6 References

Adair, G.M., Rolig, R.L., Moore-Faver, D., Zabelshansky, M., Wilson, J.H., and Nairn, R.S. (2000). Role of ERCC1 in removal of long non-homologous tails during targeted homologous recombination. EMBO J 19, 5552-5561.

Admire, A., Shanks, L., Danzl, N., Wang, M., Weier, U., Stevens, W., Hunt, E., and Weinert, T. (2006). Cycles of chromosome instability are associated with a fragile site and are increased by defects in DNA replication and checkpoint controls in yeast. Genes Dev 20, 159-173.

Aguilera, A. (2002). The connection between transcription and genomic instability. EMBO J 21, 195-201.

Aguilera, A., and Garcia-Muse, T. (2012). R loops: from transcription byproducts to threats to genome stability. Mol Cell 46, 115-124.

Aguilera, A., and Gomez-Gonzalez, B. (2008). Genome instability: a mechanistic view of its causes and consequences. Nature reviews Genetics 9, 204-217.

Alabert, C., Bukowski-Wills, J.C., Lee, S.B., Kustatscher, G., Nakamura, K., de Lima Alves, F., Menard, P., Mejlvang, J., Rappsilber, J., and Groth, A. (2014). Nascent chromatin capture proteomics determines chromatin dynamics during DNA replication and identifies unknown fork components. Nature cell biology 16, 281-293.

Alabert, C., and Groth, A. (2012). Chromatin replication and epigenome maintenance. Nature reviews Molecular cell biology 13, 153-167.

Alani, E., Thresher, R., Griffith, J.D., and Kolodner, R.D. (1992). Characterization of DNA- binding and strand-exchange stimulation properties of y-RPA, a yeast single-strand-DNA- binding protein. Journal of molecular biology 227, 54-71.

Alvaro, D., Lisby, M., and Rothstein, R. (2007). Genome-wide analysis of Rad52 foci reveals diverse mechanisms impacting recombination. PLoS Genet 3, e228.

Ampatzidou, E., Irmisch, A., O'Connell, M.J., and Murray, J.M. (2006). Smc5/6 is required for repair at collapsed replication forks. Mol Cell Biol 26, 9387-9401.

Andersen, M.P., Nelson, Z.W., Hetrick, E.D., and Gottschling, D.E. (2008). A genetic screen for increased loss of heterozygosity in Saccharomyces cerevisiae. Genetics 179, 1179-1195.

Argueso, J.L., Westmoreland, J., Mieczkowski, P.A., Gawel, M., Petes, T.D., and Resnick, M.A. (2008). Double-strand breaks associated with repetitive DNA can reshape the genome. Proc Natl Acad Sci U S A 105, 11845-11850.

152

Arlt, M.F., Xu, B., Durkin, S.G., Casper, A.M., Kastan, M.B., and Glover, T.W. (2004). BRCA1 is required for common-fragile-site stability via its G2/M checkpoint function. Mol Cell Biol 24, 6701-6709.

Bansbach, C.E., Betous, R., Lovejoy, C.A., Glick, G.G., and Cortez, D. (2009). The annealing helicase SMARCAL1 maintains genome integrity at stalled replication forks. Genes & development 23, 2405-2414.

Baumann, C., Korner, R., Hofmann, K., and Nigg, E.A. (2007). PICH, a centromere-associated SNF2 family ATPase, is regulated by Plk1 and required for the spindle checkpoint. Cell 128, 101-114.

Behlke-Steinert, S., Touat-Todeschini, L., Skoufias, D.A., and Margolis, R.L. (2009). SMC5 and MMS21 are required for chromosome cohesion and mitotic progression. Cell Cycle 8, 2211- 2218.

Bell, S.P., and Dutta, A. (2002). DNA replication in eukaryotic cells. Annual review of biochemistry 71, 333-374.

Ben-Aroya, S., Coombes, C., Kwok, T., O'Donnell, K.A., Boeke, J.D., and Hieter, P. (2008). Toward a comprehensive temperature-sensitive mutant repository of the essential genes of Saccharomyces cerevisiae. Mol Cell 30, 248-258.

Bermejo, R., Doksani, Y., Capra, T., Katou, Y.M., Tanaka, H., Shirahige, K., and Foiani, M. (2007). Top1- and Top2-mediated topological transitions at replication forks ensure fork progression and stability and prevent DNA damage checkpoint activation. Genes & development 21, 1921-1936.

Bermejo, R., Lai, M.S., and Foiani, M. (2012). Preventing replication stress to maintain genome stability: resolving conflicts between replication and transcription. Mol Cell 45, 710-718.

Bernstein, K.A., Gangloff, S., and Rothstein, R. (2010). The RecQ DNA helicases in DNA repair. Annual review of genetics 44, 393-417.

Bielinsky, A.K. (2003). Replication origins: why do we need so many? Cell Cycle 2, 307-309.

Blackford, A.N., Schwab, R.A., Nieminuszczy, J., Deans, A.J., West, S.C., and Niedzwiedz, W. (2012). The DNA translocase activity of FANCM protects stalled replication forks. Human molecular genetics 21, 2005-2016.

Bocquet, N., Bizard, A.H., Abdulrahman, W., Larsen, N.B., Faty, M., Cavadini, S., Bunker, R.D., Kowalczykowski, S.C., Cejka, P., Hickson, I.D., et al. (2014). Structural and mechanistic insight into Holliday-junction dissolution by topoisomerase IIIalpha and RMI1. Nature structural & molecular biology 21, 261-268.

Bosco, G., and Haber, J.E. (1998). Chromosome break-induced DNA replication leads to nonreciprocal translocations and telomere capture. Genetics 150, 1037-1047.

153

Branzei, D., and Foiani, M. (2007). Interplay of replication checkpoints and repair proteins at stalled replication forks. DNA Repair (Amst) 6, 994-1003.

Branzei, D., and Foiani, M. (2009). The checkpoint response to replication stress. DNA repair 8, 1038-1046.

Branzei, D., and Foiani, M. (2010). Maintaining genome stability at the replication fork. Nature reviews Molecular cell biology 11, 208-219.

Branzei, D., Sollier, J., Liberi, G., Zhao, X., Maeda, D., Seki, M., Enomoto, T., Ohta, K., and Foiani, M. (2006). Ubc9- and mms21-mediated sumoylation counteracts recombinogenic events at damaged replication forks. Cell 127, 509-522.

Broberg, K., Hoglund, M., Gustafsson, C., Bjork, J., Ingvar, C., Albin, M., and Olsson, H. (2007). Genetic variant of the human homologous recombination-associated gene RMI1 (S455N) impacts the risk of AML/MDS and malignant melanoma. Cancer letters 258, 38-44.

Bruschi, C.V., McMillan, J.N., Coglievina, M., and Esposito, M.S. (1995). The genomic instability of yeast cdc6-1/cdc6-1 mutants involves chromosome structure and recombination. Mol Gen Genet 249, 8-18.

Budd, M.E., Choe, W., and Campbell, J.L. (2000). The nuclease activity of the yeast DNA2 protein, which is related to the RecB-like nucleases, is essential in vivo. J Biol Chem 275, 16518-16529.

Bugler, B., Schmitt, E., Aressy, B., and Ducommun, B. (2010). Unscheduled expression of CDC25B in S-phase leads to replicative stress and DNA damage. Molecular cancer 9, 29.

Bugreev, D.V., Yu, X., Egelman, E.H., and Mazin, A.V. (2007). Novel pro- and anti- recombination activities of the Bloom's syndrome helicase. Genes & development 21, 3085- 3094.

Burrows, A.E., and Elledge, S.J. (2008). How ATR turns on: TopBP1 goes on ATRIP with ATR. Genes & development 22, 1416-1421.

Byun, T.S., Pacek, M., Yee, M.C., Walter, J.C., and Cimprich, K.A. (2005). Functional uncoupling of MCM helicase and DNA polymerase activities activates the ATR-dependent checkpoint. Genes & development 19, 1040-1052.

Casper, A.M., Greenwell, P.W., Tang, W., and Petes, T.D. (2009). Chromosome aberrations resulting from double-strand DNA breaks at a naturally occurring yeast fragile site composed of inverted ty elements are independent of Mre11p and Sae2p. Genetics 183, 423-439, 421SI- 426SI.

Casper, A.M., Nghiem, P., Arlt, M.F., and Glover, T.W. (2002). ATR regulates fragile site stability. Cell 111, 779-789.

Castellano-Pozo, M., Garcia-Muse, T., and Aguilera, A. (2012). R-loops cause replication impairment and genome instability during meiosis. EMBO reports 13, 923-929.

154

Cejka, P., Plank, J.L., Bachrati, C.Z., Hickson, I.D., and Kowalczykowski, S.C. (2010). Rmi1 stimulates decatenation of double Holliday junctions during dissolution by Sgs1-Top3. Nature structural & molecular biology 17, 1377-1382.

Cerritelli, S.M., and Crouch, R.J. (2009). : the in eukaryotes. The FEBS journal 276, 1494-1505.

Cha, R.S., and Kleckner, N. (2002). ATR homolog Mec1 promotes fork progression, thus averting breaks in replication slow zones. Science 297, 602-606.

Chabosseau, P., Buhagiar-Labarchede, G., Onclercq-Delic, R., Lambert, S., Debatisse, M., Brison, O., and Amor-Gueret, M. (2011). Pyrimidine pool imbalance induced by BLM helicase deficiency contributes to genetic instability in Bloom syndrome. Nature communications 2, 368.

Chaganti, R.S., Schonberg, S., and German, J. (1974). A manyfold increase in sister chromatid exchanges in Bloom's syndrome lymphocytes. Proceedings of the National Academy of Sciences of the United States of America 71, 4508-4512.

Chan, K.L., and Hickson, I.D. (2011). New insights into the formation and resolution of ultra- fine anaphase bridges. Seminars in cell & developmental biology 22, 906-912.

Chan, K.L., North, P.S., and Hickson, I.D. (2007). BLM is required for faithful chromosome segregation and its localization defines a class of ultrafine anaphase bridges. EMBO J 26, 3397- 3409.

Chan, K.L., Palmai-Pallag, T., Ying, S., and Hickson, I.D. (2009). Replication stress induces sister-chromatid bridging at fragile site loci in mitosis. Nature cell biology 11, 753-760.

Chang, M., Bellaoui, M., Zhang, C., Desai, R., Morozov, P., Delgado-Cruzata, L., Rothstein, R., Freyer, G.A., Boone, C., and Brown, G.W. (2005). RMI1/NCE4, a suppressor of genome instability, encodes a member of the RecQ helicase/Topo III complex. EMBO J 24, 2024-2033.

Chaudhury, I., Sareen, A., Raghunandan, M., and Sobeck, A. (2013). FANCD2 regulates BLM complex functions independently of FANCI to promote replication fork recovery. Nucleic acids research 41, 6444-6459.

Chavez, A., George, V., Agrawal, V., and Johnson, F.B. (2010). Sumoylation and the structural maintenance of chromosomes (Smc) 5/6 complex slow through recombination intermediate resolution. The Journal of biological chemistry 285, 11922-11930.

Chen, C., Umezu, K., and Kolodner, R.D. (1998). Chromosomal rearrangements occur in S. cerevisiae rfa1 mutator mutants due to mutagenic lesions processed by double-strand-break repair. Mol Cell 2, 9-22.

Chen, J., Bardes, E.E., Aronow, B.J., and Jegga, A.G. (2009). ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic acids research 37, W305-311.

Chen, S., de Vries, M.A., and Bell, S.P. (2007). Orc6 is required for dynamic recruitment of Cdt1 during repeated Mcm2-7 loading. Genes & development 21, 2897-2907.

155

Chini, C.C., and Chen, J. (2003). Human claspin is required for replication checkpoint control. The Journal of biological chemistry 278, 30057-30062.

Choi, H., Larsen, B., Lin, Z.Y., Breitkreutz, A., Mellacheruvu, D., Fermin, D., Qin, Z.S., Tyers, M., Gingras, A.C., and Nesvizhskii, A.I. (2011). SAINT: probabilistic scoring of affinity purification-mass spectrometry data. Nature methods 8, 70-73.

Chon, H., Sparks, J.L., Rychlik, M., Nowotny, M., Burgers, P.M., Crouch, R.J., and Cerritelli, S.M. (2013). RNase H2 roles in genome integrity revealed by unlinking its activities. Nucleic acids research 41, 3130-3143.

Chou, D.M., and Elledge, S.J. (2006). Tipin and Timeless form a mutually protective complex required for genotoxic stress resistance and checkpoint function. Proceedings of the National Academy of Sciences of the United States of America 103, 18143-18147.

Ciccia, A., Bredemeyer, A.L., Sowa, M.E., Terret, M.E., Jallepalli, P.V., Harper, J.W., and Elledge, S.J. (2009). The SIOD disorder protein SMARCAL1 is an RPA-interacting protein involved in replication fork restart. Genes & development 23, 2415-2425.

Cimprich, K.A., and Cortez, D. (2008). ATR: an essential regulator of genome integrity. Nature reviews Molecular cell biology 9, 616-627.

Couzens, A.L., Knight, J.D., Kean, M.J., Teo, G., Weiss, A., Dunham, W.H., Lin, Z.Y., Bagshaw, R.D., Sicheri, F., Pawson, T., et al. (2013). Protein interaction network of the mammalian Hippo pathway reveals mechanisms of kinase-phosphatase interactions. Science signaling 6, rs15.

Cullmann, G., Fien, K., Kobayashi, R., and Stillman, B. (1995). Characterization of the five replication factor C genes of Saccharomyces cerevisiae. Molecular and cellular biology 15, 4661-4671.

Czubaty, A., Girstun, A., Kowalska-Loth, B., Trzcinska, A.M., Purta, E., Winczura, A., Grajkowski, W., and Staron, K. (2005). Proteomic analysis of complexes formed by human topoisomerase I. Biochimica et biophysica acta 1749, 133-141.

Daboussi, F., Courbet, S., Benhamou, S., Kannouche, P., Zdzienicka, M.Z., Debatisse, M., and Lopez, B.S. (2008). A homologous recombination defect affects replication-fork progression in mammalian cells. J Cell Sci 121, 162-166.

Daley, J.M., Palmbos, P.L., Wu, D., and Wilson, T.E. (2005). Nonhomologous end joining in yeast. Annual review of genetics 39, 431-451.

Davies, O.R., and Pellegrini, L. (2007). Interaction with the BRCA2 C terminus protects RAD51-DNA filaments from disassembly by BRC repeats. Nature structural & molecular biology 14, 475-483.

Davies, S.L., North, P.S., and Hickson, I.D. (2007). Role for BLM in replication-fork restart and suppression of origin firing after replicative stress. Nature structural & molecular biology 14, 677-679.

156

De Piccoli, G., Cortes-Ledesma, F., Ira, G., Torres-Rosell, J., Uhle, S., Farmer, S., Hwang, J.Y., Machin, F., Ceschia, A., McAleenan, A., et al. (2006). Smc5-Smc6 mediate DNA double-strand- break repair by promoting sister-chromatid recombination. Nat Cell Biol 8, 1032-1034.

Debatisse, M., Le Tallec, B., Letessier, A., Dutrillaux, B., and Brison, O. (2012). Common fragile sites: mechanisms of instability revisited. Trends in genetics : TIG 28, 22-32.

Delacroix, S., Wagner, J.M., Kobayashi, M., Yamamoto, K., and Karnitz, L.M. (2007). The Rad9-Hus1-Rad1 (9-1-1) clamp activates checkpoint signaling via TopBP1. Genes & development 21, 1472-1477.

Desany, B.A., Alcasabas, A.A., Bachant, J.B., and Elledge, S.J. (1998). Recovery from DNA replicational stress is the essential function of the S-phase checkpoint pathway. Genes & development 12, 2956-2970.

Deshpande, A.M., and Newlon, C.S. (1996). DNA replication fork pause sites dependent on transcription. Science 272, 1030-1033.

Di Rienzi, S.C., Collingwood, D., Raghuraman, M.K., and Brewer, B.J. (2009). Fragile genomic sites are associated with origins of replication. Genome Biol Evol 1, 350-363.

Dimitrova, D.S., Todorov, I.T., Melendy, T., and Gilbert, D.M. (1999). Mcm2, but not RPA, is a component of the mammalian early G1-phase prereplication complex. The Journal of cell biology 146, 709-722.

Dingar, D., Kalkat, M., Chan, P.K., Srikumar, T., Bailey, S.D., Tu, W.B., Coyaud, E., Ponzielli, R., Kolyar, M., Jurisica, I., et al. (2014). BioID identifies novel c-MYC interacting partners in cultured cells and xenograft tumors. Journal of proteomics.

Dion, B., and Brown, G.W. (2009). Comparative genome hybridization on tiling microarrays to detect aneuploidies in yeast. Methods Mol Biol 548, 1-18.

Duan, Z., Andronescu, M., Schutz, K., McIlwain, S., Kim, Y.J., Lee, C., Shendure, J., Fields, S., Blau, C.A., and Noble, W.S. (2010). A three-dimensional model of the yeast genome. Nature 465, 363-367.

Dunham, M.J., Badrane, H., Ferea, T., Adams, J., Brown, P.O., Rosenzweig, F., and Botstein, D. (2002). Characteristic genome rearrangements in experimental evolution of Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 99, 16144-16149.

Durkin, S.G., Arlt, M.F., Howlett, N.G., and Glover, T.W. (2006). Depletion of CHK1, but not CHK2, induces chromosomal instability and breaks at common fragile sites. Oncogene 25, 4381- 4388.

Durkin, S.G., and Glover, T.W. (2007). Chromosome fragile sites. Annual review of genetics 41, 169-192.

Dutertre, M., Lambert, S., Carreira, A., Amor-Gueret, M., and Vagner, S. (2014). DNA damage: RNA-binding proteins protect from near and far. Trends in biochemical sciences 39, 141-149.

157

Ellison, V., and Stillman, B. (2001). Opening of the clamp: an intimate view of an ATP-driven biological machine. Cell 106, 655-660.

Ellison, V., and Stillman, B. (2003). Biochemical characterization of DNA damage checkpoint complexes: clamp loader and clamp complexes with specificity for 5' recessed DNA. PLoS biology 1, E33.

Fachinetti, D., Bermejo, R., Cocito, A., Minardi, S., Katou, Y., Kanoh, Y., Shirahige, K., Azvolinsky, A., Zakian, V.A., and Foiani, M. (2010). Replication termination at eukaryotic chromosomes is mediated by Top2 and occurs at genomic loci containing pausing elements. Mol Cell 39, 595-605.

Falck, J., Coates, J., and Jackson, S.P. (2005). Conserved modes of recruitment of ATM, ATR and DNA-PKcs to sites of DNA damage. Nature 434, 605-611.

Fanning, E., Klimovich, V., and Nager, A.R. (2006). A dynamic model for replication protein A (RPA) function in DNA processing pathways. Nucleic acids research 34, 4126-4137.

Fekairi, S., Scaglione, S., Chahwan, C., Taylor, E.R., Tissier, A., Coulon, S., Dong, M.Q., Ruse, C., Yates, J.R., 3rd, Russell, P., et al. (2009). Human SLX4 is a Holliday junction resolvase subunit that binds multiple DNA repair/recombination endonucleases. Cell 138, 78-89.

Fien, K., and Stillman, B. (1992). Identification of replication factor C from Saccharomyces cerevisiae: a component of the leading-strand DNA replication complex. Molecular and cellular biology 12, 155-163.

Fishman-Lobell, J., and Haber, J.E. (1992). Removal of nonhomologous DNA ends in double- strand break recombination: the role of the yeast repair gene RAD1. Science 258, 480- 484.

Flott, S., Alabert, C., Toh, G.W., Toth, R., Sugawara, N., Campbell, D.G., Haber, J.E., Pasero, P., and Rouse, J. (2007). Phosphorylation of Slx4 by Mec1 and Tel1 regulates the single-strand annealing mode of DNA repair in budding yeast. Molecular and cellular biology 27, 6433-6445.

Focarelli, M.L., Soza, S., Mannini, L., Paulis, M., Montecucco, A., and Musio, A. (2009). Claspin inhibition leads to fragile site expression. Genes Chromosomes Cancer 48, 1083-1090.

Franchitto, A., and Pichierri, P. (2011). Understanding the molecular basis of common fragile sites instability: role of the proteins involved in the recovery of stalled replication forks. Cell cycle 10, 4039-4046.

Fujioka, Y., Kimata, Y., Nomaguchi, K., Watanabe, K., and Kohno, K. (2002). Identification of a novel non-structural maintenance of chromosomes (SMC) component of the SMC5-SMC6 complex involved in DNA repair. J Biol Chem 277, 21585-21591.

Fuss, J., and Linn, S. (2002). Human DNA polymerase epsilon colocalizes with proliferating cell nuclear antigen and DNA replication late, but not early, in S phase. The Journal of biological chemistry 277, 8658-8666.

158

Gallego-Paez, L.M., Tanaka, H., Bando, M., Takahashi, M., Nozaki, N., Nakato, R., Shirahige, K., and Hirota, T. (2014). Smc5/6-mediated regulation of replication progression contributes to chromosome assembly during mitosis in human cells. Molecular biology of the cell 25, 302-317.

Gambus, A., Jones, R.C., Sanchez-Diaz, A., Kanemaki, M., van Deursen, F., Edmondson, R.D., and Labib, K. (2006). GINS maintains association of Cdc45 with MCM in replisome progression complexes at eukaryotic DNA replication forks. Nature cell biology 8, 358-366.

Gan, W., Guan, Z., Liu, J., Gui, T., Shen, K., Manley, J.L., and Li, X. (2011). R-loop-mediated genomic instability is caused by impairment of replication fork progression. Genes & development 25, 2041-2056.

Gangloff, S., McDonald, J.P., Bendixen, C., Arthur, L., and Rothstein, R. (1994). The yeast type I topoisomerase Top3 interacts with Sgs1, a DNA helicase homolog: a potential eukaryotic reverse gyrase. Molecular and cellular biology 14, 8391-8398.

Garfinkel, D.J. (2005). Genome evolution mediated by Ty elements in Saccharomyces. Cytogenet Genome Res 110, 63-69.

German, J., Archibald, R., and Bloom, D. (1965). Chromosomal Breakage in a Rare and Probably Genetically Determined Syndrome of Man. Science 148, 506-507.

German, J., Sanz, M.M., Ciocci, S., Ye, T.Z., and Ellis, N.A. (2007). Syndrome-causing mutations of the BLM gene in persons in the Bloom's Syndrome Registry. Human mutation 28, 743-753.

Ghosal, G., and Chen, J. (2013). DNA damage tolerance: a double-edged sword guarding the genome. Translational cancer research 2, 107-129.

Giaever, G., Chu, A.M., Ni, L., Connelly, C., Riles, L., Veronneau, S., Dow, S., Lucau-Danila, A., Anderson, K., Andre, B., et al. (2002). Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387-391.

Glover, T.W., Arlt, M.F., Casper, A.M., and Durkin, S.G. (2005). Mechanisms of common fragile site instability. Human molecular genetics 14 Spec No. 2, R197-205.

Glover, T.W., Berger, C., Coyle, J., and Echo, B. (1984). DNA polymerase alpha inhibition by aphidicolin induces gaps and breaks at common fragile sites in human chromosomes. Human genetics 67, 136-142.

Gotter, A.L., Suppa, C., and Emanuel, B.S. (2007). Mammalian TIMELESS and Tipin are evolutionarily conserved replication fork-associated factors. Journal of molecular biology 366, 36-52.

Goulaouic, H., Roulon, T., Flamand, O., Grondard, L., Lavelle, F., and Riou, J.F. (1999). Purification and characterization of human DNA topoisomerase IIIalpha. Nucleic acids research 27, 2443-2450.

159

Hanada, K., Budzowska, M., Davies, S.L., van Drunen, E., Onizawa, H., Beverloo, H.B., Maas, A., Essers, J., Hickson, I.D., and Kanaar, R. (2007). The structure-specific endonuclease Mus81 contributes to replication restart by generating double-strand DNA breaks. Nature structural & molecular biology 14, 1096-1104.

Harper, J.W., and Elledge, S.J. (2007). The DNA damage response: ten years after. Mol Cell 28, 739-745.

Hawthorne, D.C. (1963). A Deletion in Yeast and Its Bearing on the Structure of the Mating Type Locus. Genetics 48, 1727-1729.

Heller, R.C., Kang, S., Lam, W.M., Chen, S., Chan, C.S., and Bell, S.P. (2011). Eukaryotic origin-dependent DNA replication in vitro reveals sequential action of DDK and S-CDK . Cell 146, 80-91.

Helmrich, A., Ballarino, M., and Tora, L. (2011). Collisions between replication and transcription complexes cause common fragile site instability at the longest human genes. Mol Cell 44, 966-977.

Helmrich, A., Stout-Weider, K., Hermann, K., Schrock, E., and Heiden, T. (2006). Common fragile sites are conserved features of human and mouse chromosomes and relate to large active genes. Genome research 16, 1222-1230.

Herrick, J., and Bensimon, A. (1999). Single molecule analysis of DNA replication. Biochimie 81, 859-871.

Hickson, I.D. (2003). RecQ helicases: caretakers of the genome. Nature reviews Cancer 3, 169- 178.

Hirano, S., Yamamoto, K., Ishiai, M., Yamazoe, M., Seki, M., Matsushita, N., Ohzeki, M., Yamashita, Y.M., Arakawa, H., Buerstedde, J.M., et al. (2005). Functional relationships of FANCC to homologous recombination, translesion synthesis, and BLM. EMBO J 24, 418-427.

Hoadley, K.A., Xue, Y., Ling, C., Takata, M., Wang, W., and Keck, J.L. (2012). Defining the molecular interface that connects the Fanconi anemia protein FANCM to the Bloom syndrome dissolvasome. Proceedings of the National Academy of Sciences of the United States of America 109, 4437-4442.

Hoang, M.L., Tan, F.J., Lai, D.C., Celniker, S.E., Hoskins, R.A., Dunham, M.J., Zheng, Y., and Koshland, D. (2010). Competitive repair by naturally dispersed repetitive DNA during non- allelic homologous recombination. PLoS Genet 6, e1001228.

Hood, J.K., and Silver, P.A. (1998). Cse1p is required for export of Srp1p/importin-alpha from the nucleus in Saccharomyces cerevisiae. J Biol Chem 273, 35142-35146.

Howlett, N.G., Taniguchi, T., Durkin, S.G., D'Andrea, A.D., and Glover, T.W. (2005). The Fanconi anemia pathway is required for the DNA replication stress response and for the regulation of common fragile site stability. Human molecular genetics 14, 693-701.

160

Huang, M.E., and Kolodner, R.D. (2005). A biological network in Saccharomyces cerevisiae prevents the deleterious effects of endogenous oxidative DNA damage. Mol Cell 17, 709-720.

Huang, M.E., Rio, A.G., Nicolas, A., and Kolodner, R.D. (2003). A genomewide screen in Saccharomyces cerevisiae for genes that suppress the accumulation of mutations. Proc Natl Acad Sci U S A 100, 11529-11534.

Hubscher, U. (2009). DNA replication fork proteins. Methods Mol Biol 521, 19-33.

Hubscher, U., Maga, G., and Spadari, S. (2002). Eukaryotic DNA polymerases. Annual review of biochemistry 71, 133-163.

Huen, M.S., Grant, R., Manke, I., Minn, K., Yu, X., Yaffe, M.B., and Chen, J. (2007). RNF8 transduces the DNA-damage signal via histone ubiquitylation and checkpoint protein assembly. Cell 131, 901-914.

Huertas, P., and Aguilera, A. (2003). Cotranscriptionally formed DNA:RNA hybrids mediate transcription elongation impairment and transcription-associated recombination. Mol Cell 12, 711-721.

Hur, S.K., Park, E.J., Han, J.E., Kim, Y.A., Kim, J.D., Kang, D., and Kwon, J. (2010). Roles of human INO80 chromatin remodeling enzyme in DNA replication and chromosome segregation suppress genome instability. Cellular and molecular life sciences : CMLS 67, 2283-2296.

Ip, S.C., Rass, U., Blanco, M.G., Flynn, H.R., Skehel, J.M., and West, S.C. (2008). Identification of Holliday junction resolvases from humans and yeast. Nature 456, 357-361.

Irmisch, A., Ampatzidou, E., Mizuno, K., O'Connell, M.J., and Murray, J.M. (2009). Smc5/6 maintains stalled replication forks in a recombination-competent conformation. EMBO J 28, 144-155.

Ivessa, A.S., Lenzmeier, B.A., Bessler, J.B., Goudsouzian, L.K., Schnakenberg, S.L., and Zakian, V.A. (2003). The Saccharomyces cerevisiae helicase Rrm3p facilitates replication past nonhistone protein-DNA complexes. Mol Cell 12, 1525-1536.

Jagannathan, M., Nguyen, T., Gallo, D., Luthra, N., Brown, G.W., Saridakis, V., and Frappier, L. (2014). A role for USP7 in DNA replication. Molecular and cellular biology 34, 132-145.

Jeong, S.Y., Kumagai, A., Lee, J., and Dunphy, W.G. (2003). Phosphorylated claspin interacts with a phosphate-binding site in the kinase domain of Chk1 during ATR-mediated activation. The Journal of biological chemistry 278, 46782-46788.

Jeppsson, K., Kanno, T., Shirahige, K., and Sjogren, C. (2014). The maintenance of chromosome structure: positioning and functioning of SMC complexes. Nature reviews Molecular cell biology 15, 601-614.

Jones, R.M., and Petermann, E. (2012). Replication fork dynamics and the DNA damage response. The Biochemical journal 443, 13-26.

161

Jonsson, Z.O., and Hubscher, U. (1997). Proliferating cell nuclear antigen: more than a clamp for DNA polymerases. BioEssays : news and reviews in molecular, cellular and developmental biology 19, 967-975.

Jungmichel, S., Clapperton, J.A., Lloyd, J., Hari, F.J., Spycher, C., Pavic, L., Li, J., Haire, L.F., Bonalli, M., Larsen, D.H., et al. (2012). The molecular basis of ATM-dependent dimerization of the Mdc1 DNA damage checkpoint mediator. Nucleic acids research 40, 3913-3928.

Kanemaki, M., and Labib, K. (2006). Distinct roles for Sld3 and GINS during establishment and progression of eukaryotic DNA replication forks. EMBO J 25, 1753-1763.

Kanke, M., Kodama, Y., Takahashi, T.S., Nakagawa, T., and Masukata, H. (2012). Mcm10 plays an essential role in origin DNA unwinding after loading of the CMG components. Embo J 31, 2182-2194.

Kao, H.I., Campbell, J.L., and Bambara, R.A. (2004). Dna2p helicase/nuclease is a tracking protein, like FEN1, for flap cleavage during Okazaki fragment maturation. The Journal of biological chemistry 279, 50840-50849.

Karvonen, U., Jaaskelainen, T., Rytinki, M., Kaikkonen, S., and Palvimo, J.J. (2008). ZNF451 is a novel PML body- and SUMO-associated transcriptional coregulator. Journal of molecular biology 382, 585-600.

Kellis, M., Patterson, N., Endrizzi, M., Birren, B., and Lander, E.S. (2003). Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241-254.

Kim, D.I., Birendra, K.C., Zhu, W., Motamedchaboki, K., Doye, V., and Roux, K.J. (2014). Probing nuclear pore complex architecture with proximity-dependent biotinylation. Proceedings of the National Academy of Sciences of the United States of America 111, E2453-2461.

Kim, R.A., and Wang, J.C. (1989). Function of DNA topoisomerases as replication swivels in Saccharomyces cerevisiae. Journal of molecular biology 208, 257-267.

Kittler, R., Pelletier, L., Heninger, A.K., Slabicki, M., Theis, M., Miroslaw, L., Poser, I., Lawo, S., Grabner, H., Kozak, K., et al. (2007). Genome-scale RNAi profiling of cell division in human tissue culture cells. Nature cell biology 9, 1401-1412.

Kliszczak, A.E., Rainey, M.D., Harhen, B., Boisvert, F.M., and Santocanale, C. (2011). DNA mediated chromatin pull-down for the study of chromatin replication. Scientific reports 1, 95.

Kolas, N.K., Chapman, J.R., Nakada, S., Ylanko, J., Chahwan, R., Sweeney, F.D., Panier, S., Mendez, M., Wildenhain, J., Thomson, T.M., et al. (2007). Orchestration of the DNA-damage response by the RNF8 ubiquitin ligase. Science 318, 1637-1640.

Kolodner, R.D., Putnam, C.D., and Myung, K. (2002). Maintenance of genome stability in Saccharomyces cerevisiae. Science 297, 552-557.

Kunzler, M., and Hurt, E.C. (1998). Cse1p functions as the nuclear export receptor for importin alpha in yeast. FEBS Lett 433, 185-190.

162

Labib, K., and Gambus, A. (2007). A key role for the GINS complex at DNA replication forks. Trends in cell biology 17, 271-278.

Labib, K., Tercero, J.A., and Diffley, J.F. (2000). Uninterrupted MCM2-7 function required for DNA replication fork progression. Science 288, 1643-1647.

Larsen, N.B., and Hickson, I.D. (2013). RecQ Helicases: Conserved Guardians of Genomic Integrity. Advances in experimental medicine and biology 767, 161-184.

Laskey, R.A., and Madine, M.A. (2003). A rotary pumping model for helicase function of MCM proteins at a distance from replication forks. EMBO reports 4, 26-30.

Le Beau, M.M., Rassool, F.V., Neilly, M.E., Espinosa, R., 3rd, Glover, T.W., Smith, D.I., and McKeithan, T.W. (1998). Replication of a common fragile site, FRA3B, occurs late in S phase and is delayed further upon induction: implications for the mechanism of fragile site induction. Hum Mol Genet 7, 755-761.

Lee, C., Hong, B., Choi, J.M., Kim, Y., Watanabe, S., Ishimi, Y., Enomoto, T., Tada, S., Kim, Y., and Cho, Y. (2004). Structural basis for inhibition of the replication licensing factor Cdt1 by geminin. Nature 430, 913-917.

Lee, J., and Dunphy, W.G. (2010). Rad17 plays a central role in establishment of the interaction between TopBP1 and the Rad9-Hus1-Rad1 complex at stalled replication forks. Molecular biology of the cell 21, 926-935.

Lee, J., Kumagai, A., and Dunphy, W.G. (2003a). Claspin, a Chk1-regulatory protein, monitors DNA replication on chromatin independently of RPA, ATR, and Rad17. Mol Cell 11, 329-340.

Lee, J.H., Goodarzi, A.A., Jeggo, P.A., and Paull, T.T. (2010). 53BP1 promotes ATM activity through direct interactions with the MRN complex. EMBO J 29, 574-585.

Lee, J.H., and Paull, T.T. (2005). ATM activation by DNA double-strand breaks through the Mre11-Rad50-Nbs1 complex. Science 308, 551-554.

Lee, J.K., Seo, Y.S., and Hurwitz, J. (2003b). The Cdc23 (Mcm10) protein is required for the phosphorylation of minichromosome maintenance complex by the Dfp1-Hsk1 kinase. Proc Natl Acad Sci U S A 100, 2334-2339.

Lee, K.H., Kim, D.W., Bae, S.H., Kim, J.A., Ryu, G.H., Kwon, Y.N., Kim, K.A., Koo, H.S., and Seo, Y.S. (2000). The endonuclease activity of the yeast Dna2 enzyme is essential in vivo. Nucleic Acids Res 28, 2873-2881.

Lemoine, F.J., Degtyareva, N.P., Kokoska, R.J., and Petes, T.D. (2008). Reduced levels of DNA polymerase delta induce chromosome fragile site instability in yeast. Mol Cell Biol 28, 5359- 5368.

Lemoine, F.J., Degtyareva, N.P., Lobachev, K., and Petes, T.D. (2005). Chromosomal translocations in yeast induced by low levels of DNA polymerase a model for chromosome fragile sites. Cell 120, 587-598.

163

Lengronne, A., and Schwob, E. (2002). The yeast CDK inhibitor Sic1 prevents genomic instability by promoting replication origin licensing in late G(1). Mol Cell 9, 1067-1078.

Lengsfeld, B.M., Rattray, A.J., Bhaskara, V., Ghirlando, R., and Paull, T.T. (2007). Sae2 is an endonuclease that processes hairpin DNA cooperatively with the Mre11/Rad50/Xrs2 complex. Mol Cell 28, 638-651.

Lesage, P., and Todeschini, A.L. (2005). Happy together: the life and times of Ty retrotransposons and their hosts. Cytogenet Genome Res 110, 70-90.

Letessier, A., Millot, G.A., Koundrioukoff, S., Lachages, A.M., Vogt, N., Hansen, R.S., Malfoy, B., Brison, O., and Debatisse, M. (2011). Cell-type-specific replication initiation programs set fragility of the FRA3B fragile site. Nature 470, 120-123.

Levin, D.S., Bai, W., Yao, N., O'Donnell, M., and Tomkinson, A.E. (1997). An interaction between DNA ligase I and proliferating cell nuclear antigen: implications for Okazaki fragment synthesis and joining. Proceedings of the National Academy of Sciences of the United States of America 94, 12863-12868.

Li, X.C., Schimenti, J.C., and Tye, B.K. (2009). Aneuploidy and improved growth are coincident but not causal in a yeast cancer model. PLoS Biol 7, e1000161.

Li, Z., Vizeacoumar, F.J., Bahr, S., Li, J., Warringer, J., Vizeacoumar, F.S., Min, R., Vandersluis, B., Bellay, J., Devit, M., et al. (2011). Systematic exploration of essential yeast gene function with temperature-sensitive mutants. Nat Biotechnol 29, 361-367.

Liang, Y., Cucchetti, M., Roncagalli, R., Yokosuka, T., Malzac, A., Bertosio, E., Imbert, J., Nijman, I.J., Suchanek, M., Saito, T., et al. (2013). The lymphoid lineage-specific actin- uncapping protein Rltpr is essential for costimulation via CD28 and the development of regulatory T cells. Nature immunology 14, 858-866.

Lisby, M., Barlow, J.H., Burgess, R.C., and Rothstein, R. (2004). Choreography of the DNA damage response: spatiotemporal relationships among checkpoint and repair proteins. Cell 118, 699-713.

Lisby, M., and Rothstein, R. (2004). DNA damage checkpoint and repair centers. Curr Opin Cell Biol 16, 328-334.

Lisby, M., and Rothstein, R. (2009). Choreography of recombination proteins during the DNA damage response. DNA repair 8, 1068-1076.

Liu, J., Luo, S., Zhao, H., Liao, J., Li, J., Yang, C., Xu, B., Stern, D.F., Xu, X., and Ye, K. (2012). Structural mechanism of the phosphorylation-dependent dimerization of the MDC1 forkhead-associated domain. Nucleic acids research 40, 3898-3912.

Liu, S., Bekker-Jensen, S., Mailand, N., Lukas, C., Bartek, J., and Lukas, J. (2006). Claspin operates downstream of TopBP1 to direct ATR signaling towards Chk1 activation. Molecular and cellular biology 26, 6056-6064.

164

Liu, S., Shiotani, B., Lahiri, M., Marechal, A., Tse, A., Leung, C.C., Glover, J.N., Yang, X.H., and Zou, L. (2011). ATR autophosphorylation as a molecular switch for checkpoint activation. Mol Cell 43, 192-202.

Liu, Y., Kao, H.I., and Bambara, R.A. (2004). Flap endonuclease 1: a central component of DNA metabolism. Annual review of biochemistry 73, 589-615.

Lopez-Contreras, A.J., Ruppen, I., Nieto-Soler, M., Murga, M., Rodriguez-Acebes, S., Remeseiro, S., Rodrigo-Perez, S., Rojas, A.M., Mendez, J., Munoz, J., et al. (2013). A proteomic characterization of factors enriched at nascent DNA molecules. Cell reports 3, 1105-1116.

Lou, Z., Minter-Dykhouse, K., Franco, S., Gostissa, M., Rivera, M.A., Celeste, A., Manis, J.P., van Deursen, J., Nussenzweig, A., Paull, T.T., et al. (2006). MDC1 maintains genomic stability by participating in the amplification of ATM-dependent DNA damage signals. Mol Cell 21, 187- 200.

Lukas, J., and Bartek, J. (2009). DNA repair: New tales of an old tail. Nature 458, 581-583.

Luo, K., Yuan, J., and Lou, Z. (2011). Oligomerization of MDC1 protein is important for proper DNA damage response. The Journal of biological chemistry 286, 28192-28199.

Lydeard, J.R., Lipkin-Moore, Z., Sheu, Y.J., Stillman, B., Burgers, P.M., and Haber, J.E. (2010). Break-induced replication requires all essential DNA replication factors except those specific for pre-RC assembly. Genes & development 24, 1133-1144.

Maga, G., Stucki, M., Spadari, S., and Hubscher, U. (2000). DNA polymerase switching: I. Replication factor C displaces DNA polymerase alpha prior to PCNA loading. Journal of molecular biology 295, 791-801.

Mailand, N., Bekker-Jensen, S., Faustrup, H., Melander, F., Bartek, J., Lukas, C., and Lukas, J. (2007). RNF8 ubiquitylates histones at DNA double-strand breaks and promotes assembly of repair proteins. Cell 131, 887-900.

Mani, R.S., and Chinnaiyan, A.M. (2010). Triggers for genomic rearrangements: insights into genomic, cellular and environmental influences. Nature reviews Genetics 11, 819-829.

Mankouri, H.W., Huttner, D., and Hickson, I.D. (2013). How unfinished business from S-phase affects mitosis and beyond. EMBO J 32, 2661-2671.

Masai, H., Matsumoto, S., You, Z., Yoshizawa-Sugata, N., and Oda, M. (2010). Eukaryotic chromosome DNA replication: where, when, and how? Annual review of biochemistry 79, 89- 130.

Masai, H., You, Z., and Arai, K. (2005). Control of DNA replication: regulation and activation of eukaryotic replicative helicase, MCM. IUBMB life 57, 323-335.

Matsuzaka, Y., Okamoto, K., Mabuchi, T., Iizuka, M., Ozawa, A., Oka, A., Tamiya, G., Kulski, J.K., and Inoko, H. (2004). Identification, expression analysis and polymorphism of a novel

165

RLTPR gene encoding a RGD motif, tropomodulin domain and proline/leucine-rich regions. Gene 343, 291-304.

McInerney, P., Johnson, A., Katz, F., and O'Donnell, M. (2007). Characterization of a triple DNA polymerase replisome. Mol Cell 27, 527-538.

McKinnon, P.J., and Caldecott, K.W. (2007). DNA strand break repair and human genetic disease. Annu Rev Genomics Hum Genet 8, 37-55.

Mechali, M. (2010). Eukaryotic DNA replication origins: many choices for appropriate answers. Nature reviews Molecular cell biology 11, 728-738.

Meetei, A.R., Sechi, S., Wallisch, M., Yang, D., Young, M.K., Joenje, H., Hoatlin, M.E., and Wang, W. (2003). A multiprotein nuclear complex connects Fanconi anemia and Bloom syndrome. Molecular and cellular biology 23, 3417-3426.

Merchant, A.M., Kawasaki, Y., Chen, Y., Lei, M., and Tye, B.K. (1997). A lesion in the DNA replication initiation factor Mcm10 induces pausing of elongation forks through chromosomal replication origins in Saccharomyces cerevisiae. Mol Cell Biol 17, 3261-3271.

Michalet, X., Ekong, R., Fougerousse, F., Rousseaux, S., Schurra, C., Hornigold, N., van Slegtenhorst, M., Wolfe, J., Povey, S., Beckmann, J.S., et al. (1997). Dynamic molecular combing: stretching the whole human genome for high-resolution studies. Science 277, 1518- 1523.

Mieczkowski, P.A., Lemoine, F.J., and Petes, T.D. (2006). Recombination between retrotransposons as a source of chromosome rearrangements in the yeast Saccharomyces cerevisiae. DNA Repair (Amst) 5, 1010-1020.

Mimitou, E.P., and Symington, L.S. (2008). Sae2, Exo1 and Sgs1 collaborate in DNA double- strand break processing. Nature 455, 770-774.

Minocherhomji, S., and Hickson, I.D. (2014). Structure-specific endonucleases: guardians of fragile site stability. Trends in cell biology 24, 321-327.

Mnaimneh, S., Davierwala, A.P., Haynes, J., Moffat, J., Peng, W.T., Zhang, W., Yang, X., Pootoolal, J., Chua, G., Lopez, A., et al. (2004). Exploration of essential gene functions via titratable promoter alleles. Cell 118, 31-44.

Moldovan, G.L., Pfander, B., and Jentsch, S. (2007). PCNA, the maestro of the replication fork. Cell 129, 665-679.

Mordes, D.A., Glick, G.G., Zhao, R., and Cortez, D. (2008). TopBP1 activates ATR through ATRIP and a PIKK regulatory domain. Genes & development 22, 1478-1489.

Moreau, S., Morgan, E.A., and Symington, L.S. (2001). Overlapping functions of the Saccharomyces cerevisiae Mre11, Exo1 and Rad27 nucleases in DNA metabolism. Genetics 159, 1423-1433.

166

Moses, J.E., and Moorhouse, A.D. (2007). The growing applications of click chemistry. Chemical Society reviews 36, 1249-1262.

Motycka, T.A., Bessho, T., Post, S.M., Sung, P., and Tomkinson, A.E. (2004). Physical and functional interaction between the XPF/ERCC1 endonuclease and hRad52. The Journal of biological chemistry 279, 13634-13639.

Moyer, S.E., Lewis, P.W., and Botchan, M.R. (2006). Isolation of the Cdc45/Mcm2-7/GINS (CMG) complex, a candidate for the eukaryotic DNA replication fork helicase. Proceedings of the National Academy of Sciences of the United States of America 103, 10236-10241.

Mrasek, K., Schoder, C., Teichmann, A.C., Behr, K., Franze, B., Wilhelm, K., Blaurock, N., Claussen, U., Liehr, T., and Weise, A. (2010). Global screening and extended nomenclature for 230 aphidicolin-inducible fragile sites, including 61 yet unreported ones. International journal of oncology 36, 929-940.

Mullen, J.R., Nallaseth, F.S., Lan, Y.Q., Slagle, C.E., and Brill, S.J. (2005). Yeast Rmi1/Nce4 controls genome stability as a subunit of the Sgs1-Top3 complex. Molecular and cellular biology 25, 4476-4487.

Munoz, I.M., Hain, K., Declais, A.C., Gardiner, M., Toh, G.W., Sanchez-Pulido, L., Heuckmann, J.M., Toth, R., Macartney, T., Eppink, B., et al. (2009). Coordination of structure- specific nucleases by human SLX4/BTBD12 is required for DNA repair. Mol Cell 35, 116-127.

Nick McElhinny, S.A., Gordenin, D.A., Stith, C.M., Burgers, P.M., and Kunkel, T.A. (2008). Division of labor at the eukaryotic replication fork. Mol Cell 30, 137-144.

Niida, H., Katsuno, Y., Banerjee, B., Hande, M.P., and Nakanishi, M. (2007). Specific role of Chk1 phosphorylations in cell survival and checkpoint activation. Molecular and cellular biology 27, 2572-2581.

O'Sullivan, R.J., and Karlseder, J. (2010). : protecting chromosomes against genome instability. Nature reviews Molecular cell biology 11, 171-181.

Ohta, M., Inoue, H., Cotticelli, M.G., Kastury, K., Baffa, R., Palazzo, J., Siprashvili, Z., Mori, M., McCue, P., Druck, T., et al. (1996). The FHIT gene, spanning the chromosome 3p14.2 fragile site and renal carcinoma-associated t(3;8) breakpoint, is abnormal in digestive tract cancers. Cell 84, 587-597.

Ohta, S., Shiomi, Y., Sugimoto, K., Obuse, C., and Tsurimoto, T. (2002). A proteomics approach to identify proliferating cell nuclear antigen (PCNA)-binding proteins in human cell lysates. Identification of the human CHL12/RFCs2-5 complex as a novel PCNA-binding protein. The Journal of biological chemistry 277, 40362-40367.

Osborne, M.A., Schlenstedt, G., Jinks, T., and Silver, P.A. (1994). Nuf2, a spindle body- associated protein required for nuclear division in yeast. J Cell Biol 125, 853-866.

Outwin, E.A., Irmisch, A., Murray, J.M., and O'Connell, M.J. (2009). Smc5-Smc6-dependent removal of cohesin from mitotic chromosomes. Mol Cell Biol 29, 4363-4375.

167

Ouyang, K.J., Woo, L.L., Zhu, J., Huo, D., Matunis, M.J., and Ellis, N.A. (2009). SUMO modification regulates BLM and RAD51 interaction at damaged replication forks. PLoS biology 7, e1000252.

Ozeri-Galai, E., Lebofsky, R., Rahat, A., Bester, A.C., Bensimon, A., and Kerem, B. (2011). Failure of origin activation in response to fork stalling leads to chromosomal instability at fragile sites. Mol Cell 43, 122-131.

Pacek, M., Tutter, A.V., Kubota, Y., Takisawa, H., and Walter, J.C. (2006). Localization of MCM2-7, Cdc45, and GINS to the site of DNA unwinding during eukaryotic DNA replication. Mol Cell 21, 581-587.

Pacek, M., and Walter, J.C. (2004). A requirement for MCM7 and Cdc45 in chromosome unwinding during eukaryotic DNA replication. EMBO J 23, 3667-3676.

Palakodeti, A., Han, Y., Jiang, Y., and Le Beau, M.M. (2004). The role of late/slow replication of the FRA16D in common fragile site induction. Genes Chromosomes Cancer 39, 71-76.

Panier, S., and Boulton, S.J. (2014). Double-strand break repair: 53BP1 comes into focus. Nature reviews Molecular cell biology 15, 7-18.

Pardo, B., Gomez-Gonzalez, B., and Aguilera, A. (2009). DNA repair in mammalian cells: DNA double-strand break repair: how to fix a broken relationship. Cellular and molecular life sciences : CMLS 66, 1039-1056.

Paulsen, R.D., and Cimprich, K.A. (2007). The ATR pathway: fine-tuning the fork. DNA repair 6, 953-966.

Paulsen, R.D., Soni, D.V., Wollman, R., Hahn, A.T., Yee, M.C., Guan, A., Hesley, J.A., Miller, S.C., Cromwell, E.F., Solow-Cordero, D.E., et al. (2009). A genome-wide siRNA screen reveals diverse cellular processes and pathways that mediate genome stability. Mol Cell 35, 228-239.

Pebernard, S., Perry, J.J., Tainer, J.A., and Boddy, M.N. (2008). Nse1 RING-like domain supports functions of the Smc5-Smc6 holocomplex in genome stability. Mol Biol Cell 19, 4099- 4109.

Pepe, A., and West, S.C. (2014). MUS81-EME2 promotes replication fork restart. Cell reports 7, 1048-1055.

Perry, P., and Wolff, S. (1974). New Giemsa method for the differential staining of sister chromatids. Nature 251, 156-158.

Pichierri, P., Franchitto, A., and Rosselli, F. (2004). BLM and the FANC proteins collaborate in a common pathway in response to stalled replication forks. EMBO J 23, 3154-3163.

Postow, L., Woo, E.M., Chait, B.T., and Funabiki, H. (2009). Identification of SMARCAL1 as a component of the DNA damage response. The Journal of biological chemistry 284, 35951- 35961.

168

Potts, P.R., Porteus, M.H., and Yu, H. (2006). Human SMC5/6 complex promotes sister chromatid homologous recombination by recruiting the SMC1/3 cohesin complex to double- strand breaks. EMBO J 25, 3377-3388.

Pursell, Z.F., Isoz, I., Lundstrom, E.B., Johansson, E., and Kunkel, T.A. (2007). Yeast DNA polymerase epsilon participates in leading-strand DNA replication. Science 317, 127-130.

Rao, V.A., Conti, C., Guirouilh-Barbat, J., Nakamura, A., Miao, Z.H., Davies, S.L., Sacca, B., Hickson, I.D., Bensimon, A., and Pommier, Y. (2007). Endogenous gamma-H2AX-ATM-Chk2 checkpoint activation in Bloom's syndrome helicase deficient cells is related to DNA replication arrested forks. Molecular cancer research : MCR 5, 713-724.

Rao, V.A., Fan, A.M., Meng, L., Doe, C.F., North, P.S., Hickson, I.D., and Pommier, Y. (2005). Phosphorylation of BLM, dissociation from topoisomerase IIIalpha, and colocalization with gamma-H2AX after topoisomerase I-induced replication damage. Molecular and cellular biology 25, 8925-8937.

Rasala, B.A., Orjalo, A.V., Shen, Z., Briggs, S., and Forbes, D.J. (2006). ELYS is a dual nucleoporin/kinetochore protein required for nuclear pore assembly and proper cell division. Proceedings of the National Academy of Sciences of the United States of America 103, 17801- 17806.

Raveendranathan, M., Chattopadhyay, S., Bolon, Y.T., Haworth, J., Clarke, D.J., and Bielinsky, A.K. (2006). Genome-wide replication profiles of S-phase checkpoint mutants reveal fragile sites in yeast. EMBO J 25, 3627-3639.

Raynard, S., Zhao, W., Bussen, W., Lu, L., Ding, Y.Y., Busygina, V., Meetei, A.R., and Sung, P. (2008). Functional role of BLAP75 in BLM-topoisomerase IIIalpha-dependent holliday junction processing. The Journal of biological chemistry 283, 15701-15708.

Roeder, G.S., and Fink, G.R. (1980). DNA rearrangements associated with a transposable element in yeast. Cell 21, 239-249.

Rogakou, E.P., Pilch, D.R., Orr, A.H., Ivanova, V.S., and Bonner, W.M. (1998). DNA double- stranded breaks induce histone H2AX phosphorylation on serine 139. The Journal of biological chemistry 273, 5858-5868.

Roux, K.J., Kim, D.I., Raida, M., and Burke, B. (2012). A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. The Journal of cell biology 196, 801-810.

Rowley, J.D. (1973). Letter: A new consistent chromosomal abnormality in chronic myelogenous leukaemia identified by quinacrine fluorescence and Giemsa staining. Nature 243, 290-293.

Saintigny, Y., Delacote, F., Vares, G., Petitot, F., Lambert, S., Averbeck, D., and Lopez, B.S. (2001). Characterization of homologous recombination induced by replication inhibition in mammalian cells. EMBO J 20, 3861-3870.

169

Santa Maria, S.R., Gangavarapu, V., Johnson, R.E., Prakash, L., and Prakash, S. (2007). Requirement of Nse1, a subunit of the Smc5-Smc6 complex, for Rad52-dependent postreplication repair of UV-damaged DNA in Saccharomyces cerevisiae. Mol Cell Biol 27, 8409-8418.

Santos-Pereira, J.M., Herrero, A.B., Garcia-Rubio, M.L., Marin, A., Moreno, S., and Aguilera, A. (2013). The Npl3 hnRNP prevents R-loop-mediated transcription-replication conflicts and genome instability. Genes & development 27, 2445-2458.

Saparbaev, M., Prakash, L., and Prakash, S. (1996). Requirement of mismatch repair genes MSH2 and MSH3 in the RAD1-RAD10 pathway of mitotic recombination in Saccharomyces cerevisiae. Genetics 142, 727-736.

Sarbajna, S., and West, S.C. (2014). Holliday junction processing enzymes as guardians of genome stability. Trends in biochemical sciences 39, 409-419.

Sartori, A.A., Lukas, C., Coates, J., Mistrik, M., Fu, S., Bartek, J., Baer, R., Lukas, J., and Jackson, S.P. (2007). Human CtIP promotes DNA end resection. Nature 450, 509-514.

Savic, V., Yin, B., Maas, N.L., Bredemeyer, A.L., Carpenter, A.C., Helmink, B.A., Yang-Iott, K.S., Sleckman, B.P., and Bassing, C.H. (2009). Formation of dynamic gamma-H2AX domains along broken DNA strands is distinctly regulated by ATM and MDC1 and dependent upon H2AX densities in chromatin. Mol Cell 34, 298-310.

Schuldiner, M., Collins, S.R., Thompson, N.J., Denic, V., Bhamidipati, A., Punna, T., Ihmels, J., Andrews, B., Boone, C., Greenblatt, J.F., et al. (2005). Exploration of the function and organization of the yeast early secretory pathway through an epistatic miniarray profile. Cell 123, 507-519.

Schulze, J.M., Wang, A.Y., and Kobor, M.S. (2009). YEATS domain proteins: a diverse family with many links to chromatin modification and transcription. Biochemistry and cell biology = Biochimie et biologie cellulaire 87, 65-75.

Schwartz, M., Zlotorynski, E., Goldberg, M., Ozeri, E., Rahat, A., le Sage, C., Chen, B.P., Chen, D.J., Agami, R., and Kerem, B. (2005). Homologous recombination and nonhomologous end- joining repair pathways regulate fragile site stability. Genes Dev 19, 2715-2726.

Shen, Z., and Prasanth, S.G. (2012). Emerging players in the initiation of eukaryotic DNA replication. Cell division 7, 22.

Sherman, F. (1991). Getting started with yeast. Methods Enzymol 194, 3-21.

Shiloh, Y., and Ziv, Y. (2013). The ATM protein kinase: regulating the cellular response to genotoxic stress, and more. Nature reviews Molecular cell biology 14, 197-210.

Shimada, K., Pasero, P., and Gasser, S.M. (2002). ORC and the intra-S-phase checkpoint: a threshold regulates Rad53p activation in S phase. Genes Dev 16, 3236-3252.

170

Shor, E., Weinstein, J., and Rothstein, R. (2005). A genetic screen for top3 suppressors in Saccharomyces cerevisiae identifies SHU1, SHU2, PSY3 and CSM2: four genes involved in error-free DNA repair. Genetics 169, 1275-1289.

Singh, T.R., Ali, A.M., Busygina, V., Raynard, S., Fan, Q., Du, C.H., Andreassen, P.R., Sung, P., and Meetei, A.R. (2008). BLAP18/RMI2, a novel OB-fold-containing protein, is an essential component of the Bloom helicase-double Holliday junction dissolvasome. Genes & development 22, 2856-2868.

Sirbu, B.M., Couch, F.B., and Cortez, D. (2012). Monitoring the spatiotemporal dynamics of proteins at replication forks and in assembled chromatin using isolation of proteins on nascent DNA. Nature protocols 7, 594-605.

Sirbu, B.M., Couch, F.B., Feigerle, J.T., Bhaskara, S., Hiebert, S.W., and Cortez, D. (2011). Analysis of protein dynamics at active, stalled, and collapsed replication forks. Genes & development 25, 1320-1327.

Sirbu, B.M., McDonald, W.H., Dungrawala, H., Badu-Nkansah, A., Kavanaugh, G.M., Chen, Y., Tabb, D.L., and Cortez, D. (2013). Identification of proteins at active, stalled, and collapsed replication forks using isolation of proteins on nascent DNA (iPOND) coupled with mass spectrometry. The Journal of biological chemistry 288, 31458-31467.

Skourti-Stathaki, K., and Proudfoot, N.J. (2014). A double-edged sword: R loops as threats to genome integrity and powerful regulators of gene expression. Genes & development 28, 1384- 1396.

Smith, S., Hwang, J.Y., Banerjee, S., Majeed, A., Gupta, A., and Myung, K. (2004). Mutator genes for suppression of gross chromosomal rearrangements identified by a genome-wide screening in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 101, 9039-9044.

Smits, V.A., Reaper, P.M., and Jackson, S.P. (2006). Rapid PIKK-dependent release of Chk1 from chromatin promotes the DNA-damage checkpoint response. Current biology : CB 16, 150- 159.

Sogo, J.M., Lopes, M., and Foiani, M. (2002). Fork reversal and ssDNA accumulation at stalled replication forks owing to checkpoint defects. Science 297, 599-602.

Solsbacher, J., Maurer, P., Bischoff, F.R., and Schlenstedt, G. (1998). Cse1p is involved in export of yeast importin alpha from the nucleus. Mol Cell Biol 18, 6805-6815.

Song, B., and Sung, P. (2000). Functional interactions among yeast Rad51 recombinase, Rad52 mediator, and replication protein A in DNA strand exchange. The Journal of biological chemistry 275, 15895-15904.

Steed, E., Elbediwy, A., Vacca, B., Dupasquier, S., Hemkemeyer, S.A., Suddason, T., Costa, A.C., Beaudry, J.B., Zihni, C., Gallagher, E., et al. (2014). MarvelD3 couples tight junctions to the MEKK1-JNK pathway to regulate cell behavior and survival. The Journal of cell biology 204, 821-838.

171

Stewart, J.A., Campbell, J.L., and Bambara, R.A. (2006). Flap endonuclease disengages Dna2 helicase/nuclease from Okazaki fragment flaps. The Journal of biological chemistry 281, 38565- 38572.

Stirling, P.C., Bloom, M.S., Solanki-Patil, T., Smith, S., Sipahimalani, P., Li, Z., Kofoed, M., Ben-Aroya, S., Myung, K., and Hieter, P. (2011). The complete spectrum of yeast chromosome instability genes identifies candidate CIN cancer genes and functional roles for ASTRA complex components. PLoS Genet 7, e1002057.

Stirling, P.C., Chan, Y.A., Minaker, S.W., Aristizabal, M.J., Barrett, I., Sipahimalani, P., Kobor, M.S., and Hieter, P. (2012). R-loop-mediated genome instability in mRNA cleavage and polyadenylation mutants. Genes Dev 26, 163-175.

Stracker, T.H., and Petrini, J.H. (2011). The MRE11 complex: starting from the ends. Nature reviews Molecular cell biology 12, 90-103.

Strathern, J., Hicks, J., and Herskowitz, I. (1981). Control of cell type in yeast by the mating type locus. The alpha 1-alpha 2 hypothesis. J Mol Biol 147, 357-372.

Strom, L., and Sjogren, C. (2007). Chromosome segregation and double-strand break repair - a complex connection. Curr Opin Cell Biol 19, 344-349.

Stucki, M., Clapperton, J.A., Mohammad, D., Yaffe, M.B., Smerdon, S.J., and Jackson, S.P. (2005). MDC1 directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. Cell 123, 1213-1226.

Sugawara, N., Wang, X., and Haber, J.E. (2003). In vivo roles of Rad52, Rad54, and Rad55 proteins in Rad51-mediated recombination. Mol Cell 12, 209-219.

Sung, P. (1997). Yeast Rad55 and Rad57 proteins form a heterodimer that functions with replication protein A to promote DNA strand exchange by Rad51 recombinase. Genes & development 11, 1111-1121.

Svendsen, J.M., Smogorzewska, A., Sowa, M.E., O'Connell, B.C., Gygi, S.P., Elledge, S.J., and Harper, J.W. (2009). Mammalian BTBD12/SLX4 assembles a Holliday junction resolvase and is required for DNA repair. Cell 138, 63-77.

Tak, Y.S., Tanaka, Y., Endo, S., Kamimura, Y., and Araki, H. (2006). A CDK-catalysed regulatory phosphorylation for formation of the DNA replication complex Sld2-Dpb11. EMBO J 25, 1987-1996.

Takeda, D.Y., and Dutta, A. (2005). DNA replication and progression through S phase. Oncogene 24, 2827-2843.

Takeda, D.Y., Parvin, J.D., and Dutta, A. (2005). Degradation of Cdt1 during S phase is Skp2- independent and is required for efficient progression of mammalian cells through S phase. The Journal of biological chemistry 280, 23416-23423.

172

Takeuchi, Y., Horiuchi, T., and Kobayashi, T. (2003). Transcription-dependent recombination and the role of fork collision in yeast rDNA. Genes & development 17, 1497-1506.

Tan, B.C., Chien, C.T., Hirose, S., and Lee, S.C. (2006). Functional cooperation between FACT and MCM helicase facilitates initiation of chromatin DNA replication. EMBO J 25, 3975-3985.

Tanaka, S., and Diffley, J.F. (2002a). Deregulated G1-cyclin expression induces genomic instability by preventing efficient pre-RC formation. Genes Dev 16, 2639-2649.

Tanaka, S., and Diffley, J.F. (2002b). Interdependent nuclear accumulation of budding yeast Cdt1 and Mcm2-7 during . Nature cell biology 4, 198-207.

Tanaka, S., Umemori, T., Hirai, K., Muramatsu, S., Kamimura, Y., and Araki, H. (2007). CDK- dependent phosphorylation of Sld2 and Sld3 initiates DNA replication in budding yeast. Nature 445, 328-332.

Taylor, E.M., Copsey, A.C., Hudson, J.J., Vidot, S., and Lehmann, A.R. (2008). Identification of the proteins, including MAGEG1, that make up the human SMC5-6 protein complex. Molecular and cellular biology 28, 1197-1206.

Taylor, E.R., and McGowan, C.H. (2008). Cleavage mechanism of human Mus81-Eme1 acting on Holliday-junction structures. Proceedings of the National Academy of Sciences of the United States of America 105, 3757-3762.

Toledo, L.I., Altmeyer, M., Rask, M.B., Lukas, C., Larsen, D.H., Povlsen, L.K., Bekker-Jensen, S., Mailand, N., Bartek, J., and Lukas, J. (2013). ATR prohibits replication catastrophe by preventing global exhaustion of RPA. Cell 155, 1088-1103.

Tsantoulis, P.K., Kotsinas, A., Sfikakis, P.P., Evangelou, K., Sideridou, M., Levy, B., Mo, L., Kittas, C., Wu, X.R., Papavassiliou, A.G., et al. (2008). Oncogene-induced replication stress preferentially targets common fragile sites in preneoplastic lesions. A genome-wide study. Oncogene 27, 3256-3264.

Tsubouchi, H., and Ogawa, H. (2000). Exo1 roles for repair of DNA double-strand breaks and meiotic crossing over in Saccharomyces cerevisiae. Molecular biology of the cell 11, 2221-2233.

Tuduri, S., Crabbe, L., Conti, C., Tourriere, H., Holtgreve-Grez, H., Jauch, A., Pantesco, V., De Vos, J., Thomas, A., Theillet, C., et al. (2009). Topoisomerase I suppresses genomic instability by preventing interference between replication and transcription. Nature cell biology 11, 1315- 1324.

Umezu, K., Hiraoka, M., Mori, M., and Maki, H. (2002). Structural analysis of aberrant chromosomes that occur spontaneously in diploid Saccharomyces cerevisiae: retrotransposon Ty1 plays a crucial role in chromosomal rearrangements. Genetics 160, 97-110.

Unsal-Kacmaz, K., Chastain, P.D., Qu, P.P., Minoo, P., Cordeiro-Stone, M., Sancar, A., and Kaufmann, W.K. (2007). The human Tim/Tipin complex coordinates an Intra-S checkpoint response to UV that slows replication fork displacement. Molecular and cellular biology 27, 3131-3142.

173 van Deursen, F., Sengupta, S., De Piccoli, G., Sanchez-Diaz, A., and Labib, K. (2012). Mcm10 associates with the loaded DNA helicase at replication origins and defines a novel step in its activation. Embo J 31, 2195-2206.

Van Itallie, C.M., Tietgens, A.J., Aponte, A., Fredriksson, K., Fanning, A.S., Gucek, M., and Anderson, J.M. (2014). Biotin ligase tagging identifies proteins proximal to E-cadherin, including lipoma preferred partner, a regulator of epithelial cell-cell and cell-substrate adhesion. Journal of cell science 127, 885-895.

Veaute, X., Jeusset, J., Soustelle, C., Kowalczykowski, S.C., Le Cam, E., and Fabre, F. (2003). The Srs2 helicase prevents recombination by disrupting Rad51 nucleoprotein filaments. Nature 423, 309-312.

Vernon, M., Lobachev, K., and Petes, T.D. (2008). High rates of "unselected" aneuploidy and chromosome rearrangements in tel1 mec1 haploid yeast strains. Genetics 179, 237-247.

Vinciguerra, P., Godinho, S.A., Parmar, K., Pellman, D., and D'Andrea, A.D. (2010). Cytokinesis failure occurs in Fanconi anemia pathway-deficient murine and human bone marrow hematopoietic cells. The Journal of clinical investigation 120, 3834-3842.

Walter, J., and Newport, J. (2000). Initiation of eukaryotic DNA replication: origin unwinding and sequential chromatin association of Cdc45, RPA, and DNA polymerase alpha. Mol Cell 5, 617-627.

Wan, C., Kulkarni, A., and Wang, Y.H. (2010). ATR preferentially interacts with common fragile site FRA3B and the binding requires its kinase activity in response to aphidicolin treatment. Mutation research 686, 39-46.

Wang, B., Matsuoka, S., Ballif, B.A., Zhang, D., Smogorzewska, A., Gygi, S.P., and Elledge, S.J. (2007). Abraxas and RAP80 form a BRCA1 protein complex required for the DNA damage response. Science 316, 1194-1198.

Wang, F., Yang, Y., Singh, T.R., Busygina, V., Guo, R., Wan, K., Wang, W., Sung, P., Meetei, A.R., and Lei, M. (2010). Crystal structures of RMI1 and RMI2, two OB-fold regulatory subunits of the BLM complex. Structure 18, 1159-1170.

Wang, J., Chen, J., and Gong, Z. (2013). TopBP1 controls BLM protein level to maintain genome stability. Mol Cell 52, 667-678.

Wang, Y., Cortez, D., Yazdi, P., Neff, N., Elledge, S.J., and Qin, J. (2000). BASC, a super complex of BRCA1-associated proteins involved in the recognition and repair of aberrant DNA structures. Genes & development 14, 927-939.

Wang, Y.L., Faiola, F., Xu, M., Pan, S., and Martinez, E. (2008). Human ATAC Is a GCN5/PCAF-containing acetylase complex with a novel NC2-like histone fold module that interacts with the TATA-binding protein. The Journal of biological chemistry 283, 33808-33815.

Weston, R., Peeters, H., and Ahel, D. (2012). ZRANB3 is a structure-specific ATP-dependent endonuclease involved in replication stress response. Genes & development 26, 1558-1572.

174

Wigge, P.A., and Kilmartin, J.V. (2001). The Ndc80p complex from Saccharomyces cerevisiae contains conserved centromere components and has a function in chromosome segregation. J Cell Biol 152, 349-360.

Wilson, T.E., Grawunder, U., and Lieber, M.R. (1997). Yeast DNA ligase IV mediates non- homologous DNA end joining. Nature 388, 495-498.

Wold, M.S. (1997). Replication protein A: a heterotrimeric, single-stranded DNA-binding protein required for eukaryotic DNA metabolism. Annual review of biochemistry 66, 61-92.

Wu, L., Bachrati, C.Z., Ou, J., Xu, C., Yin, J., Chang, M., Wang, W., Li, L., Brown, G.W., and Hickson, I.D. (2006). BLAP75/RMI1 promotes the BLM-dependent dissolution of homologous recombination intermediates. Proceedings of the National Academy of Sciences of the United States of America 103, 4068-4073.

Wu, L., Chan, K.L., Ralf, C., Bernstein, D.A., Garcia, P.L., Bohr, V.A., Vindigni, A., Janscak, P., Keck, J.L., and Hickson, I.D. (2005). The HRDC domain of BLM is required for the dissolution of double Holliday junctions. EMBO J 24, 2679-2687.

Wu, L., Davies, S.L., North, P.S., Goulaouic, H., Riou, J.F., Turley, H., Gatter, K.C., and Hickson, I.D. (2000). The Bloom's syndrome gene product interacts with topoisomerase III. The Journal of biological chemistry 275, 9636-9644.

Wu, L., and Hickson, I.D. (2003). The Bloom's syndrome helicase suppresses crossing over during homologous recombination. Nature 426, 870-874.

Xiao, Z., McGrew, J.T., Schroeder, A.J., and Fitzgerald-Hayes, M. (1993). CSE1 and CSE2, two new genes required for accurate mitotic chromosome segregation in Saccharomyces cerevisiae. Mol Cell Biol 13, 4691-4702.

Xu, D., Guo, R., Sobeck, A., Bachrati, C.Z., Yang, J., Enomoto, T., Brown, G.W., Hoatlin, M.E., Hickson, I.D., and Wang, W. (2008). RMI, a new OB-fold complex essential for Bloom syndrome protein to maintain genome stability. Genes & development 22, 2843-2855.

Xu, D., Muniandy, P., Leo, E., Yin, J., Thangavel, S., Shen, X., Ii, M., Agama, K., Guo, R., Fox, D., 3rd, et al. (2010). Rif1 provides a new DNA-binding interface for the Bloom syndrome complex to maintain normal replication. EMBO J 29, 3140-3155.

Yabuuchi, H., Yamada, Y., Uchida, T., Sunathvanichkul, T., Nakagawa, T., and Masukata, H. (2006). Ordered assembly of Sld3, GINS and Cdc45 is distinctly regulated by DDK and CDK for activation of replication origins. Embo J 25, 4663-4674.

Yang, J., Bachrati, C.Z., Ou, J., Hickson, I.D., and Brown, G.W. (2010). Human topoisomerase IIIalpha is a single-stranded DNA decatenase that is stimulated by BLM and RMI1. The Journal of biological chemistry 285, 21426-21436.

Yang, J., O'Donnell, L., Durocher, D., and Brown, G.W. (2012). RMI1 promotes DNA replication fork progression and recovery from replication fork stress. Molecular and cellular biology 32, 3054-3064.

175

Yang, Y., McBride, K.M., Hensley, S., Lu, Y., Chedin, F., and Bedford, M.T. (2014). Arginine methylation facilitates the recruitment of TOP3B to chromatin to prevent R loop accumulation. Mol Cell 53, 484-497.

Yankiwski, V., Marciniak, R.A., Guarente, L., and Neff, N.F. (2000). Nuclear structure in normal and Bloom syndrome cells. Proceedings of the National Academy of Sciences of the United States of America 97, 5214-5219.

Yin, J., Sobeck, A., Xu, C., Meetei, A.R., Hoatlin, M., Li, L., and Wang, W. (2005). BLAP75, an essential component of Bloom's syndrome protein complexes that maintain genome integrity. EMBO J 24, 1465-1476.

Yoshizawa-Sugata, N., and Masai, H. (2007). Human Tim/Timeless-interacting protein, Tipin, is required for efficient progression of S phase and DNA replication checkpoint. The Journal of biological chemistry 282, 2729-2740.

You, Z., and Masai, H. (2008). Cdt1 forms a complex with the minichromosome maintenance protein (MCM) and activates its helicase activity. The Journal of biological chemistry 283, 24469-24477.

Yu, L., Pena Castillo, L., Mnaimneh, S., Hughes, T.R., and Brown, G.W. (2006). A survey of essential gene function in the yeast cell division cycle. Molecular biology of the cell 17, 4736- 4747.

Yuan, J., Ghosal, G., and Chen, J. (2012). The HARP-like domain-containing protein AH2/ZRANB3 binds to PCNA and participates in cellular response to replication stress. Mol Cell 47, 410-421.

Yuen, K.W., Warren, C.D., Chen, O., Kwok, T., Hieter, P., and Spencer, F.A. (2007). Systematic genome instability screens in yeast and their potential relevance to cancer. Proc Natl Acad Sci U S A 104, 3925-3930.

Yusufzai, T., Kong, X., Yokomori, K., and Kadonaga, J.T. (2009). The annealing helicase HARP is recruited to DNA repair sites via an interaction with RPA. Genes & development 23, 2400- 2404.

Zeman, M.K., and Cimprich, K.A. (2014). Causes and consequences of replication stress. Nature cell biology 16, 2-9.

Zhang, H., and Freudenreich, C.H. (2007). An AT-rich sequence in human common fragile site FRA16D causes fork stalling and chromosome breakage in S. cerevisiae. Mol Cell 27, 367-379.

Zhu, Z., Chung, W.H., Shim, E.Y., Lee, S.E., and Ira, G. (2008). Sgs1 helicase and two nucleases Dna2 and Exo1 resect DNA double-strand break ends. Cell 134, 981-994.

Zlotorynski, E., Rahat, A., Skaug, J., Ben-Porat, N., Ozeri, E., Hershberg, R., Levi, A., Scherer, S.W., Margalit, H., and Kerem, B. (2003). Molecular basis for expression of common and rare fragile sites. Mol Cell Biol 23, 7143-7151.

176

Zou, L., Cortez, D., and Elledge, S.J. (2002). Regulation of ATR substrate selection by Rad17- dependent loading of Rad9 complexes onto chromatin. Genes & development 16, 198-208.

Zou, L., and Elledge, S.J. (2003). Sensing DNA damage through ATRIP recognition of RPA- ssDNA complexes. Science 300, 1542-1548.

Zou, L., Liu, D., and Elledge, S.J. (2003). Replication protein A-mediated recruitment and activation of Rad17 complexes. Proceedings of the National Academy of Sciences of the United States of America 100, 13827-13832.