Essential Saccharomyces cerevisiae genome instability suppressing identify potential human tumor suppressors

Anjana Srivatsana, Binzhong Lia, Dafne N. Sancheza, Steven B. Somacha, Vandeclecio L. da Silvab,c, Sandro J. de Souzab,c, Christopher D. Putnama,d, and Richard D. Kolodnera,e,f,g,1

aLudwig Institute for Research, University of California San Diego School of Medicine, La Jolla, CA 92093-0669; bBioinformatics Multidisciplinary Environment, Instituto Metrópole Digital–Universidade Federal do Rio Grande do Norte, Natal, Brazil 59082-180; cInstituto do Cérebro–Universidade Federal do Rio Grande do Norte, Natal, Brazil 59082-180; dDepartment of Medicine, University of California San Diego School of Medicine, La Jolla, CA 92093-0669; eDepartment of Cellular and Molecular Medicine, University of California San Diego School of Medicine, La Jolla, CA 92093-0669; fMoores Cancer Center, University of California San Diego School of Medicine, La Jolla, CA 92093-0669; and gInstitute of Genomic Medicine, University of California San Diego School of Medicine, La Jolla, CA 92093-0669

Contributed by Richard D. Kolodner, July 12, 2019 (sent for review April 23, 2019; reviewed by Marco Foiani and Wolf-Dietrich Heyer)

Gross Chromosomal Rearrangements (GCRs) play an important a combination of oxidative defense, DNA replication machinery, role in human diseases, including cancer. Although most of the DNA repair, cell cycle checkpoint, telomere maintenance, RNA nonessential Genome Instability Suppressing (GIS) genes in Sac- processing, and modification/remodeling and assem- charomyces cerevisiae are known, the essential genes in which bly function in concert to prevent GCRs (12). can cause increased GCR rates are not well understood. Most genes and pathways that prevent and form GCRs in Here 2 S. cerevisiae GCR assays were used to screen a targeted model organisms have been identified through analysis of non- – collection of temperature-sensitive mutants to identify mutations essential genes (12 15, 17, 22, 25, 26, 35, 36). These studies have that caused increased GCR rates. This identified 94 essential GIS identified 182 genes that suppress increased GCR rates and 438 (eGIS) genes in which mutations cause increased GCR rates and 38 cooperating GIS genes, in which mutations do not cause increased GENETICS candidate eGIS genes that eGIS1 -interacting or GCR rates but only cause increased GCR rates when combined family member . Analysis of TCGA data using the human with mutations in other genes. In contrast, studies of essential genes predicted to encode the proteins and protein complexes genes have thus far identified only 29 essential genes in which defects cause increased GCR rates (13, 14, 17, 25, 26, 29, 35, 37– implicated by the S. cerevisiae eGIS genes revealed a significant 45). Here, we used 2 different GCR assays to screen a collection of enrichment of mutations affecting predicted human eGIS genes in temperature-sensitive (ts) mutants for mutations that cause in- 10 of the 16 analyzed. creased GCR rates and identified 94 essential S. cerevisiae GIS (called eGIS) genes, of which 71 were not previously reported, as genome instability | dynamics and replication | cancer well as an additional 38 candidate eGIS genes, and analysis of The Cancer Genome Atlas (TCGA) data (46) demonstrated a signif- he genetic instability that occurs in many cancers is thought icant enrichment of mutations affecting 1 or more predicted hu- Tto play a critical role in the development and progression of man eGIS genes in 10 of 16 cancers analyzed. tumors and falls into 3 general categories (1, 2): accumulation of mutations resulting from environmental mutagens, defects in Results DNA mismatch repair genes, defects that reduce the fidelity of A Genetic Screen to Identify Essential GIS Genes. To identify es- DNA , and increased levels of cytosine deaminases sential genes that suppress the formation of GCRs, we crossed (3–6); accumulation of genome rearrangements such as trans- locations and copy number changes (2, 7); and accumulation of Significance changes in chromosome number (8). Our understanding of the genes that suppress genome rearrangements in cancer comes By performing a targeted genetic screen of temperature- from the study of inherited defects causing cancer susceptibility sensitive mutations, this study identified 94 essential Saccha- syndromes such as Fanconi anemia and the BRCA1- and romyces cerevisiae genome instability suppressing (eGIS) genes BRCA2-defective breast and syndromes (9, 10). and 38 candidate eGIS genes. Analysis of The Cancer Genome In addition, cancer genome sequencing projects have identified Atlas data demonstrated that mutations in the human homo- mutations in candidate Genome Instability Suppressing (GIS) logues of the S. cerevisiae eGIS genes were significantly genes, most of which were identified in studies of model or- enriched in 10 different human cancers. These results provide ganisms (11). However, our understanding of the causes of ge- insights into the origin of genome instability in human cancers nome rearrangements in mammalian cells is incomplete in part and provide tools for identifying and evaluating mutations because it is difficult to perform genetic screens to identify and that contribute to the development of cancer. study GIS genes in mammalian cells. Genetic studies in Saccharomyces cerevisiae have provided Author contributions: A.S., B.L., S.J.d.S., C.D.P., and R.D.K. designed research; A.S., B.L., considerable insight into mechanisms that promote and prevent D.N.S., S.B.S., V.L.d.S., S.J.d.S., C.D.P., and R.D.K. performed research; A.S., B.L., V.L.d.S., spontaneous genome rearrangements (12). Such studies were S.J.d.S., C.D.P., and R.D.K. analyzed data; A.S., S.J.d.S., C.D.P., and R.D.K. wrote the paper; made possible by the development of quantitative genetic assays and R.D.K. supervised the entire project. that allow measurement of the rate of accumulation of Gross Reviewers: M.F., Italian Foundation for Cancer Research and University of Milan; and Chromosomal Rearrangements (GCRs) (13–18) and allow de- W.-D.H., University of California, Davis. tection of a diversity of types of GCRs (13, 14, 19–24). Overall, The authors declare no conflict of interest. the types of genome rearrangements selected in GCR assays Published under the PNAS license. resemble those seen in human diseases, including cancer (dis- 1To whom correspondence may be addressed. Email: [email protected]. cussed in ref. 12). In addition, GCR assays have been used to This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. identify genes that prevent GCRs and that alter the types of 1073/pnas.1906921116/-/DCSupplemental. GCRs formed (13–17, 22, 25–34). These studies have shown that Published online August 13, 2019.

www.pnas.org/cgi/doi/10.1073/pnas.1906921116 PNAS | August 27, 2019 | vol. 116 | no. 35 | 17377–17382 Downloaded by guest on September 24, 2021 query strains containing either the duplication-mediated GCR genes with increased patch scores in at least 1 assay at least at 1 (dGCR) assay or the short repeat-sequence-mediated (sGCR) temperature. Some mutations appeared to cause assay-specific assay (SI Appendix, Fig. S1A) with 412 ts mutants [tsV6 (47); increased GCR patch scores; however, analysis that is beyond provided by Charlie Boone] and a leu2Δ::kanMX4 control strain the scope of the current study will be required to verify this. The (11). The 412 ts mutations analyzed affected 248 genes involved ts alleles affecting genes in the categories of DNA replication, in DNA replication, DNA damage response and repair, telomere chromosome cohesion, condensation, segregation, and other had maintenance, chromatin modification and remodeling, and the largest effect on patch scores (Fig. 1D and Dataset S1). chromosome cohesion, condensation, and segregation, as well as Growth defects can potentially result in decreased GCR scores related pathways implicated in maintaining genome stability, in papillation assays. We therefore validated mutations affecting including sumoylation, cell cycle, and cytokinesis, tran- each pathway implicated by the papillation assays by measuring scription, and and nucleo-cytoplasmic trans- quantitative GCR rates (Dataset S1). This excluded 18 alleles port (Dataset S1). We recovered GCR-assay containing progeny that caused elevated patch scores but not increased GCR rates. for 399 of the 412 ts mutant alleles (243 of the 248 genes) in This also identified 12 alleles that caused increased GCR rates crosses with at least 1 and usually both query strains. but not elevated patch scores. Finally, we sequenced all alleles of The progeny were scored using a papillation assay (11) at interest and eliminated 10 strains lacking the expected mutations 30 °C and 25 °C (Fig. 1A). The number of papillae growing on (SI Appendix, Table S1). In total, we identified 121 ts alleles medium that selected for GCRs was converted to a patch score representing 94 eGIS genes (Dataset S1). ranging from 0 to 5 (11). At 25 °C and 30 °C, the leu2Δ control strain had average patch scores of 1.69 and 1.19, respectively, in Definition of the eGIS1 and eGIS2 Lists. The 94 essential genes the dGCR assay, and 0.38 and 0.30, respectively, in the sGCR identified defined the first version of the eGIS gene list (eGIS1; assay (Fig. 1 B and C and SI Appendix,Fig.S1B and C). To Dataset S1), which included genes related to DNA replication minimize false-positive and false-negative identification of GIS and the sister chromatid cohesion, chromosome condensation, genes (11), we used a cutoff score difference of 0.4 above the leu2Δ and chromosome segregation pathways. Seventy-one of the control score at each temperature and identified 134 alleles of 103 eGIS1 genes had no previously known role in suppressing GCRs. Examples include IPI1, which has a role in ribosome biogenesis and is potentially a component of the prereplication complex (48, 49), and SLD3, which is required for the activation of the 100 AB25˚C dGCR MCM replicative DNA (50). Other newly identified 25 oC30 oC 80 eGIS1 genes encoded cell cycle-related proteins (such as Apc4, Control 60 component of the anaphase-promoting complex); Cdc15 (re- 40 -4 quired for mitotic exit); Cdc4 (F-box protein required for cell 20 cycle transitions); Cdc34 (ubiquitin-conjugating enzyme); Cdc37 -ph 0 012345(chaperone needed for passage through START during the cell number of mutations cycle); 20S and 26S proteasome-related proteins Pre2, Pre6, pol1-1 C 100 30˚C dGCR 80 Pre10, Rpn5, and Rpt6; and kinetochore and spindle-related psf2-ph proteins Mif2, Nnf1, Nuf2, and Spc42. 60 Seven previously implicated genes (DNA2, POL30, SPN1, sld3-ph 40 SUA7, TAF4, TOA1, and TFG1) were not present in the eGIS1 20 list (26, 39, 41, 45): 5 genes (POL30, SUA7, TAF4, TOA1, and 0 5 TFG1 number of mutations 01234 ) were not represented in the collection screened D Strain patch score here, and 2 genes (DNA2, SPN1) were identified as GIS genes 3 using alleles that were not tested here (dna2-2, TET-SPN1). Because of the lack of alleles in our mutation collection affecting all subunits or pathway components implicated by the eGIS1 2 genes and the potential for allele-specific effects on GCR rates, we created the expanded eGIS2 list (Dataset S1), which addi- tionally included 38 genes encoding other components of the 1 complexes and pathways defined by the eGIS1 genes. To be stringent, we only added genes encoding components of com- plexes in which a high proportion of the genes encoding the 0 complex were identified as eGIS1 genes; for example, we iden- tified 4 MCM genes as eGIS1 genes and added the 2 remaining MCM -1 genes. Patch Score Difference (all assays) Patch Score Difference Identification of Mutations That Activate the DNA Damage Response. The formation of GCRs involves aberrant processing of dam- Other aged (12). To distinguish between eGIS gene DNA Cell Cycle Nucleo- Chromatin mutations that directly or indirectly disrupt normal DNA dam- Chromosome ReplicationDNAResponse Damage cytoplasmic age processing from those that increase the levels of damaged Packaging, Transport Segregation chromosomes, we introduced the HUG1-GFP reporter, whose expression is induced by activation of the DNA damage and Fig. 1. Identification of essential genome instability suppressing (eGIS) replication checkpoints (51), into the starting 413 mutant strains. genes in S. cerevisiae.(A) Example patches of haploid strains containing the We measured the fold-increase in Hug1-GFP levels at 25 °C and dGCR assay at permissive (25 °C) and semipermissive (30 °C) temperatures A after replica plating onto GCR-selecting media. (B and C) Histograms of 30 °C, using FACS (Fig. 2 and ref. 52). Multiple mutations average strain patch scores for the dGCR assay at 25 °C (B) and 30 °C (C). The caused dramatically increased Hug1-GFP levels (Fig. 2B), in- triangle indicates the position of the average patch score of the control cluding those affecting DNA replication complexes (DNA po- (leu2Δ) strain. (D) Beehive plot of the difference in average patch score for lymerase alpha/, origin recognition, replication factor A, each mutant strain relative to the control strain for the dGCR and sGCR as- the GINS complex, and Okazaki fragment maturation), cell-cycle says measured at 25 °C and 30 °C (Dataset S1); an increase of 0.4 (horizontal complexes (the RENT complex, the anaphase promoting complex/ line) was previously established as the cutoff for significance (11). Mutant cyclosome, and cyclin-dependent protein kinase), and chromo- strains were classified by the function of the affected gene. some packaging complexes (, the cohesin loader, and

17378 | www.pnas.org/cgi/doi/10.1073/pnas.1906921116 Srivatsan et al. Downloaded by guest on September 24, 2021 1.5K (class C2, 51 mutations). Class E mutations included those with A Control pol1-12 pri1-m4 (fold=1) 1200 (fold=0.7) 120 (fold = data available only at 25 °C (class E1, 7 mutations) or at 30 °C 36) 1.0K 900 90 (class E2, 3 mutations). Different ts mutations affecting the same 60 protein or complex belong to different classes; for example, the 6 Count 600 0.5K POL1 30 alleles belonged to classes A, C1, C2, and D. Increased 300 Hug1-GFP levels were observed in at least 1 temperature for 0 0 0 -103 0 103 104 105 -103 0 103 104 105 -103 0 103 104 105 50.3% of the alleles (142 of 244 genes), and 33.9% of the alleles GFP-A GFP-A GFP-A (106 of 244 genes) had increased Hug1-GFP levels at 30 °C compared with 25 °C. More than 50% of the tested mutations B Hug1-GFP 30 ˚C Hug1-GFP 25 ˚C affecting DNA replication, chromosome packaging and segre- 10.0 gation, cell cycle, and nucleo-cytoplasmic transport categories caused increased Hug1-GFP levels (Fig. 2C); the relatively low Relative 2.0 proportion of ts mutations affecting the DNA damage category 1.0 Hug1-GFP level 0.5 that caused increased Hug1-GFP levels is likely because many of 20.0 the affected genes play modest roles in promoting DNA repair. We tested the effect of temperature on the accumulation of 5.0 GCRs for a subset of the 27 C2 and D mutations with at least a 3-fold increase in Hug1-GFP levels at 30 °C compared with that 1.0 at 25 °C (Dataset S1 and SI Appendix, Table S2). The pol1-1 and Fold change 30 C 25 C 0.2 cdc9-1 mutants had a dramatic increase in GCR rate when -1 DNA replication shifted from 25 °C to 30 °C. In contrast, the GCR rate DNA damage response was not temperature dependent, and smc1-259 and nuf2-61 Chromosome packaging mutants had reduced GCR rates at the higher temperature. Cell cycle Thus, temperature-dependent changes in the DNA damage re- Nucleo-cytoplasmic transport sponse as measured by Hug1-GFP levels did not always correlate Chromatin with temperature-dependent increases in GCR rates. This may Other be because the formation of GCRs requires both the generation 0 100 200 300 and misrepair of DNA damage, and in some mutants the in- GENETICS Mutation index creased DNA damage may not be repairable, resulting in no C Mutation class: A B C1 C2 D E1 E2 change or even reduced changes in GCR rates or cell death. DNA replication Regardless, there was a strong correlation in general between DNA damage response defects causing increased levels of Hug1-GFP and those causing Chromosome packaging increased accumulation of GCRs (Fig. 3). Cell cycle Nucleo-cytoplasmic transport Mutation of eGIS Genes in Human Cancers. To examine whether Chromatin eGIS genes were inactivated in human cancers, we first generated Other thehumaneGIS1(heGIS1)genelist(115genes;Datasets S1 and All S2), which corresponded to human homologs of the S. cerevisiae 0 2010 90807060504030 100 eGIS1 genes, and the heGIS2 gene list (162 genes; Datasets S1 Percent of mutations and S3), which contained the human homologs of the S. cerevisiae Fig. 2. Monitoring induction of the DNA damage response by FACS using eGIS2 genes and 2 additional human genes (CDCA5 and POT1) the Hug1-GFP reporter. (A) Example histograms of Hug1-GFP levels (GFP- that lacked S. cerevisiae homologs but encoded proteins that Area signals) as a function of the number of FACS events for the control function in the pathways identified by the S. cerevisiae analysis. We (leu2Δ), pol1-12, and pri1-m4 strains at 30 °C. Fold changes are the mutant then identified the mutations in these genes in the TCGA data for mean GFP-Area signal divided by the control mean GFP-Area signal. (B, Top) 16 different cancers (Datasets S2–S5) and computationally ana- Summary of the changes of Hug1-GFP levels for the mutant strains at 25 °C lyzed the heGIS1 and heGIS2 gene lists for significant enrichment (red) and 30 °C (black) rank ordered by the change at 30 °C. (B, Middle) Ratio of mutations (Fig. 4A;seeMethods). The mutations included in of the fold changes at 30 °C relative to 25 °C shows that ts allele-containing the analysis were loss-of-function (LOF) mutations (nonsense strains with the highest Hug1-GFP levels at 30 °C tend to be induced by in- mutations, frameshift insertions/deletions, and splice-site muta- creased temperature. (B, Bottom) Position of ts alleles affecting different tions) or LOF plus missense mutations (Datasets S2–S5 and SI processes in the rank ordered list indicated by vertical lines; DNA replication, Appendix,TableS3); data from analysis of specific classes of chromosome packaging, and segregation, cell cycle, and other categories mutations are present in Datasets S2 and S3 and SI Appendix, dominate the alleles with the highest Hug1-GFP levels at 30 °C. (C) Distri- – bution of the ts alleles for different processes based on the Hug1-GFP levels Tables S4 S12. We also calculated S-scores for each heGIS1 and (class A = no increase, class B = increase at 25 °C, class C1 and C2 = increase heGIS2 gene and performed enrichment analysis (11). at 25 °C and 30 °C, class D = increase at 30 °C, class E = missing data; see Significant enrichment of mutations in a broad array of human Identification of Mutations That Activate the DNA Damage Response). eGIS genes were observed in 9 of the 16 TCGA cancers ana- lyzed: bladder urothelial carcinoma, colorectal adenocarcinoma, glioblastoma multiforme (GBM), kidney renal clear cell carci- the Smc5-Smc6 complex). In addition, defects in sumoylation noma, acute myeloid leukemia (LAML), low-grade glioma, sar- (“other” category) were also identified. coma, stomach adenocarcinoma, and uterine corpus endometrial We divided the mutant strains into 5 categories (Fig. 2C and carcinoma (UCEC; Fig. 4 and SI Appendix, Tables S3–S12). Dataset S1): those that did not cause increased Hug1-GFP levels Enrichment in GBM was specific to heGIS1, and enrichment in (class A, 192 mutations), those that caused increased Hug1-GFP kidney renal clear cell carcinoma and sarcoma was specific to levels only at 25 °C (class B, 28 mutations), at both 25 °C and heGIS2. Among these 9 cancer types, bladder urothelial carci- noma samples had the greatest incidence of mutations in heGIS 30 °C (class C, 76 mutations), only at 30 °C (class D, 80 muta- genes (SI Appendix,Fig.S2), and LAML, a cancer that has limited tions), or those for which data were only available at 1 temper- genome instability, had the lowest incidence of mutations in the ature due to growth defects (class E, 10 mutations). Class C expanded heGIS2 gene set and among the lowest incidence in the mutations included those whose Hug1-GFP levels were relatively heGIS1 gene set. This is consistent with our previous study, in temperature-insensitive (class C1, 25 mutations) and those which LAML was not significantly enriched in mutations in non- whose Hug1-GFP levels were higher at 30 °C relative to 25 °C essential GIS genes (11). Breast invasive carcinoma (BRCA) only

Srivatsan et al. PNAS | August 27, 2019 | vol. 116 | no. 35 | 17379 Downloaded by guest on September 24, 2021 Complexes and individual proteins 1.6% of BRCA samples (Datasets S2–S5). In BRCA, GBM, and LAML, the overall enrichment was entirely attributable to mutations in cohesin genes. Other frequently mutated genes included MTOR SETX FBXW Hug1-GFP 41 45 9 GCR (mammalian target of rapamycin), (senataxin), (F-box/ WD-repeat containing protein 7), and POLE (catalytic subunit of DNA epsilon). Both FBXW and POLE contain re- current missense mutations in addition to other loss-of-function mutations; the recurrent POLE mutations cause defects in the Abf1 ORC complex Rix1 complex proofreading exonuclease activity (53). Given the importance of Cdc8 MCM complex Pkc1 Dna2 Dbp11-Sld2-Sld3 Mis12/MIND comlex the SMC5/6 complex in DNA repair, it was surprising that mu- Guk1 DNA polymerase Mif2 tations in genes encoding the SMC5/6 complex were only signifi- TORC1 complex DNA primase Rrs1 cantly enriched in 1 cancer type (UCEC) (SI Appendix,TableS3). Tap42-Rrd1-Sit4 DNA polymerase Rap1 Tap42-Rrd1-Pph21 22 DNA polymerase Sen1 COMA complex GINS complex Sln1 Discussion Mps2-Bbp1 Dbf4-Cdc7 Ubc9 Mitotic checkpoint Here, we performed a targeted screen for mutations in essential Chromosome passenger Cdc9 genes that result in increased GCR rates. We focused on mu- Gamma tubulin (small) Cdc21 tations affecting genes involved in DNA and chromosome me- Central kinetochore RPA complex Gpn2 RFC complex tabolism, cell cycle regulation, and other processes previously Pds5 Mcm10 implicated in maintaining genome stability (12). These mutations Sfi1 TORC2 complex were screened for those causing increased GCR rates at 2 Sgt1 Nuclear cohesin Multiple CDK’s Nuclear Cak1 Scc2-Scc4 DNA replication RENT complex Smc5-Smc6 complex Dbf2-Mob1 Ndc80 complex Prp19-associated A **** Separase Securin DNA damage 500 Importin complex Spindle body TREX complex DASH complex response Brr6 Eco1 400

Chromosome *** Gsp1 ** **** APC C complex ** Ntf2 Cdc15 packaging,

Pse1 Cdc37 segregation 300 *** Yrb1 ** Shared remodeling Brl1 Cell cycle **** subunits Act1, Arp4, Swc4 RSC complex 200 **** FACT complex ASTRA complex

Stp6 Shared remodelling subunits Nucleo- **** Number of Mutations * **

100 * Cia2 mRNA cleavage factor CF I IA cytoplasmic

Dbp5 **** Piccolo NuA4 complex transport **** Mcm1 Tah11 0 Spn1 CST complex OV Spt14 Chromatin GBM LGG SCF complexes BLCA BRCA HNSC KIRC LAML LUAD LUSC PRAD SARC SKCM STAD UCEC Yah1 Proteosome Rsp5-Bul1 2 Bur1-Bur2 COADREAD Simulation average and heGIS1 heGIS2 TFIIK Observed TFIIH Other interquartile range (blue) (red) TFIID Smt3 (SUMO) Ulp1 B 300 BLCA + BRCA + COADREAD + GBM + KIRC Fig. 3. Comparison of mutations causing increased Hug1-GFP signaling and + LAML + LGG + SARC + STAD + UCEC (n=4376) [BRCA is enriched for cohesin gene mutations] increased GCR rates. Alleles causing increased Hug1-GFP were mapped to 250 protein complexes or individual proteins on the basis of the affected genes. The resulting 95 complexes/proteins were divided into those only affecting 200 Missense (Ndamage 5) Hug1-GFP signaling, those only affecting GCR rates, or those affecting both.

Missense (Ndamage =5) The complexes/proteins were then color-coded by biological process. LOF mutations 150 showed significant enrichment for mutations in cohesion genes. In contrast to these 10 cancers, no enrichment for mutations was 100

observed with either gene list in head-neck squamous cell carcinoma, Number of Mutations lung adenocarcinoma, lung squamous cell carcinoma, ovarian serous cystadenocarcinoma, prostate adenocarcinoma, and skin cutaneous 50 melanoma. We also found that there was significant enrichment for genes with S-score >2and>3 among the heGIS1 genes in 11 cancers 0 and among heGIS2 genes in 12 cancers (SI Appendix,TableS13); as AG1 CUL1 NIPBLMTORPOLESETX T ESPL1 SMC4 SMC2 RFC1 SMC3 SMC5 PBRM1TRRAPFBXW7 STAG2S NUP98PDS5B TICRRPCF11SMC1AOPBP1SMC1B POLA1ERCC2PRKCBPOLD1PDS5A expression changes are an important component in the S-score NCAPD3 T SMARCA4SMARCA2 calculation (11), this suggests there may be overexpression of SMARCC2 individual GIS genes in different cancers, although we did not Fig. 4. Analysis of TCGA data for mutations in human homologs of S. cer- analyze this. evisiae eGIS genes. (A) Summary of the simulations to determine whether Overall, genes encoding replication-related complexes, the human homologs of the eGIS1 and eGIS2 genes are significantly mutated in cohesin and condensin complexes, and chromatin- and transcription- cancers sequenced by the TCGA. Solid circles are the observed number of loss- related complexes were among the most frequently mutated genes in of-function and missense mutations for the heGIS1 and heGIS2 gene lists. The the 10 cancers that showed enrichment for mutations in the com- box and whiskers correspond the average and interquartile range from the in B silico simulations. Statistically significant P values are indicated by the number plete heGIS gene lists or a subset of heGIS genes (Fig. 4 ). For of asterisks (4 = P < 0.0001, 3 = P < 0.001, 2 = P < 0.01, 1 = P < 0.05). All example, 11.7% of bladder urothelial carcinoma samples contained significant P values were above a false-discovery rate of 0.05 as determined by a mutation or mutations in STAG2, and NIPBL mutation or mu- the Benjamini-Hochberg procedure. (B) Count of the number of mutations in tations were observed in 19.8% of UCEC samples, 8.5% of stomach the top 50 mutated heGIS2 genes from the 9 cancers with significant levels of adenocarcinoma samples, 4.5% of low-grade glioma samples, and mutations and BRCA, which is enriched for cohesin gene mutations.

17380 | www.pnas.org/cgi/doi/10.1073/pnas.1906921116 Srivatsan et al. Downloaded by guest on September 24, 2021 different temperatures using 2 different assays that detect differ- genes either cause defects in the DNA damage response (e.g., ent but overlapping types of GCRs. Use of the 2 assays and 2 PKC1) (62) or defects in DNA repair that result in increased different temperatures increased the number of tests performed, GCRs without increases in the steady state levels of DNA damage. and hence the sensitivity of the screen, and allowed identification Previous studies have suggested that defects in the human of mutations that preferentially affect specific types of GCRs (15). homologs of nonessential GIS genes are prevalent in cancers that This screen identified 121 mutations in 94 essential GIS genes, 71 have increased genome instability (11, 63). Here, by analyzing of which have not been previously reported. The 94 eGIS genes TCGA data for 16 cancers, we identified 9 cancers with significant encode 47 multiprotein complexes (78 genes) and 16 proteins that enrichment for LOF mutations and/or LOF plus missense muta- function as individual proteins (16 genes); this list of genes is re- tionsintheeGISgenesand1cancer(BRCA)inwhichtherewas ferred to as eGIS1. In many cases, 1 or more genes encoding a only enrichment for mutations in the cohesin complex-encoding protein complex were identified as GIS genes, but not all of the genes. Of the 9 cancers with enrichment for mutations in the genes encoding the complex were identified as GIS genes. The eGIS genes, in 2 cases (GBM and LAML), this enrichment was most likely explanation for this is that our mutation collection did only a result of the inclusion of the cohesion genes, whereas in the not contain a sufficient number of mutations in each gene to allow other 7 cancers, the enrichment was not attributable to any specific identification of all possible mutations that could cause a defect subset of eGIS genes. Furthermore, among the 9 cancers with resulting in increased GCR rates. We therefore constructed the general enrichment for mutations in the eGIS genes were 4 cancers eGIS2 gene list comprising 132 genes, which also contained other with enrichment for mutations in /modifi- genes encoding protein complexes implicated in suppressing GCRs cation genes, 6 cancers with enrichment for mutations in cohesin that were not present in the eGIS1 list (see Definition of the eGIS1 genes, 2 cancers with enrichment for mutations in condensin genes, and eGIS2 Gene Lists). In combination with the previously iden- 1 cancer with enrichment for mutations in Smc5-Smc6 cohesin tified nonessential GIS genes (11), the genes in eGIS1 and eGIS2 genes, and 4 cancers with enrichment for mutations in replication identify 266 and 304 GIS1 and GIS2 genes, respectively. genes. This distribution of classes of mutated genes was also The eGIS genes identified have implicated a number of met- reflected among the top most frequently mutated genes in the 10 abolic processes in the suppression of GCRs. The most prominent cancers with some evidence for enrichment for mutations in dif- group of genes identified was those encoding DNA replication ferent eGIS genes (Fig. 4B). Defects in cohesin-encoding genes factors; these included most replication proteins including ORC, the and condensin-encoding genes in cancer have been reported pre- MCM helicase, GINS proteins, all the DNA polymerases, DNA viously, although our analysis identified mutations in additional primase, RFA, RFC, and various proteins involved in control of cancers, including GBM, low-grade glioma, and UCEC (cohesin) GENETICS DNA replication. This is consistent with previous studies implicating and stomach adenocarcinoma and UCEC (condensin) beyond replication errors in the production of GCRs and studies showing those in which defects have been previously reported (64). We also that down-regulation of DNA polymerases can result in increased observed mutations in a broad diversity of genes encoding proteins formation of GCRs (13, 54). A second prominent group of GIS that function in DNA replication beyond the more generally ob- genes were those encoding cohesin, condensin, and the Smc5-Smc6 served mutator mutations affecting DNA polymerases delta and cohesion complex; the latter has been extensively studied in regard epsilon, supporting the idea that DNA replication errors could to its role in DNA repair (29) and plays a role in suppressing GCRs contribute to increased genome instability in cancer (53). Overall, (29, 40). Among other functions, cohesin acts during DNA repli- our results support the view that eGIS genes are significantly cation to maintain sister chromatid cohesion, which plays a role in mutated in a number of cancers. The mutations observed could be sister chromatid recombination (55), a process suggested to suppress reduced-function mutations that in some cases might be dominant, GCRs (12, 26). Condensin plays a role in maintaining the 3D loss-of-function mutations that could cause haploinsufficiency, or structure of chromosomes (55), but how this functions to suppress gain-of-function dominant mutations. Additional studies will be GCRs is not clear; regardless, condensin defects caused similar in- required to determine the frequency and nature of defects in es- creases in GCR rates as those caused by defects in cohesin and the sential and nonessential GIS genes in different cancers and Smc5-Smc6 cohesin complex (this study and refs. 29 and 40). We whether these actually cause genome instability in these cancers. also found a modest number of mutations affecting the proteasome (56), sumoylation pathways (35, 57), chromatin remodeling com- Methods plexes (52, 58), and nuclear pore (59), all of which act in DNA repair S. cerevisiae Strains. The dGCR and sGCR query strains used for systematic and the DNA damage response to some extent. Among the more mating, RDKY7635 (dGCR; MATα hom3-10 ura3Δ0leu2Δ0 trp1Δ63

surprising GIS genes identified were those encoding the TORC2 his3Δ200 lyp1::TRP1 cyh2-Q38K iYFR016C::PMFA1-LEU2 can1::PLEU2-NAT complex that may play a role in DNA damage responses (60); yel072w::CAN1/URA3) and RDKY7964 (sGCR; MATα hom3–10 ura3Δ0 leu2Δ0 Δ Δ Senataxin, which acts in R-loop regulation (61); and Pkc1, which trp1 63 his3 200 lyp1::TRP1 cyh2-Q38K iYFR016C::PMFA1-LEU2 can1::PLEU2-NAT plays a minor role in DNA damage checkpoint responses (62). yel068c::CAN1/URA3), were described previously (11). The query strain To better classify the eGIS genes, we included a Hug1-GFP RDKY8174 (MATα hom3-10 ura3Δ0leu2Δ0trp1Δ63 his3Δ200 lyp1::TRP1 downstream transcriptional reporter of the DNA damage re- cyh2-Q38K iYFR016C::PMFA1-LEU2 can1::PLEU2-NAT yel072w::CAN1/URA3 sponse in our screen (51, 52). This allowed the identification of 3 HUG1-EGFP.hphNT1) was used to introduce the HUG1-GFP reporter. classes of genes. The first class was those in which defects caused increased Hug1 expression but no increase in GCR rates; defects Systematic Strain Construction and Screening. The dGCR and sGCR query in these genes are likely to directly affect the transcription of DNA strains were crossed to selected strains from the tsv6 temperature-sensitive damage response genes and are unlikely to reflect processes that mutant collection (BY4741 MATa strains; obtained from Charles Boone, suppress GCRs or possibly cause increased levels of DNA damage Donnelly Centre, University of Toronto, Toronto, ON, Canada), using a RoTor that is repaired in ways that do not result in the formation of instrument (Singer). Systematic mating, sporulation, haploid selection, GCR patch tests, and GCR rates were performed as described (11), except the GCRs. The second, and most prevalent, class was those in which crosses were performed at 25 °C and GCR patch tests and rates were per- defects caused both increased Hug1 expression and increased formed at 25 °C and/or 30 °C, as indicated. To verify expected mutations in GCR rates and encompassed virtually all of the replication and strains of interest, ∼1-kb overlapping fragments spanning the genes of in- cohesin/condensin genes. It seems likely that defects in replication terest were amplified from 2 independent colonies from the cross progeny genes result in increased levels of DNA damage because of rep- and subjected to Sanger sequencing at a commercial facility. lication errors (13, 54), whereas defects in cohesin and possibly condensin genes result in reduced rates of DNA repair (29, 55), Measurement of Hug1-GFP Levels. Hug1-GFP abundance in log phase cells was both of which might result in higher steady state levels of DNA measured at 25 °C and 30 °C using FACS, as described (52), using the fol- damage; in both cases, misrepair results in GCRs. The third class lowing modifications: Cells were grown and processed in 96-well plates and was those in which defects cause increased GCR rates but no analyzed using a BD LSR Fortessa analytical cytometer with an HTS loader. increase in Hug1 expression. It is likely that defects in these latter Excitation was at 488 nm, and the fluorescence signal was collected through

Srivatsan et al. PNAS | August 27, 2019 | vol. 116 | no. 35 | 17381 Downloaded by guest on September 24, 2021 a 505-nm long-pass filter and a HQ510/20 band-pass filter (Chroma Tech- described (SI Appendix, Methods) (11), using 9 prediction tests (Datasets S4–S6); nology Corp). For each sample, 30,000 events were recorded. The mean to be predicted deleterious, a missense mutation had to be scored as delete- value of GFP abundance was calculated using FlowJo software and nor- rious in at least 5 tests (Ndamage ≥ 5), as this cutoff captured known recurrent malized to the mean GFP value in wild-type cells. missense mutations in POLE, FBXW7, KAT8, MTOR,andSETX.

Analysis of Cancer Genomics Data. TCGA data were obtained from the cBIO ACKNOWLEDGMENTS. We thank Dr. Charlie Boone for the ts mutants. This portal (http://www.cbioportal.org). Simulations to determine statistical sig- work was supported by NIH grant R01GM26017 (to R.D.K.), Coordenação de nificance were performed as described (SI Appendix, Methods) (11). Pre- Aperfeiçoamento de Pessoal de Nível Superior (Brazil) grant (23038.004629/ diction of the functional impact of missense mutations was determined as 2014-19 to S.J.d.S.), and the Ludwig Institute (R.D.K., C.D.P., and S.J.d.S.).

1. L. A. Loeb, A mutator phenotype in cancer. Cancer Res. 61, 3230–3239 (2001). 33. M. E. Huang, A. G. Rio, A. Nicolas, R. D. Kolodner, A genomewide screen in Saccha- 2. B. Vogelstein et al., Cancer genome landscapes. Science 339, 1546–1558 (2013). romyces cerevisiae for genes that suppress the accumulation of mutations. Proc. Natl. 3. S. Nik-Zainal et al.; Breast Cancer Working Group of the International Cancer Genome Acad. Sci. U.S.A. 100, 11529–11534 (2003). Consortium, Mutational processes molding the genomes of 21 breast cancers. Cell 34. M. E. Huang, R. D. Kolodner, A biological network in Saccharomyces cerevisiae prevents the 149, 979–993 (2012). deleterious effects of endogenous oxidative DNA damage. Mol. Cell 17,709–720 (2005). 4. C. Palles et al.; CORGI Consortium; WGS500 Consortium, Germline mutations affecting 35. C. P. Albuquerque et al., Distinct SUMO ligases cooperate with Esc2 and Slx5 to suppress the proofreading domains of POLE and POLD1 predispose to colorectal adenomas duplication-mediated genome rearrangements. PLoS Genet. 9, e1003670 (2013). and carcinomas. Nat. Genet. 45, 136–144 (2013). 36. C. D. Putnam et al., Bioinformatic identification of genes suppressing genome in- – 5. A. de la Chapelle, Genetic predisposition to colorectal cancer. Nat. Rev. Cancer 4, 769– stability. Proc. Natl. Acad. Sci. U.S.A. 109, E3251 E3259 (2012). 780 (2004). 37. S. Banerjee, K. Myung, Increased genome instability and telomere length in the elg1- 6. L. A. Loeb, C. C. Harris, Advances in chemical : A historical review and deficient Saccharomyces cerevisiae mutant are regulated by S-phase checkpoints. – prospective. Cancer Res. 68, 6863–6872 (2008). Eukaryot. Cell 3, 1557 1566 (2004). 7. K. Inaki, E. T. Liu, Structural mutations in cancer: Mechanistic and functional insights. 38. K. Myung, S. Smith, R. D. Kolodner, Mitotic checkpoint function in the formation of Trends Genet. 28, 550–559 (2012). gross chromosomal rearrangements in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. – 8. L. Sansregret, C. Swanton, The role of in cancer evolution. Cold Spring U.S.A. 101, 15980 15985 (2004). 39. M. E. Budd, C. C. Reis, S. Smith, K. Myung, J. L. Campbell, Evidence suggesting that Pif1 Harb. Perspect. Med. 7, a028373 (2017). helicase functions in DNA replication with the Dna2 helicase/ and DNA po- 9. A. D. D’Andrea, Susceptibility pathways in Fanconi’s anemia and breast cancer. N. lymerase delta. Mol. Cell. Biol. 26, 2490–2500 (2006). Engl. J. Med. 362, 1909–1919 (2010). 40. J. Y. Hwang et al., Smc5-Smc6 complex suppresses gross chromosomal rearrangements 10. H. Kobayashi, S. Ohno, Y. Sasaki, M. Matsuura, Hereditary breast and ovarian cancer mediated by break-induced replications. DNA Repair (Amst.) 7, 1426–1436 (2008). susceptibility genes (review). Oncol. Rep. 30, 1019–1029 (2013). 41. S. Allen-Soltero, S. L. Martinez, C. D. Putnam, R. D. Kolodner, A saccharomyces 11. C. D. Putnam et al., A genetic network that suppresses genome rearrangements in cerevisiae RNase H2 interaction network functions to suppress genome instability. Mol. Saccharomyces cerevisiae and contains defects in cancers. Nat. Commun. 7, 11256 (2016). Cell. Biol. 34,1521–1534 (2014). 12. C. D. Putnam, R. D. Kolodner, Pathways and mechanisms that prevent genome in- 42. A. Colosio, C. Frattini, G. Pellicanò, S. Villa-Hernández, R. Bermejo, Nucleolytic processing – stability in Saccharomyces cerevisiae. Genetics 206, 1187 1225 (2017). of aberrant replication intermediates by an Exo1-Dna2-Sae2 axis counteracts fork 13. C. Chen, R. D. Kolodner, Gross chromosomal rearrangements in Saccharomyces collapse-driven . Nucleic Acids Res. 44,10676–10690 (2016). – cerevisiae replication and recombination defective mutants. Nat. Genet. 23,81 85 (1999). 43. S. K. Deng, Y. Yin, T. D. Petes, L. S. Symington, Mre11-Sae2 and RPA collaborate to 14. J. E. Chan, R. D. Kolodner, A genetic and structural study of genome rearrangements prevent palindromic gene amplification. Mol. Cell 60, 500–508 (2015). mediated by high copy repeat Ty1 elements. PLoS Genet. 7, e1002089 (2011). 44. K. A. Shah et al., Role of DNA polymerases in repeat-mediated genome instability. 15. C. D. Putnam, T. K. Hayes, R. D. Kolodner, Specific pathways prevent duplication- Cell Rep. 2, 1088–1095 (2012). – mediated genome rearrangements. Nature 460, 984 989 (2009). 45. Y. Zhang et al., Genome-wide screen identifies pathways that govern GAA/TTC repeat 16. P. Kanellis et al., A screen for suppressors of gross chromosomal rearrangements iden- fragility and expansions in dividing and nondividing yeast cells. Mol. Cell 48,254–265 (2012). tifies a conserved role for PLP in preventing DNA lesions. PLoS Genet. 3, e134 (2007). 46. J. N. Weinstein et al.; Cancer Genome Atlas Research Network, The cancer genome 17. K. Myung, A. Datta, R. D. Kolodner, Suppression of spontaneous chromosomal rearrange- Atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013). ments by S phase checkpoint functions in Saccharomyces cerevisiae. Cell 104,397–408 (2001). 47. Z. Li et al., Systematic exploration of essential yeast gene function with temperature- 18. J. A. Hackett, D. M. Feldser, C. W. Greider, Telomere dysfunction increases mutation sensitive mutants. Nat. Biotechnol. 29, 361–367 (2011). rate and genomic instability. Cell 106, 275–286 (2001). 48. L. Huo et al., The Rix1 (Ipi1p-2p-3p) complex is a critical determinant of DNA replication 19. V. Pennaneach, R. D. Kolodner, Stabilization of dicentric translocations through secondary licensing independent of their roles in ribosome biogenesis. Cell Cycle 11, 1325–1339 (2012). rearrangements mediated by multiple mechanisms in S. cerevisiae. PLoS One 4, e6389 (2009). 49. T. A. Nissan et al., A pre-ribosome with a tadpole-like structure functions in ATP- 20. C. D. Putnam, V. Pennaneach, R. D. Kolodner, Chromosome healing through terminal dependent maturation of 60S subunits. Mol. Cell 15, 295–301 (2004). deletions generated by de novo telomere additions in Saccharomyces cerevisiae. Proc. 50. S. Tanaka, H. Araki, Helicase activation and establishment of replication forks at chro- Natl. Acad. Sci. U.S.A. 101, 13262–13267 (2004). mosomal origins of replication. Cold Spring Harb. Perspect. Biol. 5, a010371 (2013). 21. C. D. Putnam, V. Pennaneach, R. D. Kolodner, Saccharomyces cerevisiae as a model system 51. M. A. Basrai, V. E. Velculescu, K. W. Kinzler, P. Hieter, NORF5/HUG1 is a component of to define the chromosomal instability phenotype. Mol. Cell. Biol. 25,7226–7238 (2005). the MEC1-mediated checkpoint response to DNA damage and replication arrest in 22. C. D. Putnam, K. Pallis, T. K. Hayes, R. D. Kolodner, DNA repair pathway selection Saccharomyces cerevisiae. Mol. Cell. Biol. 19, 7041–7049 (1999). caused by defects in TEL1, SAE2, and de novo telomere addition generates specific 52. A. Srivatsan et al., The Swr1 chromatin-remodeling complex prevents genome instability chromosomal rearrangement signatures. PLoS Genet. 10, e1004277 (2014). induced by replication fork progression defects. Nat. Commun. 9, 3680 (2018). 23. J. E. Chan, R. D. Kolodner, Rapid analysis of Saccharomyces cerevisiae genome rearrangements 53. E. Rayner et al., A panoply of errors: Polymerase proofreading domain mutations in – by multiplex ligation-dependent probe amplification. PLoS Genet. 8, e1002539 (2012). cancer. Nat. Rev. Cancer 16,71 81 (2016). 24. K. H. Schmidt, J. Wu, R. D. Kolodner, Control of translocations between highly di- 54. F. J. Lemoine, N. P. Degtyareva, K. Lobachev, T. D. Petes, Chromosomal translocations verged genes by Sgs1, the Saccharomyces cerevisiae homolog of the Bloom’s syn- in yeast induced by low levels of DNA polymerase a model for chromosome fragile sites. Cell 120, 587–598 (2005). drome protein. Mol. Cell. Biol. 26, 5406–5420 (2006). 55. K. Jeppsson, T. Kanno, K. Shirahige, C. Sjögren, The maintenance of chromosome structure: 25. K. Myung, C. Chen, R. D. Kolodner, Multiple pathways cooperate in the suppression of Positioning and functioning of SMC complexes. Nat. Rev. Mol. Cell Biol. 15,601–614 (2014). genome instability in Saccharomyces cerevisiae. Nature 411, 1073–1076 (2001). 56. S. Ben-Aroya et al., Proteasome nuclear activity affects chromosome stability by controlling 26. C. D. Putnam, T. K. Hayes, R. D. Kolodner, Post-replication repair suppresses the turnover of Mms22, a protein important for DNA repair. PLoS Genet. 6, e1000852 (2010). duplication-mediated genome instability. PLoS Genet. 6, e1000933 (2010). 57. M. Nie, M. N. Boddy, Cooperativity of the SUMO and ubiquitin pathways in genome 27. S. Smith et al., Mutator genes for suppression of gross chromosomal rearrangements stability. Biomolecules 6, 14 (2016). identified by a genome-wide screening in Saccharomyces cerevisiae. Proc. Natl. Acad. 58. C. B. Gerhold, M. H. Hauer, S. M. Gasser, INO80-C and SWR-C: Guardians of the ge- – Sci. U.S.A. 101, 9039 9044 (2004). nome. J. Mol. Biol. 427, 637–651 (2015). 28. P. C. Stirling et al., The complete spectrum of yeast chromosome instability genes 59. S. Loeillet et al., Genetic network interactions among replication, repair and nuclear identifies candidate CIN cancer genes and functional roles for ASTRA complex com- pore deficiencies in yeast. DNA Repair (Amst.) 4, 459–468 (2005). ponents. PLoS Genet. 7, e1002057 (2011). 60. K. Shimada et al., TORC2 signaling pathway guarantees genome stability in the face 29. G. De Piccoli et al., Smc5-Smc6 mediate DNA double-strand-break repair by pro- of DNA strand breaks. Mol. Cell 51, 829–839 (2013). – moting sister-chromatid recombination. Nat. Cell Biol. 8, 1032 1034 (2006). 61. H. E. Mischo et al., Yeast Sen1 helicase protects the genome from transcription- 30. S. Banerjee et al., Mph1p promotes gross chromosomal rearrangement through associated instability. Mol. Cell 41,21–32 (2011). partial inhibition of . J. Cell Biol. 181, 1083–1093 (2008). 62. M. Soriano-Carot, I. Quilis, M. C. Bañó, J. C. Igual, Protein kinase C controls activation 31. A. Motegi, K. Kuntz, A. Majeed, S. Smith, K. Myung, Regulation of gross chromosomal of the DNA integrity checkpoint. Nucleic Acids Res. 42, 7084–7095 (2014). rearrangements by ubiquitin and SUMO ligases in Saccharomyces cerevisiae. Mol. 63. T. A. Knijnenburg et al., Genomic and molecular landscape of DNA damage repair Cell. Biol. 26, 1424–1433 (2006). deficiency across the cancer genome Atlas. Cell Rep. 23, 239–254.e6 (2018). 32. K. H. Schmidt, R. D. Kolodner, Suppression of spontaneous genome rearrangements 64. M. D. Leiserson et al., Pan-cancer network analysis identifies combinations of rare somatic in yeast DNA helicase mutants. Proc. Natl. Acad. Sci. U.S.A. 103, 18196–18201 (2006). mutations across pathways and protein complexes. Nat. Genet. 47,106–114 (2015).

17382 | www.pnas.org/cgi/doi/10.1073/pnas.1906921116 Srivatsan et al. Downloaded by guest on September 24, 2021