Bioinformatic identification of suppressing PNAS PLUS genome instability

Christopher D. Putnama,b, Stephanie R. Allen-Solteroa,c, Sandra L. Martineza, Jason E. Chana,b, Tikvah K. Hayesa,1, and Richard D. Kolodnera,b,c,d,e,2

aLudwig Institute for Cancer Research, Departments of bMedicine and cCellular and Molecular Medicine, dMoores-University of California at San Diego Cancer Center, and eInstitute of Genomic Medicine, University of California School of Medicine at San Diego, La Jolla, CA 92093

Contributed by Richard D. Kolodner, September 28, 2012 (sent for review August 25, 2011) Unbiased forward genetic screens for mutations causing increased in concert to prevent genome rearrangements (reviewed in 12). gross chromosomal rearrangement (GCR) rates in Saccharomyces Modifications of the original GCR assay demonstrated that sup- cerevisiae are hampered by the difficulty in reliably using qualitative pression of GCRs mediated by segmental duplications and Ty GCR assays to detect mutants with small but significantly increased elements involves additional genes and pathways that do not GCR rates. We therefore developed a bioinformatic procedure using suppress single-copy sequence-mediated GCRs (13–15). Inter- genome-wide functional genomics screens to identify and prioritize estingly, homologs of some GCR-suppressing genes and pathways candidate GCR-suppressing genes on the basis of the shared drug suppress the development of cancer in mammals (16). Most of the sensitivity suppression and similar genetic interactions as known genes that suppress GCRs have been identified through a candi- GCR suppressors. The number of known suppressors was increased date approach. Some studies have screened collections of from 75 to 110 by testing 87 predicted genes, which identified un- arrayed S. cerevisiae mutants for mutations that cause increased anticipated pathways in this process. This analysis explicitly dealt GCR rates and have identified additional genes of interest with the lack of concordance among high-throughput datasets (17–20), although the mutations identified in each screen only to increase the reliability of phenotypic predictions. Additionally, had a small overlap with each other. Consequently, it is probable shared phenotypes in one assay were imperfect predictors for that not all the genes and pathways that suppress GCRs have shared phenotypes in other assays, indicating that although ge- been identified. nome-wide datasets can be useful in aggregate, caution and valida- The promise of genome-wide –protein interaction, ge- tion methods are required when deciphering biological functions via netic interaction, and drug sensitivity datasets developed using S. surrogate measures, including growth-based genetic interactions. cerevisiae is that these data can be used for predicting gene and gene product functions (e.g., ref. 21). Despite the fact that these datasets DNA damage | DNA repair | systems biology contain useful information, high-throughput methods are prone to both false-positive and false-negative errors. Consequently, differ- enetic instability is a characteristic of most cancers (1) that ent datasets generated using similar approaches to screen the same Gmay play a critical role in driving the accumulation of genetic mutant collection show a substantial lack of concordance (22). changes that underlie tumorigenesis (2). A number of observations Here, we show that combining these types of data identified addi- are consistent with this view, including the following: a number of tional genes involved in suppressing genome stability based on the cancer predisposition syndromes have been identified that are hypothesis that these additional genes will share aspects of their associated with inherited defects in genes involved in suppressing phenotypes with known genes. Using these data, we have generated genome instability, and inactivation of some of these genes has a set of 1,041 gene deletion mutations that have genetic interactions been observed in sporadic cancers (3, 4); p53, which promotes cell and drug sensitivity profiles matching those mutations known to cycle arrest or apoptosis in response to DNA damage, is inacti- affect the rate of accumulating GCRs; 787 of them are character- vated in roughly 50% of human cancers, and p53 defects allow cells ized by dense genetic interactions, and the remaining 254 have to tolerate the accumulation of genome rearrangements (5); and limited genetic interactions. To validate this approach, we in- genomic instability has been observed to precede the transition to vestigated a subset of the predicted genes and found that deletions the carcinogenic state or to be associated with the development of of 35 of the 87 genes selected from clusters containing known GCR- cancers in mouse model systems (6). suppressing genes for analysis increased the rate of accumulating The investigation of model systems in the study of genome in- GCRs, which represents a 200-fold higher efficiency for identifying stability has the potential to identify and understand novel genes new GCR-suppressing genes compared with that seen in genome- and pathways relevant to human cancer. A genetic assay developed wide screens. This experimental validation identified genes that had in the has been used to identify not been previously implicated in suppressing GCRs and demon- genes and pathways that suppress gross chromosomal rearrange- strated that components of the nuclear pore, the proteasome, and ments (GCRs) mediated by single-copy DNA sequences (7). In this the morphogenesis and septin checkpoint, as well as proper control assay, selection against two genetic markers, CAN1 and URA3, of the anaphase-promoting complex/cyclosome (APC/C), play roles placed on a nonessential end of the left arm of V in suppressing GCRs. Thus, the resulting gene lists are enriched for selects for the loss of these two genes that results as a consequence of the formation of GCRs that delete the left arm of chromosome V. The types of GCRs that have been observed with this assay GENETICS Author contributions: C.D.P. and R.D.K. designed research; C.D.P., S.R.A.-S., S.L.M., J.E.C., include terminal deletions healed by de novo telomere addition, and T.K.H. performed research; C.D.P. contributed new reagents/analytic tools; C.D.P. and translocations, isoduplications and other types of dicentric trans- R.D.K. analyzed data; and C.D.P. and R.D.K. wrote the paper. location , interstitial deletions, circular chromosomes, The authors declare no conflict of interest. and complex GCRs resulting from multiple cycles of rearrange- 1Present address: Curriculum in Genetics and Molecular Biology and Lineberger Compre- ment, usually as a result of the formation of unstable dicentric hensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599. translocations (8–11). Using this assay, oxidative defense pathways, 2To whom correspondence should be addressed. E-mail: [email protected]. the replication machinery, DNA repair pathways, cell cycle check- See Author Summary on page 19055 (volume 109, number 47). point pathways, telomere maintenance pathways, and chromatin This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. modification and assembly pathways have been shown to function 1073/pnas.1216733109/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1216733109 PNAS | Published online November 5, 2012 | E3251–E3259 Downloaded by guest on September 28, 2021 genes that function to suppress genome stability. Importantly, our tified in 119, 118, 106, 96, and 91 screens, respectively (Dataset S2). results indicate that identification of genes based on analysis of Over 60% of all mutations were identified in 5 or fewer screens, DNA damaging agent sensitivity and growth-based genetic in- and 16% were observed in only 1 screen. Using random computer teraction patterns was an imperfect predictor for identifying genes simulations (Materials and Methods), we calculated pnhit P values, that suppress GCRs, which has important implications for attempts which was the statistical significance of identifying a gene n times, to reconstruct pathways by computationally combining data from and found that mutations identified eight times (n =8)weresig- systematic genetic and physical interaction studies. nificant (pnhit < 0.01). We also analyzed mutations identified a statistically significant Results number of times (pnhit < 0.01) that caused sensitivity to specific Bioinformatic Identification of Candidate Genome Stability Genes. DNA damaging treatments using the program GOstat (23) to Genes identified as suppressing genome rearrangements. To identify identify statistically significant terms (24). This candidate genes that suppress GCRs (Fig. 1), we first analyzed analysis primarily identified terms related to DNA repair, DNA over 700 published GCR rates of strains with single or multiple damage signaling, chromatin, and chromosome organization and mutations. This identified 75 mutations that increased GCR rates biogenesis (Dataset S3). Some unexpected pathways were also by fivefold or more as single mutants and/or caused synergistic identified: ubiquitin-dependent protein catabolism of the multi- increases in rate in combination with other mutations (SI Appen- vesicular body pathway [2-dimethylaminoethyl chloride (DMAEC), dix, Table S1 and Dataset S1) and 40 mutations that did not in- hydroxyurea (HU), mitomycin c, and tirapazamine]; osmotic stress crease GCR rates (SI Appendix, Table S2). The analysis considered (cisplatin and mitomycin c); vesicle-mediated transport (bleomycin, the effect of all pair-wise interactions; for example, the GCR rate HU, and oxaliplatin); peroxisome function [methylmethane sul- of the mre11 lig4 tlc1 triple-mutant strain was compared with the fonate (MMS)]; and secretory pathways, membrane invagination, GCR rates of the pairs of strains mre11 and lig4 tlc1, lig4 and and glycoprotein biosynthesis (bleomycin). In contrast, genes as- mre11 tlc1,andtlc1 and mre11 lig4. Interestingly, many of the sociated with UV light and ionizing radiation (IR) resistance were mutations that increased the GCR rates also increased GCR rates predominantly associated with DNA repair, damage signaling, synergistically in combination with other mutations, whereas and chromatin remodeling. One implication of these results is many of the mutations tested that did not increase GCR rates that some pathways involved in resistance to chronic drug treat- suppressed the increased GCR rates caused by other mutations. ments but not UV or IR treatment might function by means of Genes identified as suppressing sensitivity to DNA damaging agents. drug detoxification, drug export, or amelioration of damage to Mutations in many of the 75 known GCR-suppressing genes cellular components other than DNA. caused increased sensitivity to DNA damaging agents (SI Appendix, Because we were interested in common DNA damage respon- Table S1). Therefore, we analyzed the results of 155 screens of the ses, we developed a statistical test to identify mutations with biased S. cerevisiae gene deletion collection against DNA damaging agents distributions to screens of specific treatments and specificlabora- (SI Appendix,TableS3). Combined, 4,414 mutations (affecting tories (Materials and Methods). We applied this test to all 4,143 over 90% of the nonessential genes in the S. cerevisiae genome) mutations, which reduced the number to 1,446 mutations. Most were reported to cause some level of increased sensitivity in at least mutations eliminated were observed in four or fewer screens (Fig. one screen; this number was reduced to 4,143 mutations by treating 2C). In addition, the test eliminated frequently observed mutations deletions of dubious ORFs that overlapped verified genes as alleles that were specific to a particular laboratory, such as yll032cΔ, of the verified genes (Fig. 2A and Dataset S2). The large number of rpl15bΔ, gal1Δ,andtma46Δ, which were observed 52, 50, 49, and reported mutations causing sensitivity to DNA damaging agents 49 times, respectively, in a single laboratory or were specifictoa reflected the low reproducibility of different screens of the same particular damaging agent, including hxk2Δ, ybr242wΔ, ald6Δ, damaging agents (Fig. 2E); hierarchal agglomerative clustering atg12Δ,andylr064wΔ, which were observed 10, 9, 9, 8, and 8 times, analysis grouped screens by laboratory rather than by DNA dam- respectively, almost exclusively in cisplatin sensitivity screens. Al- aging agent, indicative of “batch effects” in these high-throughput though the eliminated mutations had no obvious role in the DNA datasets (22). Regardless, the most commonly identified mutations damage response, we tested 45 of these mutations, including the affected genes known to be involved in DNA repair (Fig. 2B). For laboratory-specific examples cited above, for their affect on chronic example, mms4Δ, rad5Δ, mus81Δ, rad59Δ,andrad10Δ were iden- exposure to HU, MMS, 4-nitroquinoline 1-oxide (4NQO), and/or

258 merged 962 merged 70 Newly tested A GCR suppressing DNA damaging agent B genes sensitivity genes 60 Previously known

Known 75 31 814 Sensitivity 50 GCR genes 928 genes 40 44 114

227 genetically 148 genetically 30 similar genes 183 34 similar genes Number of Genes 20 10 related genes 10

0 10

0 0 0 Suppressors Suppressors Suppressors Suppressors 72 186 776 GCR and Sensitivty Related GCR list Sensitivity list list Final 1041 candidate genes lists

Fig. 1. Schematic of the bioinformatic scheme to enrich for genome stability genes. (A) Number of genes identified at each step is indicated. Venn diagrams contain gene counts and indicate merging steps. (B) Breakdown of genes that suppress and have no effect in suppressing GCRs as a function of if these genes were present in the list of genes suppressing GCRs, sensitivity to DNA damaging agents, or both, or were from the list of 10 related genes. Dark bars indicate genes whose roles in GCRs were tested here, and white bars indicate genes whose GCR status was previously known.

E3252 | www.pnas.org/cgi/doi/10.1073/pnas.1216733109 Putnam et al. Downloaded by guest on September 28, 2021 PNAS PLUS A 600 All Mutations (n=4,143) E Number of Number of Mutations 400 Treatment Screens In Any p < 0.01 In All 200 Cisplatin 33 1743 708 0

Mutations Mechlorethamine 22 1398 370 2 0 MMS 12 2053 274 0 B 4 All Mutations HU 11 1350 265 0 2 Mitomycin c 11 2206 671 9 Camptothecin 8 485 100 0 0 Psoralen 7 718 61 0 2 Bleomycin 6 736 446 23 Mutations Carboplatin 6 570 62 0 4 DNA Repair Mutations Angelicin 5 570 113 1 Oxaliplatin 5 656 235 6 C 600 Common Mutations (n=1,743) 4NQO 4 954 184 30 400 Melphalan 4 434 39 0 200 Streptozotocin 4 425 115 11 UVC 3 421 41 8 Mutations 0 DMAEC 2 307 132 0 D 600 DNA Damaging Agent Mutations (n=928) Doxorubicin 2 229 37 37 400 IR 2 165 165 19 other 8 590 n.a. 0 200

Mutations 0 All 155 4143 928 0 0 20 40 60 80 100 120 Number of Screens

Fig. 2. Analysis of DNA damaging agent treatments. (A) Histogram of the number of DNA damaging agent sensitivity mutations as a function of the number of screens in which each mutation was identified. (B) View of the histogram in A plotting all mutations above the axis and only those mutations known to affect the DNA damage response below the axis. (C) View of the histogram in A after filtering out agent- and laboratory-specific mutations. (D)Viewofthe histogram in C after filtering out nonsignificant genes. (E) Summary table of treatments that have been screened multiple times and the number of mutations

found in any screen, in a statistically significant number of screens (pnhit < 0.01), and in all screens. UVC, ultraviolet light in band C.

camptothecin. Forty-four of the 45 mutations caused no drug sen- genes, 71 suppressed GCR formation (61% of those tested) and sitivity (P < 0.0001, hypergeometric test), whereas yll032cΔ caused 46 did not when including the experimental results described be- weak MMS sensitivity. Retaining only those mutations identified in low, suggesting an enrichment for GCR-suppressing genes (P = −102 asignificant number of screens (pnhit < 0.01) from the 1,446 muta- 1 × 10 if the 110 GCR-suppressing genes identified previously − tions resulted in 928 mutations, which included 44 of 75 mutations and below are the only ones that exist and P =2× 10 20 if all 1,041 increasing GCR rates (Fig. 2D and Dataset S1). candidates identified here suppress GCRs, hypergeometric test). Genes identified by genetic congruence. To find genes that had been Merging the 75 starting genes and 227 congruent genes produced missed but with related functions, we scored all the genes in the a merged GCR list of 258 genes (Fig. 1). genome on the basis of their genetic similarity or “congruence” Using the 928 DNA damaging agent genes, the maximum (25) with previously identified genes using reported growth-based congruence score was 0.063 for SWR1 (Dataset S4) and the cutoff genetic interactions. The growth-based genetic data also have of 0.046 (P < 0.01, random simulation) selected 148 genes, which imperfect concordance; the mean overlap for a reported subset of included 114 starting genes and 34 new genes. One hundred five of genetic interactions in S. cerevisiae by different groups has been the 114 starting genes were identified even when removed from the estimated at less than 50% (26), potentially due to errors in scoring initial list. Thirty-two of the 34 new genes were nonessential, 31 growth phenotypes, escape of diploids during haploid selection were previously identified in at least one screen, and deletion of 20 (27, 28), additional mutations present in strains in the deletion of these 32 nonessential new genes caused at least some sensitivity collection (29, 30), and/or the presence of an incorrect mutation when tested against chronic exposure to HU, MMS, 4NQO, and/or − due to cross-contamination, which we have tested for and cor- camptothecin (P < 1 × 10 7, hypergeometric test; Dataset S5). rected in our copy of the genome deletion collection. Therefore, Because most newly identified genes suppressed drug sensitivity, our strategy to improve the robustness of this step was to score the we generated a merged list of 962 genes from the starting genes genetic congruence of each candidate mutation using the com- and the genetically congruent genes (Fig. 1). bined genetic signature of the interactions of the 75 mutations Merging the 258 GCR genes and 962 DNA damaging agent causing increased GCR rates and the 928 mutations causing DNA genes implicated in this study generated a merged list of 1,031 damaging agent sensitivity with each gene in the rest of the genome genes (Fig. 1 and Dataset S1). One hundred eighty-nine genes GENETICS (∼6,000 genes; SI Appendix, SI Materials and Methods). Congru- were shared between the merged GCR gene and merged DNA ence scores could range from 0 (no congruence) to 1 (complete damaging agent lists: 69 were unique to the merged GCR gene list, congruence), and random simulations were performed to identify and 773 were unique to the merged DNA damaging agent list. statistically significant congruence score cutoffs. Additionally, we noted 10 genes (RTT105, IRC15, IRC3, DOT1, Using the 75 GCR genes, the maximum congruence score was DPB3, MLH1, NAS6, PAP2, UMP1, and VAC7) that fell below 0.115 for SRS2 (Dataset S4), and the cutoff of 0.040 (P < 0.01, statistical cutoffs in our analysis but were related to and clustered random simulation) selected 227 genes, which included 44 starting with bona fide GCR-suppressing genes (see below), and we added genes and 183 new genes. Forty-two of the 44 recovered starting them to the final list, resulting in a total of 1,041 genes. genes were reidentified by congruence selection even when the Robustness of the method. A computational test of the robustness of gene was removed from the initial list. Of the 227 GCR congruent our method was performed by determining if the method could

Putnam et al. PNAS | Published online November 5, 2012 | E3253 Downloaded by guest on September 28, 2021 identify genes found in three different systematic screens using genes were identified (Datasets S2 and S6) and is consistent with modified GCR assays to identify genes that suppress GCRs (17– the known roles of these gene products. We note that the inability 19) when those genes were removed from the original list of GCR- of unperturbed growth to capture the roles of these types of genes suppressing genes that anchored the analysis. This analysis re- can be anticipated from decades of classic genetic studies as well covered 7 of 8 (P < 0.0001, hypergeometric test), 8 of 11 (P < as a recent report of changes in genetic interactions measured in − 0.0004), and 13 of 16 (P < 7 × 10 7) of the genes reported in these a high-throughput manner due to the presence of MMS (31). screens, respectively (SI Appendix, Table S4), although it should be noted that some of these genes that were not identified by our Experimental Validation of the Enrichment of Genome Stability analysis only played small roles in suppressing GCRs and that Genes. We selected a subset of 87 genes from the finallistof1,041 many of the genes experimentally verified here were not identified genes to analyze their potential roles in suppressing GCRs. Given by these screens (see below). Despite these differences, the ro- that some clusters might be more important for drug detoxification bustness with which these genes from these screens were identified or export than genome stability per se, these genes were primarily suggests that the final list is enriched in genes involved in pre- selected from clusters that contained known GCR-suppressing venting genome stability. genes. None of these genes had been tested for a role in suppressing GCRs at the time this analysis was initiated, although, subsequently, Computational Analysis and Prioritization of Candidate Genome the results of studies of some of the selected genes have been Stability Genes. The 1,041 genes implicated by this analysis were reported by others. We also surveyed genes from a number of other large enough to be problematic for gene-by-gene validation. In clusters. Overall, we tested the effects of 87 different single-gene addition, the identification of potential drug detoxification mech- deletion mutations in our standard GCR assay that measures GCRs anisms suggested that not all these genes directly suppress genome mediated by single-copy sequences (Table 1 and SI Appendix,Table instability. Thus, to prioritize the final list of 1,041 genes for sub- S5) and found that 35 (40%) caused at least a modest but significant sequent experimental analysis, the genes were subjected to ag- threefold or higher increase in the spontaneous GCR rate, which glomerative hierarchical clustering analysis (Fig. 3 and Dataset S6) suggests a substantial enrichment for genes that suppress genome using congruence scores calculated from reported growth-based instability in the higher scoring clusters generated by the bio- genetic interactions (SI Appendix, SI Materials and Methods). This informatic analysis. The presence of a newly tested gene in clusters analysis divided the list into 74 clusters (comprising 787 genes) 4, 32, 53, 55, 59, and 60 did not have a statistically significant bias for with an additional “unclustered” group (comprising 254 genes) suppressing GCRs (P = 0.2, Fisher exact probability), and the that contained those genes that did not cluster with other genes presence of new genes in these clusters did not correlate with due to lack of shared genetic interactions (Fig. 4). ahigherGCRrate(P =0.3,Mann–Whitney U test). Many clusters were enriched in genes involved in specific cel- The newly identified genes that function in the suppression of lular functions. For example, cluster 1 was enriched in polarity genome instability could be divided into three classes. The first determination and vesicle-mediated transport; cluster 2 was en- class of genes encoded subunits of complexes or components of riched in mitotic nuclear and chromosome migration; cluster 3 pathways already known to be involved in maintaining genome was enriched in chromatin modification and transcription; and stability. These included RMI1, which encodes a subunit of the cluster 4 was enriched in the DNA damage response, particularly Sgs1/Rmi1/Top3 complex (34); SAE2, which encodes a factor that those genes involved in double-strand break (DSB) repair (Data- acts in conjunction with the Mre11/Rad50/Xrs2 complex (35); set S6). Within each cluster, genes encoding protein complexes or RTT109, which is involved in the ASF1-dependent acetylation of belonging to known pathways tended to group together and to K56 of histone H3 (36); and HST3, ARP8, RSC2,andHTA1,which have few interactions with each other, consistent with these genes function in chromatin assembly and remodeling pathways, pro- belonging to single epistasis groups. Furthermore, genetic inter- cesses known to prevent genome instability (37). The second class actions between genes within an individual cluster (Fig. 4A) were of genes has been implicated by other analyses as suppressing consistent with the presence of multiple epistasis groups. To- genome stability but had never been analyzed in the GCR assay at gether, these observations indicated that the clustering captured the time this analysis was initiated. This group of genes included important aspects of at least some of the functions of these genes. CDC73, CLB2, CLB5, CSM3, DOT1, ESC2, MMS1, MRC1, Some biological functions were divided between multiple clus- MPH1, NUP84, NUP133, NUP60, REV7, RTT107 (ESC4), SLX5, ters. DNA damage response genes were divided between cluster 4 SLX8,andTOF1. The third class of genes lacks known functions or (Fig. 3) and cluster 32 (SI Appendix,Fig.S1), as well as the smaller has not previously known to play a role in maintaining genome clusters 53, 55, 59, and 60 (Dataset S6). Clusters 4 and 32 have high stability. These genes include CDH1, CTF4, DST1, HSL1, IRC3, GCR congruence scores (Fig. 4B) and moderate DNA damaging IRC15, LRS4, PIN4, RTT105, RML2,andRPN10. Unlike the case agent congruence scores (Fig. 4C), and they contain many of the of RPN10, which encodes a proteasome subunit (38), deletion of genes implicated in suppressing sensitivity to many different DNA the nonessential proteasome-related genes NAS6, RPN4,and damaging agents (Fig. 4D) and in playing important roles in sup- UMP1 did not increase the GCR rate (SI Appendix, Table S5). pressing GCRs (Fig. 4E). Examination of the interactions of these In contrast, 52 identified genes did not increase rates in our two clusters suggests that the major reason the clustering algorithm standard GCR assay when mutated (SI Appendix, Table S5). A split these genes into two clusters was that cluster 32 had fewer number of these genes were from clusters 4 and 32, which con- interactions with cluster 3 (chromatin modification) than cluster 4 tained many genes that caused substantially increased GCR rates did. In contrast, genes biased toward interactions with cluster 32 when mutated (Fig. 4E). Defects in some of these genes were but not with cluster 4 did not define clear pathways or groups. previously reported to show significant numbers of genetic inter- Remarkably, genes involved in a number of well-characterized actions with defects in DNA repair, including the genes encoding pathways, such as base-excision repair, nucleotide-excision repair, the Ard1-Nat1 N-terminal acetyltransferase complex (39) and the and mismatch repair, tended not to be present in either cluster 4 or Get1–Get2 complex involved in transporting tail-anchored pro- 32 (Fig. 3 and SI Appendix, Fig. S1), and genes from these pathways teins to the endoplasmic reticulum (40). Moreover, a number of were frequently divided between multiple clusters. The lack of other genes that were previously implicated as functioning in DNA clustering of these genes is consistent with their general paucity of repair and DNA damage responses, including CCR4, WSS1, genetic interactions relative to DSB repair genes in the absence PPH3, DOA1, and CSM1, did not appear to act in suppressing of DNA damaging agents (Fig. 4 A and B and Dataset S5). How- GCRs. Although it is possible that these genes play no role in ever, the importance of these genes in the presence of DNA dam- maintaining genome stability, it is also possible that they suppress aging agents is emphasized by the number of screens in which these GCRs not detected by our standard GCR assay. Multiple genes,

E3254 | www.pnas.org/cgi/doi/10.1073/pnas.1216733109 Putnam et al. Downloaded by guest on September 28, 2021 PNAS PLUS

Inclusion TL Ty CST LOH Ty1 - Ty1 + Ty3 - Ty3 + GCR Similar A+ G+ Drug Similar GCR Rate Drug A- G- IRC CST ALF BiM CTF 4NQO GCR MET15 SAM2 Carboplatin Cisplatin Oxaliplatin Mechlorethamine Streptozotocin MMS Angelicin DMAEC UV Mitomycin c HU Camptothecin Bleomycin Other MAT IR Psoralen CSM3 MRC1 TOF1 CTF8 CTF18 DCC1 CTF4 POL32 RAD27 ASF1 MMS22 DIA2 RAD50 XRS2 MRE11 RAD52 RAD51 RAD54 RAD55 RAD57 SGS1 SRS2 RTT107 MMS1 RTT101 ELG1 MUS81 RAD53 RAD18 RAD5 TSA1 DUN1 CCS1 SOD1 SWI6 SLX5 SLX8 RTT109 NAT1 ARD1 POP2 CCR4 CDC20 YNG2 EPL1 ARP4 ESA1 EAF1 GENETICS SIC1

Fig. 3. Annotated genes from cluster 4. The GCR rate column identifies mutations tested in the GCR assay: Circles were previously tested, squares were tested in this study, crosses were essential genes, solid symbols increased GCR rates as single mutants, half filled-in symbols only synergistically increased GCR rates in combination with other mutants, and open symbols did not increase GCR rates. “Inclusion” indicates if a gene was identified in the GCR rate (GCR Rate), genetic congruence to GCR genes (GCR Similar), DNA damaging agent (Drug), or genetic congruence to DNA damaging agent genes (Drug Similar) stage of the bioinformatics analysis. “IRC” indicates those genes causing increased recombination centers (48). “TL” indicates mutations identified in two telomere- − − length screens by Askree et al. (60) and Gatbonton et al. (61), with decreased (A ,G ) or increased (A+,G+) telomere lengths. “Ty” indicates mutations causing − − decreased (Ty1 , Ty3 ) or increased (Ty1+, Ty3+) transposition (49, 62, 63). “CST” indicates mutations identified as affecting chromosome stability by several assays (64, 65). LOH indicates mutations increasing loss of heterozygosity by several assays (66). Sensitivity to each DNA damaging agent is indicated by vertical bars, with different treatments having alternate colors.

Putnam et al. PNAS | Published online November 5, 2012 | E3255 Downloaded by guest on September 28, 2021 A Cluster Number 4 32 53 55 59 60 Unclustered

4

32 Cluster Number

53 55 59 60

B 10

) 8 2 6 GCR

(x10 4

Congruenc 2 0 C 6 5 ) 2 4 3 Drug (x10 2

Congruence 1 0 D 100 80 60 40

Number of 20 Drug Screens E 0 10-7 10-8

-9

GCR Rate 10 10-10 0 200 400 600 800 1000 1041 Gene Number

Fig. 4. Overview of the clustering of the bioinformatically identified genes. (A) Binary interaction map showing the presence (black) or absence (white) of genetic interactions (Materials and Methods) between all 1,041 genes in the 74 clusters and the nonclustered group (horizontal) and 787 genes in the 74 clusters (vertical). (B) Genetic congruence score for each of the 1,041 genes with the GCR genes. Boundaries for clusters 4 and 32 are shown as vertical lines. (C) Genetic congruence score with the genes suppressing sensitivity to DNA damaging agents. (D) Number of DNA damaging agents screens in which different deletions of the 1,041 genes were identified. (E) GCR rates of single-gene deletion mutants. Genes with rates listed as “Low” in SI Appendix, Table S5 were − arbitrarily assigned the WT GCR rate (3.5 × 10 10) for display purposes.

including RAD6, do not suppress GCRs in the standard assay but the original analysis, 18 genes were omitted (CDC6, MSH2, PBY1, do in other GCR assays (13, 14). Other genes, such as MRC1 and PEP3, PPM1, RDH54, RFA1, RNH201, RPN6, SAE2, SGO1, TEL1, play redundant roles in suppressing GCRs, whose role can SLK19, SPT4, SUM1, THP2, UBP14, ULP1,andVPS36), and 14 only be observed when combined with other mutations (41) (Table genes were included (ASC1, CSE2, EAF5, HOS2, MNN10, NUP188, 1), whereas other genes do not increase GCRs because they are PHO23, RTT103, SEC22, SIF1, SIN3, SNF4, SRC1,andYKE2). required for producing GCRs. Mutations in PBY1, SGO1, and SPT4, which were eliminated from the list, do not increase the GCR rate (SI Appendix, Table S5). Reiterating the Analysis with Newly Identified GCR Suppressors. The Mutations in MSH2, RDH54, RFA1, and SAE2, which were also 35 newly validated GCR-suppressing genes from this analysis (see eliminated, cause only modest increases in the GCR rate as single below) were combined with the initial list of 75 GCR-suppressing mutations (msh2Δ, rdh54Δ, and sae2Δ; Table 1 and SI Appendix, genes to generate a starting list of 110 GCR-suppressing genes. This Table S1), are complicated by their causing increased rates of newly identified set of starting genes was then reanalyzed by our point mutations in addition to GCRs (msh2Δ) (32), or are com- bioinformatics pipeline. Two hundred twenty-three genes, rather plicated by the existence of different hypomorphic alleles that than 227 genes from the original analysis, were identified as having cause different phenotypes (rfa1) (33). However, all these muta- statistically significant genetic congruence scores (score >0.046; P < tions were retained in this second analysis through their presence 0.01), which included 67 of the 110 starting genes. Compared with in the initial GCR list and/or by effects on sensitivity to DNA

E3256 | www.pnas.org/cgi/doi/10.1073/pnas.1216733109 Putnam et al. Downloaded by guest on September 28, 2021 Table 1. GCR rates of genome instability mutants implicated by bioinformatic analysis PNAS PLUS † Genotype* Systematic name Strain Cluster No. of DNA damaging screens Rate

WT — RDKY3615 n.a. n.a. 3.5 × 10−10 (1) − esc2::HIS3 ydr363w RDKY7030 32 11 9.0 × 10 8 (257) − rmi1::HIS3 ypl024w RDKY6242 32 6 6.0 × 10 8 (189) mrc1::TRP1, tof1::HIS3 ycl061c, ynl273w RDKY7032 4, 4 37, 33 2.6 × 10−8 (75) slx8::HIS3 yer116c RDKY7527 4 11 2.6 × 10−8 (75) − slx5::HIS3 ydl013w RDKY7524 4 9 2.3 × 10 8 (66) − cdh1::HIS3 ygl003c RDKY6485 21 4 2.1 × 10 8 (58) nup84::HIS3 ydl116w RDKY6195 32 9 1.6 × 10−8 (44) − rtt107::HIS3 yhr154w RDKY7031 4 23 9.4 × 10 9 (27) − rpn10::HIS3 yhr200w RDKY6216 3 16 9.0 × 10 9 (26) rsc2::G418 ylr357w RDKY6006 11 11 4.8 × 10−9 (13) mms1::HIS3 ypr164w RDKY6206 4 32 3.9 × 10−9 (11) − nup133::HIS3 ykr082w RDKY6476 7 9 3.7 × 10 9 (10) − rtt105::HIS3 yer104w RDKY6673 32 0 3.3 × 10 9 (9.4) dst1::G418 ygl043w RDKY7023 3 8 3.0 × 10−9 (8.6) arp8::HIS3 yor141c RDKY5949 32 21 2.9 × 10−9 (8.4) − irc15::HIS3 ypl017c RDKY7024 29 0 2.8 × 10 9 (8.0) − nup60::HIS3 yar002w RDKY6489 7 11 2.6 × 10 9 (7.4) irc3::HIS3 ydr332w RDKY7467 29 0 2.4 × 10−9 (6.9) − csm3::G418 ymr048w RDKY5708 4 46 2.2 × 10 9 (6.3) − clb5::G418 ypr120c RDKY7458 32 17 2.2 × 10 9 (6.3) rml2::HIS3 yel050c RDKY7069 12 8 2.2 × 10−9 (6.3) hsl1::HIS3 ykl101w RDKY6487 29 20 1.9 × 10−9 (5.4) − tof1::HIS3 ynl273w RDKY5135 4 33 1.6 × 10 9 (4.6) − mph1::G418 yir002c RDKY7026 34 60 1.6 × 10 9 (4.6) pin4::G418 ybl051c RDKY7476 15 16 1.6 × 10−9 (4.6) rev7::G418 yil139c RDKY7483 60 28 1.5 × 10−9 (4.3) − sae2::HIS3 ygl175c RDKY6234 55 62 1.4 × 10 9 (4.0) − hst3::HIS3 yor025w RDKY6060 14 26 1.4 × 10 9 (4.0) ctf4::G418 ypr135w RDKY6018 4 11 1.4 × 10−9 (4.0) − dot1::HPH ydr440w RDKY7021 16 2 1.4 × 10 9 (4.0) − hta1::HIS3 ydr225w RDKY6490 15 13 1.4 × 10 9 (4.0) − rtt109::HIS3 yll002w RDKY6226 4 16 1.4 × 10 9 (4.0) cdc73::HIS3 ylr418c RDKY6410 3 9 1.3 × 10−9 (3.7) − lrs4::G418 ydr439w RDKY7470 2 29 1.2 × 10 9 (3.4) − clb2::G418 ypr119w RDKY7456 29 45 1.2 × 10 9 (3.4)

n.a., not applicable. *Deletions constructed in RDKY3615 [MATα leu2Δ1 his3Δ200 trp1Δ63 ura3-52 ade2Δ1 ade8 lys2ΔBgl hom3-10 hxt13::URA3]. †Number in parentheses corresponds to fold increase in rate over the wild-type rate.

damaging agents. Thus, these results suggest that adding more addition, the observed gene validation frequency is likely to be data will further refine the results. higher than reported here because many mutations only cause increased GCR rates in conjunction with other mutations, such as Discussion tel1, or in segmental duplication GCR assays, such as rad6 (13, 41). Systematic genetics using the S. cerevisiae deletion and hypomor- A critical next step in our analysis is to test these mutations in phic allele collections has been well established. However, the multiple GCR assays that probe different chromosomal features ability to screen these mutants readily for complex phenotypes or and combine these mutations with other mutations. Importantly, phenotypes requiring involved quantitative assays, such as GCR this approach is generally applicable; these methods allowed the assays, can be difficult and subject to significant error. Thus, we analysis to be performed multiple times as more data have become designed a bioinformatic protocol for identifying unanticipated available, and the nature of the starting set of well-characterized genes involved in suppressing GCRs, which involved handling genes and genome-wide screens need not be tied to the problem of numerous genome-wide datasets affected by both false-positive genome stability and can readily accommodate RNAi data gen- and false-negative errors. This analysis identified genes that were erated in mammalian systems. successfully enriched for genes involved in genome stability, as Experimental verification of the implicated genes revealed a GENETICS evidenced by independent identification of most previously known number of interesting pathways. Genome instability was increased genes (17–19) and by the experimental validation of 40% of 87 by deletion of genes involved in synchronizing multiple phases of identified genes that were tested for a role in suppressing GCRs the cell cycle, including CLB2, CLB5, HSL1, PIN4, and particularly (Table 1 and SI Appendix,TableS5), including genes in un- CDH1, which encodes a subunit of the APC/C complex that expected pathways. Our analysis of these 87 genes resulted in the degrades during mitosis and G1, this role for CDH1 is identification of more GCR suppressing genes than resulted from consistent with observations in vertebrates (42). Genes encoding three genome-wide screens involving the analysis of more than two different subcomplexes of the nuclear pore, the Nup84 com- 14,000 mutants, which is consistent with a 200-fold enrichment in plex (NUP84, NUP133, NUP120, NUP145, NUP85, SEH1,and GCR-suppressing genes relative to the whole-genome screens. In SEC13)(43)andNUP60, suppressed genome instability; studies

Putnam et al. PNAS | Published online November 5, 2012 | E3257 Downloaded by guest on September 28, 2021 performed while this work was in progress suggest a role in sup- de novo. This is particularly true when growth is used as a surrogate pressing accumulation of DSBs via sumoylation of DNA repair marker for measuring a specific phenotype, because growth defects enzymes (44) and direct recruitment of DSBs to the nuclear pore may not be directly related to the phenotype of interest. We are (45). We also identified genes that may suppress GCRs by in- presently implementing an approach in which systematically gen- directly aiding DNA replication, including CTF4, which may link erated double-mutant strains designed to query the enriched gene DNA synthesis to sister chromatid cohesion, and DST1,which lists described here will be analyzed using multiple GCR assays to potentially reduces collisions between RNA and DNA poly- define better the pathways that suppress GCRs implied by the merases. In addition, we found a role for RPN10, which encodes bioinformatic analysis presented here. We anticipate that human a non-ATPase base subunit of the 19S regulatory particle of the 26S orthologs of verified GCR genes identified here will also play roles proteasome, in suppressing genome instability, suggesting that the in suppressing genome instability and may be important for sup- proteasome may play roles in genome instability outside of nu- pressing cancer initiation and progression. cleotide excision repair (38), consistent with recent reports linking the proteasome to DSB repair in S. cerevisiae (46) and vertebrates Materials and Methods (47). Interestingly, deletion of other genes related to the protea- Bioinformatic Analysis. The bioinformatic analysis described here has been some, including DOA1, NAS6, UMP1, UBP6, and especially RPN4, implemented in the integration of multiple orthogonal datasets (IMOD) which encodes a transcription factor that stimulates proteasome program package. IMOD and associated documentation and data files are gene expression and has a similar genetic interaction profile to available at http://sourceforge.net/projects/imod-gene. IMOD consists of RPN10, did not cause increased GCR rates. Thus, the defect in command-line programs and shell scripts. IMOD readily compiles and runs in Δ fi UNIX (uniplexed information and computing service) system-like operating rpn10 strains might involve a speci c feature or function of the systems. A detailed description of the methods is provided in SI Appendix, SI proteasome (or the regulatory particle) that is not affected by Materials and Methods. eliminating other nonessential proteasome components. Taken Analysis of DNA damaging agent sensitivities. Mutations deemed as causing together, this bioinformatics procedure has successfully identified sensitivity to different DNA damaging agent treatments were included based tested components of genome stability pathways, untested com- on the recommendations of the authors of the individual studies (SI Ap- ponents of tested genome stability pathways, untested genome pendix, Table S3). Deletions of genes deemed “dubious ORFs” by the Sac- stability pathways, and genes in other pathways that are beginning charomyces Genome Database that overlapped validated genes were to be implicated in suppressing genome instability. These successes treated as mutant alleles of the validated genes; for example, ybr099cΔ was encourage further characterization of genes whose roles in sup- treated as an mms4 mutation. The full list of overlaps used is available as part of the data distributed with the IMOD software. The pnhit P values for pressing genome instability might currently be less clear, including observing a mutation in n of the N DNA damaging screens were calculated IRC3, IRC15, RML2,andRTT105 (48, 49). using probabilities from 1,000,000 random simulations (SI Appendix, SI This bioinformatic scheme rested on three assumptions: (i) Materials and Methods). Determination if the distribution of any particular Systematically generated genome-wide data are of sufficient quality mutation was significantly biased toward a group of screens, such as those to be useful, (ii) novel genes that suppress GCRs share some belonging to a specific laboratory or a specific DNA damaging agent, was phenotypes with known genes that suppress GCRs, and (iii) genetic calculated by a ratio test of likelihoods (SI Appendix, SI Materials interactions reported on the basis of change in nonperturbed and Methods). growth provide a reasonable surrogate for other biological pro- Calculation of genetic distance and genetic congruence. Growth-based genetic fi interactions were measured using a modified BioGRID database derived from cesses. The above assumptions are suf ciently true that combining “ ” these independent sources of information yielded unexpected version 2.0.60 (51), including the interaction categories synthetic lethality, “synthetic growth defect,” and “haploinsufficiency,” as well as “phenotypic genes of interest that were validated at high frequency. The most enhancement” data specifically derived from E-MAP studies (52–55). We also problematic assumption, however, was that genetic interactions added 8,102 and 191,890 E-MAP interactions from additional studies pub- based on growth phenotypes were a reasonable measure of simi- lished during the course of this analysis (56, 57). The interaction data were larity for roles in suppressing genome instability. One of the used to calculate genetic distances via the composite angle distance, which is stronger counterexamples that can be cited is the observation that similar to the Jaccard distance (58) but has a number of advantages for deletion of 12 of 31 tested genes in a high-scoring DNA damage analysis of multiple genes (SI Appendix, SI Materials and Methods). We cluster (cluster 4) did not cause increased GCR rates as single scored genetic congruence of each gene in the genome against the list of mutations. Sufficient genetic data exist for genes in cluster 4 to genes of interest using the composite angle distance and performed over suggest that nonperturbed growth-based genetic interactions are 100,000 random simulations to calculate P values (SI Appendix, SI Materials and Methods). only a crude surrogate for measuring similarity in suppressing Clustering. Genes were clustered on the basis of their genetic congruence GCRs, which is consistent with the substantial changes in synthetic using agglomerative hierarchical clustering (59) (SI Appendix, SI Materials lethal interactions between deletion mutations caused DNA dam- and Methods). aging agents (31). Additionally, because only pair-wise interactions are typically identified, other kinds of important genetic results Yeast Genetics. S. cerevisiae strains were constructed in the RDKY3615 cannot be identified, such as suppression of the lethality of srs2Δ background (MATa leu2Δ1 his3Δ200 trp1Δ63 lys2ΔBgl hom3-10 ade2Δ1 ade8 sgs1Δ double mutants by mutations causing homologous re- ura3-52 hxt13::URA3) using standard PCR-based mutagenesis methods. The combination defects (50), and because more complex genetic media and protocol for strain propagation and measuring GCR rates were redundancies, which are particularly important in higher eukar- essentially as described previously (7). yotes, cannot be handled. Together these factors argue that al- ACKNOWLEDGMENTS. We thank Hans Hombauer, Jorritt Ensernik, Vincent though these data can be extraordinarily useful in aggregate as we Pennaneach, Ellen Kats, and Kyungjae Myung for the generous gift of have demonstrated here, caution is called for in any attempt to S. cerevisiae strains. This work was supported by National Institutes of Health use these kinds of data exclusively to derive biological pathways Grants GM26017 and GM085764.

1. Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100(1):57–70. 6. van de Wetering CI, Horne MC, Knudson CM (2007) Chromosomal instability and 2. Loeb LA (2001) A mutator phenotype in cancer. Cancer Res 61(8):3230–3239. supernumerary centrosomes represent precursor defects in a mouse model of T-cell 3. Hoeijmakers JH (2001) Genome maintenance mechanisms for preventing cancer. lymphoma. Cancer Res 67(17):8081–8088. Nature 411(6835):366–374. 7. Chen C, Kolodner RD (1999) Gross chromosomal rearrangements in Saccharomyces 4. Vessey CJ, Norbury CJ, Hickson ID (1999) Genetic disorders associated with cancer cerevisiae replication and recombination defective mutants. Nat Genet 23(1): predisposition and genomic instability. Prog Nucleic Acid Res Mol Biol 63:189–221. 81–85. 5. Soussi T, Ishioka C, Claustres M, Béroud C (2006) -specificmutation 8. Pennaneach V, Kolodner RD (2004) Recombination and the Tel1 and Mec1 checkpoints databases: Pitfalls and good practice based on the p53 experience. Nat Rev Cancer differentially effect genome rearrangements driven by telomere dysfunction in yeast. 6(1):83–90. Nat Genet 36(6):612–617.

E3258 | www.pnas.org/cgi/doi/10.1073/pnas.1216733109 Putnam et al. Downloaded by guest on September 28, 2021 9. Putnam CD, Pennaneach V, Kolodner RD (2004) Chromosome healing through 37. Myung K, Pennaneach V, Kats ES, Kolodner RD (2003) Saccharomyces cerevisiae PNAS PLUS terminal deletions generated by de novo telomere additions in Saccharomyces chromatin-assembly factors that act during DNA replication function in the cerevisiae. Proc Natl Acad Sci USA 101(36):13262–13267. maintenance of genome stability. Proc Natl Acad Sci USA 100(11):6640–6645. 10. Putnam CD, Pennaneach V, Kolodner RD (2005) Saccharomyces cerevisiae as a model 38. Reed SH, Gillette TG (2007) Nucleotide excision repair and the ubiquitin proteasome system to define the chromosomal instability phenotype. Mol Cell Biol 25(16): pathway—Do all roads lead to Rome? DNA Repair (Amst) 6(2):149–156. 7226–7238. 39. Park EC, Szostak JW (1992) ARD1 and NAT1 proteins form a complex that has N- 11. Pennaneach V, Kolodner RD (2009) Stabilization of dicentric translocations through terminal acetyltransferase activity. EMBO J 11(6):2087–2093. secondary rearrangements mediated by multiple mechanisms in S. cerevisiae. PLoS 40. Schuldiner M, et al. (2008) The GET complex mediates insertion of tail-anchored ONE 4(7):e6389. proteins into the ER membrane. Cell 134(4):634–645. 12. Kolodner RD, Putnam CD, Myung K (2002) Maintenance of genome stability in 41. Myung K, Datta A, Kolodner RD (2001) Suppression of spontaneous chromosomal – Saccharomyces cerevisiae. Science 297(5581):552 557. rearrangements by S phase checkpoint functions in Saccharomyces cerevisiae. Cell 104 fi 13. Putnam CD, Hayes TK, Kolodner RD (2009) Speci c pathways prevent duplication- (3):397–408. – mediated genome rearrangements. Nature 460(7258):984 989. 42. García-Higuera I, et al. (2008) Genomic stability and tumour suppression by the APC/C 14. Putnam CD, Hayes TK, Kolodner RD (2010) Post-replication repair suppresses cofactor Cdh1. Nat Cell Biol 10(7):802–811. duplication-mediated genome instability. PLoS Genet 6(5):e1000933. 43. Lutzmann M, Kunze R, Buerer A, Aebi U, Hurt E (2002) Modular self-assembly of a Y- 15. Chan JE, Kolodner RD (2011) A genetic and structural study of genome re- shaped multiprotein complex from seven nucleoporins. EMBO J 21(3):387–397. arrangements mediated by high copy repeat Ty1 elements. PLoS Genet 7(5): 44. Palancade B, et al. (2007) Nucleoporins prevent DNA damage accumulation by e1002089. modulating Ulp1-dependent sumoylation processes. Mol Biol Cell 18(8):2912–2923. 16. Wang Y, et al. (2005) Mutation in Rpa1 results in defective DNA double-strand break 45. Nagai S, et al. (2008) Functional targeting of DNA damage to a nuclear pore- repair, chromosomal instability and cancer in mice. Nat Genet 37(7):750–755. associated SUMO-dependent ubiquitin ligase. Science 322(5901):597–602. 17. Huang ME, Rio AG, Nicolas A, Kolodner RD (2003) A genomewide screen in 46. Ben-Aroya S, et al. (2010) Proteasome nuclear activity affects chromosome stability by Saccharomyces cerevisiae for genes that suppress the accumulation of mutations. Proc controlling the turnover of Mms22, a protein important for DNA repair. PLoS Genet 6 Natl Acad Sci USA 100(20):11529–11534. 18. Smith S, et al. (2004) Mutator genes for suppression of gross chromosomal (2):e1000852. rearrangements identified by a genome-wide screening in Saccharomyces cerevisiae. 47. Motegi A, Murakawa Y, Takeda S (2009) The vital link between the ubiquitin- Proc Natl Acad Sci USA 101(24):9039–9044. proteasome pathway and DNA repair: impact on cancer therapy. Cancer Lett 283(1): – 19. Kanellis P, et al. (2007) A screen for suppressors of gross chromosomal 1 9. rearrangements identifies a conserved role for PLP in preventing DNA lesions. PLoS 48. Alvaro D, Lisby M, Rothstein R (2007) Genome-wide analysis of Rad52 foci reveals Genet 3(8):e134. diverse mechanisms impacting recombination. PLoS Genet 3(12):e228. 20. Stirling PC, et al. (2011) The complete spectrum of yeast chromosome instability genes 49. Scholes DT, Banerjee M, Bowen B, Curcio MJ (2001) Multiple regulators of Ty1 identifies candidate CIN cancer genes and functional roles for ASTRA complex transposition in Saccharomyces cerevisiae have conserved roles in genome maintenance. components. PLoS Genet 7(4):e1002057. Genetics 159(4):1449–1465. 21. Jordan PW, Klein F, Leach DR (2007) Novel roles for selected genes in meiotic DNA 50. Gangloff S, Soustelle C, Fabre F (2000) is responsible for processing. PLoS Genet 3(12):e222. cell death in the absence of the Sgs1 and Srs2 helicases. Nat Genet 25(2):192–194. 22. Leek JT, et al. (2010) Tackling the widespread and critical impact of batch effects in 51. Stark C, et al. (2006) BioGRID: A general repository for interaction datasets. Nucleic high-throughput data. Nat Rev Genet 11(10):733–739. Acids Res 34(Database issue):D535–D539. 23. Beissbarth T, Speed TP (2004) GOstat: Find statistically overrepresented Gene 52. Collins SR, et al. (2007) Functional dissection of protein complexes involved in yeast Ontologies within a group of genes. Bioinformatics 20(9):1464–1465. chromosome biology using a genetic interaction map. Nature 446(7137):806–810. 24. Ashburner M, et al.; The Gene Ontology Consortium (2000) Gene ontology: Tool for 53. Jessulat M, et al. (2008) Interacting proteins Rtt109 and Vps75 affect the efficiency of the unification of biology. Nat Genet 25(1):25–29. non-homologous end-joining in Saccharomyces cerevisiae. Arch Biochem Biophys 469 25. Ye P, et al. (2005) Gene function prediction from congruent synthetic lethal (2):157–164. interactions in yeast. Mol Syst Biol 1:2005.0026. 54. Schuldiner M, et al. (2005) Exploration of the function and organization of the yeast 26. Tischler J, Lehner B, Fraser AG (2008) Evolutionary plasticity of genetic interaction early secretory pathway through an epistatic miniarray profile. Cell 123(3):507–519. networks. Nat Genet 40(4):390–391. 55. Wilmes GM, et al. (2008) A genetic interaction map of RNA-processing factors reveals 27. Daniel JA, Yoo J, Bettinger BT, Amberg DC, Burke DJ (2006) Eliminating gene links between Sem1/Dss1-containing complexes and mRNA export and splicing. Mol conversion improves high-throughput genetics in Saccharomyces cerevisiae. Genetics Cell 32(5):735–746. – 172(1):709 711. 56. Fiedler D, et al. (2009) Functional organization of the S. cerevisiae phosphorylation 28. Singh I, Pass R, Togay SO, Rodgers JW, Hartman JL, 4th (2009) Stringent mating-type- network. Cell 136(5):952–963. regulated auxotrophy increases the accuracy of systematic genetic interaction screens 57. Costanzo M, et al. (2010) The genetic landscape of a cell. Science 327(5964):425–431. – with Saccharomyces cerevisiae mutant arrays. Genetics 181(1):289 300. 58. Jaccard P (1901) Distribution de la flore alpine dans le bassin des Dranses et dans 29. Lehner KR, Stone MM, Farber RA, Petes TD (2007) Ninety-six haploid yeast strains with quelques régions voisines. Bull Soc Vaud Sci Nat, 37:241–272, French. individual disruptions of open reading frames between YOR097C and YOR192C, 59. Xu R, Wunsch D, 2nd (2005) Survey of clustering algorithms. IEEE Trans Neural Netw constructed for the Saccharomyces genome deletion project, have an additional 16(3):645–678. mutation in the mismatch repair gene MSH3. Genetics 177(3):1951–1953. 60. Askree SH, et al. (2004) A genome-wide screen for Saccharomyces cerevisiae deletion 30. Game JC, et al. (2003) Use of a genome-wide approach to identify new genes that mutants that affect telomere length. Proc Natl Acad Sci USA 101(23):8658–8663. control resistance of Saccharomyces cerevisiae to ionizing radiation. Radiat Res 160 61. Gatbonton T, et al. (2006) Telomere length as a quantitative trait: Genome-wide (1):14–24. survey and genetic mapping of telomere length-control genes in yeast. PLoS Genet 2 31. Bandyopadhyay S, et al. (2010) Rewiring of genetic networks in response to DNA (3):e35. damage. Science 330(6009):1385–1389. 62. Griffith JL, et al. (2003) Functional genomics reveals relationships between the 32. Myung K, Datta A, Chen C, Kolodner RD (2001) SGS1, the Saccharomyces cerevisiae homologue of BLM and WRN, suppresses genome instability and homeologous retrovirus-like Ty1 element and its host Saccharomyces cerevisiae. Genetics 164(3): – recombination. Nat Genet 27(1):113–116. 867 879. 33. Chen C, Umezu K, Kolodner RD (1998) Chromosomal rearrangements occur in S. 63. Irwin B, et al. (2005) Retroviruses and yeast retrotransposons use overlapping sets of – cerevisiae rfa1 mutator mutants due to mutagenic lesions processed by double- host genes. Genome Res 15(5):641 654. strand-break repair. Mol Cell 2(1):9–22. 64. Ouspenski II, Elledge SJ, Brinkley BR (1999) New yeast genes important for 34. Chang M, et al. (2005) RMI1/NCE4, a suppressor of genome instability, encodes chromosome integrity and segregation identified by dosage effects on genome a member of the RecQ helicase/Topo III complex. EMBO J 24(11):2024–2033. stability. Nucleic Acids Res 27(15):3001–3008. 35. Lengsfeld BM, Rattray AJ, Bhaskara V, Ghirlando R, Paull TT (2007) Sae2 is an 65. Yuen KW, et al. (2007) Systematic genome instability screens in yeast and their endonuclease that processes hairpin DNA cooperatively with the Mre11/Rad50/Xrs2 potential relevance to cancer. Proc Natl Acad Sci USA 104(10):3925–3930. complex. Mol Cell 28(4):638–651. 66. Andersen MP, Nelson ZW, Hetrick ED, Gottschling DE (2008) A genetic screen for 36. Marmorstein R, Trievel RC (2009) Histone modifying enzymes: Structures, mechanisms, increased loss of heterozygosity in Saccharomyces cerevisiae. Genetics 179(3):1179– and specificities. Biochim Biophys Acta 1789(1):58–68. 1195. GENETICS

Putnam et al. PNAS | Published online November 5, 2012 | E3259 Downloaded by guest on September 28, 2021