G3: Genes|Genomes|Genetics Early Online, published on July 3, 2019 as doi:10.1534/g3.119.400429
1
1
2 Single-gene deletions contributing to loss of heterozygosity in Saccharomyces cerevisiae:
3 genome-wide screens and reproducibility
4
5 Kellyn M. Hoffert* and Erin D. Strome*
6 *Department of Biological Sciences, Northern Kentucky University, Highland Heights, Kentucky,
7 41099
8
9
10
11
12
13
14
15
16
17
18
19
© The Author(s) 2013. Published by the Genetics Society of America. 2
20
21 S. cerevisiae heterozygous LOH screen
22 Keywords: S. cerevisiae, loss of heterozygosity, MAT, MET15, screen comparison
23 Erin D. Strome, FH359G Nunn Drive, Highland Heights, KY 41099, 859-572-6635,
25
26
27
28
29
30
31
32
33
34
35
36
37 3
38 ABSTRACT
39 Loss of heterozygosity (LOH) is a phenomenon commonly observed in cancers; the loss of
40 chromosomal regions can be both causal and indicative of underlying genome instability. Yeast
41 has long been used as a model organism to study genetic mechanisms difficult to study in
42 mammalian cells. Studying gene deletions leading to increased LOH in yeast aids our
43 understanding of the processes involved, and guides exploration into the etiology of LOH in
44 cancers. Yet, before in-depth mechanistic studies can occur, candidate genes of interest must
45 be identified. Utilizing the heterozygous Saccharomyces cerevisiae deletion collection (≈ 6500
46 strains), 217 genes whose disruption leads to increased LOH events at the endogenously
47 heterozygous mating type locus were identified. Our investigation to refine this list of genes to
48 candidates with the most definite impact on LOH includes: secondary testing for LOH impact at
49 an additional locus, gene ontology analysis to determine common gene characteristics, and
50 positional gene enrichment studies to identify chromosomal regions important in LOH events.
51 Further, we conducted extensive comparisons of our data to screens with similar, but distinct
52 methodologies, to further distinguish genes that are more likely to be true contributors to
53 instability due to their reproducibility, and not just identified due to the stochastic nature of LOH.
54 Finally, we selected nine candidate genes and quantitatively measured their impact on LOH as
55 a benchmark for the impact of genes identified in our study. Our data add to the existing body of
56 work and strengthen the evidence of single-gene knockdowns contributing to genome instability.
57
58
59 4
60 INTRODUCTION
61 Genome instability underlies a multitude of changes observed during tumorigenesis. The
62 accumulation of mutations, both as drivers and as a result of tumor formation, give important
63 insight into cancer progression (Loeb et al. 2003). For most genes conferring a protective effect
64 against tumorigenesis, loss of function alterations need to occur in both alleles before the
65 development of cancer phenotypes (Kacser and Burns 1981; Knudson 1993). However, some
66 genes have been discovered to impart haploinsufficient effects whereby a mutation event in only
67 one allele leads to an abnormal cellular phenotype (Veitia 2002; Trotman et al. 2003; Alimonti et
68 al. 2010; Berger et al. 2011). In most cancers, the accumulation of mutations leading to
69 tumorigenesis does not happen only at the single nucleotide level, but with large chromosomal
70 gains or losses (Lengauer et al. 1998; Loeb et al. 2003; Giam and Rancati 2015; Gao et al.
71 2016). Loss of heterozygosity (LOH) events are one such type of genome instability implicated
72 in tumorigenesis and can arise from a myriad of underlying mechanisms, Fig 1, (Knudson 1971;
73 Vladusic et al. 2010; Choi et al. 2017) for additional review: (Levine 1993; Thiagalingam et al.
74 2002; Payne and Kemp 2005).
75 Identifying driver mutations responsible for LOH events has proven a difficult task in
76 mammalian cells; by the time tumors are large enough to be detected and examined, tens if not
77 hundreds to thousands of mutations have occurred. Utilizing Saccharomyces cerevisiae for
78 studying genomic instability mechanisms, including LOH, has long been an invaluable resource
79 in furthering our understanding of molecular underpinnings (for review see (Mager and
80 Winderickx 2005; Botstein and Fink 2011)). Much of the knowledge gained through yeast can
81 then be applied to higher eukaryotic organisms.
82 With the creation of the S. cerevisiae gene knockout collections, large scale surveys for
83 genes implicated in a particular phenotype can be systematically conducted (for review (Giaever 5
84 and Nislow 2014)). The 6,477 strains included in the heterozygous deletion collection (hereafter
85 referred to as SCDChet) target both essential and non-essential genes. In this screening, we
86 use SCDChet to systematically screen for genes with a haploinsufficient effect resulting in
87 increased LOH events. We first utilize the mating type (MAT) locus, on chromosome III, which is
88 endogenously heterozygous, MATa/MATα, in diploid yeast cells. The non-mating diploid
89 phenotype results from a co-repressive mechanism driven by the MATa and MATα alleles
90 (Duntze et al. 1970; Herskowitz 1995; Schmidlin Tom et al. 2007; Haber 2012). When an LOH
91 event occurs at this locus, the co-repressible mechanism is no longer active to prevent mating,
92 and the diploid cell mates as if it were haploid. As a secondary assay to confirm increases in
93 LOH events, mutants identified as top-hits for LOH at the MAT locus were tested for increases
94 in LOH at the separate MET15 locus on chromosome XII. The SCDChet is heterozygous at
95 MET15 (met15Δ/MET15) and a color-identifying sectoring assay allows LOH at this locus to be
96 readily measured (Cost and Boeke 1996).
97 The MAT and MET15 loci have been exploited as markers in other LOH screens interested
98 in identifying heterozygous and/or homozygous knockouts that increase genome instability
99 (Yuen et al. 2007; Andersen et al. 2008; Choy et al. 2013). Our goals were therefore two-fold:
100 first to identify genes with haploinsufficiency impacts on genome stability, and secondly, in
101 performing a variety of follow-up analyses, to identify those top-hits with the most potential to be
102 cancer susceptibility genes. These follow-up analyses include gene ontology characterizations,
103 positional gene enrichment identification, and multiple comparisons to other screens interested
104 in identifying genes with impacts on genome stability, to determine the enrichment and
105 reproducibility of gene identification. Thereby not only giving insight into candidate cancer
106 susceptibility genes that have a haploinsufficient impact, but also deepening our understanding
107 of the differences in results due to variations in screening set-ups and the stochastic nature of
108 LOH events. By extensively documenting the methodology used, as well as comparing datasets 6
109 across multiple screens, we aim to sort through much of the noise of screen results and avoid
110 the far too common crisis of reproducibility (Yamada and Hall 2015; Drucker 2016).
111
112 MATERIALS AND METHODS
113 Creation of mating tester haploids
114 One MATa haploid (MATa: his3-∆200, trp1-∆1, CF:[ura3: TRP1 SUP11 CEN4 D8B],
115 leu2-3, lys2-801, ura3-52, ade2-101) and one MATα haploid (MATα: his3-∆200, trp1-∆1,
116 CF:[ura3: TRP1 SUP11 CEN4 D8B], leu2-3, ura3-52, ade2-101) were transformed using lithium
117 acetate transformation with a HIS3MX cassette, carrying the Schizosaccharomyces pombe
118 HIS5 gene, to replace the wildtype CAN1 locus. This resulted in the production of haploid
119 strains EDS585a (MATa: his3-∆200, trp1-∆1, CF:[ura3: TRP1 SUP11 CEN4 D8B], leu2-3, lys2-
120 801, ura3-52, ade2-101, can1::HIS5) and EDS588α (MATα: his3-∆200, trp1-∆1, CF:[ura3: TRP1
121 SUP11 CEN4 D8B], leu2-3, LYS2, ura3-52, ade2-101, can1::HIS5) hereafter referred to as
122 585a and 588α.
123 Heterozygous deletion collection (SCDChet) screen for LOH at the MAT locus
124 The heterozygous deletion collection (6,477 strains) was purchased from Open
125 Biosystems (YSC1055). Three initial copies of the collection were made (200 μL YPD + G418
126 (0.2 mg/mL)); 10 μL of cells were transferred to a 96-well plate, grown overnight at 30°, 60 µL of
127 67% glycerol was added and plates were kept were frozen at -80°. Haploid strains 585a and
128 588α were struck from freeze-down stocks and grown for 3 days. One colony from each strain
129 was inoculated individually into 100 mL SC-His and grown overnight on a shaker. On the same
130 day, deletion collection plates were thawed and new copies were made of each plate grown in
131 200 μL YPD + G418 (0.2 mg/mL) with 20 μL starting inoculum of deletion collection cells. These 7
132 copies were incubated overnight at 30° to allow LOH events to occur. The deletion collection
133 cells were mixed with either 585a or 588α haploid cells (200 μL YPD, 10 μL SCDChet cells, 10
134 μL haploid cells) and grown for 24 hours at 30° to allow mating events to occur. The cells were
135 then spun down at 2,500 rpm for 4 minutes, the supernatant was removed, and the cells were
136 washed with 200 μL ddH2O. The cells were again spun down at 2,500 rpm, the supernatant was
137 removed, and the cells were resuspended in 200 μL ddH2O. Resuspended cells were pinned to
138 SC-His + G418 (0.4 mg/mL) in duplicate and incubated for 48 hours at 30° before colony counts
139 were taken. A scoring system was utilized to record colony count ranges. Pinned spots with no
140 colonies were recorded as “0,” 1-9 colonies were scored as “+,” 10-19 as “++,” 20-29 as “+++,”
141 and any colony counts above 30 were scored as “++++” (Fig 2C). Two complete trials for each
142 haploid mater were conducted, utilizing different copies of the SCDChet collection for each trial.
143 The methodology used in this screening was verified through a series of control
144 experiments. 585a and 588α were put through the screening protocol as described above, but
145 without SCDChet cells introduced to the samples. Haploid tester strains were grown in duplicate
146 and 20 µL samples were placed into 96-well plates. None of the 192 wells per haploid pinned
147 onto SC-His + G418 (0.4 mg/mL) contained colonies, indicating that false positives of the
148 haploid testers growing alone on double selection plates are unlikely. To screen for false
149 positives originating from SCDChet cells, five SCDChet plates were selected at random, and put
150 through the screening protocol above without the addition of a haploid tester. Of the 413 unique
151 strains pinned, 23 samples contained colonies. Fourteen of these strains were identified as
152 producing lawn growth, not single colonies, interpreted as their starting with a genotype
153 conferring an ability to grow on SC-His + G418 (0.4 mg/mL) without needing to mate with 585a
154 or 588α. (When assayed again in the full screen, these 14 wells continued to show lawn growth
155 when incubated with either the 585a or 588α strains.) Therefore, these wells were excluded as
156 top-hits, as well as any other wells that showed lawn growth, after incubation with both 585a 8
157 and 588α. The wells identified with this characteristic can be found in Table S1. Of the
158 remaining nine wells that contained colonies, not one contained a background level of growth to
159 score above a “++” on the scale established. This indicates a low level of false positives –
160 approximately 2% - if classifying any amount of growth as a positive hit. This then informed our
161 decision on a threshold for ‘top-hits’ from our screen (as discussed later), ensuring these types
162 of events are not enough to warrant categorization as a ‘top-hit’ to mark a gene as one of
163 interest. 585a and 588α were also mated with control haploids containing KANMX cassettes to
164 determine false negative rates. Two different control MATa and two different control MATα
165 strains, with KANMX cassettes inserted at different locations in the genome, were utilized. All
166 conditions were kept the same as the screening described above, but instead of inoculating
167 SCDChet cells, the KANMX-containing haploid controls were inoculated into 100mL YPD +
168 G418 (0.2mg/mL). Opposite mating types were paired. All pairings, 32 samples in total, resulted
169 in uncountable lawn growth or colony counts greater than thirty (++++ score). This verifies strain
170 mating occurs in the incubation time allotted, scoring at the top level of growth, and giving no
171 indication of false negatives.
172 Secondary screen for LOH at MET15
173 Methodology was adapted from previous screenings (Ono et al. 1991; Cost and Boeke
174 1996; Andersen et al. 2008); briefly, the 217 top-hits identified in the MAT locus LOH assay
175 were additionally evaluated to determine if LOH was also increased at a secondary locus. When
176 cells lacking functional MET15 are placed on plates containing lead (Pb2+), the excess sulfur
177 precipitates as dark-colored lead (II) sulfide (PbS) (Cost and Boeke 1996). Cells must contain
178 at least one functional copy of MET15 to remain white in appearance; cells that have undergone
179 LOH at MET15 and no longer maintain a functional copy will appear as a black/brown sector.
180 Two microliters of each SCDChet top-hit were inoculated into 200µL YPD and incubated for
181 three days at 30° to allow LOH events to occur. Cells were then pinned to plates in triplicate 9
182 containing 0.7mg/mL lead nitrate (for full plate recipe see Guide to Yeast Genetics and
183 Molecular Cell Biology, Elsevier, 2002), and allowed to grow at 30° for four days. Plates were
184 then placed at 4° for 24 hours to aid in color development before sectors were counted. As with
185 other published methodologies (Ono et al. 1991; Cost and Boeke 1996; Andersen et al. 2008),
186 pinned samples grew as patches, not individual colonies, and sectors are counted as the total
187 number of dark growth regions within the larger patch. The screen was run twice (two biological
188 replicates); strains were scored as positive for LOH at MET15 if an average of more than 13
189 sectors appeared in each replicate across the 6 replicates when analyzed at 24x magnification.
190 Eight mutants (KIN3, MET17, STH1, MVD1, HRB1, BSC5, VAM6, and YKR073C) were unable
191 to be analyzed for sectoring due to presenting as entirely brown colonies from the initial pinning
192 to the lead-containing plate.
193 The methodology used in this screening was verified through a series of control
194 experiments. To determine the baseline rate for LOH at MET15 for the SCDChet collection, we
195 utilized two biological replicates of a plate chosen from the SCDChet collection that contains no
196 strains identified as increasing LOH at MAT in our primary screening assay. In an attempt to
197 ensure that LOH events that happened earlier, and thus resulted in a larger sector portion, were
198 not unduly overrepresented in our results due to increased visibility, we utilized a dissecting
199 microscope for analysis of sectors. Sectors were counted at 24x magnification and averaged
200 across all 94 strains on the plate for 3 technical replicates. Analysis at 24x magnification
201 revealed a high level of sectoring in these strains with the average at 11 sectors. Previous
202 screens utilizing the MET15 locus and analyzing sector appearance did so without magnification
203 and set their threshold for classification as a hit at 2 sectors (Andersen et al. 2008). We
204 therefore set our top-hits threshold level at +2 over the average seen in this control set of
205 analyzed strains. The top-hit threshold was therefore set at having 13 or more sectors as
206 indicating an increase in LOH over the baseline. 10
207 It has been previously documented (Bae et al. 2017) that the SCDChet strain that
208 contains the knockout for YLR303W/MET17 (alias of MET15) is genotyped as
209 met15∆::kanMX/met15∆. Due to its lack of functioning MET15, all cells of this strain plated on
210 Pb2+-containing plates should appear entirely black/brown. This phenotype was confirmed in all
211 replicates (two biological, three technical) where this strain was subjected to the screening
212 protocol. This strain serves as a positive control for this secondary screening.
213 Fluctuation Analysis Benchmarking of Top Candidate Gene Deletions
214 The parental strain used to create SCDChet was ordered from Open Biosystems
215 (BY4743). This strain underwent lithium acetate transformation with a can1::HIS3 cassette in
216 order to render the CAN1 locus heterozygous. Mutant strains containing the heterozygous
217 knockouts of SSA1, CCT7, MRC1, RSM7, HSL7, SSE2, PUG1, YOR364W, and MEC1 were
218 grown from SCDChet and subjected to the same can1::HIS3 switch out via lithium acetate
219 transformation. Strains were then struck for individual colonies and grown at 30° until they
220 reached ~3mm, allowing LOH events to occur. Twenty-four individual colonies per strain were
221 then resuspended in water, assessed for optical density, and the 15 colonies with the most
222 similar size, as read by absorbance at 562nm greater than 0.5, were used for analysis. The rate
223 of LOH events resulting in a change to the CAN locus were determined by plating dilutions on
224 non-selective (YPD for population size) and SC-Arg- plus canavanine (60 μg/mL) plates (for
225 LOH events). Plates were grown for 3-5 days at 30°, followed by colony counting. Fluctuation
226 analysis for LOH rate with 95% confidence intervals (CI) were calculated utilizing the R
227 advanced calculation package Salvador (rSalvador) (Zheng 2002, 2008, 2016). The LOH rates
228 and confidence intervals were measured for two biological replicates for each strain.
229
230 Data Availability Statement 11
231 Strains are available upon request. Table S1. Table of complete MAT locus LOH screen data.
232 Table S2. Top hits identified in MAT locus LOH screen. Table S3. Sector counts of the 217
233 SCDChet gene deletions analyzed in secondary MET15 screen. Table S4. GO Slim Mapper
234 results for the 217 identified genes in the MAT locus screen. Table S5. GO Slim Mapper results
235 for the 100 identified genes in the MET15 secondary screen. Table S6. GO Slim Mapper and
236 GO Term Finder results for the 91 genes identified in two or more related independent studies.
237 Table S7. Positional Gene Enrichment analysis of the 217 MAT Top Hits, 100 MAT + MET15
238 Top hits, and 91 genes identified in 2+ independent studies. Table S8. Human homologs and
239 their associated cancer types for genes identified by the MAT locus screen. Table S9. Human
240 homologs and their associated cancer types for genes identified by multiple independent
241 studies. Table S10. Positional Gene Enrichment analysis of genes identified by similar
242 independent studies. Figure S1. Visual representation of Positional Gene Enrichment data for
243 enriched region on chromosome II. All supplemental files are available in the GSA Figshare
244 portal.
245
246 RESULTS
247 MAT Locus Screen For Genes With Haploinsufficiency Effects On LOH
248 To identify genes that, when heterozygously mutated, resulted in increased incidence of
249 LOH at the MAT locus, the SCDChet collection was screened. SCDChet cells were grown in 96-
250 well plates and paired separately with MATa and MATα haploid tester strains. When pinned
251 onto double selection media, only cells that had mated and formed triploids grew (Fig 2A and
252 2B). The entire collection was screened twice (biological replicates) with technical replicates
253 performed each time. Furthermore, this was done with both MATa and MATα haploid maters
254 resulting in 8 data points for each SCDChet strain. Each trial stamped on SC-His + G418 plates 12
255 was counted independently for growth and assigned a score of 0, +, ++, +++, or ++++ (Fig 2C).
256 The scores of all four replicates were summed for an SCDChet strain with given haploid pairing;
257 any combination of “+” scores adding to 12 or higher were further analyzed to be considered as
258 a ‘top-hit’. For comparison, across the whole SCDChet collection, the average amount of growth
259 seen on the double-selection plates scored as a single “+” for any one spot, and a summed plus
260 score of three across the four replicates. Further, any pairing that resulted in a score of twelve,
261 but contained a replicate that scored “++” or below was removed from top-hit consideration due
262 to the possibility of it being a false positive. Strains that were annotated in the SCDChet
263 database provided by Open Biosystems as having a phenotype related to mating were excluded
264 from consideration as a top-hit, as their ability to mate did not represent an LOH event, but an
265 inherent characteristic of that particular mutant. This screening mechanism resulted in 217
266 heterozygous gene mutations, approximately 3.4% of the genome, being scored as top-hits
267 (Table S2). For listing of all scores of all strains from this study, see Table S1.
268 Secondary Screen For Haploinsufficiency Effects On LOH At An Alternate Locus (MET15)
269 To further understand the extent of the impact of these 217 top-hit heterozygous gene
270 mutations on LOH events, these strains were screened for their LOH impacts at an alternate
271 secondary locus; the MET15 gene on chromosome XII. Each identified top-hit SCDChet strain
272 was grown up in a 96-well plate for three days, pinned to lead-containing plates, then grown up
273 into a patch over four days until color developed. When pinned onto plates containing Pb2+, cells
274 within a patch that have undergone an LOH event will appear as black/brown sectors. All 217
275 top-hit SCDChet strains were tested twice (biological replicates) and stamped onto lead-
276 containing plates in triplicate each time (technical replicates), resulting in 6 sector counts for
277 each mutant strain. One hundred of our initial 217 top-hit heterozygous gene deletions induced
278 increased LOH events at the secondary MET15 locus. This indicates that ~48% of our intiallially
279 identified heterozygous gene mutaitons have large reproducible effects on LOH at multiple loci, 13
280 representing 1.5% of the genome. Eight of the initial 217 strains were unable to have sectors
281 counted for LOH at MET15 due to presenting a brown phenotype throughout the entire patch.
282 See Table S3 for MET15 sectoring data for each tested strain.
283 Multiple Screen Comparison To Identify Reproduced Results From Independent Studies
284 Assessing LOH And/Or Haploinsufficiency Induced Genome Instability
285 To further expand our analysis of the reproducibility of gene identification, we compared
286 our results to previously reported screens that asked similar questions about genes contributing
287 to genome instability. Screens were selected for comparison based on a) utilizing heterozygous
288 knockouts to screen for genome instability, b) screening homozygous knockouts specifically for
289 LOH, or c) a combination of the two (for a visual representation of screen selection, see Fig 3A).
290 Choy et al. The greatest number of overlapping hits, 26, were seen when our results from the
291 MAT locus screen were compared to a screen that most mirrored our own, having utilized the
292 SCDChet to screen for LOH at the MAT locus, Fig 3B, (Choy et al. 2013). This overlap shows
293 statistical significance at the level of p = 3.995 x 10-5 when a hypergeometric probability was
294 calculated using a normal approximation, Table 1. (Ten of the overlapping genes, APD1, SSE2,
295 HSM3, FRM2, MRH1, LSM4, TMN3, PUG1, RPL36B, and CCT7, also reproduced in our
296 secondary screen for increasing LOH at MET15). This screen arrayed four samples from each
297 well of the SCDChet onto solid YPD media with either a MATa or MATα haploid tester to allow
298 for mating. The mating results for each pinned position were then summed to result in a score of
299 0-4 for each. The entire screen was performed in triplicate, strains that had a score ≥ 2 SD
300 above the mean were included as top-hits.
301 Strome et al. Previously an independent screen was conducted using random insertional
302 mutagenesis generated heterozygous knockouts, to identify gene mutations involved in genome
303 instability, specifically chromosome transmission fidelity (CTF) (Strome et al. 2008). 14
304 Heterozygous knockouts containing a chromosome fragment (CF) allowed for visual
305 identification of fragment loss due to insertional mutagenesis induced genome instability (Hieter
306 et al. 1985). The SCDChet was not used in this screening. Of the 164 hits identified, 4 of them
307 (2.4% of their 164) overlap with genes we have identified for increasing MAT LOH, representing
308 1.8% of our dataset, Fig 3B. Two of the identified gene deletions – RLI1 and BPH1 – also
309 increased LOH at MET15.
310 Andersen et al. Screens that assayed for LOH events even if not utilizing heterozygous
311 deletions can still provide valuable insight. This study utilized the homozygous deletion
312 collection (SCDChom) and looked at LOH events at three different loci. Their primary screen
313 utilized the intrinsic heterozygosity at the MET15 locus in SCDChom to measure LOH. To
314 further examine genes of interest identified in their initial screen, they constructed a strain with
315 Many Heterozygous Markers (MHM) on chromosomes III, IV, and XII to understand the extent
316 of the events at various loci. From the initial MET15 screening in SCDChom, they identified 132
317 gene deletions that resulted in increased LOH. They were able to successfully recreate 114 of
318 these knockouts in their MHM background, which they screened again for MET15 loss at
319 elevated rates. Of the 114 MHM knockouts, 61 of them again demonstrated elevated LOH at
320 MET15. These 61 knockout strains were then examined further for the extent of their LOH
321 activity at two additional loci, SAM2 and MAT. Additional assays identified 26 of these genes
322 with effects increasing LOH events across three independent loci. In comparing all genes
323 identified as increasing LOH from their screening data, we observe three overlaps with our gene
324 list, SLX5, IRA2 and ICE2, Fig 3B. A representation factor (Rf) calculated for this overlap
325 indicates a value greater than 1, (Rf=1.4), indicating more overlap than expected, although not
326 at a p-value <0.05, Table 1. Our secondary MET15 screen also identified SLX5 and ICE2
327 knockdowns as contributing to increased MET15 LOH. These three genes represent a 4.9% 15
328 overlap in their dataset and 1.7% of our data (when corrected to remove essential genes not
329 identifiable in their screening method).
330 Yuen et al. Another screen looking at LOH events due to homozygous mutations once again
331 utilized SCDChom but examined the presence of the bi-mater (BiM) phenotype (among other
332 genome instability assays) (Yuen et al. 2007). The BiM assay measures LOH events at the MAT
333 locus that allow for mating with haploid testers of both mating types. When comparing top-hits
334 between our screens, 5 genes overlap as having elevated levels of LOH, Fig 3B. A
335 representation factor (Rf) calculated for this overlap indicates a value greater than 1, (Rf=1.2),
336 indicating more overlap than expected, although not at a p-value <0.05, Table 1. This
337 represents 2.8% of top-hit data from our screen (corrected to remove essential genes not
338 identifiable with the SCDChom), and 4.1% of overlap from their list of identified genes that lead
339 to a BiM phenotype. Two genes identified in both screens – IML3 and SGO1 – reproduced in
340 our secondary MET15 screening.
341 Schmidlin et al. A third screen utilizing SCDChom again analyzed mating capabilities with
342 haploid testers to examine LOH (Schmidlin Tom et al. 2007). Mating pairs of SCDChom cells
343 and a haploid tester were pinned to plates of minimal medium that allowed for triploid selection;
344 growth of four or more colonies in a sample was considered a positive hit. One hundred
345 homozygous gene deletions were identified in this initial screen, and seven of those overlap with
346 top-hits found in our screening, Fig 3B. A representation factor (Rf) calculated for this overlap
347 indicates a value greater than 1, (Rf=2.0), indicating more overlap than expected, although at a
348 p-value = 0.06, Table 1. Only including our non-essential hits, this represents 3.9% of our
349 dataset, and 7% of the strains initially identified in their screening. Eighty-nine homozygous
350 deletions were then remade by mating the corresponding deletion collection haploid strains to
351 make new homozygous diploids in the same background. In assays to confirm the mating
352 phenotype, six reconfirmed. One of the six genes they identified overlapped with one of our hits 16
353 – IST3. IST3, as well as RSM7 (one of the genes identified in their initial screen), were identified
354 in our secondary screen as increasing the rate of LOH at MET15.
355 While the stochastic nature of LOH events, as well as the differences in the instability
356 phenotypes being assayed, contribute to the limited overlap between screens, genes that
357 consistently appear as top-hits in screens interested in similar instability mechanisms provide
358 interesting avenues for further investigation. Fig 3B shows the results of the comparison of each
359 of these individual screens to each other and the 91 genes found minimally in two independent
360 publications, seven of which were identified in three studies. Furthermore in the current climate
361 of questions surrounding reproducibility we are pleased to see a significant level of overlap
362 between our screen results and the Choy et al. screening for MAT locus LOH with SCDChet
363 strains; approximately 11.9% of our top-hit genes were identified in their results and this
364 represents 7.8% of their dataset.
365 Gene Ontology Analysis for Identified Enrichment Categories
366 To achieve a primary understanding of the functions carried out by members of our first
367 list of top-hits, we analyzed this 217-member MAT locus LOH gene list using two Gene
368 Ontology (GO) tools: Saccharomyces Genome Database (SGD) GO Slim Mapper and GO Term
369 Finder. GO Slim Mapper provides an overview of broader parent terms that a gene can be
370 mapped to, selected by SGD curators, and does not automatically generate enrichment p-
371 values based on its grouping of genes into categories. Fisher’s exact test, with the Benjamini-
372 Hochberg correction for false discovery rates (FDR) (Q = 0.05) were selected and applied
373 based on their documented use for ontology analysis (Benjamini and Hochberg 1995; Sabatti et
374 al. 2003; Rivals et al. 2007). Significant results from SGD GO Slim Mapper (p-value < 0.05, and
375 p < Benjamini-Hochburg critical value) are summarized in Table 2A. Four ontology varieties of
376 SGD GO Slim Mapper were utilized: Cellular Component, Molecular Function, Biological 17
377 Process, and Macromolecular Complex. Alternatively, SGD GO Term Finder selects the most
378 granular term for each gene within a query, providing as detailed of an analysis as currently
379 available based on the literature (Christie et al. 2009). GO Term Finder uses a binomial
380 distribution to calculate p-values corrected for multiple comparison analysis, and there are three
381 varieties of annotations: Cellular Component, Molecular Function, and Biological Process. Many
382 of the categories found in GO Slim Mapper Macromolecular Complex are absorbed into the
383 more specific GO Term Finder Cellular Component, but GO Term Finder and GO Slim Mapper
384 analysis were kept separate due to the nature of the algorithms and the statistical analysis of
385 each dataset.
386 The SGD Slim Mapper Gene Ontology tool reported two significantly enriched groups
387 after FDR correction. For the Molecular Function ontology, Molecular Function Unknown (padj =
388 4.997x10-4) was identified as enriched with 88 genes from the 217 top-hits list lacking specific
389 information on their molecular function. Within the Biological Process ontology, Chromosome
-4 390 Segregation (padj = 4.605x10 ) was found to be overrepresented with 17 genes classified in this
391 group. To determine if a further understanding of the relationships between the top-hits
392 identified in this screen could be found using more explicit ontology terms, GO Term Finder
393 results were analyzed. The only ontology category that was found to be enriched through GO
394 Term Finder was Cellular Component – Chaperonin-containing T-complex (CCT-complex) (padj
395 = 0.0489). Four genes that are part of this complex were identified in our screening, CCT2,
396 CCT7, and CCT8, three core subunits of the complex, as well as SSA1, an ATPase that
397 associates with the core subunits. A complete list of GO Slim Mapper annotations, adjusted p-
398 values, and genes annotated to each term can be found in Table S4.
399 With the goal of identifiying additional ontologies of interest, SGD GO Slim Mapper and
400 GO Term Finder were also applied to the narrowed list of 100 genes identified as increasing
401 LOH events at two loci. Molecular Function Unknown (p = 2.328x10-4) again reproduced as 18
402 being significantly enriched in this dataset with 47 genes from 100 top-hit list represented, Table
403 2B. For the list of all ontologies for the 100 gene list and adjusted p-values of representation see
404 Table S5.
405 To determine if multiple screen comparisons for repetition of gene identification was
406 likely to lead to ontologies worthy of further pursuit, the 91 genes that were identified in at least
407 two of the independent studies previously discussed, were also analyzed with SGD GO Slim
408 Mapper and GO Term Finder to determine category enrichment. For GO Term Finder Biological
409 Process, 82 categories were annotated as significant (p < 0.05), the four categories with largest
-13 410 enrichment are DNA Metabolic Process (padj = 4.63x10 ), Cellular Response to DNA Damage
-15 -13 411 Stimulus (padj = 9.12x10 ), Cellular Response to Stress (padj = 3.75x10 ), and DNA Repair
-12 412 (padj = 1.19x10 ). GO Slim Mapper identified 14 Biological Process categories (p<0.05), the
-11 413 four with most significant enrichment are Organelle Fission (padj = 4.79x10 ), Cellular
-16 -10 414 Response to DNA Damage Stimulus (padj = 3.44x10 ), Mitotic Cell Cycle (padj = 1.04x10 ),
-14 415 and DNA Repair (padj = 1.30x10 ). Thirty-five categories were enriched for GO Term Finder
416 Cellular Component (p < 0.05); the four categories with largest enrichment are Chromosome
-12 -12 - 417 (padj = 3.25x10 ), Chromosomal Part (padj = 4.23x10 ), Nuclear Chromosome (padj = 6.29x10
8 -8 418 ), and Nucleus (padj = 2.41x10 ). GO Slim Mapper identified two Cellular Component categories
-9 -9 419 (p<0.05), Chromosome (padj = 3.42x10 ), and Nucleus (padj = 4.16x10 ). DNA-Dependent
420 ATPase Activity (padj = 0.000228), DNA Binding (padj = 0.00224), G-quadruplex DNA Binding
421 (padj = 0.00416) and Exonuclease Activity (padj = 0.0290) were the significant categories
422 identified using GO Term Finder Molecular Function. While GO Slim Mapper also picked up
05 423 DNA Binding (padj = 7.41x10- ) and ATPase Activity (padj = 0.000808) within the Molecular
424 Function ontology. GO Slim Mapper additionaly found 5 enriched groups in the Macromolecular
06 425 Complex category (p<0.05), Chromosome, Centromeric Region (padj = 1.16x10- ), Kinetochore
426 (padj = 0.000135), SUMO-Targeted Ubiquitin Ligase Complex (padj = 0.00019), Condensed 19
427 Nuclear Chromosome Outer Kinetochore (padj = 0.00019) and Condensed Nuclear Chromosome
428 Kinetochore (padj = 0.000205). For a complete list of GO results from the 91 gene top-hit list of
429 overlaps from the multiple screen comparison, see Table S6.
430 Analysis Of Positional Gene Enrichment for Chromosomal Regions of Interest
431 To investigate if our screens identified any enriched chromosomal regions, which might
432 be indicative of neighborhoods in the genome that contribute to LOH events, we mapped the
433 location of each gene identified as a top-hit using Positional Gene Enrichment analysis (PGE)
434 (De Preter et al. 2008). Genes, which when mutated, identified as top-hits causing increased
435 LOH were dispersed throughout all 16 chromosomes. However, utilizing PGE to assess our
436 different top-hit gene lists allowed us to identify locations with significant enrichment. Analysis of
437 our MAT 217 gene top-hits list identified 41 total locations with significant enrichment, further
438 refinement revealed seven clusters of genes, located on six different chromosomes, with three
439 or more ORFs constituting the enrichment and with multiple comparison adjusted p-values <
440 0.01 (Fig 4). When we analyzed the narrowed list of 100 mutants that increase LOH at both
441 assayed loci (hereafter referenced as the MAT + MET15 dataset), 27 total locations were
442 identified significant enrichment, further refinement revealed nine clusters of genes, located on
443 six different chromosomes, with three or more ORFs constituting the enrichment and with
444 multiple comparison adjusted p-values < 0.01. Analysis of our multiple screen comparison 91
445 gene top-hits list identified 22 total locations with significant enrichment, further refinement
446 revealed two clusters of genes, located on two different chromosomes, with three or more ORFs
447 constituting the enrichment and with multiple comparison adjusted p-values < 0.01. For a list of
448 all enriched chromosomal locations, see Table S7.
449 Chromosome II A large region of significance was identified comprised of 20 ORFs found on
-4 450 the right arm of chromosome II (Base Pair (BP) region 501798-545972) (padj = 2.75 x 10 ), 20
451 when the 217 MAT locus LOH top-hits list was evaluated, as shown in Fig 4A. Seven of the
452 ORFs found in this region were identified by our MAT locus LOH screen – HSL7, CKS1,
453 YBR137W, ATG42, BMT2, YBR144C, and APD1. When the MAT + MET15 data set was
454 subjected to PGE analysis, a more defined region of chromosome II (a subsection of the region
455 mentioned above) was found to be more significantly enriched (BP region 504848-545972) (padj
456 = 9.69 x 10-6) Fig 4A. Six of the seven previously mentioned gene deletions in this region, again
457 appear in a now refined territory comprised of 18 ORFs. We next refined our PGE investigation
458 to all top-hits found in our multiple screen analysis, aiming to better identify chromosomal
459 regions important in LOH events. The region of chromosome II identified above is expanded
460 (BP region 442918-575991) (padj = 0.016) and contains 5 ORF hits (YBR099C, IML3, HSL7,
461 APD1, and SSE2) in a 79 ORF region.
462 Chromosome V When the 217 top-hits from the initial screen of increasing LOH at the MAT
463 locus were analyzed, a 24-gene region containing eight top-hit ORFs was found to be enriched
-4 464 on chromosome V (BP region 375211-424307) (padj = 2.31 x10 ). These top-hit genes are
465 FLO8, LSM4, TMN3, RPL23B, DSE1, SAK1, COM2, and RPS26B Fig 4B. The genes SAK1,
466 COM2, and RPS26B are immediately adjacent to one another and are identified as a further
467 enriched cluster with a p-value of 9.57 x 10-4. After narrowing our MAT LOH top-hits list to 100
468 genes that increase LOH at both MAT and MET15, one larger region (BP 387228-560360) with
-6 469 11 hits across a 100-ORF span (padj = 9.69x10 ) and two further enriched subsections of this
470 region (BP region 387228-438340) and (BP region 387228-397649) were identified. The
471 downstream-shifted 30-gene region of chromosome V identified as enriched (BP region
-6 472 387,228-438,340) (padj = 9.69 x 10 ) is comprised of seven top-hits (LSM4, TMN3, RPL23B,
473 DSE1, SAK1, COM2, and YER135C) in the region of 30 genes, while the smaller region is
-4 474 comprised of three ORFs in a six ORF span (padj = 2.57x10 ) Fig 4B. Running the PGE analysis
475 with the overlapping 91 genes from the six compared screens presented a further narrowed 21
-5 476 enriched region (BP region 387228-396168) (padj = 2.03x10 ), with 3 repeated ORFs (LSM4,
477 TMN3, and SLX8) in a 5 ORF region (Fig. 4B). Notably, LSM4 and TMN3 were identified by two
478 independent studies (this study and Choy et al.), and SLX8 was identified by three independent
479 studies (Schmidlin et al., Andersen et al., and Yuen et al.).
480
481 Chromosome VII An enriched region of significance was identified comprised of 23 ORFs on
-3 482 chromosome VII (BP region 67598-98589) (padj = 3.57 x 10 ), when the 217 MAT locus LOH
483 top-hits list was evaluated, as shown in Fig 4C. Five of the ORFs (SHE10, MTC3, YGL218W,
484 KIP3, and SIP2) found in this region were identified by our MAT locus LOH screen. When the
485 narrowed list of 100 mutants that increase LOH at both MAT + MET15 were subjected to PGE
486 analysis this region is no longer identified as being enriched, and further does not recur in PGE
487 evaluation of the 91 overlap list from the multiple screen comparison.
488 Chromosome IX An enriched region was identified (BP region 255113 – 270572) when the 217
489 MAT locus LOH top-hits list was evaluated. This region contains five ORFs (GPP1, MMF1,
-4 490 PCL7, NEO1, and MET30) identified in a 10 ORF region (padj = 4.40 x 10 ) (Fig 4D). Two
491 enriched regions were identified when PGE was run with the 100 top-hits that appeared in both
492 our MAT + MET15 screens. The larger region runs from 193592-264891bp and had five hits
493 (ICE2, AIM19, YIL060W, MMF1, and NEO1) in a 49 ORF region. A smaller subsection was
494 identified as the second hit, this region is 18,502bp long and contains 14 ORFs of which 3 were
-3 495 found in our top-hits (BP region 246389-264891) (padj = 2.72 x 10 ) (Fig 4D). A 13 ORF region
496 of chromosome IX (BP region 83302-100501) contains two genes identified by at least two of
497 the six compared studies (padj = 0.0299). One of the core CCT-complex genes, CCT2, is found
498 in this region, and was identified by our study as well as Choy et al. Additionally, CSM2 was
499 identified by this study, and the primary screening conducted by Schmidlin et al. (Fig. 4D). 22
500 Chromosome X No enriched regions containing more than 2 ORFs are found when running
501 the 217 MAT locus genes alone. The MAT + MET15 100 gene analysis identifies three ORFs
- 502 (YJL009W, SYS1, and SPC1) in a 23 ORF region (BP region 419849-458354) (padj = 9.01 x 10
503 3) (Fig. 4E). A second enriched locus of interest is identified on chromosome X when the
504 overlapping gene list from the multiple screen comparison is analyzed. This region is from
505 491074-517506bp and contains 3 top hits (CPR7, PET191, and POL32) in a 12 ORF region (padj
506 = 4.18x10-4).
507 Chromosome XI Three ORFs (YKR073C, ECM4, and YKR078W) found in the MAT screen
508 are identified in a 8 ORF region on chromosome XI (BP region 577829-586351) (padj = 7.51 x
509 10-3) (Fig 4F). This region is not found in our MAT + MET15 list. However, an overlapping
510 region is found when the multiple screen comparison list is studied (BP region 584594-595940)
511 (padj = 0.013) and contains 2 ORFs, YKR078W and NUP133, in a 5 ORF region.
512 Chromosome XV A large region is returned from PGE analysis of the MAT top-hits list (BP
-3 513 region 69376-139045) (padj = 2.88x10 ) with seven (MED7, YOL134C, HRP1, PAP2, WSC3,
514 NDJ1, and COQ3) of the 46 ORFs in the region being found in that screen. A subsection of this
-3 515 region (BP region 69376-115808) (padj = 3.32 x 10 ) with four ORFs (MED7, HRP1, PAP2, and
516 WSC3) out of a 33 ORF section is identified from the 100 gene MAT + MET15 list (Fig 4G).
517 Analysis of the 91 overlap list from the multiple screen comparison did not return any enriched
518 regions on this chromosome.
519 Chromosome XVI No enriched regions are identified from the 217 MAT locus gene list for
520 chromosome XVI. However, when the 100 top-hits gene list from the MAT + MET15 screens is
521 analyzed for chromosomal enrichment four ORFs in an 83 ORF region are found (BP region
522 600646-728613) (padj = 0.041). Two (RPL36B and DIP5) out of a 20 ORF region are found in a
523 separate region of chromosome XVI (BP region 41043-76239) (padj = 0.046) and two other 23
524 ORFs (DDC1 and YIG1) out of a 9 ORF region (BP region 169769-181114) (padj = 0.019) are
525 also identified when the list of 91 genes that were found in at least two separate screens are
526 analyzed (Fig 4H). Within these regions several genes from heterozygous LOH/genome stability
527 screens are found, DIP5 was identified by both Strome et al. (2008) and Choy et.al., and
528 RPL36B was identified by Choy et al. and this study.
529 Fluctuation Analysis of LOH candidate genes selected from data narrowing criteria
530 In an effort to determine the level of LOH induced by heterozygous gene mutations of interest
531 we created strains capable of a quantitative fluctuation analysis assessment. Nine genes,
532 SSA1, CCT7, MRC1, RSM7, HS7, SSE2, PUG1, YOR364W, and MEC1, were chosen for this
533 assessment based on appearing within our screen and at least one secondary analyses
534 discussed above (additional description of gene functions’ can be found in the discussion
535 section below); the LOH rates of strains heterozygously mutated for these genes were then
536 benchmarked using fluctuation analysis at the CAN1 locus. PUG1, YOR364W, and RSM7 were
537 chosen because heterozygous mutations in these genes results in the highest LOH event
538 scores in both our MAT and MET15 assays. Futhermore YOR364W is categorized as a dubious
539 open reading frame (ORF) indicating lack of clarity on if a protein/functional product is produced,
540 and if so, an unknown function for that product. As such, YOR364W lacks characterization and
541 falls in the Unknown categories for all three main GO terms: Molecular Function, Biological
542 Process, and Cellular Component. MRC1 was selected because it was indentified in three of the
543 studies we analyzed, while SSE2 was chosen because it was identified in two of the separate
544 published studies (this study and Choy et al. that showed significant overlap) we analyzed for
545 increasing LOH. To learn more about one of the clustered regions identified through PGE we
546 selected three genes from the chromosomal II array, MEC1, SSE2, and HSL7. CCT7 and SSA1
547 were included in these analyses because they are part of the overrepresented gene ontology
548 category of the CCT-complex, identified by both our study and Choy et al. Finally CCT7, SSE2, 24
549 MRC1, RSM7, HSL7 and SSA1, all have identified human homologs (see discussion) which
550 could make their further investigation more relevant to LOH events in cancers. The SCDChet
551 strains containing heterozygous mutations in SSA1, CCT7, MRC1, RSM7, HSL7, SSE2, PUG1,
552 YOR364W, and MEC1, as well as the parental strain used to construct SCDChet – BY4743 –
553 were transformed with a can1::HIS3 cassette to render their CAN1 locus heterozygous. Four
554 independent fluctuation analysis experiments for the parental strain BY4743 were conducted to
555 estimate the baseline LOH rate and 95% confidence intervals (see Fig 5) using the rSalvador
556 package (Zheng 2002). The nine heterozygous gene deletion strains tested all showed
557 significant increased LOH rates with 7- to 31-fold increases over the parental LOH rate (see Fig
558 5). These increased rates demonstrate that all of our secondary analyses were successful in
559 narrowing candidates of interest whose single-gene deletion leads to a significant increase in
560 LOH.
561
562 DISCUSSION
563 Through the work presented here we sought to identify heterozygous gene mutations
564 with haploinsufficient impacts on LOH, as the homologs of these genes are particularly
565 interesting targets for study since loss/mutation to only one allele could induce LOH-based
566 cancer phenotypes. Multiple refinement strategies were employed to attempt to identify those
567 genes with the greatest potential as candidate cancer susceptibility genes for futher study.
568 Human Homolog Identification and Cancer Associations
569 Of the 217 genes identified as top-hits in the primary MAT locus LOH screening, we
570 were able to identify 127 with a known human homolog (58.5%) utilizing YeastMine
571 (https://yeastmine.yeastgenome.org) and NCBI Homologene 25
572 (https://www.ncbi.nlm.nih.gov/homologene). Literature searches on these 127 genes identified
573 86 with a known association with cancer incidence (40% of the top-hits list, 68% of the “genes
574 with human homologs” list) (see Table S8 for all top-hits with known human homologs). When
575 the smaller list of mutants that increased LOH at both MAT + MET15 was examined for human
576 homologs in the same manner as above, 58 of the 100 genes (58%) were identified as having a
577 human homolog, with 36 of the 58 human homologs with a known association to cancer (62%).
578 By again performing data comparisons to multiple independent studies, we are able to
579 collect more information about genes with human homologs and those with known associations
580 to cancers. Sixty-seven out of the 91 genes identified in at least two screens, have a known
581 human homolog (74%). Including the genes identified in our screen (discussed above), a search
582 of the current literature revealed that 54 of those 67 genes have an association with cancer
583 (69%) (Table S9) (Sun et al. 2002; Leone et al. 2003; Mason et al. 2014; Abdel-Fatah et al.
584 2014; Hennecke et al. 2015; The et al. 2015; Wang et al. 2015; Taguchi et al. 2015; Dai et al.
585 2016). This serves as a positive control that performing these screens, and the multiple screen
586 analyses, identifies genes, both with known human homologs that can be investigated for
587 impacts on cancer development, as well as those that have already been linked to roles in
588 cancer progression. The list of genes, with known human homologs not already associated with
589 cancer phenotypes, are prime candidates as novel cancer susceptibility genes for future studies
590 (as an example see discussion of LSM4 below).
591 Multiple Screen Comparisons for Reproducibility of Gene Identification
592 In addition to conducting new screens to search for additional candidates, pooling the
593 data from multiple, related screens, serves as a further layer to test reproducibility and enables
594 identification of candidate genes that are likely contributors to genome instability. 26
595 SSE2 encodes a heat shock protein 70 (HSP70) family member which functions as a
596 nucleotide exchange factor for cytosolic Hsp70s during protein refolding (Easton et al. 2000;
597 Shaner et al. 2005; Dragovic et al. 2006). This gene was identified in our primary MAT screen,
598 reproduced in our secondary LOH screen at MET15, was also identified by Choy et al. as
599 contributing to LOH at MAT in a haploinsuffient manner, and is found in an enriched region of
600 chromosome II, and has a human homolog (HSPH1) with a known cancer association. The
601 discovered relationship of heterozygous loss of this gene to induction of LOH events in yeast
602 could provide an avenue for further mechanistics studies in this model system, a pathway for
603 study design in mammalian cells, and a justification to screen for HSPH1 alterations in
604 additional cancer types.
605 Similarly, LSM4 was identified in our MAT and MET15 datasets (with one of the highest
606 levels of LOH events observed in each), was also identified by Choy et al., is found in an
607 enriched region of chromosome V, and has a human homolog (LSM4), however there is no
608 current direct cancer association. This gene encodes part of a protein complex involved in
609 mRNA splicing, processing body assembly, and decay (He and Parker 2000; Beggs 2005). First
610 characterized in systemic lupus erythemathosus patients, this protein is an intriguing prospect
611 for further study, as SLE is a heterogeneous disorder, linked to increased incidence of many
612 cancer types, however the data are inconsistent (Gayed et al. 2009; Song et al. 2018). Direct
613 investigation into LSM4 haploinsufficiency in a yeast model might allow investigation into the
614 role of LOH events in disease progression.
615 Additionally, the seven genes found in three overlapping, related screens – DCC1,
616 SLX8, EMI2, MRC1, MCM21, RRM3, and ELG1 – qualify as strong candidates for further
617 investigation due to consistently contributing to instability via LOH mechanisms under a variety
618 of experimental conditions. Six of these genes are categorized in the DNA Metabolic Processes
619 gene ontology and are further linked to chromosome organization (DCC1, SLX8, MRC1, 27
620 MCM21, RRM3, and ELG1). With known human homologs and cancer associations for all but
621 MCM21, study of the mechanisms of association with increased LOH events could yield more
622 information for inclusion in epidemiological studies as well as pathway information for targeted
623 treatments.
624 Additional genes that induced the highest levels of LOH in our screens and showed
625 statistically significant increases in LOH via fluctuation analysis are RSM7, PUG1, and
626 YOR364W. RSM7 encodes a small subunit mitochondrial ribosomal protein (Saveanu et al.
627 2001) and mutations in this gene were also picked up by Schmidlin et al as increasing LOH
628 events. Little else has been published about this gene and it is intriguing to consider how
629 defects in a mitochondrial ribosomal protein induce nuclear genome instability. Further, while a
630 human homolog has been identified, to date there has been no association with cancer making
631 it a novel candidate cancer predisposition gene identified here. PUG1 encodes a
632 protoporphyrin uptake protein in the plasma membrane also involved in heme transport
633 (Protchenko et al. 2008; Manente and Ghislain 2009). Choy et al identified haploinsufficiency
634 effects of loss of this gene on chromosome maintenance and Zhu et al have shown increased
635 colony sectoring in a CTF assay using the SCDChet (Zhu et al. 2015). Again, little additional
636 information has been published on this gene. YOR364W, however, has had no direct work
637 published about it, and is therefore another intriguing candidate. This ORF is still considered
638 “dubious” and “unlikely to encode a functional protein”
639 (https://www.yeastgenome.org/locus/S000005891). We however saw high levels of LOH
640 induction in both our MAT locus and MET15 locus screens as well as 14-fold increase in LOH at
641 the CAN1 locus via fluctuation analysis.
642 Gene Ontology Enrichment 28
643 Further indication that our screen results are pertinent to the identification of human
644 cancer-relevant gene mutations comes from our gene ontology analyses. On top of identifying
645 many ontologies with clear cancer relevance such as Chromosome Segregation, Mitotic Cell
646 Cycle, and DNA Repair, our GO analysis identified other ontologies which may hold relevance,
647 but require more analysis of the members for candidate cancer susceptibility genes. For
648 example, one of the top GO Term Finder ontology hits identified the CCT-complex; conserved
649 from yeast to humans, some components of this complex have known cancer impacts. In yeast,
650 systematic identical null mutations in each subunit of the eight core CCT complex proteins
651 revealed varying phenotypic effects, indicating the possibility of secondary roles that individual
652 subunits play in addition to cytoskeletal subunit folding as part of the CCT-complex (Amit et al.
653 2010). Secondary cellular roles of these subunits are additionally supported in research linking
654 many, but not all, of the eight subunits to different cancer types. For instance, the homolog of
655 one of our identified top-hits, CCT2, along with TCP1, has recently been linked as a driver
656 mutation in breast tumor formation (Guest et al. 2015). As well, change in expression of CCT8
657 has been linked to poor prognosis in glioma patients (Qiu et al. 2015). Studying the proteins of
658 this complex could lead to more mechanistic insight on their secondary roles as well as
659 determination of if other members of the complex play a role in cancer development.
660 Furthermore, because members of the CCT-complex are essential genes, utilizing the SCDChet
661 allowed for identification of this complex where it could not have been assessed using
662 homozygous deletions; therefore, allowing for a more comprehensive look at genes contributing
663 to genome instability through LOH events.
664 Chromosome Location Enrichment Clustering Of Genes Involved In LOH
665 In the search for additional information that might aid in generating a better
666 understanding of LOH we investigated the chromosome location of hits within our screen results
667 alone, and further compared across multiple screens. These searches hold the potential to 29
668 identify chromosomal regions and genes important in LOH events as well as to rule out genes
669 that might have been identified due to artifacts in strain construction. On the one hand there are
670 acknowledged errors, advantages, and disadvantages, for using a particular deletion collection
671 (aneuploid strains, secondary mutations, incorrect genotyping) as well as screen-specific issues
672 of strain reconstruction due to the nature of certain mutations (Hughes et al. 2000; Giaever et al.
673 2002; Deutschbauer et al. 2005; Yuen et al. 2007; Schmidlin Tom et al. 2007; Andersen et al.
674 2008; Ben-Shitrit et al. 2012; Giaever and Nislow 2014). These known inconsistencies however
675 do not negate the importance of the S. cerevisiae deletion collections as a tool for
676 understanding genome instability, and the ability to apply knowledge to the study of higher
677 eukaryotes. Conversely, enriched regions may contain regulatory sequences or identify the
678 presence of a particularly important gene in the vicinity, and may point us to candidates with the
679 most potential for further investigation. Because there is not conclusive evidence in all cases to
680 confirm if a gene knockout is contributing to LOH due to regional effects, or via an independent
681 mechanism, all possibilities need to be considered. We take the identified region of
682 chromosome II as an example to discuss these models.
683 Chromosome II The large region of significance identified on the right arm of chromosome II,
684 included seven MAT locus LOH screen identified genes within a 20 ORF region – HSL7, CKS1,
685 YBR137W, ATG42, BMT2, YBR144C, and APD1 and expanded to 19 identified hits in a region
686 of 49 ORFs when run with the 91 gene multiple screen comparison top-hits list. Several
687 possibilities exist for why this region was identified as an enriched locus for increasing LOH
688 events.
689 The first possibility is that all/most of the genes in this neighborhood have independent
690 impacts on LOH occurrences. While a few occasions of clustering of yeast genes with similar
691 functions have been identified (Zhang and Smith 1998), it is possible that due to the large
692 variety of gene functions that could be perturbed and lead to LOH that this region was not 30
693 previously identified as one of these such clusters. Evolutionarily this clustering may have come
694 into existance due to co-regulation of these genes for their involvement in genome stability
695 maintenance.
696 A second possibility is that neighboring gene effects are driving LOH events through a
697 single driver gene in the locus. For example, near the center of this region lies the gene MEC1.
698 Orthologous to human ATR, MEC1 is a critical mitotic checkpoint gene that plays an important
699 role in responding to DNA damage as the cell navigates through the cell cycle (Harrison and
700 Haber 2006; Bandhu et al. 2014). Further, mutations in MEC1 have previously been shown to
701 increase mitotic recombination events (Fasullo and Sun 2008) and decrease chromosome
702 maintenance events (Choy et al. 2013), both of which could lead to increases in LOH; and has
703 been identified in at least two previous screens looking for genes with impacts on genome
704 stability (Stirling et al. 2011; Choy et al. 2013). MEC1 was identified as a top-hit by Choy et al
705 indicating heterozygous loss of this locus can increase LOH events. As an essential gene this
706 ORF would not be identifiable through screens utilizing the homozygous deletion collection.
707 Other groups have reported neighborhood effects of KANMX insertions, however these are
708 generally limited to adjacent genes within 600bp upstream or downstream of the driver locus.
709 Ben-Shitrit et al. published a mechanism to predict such neighboring gene effects, however
710 implementation requires a known protein-protein interaction network with anchoring proteins for
711 the phenotype being measured (Ben-Shitrit et al. 2012). Since LOH events are caused by
712 widely variable mechanisms via genes with widely variable functions we were unable to utilize
713 their algorithm in this situation. Further study of this region to determine if all the identified ORFs
714 are presenting as top-hits due to their proximity to a particular gene, like MEC1, or if they are
715 contributing to genome instability through neighborhood-independent mechanisms could be
716 accomplished through additional studies. This could take the form of individual knockout of
717 every gene separately in a new strain, LOH rate quantification (such as via fluctuation analysis), 31
718 and complementation assays, or through separate testing for expression levels of all of the
719 genes in the region in each of the individual SCDChet strains representing the genes across this
720 chromosomal region. Examining the extent of individual neighborhood effects in these clusters
721 is a future direction that falls outside the scope of this study.
722 A third model is that the chromosomal region itself, potentially through TF binding,
723 histone modification, replication firing, or three-dimensional architecture, may play roles in
724 multiple loci being identified. Several studies have shown that a gene's expression level tends to
725 be similar to that of its neighbors on a chromosome (Zhang and Smith 1998; Kruglyak and Tang
726 2000; Cohen et al. 2000). If these are dosage sensitive genes, this might contribute to them
727 having haploinsufficiency effects, which might account for the chromosomal region’s
728 identification. To assess this possibility for the chromosome II region we searched for
729 topologically associating domains (TAD) within a high-throughput chromosome conformation
730 capture (Hi-C) dataset (Eser et al. 2017). A 130kb TAD was found to stretch across the region
731 we identified (spanning from ~450-580kb on Chr II), however the authors of that study report
732 that the TAD-like domains they found were more strongly correlated with replication timing than
733 with transcription.
734 A fourth model is that chromosomal regions are being identified as involved in increased
735 LOH as a result of an artifact of how the knockouts were generated. To attempt to assess the
736 possibility of region identification due to strain artifacts we have done two analyses. We started
737 by running PGE with all 898 genes from the multiple screen comparisons (De Preter et al.
738 2008). The logic here is that if large regions are identified without individual genes being
739 replicated this could tell us about the possibility for artifacts. Further, in running PGE in this
740 manner we made note of which screen identified each gene, Table S10. We then further
741 categorized each gene identified to the screen set-up and screens utilizing the same starting
742 strains (i.e. SCDChet vs SCDChom vs non-SCDC strains) were combined to determine if strain 32
743 creation artifacts could be at play. We found that for the region in chromosome II, nearly all of
744 the hits (16/19) were found from screens that utilized SCDChet, Figure S1. This could support
745 the theory that something about these particular strains and the way they were made is leading
746 to identification of this locus, but could also indicate a region high in genes with
747 haploinsufficiency effects or essential genes not able to be identified by SCDChom screens
748 (further substantiated by the fact that the other screen that identified two genes in this region did
749 not use either deletion collection but completed insertional mutagenesis to assay for
750 heterozygous effects (Strome et al. 2008)). We therefore moved to a second analysis to assess
751 how strains were created as part of making the S. cerevisiae deletion collections. Since past
752 artifacts have been identified when a set of strains were all created in the same lab (Lehner et
753 al. 2007) we wanted to gauge if this chromosomal region of knockouts met this criteria.
754 Information on strain creation from the Standford yeast deletion project website (http://www-
755 sequence.stanford.edu/group/yeast_deletion_project/overlapping.html) identifies “lab 11” as
756 having created this entire block of gene mutations, ending just where our identified region ends
757 (at ORF YBR172C), however starting farther up (having created strains starting at YBR080C,
758 our identified region starts at YBR127C). This unfortunately both supports and conflicts with the
759 theory of artifact-induced LOH events in all strains made by this group. Supporting evidence is
760 that this group did make all the gene knockout strains we identified. Conflictingly, however we
761 would have expected to have identified all genes/most genes in the chromosomal region that
762 encompasses all of the strains they created, not approximately half.
763 These models could apply to all of the PGE identified enriched regions and further
764 investigation into all of these loci are warranted, but are outside of the scope of this mutant
765 screen report.
766 Essential genes 33
767 As less frequently evaluated loci, not assayed in haploid studies or studies of the
768 homozygous deletion collection that are more frequently conducted, the cohort of essential
769 genes from our screen are interesting targets for further study. 37 essential genes were
770 identified in our primary screen RPN6, LSM4, HSK3, PFS2, RLI1, SSY1, RSC3, BUR6, SLD3,
771 ERG7, TTI1, HRP1, PSA1, FAP7, KRS1, FRQ1, MOB2, YJL009W, CCT7, SPC3, MED7,
772 GPN2, RPO26, TSC13, MPS1, MET30, STH1, CCT2, CCT8, TUB1, MVD1, YOL134C, RKI1,
773 RET1, MCM2, SMC1 and NEO1. Of these essential genes one, CCT7, was chosen for strain
774 recreation to allow quantitative assesement of LOH rate due to heterozygous mutation. The
775 significant 27-fold increase in LOH due to loss of this gene demonstrates its haploinsufficient
776 impact on genome stability. Amongst this group, 32, have previously identified human homologs
777 of which 23 have a known cancer association. Once again, analyses of these
778 genes/proteins/pathways in a yeast system may help uncover additional mechanistic
779 information important in understanding their roles in cancer development. Further, the nine
780 genes (LSM4, BUR6, ERG7, KRS1, SPC3, GPN2, TSC13, MUD1, and NEO1) with known
781 human homologs not already associated with cancer phenotypes are prime candidates as novel
782 cancer susceptibility genes as they, like their yeast counterparts, may be able to induce
783 haploinsufficient effects, not abiding the two-hit hypothesis, and therefore being more impactful
784 gene mutations.
785 Conclusions
786 Analyzing new screen data independently and then in conjunction with gene lists from
787 relevant published works, as well as performing analyses beyond individual genes by looking at
788 such items as ontology representation and positional gene enrichment, can increase the power
789 of a study to identify genes of most importance. The studies a group might choose for
790 comparison could be selected by literature search identification of relevant screens. We propose
791 that this methodology of candidate narrowing allows the community to wade through the noise 34
792 of data and focus on chromosomal deletions that play important roles in the phenotype or
793 pathway of interest.
794 Acknowledgements
795 The content is solely the responsibility of the authors and does not necessarily represent the
796 official views of the National Institutes of Health. Research was supported by an Institutional
797 Development Award (IDeA) from the National Institute of General Medical Sciences of the
798 National Institutes of Health: P20GM1234 and an NIGMS R15 Area Award: 1R15GM109269-
799 01A1.
800
801 References
802 Abdel-Fatah, T. M. A., R. Russell, N. Albarakati, D. J. Maloney, D. Dorjsuren et al., 2014 Genomic and
803 protein expression analysis reveals flap endonuclease 1 (FEN1) as a key biomarker in breast and
804 ovarian cancer. Mol. Oncol. 8: 1326–1338.
805 Alimonti, A., A. Carracedo, J. G. Clohessy, L. C. Trotman, C. Nardella et al., 2010 Subtle variations in Pten
806 dose determine cancer susceptibility. Nat. Genet. 42: 454–458.
807 Amit, M., S. J. Weisberg, M. Nadler-Holly, E. A. McCormack, E. Feldmesser et al., 2010 Equivalent
808 Mutations in the Eight Subunits of the Chaperonin CCT Produce Dramatically Different Cellular
809 and Gene Expression Phenotypes. J. Mol. Biol. 401: 532–543.
810 Andersen, M. P., Z. W. Nelson, E. D. Hetrick, and D. E. Gottschling, 2008 A Genetic Screen for Increased
811 Loss of Heterozygosity in Saccharomyces cerevisiae. Genetics 179: 1179–1195.
812 Antonacopoulou, A. G., P. D. Grivas, L. Skarlas, M. Kalofonos, C. D. Scopa et al., 2008 POLR2F, ATP6V0A1
813 and PRNP expression in colorectal cancer: new molecules with prognostic significance?
814 Anticancer Res. 28: 1221–1227. 35
815 Apasu, J. E., D. Schuette, R. LaRanger, J. A. Steinle, L. D. Nguyen et al., 2019 Neuronal calcium sensor 1
816 (NCS1) promotes motility and metastatic spread of breast cancer cells in vitro and in vivo. FASEB
817 J. Off. Publ. Fed. Am. Soc. Exp. Biol. 33: 4802–4813.
818 Babeu, J.-P., S. D. Wilson, É. Lambert, D. Lévesque, F.-M. Boisvert et al., 2019 Quantitative Proteomics
819 Identifies DNA Repair as a Novel Biological Function for Hepatocyte Nuclear Factor 4α in
820 Colorectal Cancer Cells. Cancers 11:.
821 Bae, N. S., A. P. Seberg, L. P. Carroll, and M. J. Swanson, 2017 Identification of Genes in Saccharomyces
822 cerevisiae that Are Haploinsufficient for Overcoming Amino Acid Starvation. G3
823 GenesGenomesGenetics 7: 1061–1084.
824 Bai, D., J. Zhang, T. Li, R. Hang, Y. Liu et al., 2016 The ATPase hCINAP regulates 18S rRNA processing and
825 is essential for embryogenesis and tumour growth. Nat. Commun. 7: 12310.
826 Baldeyron, C., A. Brisson, B. Tesson, F. Némati, S. Koundrioukoff et al., 2015 TIPIN depletion leads to
827 apoptosis in breast cancer cells. Mol. Oncol. 9: 1580–1598.
828 Bandhu, A., J. Kang, K. Fukunaga, G. Goto, and K. Sugimoto, 2014 Ddc2 Mediates Mec1 Activation
829 through a Ddc1- or Dpb11-Independent Mechanism. PLoS Genet. 10:.
830 Bausch, B., F. Schiavi, Y. Ni, J. Welander, A. Patocs et al., 2017 Clinical Characterization of the
831 Pheochromocytoma and Paraganglioma Susceptibility Genes SDHA, TMEM127, MAX, and
832 SDHAF2 for Gene-Informed Prevention. JAMA Oncol. 3: 1204–1212.
833 Beggs, J. D., 2005 Lsm proteins and RNA processing. Biochem. Soc. Trans. 33: 433–438.
834 Benjamini, Y., and Y. Hochberg, 1995 Controlling the false discovery rate: a practical and powerful
835 approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57: 289–300.
836 Ben-Shitrit, T., N. Yosef, K. Shemesh, R. Sharan, E. Ruppin et al., 2012 Systematic identification of gene
837 annotation errors in the widely used yeast mutation collections. Nat. Methods 9: 373–378. 36
838 Berger, A. H., A. G. Knudson, and P. P. Pandolfi, 2011 A continuum model for tumour suppression.
839 Nature 476: 163–169.
840 Bianco, J. N., V. Bergoglio, Y.-L. Lin, M.-J. Pillaire, A.-L. Schmitz et al., 2019 Overexpression of Claspin and
841 Timeless protects cancer cells from replication stress in a checkpoint-independent manner. Nat.
842 Commun. 10:.
843 Boele, J., H. Persson, J. W. Shin, Y. Ishizu, I. S. Newie et al., 2014 PAPD5-mediated 3’ adenylation and
844 subsequent degradation of miR-21 is disrupted in proliferative disease. Proc. Natl. Acad. Sci. U.
845 S. A. 111: 11467–11472.
846 Botstein, D., and G. R. Fink, 2011 Yeast: an experimental organism for 21st Century biology. Genetics
847 189: 695–704.
848 Brant, K. A., and G. D. Leikauf, 2014 Dysregulation of FURIN by prostaglandin‐endoperoxide synthase 2
849 in lung epithelial NCI‐H292 cells. Mol. Carcinog. 53: 192–200.
850 Broustas, C. G., K. M. Hopkins, S. K. Panigrahi, L. Wang, R. K. Virk et al., 2019 RAD9A promotes
851 metastatic phenotypes through transcriptional regulation of anterior gradient 2 (AGR2).
852 Carcinogenesis 40: 164–172.
853 Chang, K.-W., H.-L. Chen, Y.-H. Chien, T.-C. Chen, and C.-T. Yeh, 2011 SLC25A13 gene mutations in
854 Taiwanese patients with non-viral hepatocellular carcinoma. Mol. Genet. Metab. 103: 293–296.
855 Chang, J.-S., C.-Y. Su, W.-H. Yu, W.-J. Lee, Y.-P. Liu et al., 2015 GIT1 promotes lung cancer cell metastasis
856 through modulating Rac1/Cdc42 activity and is associated with poor prognosis. Oncotarget 6:
857 36278–36291.
858 Chen, Y.-C., B. Humphries, R. Brien, A. E. Gibbons, Y.-T. Chen et al., 2018 Functional Isolation of Tumor-
859 Initiating Cells using Microfluidic-Based Migration Identifies Phosphatidylserine Decarboxylase
860 as a Key Regulator. Sci. Rep. 8: 244. 37
861 Chen, T.-M., M.-C. Lai, Y.-H. Li, Y.-L. Chan, C.-H. Wu et al., 2019 hnRNPM induces translation switch
862 under hypoxia to promote colon cancer development. EBioMedicine 41: 299–309.
863 Chen, J., Y. Lu, J. Xu, Y. Huang, H. Cheng et al., 2004 Identification and characterization of NBEAL1, a
864 novel human neurobeachin-like 1 protein gene from fetal brain, which is up regulated in glioma.
865 Mol. Brain Res. 125: 147–155.
866 Chen, Z., L. Xu, T. Su, Z. Ke, Z. Peng et al., 2017 Autocrine STIP1 signaling promotes tumor growth and is
867 associated with disease outcome in hepatocellular carcinoma. Biochem. Biophys. Res. Commun.
868 493: 365–372.
869 Chen, C., Y. Yin, C. Li, J. Chen, J. Xie et al., 2016 ACER3 supports development of acute myeloid leukemia.
870 Biochem. Biophys. Res. Commun. 478: 33–38.
871 Cho, H.-J., S. Lee, Y. G. Ji, and D. H. Lee, 2018 Association of specific gene mutations derived from
872 machine learning with survival in lung adenocarcinoma. PloS One 13: e0207204.
873 Choi, W. H., S. Lee, and S. Cho, 2017 Microsatellite Alterations and Protein Expression of 5 Major Tumor
874 Suppressor Genes in Gastric Adenocarcinomas. Transl. Oncol. 11: 43–55.
875 Choi, C., N. Thi Thao Tran, T. Van Ngu, S. W. Park, M. S. Song et al., 2018 Promotion of tumor
876 progression and cancer stemness by MUC15 in thyroid cancer via the GPCR/ERK and integrin-
877 FAK signaling pathways. Oncogenesis 7: 85.
878 Chou, Y.-T., J.-K. Jiang, M.-H. Yang, J.-W. Lu, H.-K. Lin et al., 2018 Identification of a noncanonical
879 function for ribose-5-phosphate isomerase A promotes colorectal cancer formation by
880 stabilizing and activating β-catenin via a novel C-terminal domain. PLoS Biol. 16: e2003714.
881 Choy, J. S., E. O’Toole, B. M. Schuster, M. J. Crisp, T. S. Karpova et al., 2013 Genome-wide
882 haploinsufficiency screen reveals a novel role for γ-TuSC in spindle organization and genome
883 stability. Mol. Biol. Cell 24: 2753–2763. 38
884 Christie, K. R., E. L. Hong, and J. M. Cherry, 2009 Functional annotations for the Saccharomyces
885 cerevisiae genome: the knowns and the known unknowns. Trends Microbiol. 17: 286–294.
886 Ciou, S.-C., Y.-T. Chou, Y.-L. Liu, Y.-C. Nieh, J.-W. Lu et al., 2015 Ribose-5-phosphate isomerase A
887 regulates hepatocarcinogenesis via PP2A and ERK signaling. Int. J. Cancer 137: 104–115.
888 Cohen, B. A., R. D. Mitra, J. D. Hughes, and G. M. Church, 2000 A computational analysis of whole-
889 genome expression data reveals chromosomal domains of gene expression. Nat. Genet. 26:
890 183–186.
891 Cost, G. J., and J. D. Boeke, 1996 A useful colony colour phenotype associated with the yeast
892 selectable/counter-selectable marker MET15. Yeast 12: 939–941.
893 Dai, B., P. Zhang, Y. Zhang, C. Pan, G. Meng et al., 2016 RNaseH2A is involved in human gliomagenesis
894 through the regulation of cell proliferation and apoptosis. Oncol. Rep. 36: 173–180.
895 Datta, B., 2009 Roles of P67/MetAP2 as a tumor suppressor. Biochim. Biophys. Acta 1796: 281–292.
896 De Preter, K., R. Barriot, F. Speleman, J. Vandesompele, and Y. Moreau, 2008 Positional gene enrichment
897 analysis of gene sets for high-resolution identification of overrepresented chromosomal regions.
898 Nucleic Acids Res. 36: e43.
899 Deutschbauer, A. M., D. F. Jaramillo, M. Proctor, J. Kumm, M. E. Hillenmeyer et al., 2005 Mechanisms of
900 Haploinsufficiency Revealed by Genome-Wide Profiling in Yeast. Genetics 169: 1915–1925.
901 Di Giammartino, D. C., and J. L. Manley, 2014 New links between mRNA polyadenylation and diverse
902 nuclear pathways. Mol. Cells 37: 644–649.
903 Diefenbacher, M., and A. Orian, 2017 Stabilization of nuclear oncoproteins by RNF4 and the ubiquitin
904 system in cancer. Mol. Cell. Oncol. 4: e1260671.
905 Ding, N., R. Li, W. Shi, and C. He, 2018 CENPI is overexpressed in colorectal cancer and regulates cell
906 migration and invasion. Gene 674: 80–86. 39
907 Donato, M. D., M. Fanelli, M. Mariani, G. Raspaglio, D. Pandya et al., 2015 Nek6 and Hif-1α cooperate
908 with the cytoskeletal gateway of drug resistance to drive outcome in serous ovarian cancer. Am.
909 J. Cancer Res. 5: 1862–1877.
910 Dong, J., J. Geng, and W. Tan, 2018 MiR-363-3p suppresses tumor growth and metastasis of colorectal
911 cancer via targeting SphK2. Biomed. Pharmacother. Biomedecine Pharmacother. 105: 922–931.
912 Dragovic, Z., S. A. Broadley, Y. Shomura, A. Bracher, and F. U. Hartl, 2006 Molecular chaperones of the
913 Hsp110 family act as nucleotide exchange factors of Hsp70s. EMBO J. 25: 2519–2528.
914 Drucker, D. J., 2016 Never Waste a Good Crisis: Confronting Reproducibility in Translational Research.
915 Cell Metab. 24: 348–360.
916 Duntze, W., V. MacKay, and T. R. Manney, 1970 Saccharomyces cerevisiae: A Diffusible Sex Factor.
917 Science 168: 1472–1473.
918 Easton, D. P., Y. Kaneko, and J. R. Subjeck, 2000 The Hsp110 and Grp170 stress proteins: newly
919 recognized relatives of the Hsp70s. Cell Stress Chaperones 5: 276–290.
920 Engin, H. B., D. Carlin, D. Pratt, and H. Carter, 2017 Modeling of RAS complexes supports roles in cancer
921 for less studied partners. BMC Biophys. 10:.
922 Eser, U., D. Chandler-Brown, F. Ay, A. F. Straight, Z. Duan et al., 2017 Form and function of topologically
923 associating genomic domains in budding yeast. Proc. Natl. Acad. Sci. 114: E3061–E3070.
924 Evensen, N. A., P. P. Madhusoodhan, J. Meyer, J. Saliba, A. Chowdhury et al., 2018 MSH6
925 haploinsufficiency at relapse contributes to the development of thiopurine resistance in
926 pediatric B-lymphoblastic leukemia. Haematologica.
927 Fang, Y., D. Fu, and X.-Z. Shen, 2010 The potential role of ubiquitin c-terminal hydrolases in oncogenesis.
928 Biochim. Biophys. Acta 1806: 1–6. 40
929 Fang, Z., C. Gong, S. Yu, W. Zhou, W. Hassan et al., 2018 NFYB-induced high expression of E2F1
930 contributes to oxaliplatin resistance in colorectal cancer via the enhancement of CHK1 signaling.
931 Cancer Lett. 415: 58–72.
932 Fang, H., J. Kang, R. Du, X. Zhao, X. Zhang et al., 2015 Growth inhibitory effect of adenovirus-mediated
933 tissue-targeted expression of ribosomal protein L23 on human colorectal carcinoma cells. Oncol.
934 Rep. 34: 763–770.
935 Faria, J. A. Q. A., N. C. R. Corrêa, C. de Andrade, A. C. de Angelis Campos, R. Dos Santos Samuel de
936 Almeida et al., 2013 SET domain-containing Protein 4 (SETD4) is a Newly Identified Cytosolic and
937 Nuclear Lysine Methyltransferase involved in Breast Cancer Cell Proliferation. J. Cancer Sci. Ther.
938 5: 58–65.
939 Fasullo, M., and M. Sun, 2008 The Saccharomyces cerevisiae checkpoint genes RAD9, CHK1 and PDS1 are
940 required for elevated homologous recombination in a mec1 (ATR) hypomorphic mutant. Cell
941 Cycle Georget. Tex 7: 2418–2426.
942 Fernández-Sáiz, V., B.-S. Targosz, S. Lemeer, R. Eichner, C. Langer et al., 2013 SCFFbxo9 and CK2 direct
943 the cellular response to growth factor withdrawal via Tel2/Tti1 degradation and promote
944 survival in multiple myeloma. Nat. Cell Biol. 15: 72–81.
945 Gagou, M. E., A. Ganesh, G. Phear, D. Robinson, E. Petermann et al., 2014 Human PIF1 helicase supports
946 DNA replication and cell growth under oncogenic-stress. Oncotarget 5: 11381–11398.
947 Gao, C., Y. Su, J. Koeman, E. Haak, K. Dykema et al., 2016 Chromosome instability drives phenotypic
948 switching to metastasis. Proc. Natl. Acad. Sci. U. S. A. 113: 14793–14798.
949 Gayed, M., S. Bernatsky, R. Ramsey-Goldman, A. Clarke, and C. Gordon, 2009 Lupus and cancer. Lupus
950 18: 479–485.
951 Giaever, G., A. M. Chu, L. Ni, C. Connelly, L. Riles et al., 2002 Functional profiling of the Saccharomyces
952 cerevisiae genome. Nature 418: 387–391. 41
953 Giaever, G., and C. Nislow, 2014 The yeast deletion collection: a decade of functional genomics. Genetics
954 197: 451–465.
955 Giam, M., and G. Rancati, 2015 Aneuploidy and chromosomal instability in cancer: a jackpot to chaos.
956 Cell Div. 10:.
957 Guest, S. T., Z. R. Kratche, A. Bollig-Fischer, R. Haddad, and S. P. Ethier, 2015 Two members of the TRiC
958 chaperonin complex, CCT2 and TCP1 are essential for survival of breast cancer cells and are
959 linked to driving oncogenes. Exp. Cell Res. 332: 223–235.
960 Gutierrez-Erlandsson, S., P. Herrero-Vidal, M. Fernandez-Alfara, S. Hernandez-Garcia, S. Gonzalo-Flores
961 et al., 2013 R-RAS2 overexpression in tumors of the human central nervous system. Mol. Cancer
962 12: 127.
963 Haber, J. E., 2012 Mating-type genes and MAT switching in Saccharomyces cerevisiae. Genetics 191: 33–
964 64.
965 Harrison, J. C., and J. E. Haber, 2006 Surviving the breakup: the DNA damage checkpoint. Annu. Rev.
966 Genet. 40: 209–235.
967 Hattori, A., M. Tsunoda, T. Konuma, M. Kobayashi, T. Nagy et al., 2017 Cancer progression by
968 reprogrammed BCAA metabolism in myeloid leukaemia. Nature 545: 500–504.
969 He, W., L. Chen, K. Yuan, Q. Zhou, L. Peng et al., 2018 Gene set enrichment analysis and meta-analysis to
970 identify six key genes regulating and controlling the prognosis of esophageal squamous cell
971 carcinoma. J. Thorac. Dis. 10: 5714-5726–5726.
972 He, X., Z. Liu, Y. Peng, and C. Yu, 2016 MicroRNA-181c inhibits glioblastoma cell invasion, migration and
973 mesenchymal transition by targeting TGF-β pathway. Biochem. Biophys. Res. Commun. 469:
974 1041–1048.
975 He, W., and R. Parker, 2000 Functions of Lsm proteins in mRNA degradation and splicing. Curr. Opin. Cell
976 Biol. 12: 346–350. 42
977 Hennecke, S., J. Beck, K. Bornemann-Kolatzki, S. Neumann, H. Murua Escobar et al., 2015 Prevalence of
978 the Prefoldin Subunit 5 Gene Deletion in Canine Mammary Tumors. PloS One 10: e0131280.
979 Herskowitz, I., 1995 MAP kinase pathways in yeast: For mating and more. Cell 80: 187–197.
980 Hieter, P., C. Mann, M. Snyder, and R. W. Davis, 1985 Mitotic stability of yeast chromosomes: a colony
981 color assay that measures nondisjunction and chromosome loss. Cell 40: 381–392.
982 Hlaváč, V., V. Brynychová, R. Václavíková, M. Ehrlichová, D. Vrána et al., 2014 The role of cytochromes
983 p450 and aldo-keto reductases in prognosis of breast carcinoma patients. Medicine (Baltimore)
984 93: e255.
985 Huang, B., H. Zhou, X. Lang, and Z. Liu, 2014 siRNA‑induced ABCE1 silencing inhibits proliferation and
986 invasion of breast cancer cells. Mol. Med. Rep. 10: 1685–1690.
987 Hughes, T. R., C. J. Roberts, H. Dai, A. R. Jones, M. R. Meyer et al., 2000 Widespread aneuploidy revealed
988 by DNA microarray expression profiling. Nat. Genet. 25: 333–337.
989 Husni, R. E., A. Shiba-Ishii, T. Nakagawa, T. Dai, Y. Kim et al., 2019 DNA hypomethylation-related
990 overexpression of SFN, GORASP2 and ZYG11A is a novel prognostic biomarker for early stage
991 lung adenocarcinoma. Oncotarget 10: 1625–1636.
992 Jin, L., J. Chun, C. Pan, A. Kumar, G. Zhang et al., 2018 The PLAG1-GDH1 Axis Promotes Anoikis
993 Resistance and Tumor Metastasis through CamKK2-AMPK Signaling in LKB1-Deficient Lung
994 Cancer. Mol. Cell 69: 87-99.e7.
995 Jo, Y. S., H. R. Oh, M. S. Kim, N. J. Yoo, and S. H. Lee, 2016 Frameshift mutations of OGDH, PPAT and
996 PCCA genes in gastric and colorectal cancers. Neoplasma 63: 681–686.
997 Joseph, C., O. Macnamara, M. Craze, R. Russell, E. Provenzano et al., 2018 Mediator complex (MED) 7: a
998 biomarker associated with good prognosis in invasive breast cancer, especially ER+ luminal
999 subtypes. Br. J. Cancer 118: 1142–1151.
1000 Kacser, H., and J. A. Burns, 1981 The Molecular Basis of Dominance. Genetics 97: 639–666. 43
1001 Kim, J.-S., J. W. Chang, J. K. Park, and S.-G. Hwang, 2012 Increased aldehyde reductase expression
1002 mediates acquired radioresistance of laryngeal cancer cells via modulating p53. Cancer Biol.
1003 Ther. 13: 638–646.
1004 Kim, M. S., E. G. Jeong, Y. J. Chung, N. J. Yoo, and S. H. Lee, 2008 Absence of somatic mutation of a
1005 tumor suppressor gene eukaryotic translation elongation factor 1, epsilon-1 (EEF1E1), in
1006 common human cancers. APMIS 116: 832–833.
1007 Kim, C. G., H. Lee, N. Gupta, S. Ramachandran, I. Kaushik et al., 2018 Role of Forkhead Box Class O
1008 proteins in cancer progression and metastasis. Semin. Cancer Biol. 50: 142–151.
1009 Knudson, A. G., 1993 Antioncogenes and human cancer. Proc. Natl. Acad. Sci. U. S. A. 90: 10914–10921.
1010 Knudson, A. G., 1971 Mutation and Cancer: Statistical Study of Retinoblastoma. Proc. Natl. Acad. Sci. U.
1011 S. A. 68: 820–823.
1012 Kobayashi, M., K. Harada, M. Negishi, and H. Katoh, 2014 Dock4 forms a complex with SH3YL1 and
1013 regulates cancer cell migration. Cell. Signal. 26: 1082–1088.
1014 Kruglyak, S., and H. Tang, 2000 Regulation of adjacent yeast genes. Trends Genet. TIG 16: 109–111.
1015 Kurbasic, E., M. Sjöström, M. Krogh, E. Folkesson, D. Grabau et al., 2015 Changes in glycoprotein
1016 expression between primary breast tumour and synchronous lymph node metastases or
1017 asynchronous distant metastases. Clin. Proteomics 12: 13.
1018 Lee, S. T., M. Feng, Y. Wei, Z. Li, Y. Qiao et al., 2013 Protein tyrosine phosphatase UBASH3B is
1019 overexpressed in triple-negative breast cancer and promotes invasion and metastasis. Proc.
1020 Natl. Acad. Sci. U. S. A. 110: 11121–11126.
1021 Lee, M., and A. R. Venkitaraman, 2014 A cancer-associated mutation inactivates a region of the high-
1022 mobility group protein HMG20b essential for cytokinesis. Cell Cycle 13: 2554–2563.
1023 Lehner, K., M. M. Stone, R. A. Farber, and T. D. Petes, 2007 Ninety-six haploid yeast strains with
1024 individual disruptions of open reading frames between YOR097C and YOR192C, constructed for 44
1025 the Saccharomyces genome deletion project, have an additional mutation in the mismatch
1026 repair gene MSH3. Genetics 177: 1951–1953.
1027 Leng, J.-J., H.-M. Tan, K. Chen, W.-G. Shen, and J.-W. Tan, 2012 Growth-inhibitory effects of MOB2 on
1028 human hepatic carcinoma cell line SMMC-7721. World J. Gastroenterol. 18: 7285–7289.
1029 Lengauer, C., K. W. Kinzler, and B. Vogelstein, 1998 Genetic instabilities in human cancers. Nature 396:
1030 643–649.
1031 Leone, P. E., M. Mendiola, J. Alonso, C. Paz-y-Miño, and A. Pestaña, 2003 Implications of a RAD54L
1032 polymorphism (2290C/T) in human meningiomas as a risk factor and/or a genetic marker. BMC
1033 Cancer 3: 6.
1034 Levine, A. J., 1993 The Tumor Suppressor Genes. Annu. Rev. Biochem. 62: 623–651.
1035 Li, L., Y. Cao, H. Zhou, Y. Li, B. He et al., 2018 Knockdown of CCNO decreases the tumorigenicity of gastric
1036 cancer by inducing apoptosis. OncoTargets Ther. 11: 7471–7481.
1037 Li, M., C. Jin, M. Xu, L. Zhou, D. Li et al., 2017a Bifunctional enzyme ATIC promotes propagation of
1038 hepatocellular carcinoma by regulating AMPK-mTOR-S6 K1 signaling. Cell Commun. Signal. CCS
1039 15: 52.
1040 Li, Z., L. Kong, L. Yu, J. Huang, K. Wang et al., 2014 Association between MSH6 G39E polymorphism and
1041 cancer susceptibility: a meta-analysis of 7,046 cases and 34,554 controls. Tumour Biol. J. Int.
1042 Soc. Oncodevelopmental Biol. Med. 35: 6029–6037.
1043 Li, Y., Q. Liang, Y. -q Wen, L. -l Chen, L. -t Wang et al., 2010 Comparative proteomics analysis of human
1044 osteosarcomas and benign tumor of bone. Cancer Genet. Cytogenet. 198: 97–106.
1045 Li, C., V. W. Liu, P. M. Chiu, D. W. Chan, and H. Y. Ngan, 2012 Over-expressions of AMPK subunits in
1046 ovarian carcinomas with significant clinical implications. BMC Cancer 12: 357.
1047 Li, Y., N. Sun, Z. Lu, S. Sun, J. Huang et al., 2017b Prognostic alternative mRNA splicing signature in non-
1048 small cell lung cancer. Cancer Lett. 393: 40–51. 45
1049 Li, A., Q. Yan, X. Zhao, J. Zhong, H. Yang et al., 2016 Decreased expression of PBLD correlates with poor
1050 prognosis and functions as a tumor suppressor in human hepatocellular carcinoma. Oncotarget
1051 7: 524–537.
1052 Lin, L.-H., Y.-W. Xu, L.-S. Huang, C.-Q. Hong, T.-T. Zhai et al., 2017 Serum proteomic-based analysis
1053 identifying autoantibodies against PRDX2 and PRDX3 as potential diagnostic biomarkers in
1054 nasopharyngeal carcinoma. Clin. Proteomics 14: 6.
1055 Liu, J.-W., J. K. Nagpal, W. Sun, J. Lee, M. S. Kim et al., 2008 ssDNA-binding protein 2 is frequently
1056 hypermethylated and suppresses cell growth in human prostate cancer. Clin. Cancer Res. Off. J.
1057 Am. Assoc. Cancer Res. 14: 3754–3760.
1058 Loeb, L. A., K. R. Loeb, and J. P. Anderson, 2003 Multiple mutations and cancer. Proc. Natl. Acad. Sci.
1059 100: 776–781.
1060 Mager, W. H., and J. Winderickx, 2005 Yeast as a model for medical and medicinal research. Trends
1061 Pharmacol. Sci. 26: 265–273.
1062 Maleva Kostovska, I., J. Wang, N. Bogdanova, P. Schürmann, S. Bhuju et al., 2016 Rare ATAD5 missense
1063 variants in breast and ovarian cancer patients. Cancer Lett. 376: 173–177.
1064 Manente, M., and M. Ghislain, 2009 The lipid-translocating exporter family and membrane phospholipid
1065 homeostasis in yeast. FEMS Yeast Res. 9: 673–687.
1066 Maragozidis, P., E. Papanastasi, D. Scutelnic, A. Totomi, I. Kokkori et al., 2015 Poly(A)-specific
1067 ribonuclease and Nocturnin in squamous cell lung cancer: prognostic value and impact on gene
1068 expression. Mol. Cancer 14:.
1069 Mason, J. M., H. L. Logan, B. Budke, M. Wu, M. Pawlowski et al., 2014 The RAD51-stimulatory compound
1070 RS-1 can exploit the RAD51 overexpression that exists in cancer cells and tumors. Cancer Res.
1071 74: 3546–3555. 46
1072 Ni, S., W. Weng, M. Xu, Q. Wang, C. Tan et al., 2018 miR-106b-5p inhibits the invasion and metastasis of
1073 colorectal cancer by targeting CTSA. OncoTargets Ther. 11: 3835–3845.
1074 Nicolas, E., E. A. Golemis, and S. Arora, 2016 POLD1: Central mediator of DNA replication and repair, and
1075 implication in cancer and other pathologies. Gene 590: 128–141.
1076 Ono, B., N. Ishii, S. Fujino, and I. Aoyama, 1991 Role of hydrosulfide ions (HS-) in methylmercury
1077 resistance in Saccharomyces cerevisiae. Appl. Environ. Microbiol. 57: 3183–3186.
1078 Oo, H. Z., K. Sentani, S. Mukai, T. Hattori, S. Shinmei et al., 2016 Fukutin, identified by the Escherichia
1079 coli ampicillin secretion trap (CAST) method, participates in tumor progression in gastric cancer.
1080 Gastric Cancer Off. J. Int. Gastric Cancer Assoc. Jpn. Gastric Cancer Assoc. 19: 443–452.
1081 Oo, H. Z., K. Sentani, N. Sakamoto, K. Anami, Y. Naito et al., 2014 Identification of novel transmembrane
1082 proteins in scirrhous-type gastric cancer by the Escherichia coli ampicillin secretion trap (CAST)
1083 method: TM9SF3 participates in tumor invasion and serves as a prognostic factor. Pathobiol. J.
1084 Immunopathol. Mol. Cell. Biol. 81: 138–148.
1085 Pan, J., D. Cao, and J. Gong, 2018 The endoplasmic reticulum co-chaperone ERdj3/DNAJB11 promotes
1086 hepatocellular carcinoma progression through suppressing AATZ degradation. Future Oncol.
1087 Lond. Engl. 14: 3001–3013.
1088 Patra, K. C., Q. Wang, P. T. Bhaskar, L. Miller, Z. Wang et al., 2013 Hexokinase 2 is required for tumor
1089 initiation and maintenance and its systemic deletion is therapeutic in mouse models of cancer.
1090 Cancer Cell 24: 213–228.
1091 Payne, S. R., and C. J. Kemp, 2005 Tumor suppressor genetics. Carcinogenesis 26: 2031–2045.
1092 Protchenko, O., M. Shakoury-Elizeh, P. Keane, J. Storey, R. Androphy et al., 2008 Role of PUG1 in
1093 Inducible Porphyrin and Heme Transport in Saccharomyces cerevisiae. Eukaryot. Cell 7: 859–
1094 871. 47
1095 Provost, E., J. M. Bailey, S. Aldrugh, S. Liu, C. Iacobuzio-Donahue et al., 2014 The Tumor Suppressor rpl36
1096 Restrains KRASG12V-Induced Pancreatic Cancer. Zebrafish 11: 551–559.
1097 Qi, T., W. Zhang, Y. Luan, F. Kong, D. Xu et al., 2014 Proteomic profiling identified multiple short-lived
1098 members of the central proteome as the direct targets of the addicted oncogenes in cancer
1099 cells. Mol. Cell. Proteomics MCP 13: 49–62.
1100 Qin, Y., Q. Hu, J. Xu, S. Ji, W. Dai et al., 2019 PRMT5 enhances tumorigenicity and glycolysis in pancreatic
1101 cancer via the FBW7/cMyc axis. Cell Commun. Signal. 17: 30.
1102 Qiu, X., X. He, Q. Huang, X. Liu, G. Sun et al., 2015 Overexpression of CCT8 and its significance for tumor
1103 cell proliferation, migration and invasion in glioma. Pathol. - Res. Pract. 211: 717–725.
1104 Rahman, M. R., T. Islam, E. Gov, B. Turanli, G. Gulfidan et al., 2019 Identification of Prognostic Biomarker
1105 Signatures and Candidate Drugs in Colorectal Cancer: Insights from Systems Biology Analysis.
1106 Med. Kaunas Lith. 55:.
1107 Reid, R. J. D., X. Du, I. Sunjevaric, V. Rayannavar, J. Dittmar et al., 2016 A Synthetic Dosage Lethal Genetic
1108 Interaction Between CKS1B and PLK1 Is Conserved in Yeast and Human Cancer Cells. Genetics
1109 204: 807–819.
1110 Rivals, I., L. Personnaz, L. Taing, and M.-C. Potier, 2007 Enrichment or depletion of a GO category within
1111 a class of genes: which test? Bioinformatics 23: 401–407.
1112 Romero-Garcia, S., H. Prado-Garcia, and J. S. Lopez-Gonzalez, 2014 Transcriptional analysis of hnRNPA0,
1113 A1, A2, B1, and A3 in lung cancer cell lines in response to acidosis, hypoxia, and serum
1114 deprivation conditions. Exp. Lung Res. 40: 12–21.
1115 Sabatti, C., S. Service, and N. Freimer, 2003 False discovery rate in linkage and association genome
1116 screens for complex disorders. Genetics 164: 829–833.
1117 Saritha, V. N., V. S. Veena, K. M. Jagathnath Krishna, T. Somanathan, and K. Sujathan, 2018 Significance
1118 of DNA Replication Licensing Proteins (MCM2, MCM5 and CDC6), p16 and p63 as Markers of 48
1119 Premalignant Lesions of the Uterine Cervix: Its Usefulness to Predict Malignant Potential. Asian
1120 Pac. J. Cancer Prev. APJCP 19: 141–148.
1121 Saveanu, C., M. Fromont-Racine, A. Harington, F. Ricard, A. Namane et al., 2001 Identification of 12 new
1122 yeast mitochondrial ribosomal proteins including 6 that have no prokaryotic homologues. J. Biol.
1123 Chem. 276: 15861–15867.
1124 Schmidlin Tom, Kaeberlein Matt, Kudlow Brian A., MacKay Vivian, Lockshon Daniel et al., 2007 Single‐
1125 gene deletions that restore mating competence to diploid yeast. FEMS Yeast Res. 8: 276–286.
1126 Shafique, S., and S. Rashid, 2018 Structural basis for renal cancer by the dynamics of pVHL-dependent
1127 JADE1 stabilization and β-catenin regulation. Prog. Biophys. Mol. Biol.
1128 Shan, N., W. Zhou, S. Zhang, and Y. Zhang, 2016 Identification of HSPA8 as a candidate biomarker for
1129 endometrial carcinoma by using iTRAQ-based proteomic analysis. OncoTargets Ther.
1130 Shaner, L., H. Wegele, J. Buchner, and K. A. Morano, 2005 The yeast Hsp110 Sse1 functionally interacts
1131 with the Hsp70 chaperones Ssa and Ssb. J. Biol. Chem. 280: 41262–41269.
1132 Shen, X., S. Kan, J. Hu, M. Li, G. Lu et al., 2016 EMC6/TMEM93 suppresses glioblastoma proliferation by
1133 modulating autophagy. Cell Death Dis. 7: e2043.
1134 Shi, S. Q., J.-J. Ke, Q. S. Xu, W. Q. Wu, and Y. Y. Wan, 2018 Integrated network analysis to identify the key
1135 genes, transcription factors, and microRNAs involved in hepatocellular carcinoma. Neoplasma
1136 65: 66–74.
1137 Shimozono, N., M. Jinnin, M. Masuzawa, M. Masuzawa, Z. Wang et al., 2015 NUP160-SLC43A3 is a novel
1138 recurrent fusion oncogene in angiosarcoma. Cancer Res. 75: 4458–4465.
1139 Siołek, M., C. Cybulski, D. Gąsior-Perczak, A. Kowalik, B. Kozak-Klonowska et al., 2015 CHEK2 mutations
1140 and the risk of papillary thyroid cancer. Int. J. Cancer 137: 548–552. 49
1141 Song, H., E. Dicks, S. J. Ramus, J. P. Tyrer, M. P. Intermaggio et al., 2015 Contribution of Germline
1142 Mutations in the RAD51B, RAD51C, and RAD51D Genes to Ovarian Cancer in the Population. J.
1143 Clin. Oncol. 33: 2901–2907.
1144 Song, L., Y. Wang, J. Zhang, N. Song, X. Xu et al., 2018 The risks of cancer development in systemic lupus
1145 erythematosus (SLE) patients: a systematic review and meta-analysis. Arthritis Res. Ther. 20:.
1146 Stirling, P. C., M. S. Bloom, T. Solanki-Patil, S. Smith, P. Sipahimalani et al., 2011 The Complete Spectrum
1147 of Yeast Chromosome Instability Genes Identifies Candidate CIN Cancer Genes and Functional
1148 Roles for ASTRA Complex Components. PLoS Genet. 7:.
1149 Strome, E. D., X. Wu, M. Kimmel, and S. E. Plon, 2008 Heterozygous Screen in Saccharomyces cerevisiae
1150 Identifies Dosage-Sensitive Genes That Affect Chromosome Stability. Genetics 178: 1193–1207.
1151 Sun, M., L. Ma, L. Xu, J. Li, W. Zhang et al., 2002 A human novel gene DERPC on 16q22.1 inhibits prostate
1152 tumor cell growth and its expression is decreased in prostate and renal tumors. Mol. Med. 8:
1153 655–663.
1154 Sung, Y., S. Park, S. J. Park, J. Jeong, M. Choi et al., 2018 Jazf1 promotes prostate cancer progression by
1155 activating JNK/Slug. Oncotarget 9: 755–765.
1156 Taguchi, A., J.-H. Rho, Q. Yan, Y. Zhang, Y. Zhao et al., 2015 MAPRE1 as a plasma biomarker for early-
1157 stage colorectal cancer and adenomas. Cancer Prev. Res. Phila. Pa 8: 1112–1119.
1158 Tan, S.-X., R.-C. Hu, J.-J. Liu, Y.-L. Tan, and W.-E. Liu, 2014 Methylation of PRDM2, PRDM5 and PRDM16
1159 genes in lung cancer cells. Int. J. Clin. Exp. Pathol. 7: 2305–2311.
1160 The, I., S. Ruijtenberg, B. P. Bouchet, A. Cristobal, M. B. W. Prinsen et al., 2015 Rb and FZR1/Cdh1
1161 determine CDK4/6-cyclin D requirement in C. elegans and human cancer cells. Nat. Commun. 6:
1162 5906. 50
1163 Thiagalingam, S., R. L. Foy, K. Cheng, H. J. Lee, A. Thiagalingam et al., 2002 Loss of heterozygosity as a
1164 predictor to map tumor suppressor genes in cancer: molecular basis of its occurrence. Curr.
1165 Opin. Oncol. 14: 65–72.
1166 Tina, E., S. Prosén, S. Lennholm, G. Gasparyan, M. Lindberg et al., 2019 Expression profile of the amino
1167 acid transporters SLC7A5, SLC7A7, SLC7A8 and the enzyme TDO2 in basal cell carcinoma. Br. J.
1168 Dermatol. 180: 130–140.
1169 Tóth, C., F. Sükösd, E. Valicsek, E. Herpel, P. Schirmacher et al., 2017 Expression of ERCC1, RRM1, TUBB3
1170 in correlation with apoptosis repressor ARC, DNA mismatch repair proteins and p53 in liver
1171 metastasis of colorectal cancer. Int. J. Mol. Med. 40: 1457–1465.
1172 Trotman, L. C., M. Niki, Z. A. Dotan, J. A. Koutcher, A. D. Cristofano et al., 2003 Pten Dose Dictates Cancer
1173 Progression in the Prostate. PLOS Biol. 1: e59.
1174 Veitia, R. A., 2002 Exploring the etiology of haploinsufficiency. BioEssays 24: 175–184.
1175 Vladusic, T., R. Hrascan, N. Pecina-Slaus, I. Vrhovac, M. Gamulin et al., 2010 Loss of heterozygosity of
1176 CDKN2A (p16INK4a) and RB1 tumor suppressor genes in testicular germ cell tumors. Radiol.
1177 Oncol. 44: 168–173.
1178 Wang, C., J.-F. Chang, H. Yan, D.-L. Wang, Y. Liu et al., 2015 A conserved RAD6-MDM2 ubiquitin ligase
1179 machinery targets histone chaperone ASF1A in tumorigenesis. Oncotarget 6: 29599–29613.
1180 Wang, J., W. Chen, W. Wei, and J. Lou, 2017a Oncogene TUBA1C promotes migration and proliferation
1181 in hepatocellular carcinoma and predicts a poor prognosis. Oncotarget 8: 96215–96224.
1182 Wang, T., M. A. Goodman, R. L. McGough, K. R. Weiss, and U. N. M. Rao, 2014a Immunohistochemical
1183 Analysis of Expressions of RB1, CDK4, HSP90, cPLA2G4A, and CHMP2B Is Helpful in Distinction
1184 between Myxofibrosarcoma and Myxoid Liposarcoma. Int. J. Surg. Pathol. 22: 589–599. 51
1185 Wang, Y., Y.-X. Qi, Z. Qi, and S.-Y. Tsang, 2019 TRPC3 Regulates the Proliferation and Apoptosis
1186 Resistance of Triple Negative Breast Cancer Cells through the TRPC3/RASA4/MAPK Pathway.
1187 Cancers 11:.
1188 Wang, J., J. Qian, M. D. Hoeksema, Y. Zou, A. V. Espinosa et al., 2013 Integrative genomics analysis
1189 identifies candidate drivers at 3q26-29 amplicon in squamous cell carcinoma of the lung. Clin.
1190 Cancer Res. Off. J. Am. Assoc. Cancer Res. 19: 5580–5590.
1191 Wang, Y., F. Ren, Y. Wang, Y. Feng, D. Wang et al., 2014b CHIP/Stub1 functions as a tumor suppressor
1192 and represses NF-κB-mediated signaling in colorectal cancer. Carcinogenesis 35: 983–991.
1193 Wang, Y., N. Wu, B. Pang, D. Tong, D. Sun et al., 2017b TRIB1 promotes colorectal cancer cell migration
1194 and invasion through activation MMP-2 via FAK/Src and ERK pathways. Oncotarget 8: 47931–
1195 47942.
1196 Wang, L., M. Zhang, and D.-X. Liu, 2014c Knock-down of ABCE1 gene induces G1/S arrest in human oral
1197 cancer cells. Int. J. Clin. Exp. Pathol. 7: 5495–5504.
1198 Wu, Y., A. Wang, B. Zhu, J. Huang, E. Lu et al., 2018 KIF18B promotes tumor progression through
1199 activating the Wnt/β-catenin pathway in cervical cancer. OncoTargets Ther. 11: 1707–1720.
1200 Xie, Y., A. Wang, J. Lin, L. Wu, H. Zhang et al., 2017 Mps1/TTK: a novel target and biomarker for cancer. J.
1201 Drug Target. 25: 112–118.
1202 Yamada, K. M., and A. Hall, 2015 Reproducibility and cell biology. J Cell Biol 209: 191–193.
1203 Yamaguchi, K., R. Yamaguchi, N. Takahashi, T. Ikenoue, T. Fujii et al., 2014 Overexpression of Cohesion
1204 Establishment Factor DSCC1 through E2F in Colorectal Cancer (S. P. Chellappan, Ed.). PLoS ONE
1205 9: e85750.
1206 Yang, Y., Y. Gao, L. Mutter-Rottmayer, A. Zlatanou, M. Durando et al., 2017 DNA repair factor RAD18 and
1207 DNA polymerase Polκ confer tolerance of oncogenic DNA replication stress. J. Cell Biol. 216:
1208 3097–3115. 52
1209 Yang, F., J. Xu, H. Li, M. Tan, X. Xiong et al., 2019 FBXW2 suppresses migration and invasion of lung
1210 cancer cells via promoting β-catenin ubiquitylation and degradation. Nat. Commun. 10: 1382.
1211 Yang, Z., L. Zhuang, P. Szatmary, L. Wen, H. Sun et al., 2015 Upregulation of heat shock proteins
1212 (HSPA12A, HSP90B1, HSPA4, HSPA5 and HSPA6) in tumour tissues is associated with poor
1213 outcomes from HBV-related early-stage hepatocellular carcinoma. Int. J. Med. Sci. 12: 256–263.
1214 Yuen, K. W. Y., C. D. Warren, O. Chen, T. Kwok, P. Hieter et al., 2007 Systematic genome instability
1215 screens in yeast and their potential relevance to cancer. Proc. Natl. Acad. Sci. U. S. A. 104: 3925–
1216 3930.
1217 Yunlei, Z., C. Zhe, L. Yan, W. Pengcheng, Z. Yanbo et al., 2013 INMAP, a novel truncated version of
1218 POLR3B, represses AP-1 and p53 transcriptional activity. Mol. Cell. Biochem. 374: 81–89.
1219 Zhang, X., and T. F. Smith, 1998 Yeast “Operons.” Microb. Comp. Genomics 3: 133–140.
1220 Zhang, Z., F. Wang, C. Du, H. Guo, L. Ma et al., 2017 BRM/SMARCA2 promotes the proliferation and
1221 chemoresistance of pancreatic cancer cells by targeting JAK2/STAT3 signaling. Cancer Lett. 402:
1222 213–224.
1223 Zhang, Y.-F., B.-C. Zhang, A.-R. Zhang, T.-T. Wu, J. Liu et al., 2013 Co-transduction of ribosomal protein
1224 L23 enhances the therapeutic efficacy of adenoviral-mediated p53 gene transfer in human
1225 gastric cancer. Oncol. Rep. 30: 1989–1995.
1226 Zheng, Q., 2008 A note on plating efficiency in fluctuation experiments. Math. Biosci. 216: 150–153.
1227 Zheng, Q., 2016 Comparing mutation rates under the Luria-Delbrück protocol. Genetica 144: 351–359.
1228 Zheng, Q., 2015 Methods for comparing mutation rates using fluctuation assay data. Mutat. Res. 777:
1229 20–22.
1230 Zheng, Q., 2002 Statistical and algorithmic methods for fluctuation analysis with SALVADOR as an
1231 implementation. Math. Biosci. 176: 237–252. 53
1232 Zhong, A., H. Zhang, S. Xie, M. Deng, H. Zheng et al., 2018 Inhibition of MUS81 improves the chemical
1233 sensitivity of olaparib by regulating MCM2 in epithelial ovarian cancer. Oncol. Rep.
1234 Zhou, W., X. Feng, C. Ren, X. Jiang, W. Liu et al., 2013 Over-expression of BCAT1, a c-Myc target gene,
1235 induces cell proliferation, migration and invasion in nasopharyngeal carcinoma. Mol. Cancer 12:
1236 53.
1237 Zhou, S., Y. Shen, M. Zheng, L. Wang, R. Che et al., 2017a DNA methylation of METTL7A gene body
1238 regulates its transcriptional level in thyroid cancer. Oncotarget 8: 34652–34660.
1239 Zhou, P., N. Xiao, J. Wang, Z. Wang, S. Zheng et al., 2017b SMC1A recruits tumor-associated-fibroblasts
1240 (TAFs) and promotes colorectal cancer metastasis. Cancer Lett. 385: 39–45.
1241 Zhu, J., D. Heinecke, W. A. Mulla, W. D. Bradford, B. Rubinstein et al., 2015 Single-Cell Based
1242 Quantitative Assay of Chromosome Transmission Fidelity. G3 Bethesda Md 5: 1043–1056.
1243 Zhu, Y., H. Lin, Z. Li, M. Wang, and J. Luo, 2001 Modulation of expression of ribosomal protein L7a
1244 (rpL7a) by ethanol in human breast cancer cells. Breast Cancer Res. Treat. 69: 29–38.
1245
1246
1247 Fig 1. Representative LOH events impacting the MAT locus that can result in diploids
1248 exhibiting haploid mating behavior Due to the co-repressible nature of the MAT locus, both
1249 MATa and MATα alleles must be present and active in order to suppress the production of
1250 mating pheromones and receptors, leading to the non-mating diploid phenotype.
1251 Fig 2. Theory and set-up of LOH screen at the MAT locus A) Example of a possible triploid
1252 cell formation mechanism that would allow for growth under double selection conditions. The
1253 LOH event could occur through a variety of mechanisms, all leading to the ability of the diploid
1254 cell to mate. B) Model of the MAT locus LOH screening methodology. C) The scoring system of 54
1255 colony counts utilized in the screening. Strains demonstrating a minimum score of “+++” in all
1256 four replicates for mating with a particular haploid mating type were included as a top-hit.
1257 Fig 3. Multiple independent screen selection and gene lists comparison A) Data sets were
1258 compared across screens that either assayed heterozygous deletions, assayed for LOH events,
1259 or both. Screens looking at haploinsufficiency and its connection to genome instability, or
1260 utilizing homozygous knockouts to understand LOH mechanisms were chosen as they provide
1261 the most relevant data sets for comparison. B) Gene lists from the six screens selected for
1262 comparison were mapped for their overlapping top-hits. Gene deletions identified in two
1263 independent studies are shown in burgundy, whereas genes appearing as hits in three
1264 independent studies are shown in pink. If a particular gene deletion reproduced in multiple
1265 screens within the same publication, the solid color was changed to a striped pattern. Only
1266 screens performed in a diploid system were considered when determining the number of
1267 screens a gene reproduced in.
1268 Fig 4. Positional Gene Enrichment (PGE) analysis for enriched chromosome regions
1269 Genes highlighted in green were identified by the MAT screen; genes highlighted in orange
1270 were identified by the MAT and MET15 screens; genes highlighted in blue were identified by
1271 one independent study (studies selected for comparison are shown in Fig. 3); genes highlighted
1272 in burgundy were identified in two independent studies; genes highlighted in pink were identified
1273 by three independent studies. The key denoting color labels remains the same throughout all
1274 parts of the figure. A) Enriched regions of chromosome II from PGE analysis on MAT locus
-4 -6 1275 screen top-hits (padj = 2.74 x 10 ), MAT + MET15 top-hits (padj = 9.69 x 10 ), and Multiple
1276 Screen Comparison top-hits (six screens compared in Fig. 3), (BP region 442918-575991) (padj
1277 = p=0.016). B) Enriched regions of chromosome V from PGE analysis on MAT locus screen top-
-4 -6 1278 hits (padj = 2.31 x 10 ), MAT + MET15 top-hits (p = 9.69 x 10 ), and Multiple Screen
-5 1279 Comparison top-hits (padj = 2.03x10 ). C) Enriched regions of chromosome VII from PGE 55
1280 analysis on MAT locus screen top-hits (padj = 0.00358). No enriched regions of 3 or more
1281 identified genes were enriched on this chromosome when the MAT + MET15 or Multiple Screen
1282 Comparison datasets were analyzed. D) Enriched regions of chromosome IX from PGE analysis
-4 1283 on MAT locus screen top-hits (padj = 4.40 x 10 ), MAT + MET15 top-hits (padj = 0.00194), and
1284 Multiple Screen Comparison top-hits (padj = 0.0299). E) Enriched regions of chromosome X from
1285 PGE analysis on MAT locus screen top-hits (padj = 0.00901) and Multiple Screen Comparison
-4 1286 top-hits (padj = 4.18x10 ). No regions containing three or more identified genes were enriched
1287 when our MAT locus dataset was run. F) Enriched region of chromosome XI from PGE analysis
1288 on MAT locus screen top-hits (padj = 0.00751) and the Multiple Screen Comparison top-hits (padj
1289 = 0.013). G) Enriched region of chromosome XV from PGE analysis on MAT locus screen top-
1290 hits (padj = 0.00288), and MAT + MET15 top-hits (padj = 0.00332). No enriched regions containing
1291 more than three identified genes are found on chromosome XV for the Multiple Screen
1292 Comparison analysis. H) Enriched region of chromosome XVI from PGE analysis on Multiple
1293 Screen Comparison dataset (padj = 0.046).
1294 Figure 5. LOH rates at the CAN1 locus due to nine separate heterozygous gene mutations
1295 The data shown represents a combination of a minimum of two independent experiments. The
1296 black circle depicts the mean LOH rate with the tails showing the experimental 95% CIs. Non-
1297 overlapping 95% CIs, to the wildtype BY4743 strain, are considered significantly different as the
1298 95% CI overlap method mimics a two-tailed, two-population t-test at the conventional p < 0.05
1299 level with an improvement in type I error rate and statistical power when compared to a t-test,
1300 which has been found unsuitable for FA data analysis (Zheng 2015).
1301 Table 1. Multiple Screen Comparison Gene Identification Overlap A hypergeometric
1302 probability was calculated using a normal approximation using the webtool
1303 http://nemates.org/MA/progs/overlap_stats.html. The same tool was used to calculate a
1304 representation factor. A representation factor is calculated as the number of genes in common 56
1305 between two studies divided by the number of expected genes. The number of expected genes
1306 is estimated as the number of genes in the first study times the number of genes in the second
1307 study which is then divided by the total number of genes that were screened. Representation
1308 factors greater than 1 indicate more overlap than expected, representation factors less than 1
1309 indicate less overlap than expected.
1310 Table 2. Go Slim Mapper and Go Term Finder Results for A) 217 MAT Top Hits B) 100
1311 MAT + MET15 Top Hits Significantly enriched gene ontology categories identified with SGD
1312 Slim Mapper and SGD Term Finder tools are shown. SGD Term Finder reported multiple
1313 comparison corrected p-values < 0.05 are shown. P-values from SGD Slim Mapper were
1314 corrected with the Benjamini-Hochberg critical value. For SGD Slim Mapper, all significantly
1315 enriched categories with a uncorrected p-value < their Benjamini-Hochberg critical value (Q =
1316 0.05) are shown. Gene Ontology Identification numbers (GOID) and names of the genes that
1317 represent the enrichment are included.
1318 Table S1. Complete MAT locus LOH screen data Score data for every gene in the SCDChet
1319 for each of the individual experimental replicates.
1320 Table S2. Top hits identified in MAT locus LOH screen Two hundred and seventeen genes,
1321 when deleted, contribute to LOH at the MAT locus. Genes reported are listed based on the sum
1322 of the “+” scores of all four replicates of mating with a MATa or MATα haploid tester, with each
1323 mating type scored separately. Of the 217 genes identified, 17.5% of them are essential; genes
1324 that have systematic ORF identifiers but no standard name (generally attributed to a lack of
1325 understanding of their function) also make up 17.5% of the dataset. For a complete list of scores
1326 recorded for every gene in the collection, as well as genes excluded from the analysis, see
1327 Table S1. 57
1328 Table S3. Sector counts of the 217 SCDChet gene deletions analyzed in secondary
1329 MET15 screen Score data, for all 217 top-hits from the MAT locus screen, in the MET15
1330 sectoring screen for each of the individual experimental replicates.
1331 Table S4. GO Slim Mapper results for the 217 identified genes in the MAT locus screen
1332 SGD GO Slim Mapper results for Cellular Component, Molecular Function, Biological Process,
1333 and Macromolecular Complex reported with Benjamini-Hochberg critical values for multiple
1334 comparison correction.
1335 Table S5. GO Slim Mapper results for the 100 identified genes in the MET15 secondary
1336 screen SGD GO Slim Mapper results for Cellular Component, Molecular Function, Biological
1337 Process, and Macromolecular Complex reported with Benjamini-Hochberg critical values for
1338 multiple comparison correction.
1339 Table S6. GO Slim Mapper and GO Term Finder results for the 91 genes identified in two
1340 or more related independent studies SGD GO Slim Mapper and GO Term Finder results for
1341 Cellular Component, Molecular Function, Biological Process, and Macromolecular Complex
1342 reported with Benjamini-Hochberg critical values and corrected p-values for multiple comparison
1343 correction.
1344 Table S7. Positional Gene Enrichment analysis of the 217 MAT Top Hits, 100 MAT +
1345 MET15 Top hits, and 91 genes identified in 2+ independent studies All chromosomal
1346 regions reported from PGE analysis are included with basepair positioning and adjusted p-
1347 values. Further the number of elements identified and the total number of elements in the region
1348 are reported.
1349 Table S8. Human homologs and their associated cancer types for genes identified by the
1350 MAT locus screen Literature searches for cancer association were conducted for every MAT 58
1351 screen identified yeast gene with a known human homolog. The yeast systematic and standard
1352 names are included as well as the name of the human homolog and the citation for literature
1353 associating that human homolog with cancer. Whether the yeast genes were additionally
1354 identified in the MET15 screen is also noted. (Zhu et al. 2001; Chen et al. 2004; Kim et al.
1355 2008; Antonacopoulou et al. 2008; Liu et al. 2008; Datta 2009; Li et al. 2010; Fang et al.
1356 2010; Chang et al. 2011; Kim et al. 2012; Li et al. 2012; Leng et al. 2012; Fernández-Sáiz et
1357 al. 2013; Faria et al. 2013; Yunlei et al. 2013; Zhou et al. 2013; Lee et al. 2013; Patra et al.
1358 2013; Zhang et al. 2013; Wang et al. 2013; Gutierrez-Erlandsson et al. 2013; Oo et al. 2014;
1359 Tan et al. 2014; Wang et al. 2014c; Qi et al. 2014; Romero-Garcia et al. 2014; Brant and
1360 Leikauf 2014; Kobayashi et al. 2014; Wang et al. 2014b; Li et al. 2014; Boele et al. 2014;
1361 Lee and Venkitaraman 2014; Di Giammartino and Manley 2014; Provost et al. 2014;
1362 Hlaváč et al. 2014; Donato et al. 2015; Kurbasic et al. 2015; Yang et al. 2015; Guest et al.
1363 2015; Ciou et al. 2015; Fang et al. 2015; Wang et al. 2015; Qiu et al. 2015; Shimozono et al.
1364 2015; Chang et al. 2015; Chen et al. 2016; Jo et al. 2016; Li et al. 2016; Shen et al. 2016;
1365 He et al. 2016; Shan et al. 2016; Bai et al. 2016; Reid et al. 2016; Chen et al. 2017; Hattori
1366 et al. 2017; Lin et al. 2017; Xie et al. 2017; Zhang et al. 2017; Zhou et al. 2017b; Li et al.
1367 2017b; Zhou et al. 2017a; Wang et al. 2017b; Engin et al. 2017; Bausch et al. 2017; Wang
1368 et al. 2017a; Li et al. 2017a; Cho et al. 2018; Jin et al. 2018; Ni et al. 2018; Shi et al. 2018;
1369 Wu et al. 2018; Chou et al. 2018; Sung et al. 2018 p. 1; Chen et al. 2018; Zhong et al. 2018;
1370 Saritha et al. 2018; Evensen et al. 2018; Fang et al. 2018; Joseph et al. 2018; Ding et al.
1371 2018; Choi et al. 2018; Pan et al. 2018; Yang et al. 2019; Tina et al. 2019; Husni et al. 2019;
1372 Chen et al. 2019; Apasu et al. 2019; Wang et al. 2019).
1373 Table S9. Human homologs and their associated cancer types for genes identified by
1374 multiple independent studies Literature searches for cancer association were conducted for
1375 every multiple screen comparison identified yeast gene with a known human homolog. The 59
1376 yeast systematic and standard names are included as well as the name of the human homolog
1377 and the citation for literature associating that human homolog with cancer. (Abdel-Fatah et al.
1378 2014 p.; Hennecke et al. 2015; Leone et al. 2003; Mason et al. 2014; Sun et al. 2002;
1379 Taguchi et al. 2015; The et al. 2015 p. 1; Wang et al. 2015; Guest et al. 2015; Wang et al.
1380 2014c; Shan et al. 2016; Qi et al. 2014 p. 200; Saritha et al. 2018; Zhong et al. 2018; Kim et
1381 al. 2018; Datta 2009; Zhou et al. 2017b; Yang et al. 2015; Shi et al. 2018; Fang et al. 2018;
1382 Chen et al. 2004; Donato et al. 2015; Tina et al. 2019; Zhu et al. 2001; Patra et al. 2013;
1383 Song et al. 2015; Rahman et al. 2019; Diefenbacher and Orian 2017; Shafique and Rashid
1384 2018; Gagou et al. 2014; Nicolas et al. 2016; Wang et al. 2014a; Oo et al. 2016; Baldeyron
1385 et al. 2015; Babeu et al. 2019; Maleva Kostovska et al. 2016; Dong et al. 2018; Broustas et
1386 al. 2019; Qin et al. 2019; Huang et al. 2014; Dai et al. 2016; Chen et al. 2018; Ding et al.
1387 2018; Chen et al. 2019; Wang et al. 2019; Provost et al. 2014; Tina et al. 2019; Lee and
1388 Venkitaraman 2014; Maragozidis et al. 2015; Yamaguchi et al. 2014; Yang et al. 2017; He
1389 et al. 2018; Tóth et al. 2017 p. 3; Li et al. 2018; Bianco et al. 2019; Siołek et al. 2015)
1390 Table S10. Positional Gene Enrichment analysis of genes identified by similar
1391 independent studies All chromosomal regions reported from PGE analysis run on the full 898
1392 top-hits list compiled from all genes identified across all six screens compared in this study.
1393 Enriched regions are included with basepair positioning and adjusted p-values. Further the
1394 number of elements identified and the total number of elements in the region are reported. The
1395 systematic name of every gene in that region are included and color-coded based on which of
1396 the six screen identified each gene. Orange: (Strome et al. 2008); Yellow: (Choy et al. 2013);
1397 Pink: This Study; Green: (Schmidlin Tom et al. 2007); Blue:(Yuen et al. 2007); Purple:
1398 (Andersen et al. 2008). Systematics names in gray indicate the gene was identified in two
1399 independent studies, in black from three independent studies. 60
1400 Figure S1. Positional Gene Enrichment example figure Chr II Visual representation of the
1401 data of chromosome II PGE analysis including gene name, study which identified the gene, and
1402 type of screen/analysis performed (SCDChet, SCDChom, non-SCDC). The majority of genes
1403 representing the enriched region of chromosome II come from heterozygous screens, and most
1404 of these are found utilizing SCDChet. Table 1. Multiple Screen Comparison Gene Identification Overlap
Number of Total Number Mutants with Number Overlapping Representation Screen of Genes p-value Phenotype with This Study “hits” Factor Screened (“hits”) This Study 6477 217 (180*) - - - Choy et al. 2013 6477 332 26 0.00003995 2.3 Strome et al. 2008 6477 164 4 0.351 0.7 Andersen et al. 2008 5134 61 3 0.362 1.4 Yuen et al. 2006 5134 122 5 0.427 1.2 Schmidlin et al. 2007 5134 100 7 0.060 2.0 *217 total top-hit genes identified, 180 non-essential top-hit genes were used for comparison with homozygous knockout screens.
A hypergeometric probability was calculated using a normal approximation using the webtool http://nemates.org/MA/progs/overlap_stats.html. The same tool was used to calculate a representation factor. A representation factor is calculated as the number of genes in common between two studies divided by the number of expected genes. The number of expected genes is estimated as the number of genes in the first study times the number of genes in the second study which is then divided by the total number of genes that were screened. Representation factors greater than 1 indicate more overlap than expected, representation factors less than 1 indicate less overlap than expected.
Table 2. Go Slim Mapper and Go Term Finder Results
A. 217 MAT Top Hits List
Benjamini- Hochburg GO GO GO Term p-value critical Genes in term Categories Method value (Q = 0.05) SGD Slim - - - - Mapper Cellular Chaperonin- Component SGD Term containing T- 0.04889 - SSA1, CCT2, CCT8, CCT7 Finder complex (GOID: 5832) KIN3, IML3, MRC1, MPS1, SPC19, Chromosome SGD Slim 0.0004605 0.00049505 SMC1, KIP3, SPO22, STH1, CSM2, segregation Biological Mapper HSK3, CTF3, TUB1, NDJ1, SGO1, (GOID: 7059) Process KIN4, GPN2 SGD Term - - - - Finder YAL067W-A, YBR096W, VID24, IML3, YBR137W, YBR144C, APD1, UBS1, HSM3, LDB16, MRC1, BPH1, RMD1, QRI7, RGT2, YDL211C, YDR029W, MRH1, SSY1, UBX5, ECM11, YDR509W, EMI2, GRH1, ZRG8, YER087C-A, TMN3, DSE1, YER135C, BCK2, YER181C, PUG1, SNO3, YGL218W, MTC3, SHE10, NNF2, YGR201C, YHI9, MTC6, AIM18, Molecular YIL025C, YIL032C, MMF1, YIL060W, SGD Slim Function 0.0004997 0.001136 SPO22, AIM19, ICE2, CSM2, SYS1, Molecular Mapper Unknown YJL009W, PRY3, PRM10, YJL120W, Function (GOID: 3674) SPC1, HIT1, AIM24, ILM1, YKL018C- A, TTI1, FAT3, YKR073C, EMC6, PER33, YLR342W-A, YLR374C, CTF3, YML037C, TUB1, YML094C-A, YMR105W-A, YMR119W-A, YMR122C, YMR153C-A, YNL146C-A, YNL146W, VID27, RTC4, BSC5, YOL134C, MED7, YOR072W, SGO1, AIM41, YOR364W, YPL080C, YIG1, OPY2 SGD Term - - - - Finder
B. 100 MAT + MET15 Top Hits List
Benjamini- Hochburg GO GO GO Term p-value critical Genes in term Categories Method value (Q = 0.05) SGD Slim - - - - Cellular Mapper Component SGD Term - - - - Finder SGD Slim - - - - Biological Mapper Process SGD Term - - - - Finder YAL067W-A, YBR096W, IML3, YBR137W, YBR144C, APD1, HSM3, BPH1, RGT2, YDL211C, MRH1, ECM11, YDR509W, GRH1, ZRG8, YER087C-A, TMN3, DSE1, Molecular YER135C, BCK2, YER181C, PUG1, SGD Slim Function 0.000233 0.001136 SNO3, NNF2, MMF1, YIL060W, Molecular Mapper Unknown AIM19, ICE2, SYS1, YJL009W, Function (GOID: 3674) PRY3, YJL120W, SPC1, HIT1, AIM24, YKL018C-A, TUB1, YML094C-A, YMR105W-A, YMR122C, VID27, RTC4, MED7, YOR072W, SGO1, YOR364W, OPY2
SGD Term - - - - Finder
Significantly enriched gene ontology categories identified with SGD Slim Mapper and SGD Term Finder tools are shown. SGD Term Finder reported multiple comparison corrected p- values < 0.05 are shown. P-values from SGD Slim Mapper were corrected with the Benjamini- Hochberg critical value. For SGD Slim Mapper, all significantly enriched categories with a uncorrected p-value < their Benjamini-Hochberg critical value (Q = 0.05) are shown. Gene Ontology Identification numbers (GOID) and names of the genes that represent the enrichment are included.