<<

Using CRISPR/Cas9 to Identify Interactions with Hexosamine Biosynthesis and N-Glycan Remodeling Pathway

by

Alexandra Chirila

A thesis submitted in conformity with the requirements for the degree of Master of Science Laboratory Medicine and Pathobiology University of Toronto

© Copyright by Alexandra Chirila 2019

Using CRISPR/Cas9 to Identify Gene Interactions with Hexosamine Biosynthesis and N-Glycan Remodeling Pathway Enzymes

Alexandra Chirila

Master of Science

Laboratory Medicine and Pathobiology University of Toronto

2019 Abstract

Genetic studies by classical mutagenesis and screening methods have revealed many molecular interactions and regulatory relationships in animal models. The hexosamine biosynthesis pathway and N-glycosylation are upregulated in most cancers and have shown to play a role in many cancer cell phenotypes. In this thesis, a CRISPR/Cas9 genome-wide targeted mutagenesis approach was employed to identify gene interactions with chosen -of-interest from these two pathways: NAGK, GFPT1, MGAT1 and MGAT5. The gene interactions identified suggest relationships between our genes-of-interest and cell-cell adhesion, cytoskeleton, and folate and nucleotide metabolism. Further characterization of metabolite levels in the gene-of-interest knockout cells was done to help understand potential gene interactions from the screen. For example, metabolic imbalance in mutant cells likely indicates cell stress and reactive oxygen species, consistent with PRDX1, an antioxidant, being a suggested genetic interactor in multiple screens. These findings provide new insight on vulnerabilities and genomic redundancies in cancer cells.

ii

Acknowledgments

I would first like to thank my supervisor, Dr. James Dennis, for the continuous mentorship and support throughout my master’s thesis project. He was always willing to provide guidance and share his immense knowledge, and was very patient and enthusiastic.

I would also like to thank Dr. Payman Tehrani, Dr. Michael Aregger, and Dr. Keith Lawson for much guidance and assistance throughout my thesis project.

I thank all members of the Dennis/Swallow lab for the discussions, support and assistance.

I would also like to thank my advisory committee: Dr. Linda Penn and Dr. Jason Moffat, for providing me with direction and expanding my knowledge on my project, and Dr. Irene Andrulis for sitting on my examination committee.

iii

Contributions

The author performed all experiments described in this thesis with the following contributions:

Jason Moffat’s Lab: Conducted multiple HAP1 WT genome-wide CRISPR/Cas9 KO screens for comparison to the KO screen data (prior to my project start date).

Michael Aregger in Jason Moffat’s Lab: Conducted the HAP1 NAGK KO genome-wide CRISPR/Cas9 KO screen (prior to my project start date).

Aldis Krizus in Jim Dennis’s Lab: Generated the MDA-MB-231 NAGK, GNPNAT1, MGAT1 and MGAT5 KO cell lines, and also conducted the sample preparation for the MDA-MB-231 metabolomics experiment outlined in Figure 4.1.

Katie Chan in Jason Moffat’s Lab: Made the virus containing the Toronto KnockOut CRISPR Library - Version 3 (TKOv3) for all of the CRISPR/Cas9 KO screens.

Amy Tong in Jason Moffat’s Lab: Coordinated sequencing library preparations and managed sequence data for all of the CRISPR/Cas9 KO screens.

Max Billmann in Jason Moffat’s Lab: Conducted sequence analysis for all of the CRISPR/Cas9 KO screens up to generation of pi-scores and False Discovery Rates (FDRs).

Michael Parsons in the Lunenfeld-Tanenbaum Research Institute: Assisted with flow cytometry analysis.

Judy Pawling in Jim Dennis’s Lab: Conducted the metabolomics sample preparation beyond the flash freezing in liquid nitrogen, ran the samples through liquid chromatography-tandem mass spectrometry, and analyzed the raw data generating expression values normalized to cell number for each cell line. Judy also conducted the MDA-MB-231 in vivo tumor xenograft experiment in NOD-SCID mice outlined in Figure 1.1 with Karina Pacholczyk, and also generated Figures 3.12 and 3.13.

Karina Pacholczyk in Dennis’s Lab: Conducted the MDA-MB-231 in vivo tumor xenograft experiment in NOD-SCID mice outlined in Figure 1.1 with Judy Pawling.

iv

Table of Contents Abstract ...... ii

Acknowledgments ...... iii

Contributions ...... iv

List of Tables ...... vii

List of Figures ...... viii

List of Appendices ...... x

List of Abbreviations ...... xi

Chapter 1 ...... 1

Introduction ...... 1

1.1 Protein N-Glycosylation as a Known-Vulnerability ...... 2

1.2 Hexosamine Biosynthesis Pathway ...... 7 1.2.1 Glycosylation ...... 9

1.3 Cancer Cell Metabolism ...... 13 1.3.1 Main Cancer Energy Sources ...... 15 1.3.2 Increased flux through HBP ...... 17

1.4 Gene Editing ...... 18 1.4.1 CRISPR/Cas9 ...... 20

1.5 Genetic Interactions ...... 21 1.5.1 Global gene interaction network ...... 23 1.5.2 Yeast versus ...... 24

1.6 Rationale ...... 25

Chapter 2 ...... 28

Materials and Methods ...... 28

2.1 Materials ...... 28

2.2 Methods ...... 33

v

Chapter 3 ...... 46

Results ...... 46

3.1 HAP1 CRISPR/Cas9 Screens ...... 46

3.2 Validation Competition Assay ...... 62

3.3 Metabolomics ...... 68

Chapter 4 ...... 76

Discussion ...... 76

4.1 HAP1 CRISPR/Cas9 Screens ...... 76

4.2 Validation Competition Assay ...... 78

4.3 Metabolomics ...... 80

4.4 Limitations ...... 86

4.5 Future Directions ...... 88

4.6 Conclusions ...... 89

References ...... 91

Appendices ...... 104

vi

List of Tables

Table 2.1: PCR1 reaction mixture per tube. 35

Table 2.2: PCR2 reaction mixture. 37

Table 3.1: terms over-represented in all three GFPT1, MGAT1 and MGAT5 KO screens (g:SCS multiple testing correction method applying significance threshold of 0.05). 50

Table 3.2: Enriched GO terms in ordered GI lists and the common gene interactors associated with those GO terms. 52

Table 3.3: Seventeen GIs that have been chosen for validation and their functions. 63

Table 4.1: Number of publications linking each gene to cancer, conducted by searching [“gene name” AND cancer] in PubMed, and the incidence of somatic mutations found in these genes in cancer samples from the Catalogue of Somatic Mutations in Cancer (COSMIC). 79

vii

List of Figures

Figure 1.1: Xenograft tumor growth of MDA-MB-231 (A) WT, (B) GNPNAT1 KO, (C) MGAT1 KO, and (D) MGAT5 KO cell lines. 6

Figure 1.2: The Hexosamine Biosynthesis Pathway (HBP). 8

Figure 1.3: Golgi N-Glycan Branching Pathway. 11

Figure 1.4: A synthetic lethal interaction. 22

Figure 2.1: Functional confirmation of MDA-MB-231 NAGK and GNPNAT1 KO cell lines. 29

Figure 2.2: Functional confirmation of MDA-MB-231 MGAT1 and MGAT5 KO cell lines. 30

Figure 2.3: Polymerase chain reaction 1 (PCR1) gel image for all three MDA-MB- 231 CRISPR/Cas9 knockout (KO) screens. 36

Figure 2.4: Log2 fold change plot of essential and nonessential genes at the final time point of the HAP1 MGAT1 KO CRISPR/Cas9 screen relative to day 0 after puromycin selection. 39

Figure 2.5: Western blot membranes showing protein expression of Cas9 and g- tubulin in our Cas9 stable cell lines. 41

Figure 3.1: Score plots for all four HAP1 gene-of-interest CRISPR/Cas9 KO screens. 47

Figure 3.2: Manhattan Plots of over-expressed GO terms from multiquery analysis of GI lists (FDR<0.05) from all four gene-of-interest KO screens. 49

Figure 3.3: Negative genetic interaction map (FDR<0.05, pi-score<-1). 54

Figure 3.4: Positive genetic interaction map (FDR<0.05, pi-score>1). 55

Figure 3.5: Score plot of the NAGK KO CRISPR/Cas9 screen. 58

viii

Figure 3.6: Score plot of the GFPT1 KO CRISPR/Cas9 screen. 59

Figure 3.7: Score plot of the MGAT1 KO CRISPR/Cas9 screen. 60

Figure 3.8: Score plot of the MGAT5 KO CRISPR/Cas9 screen. 61

Figure 3.9: Flow cytometry analysis of the cell populations at the initial time point of the HAP1 WT competition assay with the AAVS1 negative control gRNA. 66

Figure 3.10: Heat map showing relative levels of 88 metabolites across the HAP1 WT and gene-of-interest KO cell lines (n=6). 69

Figure 3.11: Average (A) GlcNAc and (B) UDP-GlcNAc levels, normalized to cell counts, in the HAP1 WT and gene-of-interest KO cell lines. 71

Figure 3.12: Relative levels of (A) amino acids, (B) nucleotides, and (C) metabolites involved in Glycolysis, the Pentose Phosphate Pathway (PPP), and the Citric acid cycle in the HAP1 MGAT1 KO cell line, normalized to the average level of each in the HAP1 WT cell line. 73

Figure 3.13: Relative nucleotide levels in the HAP1 GFPT1 KO cell line, normalized to the average level of each in the HAP1 WT cell line. 75

Figure 4.1: Relative levels of metabolites that reveal a significant difference across one or more of the MDA-MB-231 WT and each gene-of-interest KO cell line (n=9). 83

ix

List of Appendices

Appendix 1: pSTV6-PGK-P2R N-mCherry backbone vector map. 104

Appendix 2: Primers used for PCR1 and PCR2 in the genome-wide CRISPR/Cas9 KO screens. 105

Appendix 3: gRNAs chosen for validation and the forward and reverse oligos required for proper ligation of each gRNA into the pLCKO backbone vector. 109

Appendix 4: Sample pi-score calculation, shown using data from SLC16A1 gRNAs inducing a second KO in the HAP1 NAGK KO cell line. 110

Appendix 5: List of 88 metabolites that showed a significant difference in their abundance between one or more of the following cell lines: HAP1 WT, HAP1 NAGK KO, HAP1 GFPT1 KO, HAP1 MGAT1 KO and HAP1 MGAT5 KO. 111

Appendix 6: Unsupervised heat map of overall metabolite levels in the HAP1 WT and gene-of-interest KO cell lines. 112

x

List of Abbreviations

AAVS1 Adeno-Associated Virus Integration Site 1 ADP Adenosine diphosphate Akt Protein Kinase B AMP Adenosine monophosphate ATIC 5-Aminoimidazole-4-Carboxamide Ribonucleotide Formyltransferase/IMP Cyclohydrolase ATP Adenosine triphosphate BAGEL Bayesian Analysis of Gene EssentiaLity bp Base pairs BP Biological Process CAC Citric Acid Cycle Cas9 CRISPR associated protein 9 CC Cellular Component CCNF Cyclin F CDP Cytidine diphosphate CHD2 Chromodomain Helicase DNA Binding Protein 2 CMP Cytidine monophosphate CORUM CORUM protein complexes COSMIC Catalogue of Somatic Mutations in Cancer CRISPR Clustered Regularly Interspaced Short Palindromic Repeats CTNNA1 Catenin Alpha 1 CTP Cytidine triphosphate DHAP Dihydroxyacetone phosphate DHFR Dihydrofolate Reductase DMEM Dulbecco’s Modification Eagle’s Medium 1x DNA Deoxyribonucleic acid D-PBS Dulbecco’s Phosphate Buffered Saline 1x EGFP Enhanced Green Fluorescent Protein EPC2 Enhancer Of Polycomb Homolog 2 ERK Extracellular-Signal-Regulated Kinase xi

FBS Fetal Bovine Serum FDR False Discovery Rate FTCD Formimidoyltransferase Cyclodeaminase GALE UDP-galactose-4-epimerase GalNAc N-acetylgalactosamine GART Glycinamide Ribonucleotide Transformylase GDP Guanosine diphosphate GFPT1/2 ---6-Phosphate Transaminase 1/2 GI Genetic Interaction GlcNAc N-acetylglucosamine GMP Guanosine monophosphate GNPNAT1 Glucosamine-Phosphate N-Acetyltransferase 1 GO Gene Ontology gRNA Guide RNA GTP Guanosine triphosphate HAS Hyaluronan Synthase HBP Hexosamine Biosynthesis Pathway HER2 Human Epidermal Growth Factor Receptor 2 HexNAc N-acetylhexosamine HP Human Phenotype Ontology HPA Human Protein Atlas IgG Immunoglobulin G indel insertion/deletion INPPL1 Inositol Polyphosphate Phosphatase Like 1 INT Intensity IPP Isopentenyl-diphosphate kDa Kilodaltons KEGG Kyoto Encyclopedia of Gene and Genomes KG Ketoglutarate KO Knockout LC-MS/MS Liquid Chromatography-tandem Mass Spectrometry LLOD Lower Limit of Detection

xii

MANII Mannosidase II MF Molecular Function MGAT1 Alpha-1,3-Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase MGAT2 Alpha-1,6-Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase MGAT4 Alpha-1,3-Mannosylglycoprotein 4-Beta-N-Acetylglucosaminyltransferase MGAT5 Alpha-1,6-Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase MIRNA miRTarBase MLLT4 Mixed-Lineage Leukemia; Translocated To, 4 MOI Multiplicity of Infection MTHFD1 Methylenetetrahydrofolate Dehydrogenase, Cyclohydrolase And Formyltetrahydrofolate Synthetase 1 NAD+ Nicotinamide adenine dinucleotide (-H is reduced form) NADP Nicotinamide adenine dinucleotide phosphate (-H is reduced form) NAGK N-acetylglucosamine Kinase NF2 Neurofibromin 2 OAZ1 Ornithine Decarboxylase Antizyme 1 PCR Polymerase Chain Reaction PDGFR Platelet-Derived Growth Factor Receptor PEP Phosphoenolpyruvate PLK1 Polo Like Kinase 1 PNK Polynucleotide Kinase PPAT Phosphoribosyl Pyrophosphate Amidotransferase PPP Pentose Phosphate Pathway PRDX1 Peroxiredoxin 1 PSMD1 Proteasome 26S Subunit, Non-ATPase 1 PTEN Phosphatase and Tensin Homolog PVDF Polyvinylidene difluoride REAC Reactome RNAi RNA Interference rpm Rotations per minute rSAP Shrimp Alkaline Phosphatase SLC16A1 Solute Carrier Family 16 Member 1

xiii

SLC35A3 Solute Carrier Family 35 Member A3 TAX1BP3 Tax1 Binding Protein 3 TF Transfac (transcription factor) TGF-b Transforming Growth Factor-Beta TKOv1/3 Toronto KnockOut (TKO) CRISPR Library - Version 1/3 UDP Uridine diphosphate UMP Uridine monophosphate UTP Uridine triphosphate WP WikiPathways WT Wild type

xiv Chapter 1

Introduction

Cancer cells harness an exceptional capacity to adapt, either by mutation or activation of alternate pathways, making cancer a rapidly moving target for therapy. With the current diagnostic tools and treatments available, approximately half of patients diagnosed with cancer will die from the disease.1 While there have been many improvements in the treatment of early stage cancers, minimal treatment success remains in disseminated cancers. Moreover, combinations of currently used drugs are largely ineffective long-term, suggesting a broader approach is needed to find promising new interacting targets and therapeutic agents.2 Thus there is an urgent need to better understand the complexity of the cancer cell, particularly where there is pathway crosstalk and where redundancy within the genome lies.

Cancer research has shifted towards the discovery of interacting gene targets, where there is potential for new combination therapies that can block tumor cell adaptation, sensitize cells to an agent, or circumvent chemotherapeutic toxicities. An example of this is shown in Kanarek et al.

2018, whereby Formimidoyltransferase Cyclodeaminase (FTCD), an required for histidine catabolism, was found to be linked to methotrexate sensitivity in cancer cells, a widely used chemotherapeutic agent. Identifying this relationship revealed that histidine supplementation during methotrexate treatment sensitized leukemia xenografts to methotrexate, decreasing the required dosage and therefore the toxicities commonly associated with this agent.3

Gene interaction (GI) studies in Saccharomyces cerevisiae suggest there are many more GIs to be discovered in mammalian cells than are currently known, with a subset that may be selective

1 2 for cancer cells.4 CRISPR/Cas9 gene editing on a genome-wide scale is now being used to reveal synergistic GIs in mammalian cells.5–8 Synergistic negative GIs denote a situation where loss-of- function mutations in two genes suppress growth, but either alone has little effect.9 By starting with a known-vulnerability in cancer, such as re-wired metabolism,10–12 immune recognition,13 or sensitivity to certain drugs,3,14 GIs may provide an advantage in the search and validation of new therapies. Herein we focus on Asn (N)-glycosylation of proteins as a critical vulnerability in cancer progression, as described below.

1.1 Protein N-Glycosylation as a Known-Vulnerability

Cancer cells exhibit increased flux through the hexosamine biosynthesis pathway (HBP), which utilizes glycolytic intermediates to generate uridine diphosphate N-acetylglucosamine (UDP-

GlcNAc).15 This increase in UDP-GlcNAc, the substrate for N-glycan branching, results in a subsequent increase in N-glycosylation,16 which has been shown to be directly linked to metabolism. Overexpression of N-glycan branching enzymes, including MGAT1 and MGAT5, along with HBP stimulation to generate more UDP-GlcNAc, revealed an increase in metabolite levels, as well as oxidative and lactate metabolism in an additive manner in primary cell culture experiments.17 This was also tested in and glutamine limiting conditions, both being nutrients cancer cells depend on. Further, overexpression of MGAT5 alone revealed increased uptake of amino acids, glycolytic and TCA cycle intermediates, and most importantly glutamine.17 These findings link HBP and N-glycosylation to metabolism, indicating that these pathways are capable of regulating conditions critical to cancer cell survival.

Increased GlcNAc in cancer appears to play a regulatory role in cancer cell growth and survival by increasing glycan density on the cell surface, and therefore galectin binding.18 Galectins are

2 3 secreted proteins found in the extracellular space, which bind to one another to form complexes, forming the galectin lattice.19,20 N-glycans on proteins are the primary ligand of galectins, therefore the abundance, distribution and structure of glycoproteins on the cell surface play a role in galectin binding. These ligands also appear to regulate T-cell proliferation and apoptosis, by restricting the recruitment of T-cell receptors at the site of antigen presentation. This suggests a role between N-glycosylation and auto-immune disorders, and may play a role in immune surveillance during cancer development.21 Galectin interactions are critically important to the function of surface glycoproteins as they inhibit mobility of glycoproteins, forming a physical barrier against glycoprotein dimerization, and they also promote cell surface residency by opposing loss via endocytosis.16

Increased galectin binding promotes cell surface residency of many receptors and transporters that are highly implicated in cancer, for example epidermal growth factors receptors (EGFR), thereby promoting EGFR signaling. N-glycosylation also appears to have a direct effect on

EGFR signaling, by stabilizing the growth factor binding site, favouring stronger ligand interactions, and also helping maintain the EGFR dimeric interface.22 An activating cancer mutation in EGFR that deletes part of the extracellular domain also deletes two glycosylation sites. When glycosylated, these sites prevent receptor dimerization due to galectin lattice binding, thereby inhibiting lateral movement within the membrane. Therefore loss of these glycosylation sites increases dimerization, resulting in increased EGFR signaling, thereby promoting tumor cell growth.16 This reveals the complex nature of N-glycosylation particularly in the context of cancer, as it is not simply an increase or decrease in cell surface glycans that promotes carcinogenesis, but the number and spatial location. Depending on the expression and pattern of N-glycosylation on the surface of the cell, it can result in opposing effects, as revealed in the example looking at EGFR. N-glycosylation of proteins is also a very dynamic process,

3 4 allowing for rapid adaptive responses to stressors that are present during tumorigenesis, thereby promoting cancer cell survival.15

Previous studies have revealed that knockdown or knockout models of enzymes of the N-glycan branching pathway inhibit both tumor growth and initiation in in vivo cancer mouse models.

Zavareh et al. 2012 developed a knockdown model of MGAT1 using shRNA and looked at its effect in two different cancer cell lines, HeLa cervical cancer cells and PC-3 prostate cancer cells. In vitro studies with HeLa cells revealed that the MGAT1 knockdown did not affect cell growth, but inhibited invasion and migration. In prostate cancer cells conversely, the MGAT1 knockdown inhibited both primary tumor growth and incidence of lung metastases in orthotopic xenograft SCID mouse models.23 Further, Granovsky et al. 2000 generated Mgat5-/- mice, and induced mammary tumors using a polyomavirus middle T oncoprotein (PyMT). A significant decrease in tumor growth as well as lung metastases was also observed in these PyMT Mgat5-/- mice relative to the PyMT WT mice. Delay and slower growth of the mammary tumors was observed in ~95% of the tumors, whereas after a latent period ~5% of tumors adapted or escaped the slow growing phenotype, allowing rapid tumor growth. This indicates that MGAT1 and

MGAT5 play a role in both tumor initiation and growth.24

A similar negative growth effect was observed using other Mgat5-/- mouse models, whereby tumorigenesis was driven by increased human epidermal growth factor receptor 2 (HER2) expression or loss of phosphatase and tensin homolog (PTEN) expression.25,26 Cell lines generated from PyMT Mgat5-/- mammary tumors revealed that loss of Mgat5 desensitizes the cells to signaling of certain cytokines. This is due to loss in surface expression of EGFR, platelet- derived growth factor receptor (PDGFR), and transforming growth factor beta (TGF-b) receptor

1, in these cells.27 Interestingly, when the Mgat5 WT phenotype was restored, rescue of EGF and

4 5

TGF-b signaling was also observed. Rescue was also possible with increased HBP flux via

GlcNAc supplementation,28 and when endocytosis was inhibited, thereby retaining cell surface expression of receptors present.27 This emphasizes the importance of N-glycosylation on the expression of many cell surface receptors, particularly cytokine transporters, which can have an important effect in tumors. These PyMT Mgat5-/- mammary tumor cell lines also revealed upregulation of glycolysis and oxidative phosphorylation, increasing reactive oxygen levels in the cell, playing a role in protein kinase B/extracellular-signal-regulated kinase (Akt/ERK) signaling by inhibiting PTEN and other phosphatases.29

Recent work in our lab has shown a similar effect on mammary tumor growth using tumor xenografts in NOD-SCID mice. MGAT1 KO and MGAT5 KO MDA-MB-231 breast cancer cells were injected subcutaneously into the flank of NOD-SCID mice, and tumor growth was monitored. Figure 1.1 shows decreased tumor growth of xenograft MGAT1 and MGAT5 KO cells relative to MDA-MB-231 WT cells. The MGAT1 KO appears to be the most growth limiting, as the tumors showed very minimal growth throughout the 120-day experiment, whereas the MGAT5 KO showed a latency period greater than 80 days. This is far longer than the approximate 20-day latency period in the MDA-MB-231 WT cells. This was also conducted with MDA-MB-231 GNPNAT1 KO cells, an enzyme in the de novo Hexosamine Biosynthesis

Pathway, which generates substrates required for N-glycosylation. A latency period of ~60 days was observed in the GNPNAT1 KO tumors. After sacrificing the mice, all tumors that appeared to adapt to the KOs were analyzed by polymerase chain reaction (PCR), to confirm the gene mutations still remained. All tumors maintained their KO mutations, indicating the tumors that adapted did so through an alternate mechanism. This reveals that tumors have an ability to adapt to KOs of enzymes in HBP and N-glycan remodeling, indicating a need to better understand how these pathways interact with other pathways in the cell, to ultimately identify a second target.

5 6

Figure 1.1: Xenograft tumor growth of MDA-MB-231 (A) WT, (B) GNPNAT1 KO, (C) MGAT1 KO, and (D) MGAT5 KO cell lines. The MDA-MB-231 cells were injected subcutaneously into NOD-SCID mice and tumor volumes were measured using calipers. The WT cells appear to show a latency period of ~20 days before tumor growth accelerated, whereas the GNPNAT1 KO and MGAT5 KO show a ~60 day and >80 day latency period respectively. The MGAT1 KO cells did not lead to rapid tumor growth within the 120-day experimental period. GNPNAT1, Glucosamine-Phosphate N-Acetyltransferase 1; KO, knockout; MGAT1, Alpha-1,3-Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase; MGAT5, Alpha-1,6- Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; WT, wild type.

6 7

1.2 Hexosamine Biosynthesis Pathway

The hexosamine biosynthesis pathway (HBP) is the de novo pathway that generates UDP-

GlcNAc from glucose, Figure 1.2.15 The first two steps of this pathway are shared with glycolysis, converting glucose to fructose-6-phosphate (F6P). At this point, Glutamine--

Fructose-6-Phosphate Transaminase 1/2 (GFPT1/2) catalyzes the first-committed and rate limiting step of HBP, adding glutamine to F6P, generating Glucosamine-6-phosphate (GlcN-6P).

The GFPT1 and GFPT2 isoforms reveal different tissue distribution patterns, with high levels of

GFPT2 found primarily in neural tissue, and GFPT1 in many other tissues, with highest expression in the pancreas, placenta, and testis.30 Glucosamine-Phosphate N-Acetyltransferase 1

(GNPNAT1) then then adds an acetyl group from acetyl-CoA to GlcN-6P to generate GlcNAc-6- phosphate (GlcNAc-6P), which is eventually converted to UDP-GlcNAc (vertical pathway in

Figure 1.2, enzymes in blue).31,32

Although UDP-GlcNAc is primarily generated through HBP, there is also a salvage pathway that utilizes cytosolic GlcNAc, which gets phosphorylated by N-acetylglucosamine Kinase (NAGK) to yield GlcNAc-6P (Figure 1.2). Cytosolic GlcNAc for the salvage pathway typically comes from lysosomes, following degradation of oligosaccharides, as well as from transport of extracellular GlcNAc into the cell. Studies in various cell types have shown that following

GlcNAc supplementation a significant increase in intracellular UDP-GlcNAc is observed.32 This suggests that the salvage pathway acts to supplement the de novo pathway as opposed to acting as a salvage pathway, whereby it would only be activated when needed by the cell.

UDP-GlcNAc is used in multiple processes, such as glycosylation, which includes the previously mentioned N-glycosylation, as well as generating hyaluronan.

7 8

Figure 1.2: The Hexosamine Biosynthesis Pathway (HBP). -1P, 1-phosphate; -6P, 6- phosphate; F6P, Fructose 6-phosphate; GALE, UDP-galactose-4-epimerase; GalNAc, N- acetylgalactosamine; GFPT1,2, Glutamine--Fructose-6-Phosphate Transaminase 1,2; GlcN, glucosamine; GlcNAc, N-acetylglucosamine; Gln, glutamine; GNPDA1,2 Glucosamine-6- Phosphate Deaminase 1,2; GNPNAT1, Glucosamine-Phosphate N-Acetyltransferase 1; NAGK, N-acetylglucosamine Kinase; PGM3, Phosphoglucomutase 3; UAP1, UDP-N- Acetylglucosamine Pyrophosphorylase 1; UDP, Uridine diphosphate; UTP, Uridine triphosphate.

8 9

1.2.1 Glycosylation

Glycosylation is a common protein modification that is crucial for the proper folding, stability, translocation and function of many proteins.33 Glycosylation is the addition of glycans onto specific amino acid residues of proteins, occurring in the lumen of the Golgi and endoplasmic reticulum (ER).34 Since the nucleotide sugar substrates (ex. UDP-GlcNAc) required for glycosylation are generated in the cytoplasm, transport into the Golgi and ER is required. The transporters for this purpose belong to the SLC35 family of transporters, and this compartmentalization of steps allows glycosylation to be a highly regulated and efficient process.34 The most common types of glycosylation are N- and O-linked glycosylation, whereby glycans are attached to an asparagine residue (N-glycosylation) or a serine or threonine residue

(O-glycosylation) of a protein.35 Glycans are known to be the most complex and abundant groups of molecules in living organisms, thereby introducing significant diversity to proteins.36

While it was previously believed that glycosylation was a protein modification only occurring in eukaryotes, it has more recently been observed in all three domains of life.35

Glycosylation is important, as many membrane receptors and cytokine transporters are co- or post-translationally modified with N-linked or O-linked glycans.16 Neither glycan structure nor site specificity are encoded in the genome.37 Therefore, for each protein that gets glycosylated, its pattern can vary based on response to environmental cues, cell type, and availability of substrates or enzymes, providing a powerful mechanism for introducing diversity and complexity.37,38

9 10

1.2.1.1 N-Glycosylation

N-glycosylation is the most common protein modification in eukaryotes, modifying two-thirds of all proteins. N-glycosylation primarily occurs as a co-translational modification, making it critically important in proper protein folding and stability. In instances where co-translational modification is missed, it can occur post-tranlationally.39

The N-linked glycans used in N-glycosylation are modified in the Golgi of the cell, via the N- glycan branching pathway.17 The N-glycan branching pathway is a linear pathway, that utilizes

UDP-GlcNAc as a substrate for adding GlcNAc residues onto glycoproteins (Figure 1.3). The various MGAT enzymes function to add GlcNAc onto mannose residues of glycoproteins, generating mono-, bi-, tri- then tetra-antennary N-glycans. N-glycans can obtain any of these structures, as this pathway does not always go to completion. These GlcNAc residues also get further modified, first typically with a galactose and/or fucose residue, and then galactose residues even further elongated, for example with sialic acid, N-acetylgalactosamine, or sulfate.37

This allows production of a wide variety of glycans for protein modification. 17

10 11

Figure 1.3: Golgi N-Glycan Branching Pathway. GlcNAc, N-acetylglucosamine; MANII, Mannosidase II; MGAT1, Alpha-1,3-Mannosylglycoprotein 2-Beta-N- Acetylglucosaminyltransferase; MGAT2, Alpha-1,6-Mannosyl-Glycoprotein 2-Beta-N- Acetylglucosaminyltransferase; MGAT4, Alpha-1,3-Mannosyl-Glycoprotein 4-Beta-N- Acetylglucosaminyltransferase; MGAT5, Alpha-1,6-Mannosylglycoprotein 6-Beta-N- Acetylglucosaminyltransferase; SLC35A3, Solute Carrier Family 35 Member A.

11 12

1.2.1.2 Other pathways requiring UDP-N-Acetylhexosamine (HexNAc): O-

Glycosylation and Hyaluronan Synthesis

UDP-GlcNAc is interchangeable with UDP-N-Acetylgalactosamine (GalNAc), mediated by the action of UDP-glucose 4-epimerase (GALE), shown in Figure 1.2, page 8.40 GlcNAc and

GalNAc can be categorized as N-Acetylhexosamines (HexNAc), and are maintained at an approximate 3:1 ratio in the cell respectively. Only UDP-GlcNAc is known to function in N- glycosylation, but both UDP-HexNAcs are important in O-glycosylation and the synthesis of hyaluronan.40,41

O-glycosylation is a post-translational modification that can occur in the ER, Golgi, nucleus or cytoplasm, whereby various sugars can be added to the hydroxyl group of a serine or threonine residue of a protein.42,43 There are two types of O-glycosylation that use UDP-HexNAc residues: mucin-type O-glycosylation, which uses UDP-GalNAc, and O-GlcNAcylation.41 Membrane and excreted proteins are modified in the Golgi and ER, which is where mucin-type O-glycosylation occurs.41,42 O-GlcNAcylation is a type of O-glycosylation that occurs in the nucleus and cytoplasm, using UDP-GlcNAc as the substrate for O-linked N-acetylglucosamine transferase

(OGT) to post-translationally modify intracellular proteins with GlcNAc.31,44 O-GlcNAcylation appears to increase in oxidative stress, and is required for early-stage development in mammals.45,46

Hyaluronan (HA) is a glucosaminoglycan that is synthesized at the plasma membrane via transmembrane HA synthase enzymes (HAS1-3).47 Its synthesis requires cytosolic UDP-GlcNAc and UDP-glucuronic acid, which forms a disaccharide that is then used to form a repetitive chain into the extracellular space.48 UDP-HexNAc levels in the cell negatively regulate expression of

12 13

HAS2, thereby regulating HA synthesis.40 HA is a very important component of the extracellular microenvironment, mainly in generating and providing structure to these spaces.47,49 HA also has many other physiological functions, including cell migration and proliferation, wound healing, and tissue hydration.50 Production of HA is tightly controlled in the cell, but can be hijacked in some cancers, promoting and maintaining the malignant phenotype.51

1.3 Cancer Cell Metabolism

Cancer is characterized by aberrant growth of cells, resulting in increased energy demands to sustain this uncontrolled proliferation. In response to these increased needs, cancer cell metabolism shifts to being dominated by aerobic glycolysis, whereby the cells preferentially metabolize glucose via glycolysis over oxidative phosphorylation, even in the presence of oxygen.52 This was first noted in the 1920s by Otto Warburg, and termed the Warburg effect, but has only recently been studied more extensively.53

Warburg observed that cancer cells took up great amounts of glucose from the environment compared to surrounding tissue, producing a significant amount of lactate as a result.54 In normal oxygen rich conditions, cells do not metabolize glucose into lactate, and only resort to doing so in hypoxic conditions. The massive increase of glucose uptake by these cells is facilitated by the overexpression of several glucose membrane transporters (GLUTs), primarily GLUT1.52,55

Increased glucose transport into the cytoplasm increases its availability as a substrate for glycolysis, promoting glycolytic flux.

This feature of cancer cells has been harnessed for diagnostic purposes, using positron emission topography (PET) and an analog of glucose (18F-fluorodeoxyglucose, FDG). FDG can be transported into the cell using the membrane glucose transporters and can undergo the first two

13 14 enzymatic reactions of glycolysis. At this point it becomes FDG-6-phosphate, which cannot be further metabolized by glycolysis, becoming trapped inside the cell. This localized accumulation of FDG-6-phosphate can be detected via PET, resulting in the development of a very useful and powerful diagnostic tool.4

Eukaryotic cells have evolved to favour aerobic respiration as the main form of cellular metabolism. Compared to anaerobic energy metabolism, oxidative phosphorylation is a far more efficient process, yielding 18 times more ATP per glucose molecule.56 Under normal conditions, oxidative phosphorylation produces more than 80% of the ATP required by a cell, and only the remainder is produced using anaerobic catabolic pathways, such as glycolysis.57 Interestingly, aerobic glycolysis also appears to be favoured in other rapidly dividing tissues, such as activated

T lymphocytes and some embryonic tissues, regardless of oxygen availabilty.52 While not at first obvious, it has been discovered that there are advantages to aerobic glycolysis, primarily that increased glycolysis results in an increased abundance of glycolytic intermediates, which can feed into many other pathways. This is particularly important, as these intermediates feed into biosynthetic pathways, many of which involve the generation of macromolecules, critical building blocks during cell division.52 Additionally, it was found that the large amounts of lactate produced is taken up by neighbouring cells as a primary energy source, feeding into the citric acid cycle.15,52

Although metabolic pathways are far less efficient in the absence of oxygen, there are energy benefits that arise in hypoxic conditions that are similar to the benefits of oxygen availability.

While there is a vast increase in the yield of ATP from glucose in the presence of oxygen, the cost of generating macromolecules and other biomass required for a cell uses thirteen times more energy in the presence of oxygen versus in low oxygen conditions.58 This indicates that there

14 15 may be more benefits than previously believed in anaerobic cellular states, something that tumors may benefit from, as some microenvironments, particularly in the center of tumor masses, experience hypoxia.

1.3.1 Main Cancer Energy Sources

The three main energy sources for cells are carbohydrates, proteins and fats. One primary molecule from each of these sources is preferentially used by cancer cells: glucose, glutamine and acetyl-CoA.

Increased dependence on glucose is one of the major hallmarks of cancer, primarily due to the

Warburg effect previously mentioned. This is favourable to cancer cell survival, as it results in resistance to hypoxic conditions, local tissue immunosuppression, and favours proliferation.59

Many cancer cell lines are sensitive to glutamine starvation, revealing a dependency on extracellular glutamine in vitro, even though it is a non-essential amino acid.60,61 The glutamine demands in tumor cells outpace the supply as a result of the rapid cell growth, making glutamine essential.62 This increased need is met by increased glutamine uptake from the extracellular environment or via micropinocytosis.63 Fortunately, glutamine is the most abundant amino acid in blood plasma, at a concentration of 0.6-0.9 mmol/L, providing an accessible source to fuel the increased requirement by tumors.62,64

Glutamine is utilized in various ways in cancer cells. Glutamine can be catabolized via glutaminolysis, whereby glutamine is broken down into pyruvate, which can undergo fermentation or be converted into acetyl-CoA.65 Glutamine can also be converted into alpha- ketoglutarate via deamination, which subsequently enters the citric acid cycle, ultimately generating acetyl-CoA and oxaloacetate in the cytoplasm.66 Additionally, glutamine can be

15 16 converted into glutathione, an antioxidant, facilitating control of the redox state of the cell.55 This is particularly important in cancer, as most cancer cells exhibit higher basal levels of reactive oxygen species (ROS) relative to healthy tissue.65,67 Additionally, studies have shown that production of ROS can result in signaling that regulates processes such as proliferation and apoptosis.68

Cancer cells have a much higher demand for acetyl-CoA relative to healthy tissue, which is generated by the citric acid cycle or via glutamine breakdown as outlined previously. This demand results from increased activation of fatty acid synthesis often seen in tumors, which requires acetyl-CoA as a substrate. A critical enzyme in this synthetic pathway is fatty acid synthase (FASN), which is overexpressed in many cancers, favouring the anabolism of fatty acids. This supports the increased proliferative demand, as fatty acids are integral to the synthesis of cell membranes. FASN also plays a protective role against apoptosis, which in the context of cancer favours the initiation and maintenance of tumors.66

Acetyl-CoA can also feed into the mevalonate pathway, generating sterols and isoprenoids. The mevalonate pathway generates isopentenyl-diphosphate (IPP), being the only intracellular IPP source in human cells.69 18-20 IPP monomers can join to make dolichol phosphate, which is elongated with GlcNAc and mannose residues to generate a dolichol-linked N-glycan precursor.

This precursor gets transferred onto an asparagine residue of a protein in the ER co- translationally, thereby initiating N-glycosylation.37 Therefore, acetyl-CoA plays a very important role in the initiation of N-glycosylation through dolichol, and in generating GlcNAc residues, important in N-glycan branching and further N-glycan modification.

16 17

1.3.2 Increased flux through HBP

Increased glucose uptake by cancer cells results in a subsequent increase in HBP flux.33,36 As a result, aberrant N- and O-linked glycosylation is observed, and has shown to accelerate the progression of cancer. The resulting glycoconjugates appear to have regulatory roles in many tumorigenic process, including invasion, metastasis, angiogenesis and proliferation.15,36

The three main cancer energy sources, glucose, glutamine and acetyl-CoA, are particularly important, as they are all substrates of HBP (Figure 1.2, page 8). This makes HBP a very sustainable process in cancer cells, as the metabolic adaptations that occur in tumor cells increase the availability of all three. Thus UDP-GlcNAc has been termed a nutrient sensor, as its levels depend on the availability of these three energy sources.32

Previous studies have shown the importance of HBP in the context of cancer. Ying et al. 2012 showed that knockdown of GFPT1 via shRNA, the first committed step of HBP, inhibited tumor initiation and growth both in vitro and in vivo. This was done using a Kras inducible system in

Pancreatic Ductal Adenocarcinoma Cells (PDAC), and the effect was observed in nude mice following subcutaneous injection of the cells.70 Wellen et al. 2010 showed that mammalian cells depend on HBP for cell growth and survival. Their experiments revealed that glucose starvation decreased glutamine uptake, and in these conditions cell atrophy was observed. This phenotype was rescued with GlcNAc supplementation, therefore allowing flux through HBP. This revealed that glucose metabolism through HBP is critical for glutamine uptake and for cell growth and survival, and therefore critical for tumor cell survival, as cancer cells have revealed to be dependent on glutamine.71

17 18

Additionally, previous work in our lab has shown that KO models of GNPNAT1 have shown decreases in tumor cell proliferation in vitro using MDA-MB-231 breast cancer cells. We have also seen significant decreases in tumor growth using these MDA-MB-231 GNPNAT1 KO cells relative to the MDA-MB-231 WT cells in vivo using tumor xenograft models in NOD-SCID mice, as outlined previously in Figure 1.1, page 6.

1.4 Gene Editing

Since the advent of the Human Genome Project in 2003, sequencing the entire human genome was made possible.72 This was a very exciting time, as it was believed that many of the unknowns regarding genes and proteins would be answered by knowing the DNA sequence.

Since then, there has been a massive influx of sequencing information, including those from different cell types, disease states, and from different organisms. Despite the extensive genomic information known, predicting phenotypes remains a challenge from DNA sequences alone.

The most common method used to understand the function of genes and proteins uses gene modification, and seeing how that modification affects a cell or organism.73,74 Typically, this is done by altering the expression of a gene’s product, and studying the response to that change.

For example, a gene can be mutated, and introducing that mutation will result in disruption of a cellular process, allowing study to understand the function of that gene.74 To do this, mutations need to be achieved in a targeted manner. There are many targeted gene technologies, including

RNA interference (RNAi), transcription activator-like effector nucleases (TALENs), zinc-finger nucleases (ZFNs), and clustered regularly interspaced short palindromic repeats

(CRISPR)/CRISPR associated protein 9 (Cas9).75,76 While these methods have all been

18 19 successful in either knocking down or knocking out a gene’s expression, CRISPR/Cas9 has proven to be the superior method.77–79

Genome-wide loss-of-function screens have been emerging to study many disease processes. The current field of research is largely dominated by guided research, where only promising findings are further pursued. This method limits the discovery of what’s unknown, and genome-wide screening provides opportunity to discover novel interactions and functions of genes. Previously, this was primarily conducted using RNAi,79 which uses exogenous double stranded RNA

(dsRNA) that is complementary to the mRNA sequence it is targeting. This dsRNA is used as a template for the cleavage and therefore degradation of the transcript. As this method targets genes at the mRNA level, it can only result in knockdown of a gene as opposed to a complete knockout (KO), as it is impossible to get complete degradation prior to translation of all mRNAs.

Additionally, RNAi cannot silence mRNAs located within the nucleus, limiting its applicability.79 mRNA is a moving target with only a short time frame prior to its translation.

Therefore, targeting a gene at the DNA level is the gold standard, allowing the possibility of a complete gene knockout.

Previously, TALENs and ZFNs had been used to conduct precise genome editing.77 These methods use customized DNA binding motifs, allowing specificity to targeted genome locations, and nuclease activity to make double strand breaks. Non-homologous end joining allows repair of these breaks, but is a very error-prone method, often leading to insertion or deletion (indel) mutations as a result. The indel mutation ideally results in a frame-shift mutation, knocking out the function of that gene.80 Unfortunately, the highly repetitive nature of TALENs limits the cloning of genes, and the extensive DNA-protein contacts makes the design of ZFNs very

19 20 complex.81 CRISPR/Cas9 genome editing has overcome these technical barriers that limit the range of future applications for ZFNs and TALENs.77

1.4.1 CRISPR/Cas9

CRISPR/Cas9 gene editing technologies have evolved from the prokaryotic adaptive immune system, specifically the type II CRISPR system. Prokaryotes integrate foreign invading DNA into their genome, in fragments termed protospacers, forming a CRISPR region. This region is transcribed, generating a pre-CRISPR RNA, and further processed using the assistance of a trans-activating crRNA (tracrRNA) to generate a mature crRNA. These two RNA molecules interact to form a duplex, also termed a sgRNA. This duplex guides Cas9 to a specific DNA site that is complementary to the protospacer sequence, where Cas9 then catalyzes double stranded cleavage of the DNA.82

In response to a double stranded break, the cell undergoes either homology-directed repair

(HDR), which is favoured but can only occur when a template is present, or with non- homologous end joining (NHEJ) when no template is available.77,82 NHEJ is an error prone DNA repair method, that often leads to insertion or deletion mutations that disrupt the locus.

Alternatively, a donor template can be used that is homologous to the target site, generating a precise mutation.82 Both methods lead to a mutation at the target locus, ideally knocking out the function of the gene.

CRISPR/Cas9 has made a profound impact on the field of genetics, as it is a gene-editing technology that has many benefits over those previously used. This system is very easy to engineer, easily scalable, affordable, and shows high sensitivity, specificity and efficacy.77,83

20 21

CRISPR/Cas9 is a particularly valuable technology, as it can be used to identify genetic interactions via genome-wide screening.4,79

1.5 Genetic Interactions

A genetic interaction (GI) is present when multiple gene mutations result in an unexpected phenotype relative to each individual mutant alone. There are many types of GIs, such as positive and negative GIs, which result in a fitness advantage or deficit respectively relative to each individual mutant.4 Synergistic interactions are particularly valuable, as the double KO magnifies the effect of each mutant alone, indicating that the two genes have an overlapping function. An example of a negative synergistic gene interaction is a synthetic lethal interaction, whereby each individual mutant maintains a viable phenotype, but when mutated together results in a lethal phenotype (Figure 1.4). Although this is a rare phenomenon, it reveals two genes that work together to have an effect on the same essential function.84

While mutagenesis has been used to try to determine gene functions, in eukaryotes, mutating most genes has very little effect.84 This is the result of considerable genome buffering, making most eukaryotic genes non-essential for survival. This reveals the importance of understanding how genes interact with one another, and where redundancy within the genome lies, as producing phenotypes is a very complex process.4 Gene-gene interaction studies have revealed that GIs tend to occur between functionally related genes, allowing a method for phenotypically characterizing genes.4,84 Additionally, combining gene-gene data with protein-protein, protein- gene or metabolic networks, physical mechanisms of these gene products can be elucidated.

Therefore, integrating knowledge on gene and protein interactions can help determine pathway function and organization.85

21 22

Figure 1.4: A synthetic lethal interaction. This is an example of a synergistic negative genetic interaction.

22

GIs have been studied on a genome-wide scale in Saccharomyces cerevisiae, which has revealed valuable insights on the information gene interactions can provide.

1.5.1 Global gene interaction network

Costanzo et al. 2016 generated all pairwise double KOs for most protein coding genes in the S. cerevisiae genome (~6000). This systematic approach revealed almost one million GIs, spanning

~90% of all genes.4 As over 23 million double mutants were generated using this method, this revealed a high prevalence of GI relationships. Many prior studies have looked at genetic interactions, and it was generally known that these interactions can provide functional information, but how exactly and to what confidence was not well understood. Identifying GIs on a global scale provided more information on the complex nature of these interactions, and what information they can provide.

Costanzo et al. 2016 revealed that negative interactions typically reveal functional relationships, identifying genes with pleiotropic functions, while positive interactions tend to identify regulatory relationships between genes, providing a better understanding of the mechanism of genetic resiliency or suppression. They also showed that essential genes were far more connected in the genetic interaction network, and therefore had more GIs, relative to non-essential genes, as well that genes with more similar gene interaction profiles tend to cluster with genes of similar function or that work within the same pathway.4,84 Knowing these trends in GIs is very valuable when analyzing GI data. While generating gene interaction profiles would require large scale GI screening, the findings can be applied to even small-scale screens, providing a framework for interpreting GI data.

23 24

Genetic interactions appear to show much promise in bridging the gap between genomic information and obtaining functional information. This is important for functionally characterizing the entire human genome, which is very valuable for better understanding diseases. Understanding how genes function and interact in a healthy cell is important for appreciating the pathology that occurs in disease states, but genetic interaction screening also provides a means of characterizing genes and relationships directly in disease contexts, by conducting screens in disease cell models.

The field of genetic interactions is still new, with minimal large-scale data on the human genome. It will take a very collaborative approach between labs around the world to develop a global genetic interaction map for the human genome, especially in different disease states, but if achieved, will provide a high volume of functional information previously not known.

1.5.2 Yeast versus Human Genome

The findings of the genome scale KO screening in yeast supports this unexplored method for studying the human genome. This opens up new possibilities, but many caveats are introduced when translating such a method from the yeast to human genome.

Firstly, the human genome is much larger than that of yeast, containing approximately 20,000 protein coding genes, whereas yeast only have approximately 6,000.86,87 Additionally, 96% of yeast genes do not contain introns, as it is something that was evolutionarily lost in simple eukaryotes in favour of more rapid replication.86 The presence of introns in genes introduces benefits, including the ability for alternative splicing, enhancing gene expression, and regulating mRNA transport and decay processes.88 Unfortunately, this also introduces new considerations for KO screening in the human genome, such as the presence and pattern of alternative splicing

25 when designing gRNAs to target specific genes, as approximately 35% of human genes undergo alternative splicing.89

Additionally, yeast are single-celled organisms, providing a single genome to study. Therefore, when scaling the genome-wide method to the human genome, not only does the difference in gene number need to be taken into account, but the many different human cell types as well.

Each cell type expresses a certain subset of genes, complicating GI screening, as a gene can have an important functional role in one cell type, yet be silenced in another. Further, the multicellularity of human tissue involves many functions, such as cell-cell adhesion and signaling, as well as extracellular functions, which cannot be appreciated when studying the yeast genome.

While generating a global gene interaction network is the ideal end goal for the human genome, the human genome cannot be labelled as a single entity. Each different human cell type needs to be considered, given their functional differences, which will correspond to differences in their GI profiles. Further, understanding the role of GIs in disease states requires the use of appropriate cell models introducing yet another consideration. Therefore, a more precise approach is required as opposed to characterizing the “human genome,” focused on a particular cell type and or disease, and understanding the relationships and crosstalk that are relevant in that particular context.

1.6 Rationale

The hexosamine biosynthesis and N-glycan pathways are very important metabolic and co-/post- translational modification pathways in the context of cancer. Therefore, I have chosen several enzymes of interest from these pathways to generate gene-of-interest KO cell lines for the

26 discovery of genetic interactions in cancer cells. From HBP, I chose GFPT1, an enzyme that catalyzes the first committed step of the de novo pathway, and NAGK, catalyzing the generation of GlcNAc-6P via the salvage pathway (Figure 1.2, page 8).32,90 As HBP has a de novo and salvage pathway, I wanted to target both arms, allowing a more comprehensive view of HBP, which may shed light on the cell’s dependence on each redundant pathway, as well as relationships with other biological processes. NAGK was chosen as it is the only enzyme involved in the GlcNAc salvage pathway, and GFPT1 was chosen based on previous studies showing an anti-tumor response when knocking down GFPT1 function.70 The other two enzymes are members of the N-glycan branching pathway, MGAT1 and MGAT5, which are the first and last enzymes in the linear pathway respectively (Figure 1.3, page 11). N-glycosylation is known to be highly implicated in cancer, and these specific genes have revealed cancer promoting activities, and suppress growth when knocked down or knocked out in various cancer models.23,24 Currently, these four genes have no known function beyond their roles in these pathways. Better understanding these genes in the entire context of the cell, and potentially identifying new functions of these genes, could be very valuable.

Initially our list of genes-of-interest included GNPNAT1, an enzyme in de novo HBP with a unique function (Figure 1.2, page 8). Unfortunately, HAP1 cells did not yield a GNPNAT1 mutation after several attempts by Horizon Discovery, the source for the HAP1 KO cell lines used in the CRISPR/Cas9 KO screens. However, as an alternative, mutants of GFPT1, catalyzing the first committed step of HBP, were generated in the HAP1 cells.90 The preparation of MDA-

MB-231 KO cell lines and phenotyping in our lab proceeded with the GNPNAT1 KO, as well as mutants of NAGK, MGAT1 and MGAT5.

27

The purpose of the research described is to better understand the role of the hexosamine biosynthesis and N-glycan remodeling pathways and how they interact in the greater context of the cancer cell. We hypothesized that GIs revealed by CRISPR/Cas9 KO screens will reveal pathways which are dependent on N-glycosylation and critical for cancer cell growth, and that

GIs, both unique and overlapping between gene-of-interest cell lines, will be observed. We have chosen HAP1 chronic myelogenous leukemia cells and MDA-MB-231 breast cancer cells as our cancer cell models, as we would like to find GIs and relationships that hold across different cancer types. This may provide preliminary information on new regulatory and functional roles of these genes.

I performed genome-wide CRISPR/Cas9 knockout screens to identify gene interactions with my four genes-of-interest, NAGK, GFPT1, MGAT1 and MGAT5, in a chronic myelogenous leukemia cell line with the aim to validate them in vitro. The genetic interactions identified were bioinformatically analyzed and found suggested relationships between these pathways and cell- cell adhesion, cytoskeletal components, and folate and nucleotide metabolism. Metabolomics was also conducted on my cell lines of interest, to determine if there are unique metabolic signatures associated with our genes-of-interest KOs. This data revealed that the MGAT1 KO appeared to be the most detrimental to the cells, consistent with the previous in vivo study in our lab, outlined in Figure 1.1 (page 6). This suggests that N-glycosylation is an important process for cancer cell growth and is likely in part mediated through interaction with other metabolic processes, a relationship previously unknown.

28

Chapter 2

Materials and Methods

2.1 Materials

Cell Lines: HAP1 WT, HAP1 NAGK KO, HAP1 GFPT1 KO, HAP1 MGAT1 KO, and HAP1

MGAT5 KO cell lines were a gift from Jason Moffat, University of Toronto, Toronto, Canada.

The cell lines were originally purchased from Horizon Discovery (Waterbeach, Cambridge, UK).

Hek293TN cells were obtained as a gift from Anne-Claude Gingras, University of Toronto,

Toronto, Canada. The MDA-MB-231 WT cell line was obtained from ATCC, Manassas, VA,

USA. MDA-MB-231 NAGK KO, MDA-MB-231 GNPNAT1 KO, MDA-MB-231 MGAT1 KO, and MDA-MB-231 MGAT5 KO, using all-in-one CRISPR/Cas9 expression vectors, were generated previously in our lab by Aldis Krizus.91 The MDA-MB-231 NAGK and MDA-MB-

231 GNPNAT1 KOs were functionally confirmed by determining metabolite abundance using liquid chromatography-tandem mass spectrometry (LC-MS/MS) (Figure 2.1). The 100% increase in the levels of the substrates, GlcNAc and GlcN-6P, in the NAGK KO and GNPNAT1 KO respectively, and the significant decrease in production of UDP-GlcNAc in both, is the expected response to knocking out these genes. The MDA-MB-231 MGAT1 and MDA-MB-231 MGAT5

KOs were also functionally confirmed, by measuring the abundance of N-glycan branching using glycan mass spectrometry (Figure 2.2). The MGAT1 KO resulted in an almost complete loss of bi-, tri- and tetra-antennary N-glycans, and the MGAT5 KO shows a loss of the tetra-antennary

N-glycans. This is expected, since MGAT1 is required for the first step of the linear N-glycan branching pathway that generates all mono-, bi-, tri-, and tetra-antennary N-glycans, and

MGAT5 is required only for the final step, generating tetra-antennary N-glycans.

29

Figure 2.1: Functional confirmation of MDA-MB-231 NAGK and GNPNAT1 KO cell lines. Measure of relative (A) N-acetylglucosamine (GlcNAc), (B) glucosamine-6-phosphate (GlcN- 6P), and (C) uridine diphosphate N-acetylglucosamine (UDP-GlcNAc) levels in MDA-MB-231 cell lines. Back up of substrate metabolite for each knocked out (KO) enzyme confirms success of the gene KO in the cell line, as well as decreased expression of UDP-GlcNAc in both KOs (t- test, *p<0.000005 and **p<0.000001). -1P, 1-phosphate; -6P, 6-phosphate; F6P, Fructose 6- phosphate; GlcN, glucosamine; GlcNAc, N-acetylglucosamine; GFPT1,2, Glutamine--Fructose- 6-Phosphate Transaminase 1,2; Gln, glutamine; GNPDA1,2 Glucosamine-6-Phosphate Deaminase 1,2; GNPNAT1, Glucosamine-Phosphate N-Acetyltransferase 1; NAGK, N- acetylglucosamine Kinase; PGM3, Phosphoglucomutase 3; UAP1, UDP-N-Acetylglucosamine Pyrophosphorylase 1; UDP, Uridine diphosphate; UTP, Uridine triphosphate.

30

Figure 2.2: Functional confirmation of MDA-MB-231 MGAT1 and MGAT5 KO cell lines. Abundance of the different amounts of N-acetylglucosamine (GlcNAc) branching observed in MDA-MB-231 cells containing various gene knockouts (KO). Depletion of more complex GlcNAc branching from products of Alpha-1,3-Mannosylglycoprotein 2-Beta-N- Acetylglucosaminyltransferase (MGAT1) or Alpha-1,6-Mannosylglycoprotein 6-Beta-N- Acetylglucosaminyltransferase (MGAT5) confirms the success of each respective KO cell line. GNPNAT1, Glucosamine-Phosphate N-Acetyltransferase 1; MANII, Mannosidase II; MGAT2, Alpha-1,6-Mannosyl-Glycoprotein 2-Beta-N-Acetylglucosaminyltransferase; MGAT4, Alpha- 1,3-Mannosyl-Glycoprotein 4-Beta-N-Acetylglucosaminyltransferase; NAGK, N- acetylglucosamine Kinase; SLC35A3, Solute Carrier Family 35 Member A3.

31

Plasmid Vectors: Virus containing the Toronto human knockout pooled library (TKOv3), was provided as a gift from Jason Moffat, University of Toronto, Toronto, Canada. The pLCKO

(AddGene #73311) and Lenti-Cas9-2A-Blast (AddGene #73310) vectors were also provided as a gift from Jason Moffat, University of Toronto, Toronto, Canada. The pMD2.G was obtained from AddGene (Plasmid #12259), Watertown, MA, USA. The pCMV-VSV-G (AddGene

#8454), psPAX2 (AddGene #12260), and b-actin cloned pSTV6-PGK-P2R N-mCherry vector

(Appendix 1) were provided as a gift from Anne-Claude Gingras, University of Toronto,

Toronto, Canada.

Tissue Culture: Cells were cultured in DMEM (Multicell, Wisent Inc., St-Bruno, QC), containing 10% FBS (Multicell, Wisent Inc., St-Bruno, QC) and 1% Penicillin Streptomycin

(Multicell, Wisent Inc., St-Bruno, QC). Virus production media was made using DMEM with

5% FBS and then heat inactivated (55°C for 20 minutes). D-PBS and 1X Trypsin were obtained from Multicell, Wisent Inc., St-Bruno, QC. Puromycin dihydrochloride and Doxycycline hyclate were obtained from Sigma-Aldrich, St. Louis, Missouri, USA; Blasticidin S. hydrochloride from

Multicell, Wisent Inc., St-Bruno, QC; Polybrene Infection/Transfection Reagent from EMD

Millipore, Burlington, MA, USA; and jetPRIME Transfection Reagent from Polyplus- transfection, Illkirch, France. gDNA Extraction and Polymerase Chain Reactions (PCRs): gDNA extraction was conducted using the Wizard Genomic DNA Purification Kit (Promega, Madison, Wisconsin, USA), with the addition of RNase A (Invitrogen, Thermo Fisher Scientific, Carlsbad, California, USA). The

Qubit dsDNA Broad Range Assay Kit was obtained from Invitrogen, Thermo Fisher Scientific,

Carlsbad, California, USA. PCR was conducted using NEBNext Ultra II Q5 Master Mix (New

32

England Biolabs, Ipswich Massachusetts, USA), and the primers used were obtained as a gift from Jason Moffat, University of Toronto, Toronto, Canada (sequences in Appendix 2).

Antibodies: Primary antibodies used for western blotting were anti-Cas9 mouse monoclonal IgG

(Santa Cruz Biotechnology, Dallas, Texas, USA) and anti-g-tubulin mouse monoclonal IgG

(Sigma-Aldrich, St. Louis, Missouri, USA). Secondary antibody used was Anti-mouse IgG, horseradish peroxidase linked whole antibody (GE Healthcare UK Limited, Little Chalfont, UK).

Western Blot: Skim milk powder was obtained from Sigma-Aldrich, St. Louis, Missouri, USA, and Tween 20 was obtained from Bio-Rad Laboratories, Hercules, California, USA. SuperSignal

West Femto Maximum Sensitivity Substrate was obtained from Thermo Fisher Scientific,

Waltham, MA, USA.

Ligation and Transformation: BfuAI, NsiI, rSAP, T4 PNK and T4 DNA Ligase were obtained from New England Biolabs, Ipswich Massachusetts, USA, which came with the necessary buffers. Oligo primers were ordered from Eurofins Scientific, Luxembourg (sequences in

Appendix 3). OneShot Stbl3 Chemically Competent E. coli were obtained from Thermo Fisher

Scientific, Waltham, MA, USA. LB plates with 100 µg/ml of Ampicillin and LB broth was obtained from Multicell, Wisent Inc., St-Bruno, QC. Ampicillin was obtained from Sigma-

Aldrich, St. Louis, Missouri, USA. PureLink Quick Plasmid Miniprep Kit was obtained from

Invitrogen, Thermo Fisher Scientific, Carlsbad, California, USA.

Gel Electrophoresis and Gel Extraction: Gels were made using UltraPure Agarose (Invitrogen,

Thermo Fisher Scientific, Carlsbad, California, USA) and SYBR Safe DNA Gel Stain

(Invitrogen, Thermo Fisher Scientific, Carlsbad, California, USA). The O’Generuler 1kb DNA ladder, ready-to-use and 6x Orange Loading Dye were obtained from Thermo Fisher Scientific,

33

Waltham, MA, USA. The gel extraction was conducted using the QIAquick Gel Extraction Kit

(QIAGEN, Hilden, Germany).

Flow Cytometry: SYTOX Blue Dead Cell Stain was obtained from Molecular Probes, Eugene,

Oregon, USA. Flow cytometry was conducted on the Gallios Flow Cytometer, and analysis using

Kaluza Analysis Software (Beckman Coulter, Brea, California, USA).

Liquid chromatography-tandem mass spectrometry: Extraction solvent was made of 40% acetonitrile, 40% methanol (both from Fisher Scientific, Fair Lawn, NJ, USA), and 20% water.

Metabolite standards were obtained from Sigma Chemicals, St. Louis, MO, USA. High performance liquid chromatography (Dionex Corporation, CA) with an Inertsil ODS-3 reversed phase C18 column (GL Biosciences, The Netherlands) and electrospray ionization-triple- quadrupole mass spectrometer were used (AB Sciex 5500 Qtrap, Toronto, ON, Canada).

2.2 Methods

Virus Production

The TKOv3 plasmid library, which uses the lentiCRISPRv2 plasmid as a backbone, containing

Cas9 and 70,948 gRNAs (4 gRNAs for all protein coding genes in the human genome, targeting

18,053 genes, as well as 142 control gRNAs targeting LacZ, EGFP and luciferase), was amplified using electroporation, and transfected into 293T cells alongside psPAX2 and pMD2.G

(packaging and envelope plasmids respectively), to produce virus containing the plasmid library.

Detailed methods outlined in Aregger et al. 2019.92 Virus production was conducted by Jason

Moffat’s lab.

34

Cell Line Infection and Passaging of Replicates

The HAP1 WT, HAP1 NAGK KO, HAP1 GFPT1 KO, HAP1 MGAT1 KO, HAP1 MGAT5 KO,

MDA-MB-231 WT, MDA-MB-231 MGAT1 KO, AND MDA-MB-231 MGAT5 KO cell lines were infected with virus containing the TKOv3 plasmid library at a multiplicity of infection

(MOI) of approximately 0.3, to minimize the incidence of two virus particles infecting the same cell to avoid triple KOs. Infection with the library was performed at 200-fold coverage (starting with approximately 9E7 cells), and polybrene was also added at a final concentration of 8 µg/ml.

After 24 hours of infection, the cells underwent puromycin selection (1µg/ml for HAP1 cells and

2 µg/ml for MDA-MB-231 cells) for 48 hours. Once selection was complete, the cells were trypsinized, pooled, and plated into 3 technical replicates at 200-fold coverage per replicate (at least 1.5E7 cells per replicate per cell line). The cells were passaged every 3 days until 18 days after completion of puromycin selection for the HAP1 cell lines. The MDA-MB-231 cell lines were passaged every ~5 days, up to ~12 population doublings. At each passage cell pellets of

3E7 (400-fold library coverage) were isolated. Additional cell pellets for T0 (day 0 after puromycin selection was complete) and the final time point of 2.5E5 and 5E6 cells were isolated for mycoplasma testing (to ensure no contamination) and to confirm ploidy (only for the HAP1 cell lines, to see if the cells remained haploid) respectively. gDNA Extraction, PCRs, and Gel Extraction

The cell pellets of 3E7 cells isolated at T0 and all three replicates at the final time point

(replicates A, B and C) underwent gDNA extraction using the Wizard Genomic DNA

Purification Kit as described in the kit manual, adding RNase A (final concentration of 100

µg/ml) treatment. The resulting gDNA concentration was determined using the Qubit dsDNA

Broad Range Assay Kit and the Qubit 2.0 Fluorometer as described in the kit manual. 15

35 replicate reactions of PCR1 were then conducted for each gDNA sample using the following conditions:

Table 2.1: PCR1 reaction mixture per tube.

Reagent Volume added/tube 2x NEBnext Ultra II 25 µl Q5 Master Mix 10 µM v2.1-F1 primer 2.5 µl 10 µM v2.1-R1 primer 2.5 µl Genomic DNA 3.5 µg Water x µl Total 50 µl

The thermocycler conditions were as follows:

1. 98°C for 30 seconds.

2. 98°C for 10 seconds.

3. 66°C for 30 second.

4. 72°C for 15 seconds.

5. Repeat steps 2-4 twenty-four times (25 cycles total).

6. 72°C for 2 minutes.

7. 10°C forever.

All 15 PCR1 products from the same gDNA sample were pooled, and 2 µl of each PCR1 was run on a 2% agarose gel, to ensure the amplified 600 (bp) fragment was seen, along with genomic DNA (smear of sheared genomic DNA bands down to the 600 bp amplified fragment).

Figure 2.3 shows a sample PCR1 gel from the MDA-MB-231 screen samples.

36

Figure 2.3: Polymerase chain reaction 1 (PCR1) gel image for all three MDA-MB-231 CRISPR/Cas9 knockout (KO) screens. T0 is day 0 after puromycin selection, and there are three replicates, A, B and C, of the final time points (27 or 33 days after puromycin selection). This gel is run as a quality control to ensure the proper PCR1 product has resulted. A smear is expected from high molecular weight down to a clear 600 base pair (bp) fragment, indicating sheared genomic DNA and the amplified PCR1 product respectively, as seen in the gel image. bp, base pair; MGAT1, Alpha-1,3-Mannosylglycoprotein 2-Beta-N- Acetylglucosaminyltransferase; MGAT5, Alpha-1,6-Mannosylglycoprotein 6-Beta-N- Acetylglucosaminyltransferase; WT, wild type.

37

One reaction of PCR2 was then conducted from the pooled PCR1 product of each sample using the following conditions:

Table 2.2: PCR2 reaction mixture.

Reagent Volume added/tube 2x NEBnext Ultra II 25 µl Q5 Master Mix 10 µM forward primer 2.5 µl 10 µM reverse primer 2.5 µl PCR1 product 5 µl Water 15 µl Total 50 µl

The thermocycler conditions were as follows:

1. 98°C for 30 seconds.

2. 98°C for 10 seconds.

3. 55°C for 30 second.

4. 65°C for 15 seconds.

5. Repeat steps 2-4 nine times (10 cycles total).

6. 65°C for 5 minutes.

7. 10°C forever.

6x Orange Loading Dye was added to the entire PCR2 product and loaded onto a 2% gel. The gel was run at ~90V for ~90 minutes. The 200 bp band amplified by PCR2 was excised from the gel for each sample, and the kit manual was followed for the QIAquick Gel Extraction Kit for each gel slice. The extracted DNA was then sent for Next Generation Sequencing.

38

Sequencing and Data Analysis

The extracted DNA from many samples was pooled and sequenced using standard primers for dual indexing on the Illumina HiSeq 2500. The PCR2 primers used (sequences in Appendix 2) are unique to each sample. This allows all samples to be pooled and sequenced together, as each sequence has a barcode indicating which sample it came from. The T0 samples were sequenced at 500-fold coverage, and end time points were sequenced at 200-fold coverage. Sequencing data was mapped to the gRNA library sequences, and read counts normalized to 10 million reads per sample for comparison between time points. The fold-change of each gRNA was determined for each replicate, by comparing the read count data from the final time points to the T0 sample. Any gRNAs with less than 30 reads at T0 were excluded as they are too low to rule out loss of a gRNA due to genetic drift. Behaviour of essential and nonessential genes was assessed as a quality control, as shown for the HAP1 MGAT1 KO screen in Figure 2.4. Fold-change was analyzed using the Bayesian Analysis of Gene EssentiaLity (BAGEL) algorithm and a Bayes

Factor was determined for each gene, providing a confidence measure that the gene KO results in a change in fitness.92

Based on the data, each gene was given a genetic interaction (GI) score, termed a pi-score, and a false discovery rate (FDR). The pi-score is calculated by subtracting the log2 fold change of the gene’s KO in the gene-of-interest KO cell line from the log2 fold change of the gene’s KO in the

WT cell line and normalized to the growth of each cell line. A sample calculation for the pi- score, using the sequencing results for the SLC16A1 gRNAs in the HAP1 NAGK KO

CRISPR/Cas9 screens, is shown in Appendix 4.

39

Figure 2.4: Log2 fold change plot of essential and nonessential genes at the final time point of the HAP1 MGAT1 KO CRISPR/Cas9 screen relative to day 0 after puromycin selection. We would expect to see early drop out of essential genes, while nonessential genes should reveal a distribution of both dropout and enrichment (negative and positive log2 fold change respectively) across all genes, as shown for this screen.

40

Functional Profiling Analysis

GIs identified from all four screens (FDR<0.05) underwent functional profiling analysis run as a multiquery using g:Profiler (version e94_eg41_p11_9f195a1) with g:SCS multiple testing correction method applying a significance threshold of 0.05.93 Functional profiling analysis was also conducted for each screen individually, looking at positive and negative GIs separately, run as an ordered query (from highest to lowest magnitude pi-score). This was also run with g:SCS multiple testing correction method applying a significance threshold of 0.05.

Generating Cas9 Stable Cell Lines

Cas9 stable cell lines were generated for the HAP1 WT, NAGK KO, GFPT1 KO, MGAT1 KO, and MGAT5 KO cell lines and the MDA-MB-231 WT, MDA-MB-231 NAGK KO, MDA-MB-

231 GNPNAT1 KO, MDA-MB-231 MGAT1 KO, and MDA-MB-231 MGAT5 KO cell lines.

This was done using lentiviral infection, using the Lenti-2A-Cas9-Blast transfer vector, with the psPAX2 structural vector and pCMV-VSV-G envelope vector. Hek293TN cells were transfected with all three vectors using jetPRIME Transfection Reagent as indicated in the product protocol.

The media was changed to virus production media 6-8 hours after transfection, and the media

(virus) was collected 48 hours after transfection. Each cell line was infected with the virus along with 8 µg/ml of polybrene for 24 hours. After infection, the cells were selected for 10 days with

Blasticidin (10 µg/ml for HAP1 cells, 20 µg/ml for MDA-MB-231 cells), and then continually passaged in media containing Blasticidin for 2 weeks to make the cell lines Cas9 stable.

Western Blot

Cas9 expression in the pooled cell population of the generated Cas9 stable cell lines was confirmed via Western Blot, shown in Figure 2.5.

41

Figure 2.5: Western blot membranes showing protein expression of Cas9 and g-tubulin in our Cas9 stable cell lines. Panel (A) is the first attempt and (B) the second attempt of generating Cas9 stable cell lines in the HAP1 WT and gene-of-interest KO cell lines. The WT cell line is the same between these two gels, and generating Cas9 stable cell lines with the HAP1 NAGK, MGAT1 and MGAT5 KO cell lines was reconducted, shown in panel (B). Panel (C) is the MDA-MB-231 WT and gene-of-interest KO Cas9 stable cell lines. Cas9, CRISPR associated protein 9; GFPT1, Glutamine--Fructose-6-Phosphate Transaminase 1; GNPNAT1, Glucosamine- Phosphate N-Acetyltransferase 1; kDa, kilodalton; KO, knockout; MGAT1, Alpha-1,3- Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase; MGAT5, Alpha-1,6- Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; NAGK, N-acetylglucosamine Kinase; WT, wild type.

42

Each cell line was lysed, and lysates were run on an 8% gel at 0.3 amperes and a fixed voltage of

150 volts for 75 minutes, to separate proteins. Proteins were then transferred to a PVDF membrane at 20 volts and a fixed amperage of 0.12 amperes for 90 minutes. Membranes were then blocked with D-PBS containing 5% milk and 0.1% Tween. Membranes were probed with primary antibody (1:1000 for Cas9, and 1:10,000 for g-tubulin) overnight at 4°C, and then with secondary antibody (1:10,000) for one hour. The antibodies were detected using the SuperSignal

West Femto Maximum Sensitivity Substrate.

Ligating the pLCKO vector with gRNAs gRNAs were selected for each of the genes chosen for validation (gene selection explained in results), as well as a positive and negative control gRNA. The gRNAs were selected from the

TKOv3 pooled gRNA library used for the genome-wide KO screens. Each guide in the library is assigned a guide score based on data from the TKOv1 library, where each nucleotide is given a score at each position in a gRNA (20 bp long) based on its effectiveness. This is determined by testing the effectiveness of various gRNAs on core essential genes, where knocking them out should result in a large fitness deficit negatively affecting that populations cell growth.

Therefore, gRNAs with a higher guide score are predicted to be the most effective.94 For each gene chosen for validation, the gRNA was chosen by taking the guide with the highest guide score for that gene from the TKOv3 library, which contains four gRNAs/gene. Adeno-

Associated Virus Integration Site 1 (AAVS1) was chosen as a negative control gRNA as it targets a non-coding region of the human genome and therefore should have no effect on the cell, and Polo Like Kinase 1 (PLK1) was chosen as a positive control gRNA, as it is an essential gene for survival of human cells.83 The list of gRNAs chosen are outlined in Appendix 3, along with the forward and reverse oligos required for proper ligation into the pLCKO vector (requires 5’-

43

ACCG NNNNNNNNNNNNNNNNNNNN-3’ forward oligo and 5’-AAAC

NNNNNNNNNNNNNNNNNNNN-3’ reverse oligo).

Forward and reverse oligos for each chosen gRNA were mixed with T4 PNK and T4 ligation buffer and annealed in a thermocycler at 37°C for 30 minutes, followed by 95°C for 5 minutes, and then ramped down to 25°C at 5°C per minute. The pLCKO vector was digested with BfuAI first, then NsiI and rSAP were added and further incubated. The reaction was then run on a 1% gel and the ~7500 bp fragment was excised from the gel. The kit manual for the QIAquick Gel

Extraction Kit was followed for each gel slice. Digested vector was combined with the annealed oligos and T4 ligase. The ligated vector was transformed into Stbl3 competent cells, SOC-treated for 30 minutes, and plated on LB plates with 100 µg/ml of Ampicillin. After 24 hours, individual colonies were picked and grown in 5 ml of LB broth with 100-125 µg/ml of Ampicillin for 14-16 hours. The bacterial growth was pelleted via centrifugation and underwent a miniprep following the kit manual of the PureLink Quick Plasmid Miniprep Kit.

Competition Assay

The Cas9 stable cell lines were used as the starting cell lines for this assay. For each competition assay, the cell line was split into two populations. One of the populations was infected with virus containing the pLCKO transfer vector, introducing a gRNA of interest into the cells. The other population was infected with virus containing the pSTV6-PGK-P2R N-mCherry transfer vector

(which contains b-actin tagged with a doxycycline inducible mCherry fluorescence gene). Both viruses were produced using the same method as that with the Lenti-Cas9-2A-Blast transfer vector (page 40). After 24 hours of infection, both populations were selected with puromycin at a concentration of 2 µg/ml for 48 hours. The cells were then plated as a 50:50 population in

44 triplicate, as well as a population containing only the b-actin fluorescent cells. The cells were passaged every 3-4 days over a 2-week period, and at each time point each sample was isolated for flow cytometry analysis in addition to being re-plated. The media on all cells were changed

24-28 hours prior to each passage and also cell collection for flow cytometry, to include 1 µg/ml of doxycycline hyclate. The mCherry b-actin fluorescence tag is doxycycline inducible and requires at least 24 hours of doxycycline exposure for sufficient mCherry fluorescence.

Liquid chromatography-tandem mass spectrometry (LC-MS/MS)

The HAP1 WT, HAP1 NAGK KO, HAP1 GFPT1 KO, HAP1 MGAT1 KO, and HAP1 MGAT5

KO cell lines were plated (n=7) in a 24-well plate at 1E5 cells/well. The following day 6 wells for each cell line were washed with PBS and then flash frozen in liquid nitrogen. The remaining well for each cell line was trypsinized to conduct a cell count. 1 ml of extraction solvent was added to each flash frozen well, then shaken on an orbital shaker at 4°C for 30 minutes. The solvent solution was recovered to tubes and shaken at 1000 rotations per minute (rpm) for 1 hour.

Samples were spun down at 14,000 rpm for 10 minutes and the supernatant containing the metabolites was dried at 32°C.

13 The metabolites were resuspended in water containing internal standards (D7-glucose and C9

15N-tyrosine) and separated by gradient reversed phase high performance liquid chromatography

(HPLC). The metabolites were then analyzed using an electrospray ionization-triple-quadrupole mass spectrometer. Each metabolite was quantified based on mass, and the relative quantity of each metabolite ion was determined by the area ratio, which takes the peak area of a metabolite in a sample and divides it by the peak area of the internal standard in that same sample. Detailed methods outlined in Rahman et al. 2014.95 The metabolite list includes over 180 different metabolites, each represented by up to 3 separate ions. The final data table includes the

45 metabolite ion with the cleanest or highest peak. The sample area ratios are normalized to cell number. This dataset includes six technical replicates, except for the MGAT5 KO cells which only has five replicates. One replicate was eliminated from analysis due to a technical issue in the HPLC run that affected the retention time, and therefore the MS/MS recognition of numerous metabolites in the sample. This most commonly occurs if an air bubble enters the liquid in the

HPLC lines, and generally causes a delay in the metabolites entering the mass spectrometer. The data table was analyzed using MetaboAnalyst 4.0 Statistical Analysis, using standard deviation and no further normalization.96

Chapter 3

Results

3.1 HAP1 CRISPR/Cas9 Screens

Genome-wide CRISPR/Cas9 knockout (KO) screens were conducted, in collaboration with

Jason Moffat’s lab, on the HAP1 gene-of-interest KO cell lines: NAGK KO, GFPT1 KO,

MGAT1 KO and MGAT5 KO. These screens revealed positive and negative genetic interactions

(GIs) with each gene-of-interest. A positive GI is identified when a double mutant results in increased fitness relative to the single mutants of each gene. Inversely, a negative GI is identified when a double mutant results in lesser fitness than would be expected from the single mutants.

These interactions indicate a potential relationship between the genes in the GI pair. In Costanzo et al. 2016, most genes in the Saccharomyces cerevisiae genome were tested pairwise using targeted mutagenesis, which revealed that positive and negative interactions typically revealed regulatory and functional relationships respectively.4 Therefore, identifying GIs on a genome- wide scale can provide a comprehensive view of the complex nature of a particular gene’s role within a cell.

Across all four CRISPR/Cas9 KO screens, 1516 GIs were identified with a false discovery rate

(FDR) <0.2. Each GI was determined based on the effect of a new gene KO on the growth of the

HAP1 gene-of-interest KO cell line relative to the effect of that new KO in the WT cell line, therefore the double and single KO respectively. These effects were computed as the log2 fold change and plotted against one another. This is summarized for all four CRISPR/Cas9 screens in the score plots outlined in Figure 3.1.

46 47

Figure 3.1: Score plots for all four HAP1 gene-of-interest CRISPR/Cas9 KO screens. Yellow dots represent genes that are positive interactors with the gene-of-interest and blue dots represent negative interactors (FDR<0.2). Each gene is plotted based on the effect of its KO in the gene-of-interest KO cell line relative to the WT cell line. Gene KOs that have a similar effect in both cell lines would lie along the diagonal, which are excluded due to the lack of a genetic interaction. FDR, False Discovery Rate; GFPT1, Glutamine--Fructose-6-Phosphate Transaminase 1; GI, Genetic Interaction; KO, Knockout; MGAT1, Alpha-1,3- Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase ; MGAT5, Alpha-1,6- Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; NAGK, N-acetylglucosamine Kinase; WT, Wild type.

48

As shown in the score plots, a large number of GIs were identified when using FDR<0.2 as a statistical cutoff, particularly in the HAP1 MGAT1 KO and HAP1 GFPT1 KO screens. The global GI network for S. cerevisiae revealed that the more essential a gene is, the more GIs it has.4 This suggests that MGAT1 and GFPT1 are more essential to cell survival compared to

NAGK and MGAT5.

This FDR cutoff was chosen for initial analysis due to the novelty of the genome-wide

CRISPR/Cas9 KO screen datasets using this method. Therefore, without a greater magnitude of screen data, a proper significance cutoff cannot be determined, and a lenient cutoff of <0.2 was chosen by Jason Moffat’s lab to minimize the risk of eliminating true GIs. To focus on the most significant GIs, an FDR cutoff of <0.05 was chosen for further analysis.7,94 Gene interactors were chosen for validation using three analysis methods: functional profiling analysis, analysis of common GIs between screens, and score plot analysis, as follows.

All GIs (FDR<0.05) underwent functional profiling analysis using g:Profiler.93 The four GI lists were first analyzed together as a multiquery, revealing Gene Ontology (GO) terms enriched across all gene lists. Manhattan plots were generated for each gene-of-interest, each dot representing a GO term (Figure 3.2). The higher the dot is on the -log10(padj) y-axis, the more significant that GO term is (faded dots near the bottom are non-significant). To identify functional processes that are commonly over-represented, GO terms significant across all GI lists were selected. The GI list for the HAP1 NAGK KO screen did not show over-representation of any GO terms, therefore common GO terms were identified across the three remaining screens.

Among the GFPT1, MGAT1 and MGAT5 KO screens, there were 13 GO terms that were over- represented in the GI gene lists, circled in Figure 3.2. The thirteen common GO terms are outlined in Table 3.1, the IDs corresponding to the numbers on the Manhattan plots (Figure 3.2).

49

Figure 3.2: Manhattan Plots of over-expressed GO terms from multiquery analysis of GI lists (FDR<0.05) from all four gene-of-interest KO screens. The terms enclosed in a circle are the 13 common GO terms across the GFPT1, MGAT1 and MGAT5 KO screen, outlined in Table 3.1 (ID corresponds to numbers on plot above). BP, Biological Process; CC, Cellular Component; CORUM, CORUM Protein Complexes; GFPT1, Glutamine--Fructose-6-Phosphate Transaminase 1; GO, Gene Ontology; HP, Human Phenotype Ontology, HPA, Human Protein Atlas; KEGG, Kyoto Encyclopedia of Gene and Genomes; MF, Molecular Function; MGAT1, Alpha-1,3-Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase ; MGAT5, Alpha- 1,6-Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; MIRNA, miRTarBase; NAGK, N-acetylglucosamine Kinase; REAC, Reactome; TF, Transfac (transcription factor); WP, WikiPathways.

Table 3.1: Gene Ontology terms over-represented in all three GFPT1, MGAT1 and MGAT5 KO screens (g:SCS multiple testing correction method applying significance threshold of 0.05).

MGAT1 MGAT5 GFPT1 GIs GIs GIs ID Source Term ID Term Name (adjusted (adjusted (adjusted p-value) p-value) p-value) 1 GO:BP GO:0006725 cellular aromatic compound metabolic process 1.23E-09 6.16E-04 1.88E-02 2 GO:BP GO:0006139 nucleobase-containing compound metabolic process 1.06E-09 5.62E-04 4.37E-02 3 GO:BP GO:0044237 cellular metabolic process 2.34E-10 1.46E-06 1.85E-03 4 GO:BP GO:0046483 heterocycle metabolic process 2.89E-10 4.91E-04 1.72E-02 5 GO:BP GO:0071704 organic substance metabolic process 8.43E-09 2.36E-03 2.95E-02 6 GO:BP GO:1901360 organic cyclic compound metabolic process 5.19E-09 7.78E-04 3.80E-02 7 GO:BP GO:0044238 primary metabolic process 3.86E-08 9.24E-04 4.03E-02 8 GO:BP GO:0006807 nitrogen compound metabolic process 3.60E-08 3.05E-04 1.08E-02 9 GO:BP GO:0008152 metabolic process 4.56E-08 3.30E-04 1.20E-02 10 GO:CC GO:0044424 intracellular part 5.00E-12 1.07E-10 2.31E-02 11 GO:CC GO:0005622 intracellular 9.89E-12 2.04E-10 2.68E-02 12 TF TF:M04691_1 Factor: Kaiso; motif: TCTCGCGAG; match class: 1 3.04E-07 8.73E-05 4.74E-02 13 TF TF:M02052_1 Factor: EHF; motif: CSCGGAARTN; match class: 1 9.80E-03 7.29E-05 5.87E-03

BP, Biological Process; CC, Cellular Component; GFPT1, Glutamine--Fructose-6-Phosphate Transaminase 1; GI, Genetic Interaction; GO, Gene Ontology; MGAT1, Alpha-1,3-Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase; MGAT5, Alpha-1,6- Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; TF, Transcription Factor.

50

Of these thirteen common GO terms, nine are metabolic processes. As the four genes-of-interest are involved in anabolic pathways, the gene interactors are expected to reveal relationships within the larger network of metabolism. Although the metabolic terms are broadly defined, over-representation of these terms across the GI lists increases confidence in the CRISPR/Cas9

KO screen data.

Separate positive and negative GI lists, in decreasing order from strongest interaction

(FDR<0.05), were generated for each gene-of-interest. Functional profiling analysis was run as an ordered query using each list. The 40 most over-represented GO terms in each list were analyzed (or all GO terms if less than 40 total), to look for enrichment of subsets of related GO terms. No clear relationships were observed in the 40 most significant GO terms for the NAGK positive GIs, MGAT1 positive GIs, MGAT5 negative GIs, and the GFPT1 negative GIs. For the remaining GI lists, the enriched subsets of GO terms are summarized in Table 3.2.

Each enriched GO term had a gene list associated with it. This gene list contains all the genes, from the provided GI list, that are known to be related to the GO term. These lists were used to isolate gene interactors that were common across all GO terms within each enriched subset. As these genes are common across all the related GO terms, it is implied that they are most associated with each process (versus genes that are only related to some of the GO terms in the subset). While a GI can reveal a functional or regulatory relationship between two genes, a gene- of-interest that interacts with many genes with a similar functional role provides confidence that there is a relationship between the corresponding pathways. Therefore, these common genes, outlined in the last column of Table 3.2, are promising candidates for validation of the suggested genetic interactions.

51 52

Table 3.2: Enriched GO terms in ordered GI lists and the common gene interactors associated with those GO terms. Number of GO Common Genes Enriched Subset of GI List Terms in Subset Among All GO Related GO Terms (from top 40) Terms Within Subset NAGK KO Cytoskeleton components 5 INPPL1, NF2 Negative GIs and processes MGAT1 KO Cadherins/cell-cell 7 CDH2, CTNNA1 Negative GIs adhesion MGAT5 KO Folate metabolism 11 none Positive GIs GART, MTHFD1, Folate metabolism 3 GFPT1 KO DHFR, ATIC Positive GIs Nucleotide metabolism 13 GART, ATIC, PPAT ATIC, 5-Aminoimidazole-4-Carboxamide Ribonucleotide Formyltransferase/IMP Cyclohydrolase; CDH2, Chromodomain Helicase DNA Binding Protein 2; CTNNA1, Catenin Alpha 1; DHFR, Dihydrofolate Reductase; GART, Glycinamide Ribonucleotide Transformylase; GFPT1, Glutamine--Fructose-6-Phosphate Transaminase 1; GI, Genetic Interaction; GO, Gene Ontology; INPPL1, Inositol Polyphosphate Phosphatase Like 1; KO, Knockout; MGAT1, Alpha- 1,3-Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase ; MGAT5, Alpha-1,6- Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; MTHFD1, Methylenetetrahydrofolate Dehydrogenase, Cyclohydrolase And Formyltetrahydrofolate Synthetase 1; NAGK, N-acetylglucosamine Kinase; NF2, Neurofibromin 2; PPAT, Phosphoribosyl Pyrophosphate Amidotransferase.

After functionally characterizing the GI lists (FDR<0.05), a closer analysis of the strongest GIs was conducted. The GIs were narrowed down based on pi-score, which is a genetic interaction score. Pi-score is calculated as outlined in the methods (page 38), and an example pi-score calculation for the SLC16A1 KO in the HAP1 NAGK KO cell line is outlined in Appendix 4.

The pi-score is calculated based on the log2 fold change of each cell population after introducing a new KO, comparing the double KO log2 fold change to the single KO log2 fold change. A log2 fold change value <-1 or >1 indicates an average 2-fold decrease or increase respectively in the

53 abundance of a gene’s gRNAs over the course of the CRISPR/Cas9 KO screen after infection and puromycin selection. This provides a measure of the fitness effect of a specific KO in a cell line, with the majority having a neutral or near neutral effect. A decrease in abundance of a gRNA indicates a fitness deficit, no change indicates no effect, and an increase indicates an advantage. A pi-score takes into account the effect of a KO in both the gene-of-interest KO cell line and the WT cell line. This provides a fitness score of that KO in the context of the GI, as a similar response in the WT cell line would result in a pi-score near 0. Therefore, to select for the strongest GIs, a pi-score cutoff of <-1 and >1 was selected. Genetic interaction maps were generated for both the positive and negative GIs revealed in the screens, outlined in Figures 3.3 and 3.4.

The genetic interaction maps display the gene interactors, connected to the gene-of-interest(s) they interact with, in reference to their pi-score value and FDR. The strongest GIs are indicated by the darker blue fill (ie. higher magnitude of pi-score), and the most significant are indicated by the thicker border (ie. smaller FDR). For example, the negative genetic interaction map

(Figure 3.3) shows GFPT2 to be the strongest, and a very significant, negative interactor with

GFPT1. As GFPT1 and GFPT2 are paralogs with a redundant enzymatic role in HBP,16 this is a highly probable result. Therefore, knocking out each gene individually shouldn’t exert a strong effect on cell growth, which is observed when GFPT1 and GFPT2 are mutated alone.

Conversely, when mutated together, the cells exhibit a strong fitness deficit, quantified by a negative pi-score (-2.49, FDR=0.01). This provides further confidence in the success of this

CRISPR/Cas9 KO screen, and the necessity of HBP in HAP1 cancer cell growth.

Figure 3.3: Negative genetic interaction map (FDR<0.05, pi-score<-1). Yellow nodes are genes-of-interest, and blue are negative GIs, connected to the gene-of-interest(s) they interact with. The shade of blue indicates the strength of the GI based on pi-score, with the darker blue indicating a lower pi-score, and the border thickness indicates how significant the GI is, with a thicker border indicating a smaller FDR. The genes with a “*” beside them are those chosen for validation, based on being reoccurring GIs with more than one gene-of- interest. EPC2, Enhancer Of Polycomb Homolog 2; FDR, False Discovery Rate; NF2, Neurofibromin 2; PRDX1, Peroxiredoxin 1.

54 55

Figure 3.4: Positive genetic interaction map (FDR<0.05, pi-score>1). Yellow nodes are genes-of-interest, and blue are positive GIs, connected to the gene-of-interest(s) they interact with. The shade of blue indicates the strength of the GI based on pi-score, with the darker blue indicating a higher pi-score, and the border thickness indicates how significant the GI is, with a thicker border indicating a smaller FDR. The genes with a “*” beside them are those chosen for validation, based on being reoccurring GIs with more than one gene-of- interest. ATIC, 5-Aminoimidazole-4-Carboxamide Ribonucleotide Formyltransferase/IMP Cyclohydrolase; FDR, False Discovery Rate; SLC16A1, Solute Carrier Family 16 Member 1.

The genetic interaction maps reveal GIs that are common in multiple gene-of-interest screens.

Among the four screens, nineteen GIs were common. The genome-wide double mutant screens conducted in the S. cerevisiae genome by Costanzo et al. 2016 revealed that genes with more similar genetic interaction profiles tend to be functionally similar.4 Using the reverse rationale; knowing that our genes-of-interest are functionally related, it is more likely that reoccurring GIs in our four screens are true GIs. Therefore, three reoccurring GIs were selected for validation using this criterion: Solute Carrier Family 16 Member 1 (SLC16A1), Peroxiredoxin 1 (PRDX1), and Enhancer Of Polycomb Homolog 2 (EPC2). Three additional reoccuring GIs have already been selected for validation based on the functional profiling analysis: Inositol Polyphosphate

Phosphatase Like 1 (INPPL1), Neurofibromin 2 (NF2), and 5-Aminoimidazole-4-Carboxamide

Ribonucleotide Formyltransferase/IMP Cyclohydrolase (ATIC). All of these genes are indicated in the genetic interaction maps (Figures 3.3-3.4) by having an asterix next to them.

Genome-wide CRISPR/Cas9 KO screens in the HAP1 cell line is a project spearheaded by Jason

Moffat’s lab, in which over 150 screens have been conducted thus far. This has allowed identification of frequent fliers, which are genes that come up as GIs in a high proportion of the

CRISPR/Cas9 screens. These trends are likely due to a general stress response following the KO of those particular genes, as opposed to true gene-gene interactions. GIs that appeared in more than 50% of screens were eliminated from the validation list. Additionally, priority was placed on negative interactions for two reasons. First, they typically reveal functional relationships, whereas positive interactions tend to reveal regulatory relationships, and negative interactions occurred more often between closely related genes versus positive interactions, increasing the likelihood that negative interactions would provide functional information. Second, collaborators in the Moffat lab have had difficulties successfully validating positive interactions.

56 57

Lastly, the score plots shown in Figure 3.1 were further analyzed, because where GIs lie on these plots is very informative. Showing the log2 fold change of a KO in the gene-of-interest KO cell line and the WT cell line individually is valuable, as fitness advantages or deficits that are selective to one cell line can be observed. Additionally, the further a GI lies from the diagonal the stronger it is. Negative GIs of interest are those that have a strong negative effect in the gene- of-interest KO cell line and a relatively neutral response in the WT cell line, indicating a synthetic lethal phenotype. Positive GIs of interest are those with a strong negative effect in the

WT cell line and a near neutral response in the gene-of-interest KO cell line, indicating rescue of the phenotype observed in the single KO cells by the addition of the second KO. Based on these criteria, additional GIs have been selected for validation based on their location on the score plots. Detailed score plots for each gene-of-interest are outlined below (Figures 3.5-3.8). The GIs selected based on score plot location are circled in red, and all other GIs chosen for validation, as outlined previously, are circled in black.

58

Figure 3.5: Score plot of the NAGK KO CRISPR/Cas9 screen. Yellow dots represent genes that are positive interactors with the gene-of-interest and blue dots represent negative interactors (FDR<0.2). The circled genes are those that have been chosen for validation. The genes circled in red have been selected based on their location on the score plot. ATIC, 5-Aminoimidazole-4- Carboxamide Ribonucleotide Formyltransferase/IMP Cyclohydrolase; FDR, False Discovery Rate; GI, Genetic Interaction; Inositol Polyphosphate Phosphatase Like 1; KO, Knockout; NAGK, N-acetylglucosamine Kinase; NF2, Neurofibromin 2; OAZ1, Ornithine Decarboxylase Antizyme 1; SLC16A1, Solute Carrier Family 16 Member 1; WT, Wild type; INPPL1.

59

Figure 3.6: Score plot of the GFPT1 KO CRISPR/Cas9 screen. Yellow dots represent genes that are positive interactors with the gene-of-interest and blue dots represent negative interactors (FDR<0.2). The circled genes are those that have been chosen for validation. The genes circled in red have been selected based on their location on the score plot. ATIC, 5-Aminoimidazole-4- Carboxamide Ribonucleotide Formyltransferase/IMP Cyclohydrolase; DHFR, Dihydrofolate Reductase; EPC2, Enhancer Of Polycomb Homolog 2; FDR, False Discovery Rate; GART, Glycinamide Ribonucleotide Transformylase; GFPT1/2, Glutamine--Fructose-6-Phosphate Transaminase 1/2; GI, Genetic Interaction; KO, Knock out; MTHFD1, Methylenetetrahydrofolate Dehydrogenase, Cyclohydrolase And Formyltetrahydrofolate Synthetase 1; PPAT, Phosphoribosyl Pyrophosphate Amidotransferase; PRDX1, Peroxiredoxin 1; SLC16A1, Solute Carrier Family 16 Member 1; WT, Wilde type.

60

Figure 3.7: Score plot of the MGAT1 KO CRISPR/Cas9 screen. Yellow dots represent genes that are positive interactors with the gene-of-interest and blue dots represent negative interactors (FDR<0.2). The circled genes are those that have been chosen for validation. The genes circled in red have been selected based on their location on the score plot. ATIC, 5-Aminoimidazole-4- Carboxamide Ribonucleotide Formyltransferase/IMP Cyclohydrolase; CCNF, Cyclin F; CHD2, Chromodomain Helicase DNA Binding Protein 2; CTNNA1, Catenin Alpha 1; EPC2, Enhancer Of Polycomb Homolog 2; FDR, False Discovery Rate; GI, Genetic Interaction; KO, Knockout; MGAT1, Alpha-1,3-Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase; MLLT4, Mixed-Lineage Leukemia; Translocated To, 4; PRDX1, Peroxiredoxin 1; SLC16A1, Solute Carrier Family 16 Member 1; TAX1BP3, Tax1 Binding Protein 3; WT, Wild type.

61

Figure 3.8: Score plot of the MGAT5 KO CRISPR/Cas9 screen. Yellow dots represent genes that are positive interactors with the gene-of-interest and blue dots represent negative interactors (FDR<0.2). The circled genes are those that have been chosen for validation. ATIC, 5- Aminoimidazole-4-Carboxamide Ribonucleotide Formyltransferase/IMP Cyclohydrolase; FDR, False Discovery Rate; GI, Genetic Interaction; KO, Knockout; MGAT5, Alpha-1,6- Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; WT, Wild type.

62

3.2 Validation Competition Assay

To validate the selected GIs, our cell lines of interest: HAP1 WT, NAGK KO, GFPT1 KO,

MGAT1 KO, and MGAT5 KO were modified to stably express Cas9. This was done using lentiviral infection, and selection of the Cas9 expressing cells using Blasticidin. Expressing a new protein within a cell, as well as the random integration that occurs when introducing DNA via lentiviral infection, can result in off-target growth effects. Therefore, generating Cas9 stable cell lines minimizes the potential confounding effect of Cas9 expression to the growth phenotype when comparing the effect of a new KO relative to the control. Cas9 expression was confirmed with Western Blot, as shown previously in Figure 2.5 (page 41).

In total, seventeen GIs have been chosen for validation as previously described and are summarized in Table 3.3. The GIs were selected based on one or more of the three selection criteria outlined, summarized as: GO terms (GIs common across all GO terms in an enriched subset, explained on page 51), Common (a common GI between multiple gene-of-interest KO screens, explained on page 56), and Score plot (GIs chosen for validation based on their location on the score plot, explained on page 57). Table 3.3 also includes a brief functional description of each gene.

Table 3.3: Seventeen GIs that have been chosen for validation and their functions.

Interacts Interaction Validation Gene Full Name Function With Type Criteria 5-Aminoimidazole-4- Catalyzes the last two steps of the de novo purine NAGK, Carboxamide Common biosynthetic pathway. GFPT1, ATIC Ribonucleotide Positive & GO MGAT1, Formyltransferase/IMP terms MGAT5 Cyclohydrolase Regulates cell cycle transitions; plays a role in protein CCNF Cyclin F MGAT1 Positive Score plot ubiquitination. Score plot A cadherin and glycoprotein; preferentially mediates Chromodomain Helicase CHD2 MGAT1 Negative & homotypic cell-cell adhesion by dimerization with a DNA Binding Protein 2 GO terms CDH2 chain from another cell. Role in cell-cell adhesion (connects both N- and E- cadherins to intracellular actin filaments); undergoes CTNNA1 Catenin Alpha 1 MGAT1 Negative GO terms conformational changes in response to cytoskeletal tension. Converts dihydrofolate into tetrahydrofolate (necessary for de novo purine synthesis, thymidylic acid, and some DHFR Dihydrofolate Reductase GFPT1 Positive GO terms amino acids, ex. glycine); particularly important in folate metabolism. Enhancer Of Polycomb MGAT1, Not a well characterized gene; may play a role in EPC2 Negative Common Homolog 2 GFPT1 transcription or DNA repair. Has three functions, all required for de novo purine Glycinamide synthesis: phosphoribosylglycinamide GART Ribonucleotide GFPT1 Positive GO terms formyltransferase, phosphoribosylglycinamide Transformylase synthetase, and phosphoribosylaminoimidazole synthetase activity. Controls flux of glucose into HBP; likely to be involved Glutamine-Fructose-6- GFPT2 GFPT1 Negative Score plot in regulating the availability of precursors for N- and O- Phosphate Transaminase 2 linked glycosylation of proteins.

63 64 Involved in regulation of insulin function; plays a role in Inositol Polyphosphate INPPL1 NAGK Negative GO terms epidermal growth factor receptor turnover and actin Phosphatase Like 1 remodelling; is a known breast cancer biomarker. Involved in signaling and organization of cell junctions Mixed-Lineage Leukemia; MLLT4 MGAT1 Negative Score plot during embryogenesis; identified as a fusion partner of a Translocated To, 4 gene involved in acute myeloid leukemias. Has 3 enzymatic activities: methylenetetrahydrofolate Methylenetetrahydrofolate dehydrogenase, methenyltetrahydrofolate Dehydrogenase, cyclohydrolase and formyltetrahydrofolate synthetase, MTHFD1 Cyclohydrolase And GFPT1 Positive GO terms which are for sequential reactions in the interconversion Formyltetrahydrofolate of 1-C derivatives of tetrahydrofolate (substrates for Synthetase 1 methionine, thymidylate and de novo purine synthesis). Not a well characterized gene; has shown to interact NF2 Neurofibromin 2 NAGK Negative GO terms with proteins in the cell membrane, cytoskeletal proteins, and proteins regulating ion transport. Ornithine Decarboxylase Plays a role in cell growth and proliferation by OAZ1 NAGK Negative Score plot Antizyme 1 regulating intracellular polyamine levels. Phosphoribosyl Score plot Member of purine/pyrimidine phosoribosyltransferase PPAT Pyrophosphate GFPT1 Positive & activity; regulatory enzyme that catalyzes the first step Amidotransferase GO terms of the de novo purine nucleotide biosynthetic pathway. MGAT1, Member of peroxiredoxin family (antioxidant enzyme); PRDX1 Peroxiredoxin 1 Negative Common GFPT1 reduces hydrogen peroxide to alkyl hydroperoxides. MGAT1, Transports many monocarboxylates (ex. lactate and Positive Common GFPT1 pyruvate) across the plasma membrane; plays a role in Solute Carrier Family 16 SLC16A1 glucose homeostasis. Member 1 Common NAGK Negative & Score plot Promotes protein-protein interactions that affect cell signaling, adhesion, protein scaffolding and receptor and TAX1BP3 Tax1 Binding Protein 3 MGAT1 Negative Score plot ion transport functions, plays a role in many signaling pathways, ex. the Wnt/b-catenin pathway.

A preliminary control experiment was conducted to test the validation method. This was done using the Cas9 stable cell lines and the competition assay outlined in the methods (pages 40-43).

The HAP1 WT and HAP1 GFPT1 KO cell lines were tested, introducing negative and positive control gRNAs, targeting AAVS1 and PLK1 respectively, by lentiviral infection, or the b-actin tagged mCherry fluorescent protein. After 24 hours of infection and 48 hours of puromycin selection, each new KO cell population was combined 50:50 with the mCherry fluorescent cell population of the same HAP1 cell line (double vs. single KO, or single KO vs. WT). A portion of the 50:50 pooled cell suspension was isolated for flow cytometry analysis to confirm the exact proportion of each cell population at the initial time point of the competition assay. This is necessary as a base line measure to note any changes observed in the populations over time. The remainder of the suspension was plated in three replicates.

Figure 3.9 shows the initial time point of the AAVS1 negative control test in the HAP1 WT cell line, also analyzing each individual cell population alone. This is necessary, as the AAVS1 KO population (non-fluorescent) is required to determine proper fluorescence gating, and the b-actin mCherry fluorescent cell population is required to determine the percent induction of the fluorescence (there will not be 100% induction of the mCherry expression). The combined cell population in panel C of Figure 3.9 shows that 41.99% of the cells were fluorescent. After taking into consideration that the mCherry expressing cells are only showing 87.21% induction, the combined population actually contains 48.1% HAP1 WT fluorescent cells, and 51.9% AAVS1

KO non-fluorescent cells, close to the target 50:50 ratio of each cell population at the beginning of the assay.

65 66

Figure 3.9: Flow cytometry analysis of the cell populations at the initial time point of the HAP1 WT competition assay with the AAVS1 negative control gRNA. (A) AAVS1 KO cells, (B) b-Actin mCherry fluorescent cells, and (C) the combined cell population. Taking the 87.21% induction of the b-Actin mCherry fluorescence in that cell population, the combined population presents a 48.1% fluorescent (HAP1 WT) cell population at the initial time point, approximately 50% as is desired. INT, intensity.

67

The preliminary competition assay was unsuccessful. Expression of the positive control gRNA targeting PLK1, an essential gene, in the presence of Cas9 should result in a significant fitness deficit in those cells. Using this assay, a fitness deficit would be quantified by seeing the mCherry fluorescent population outcompete the non-fluorescent PLK1 KO population over time.

In both the HAP1 WT and HAP1 GFPT1 KO cell lines, introduction of the PLK1 gRNA had a neutral effect on the cells, showing no difference in the proportion of each cell population over the course of two weeks. This indicates a problem with the assay, such as poor targeting of the gRNA to PLK1. gRNAs targeting two different essential genes as positive controls, Proteasome

26S Subunit, Non-ATPase 1 (PSMD1) and PLK1, were tested in the HAP1 cell lines. Neither resulted in a fitness deficit, which should occur when knocking out an essential gene.

Additionally, the gRNA sequences were chosen from the TKOv3 gRNA library, an optimized and tested library.94 Therefore, it is unlikely that the gRNAs are the problem.

Consistent with findings in the Moffat Lab, it is possible that HAPI cells may not express functional Cas9. We had little success getting good Cas9 expression in the HAP1 cell lines, even after multiple attempts with some (Figure 2.5, page 41). The Cas9 stable cell lines were also used as a pooled population as opposed to expanding an individual clone. When using a pooled population, clones that have low Cas9 expression may have a fitness advantage as a result the insertion site and haploid state of the cells. Therefore, in the heterogenous population, the lower expressing cells typically outcompete the highly expressing cells, which may result in the remaining cell population having very low Cas9 expression, resulting in poor editing efficiency.

It is likely that the Cas9 editing efficiency is limited in these Cas9 stable cell lines, therefore not generating a new targeted KO when a gRNA is introduced. Consequently, this may limit the use of this validation assay in the HAP1 cell line.

68

The PSMD1 positive control gRNA was tested on the MDA-MB-231 Cas9 stable cell line. The

KO appeared successful, as cell growth was so affected not enough cells remained to start the experiment, therefore there are no quantified results. This observation, along with strong Cas9 expression in the generated Cas9 stable MDA-MB-231 cell lines (Figure 2.5, page 41), show promise in using this assay to validate GIs in MDA-MB-231 cell lines.

3.3 Metabolomics

To better understand each gene-of-interest in the broader context of metabolism, metabolomics was conducted on the HAP1 WT and gene-of-interest KO cell lines. Cell lysates (n=6) were prepared for each cell line and data for four hundred fifty-six ions was obtained using targeted liquid chromatography-tandem mass spectrometry (LC-MS/MS), representing 200 metabolites.

Seventeen metabolites were duplicates, as a quality control between the negative and positive metabolite lists, therefore 183 unique metabolites were included in the data acquisition.

MetaboAnalyst was used to conduct a one-way ANOVA (p<0.05) on the area ratio values determined for each metabolite after normalization to the cell number (explained in methods, pages 44-45). Metabolites with no statistical difference between cell lines were eliminated, leaving 88 metabolites, outlined in Appendix 5. Only the data from these metabolites were used for further analysis.

First, a heat map showing relative levels of each metabolite was generated, Figure 3.10, providing an overview of the results. The unsupervised heat map showing natural clustering of all samples is shown in Appendix 6, which reveals clustering of the cell line replicates.

69

Figure 3.10: Heat map showing relative levels of 88 metabolites across the HAP1 WT and gene-of-interest KO cell lines (n=6). Scale values are relative, with dark red squares having the largest quantity within the dataset, and dark blue consisting of the lowest quantity. GFPT1, Glutamine--Fructose-6-Phosphate Transaminase 1; KO, Knockout; MGAT1, Alpha-1,3- Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase ; MGAT5, Alpha-1,6- Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; NAGK, N-acetylglucosamine Kinase; WT, Wild type.

70

The HAP1 NAGK KO cell line appears to be the most similar to the HAP1 WT, with one significant metabolite difference, N-acetylglucosamine (GlcNAc). Similarly, the MGAT5 KO shows only modest differences relative to the WT for the majority of metabolites. In contrast, much larger differences are seen with the HAP1 GFPT1 KO and MGAT1 KO cell lines, with these two KOs showing almost opposite patterns in metabolite levels. The GFPT1 and MGAT1

KO metabolite levels are greatly decreased or increased compared to the WT, indicating that normal metabolism has been altered in these cell lines. This data suggests an increased dependence on these genes for cells to function properly, consistent with the elevated number of

GIs in the CRISPR/Cas9 screen data for the HAP1 GFPT1 and MGAT1 KOs compared to the

NAGK and MGAT5 KOs.

Among the most significant ANOVAs were N-acetylglucosamine (GlcNAc) and Uridine diphosphate (UDP)-GlcNAc (Figure 3.11).

GlcNAc levels across the cell lines revealed an f-value of 382.68 and an FDR of 1.19x10-18 by

ANOVA, and the differences between each cell line is shown in Figure 3.11 panel A. Fisher's

Least Significant Difference (LSD) post-hoc tests were conducted (p-value<0.05), revealing a significant difference between the NAGK KO cell line and all others, with GlcNAc levels at least

7.5 times higher in the HAP1 NAGK KO cells. NAGK catalyzes the enzymatic reaction that converts GlcNAc to GlcNAc-6-phosphate (-6P) in the HBP salvage pathway (Figure 1.2, page

8). Therefore, loss of NAGK function blocks the salvage pathway, and back-up of GlcNAc is expected. Interestingly, there is no significant difference between UDP-GlcNAc levels in the WT versus the NAGK KO cells, shown in panel B of Figure 3.11. This suggests that de novo HBP is sufficient for generating the required amounts of UDP-GlcNAc for downstream cellular processes.

71

Figure 3.11: Average (A) GlcNAc and (B) UDP-GlcNAc levels, normalized to cell counts, in the HAP1 WT and gene-of-interest KO cell lines. One or more “*” above a bar indicates a statistical difference from all other cell lines (Fisher's LSD, p<0.05). GFPT1, Glutamine-- Fructose-6-Phosphate Transaminase 1; KO, Knockout; MGAT1, Alpha-1,3- Mannosylglycoprotein 2-Beta-N-Acetylglucosaminyltransferase ; MGAT5, Alpha-1,6- Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase; NAGK, N-acetylglucosamine Kinase; UDP-GlcNAc, Uridine diphosphate N-Acetylglucosamine, WT, Wild type.

72

UDP-GlcNAc differences revealed an f-value of 146.55 and an FDR of 1.84x10-14 in the

ANOVA. UDP-GlcNAc levels were significantly increased in the HAP1 GFPT1 KO cells

(Fisher’s LSD post-hoc test, p-value<0.05), at least 2-fold higher than all other cell lines (Figure

3.11, panel B). This is unusual, as the available pool of UDP-GlcNAc would be expected to diminish when blocking de novo HBP. This suggests compensation by GFPT2 activity or the salvage pathway, which may be associated with a different set of vulnerabilities compared to the complete loss of GFPT activity. The loss of GFPT1 activity may result in overcompensation of the pathway or loss of feedback inhibition, resulting in the large increase in UDP-GlcNAc levels that is observed in the HAP1 GFPT1 KO cells.

Of the four gene-of-interest KO cell lines, the metabolic profile of the MGAT1 KO appears to be the most different from the HAP1 WT cells, as seen in Figure 3.10. The relative levels of many metabolites when normalized to each in the WT cell line is outlined in Figure 3.12. The cells respond to the MGAT1 KO with global upregulation of most metabolites, which implies an increased metabolic rate in response to the MGAT1 KO. The 2- to 4-fold increase in high energy nucleotide substrates, ATP, CTP, GTP, and UTP, relative to the WT, implies increased flux through catabolic pathways in the cell, for example glycolysis and the citric acid cycle, supported in panel C of Figure 3.12. Interestingly, the level of lower energy nucleotides, the nucleoside mono- and diphosphates, are either similar or also increased in the MGAT1 KO relative to the

WT. Since depletion of the low energy nucleotides is not seen, this indicates increased demand on anabolic and catabolic pathways alike. This is also supported by the upregulation of almost all amino acids, with the exception of valine, as amino acids do not get stored in the cell like glucose or fatty acids can for example, and are therefore being generated for immediate use.97

73

Figure 3.12: Relative levels of (A) amino acids, (B) nucleotides, and (C) metabolites involved in Glycolysis, the Pentose Phosphate Pathway (PPP), and the Citric acid cycle in the HAP1 MGAT1 KO cell line, normalized to the average level of each in the HAP1 WT cell line. A-, Adenosine; C-, Cytidine; -DP, diphosphate; G-, Guanosine; -H, reduced form of NAD+ and NADP; KO, Knockout; -MP, monophosphate; NAD, Nicotinamide adenine dinucleotide; NADP, Nicotinamide adenine dinucleotide phosphate; -TP, triphosphate; U-, Uridine; WT, Wild type.

74

This overall increase in metabolism, such as that seen in cancerous tissue relative to healthy tissue, suggests increased growth and proliferation of the cells. This was not seen with the HAP1

MGAT1 KO cells relative to the WT cells, as these cells actually appeared to grow slightly slower. This suggests that these cells are experiencing heightened cell stress.

Further, there appears to be trends in nucleotide levels in the HAP1 GFPT1 KO cell line relative to the average level of each in the HAP1 WT cell line, as shown in Figure 3.13. The cells show an over 4-fold decrease in all high energy nucleotides (nucleoside triphosphates), with the peak intensity value for GTP being too low to be quantified, as it fell below the lower limit of detection (LLOD) for the mass spectrometer. As expected, the nucleoside diphosphates levels are increased in the GFPT1 KO, as each is reciprocally regulated with its nucleoside triphosphate. A similar yet more modest trend is revealed with a slight decrease in the levels of NADH and

NADPH, the high energy counterparts of NAD+ and NADP, which are slightly increased. This indicates that energy metabolism is not able to keep up with the energy needs of the cell, and that high energy substrates are being used up too rapidly to maintain normal homeostasis.

The metabolomics results indicate specific vulnerabilities that arise in each gene-of-interest KO cell line, which may help explain some of the relationships identified with these genes from the genome-wide CRISPR/Cas9 KO screen data.

75

Figure 3.13: Relative nucleotide levels in the HAP1 GFPT1 KO cell line, normalized to the average level of each in the HAP1 WT cell line. A-, Adenosine; C-, Cytidine; -DP, diphosphate; G-, Guanosine; GFPT1, Glutamine--Fructose-6-Phosphate Transaminase 1; -H, reduced form of NAD+ and NADP; KO, Knockout; -MP, monophosphate; NAD, Nicotinamide adenine dinucleotide; NADP, Nicotinamide adenine dinucleotide phosphate; -TP, triphosphate; U-, Uridine; WT, Wild type.

Chapter 4

Discussion

4.1 HAP1 CRISPR/Cas9 Screens

The genome-wide CRISPR/Cas9 KO screens conducted on the HAP1 cell lines revealed many genetic interactions between NAGK, GFPT1, MGAT1 and MGAT5 and other genes across the entire human genome. These genetic interactions are relevant in the context of the HAP1 chronic myelogenous leukemia cell line. This near-haploid cell line was chosen for the initial

CRISPR/Cas9 screening based on the assumption that generating mutations would be more efficient in cells with a single copy of each gene.98 However, the CRISPR/Cas9 method using optimized gRNAs has shown to be very efficient in many diploid cell types.99 Cell type and mutation profiles may influence the GIs observed in the screens, for example gene relationships with the Bcr-Abl fusion protein, which HAP1 cells harbour.100 Ideally, we hope to find GIs that reveal important functional and regulatory relationships between genes in the greater context of cancer, that have a broad applicability across various cancer types.

To better characterize the genetic interactions from our screens, functional profiling analysis was conducted, identifying relationships between our genes-of-interest and metabolism broadly. The movement towards multi-target therapy for cancer treatments is an attempt to overcome the immense adaptive capacity of tumor cells.101–104 The CRISPR/Cas9 screens documented herein are a first step towards a better understanding of how exactly HBP and the N-glycan branching pathways fit into central metabolism, and with validation and follow-up experiments, has the potential to reveal new relationships.

76 77

Functional profiling on individual GI lists for each gene-of-interest revealed over-representation of genes involved in cell-cell adhesion and the cytoskeleton in the MGAT1 and NAGK lists respectively (Table 3.2, page 52), as previously reported in literature.105–108 In contrast, over- representation of genes involved in folate and nucleotide metabolism were found in the GFPT1 and MGAT5 GI lists, which was unanticipated.

N-glycosylation is known to be highly involved in cell adhesion, with the number and structure of N-glycans on the cell surface being very important.108 Aberrant N-glycan branching has been identified in cancer, and has been attributed to altered cell-cell adhesion, increasing motility and invasive phenotypes.105,106 Aberrant N-glycosylation of cadherins specifically have revealed a similar response, likely playing a prominent role in the mechanism.105,109,110 As cadherins are transmembrane proteins that link the cell surface to the cytoskeleton,111 this supports our screen data, linking HBP and N-glycosylation to both cytoskeletal components and cadherins, particularly in the context of cancer. This link appears to be very important in solid tumors, for example gastric and oral cancers, and we would expect the relationship to be upheld when expanding the CRISPR/Cas9 screens to other cancer cell types, such as the MDA-MB-231 breast cancer cell line.109,110

Interestingly, a link between the cytoskeleton and NAGK has been previously identified. Islam et al. 2015 revealed that NAGK plays a role in axonal growth, which appears to be partially mediated by NAGKs interaction with tubulin and dynein.107 A similar positive growth effect by

NAGK, mediated by the cytoskeleton, may occur in other cells and tissues, such as HAP1 cells, which would be supported by the HAP1 CRISPR/Cas9 screen data.

While N-glycosylation is not known to be implicated in folate and nucleotide metabolism, many membrane receptors and solute transporters are N-glycosylated, such as folate and nucleoside

78 transporters. Glycosylation plays a role in receptor and transporter stability and activity at the cell surface, and alterations in glycosylation can impact their function.16,112,113 While the relationship implied by the CRISPR/Cas9 screens may be mediated through receptors and transporters important in these pathways, it is unlikely, as N-glycosylation of cell surface proteins extends far beyond players associated with these two pathways. This indicates a more direct relationship. Interestingly, the tetrahydrofolate cycle generates one-carbon units from serine and glycine, which are required for nucleotide biosynthesis.114 Seeing a relationship between folate and nucleotide metabolism suggests a novel relationship between these pathways and N-glycosylation and/or HBP based on the CRISPR/Cas9 screen data.

4.2 Validation Competition Assay

Seventeen most promising GIs were chosen for validation, as outlined in Table 3.3 (pages 63-

64). All of these genes are known to be implicated in cancer; either by correlational studies looking at patient samples, by direct in vitro and in vivo studies revealing a functional or regulatory role, or by playing a role in sensitizing tumors to certain chemotherapies.115,116,125–

134,117,135–144,118,145,119–124 The number of publications that link each gene to cancer (when searching “gene name” AND cancer) in PubMed are outlined in Table 4.1. A high number of publications linking DHFR and NF2 with cancer is likely due to the relationship between these genes and cancer being identified as early as the 1980s, versus a more recent discovery in many of the other genes. ATIC, DHFR, INPPL1, MLLT4 and NF2 are confirmed cancer-related genes by The Human Protein Atlas, either being; a cancer biomarker, a mutated gene in cancer, and/or a cancer driver (Human Protein Atlas available from www.proteinatlas.org). The Catalogue of

Somatic Mutations in Cancer (COSMIC) was also used to look at the somatic mutation incidence of each gene in tumors, based on data collected from peer reviewed papers, databases, and

79 experimental work (cancer.sanger.ac.uk).146 None are heavily mutated in cancers, like p53 for example, which appears to have a somatic mutation in over 50% of cancers.147 This suggests that these genes have a more peripheral role in cancer initiation and progression, but understanding their role may be very valuable in understanding the complexity of cancer.

Table 4.1: Number of publications linking each gene to cancer, conducted by searching [“gene name” AND cancer] in PubMed, and the incidence of somatic mutations found in these genes in cancer samples from the Catalogue of Somatic Mutations in Cancer (COSMIC).

Number of Somatic Gene Name Number of Publications Mutations/Total Samples ATIC 82 195/35562 CCNF 21 230/36241 CHD2 17 499/37116 CTNNA1 67 301/39771 DHFR 841 30/35719 EPC2 21 208/35672 GART 67 244/35628 GFPT2 10 241/35628 INPPL1 15 407/35000 MLLT4 24 517/35000 MTHFD1 69 221/35608 NF2 1760 1494/56844 OAZ1 23 38/35608 PPAT 28 119/35608 PRDX1 126 63/35608 SLC16A1 33 147/35608 TAX1BP3 4 32/35608

Ten of these seventeen genes are involved in the pathways that were over-represented in the GI lists from the CRISPR/Cas9 KO screens (Table 3.3, pages 63-64): cell-cell adhesion/cadherins, cytoskeleton, folate metabolism and nucleotide metabolism, and GFPT2 is within HBP itself.

Based on these known relationships, these GIs are likely to be successfully validated.

80

Additionally, some of these genes are minimally characterized, for example EPC2. Therefore, validation of these GIs will provide information on these genes’ function, and their gene interaction network.

4.3 Metabolomics

We characterize the metabolic phenotypes of the four gene-of-interest HAP1 mutants used in our genome-wide CRISPR/Cas9 KO screens as a step towards understanding the KO vulnerabilities and a working hypothesis for candidate GIs. The top changes in metabolites by ANOVA were

GlcNAc and UDP-GlcNAc in HBP.

The near 4-fold increase in GlcNAc in the HAP1 NAGK KO relative to the WT cells was observed exclusively in the NAGK KO. This considerable increase suggests that GlcNAc only functions in the context of UDP-GlcNAc, as it does not appear to be used up by another process.148–150 Current literature highlights the role of GlcNAc towards the generation of UDP-

GlcNAc, and where there is uncertainty in the mechanism, for example in a rat model of osteoarthritis where GlcNAc has revealed to have chondroprotective effects, it is still hypothesized to function via generation of hyaluronan or glycosylation of proteins. This suggests the mechanism does require the salvage pathway and the generation of UDP-GlcNAc.151–153

Therefore, it is likely that the known HBP salvage pathway is the only mechanism of GlcNAc metabolism in humans, other than possibly being catabolized into building blocks when not used by the cell, supporting our hypothesis.

Cells typically favour the de novo pathway for synthesis of UDP-GlcNAc, supported by much stronger perturbances in response to knocking out enzymes in this pathway (ie. GFPT1) relative to enzymes in the salvage pathway (ie. NAGK). This is supported by the data from both the

81

CRISPR/Cas9 screens and metabolomics, via the number of GIs and overall metabolic effect respectively, as previously outlined. Interestingly, UDP-GlcNAc levels are increased over 2-fold in the GFPT1 KO cells. This indicates an overcompensation; a gain-of-function of the HBP phenotype in these GFPT1 KO cells. GFPT2 shares the same function, but may be regulated in a different manner.154 GlcN-6P has shown to have an inhibitory effect on GFPT1, resulting in negative feedback inhibition within the pathway, a relationship that has not been identified with

GFPT2.155 Therefore, if this negative feedback loop is not present, or not as prominent with

GFPT2, absence of feedback inhibition may result in aberrant flux, and therefore increased UDP-

GlcNAc levels, as was observed. These two enzymes also typically reveal different tissue distribution, with higher GFPT1 expression observed in human peripheral blood leukocytes as outlined by Oki et al. 1999, which would be the cell type of origin for the HAP1 leukemia cells.30

Therefore, the cells must adapt to the loss of GFPT1 activity by upregulating GFPT2, as well as likely increasing activity through the salvage pathway, ultimately generating far more UDP-

GlcNAc than in the WT cells. This results in increased levels of substrate for N-glycosylation, potentially leading to aberrant N-glycosylation, which may have an effect on carcinogenic phenotypes, such as cell-cell adhesion or retention of important receptors in cancer such as growth factor receptors. Increased flux through de novo HBP may also drain important energy substrates in cancer cells, which may play a role in why HAP1 cells with the GFPT1 KO grow slightly slower compared to the WT cells. The effect of the GFPT1 KO on N-glycosylation is not yet known and is something that would be valuable to further explore.

The folate pathway, required for nucleotide biosynthesis, is over-represented in the lists of candidate GIs with MGAT5 and GFPT1 from the CRISPR/Cas9 KO screens. This may play a role in the altered nucleotide energy states observed in the GFPT1 KO (Figure 3.12, page 73).

Interestingly, methionine levels were increased consistently across all four gene-of-interest KO

82 cell lines. Methionine synthesis requires folate metabolism, as glycine catabolism into the one- carbon pools supports methionine synthesis as well as nucleotides,114 revealing potential upregulation of this pathway in our mutant cells. This pathway is more than likely over- expressed in the HAP1 MGAT1 KO cells, sustaining the increased levels of almost all nucleotides and amino acids that are observed. This may indicate a direct link between N- glycosylation and the folate pathway.

The global increase in metabolites in the HAP1 MGAT1 KO, which can be seen in the heat map in Figure 3.10 (page 69), indicates an overall increase in metabolism. Metabolomics was similarly analyzed in MDA-MB-231 gene-of-interest cell lines, where MDA-MB-231 MGAT1

KO cells displayed a similar excess of metabolites compared to the MDA-MB-231 WT cell line, shown in Figure 4.1.

83

Figure 4.1: Relative levels of metabolites that reveal a significant difference across one or more of the MDA-MB-231 WT and each gene-of-interest KO cell line (n=9). Scale values are relative, with dark red squares having the largest quantity within the dataset, and dark blue consisting of the lowest quantity. GNPNAT1, Glucosamine-Phosphate N-Acetyltransferase 1; KO, Knockout; MGAT1, Alpha-1,3-Mannosylglycoprotein 2-Beta-N- Acetylglucosaminyltransferase ; MGAT5, Alpha-1,6-Mannosylglycoprotein 6-Beta-N- Acetylglucosaminyltransferase; NAGK, N-acetylglucosamine Kinase; WT, Wild type.

84

Little is known on the effect of MGAT1 in the greater context of metabolism, but the production of reactive oxygen species (ROS) is an inevitable byproduct of energy metabolism, and is likely increased in the HAP1 MGAT1 KO cells. ROS can have many effects on the cell, such as DNA damage, irreversible post-translational modifications, and an ER stress response.156 These can all promote carcinogenesis, but in excess can be toxic and result in apoptosis or necrosis signaling.156,157 This may account for the decreased tumor growth observed when the MDA-MB-

231 MGAT1 KO cells were tested in vivo in tumor xenograft mouse models, previously outlined in Figure 1.1 (page 6), and may be why this was the only cell line that did not appear to adapt to the gene KO. Alternatively, loss of Mgat5 in a mammary tumor mouse model found a resulting increase in ROS in the cells, which inhibited PTEN, a tumor suppressor gene, thereby promoting a carcinogenic phenotype.29 It appears that changes in ROS occur in response to loss of MGAT1 and MGAT5, but the exact mechanism and response is unclear. Therefore, ROS levels should be measured in both the MDA-MB-231 and HAP1 MGAT1 and MGAT5 KO cell lines. PRDX1 is an antioxidant enzyme that appears to be a negative gene interactor with MGAT1, as well as

GFPT1, which also appears to have a unique metabolic signature relative to the WT cells. This is consistent with the believed ROS dependent vulnerability, making PRDX1 a very promising candidate for validation.

Many more GIs were revealed in the HAP1 MGAT1 KO and GFPT1 KO cell lines relative to the

HAP1 NAGK and MGAT5 KO cell lines. This is similar to the metabolomics data, where more differences were observed in the MGAT1 and GFPT1 KO cell lines when comparing to the

HAP1 WT. Together this suggests with confidence that the cells depend more on the function of these two genes compared to NAGK and MGAT5, thereby making them more essential genes.

MGAT1 has a unique activity and is essential for embryonic development in mice,158 whereas

Mgat5-/- mice could be successfully generated,24 further supporting this conclusion. The NAGK

85 and MGAT5 KOs appear to have a mild effect on both the HAP1 and MDA-MB-231 cells, therefore biochemical redundancy may partially circumvent these mutations in cultured cells.

These similarities between the HAP1 and MDA-MB-231 metabolite data implies the effects are associated with the KO mutations, rather than being an exclusive effect in one cell line/cancer type. This suggests that the core features of the metabolic phenotypes for each gene-of-interest

KO may be conserved across different cancer types.

An alternate route and compensation is possible for the GFPT1 mutant cells (ie. through expression of GFPT2), supported by the metabolomics data whereby high levels of UDP-

GlcNAc were still generated in the HAP1 GFPT1 KO cells. GNPNAT1, which catalyzes the step after GFPT1/2, is not believed to have a homologous enzyme, which would make the KO more detrimental relative to the GFPT1 KO. The metabolomics data on the MDA-MB-231 cells supports this, as large relative differences are seen across almost all metabolites the GNPNAT1

KO cells (Figure 4.1), more so than the MGAT1 KO in both the HAP1 and MDA-MB-231 cell lines. Tissue culture work with the MDA-MB-231 GNPNAT1 KO cells has also revealed a much slower growth phenotype relative to the MDA-MB-231 WT cells. Surprisingly, the in vivo study outlined in Figure 1.1 (page 6), revealed that the MDA-MB-231 GNPNAT1 KO cells grew tumors the fastest compared to the MGAT1 and MGAT5 KO cells. This is not expected based on previous trends seen with these gene KOs in vitro, indicating that there may be another enzyme that may share it’s function. This enzyme may not be expressed in cancer or breast tissue specifically, and maybe this milder response to the GNPNAT1 KO is only observed in vivo, as the tumor has access to all nutrients and enzymes circulating in the blood.

86

4.4 Limitations

CRISPR/Cas9 is a very efficient method for targeted mutagenesis, with high sensitivity and specificity,77,83 but has its limitations. There is the possibility for off-target effects when working with CRISPR. All gRNAs have a calculated off-target score, which depends on how many off- target hits the 20bp gRNA sequence has somewhere else in the genome, which can usually have up to two mismatches. gRNAs also have an on-target score, indicating how well they target the intended genomic locus, which was computed for the TKOv3 gRNA library based on TVOv1 gRNA library data.94 These two scores need to be considered when choosing gRNAs. When designing the TKOv3 gRNA library, Hart et al. 2017 focused on optimizing the on-target score, as a small number of off-target cut sites in intergenic regions have shown a negligible fitness effect.7

The CRISPR/Cas9 system requires introducing new genes and DNA sequences into cells. This is typically done via transfection or lentiviral infection methods. Lentiviral infection involves random integration of DNA into the genome of the infected cells, providing another opportunity for off-target effects. Although transfection does not involve integration, it is a transient process, as the DNA is not maintained in the entire cell population upon expansion. This has its own limitations, as it provides a brief time frame for a successful KO to occur, and if the mutation is somehow repaired after the Cas9 gene and gRNA are already lost, that repair will be maintained.

The generation and maintenance of a targeted gene KO is a priority in the CRISPR/Cas9 screens and the validation assay, making lentiviral infection the preferred method. Additionally, over

98% of the human genome is non-coding, so it is highly likely that the DNA will integrate into a non-coding region of the genome, making it unlikely to have a phenotypic effect.

87

There is another layer of control over off-target effects in the CRISPR/Cas9 KO screens, as the gRNA library used contains 4 gRNAs per gene. gRNA behaviour is considered when analyzing the CRISPR/Cas9 screen data. For example, when one gRNA behaves differently relative to the other three gRNAs targeting the same gene, the effect of the different gRNA is eliminated from the results. The unique phenotype is likely the result of an off-target effect or a poor on-target effect, meaning that result is likely not a response to the double KO being tested.

The KOs generated using the genome-wide gRNA library in the CRISPR/Cas9 KO screens are not confirmed by sequencing, as it is not feasible to sequence every gene in every cell when screening at 200-fold coverage using a library containing 70,948 gRNAs (sequencing over 1.5E7 cells). Therefore, the gRNAs are sequenced, and it is assumed that the correct KO is being generated in the cell, as Cas9 is also present. The screens undergo a quality control check looking at drop out of essential and nonessential genes as outlined in Figure 2.4 (page 39), but each individual KO cannot be guaranteed, and there are possibly some escapers of the phenotype. Additionally, cell lines are infected at a MOI of approximately 0.3 to minimize the occurrence of two viral particles infecting the same cell and therefore producing a triple KO.

Similarly, a triple KO is not something that can be avoided entirely, and may slightly skew data, but by using 200-fold coverage of the gRNA library in the screens and 4 gRNAs per gene, this effect should be negligible. Additionally, only the first and final time points of the experiment are sequenced, which is sufficient to identify integrated gRNAs that either dropout or amplify in the competitive cell expansion. Analyzing the middle time points would allow a measure of rates, such that very early gRNA loss would indicate a stronger negative GI than if the loss occurred at the completion of the screen, which cannot be differentiated without sequencing middle time points.

88

4.5 Future Directions

To identify GIs that are conserved across different cancer types, testing GIs in isolation in a panel of cancer cell lines is the most direct method. However, screens with KO cell lines of other genes in the same pathway, or moving the CRISPR/Cas9 KO screens to another host cancer cell line, would add confidence and a broader applicability. To this end, our lab has generated

NAGK, GNPNAT1, MGAT1 and MGAT5 KO cell lines in the MDA-MB-231 mammary tumor cell line. I have done genome-wide CRISPR/Cas9 KO screens on the MDA-MB-231 WT,

MGAT1 and MGAT5 KO cell lines, which has been completed and submitted for DNA sequencing. We predict that these CRISPR/Cas9 screens will reveal a list of candidate GIs, with some genes or pathways that overlap with the HAP1 MGAT1 KO and MGAT5 KO screens. We would like to identify GIs or pathway relationships that are conserved across diverse cancer types. After validation of strong GIs in both the HAP1 and MDA-MB-231 cell lines, further testing will be done in other cancer cell lines.

The competition assay for validation did not work in the HAP1 cells, likely due to poor Cas9 expression as described in the Results (page 67). There was an observed growth deficit in the

MDA-MB-231 WT cell line when a gRNA targeting PSMD1, an essential gene, was introduced using this method (outlined on pages 43-44). The PSMD1 positive control test will be attempted again in the MDA-MB-231 cell lines, where Cas9 expression appears to be more robust relative to the HAP1 cell lines. We would like to obtain quantified results with the MDA-MB-231 cell lines, as this was not possible in the previous attempt. However, we plan to modify the method to introduce Cas9 and gRNA(s) targeting one or two genes on the same vector, delivered by lentivirus.

89

This proposed system uses a single transfer vector, introduced into the cells via lentiviral infection, which contains Cas9, a gRNA, a fluorescence tag (an mCherry or mClover tag), and a puromycin resistance gene. This allows all three components to be integrated into the genome at once, implying that if a cell survives puromycin selection, it should also contain Cas9, the gRNA(s), and the fluorescence tag within its genome as well. This method allows for competition between an mCherry expressing and an mClover expressing population. The single vector approach would allow testing of single and dual KOs, introducing the mutations in series or simultaneously into any tumor cell line or in non-transformed cells. It is important to test GIs in different orders, as cells typically adapt to KOs, whereby a second KO introduced into a cell may lead to a different response than if the two KOs were introduced simultaneously.

Finally, if a chemical inhibitor is known for one or both of the genes in a GI pair, the compounds will be tested, to determine if they exert a similar effect as the gene’s mutation. The expectations with inhibitors may not always be realized, because these compounds can have off-target effects, and a cell with gene mutations can undergo complex adaptive changes associated with drug resistance.

4.6 Conclusions

This project using CRISPR/Cas9 genome-wide KO screens has identified novel gene interactions with enzymes in the hexosamine biosynthesis and N-glycan branching pathways in the context of cancer. Through validation and further characterization of the gene interactions, we will gain a better understanding of cancer cell vulnerabilities.

Upon expanded testing of these GIs in other cancer cell lines, the information gained will contribute to the gene interaction network across various cancer types. This is particularly

90 valuable, as cancer is a very diverse and robust disease, making it a moving target for therapy.

This project will help us better understand the complexities of pathway crosstalk that occurs within the cancer cell, allowing a better understanding of where redundancy may lie within the genome and between biochemical pathways. Therefore, the novel relationships identified may present therapeutic translation, by targeting these pathways alongside the newly revealed vulnerabilities.

91

References

1. Smith L, Bryan S, De P, et al. Canadian Cancer Statistics: A 2018 special report on cancer incidence by stage. 2018. https://www.cancer.ca/~/media/cancer.ca/CW/cancer information/cancer 101/Canadian cancer statistics/Canadian-Cancer-Statistics-2018- EN.pdf?la=en. Accessed May 19, 2019.

2. Gatenby R, Brown J. The Evolution and Ecology of Resistance in Cancer Therapy. Cold Spring Harb Perspect Med. July 2017. doi:10.1101/cshperspect.a033415

3. Kanarek N, Keys HR, Cantor JR, et al. Histidine catabolism is a major determinant of methotrexate sensitivity. Nature. 2018;559(7715):632-636. doi:10.1038/s41586-018- 0316-7

4. Costanzo M, VanderSluis B, Koch EN, et al. A global genetic interaction network maps a wiring diagram of cellular function. Science (80- ). 2016;353(6306):aaf1420-aaf1420. doi:10.1126/science.aaf1420

5. Shalem O, Sanjana NE, Hartenian E, et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science. 2014;343(6166):84-87. doi:10.1126/science.1247005

6. Wang T, Wei JJ, Sabatini DM, Lander ES. Genetic screens in human cells using the CRISPR-Cas9 system. Science. 2014;343(6166):80-84. doi:10.1126/science.1246981

7. Hart T, Chandrashekhar M, Aregger M, et al. High-Resolution CRISPR Screens Reveal Fitness Genes and Genotype-Specific Cancer Liabilities. Cell. 2015;163(6):1515-1526. doi:10.1016/j.cell.2015.11.015

8. Shen JP, Zhao D, Sasik R, et al. Combinatorial CRISPR-Cas9 screens for de novo mapping of genetic interactions. Nat Methods. 2017;14(6):573-576. doi:10.1038/nmeth.4225

9. O’Neil NJ, Bailey ML, Hieter P. Synthetic lethality and cancer. Nat Rev Genet. 2017;18(10):613-623. doi:10.1038/nrg.2017.47

10. Wise DR, DeBerardinis RJ, Mancuso A, et al. Myc regulates a transcriptional program that stimulates mitochondrial glutaminolysis and leads to glutamine addiction. Proc Natl Acad Sci. 2008;105(48):18782-18787. doi:10.1073/pnas.0810199105

11. Guo JY, Chen H-Y, Mathew R, et al. Activated Ras requires autophagy to maintain oxidative metabolism and tumorigenesis. Genes Dev. 2011;25(5):460-470. doi:10.1101/gad.2016311

12. Yun J, Rago C, Cheong I, et al. Glucose deprivation contributes to the development of KRAS pathway mutations in tumor cells. Science. 2009;325(5947):1555-1559. doi:10.1126/science.1174229

13. Patel SJ, Sanjana NE, Kishton RJ, et al. Identification of essential genes for cancer immunotherapy. Nature. 2017;548(7669):537-542. doi:10.1038/nature23477

92

14. Schulze A, Harris AL. How cancer metabolism is tuned for proliferation and vulnerable to disruption. Nature. 2012;491(7424):364-373. doi:10.1038/nature11706

15. Taparra K, Tran PT, Zachara NE. Hijacking the Hexosamine Biosynthetic Pathway to Promote EMT-Mediated Neoplastic Phenotypes. Front Oncol. 2016;6:85. doi:10.3389/fonc.2016.00085

16. Dennis JW, Nabi IR, Demetriou M. Metabolism, Cell Surface Organization, and Disease. Cell. 2009;139(7):1229-1241. doi:10.1016/j.cell.2009.12.008

17. Abdel Rahman AM, Ryczko M, Nakano M, et al. Golgi N-glycan branching N- acetylglucosaminyltransferases I, V and VI promote nutrient uptake and metabolism. Glycobiology. 2015;25(2):225-240. doi:10.1093/glycob/cwu105

18. Pham L V., Bryant JL, Mendez R, et al. Targeting the hexosamine biosynthetic pathway and O-linked N-acetylglucosamine cycling for therapeutic and imaging capabilities in diffuse large B-cell lymphoma. Oncotarget. 2016;7(49):80599-80611. doi:10.18632/oncotarget.12413

19. Nabi IR, Shankar J, Dennis JW. The galectin lattice at a glance. J Cell Sci. 2015;128(13):2213-2219. doi:10.1242/jcs.151159

20. Dennis JW, Nabi IR, Demetriou M. Organization, Cell Surface and disease. Cell. 2009;139(7):1229-1241. doi:10.1016/j.cell.2009.12.008.Metabolism

21. Demetriou M, Granovsky M, Quaggin S, Dennis JW. Negative regulation of T-cell activation and autoimmunity by Mgat5 N-glycosylation. Nature. 2001;409(6821):733- 739. doi:10.1038/35055582

22. Azimzadeh Irani M, Kannan S, Verma C. Role of N-glycosylation in EGFR ectodomain ligand binding. Proteins Struct Funct Bioinforma. 2017;85(8):1529-1549. doi:10.1002/prot.25314

23. Zavareh RB, Sukhai MA, Hurren R, et al. Suppression of Cancer Progression by MGAT1 shRNA Knockdown. PLoS One. 2012;7(9). doi:10.1371/journal.pone.0043721

24. Granovsky M, Fata J, Pawling J, Muller WJ, Khokha R, Dennis JW. Suppression of tumor growth and metastasis in Mgat5-deficient mice. Nat Med. 2000;6(3):306-312. doi:10.1038/73163

25. Guo H-B, Johnson H, Randolph M, Nagy T, Blalock R, Pierce M. Specific posttranslational modification regulates early events in mammary carcinoma formation. Proc Natl Acad Sci U S A. 2010;107(49):21116-21121. doi:10.1073/pnas.1013405107

26. Cheung P, Dennis JW. Mgat5 and Pten interact to regulate cell growth and polarity. Glycobiology. 2007;17(7):767-773. doi:10.1093/glycob/cwm037

27. Partridge EA, Le Roy C, Di Guglielmo GM, et al. Regulation of cytokine receptors by Golgi N-glycan processing and endocytosis. Science. 2004;306(5693):120-124.

93

doi:10.1126/science.1102109

28. Lau KS, Partridge EA, Grigorian A, et al. Complex N-glycan number and degree of branching cooperate to regulate cell proliferation and differentiation. Cell. 2007;129(1):123-134. doi:10.1016/j.cell.2007.01.049

29. Mendelsohn R, Cheung P, Berger L, et al. Control of tumor metabolism and growth by N- glycan processing. Cancer Res. 2007;67:9771-9780.

30. Oki T, Yamazaki K, Kuromitsu J, Okada M, Tanaka I. cDNA cloning and mapping of a novel subtype of glutamine:fructose-6- phosphate amidotransferase (GFAT2) in human and mouse. Genomics. 1999;57(2):227-234. doi:10.1006/geno.1999.5785

31. Zraika S, Dunlop M, Proietto J, Andrikopoulos S. The hexosamine biosynthesis pathway regulates insulin secretion via protein glycosylation in mouse islets. Arch Biochem Biophys. 2002;405(2):275-279. doi:10.1016/S0003-9861(02)00397-1

32. Chiaradonna F, Ricciardiello F, Palorini R, Chiaradonna F, Ricciardiello F, Palorini R. The Nutrient-Sensing Hexosamine Biosynthetic Pathway as the Hub of Cancer Metabolic Rewiring. Cells. 2018;7(6):53. doi:10.3390/cells7060053

33. Sage AT, Walter LA, Shi Y, et al. Hexosamine biosynthesis pathway flux promotes endoplasmic reticulum stress, lipid accumulation, and inflammatory gene expression in hepatic cells. Am J Physiol Metab. 2010;298(3):E499-E511. doi:10.1152/ajpendo.00507.2009

34. Parker JL, Newstead S. Structural basis of nucleotide sugar transport across the Golgi membrane. Nature. 2017;551(7681):521-524. doi:10.1038/nature24464

35. Wayman JA, Glasscock C, Mansell TJ, DeLisa MP, Varner JD. Improving designer glycan production in Escherichia coli through model-guided metabolic engineering. Metab Eng Commun. 2019;9:e00088. doi:10.1016/J.MEC.2019.E00088

36. Ricciardiello F, Votta G, Palorini R, et al. Inhibition of the Hexosamine Biosynthetic Pathway by targeting PGM3 causes breast cancer growth arrest and apoptosis. Cell Death Dis. 2018;9(3):377. doi:10.1038/s41419-018-0405-4

37. Varki A, Kornfeld S. Historical Background and Overview. Cold Spring Harbor Laboratory Press; 2015. doi:10.1101/GLYCOBIOLOGY.3E.001

38. Sun S, Zhang H. Large-Scale Measurement of Absolute Protein Glycosylation Stoichiometry. Anal Chem. 2015;87(13):6479. doi:10.1021/ACS.ANALCHEM.5B01679

39. Shrimal S, Cherepanova NA, Gilmore R. Cotranslational and posttranslocational N- glycosylation of proteins in the endoplasmic reticulum. Semin Cell Dev Biol. 2015;41:71- 78. doi:10.1016/j.semcdb.2014.11.005

40. Jokela TA, Jauhiainen M, Auriola S, et al. Mannose inhibits hyaluronan synthesis by down-regulation of the cellular pool of UDP-N-acetylhexosamines. J Biol Chem.

94

2008;283(12):7666-7673. doi:10.1074/jbc.M706001200

41. Van Den Steen P, Rudd PM, Dwek RA, Opdenakker G. Concepts and principles of O- linked glycosylation. Crit Rev Biochem Mol Biol. 1998;33(3):151-208. doi:10.1080/10409239891204198

42. You X, Qin H, Ye M. Recent advances in methods for the analysis of protein o- glycosylation at proteome level. J Sep Sci. 2018;41(1):248-261. doi:10.1002/jssc.201700834

43. Joshi HJ, Narimatsu Y, Schjoldager KT, et al. SnapShot: O-Glycosylation Pathways across Kingdoms. Cell. 2018;172(3):632-632.e2. doi:10.1016/j.cell.2018.01.016

44. Brooks SA. Appropriate glycosylation of recombinant proteins for human use: Implications of choice of expression system. Appl Biochem Biotechnol - Part B Mol Biotechnol. 2004;28(3):241-256. doi:10.1385/MB:28:3:241

45. Kim YH, Nakayama T, Nayak J. Glycolysis and the Hexosamine Biosynthetic Pathway as Novel Targets for Upper and Lower Airway Inflammation. Allergy Asthma Immunol Res. 2018;10(1):6-11. doi:10.4168/aair.2018.10.1.6

46. Chatham JC, Marchase RB. Protein O-GlcNAcylation: A critical regulator of the cellular response to stress. Curr Signal Transduct Ther. 2010;5(1):49-59. http://www.ncbi.nlm.nih.gov/pubmed/22308107. Accessed May 17, 2019.

47. Cowman MK. Hyaluronan and Hyaluronan Fragments. In: Advances in Carbohydrate Chemistry and Biochemistry. Vol 74. ; 2017:1-59. doi:10.1016/bs.accb.2017.10.001

48. Vigetti D, Deleonibus S, Moretto P, et al. Role of UDP-N-acetylglucosamine (GlcNAc) and O-GlcNAcylation of hyaluronan synthase 2 in the control of chondroitin sulfate and hyaluronan synthesis. J Biol Chem. 2012;287(42):35544-35555. doi:10.1074/jbc.M112.402347

49. Rodriguez-Martinez H, Tienthai P, Atikuzzaman M, Vicente-Carrillo A, Rubér M, Alvarez-Rodriguez M. The ubiquitous hyaluronan: Functionally implicated in the oviduct? Theriogenology. 2016;86(1):182-186. doi:10.1016/j.theriogenology.2015.11.025

50. Viola M, Vigetti D, Karousou E, et al. Biology and biotechnology of hyaluronan. Glycoconj J. 2015;32(3-4):93-103. doi:10.1007/s10719-015-9586-6

51. Liu M, Tolg C, Turley E. Dissecting the Dual Nature of Hyaluronan in the Tumor Microenvironment. Front Immunol. 2019;10:947. doi:10.3389/fimmu.2019.00947

52. Hanahan D, Weinberg RA. Hallmarks of Cancer: The Next Generation. Cell. 2011;144(5):646-674. doi:10.1016/J.CELL.2011.02.013

53. Zheng J. Energy metabolism of cancer: Glycolysis versus oxidative phosphorylation (review). Oncol Lett. 2012;4(6):1151-1157. doi:10.3892/ol.2012.928

54. Liberti M V, Locasale JW. The Warburg Effect: How Does it Benefit Cancer Cells?

95

Trends Biochem Sci. 2016;41(3):211-218. doi:10.1016/j.tibs.2015.12.001

55. Kalyanaraman B. Teaching the basics of cancer metabolism: Developing antitumor strategies by exploiting the differences between normal and cancer cell metabolism. Redox Biol. 2017;12:833-842. doi:10.1016/j.redox.2017.04.018

56. Kim NH, Cha YH, Lee J, et al. Snail reprograms glucose metabolism by repressing phosphofructokinase PFKP allowing cancer cell survival under metabolic stress. Nat Commun. 2017;8(May 2016):14374. doi:10.1038/ncomms14374

57. Papa S, Martino PL, Capitanio G, et al. The Oxidative Phosphorylation System in Mammalian Mitochondria. In: Springer, Dordrecht; 2012:3-37. doi:10.1007/978-94-007- 2869-1_1

58. Zimorski V, Mentel M, Tielens AGM, Martin WF. Energy metabolism in anaerobic eukaryotes and Earth’s late oxygenation. Free Radic Biol Med. March 2019. doi:10.1016/j.freeradbiomed.2019.03.030

59. Salamon S, Podbregar E, Kubatka P, et al. Glucose Metabolism in Cancer and Ischemia: Possible Therapeutic Consequences of the Warburg Effect. Nutr Cancer. 2017;69(2):177- 183. doi:10.1080/01635581.2017.1263751

60. Wise DR, Thompson CB. Glutamine addiction: a new therapeutic target in cancer. Trends Biochem Sci. 2010;35(8):427-433. doi:10.1016/j.tibs.2010.05.003

61. Muir A, Danai L V, Gui DY, et al. Environmental cystine drives glutamine anaplerosis and sensitizes cancer cells to glutaminase inhibition. Elife. 2017;6:1-27. doi:10.7554/elife.27713

62. Hensley CT, Wasti AT, Ralph J, et al. Glutamine and cancer : cell biology , physiology , and clinical opportunities Find the latest version : Review series Glutamine and cancer : cell biology , physiology , and clinical opportunities. 2013;123(9):3678-3684. doi:10.1172/JCI69600.3678

63. Altman BJ, Stine ZE, Dang C V. From Krebs to clinic: glutamine metabolism to cancer therapy. Nat Rev Cancer. 2016;16(10):619-634. doi:10.1038/nrc.2016.71

64. Tian Y, Du W, Cao S, et al. Systematic analyses of glutamine and glutamate metabolisms across different cancer types. Chin J Cancer. 2017;36(1):1-14. doi:10.1186/s40880-017- 0255-y

65. Cairns RA, Harris IS, Mak TW. Regulation of cancer cell metabolism. Nat Rev Cancer. 2011;11(2):85-95. doi:10.1038/nrc2981

66. Yoshii Y, Furukawa T, Saga T, Fujibayashi Y. Mini-review Acetate/acetyl-CoA metabolism associated with cancer fatty acid synthesis: Overview and application q. doi:10.1016/j.canlet.2014.02.019

67. Tong L, Chuang C-C, Wu S, Zuo L. Reactive oxygen species in redox cancer therapy.

96

Cancer Lett. 2015;367(1):18-25. doi:10.1016/J.CANLET.2015.07.008

68. Galaris D, Skiada V, Barbouti A. Redox signaling and cancer: The role of “labile” iron. Cancer Lett. 2008;266(1):21-29. doi:10.1016/j.canlet.2008.02.038

69. Mullen PJ, Yu R, Longo J, Archer MC, Penn LZ. The interplay between cell signaling and the mevalonate pathway in cancer. Nat Rev Cancer. 2016;16(11). doi:10.1038/nrc.2016.76

70. Ying H, Kimmelman AC, Lyssiotis CA, et al. Oncogenic Kras Maintains Pancreatic Tumors through Regulation of Anabolic Glucose Metabolism. Cell. 2012;149(3):656-670. doi:10.1016/j.cell.2012.01.058

71. Wellen KE, Lu C, Mancuso A, et al. The hexosamine biosynthetic pathway couples growth factor-induced glutamine uptake to glucose metabolism. Genes Dev. 2010;24(24):2784-2799. doi:10.1101/gad.1985910

72. Collins FS, Morgan M, Patrinos A. The Human Genome Project: lessons from large-scale biology. Science. 2003;300(5617):286-290. doi:10.1126/science.1084564

73. Gilbert SF. Determining the Function of Genes during Development. 2000. https://www.ncbi.nlm.nih.gov/books/NBK10094/. Accessed May 18, 2019.

74. Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P. Studying Gene Expression and Function. 2002. https://www.ncbi.nlm.nih.gov/books/NBK26818/. Accessed May 18, 2019.

75. Zhang F, Wen Y, Guo X. CRISPR/Cas9 for genome editing: progress, implications and challenges. Hum Mol Genet. 2014;23(R1):R40-R46. doi:10.1093/hmg/ddu125

76. Rao DD, Vorhies JS, Senzer N, Nemunaitis J. siRNA vs. shRNA: Similarities and differences. Adv Drug Deliv Rev. 2009;61(9):746-759. doi:10.1016/j.addr.2009.04.004

77. Huang J, Wang Y, Zhao J. CRISPR editing in biological and biomedical investigation. J Cell Physiol. 2018;233(5):3875-3891. doi:10.1002/jcp.26141

78. Nerys-Junior A, Braga-Dias LP, Pezzuto P, Cotta-de-Almeida V, Tanuri A. Comparison of the editing patterns and editing efficiencies of TALEN and CRISPR-Cas9 when targeting the human CCR5 gene. Genet Mol Biol. 2018;41(1):167-179. doi:10.1590/1678- 4685-GMB-2017-0065

79. Qi X, Zhang J, Zhao Y, et al. The applications of CRISPR screen in functional genomics. Brief Funct Genomics. 2017;16(1):34-37. doi:10.1093/bfgp/elw020

80. Gaj T, Gersbach CA, Barbas CF, III. ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends Biotechnol. 2013;31(7):397-405. doi:10.1016/j.tibtech.2013.04.004

81. Komor AC, Badran AH, Liu DR. CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes. Cell. 2017;168(1-2):20-36. doi:10.1016/j.cell.2016.10.044

97

82. Ma Y, Zhang L, Huang X. Genome modification by CRISPR/Cas9. FEBS J. 2014;281(23):5186-5193. doi:10.1111/febs.13110

83. Hart T, Brown KR, Sircoulomb F, Rottapel R, Moffat J. Measuring error rates in genomic perturbation screens: gold standards for human functional genomics. Mol Syst Biol. 2014;10(7):733. doi:10.15252/MSB.20145216

84. Boone C, Bussey H, Andrews BJ. Exploring genetic interactions and networks with yeast. Nat Rev Genet. 2007;8(6):437-449. doi:10.1038/nrg2085

85. Kelley R, Ideker T. Systematic interpretation of genetic interactions using protein networks. Nat Biotechnol. 2005;23(5):561-566. doi:10.1038/nbt1096

86. Cooper G. The Cell: A Molecular Approach. 2nd edition. Sunderland (MA): Sinauer Associates; 2000. https://www.ncbi.nlm.nih.gov/books/NBK9846/. Accessed May 18, 2019.

87. Willyard C. New human gene tally reignites debate. Nat 2018 5587710. June 2018.

88. Jo B-S, Choi SS. Introns: The Functional Benefits of Introns in Genomes. Genomics Inform. 2015;13(4):112-118. doi:10.5808/GI.2015.13.4.112

89. Mironov AA, Fickett JW, Gelfand MS. Frequent alternative splicing of human genes. Genome Res. 1999;9(12):1288-1293. http://www.ncbi.nlm.nih.gov/pubmed/10613851. Accessed June 2, 2019.

90. Taniguchi N, Honke K, Fukuda M, Narimatsu H, Yamaguchi Y, Angata T. Handbook of and related genes, second edition. Handb Glycosyltransferases Relat Genes, Second Ed. 2014;1-2:1-1707. doi:10.1007/978-4-431-54240-7

91. Sakuma T, Nishikawa A, Kume S, Chayama K, Yamamoto T. Multiplex genome engineering in human cells using all-in-one CRISPR/Cas9 vector system. Sci Rep. 2015;4(1):5400. doi:10.1038/srep05400

92. Aregger M, Chandrashekhar M, Tong AHY, Chan K, Moffat J. Pooled Lentiviral CRISPR-Cas9 Screens for Functional Genomics in Mammalian Cells. In: Humana Press, New York, NY; 2019:169-188. doi:10.1007/978-1-4939-8805-1_15

93. Raudvere U, Kolberg L, Kuzmin I, et al. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res. May 2019. doi:10.1093/nar/gkz369

94. Hart T, Tong A, Chan K, et al. Evaluation and Design of Genome-wide CRISPR/Cas9 Knockout Screens. bioRxiv. March 2017:117341. doi:10.1101/117341

95. Abdel Rahman AM, Pawling J, Ryczko M, Caudy AA, Dennis JW. Targeted metabolomics in cultured cells and tissues by mass spectrometry: Method development and validation. Anal Chim Acta. 2014;845:53-61. doi:10.1016/j.aca.2014.06.012

96. Chong J, Soufan O, Li C, et al. MetaboAnalyst 4.0: towards more transparent and

98

integrative metabolomics analysis. Nucleic Acids Res. 2018;46(W1):W486-W494. doi:10.1093/nar/gky310

97. Schutz GY. Protein Turnover, Ureagenesis and Gluconeogenesis. Int J Vitam Nutr Res. 2011;81(3):101-107. doi:10.1024/0300

98. Chidawanyika T, Sergison E, Cole M, Mark K, Supattapone S. SEC24A identified as an essential mediator of thapsigargin-induced cell death in a genome-wide CRISPR/Cas9 screen. Cell Death Discov. 2018;4(1):115. doi:10.1038/s41420-018-0135-5

99. Koike-Yusa H, Li Y, Tan E-P, Velasco-Herrera MDC, Yusa K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat Biotechnol. 2014;32(3):267-273. doi:10.1038/nbt.2800

100. Shin JJ, Aftab Q, Austin P, et al. Systematic identification of genes involved in metabolic acid stress resistance in yeast and their potential as cancer targets. Dis Model Mech. 2016;9(9):1039. doi:10.1242/DMM.023374

101. Oza AM, Cibula D, Benzaquen AO, et al. Olaparib combined with chemotherapy for recurrent platinum-sensitive ovarian cancer: A randomised phase 2 trial. Lancet Oncol. 2015;16(1):87-97. doi:10.1016/S1470-2045(14)71135-0

102. Liu JF, Barry WT, Birrer M, et al. Combination cediranib and olaparib versus olaparib alone for women with recurrent platinum-sensitive ovarian cancer: A randomised phase 2 study. Lancet Oncol. 2014;15(11):1207-1214. doi:10.1016/S1470-2045(14)70391-2

103. Cunningham D, Humblet Y, Siena S, et al. Cetuximab Monotherapy and Cetuximab plus Irinotecan in Irinotecan-Refractory Metastatic Colorectal Cancer. N Engl J Med. 2004;351(4):337-345. doi:10.1056/NEJMoa033025

104. Larkin J, Chiarion-Sileni V, Gonzalez R, et al. Combined Nivolumab and Ipilimumab or Monotherapy in Untreated Melanoma. N Engl J Med. 2015;373(1):23-34. doi:10.1056/NEJMoa1504030

105. Guo H-B, Johnson H, Randolph M, Pierce M. Regulation of homotypic cell-cell adhesion by branched N-glycosylation of N-cadherin extracellular EC2 and EC3 domains. J Biol Chem. 2009;284(50):34986-34997. doi:10.1074/jbc.M109.060806

106. Yu X, Zhao Y, Wang L, et al. Sialylated β1, 6 branched N-glycans modulate the adhesion, invasion and metastasis of hepatocarcinoma cells. Biomed Pharmacother. 2016;84:1654- 1661. doi:10.1016/j.biopha.2016.10.085

107. Islam MA, Sharif SR, Lee H, Moon IS. N-Acetyl-D-Glucosamine Kinase Promotes the Axonal Growth of Developing Neurons. Mol Cells. 2015;38(10):876-885. doi:10.14348/molcells.2015.0120

108. Zhang Y, Zhao J, Zhang X, Guo H, Liu F, Chen H. Relations of the type and branch of surface N-glycans to cell adhesion, migration and integrin expressions. Mol Cell Biochem. 2004;260(1-2):137-146. http://www.ncbi.nlm.nih.gov/pubmed/15228095. Accessed June

99

1, 2019.

109. Carvalho S, Catarino TA, Dias AM, et al. Preventing E-cadherin aberrant N-glycosylation at Asn-554 improves its critical function in gastric cancer. Oncogene. 2016;35(13):1619- 1631. doi:10.1038/onc.2015.225

110. Nita-Lazar M, Noonan V, Rebustini I, Walker J, Menko AS, Kukuruzinska MA. Overexpression of DPAGT1 leads to aberrant N-glycosylation of E-cadherin and cellular discohesion in oral cancer. Cancer Res. 2009;69(14):5673-5680. doi:10.1158/0008- 5472.CAN-08-4512

111. Pinho SS, Seruca R, Gärtner F, et al. Modulation of E-cadherin function and dysfunction by N-glycosylation. Cell Mol Life Sci. 2011;68(6):1011-1020. doi:10.1007/s00018-010- 0595-0

112. Ward JL, Leung GPH, Toan S-V, Tse C-M. Functional analysis of site-directed glycosylation mutants of the human equilibrative nucleoside transporter-2. Arch Biochem Biophys. 2003;411(1):19-26. http://www.ncbi.nlm.nih.gov/pubmed/12590919. Accessed May 30, 2019.

113. Shen F, Wang H, Zheng X, Ratnam M. Expression levels of functional folate receptors alpha and beta are related to the number of N-glycosylated sites. Biochem J. 1997;327 ( Pt 3:759-764. http://www.ncbi.nlm.nih.gov/pubmed/9581553%0Ahttp://www.pubmedcentral.nih.gov/art iclerender.fcgi?artid=PMC1218854.

114. Ducker GS, Rabinowitz JD. One-Carbon Metabolism in Health and Disease. Cell Metab. 2017;25(1):27-42. doi:10.1016/J.CMET.2016.08.009

115. Liu X, Paila UD, Teraoka SN, et al. Identification of ATIC as a Novel Target for Chemoradiosensitization. Int J Radiat Oncol. 2018;100(1):162-173. doi:10.1016/j.ijrobp.2017.08.033

116. Li M, Jin C, Xu M, Zhou L, Li D, Yin Y. Bifunctional enzyme ATIC promotes propagation of hepatocellular carcinoma by regulating AMPK-mTOR-S6 K1 signaling. Cell Commun Signal. 2017;15(1):52. doi:10.1186/s12964-017-0208-8

117. Galper J, Rayner SL, Hogan AL, et al. Cyclin F: A component of an E3 ubiquitin ligase complex with roles in neurodegeneration and cancer. Int J Biochem Cell Biol. 2017;89:216-220. doi:10.1016/j.biocel.2017.06.011

118. Deshmukh RS, Sharma S, Das S. Cyclin F-Dependent Degradation of RBPJ Inhibits IDH1 R132H -Mediated Tumorigenesis. Cancer Res. 2018;78(22):6386-6398. doi:10.1158/0008- 5472.CAN-18-1772

119. Nagarajan P, Onami TM, Rajagopalan S, Kania S, Donnell R, Venkatachalam S. Role of chromodomain helicase DNA-binding protein 2 in DNA damage response signaling and tumorigenesis. Oncogene. 2009;28(8):1053-1062. doi:10.1038/onc.2008.440

100

120. Rodriguez D, Bretones G, Quesada V, et al. Mutations in CHD2 cause defective association with active chromatin in chronic lymphocytic leukemia. Blood. 2015;126(2):195-202. doi:10.1182/blood-2014-10-604959

121. Majewski IJ, Kluijt I, Cats A, et al. An α-E-catenin (CTNNA1) mutation in hereditary diffuse gastric cancer. J Pathol. 2013;229(4):621-629. doi:10.1002/path.4152

122. Ceppi F, Gagné V, Douyon L, et al. DNA variants in DHFR gene and response to treatment in children with childhood B ALL: revisited in AIEOP-BFM protocol. Pharmacogenomics. 2018;19(2):105-112. doi:10.2217/pgs-2017-0153

123. Huang X, Spencer GJ, Lynch JT, Ciceri F, Somerville TDD, Somervaille TCP. Enhancers of Polycomb EPC1 and EPC2 sustain the oncogenic potential of MLL leukemia stem cells. Leukemia. 2014;28(5):1081-1091. doi:10.1038/leu.2013.316

124. Brunetti M, Gorunova L, Davidson B, Heim S, Panagopoulos I, Micci F. Identification of an EPC2-PHF1 fusion transcript in low-grade endometrial stromal sarcoma. Oncotarget. 2018;9(27):19203-19208. doi:10.18632/oncotarget.24969

125. Cong X, Lu C, Huang X, et al. Increased expression of glycinamide ribonucleotide transformylase is associated with a poor prognosis in hepatocellular carcinoma, and it promotes liver cancer cell proliferation. Hum Pathol. 2014;45(7):1370-1378. doi:10.1016/j.humpath.2013.11.021

126. Tsukihara H, Tsunekuni K, Takechi T. Folic Acid-Metabolizing Enzymes Regulate the Antitumor Effect of 5-Fluoro-2′-Deoxyuridine in Colorectal Cancer Cell Lines. Xu B, ed. PLoS One. 2016;11(9):e0163961. doi:10.1371/journal.pone.0163961

127. Szymura SJ, Zaemes JP, Allison DF, et al. NF-κB upregulates glutamine-fructose-6- phosphate transaminase 2 to promote migration in non-small cell lung cancer. Cell Commun Signal. 2019;17(1):24. doi:10.1186/s12964-019-0335-5

128. Hoekstra E, Das AM, Willemsen M, et al. Lipid phosphatase SHIP2 functions as oncogene in colorectal cancer by regulating PKB activation. Oncotarget. 2016;7(45):73525-73540. doi:10.18632/oncotarget.12321

129. Zhou Y-L, Zheng C, Chen Y-T, Chen X-M. Underexpression of INPPL1 is associated with aggressive clinicopathologic characteristics in papillary thyroid carcinoma. Onco Targets Ther. 2018;Volume 11:7725-7731. doi:10.2147/OTT.S185803

130. Pichler M, Stiegelbauer V, Vychytilova-Faltejskova P, et al. Genome-Wide miRNA Analysis Identifies miR-188-3p as a Novel Prognostic Marker and Molecular Factor Involved in Colorectal Carcinogenesis. Clin Cancer Res. 2017;23(5):1323-1333. doi:10.1158/1078-0432.CCR-16-0497

131. Lai Y, Xu P, Liu J, et al. Decreased expression of the long non-coding RNA MLLT4 antisense RNA 1 is a potential biomarker and an indicator of a poor prognosis for gastric cancer. Oncol Lett. 2017;14(3):2629-2634. doi:10.3892/ol.2017.6478

101

132. Yu H, Wang H, Xu H-R, et al. Overexpression of MTHFD1 in hepatocellular carcinoma predicts poorer survival and recurrence. Futur Oncol. 2019;15(15):1771-1780. doi:10.2217/fon-2018-0606

133. Moruzzi S, Guarini P, Udali S, et al. One-carbon genetic variants and the role of MTHFD1 1958G>A in liver and colon cancer risk according to global DNA methylation. Chiariotti L, ed. PLoS One. 2017;12(10):e0185792. doi:10.1371/journal.pone.0185792

134. Cooper J, Xu Q, Zhou L, et al. Combined Inhibition of NEDD8-Activating Enzyme and mTOR Suppresses NF2 Loss–Driven Tumorigenesis. Mol Cancer Ther. 2017;16(8):1693- 1704. doi:10.1158/1535-7163.MCT-16-0821

135. Petrilli AM, Fernández-Valle C. Role of Merlin/NF2 inactivation in tumor biology. Oncogene. 2016;35(5):537-548. doi:10.1038/onc.2015.125

136. Wang X, Jiang L. Effects of ornithine decarboxylase antizyme 1 on the proliferation and differentiation of human oral cancer cells. Int J Mol Med. 2014;34(6):1606-1612. doi:10.3892/ijmm.2014.1961

137. Wu B, Wang X, Ma W, Zheng W, Jiang L. Assay of OAZ1 mRNA Levels in Chronic Myeloid Leukemia Combined with Application of Leukemia PCR Array Identified Relevant Gene Changes Affected by Antizyme. Acta Haematol. 2014;131(3):141-147. doi:10.1159/000353406

138. Goswami MT, Chen G, Chakravarthi BVSK, et al. Role and regulation of coordinately expressed <i>de novo</i> purine biosynthetic enzymes <i>PPAT</i> and <i>PAICS</i> in lung cancer. Oncotarget. 2015;6(27):23445-23461. doi:10.18632/oncotarget.4352

139. Ding C, Fan X, Wu G. Peroxiredoxin 1 - an antioxidant enzyme in cancer. J Cell Mol Med. 2017;21(1):193-202. doi:10.1111/jcmm.12955

140. NICOLUSSI A, D’INZEO S, MINCIONE G, et al. PRDX1 and PRDX6 are repressed in papillary thyroid carcinomas via BRAF V600E-dependent and -independent mechanisms. Int J Oncol. 2014;44(2):548-556. doi:10.3892/ijo.2013.2208

141. Chu G, Li J, Zhao Y, et al. Identification and verification of PRDX1 as an inflammation marker for colorectal cancer progression. Am J Transl Res. 2016;8(2):842-859. http://www.ncbi.nlm.nih.gov/pubmed/27158373. Accessed May 28, 2019.

142. Fang J, Quinones QJ, Holman TL, et al. The H+-Linked Monocarboxylate Transporter (MCT1/SLC16A1): A Potential Therapeutic Target for High-Risk Neuroblastoma. Mol Pharmacol. 2006;70(6):2108-2115. doi:10.1124/mol.106.026245

143. Li KKW, Pang JC sean, Ching AK keung, et al. miR-124 is frequently down-regulated in medulloblastoma and is a negative regulator of SLC16A1. Hum Pathol. 2009;40(9):1234- 1243. doi:10.1016/j.humpath.2009.02.003

144. Wang H, Yan H, Fu A, Han M, Hallahan D, Han Z. TIP-1 Translocation onto the Cell

102

Plasma Membrane Is a Molecular Biomarker of Tumor Response to Ionizing Radiation. Aziz SA, ed. PLoS One. 2010;5(8):e12051. doi:10.1371/journal.pone.0012051

145. Han M, Wang H, Zhang H-T, Han Z. The PDZ protein TIP-1 facilitates cell migration and pulmonary metastasis of human invasive breast cancer cells in athymic mice. Biochem Biophys Res Commun. 2012;422(1):139-145. doi:10.1016/j.bbrc.2012.04.123

146. Forbes SA, Beare D, Boutselakis H, et al. COSMIC: somatic cancer genetics at high- resolution. Nucleic Acids Res. 2017;45(D1):D777-D783. doi:10.1093/nar/gkw1121

147. Perri F, Pisconti S, Della Vittoria Scarpati G. P53 mutations and cancer: a tight linkage. Ann Transl Med. 2016;4(24):522. doi:10.21037/atm.2016.12.40

148. Ise H, Yamasaki S, Sueyoshi K, Miura Y. Elucidation of GlcNAc-binding properties of type III intermediate filament proteins, using GlcNAc-bearing polymers. Genes to Cells. 2017;22(10):900-917. doi:10.1111/gtc.12535

149. Ryczko MC, Pawling J, Chen R, et al. Metabolic Reprogramming by Hexosamine Biosynthetic and Golgi N-Glycan Branching Pathways. Sci Rep. 2016;6:23043. doi:10.1038/srep23043

150. Hesketh GG, Dennis JW. N-acetylglucosamine: more than a silent partner in insulin resistance. Glycobiology. 2017;27(7):595-598. doi:10.1093/glycob/cwx035

151. Kubomura D, Ueno T, Yamada M, Nagaoka I. Evaluation of the chondroprotective action of N-acetylglucosamine in a rat experimental osteoarthritis model. Exp Ther Med. 2017;14(4):3137-3144. doi:10.3892/etm.2017.4849

152. Sun D, Hu F, Gao H, et al. Distribution of abnormal IgG glycosylation patterns from rheumatoid arthritis and osteoarthritis patients by MALDI-TOF-MS n. Analyst. 2019;144(6):2042-2051. doi:10.1039/C8AN02014K

153. Wang H-C, Lin Y-T, Lin T-H, et al. Intra-articular injection of N-acetylglucosamine and hyaluronic acid combined with PLGA scaffolds for osteochondral repair in rabbits. Burns JS, ed. PLoS One. 2018;13(12):e0209747. doi:10.1371/journal.pone.0209747

154. Zhang H, Jia Y, Cooper JJ, Hale T, Zhang Z, Elbein SC. Common Variants in Glutamine:Fructose-6-Phosphate Amidotransferase 2 (GFPT2) Gene Are Associated with Type 2 Diabetes, Diabetic Nephropathy, and Increased GFPT2 mRNA Levels. J Clin Endocrinol Metab. 2004;89(2):748-755. doi:10.1210/jc.2003-031286

155. Broschat KO, Gorka C, Page JD, et al. Kinetic characterization of human glutamine- fructose-6-phosphate amidotransferase I: potent feedback inhibition by glucosamine 6- phosphate. J Biol Chem. 2002;277(17):14764-14770. doi:10.1074/jbc.M201056200

156. Fulda S. Regulation of necroptosis signaling and cell death by reactive oxygen species. Biol Chem. 2016;397(7):657-660. doi:10.1515/hsz-2016-0102

157. Belhadj Slimen I, Najar T, Ghram A, Dabbebi H, Ben Mrad M, Abdrabbah M. Reactive

103

oxygen species, heat stress and oxidative-induced mitochondrial damage. A review. Int J Hyperth. 2014;30(7):513-523. doi:10.3109/02656736.2014.971446

158. Campbell RM, Metzler M, Granovsky M, Dennis JW, Marth JD. Complex asparagine- linked oligosaccharides in Mgatl -null embryos. Glycobiology. 1995;5(5):535-543.

Appendices

Appendix 1: pSTV6-PGK-P2R N-mCherry backbone vector map. The b-actin gene was cloned into this vector using gateway cloning, thereby generating mCherry N-terminally tagged b-actin.

104

Appendix 2: Primers used for PCR1 and PCR2 in the genome-wide CRISPR/Cas9 KO screens.

PCR1 Primers (regular desalted oligos): Forward (v2.1-F1): GAGGGCCTATTTCCCATGATTC Reverse (v2.1-R1): GTTGCGAAAAAGAACGTTCACGG

Table 1: PCR2 Primers. Primer Primer Sample Timepoint Sequence type Name AATGATACGGCGACCACCGAGATCTACACAGAGGATAACACTCTTTC T0 Forward S503-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACGCTCGGTAACACTCTTTC T18 A Forward F553-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACGCCAAGACACACTCTTTC T18 B Forward F596-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACAGGCTTAGACACTCTTTC HAP1 T18 C Forward S508-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC NAGK CAAGCAGAAGACGGCATACGAGATTGCCTCTTGTGACTGGAGTTCAG KO T0 Reverse N711-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATTCCTCTACGTGACTGGAGTTCAG T18 A Reverse N712-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATTTCTGCCTGTGACTGGAGTTCAG T18 B Reverse N703-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATTCATGAGCGTGACTGGAGTTCA T18 C Reverse N714-tracr-R GACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACAGCTAGAAACACTCTTTC HAP1 T0 Forward S515-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC MGAT1 AATGATACGGCGACCACCGAGATCTACACATTAGACGACACTCTTTC KO T17 A Forward S510-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC

105 106 AATGATACGGCGACCACCGAGATCTACACAGAGGATAACACTCTTTC T17 B Forward S503-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACCAGATCTGACACTCTTTC T17 C Forward F507-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATCTAGTACGGTGACTGGAGTTCA T0 Reverse N702-tracr-R GACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATTGCCTCTTGTGACTGGAGTTCA T17 A Reverse N711-tracr-R GACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATCATGCCTAGTGACTGGAGTTCA T17 B Reverse N706-U6-R GACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATTCATGAGCGTGACTGGAGTTCA T17 C Reverse N714-U6-R GACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACACTCTAGGACACTCTTTC T0 Forward S516-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACCAGATCTGACACTCTTTC T18 A Forward F507-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACCTAGTCGAACACTCTTTC T18 B Forward S513-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACATAGAGAGACACTCTTTC HAP1 T18 C Forward S502-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG MGAT5 CAAGCAGAAGACGGCATACGAGATCTAGTACGGTGACTGGAGTTCAG KO T0 Reverse N702-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTGACTGGAGTTCAG T18 A Reverse N701-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATCATGCCTAGTGACTGGAGTTCAG T18 B Reverse N706-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATTGCCTCTTGTGACTGGAGTTCAG T18 C Reverse N711-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACAGAGGATAACACTCTTTC T0 Forward S503-tracr-F* HAP1 CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACGCTCGGTAACACTCTTTCC GFPT1 T18 A Forward F553-U6-F* KO CTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG T18 B Forward S505-tracr-F* AATGATACGGCGACCACCGAGATCTACACCTCCTTACACACTCTTTCC

107 CTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACAGCTAGAAACACTCTTTC T18 C Forward S515-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATTAGAACACGTGACTGGAGTTCAG T0 Reverse F759-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATAGGAGTCCGTGACTGGAGTTCAG T18 A Reverse N705-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATCAGCCTCGGTGACTGGAGTTCAG T18 B Reverse N710-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATTTCTGCCTGTGACTGGAGTTCAG T18 C Reverse N703-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACAGGCTTAGACACTCTTTC T0 Forward S508-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACCGGAGAGAACACTCTTTC T27 A Forward S511-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACTATGCAGTACACTCTTTC T27 B Forward S506-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACCCAGTTCAACACTCTTTC MDA- T27 C Forward F536-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC MB-231 CAAGCAGAAGACGGCATACGAGATGTAGAGAGGTGACTGGAGTTCA WT T0 Reverse N707-U6-R GACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATAGGAGTCCGTGACTGGAGTTCA T27 A Reverse N705-U6-R GACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATCTAGTACGGTGACTGGAGTTCA T27 B Reverse N702-tracr-R GACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTGACTGGAGTTCAG T27 C Reverse N701-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACATTAGACGACACTCTTTC T0 Forward S510-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG MDA- AATGATACGGCGACCACCGAGATCTACACAGAGGATAACACTCTTTC MB-231 T27 A Forward S503-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC MGAT1 AATGATACGGCGACCACCGAGATCTACACAGGCTTAGACACTCTTTC T27 B Forward S508-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC

108 AATGATACGGCGACCACCGAGATCTACACATAGAGAGACACTCTTTC T27 C Forward S502-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATGCTCAGGAGTGACTGGAGTTCAG T0 Reverse N704-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATCATGCCTAGTGACTGGAGTTCAG T27 A Reverse N706-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATTAGAACACGTGACTGGAGTTCAG T27 B Reverse F759-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATATCACGTTGTGACTGGAGTTCAG T27 C Reverse F701-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACAGCTAGAAACACTCTTTC T0 Forward S515-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACCAGATCTGACACTCTTTC T33 A Forward F507-tracr-F* CCTACACGACGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC AATGATACGGCGACCACCGAGATCTACACGCCAAGACACACTCTTTC T33 B Forward F596-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG AATGATACGGCGACCACCGAGATCTACACAGAGGATAACACTCTTTC MDA- T33 C Forward S503-U6-F* CCTACACGACGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG MB-231 CAAGCAGAAGACGGCATACGAGATGCTCAGGAGTGACTGGAGTTCAG MGAT5 T0 Reverse N704-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATAGGAGTCCGTGACTGGAGTTCAG T33 A Reverse N705-tracr-R ACGTGTGCTCTTCCGATCTTTGTGGAAAGGACGAAACACCG CAAGCAGAAGACGGCATACGAGATCAGCCTCGGTGACTGGAGTTCAG T33 B Reverse N710-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC CAAGCAGAAGACGGCATACGAGATCCTGAGATGTGACTGGAGTTCAG T33 C Reverse N715-U6-R ACGTGTGCTCTTCCGATCTACTTGCTATTTCTAGCTCTAAAAC

Appendix 3: gRNAs chosen for validation and the forward and reverse oligos required for proper ligation of each gRNA into the pLCKO backbone vector.

Gene gRNA Forward Oligo Reverse Oligo ATIC TGAACCAGAGGACTATGTGG accgTGAACCAGAGGACTATGTGG aaacCCACATAGTCCTCTGGTTCA CCNF GAAGAGCCAGATGAAAGGTG accgGAAGAGCCAGATGAAAGGTG aaacCACCTTTCATCTGGCTCTTC CHD2 AAGCAGCCGAAGACTCAGCG accgAAGCAGCCGAAGACTCAGCG aaacCGCTGAGTCTTCGGCTGCTT CTNNA1 GATGCCTCACAGCACCAGGG accgGATGCCTCACAGCACCAGGG aaacCCCTGGTGCTGTGAGGCATC DHFR CCCGGCAGATACCTGAGCGG accgCCCGGCAGATACCTGAGCGG aaacCCGCTCAGGTATCTGCCGGG EPC2 GATCAGGCATGTCCTTGCCG accgGATCAGGCATGTCCTTGCCG aaacCGGCAAGGACATGCCTGATC GART ATGGAATCCCAACCGCACAA accgATGGAATCCCAACCGCACAA aaacTTGTGCGGTTGGGATTCCAT GFPT2 TAAGATAGGGATCTGTTCTG accgTAAGATAGGGATCTGTTCTG aaacCAGAACAGATCCCTATCTTA INPPL1 AGACAGCGAGAGCGTGGCGG accgAGACAGCGAGAGCGTGGCGG aaacCCGCCACGCTCTCGCTGTCT MLLT4 TGGAACAAAGATGATCGGGA accgTGGAACAAAGATGATCGGGA aaacTCCCGATCATCTTTGTTCCA MTHFD1 AAATAAAGGTGACATCCTGG accgAAATAAAGGTGACATCCTGG aaacCCAGGATGTCACCTTTATTT NF2 CATGCGGGAAGCGATGGCCC accgCATGCGGGAAGCGATGGCCC aaacGGGCCATCGCTTCCCGCATG OAZ1 ATTGAGGATCCGCTGCAGGG accgATTGAGGATCCGCTGCAGGG aaacCCCTGCAGCGGATCCTCAAT PPAT TATAAGCAGGGAGTATGCTG accgTATAAGCAGGGAGTATGCTG aaacCAGCATACTCCCTGCTTATA PRDX1 ACTGAAAGCAATGATCTCCG accgACTGAAAGCAATGATCTCCG aaacCGGAGATCATTGCTTTCAGT SLC16A1 AAATGCATAAGAGAAGCCGA accgAAATGCATAAGAGAAGCCGA aaacTCGGCTTCTCTTATGCATTT TAX1BP3 AAGCGCACTCACCACCACGG accgAAGCGCACTCACCACCACGG aaacCCGTGGTGGTGAGTGCGCTT AAVS1 GTCACCAATCCTGTCCCTAG accgGTCACCAATCCTGTCCCTAG aaacCTAGGGACAGGATTGGTGAC PLK1 ACCGGCGAAAGAGATCCCGG accgACCGGCGAAAGAGATCCCGG aaacCCGGGATCTCTTTCGCCGGT

109

Appendix 4: Sample pi-score calculation, shown using data from SLC16A1 gRNAs inducing a second KO in the HAP1 NAGK KO cell line.

Table 1: Raw data read counts from sequencing. gRNA T0 T18 T18 T18 replicate A replicate B replicate C AAATGCATAAGAGAAGCCGA 348 62 31 94 AATCGGGCCCAAGCCAACCA 657 315 250 158 CATGACAGCCAACATTATGG 692 104 121 87 TACGGAGCTGAGCCACCCGA 714 161 86 138

Table 2: Normalized read counts (to 10 million reads per sample across all gRNAs). T18 T18 T18 gRNA T0 replicate A replicate B replicate C AAATGCATAAGAGAAGCCGA 96.507 40.367 16.953 60.517 AATCGGGCCCAAGCCAACCA 182.199 205.090 136.720 101.721 CATGACAGCCAACATTATGG 191.906 67.712 66.172 56.011 TACGGAGCTGAGCCACCCGA 198.007 104.824 47.032 88.845

Table 3: Log2 fold change calculated for each replicate at T18 [=log2(T18/T0)]. T18 T18 T18 gRNA replicate A replicate B replicate C AAATGCATAAGAGAAGCCGA -1.247 -2.475 -0.669 AATCGGGCCCAAGCCAACCA 0.170 -0.413 -0.838 CATGACAGCCAACATTATGG -1.496 -1.529 -1.768 TACGGAGCTGAGCCACCCGA -0.914 -2.062 -1.152

Table 4: Average log2 fold change across the three T18 replicates. gRNA T18 AAATGCATAAGAGAAGCCGA -1.464 AATCGGGCCCAAGCCAACCA -0.360 CATGACAGCCAACATTATGG -1.598 TACGGAGCTGAGCCACCCGA -1.376

Table 5: Average log2 fold change across all gRNAs targeting the same gene. Gene T18 SLC16A1 -1.199

Table 6: Pi-score calculation [(NAGK KO log2 fold change - WT log2 fold change) and adjusted to the growth rate of the two cell lines]. NAGK KO log WT log Gene 2 2 pi-score fold change fold change SLC16A1 -1.183 0.161 -1.278

110 111

Appendix 5: List of 88 metabolites that showed a significant difference in their abundance between one or more of the following cell lines: HAP1 WT, HAP1 NAGK KO, HAP1 GFPT1 KO, HAP1 MGAT1 KO and HAP1 MGAT5 KO.

1. Alanine 45. Adenosine 5'-monophosphate (AMP) 2. Serine 46. Adenosine diphosphate (ADP) 3. Proline 47. Adenosine triphosphate (ATP) 4. Valine 48. Guanosine monophosphate (GMP) 5. Threonine 49. Guanosine 5'-diphosphate (GDP) 6. Cysteine 50. Guanosine triphosphate (GTP) 7. Isoleucine 51. Cytidine monophosphate (CMP) 8. Leucine 52. Cytidine 5-diphosphate (CDP) 9. Asparagine 53. Cytidine 5-triphosphate (CTP) 10. Aspartate 54. Uridine 5'-monophosphate (UMP) 11. Glutamine 55. Uridine diphosphate (UDP) 12. Lysine 56. Uridine 5-triphosphate (UTP) 13. Glutamate 57. Deoxy cytidine 5'-monophosphate 14. Methionine 58. Nicotinamide adenine dinucleotide 15. Histidine 59. NADH 16. Phenylalanine 60. NADP 17. Arginine 61. NADPH 18. Tyrosine 62. 2`-Deoxyadenosine 19. Tryptophan 63. Spermidine 20. Citrulline 64. Spermine 21. Argininosuccinate 65. Glucosamine-6P 22. Homoserine 66. UDP-GlcNAc 23. Hydroxyproline 67. N-acetylglucosamine (GlcNAc) 24. Ketoleucine 68. Sialic acid (Neu5Ac) 25. Cystathionine 69. SAM 26. Glucose-6P 70. Glutathione reduced (GSH) 27. Fructose-6P 71. Glutathione oxidized (GSSG) 28. Fructose 1,6-bisphosphate 72. Cystine 29. Dihydroxyacetone phosphate (DHAP) 73. Glyoxylate 30. 2-Phosphoglycerate 74. Ketobutyrate 31. Phosphoenolpyruvate (PEP) 75. Acetoacetate 32. Pyruvate 76. Aminoadipate 33. Lactate 77. 2,3-Pyridinedicarboxylate 34. 6Phosphogluconate 78. N-Acetylglutamate 35. Xylulose-5P/Ribose-5P 79. gamma-Aminobutyrate 36. UDP-Glucose 80. o-Phosphorylethanolamine 37. GDP- Fucose 81. Glycerol-3P 38. Aconitate 82. Ascorbic acid 39. (Iso)citrate 83. myo-inositol 40. alpha-ketoglutarate 84. Creatine phosphate 41. Succinate 85. UDP-glucuronate (UDP-G) 42. Fumarate 86. Taurine 43. Oxaloacetate 87. Creatine 44. Malate 88. Ornithine

Appendix 6: Unsupervised heat map of overall metabolite levels in the HAP1 WT and gene-of- interest KO cell lines.

112