Pan-Cancer Study of INPP4B Reveals its Unexpected Oncogene-Like Role and Prognostic Significance

By

Irakli Dzneladze

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Department of Medical Biophysics University of Toronto

© Copyright by Irakli Dzneladze, 2017 Pan-Cancer Study of INPP4B Reveals its Unexpected Oncogene-Like Role and Prognostic Significance

Irakli Dzneladze Doctor of Philosophy Department of Medical Biophysics, University of Toronto, 2017

ABSTRACT

The work presented in this thesis demonstrates and explores the unexpected oncogenic role of a previously known tumor suppressor inositol polyphosphate-4-phosphatase, type-II

(INPP4B). Specifically, INPP4Bhigh AML was found to be associated with poor response to therapy, and shorter overall- and event-free survival. Multivariate analysis revealed that INPP4B expression status improves one’s ability to predict patient outcome when added to current prognostic models. Furthermore, INPP4Bhigh identifies a poor risk patient subgroup in the cytogenetically normal patient subpopulation. In vitro overexpression of INPP4B in AML cell lines revealed that INPP4B contributes to a more aggressive AML disease by enhancing colony forming potential, increasing proliferation, and increasing drug resistance. In addition to being the first in identifying INPP4B’s prognostic significance in AML, this thesis is amongst the first studies to uncover the unexpected oncogenic function of this previously known tumor suppressor.

Next, to identify transcriptional regulators of INPP4B in AML, and characterize the pan- cancer prognostic significance of INPP4B expression status, I developed subgroup identifier

(SubID), a non-median dichotomization tool for heterogeneous populations. Using SubID, I identified significant co-expression between INPP4B and the transcription factor EVI1. In vitro

ii validation revealed EVI1 knockdown in EVI1high AML cell lines is associated with a significant decrease INPP4B levels. Furthermore, chromatin immunoprecipiation demonstrated EVI1 binding to the INPP4B transcription start site region. Next, in my pan-cancer analysis with

SubID, I identified that INPP4B expression may be associated with patient survival in 13 different cancer types. Following stringent multiple testing and permutation corrections, I observed that INPP4Blow expression status was associated with shorter survival in kidney clear cell, liver hepatocellular, and bladder urothelial carcinomas. Conversely, INPP4Blow status was associated with longer survival in pancreatic adenocarcinoma. The unexpected oncogene-like prognostic significance of INPP4B in pancreatic adenocarcinoma was cross-validated in two additional independent pancreatic cancer datasets.

Overall, the work presented in this thesis provides evidence that INPP4B expression status is associated with patient outcome in both a tumor-suppressive and oncogenic manner depending on the context. In vitro work demonstrates that overexpression of INPP4B in AML contributes to traits associated with a more aggressive disease in a causative rather than associative manner.

iii

ACKNOWLEDGMENTS

This thesis could not have been possible without the support, hard work, and guidance of some of the most incredible people I have had the honor of meeting throughout my graduate journey. The last to be written, this was the hardest section to put together. How could I ever express my gratitude to all the people who help make one of my life’s greatest accomplishments a reality?

I would to express my eternal gratitude to my supervisor, mentor, and teacher, Dr. Mark Minden. It is said that great leaders lead by example, demonstrating their own commitment to the values they inspire in others. True to this, you have always been my inspiration for hard work, dedication, passion, scientific curiosity, and perseverance. You have always been a reminder to me that true dedication is continuing to work when everyone else has gone home for the day or has thrown in the towel. I cannot thank you enough for your support and guidance when I needed it the most. It is under your guidance that I was able to develop both scientifically and personally. Thank you for always keeping your office door open to me and my ideas.

I also want to thank Dr. Leonardo Salmena, whose guidance and support helped me publish my first scientific paper. Lenny, together with Dr. Minden, you helped me find the project that led to this thesis, and have supported me throughout the whole process. You helped me to navigate through my graduate journey and develop essential skills. Thank you for encouraging me to pursue my passion in bioinformatics and for all the scientific discussions.

Thank you also to my supervisory committee members. Dr. Linda Penn, your expertise in experimental biology has always been invaluable in helping me optimize my protocols and design the correct experiments. Your methodical approach was also an inspiration for me to maintain my focus and complete the experiments as well and efficiently as possible. Dr. John McPherson, thank you for always helping me see the big picture. Each time we met, you encouraged me to examine what is the overall goal of my work, and what is the best way to achieve that goal. Thank you also for your technical feedback ands for helping me uncover the different facets of my work and research questions. Finally, Dr. Daniel de Carvalho, it was under yours and Dr. McPherson’s guidance that I took my very first steps in bioinformatics. Your suggestions and feedback helped me develop the fundamental skills in biostatistics, bioinformatics, and clinical data analysis. Thank you for always pointing me in the right direction.

I would also like to thank Dr. Jüri Reimand, whose bioinformatic expertise and guidance helped me develop my basic idea of SubID into the tool described in this thesis. Your feedback, advice, and suggestions helped me recognize how SubID can be developed to better fulfill its function.

iv

I am also grateful to all the members of the Minden, Salmena, and Reimand laboratories for all your help, and for making my graduate journey an enjoyable experience. I would like to give a special thank you to Ayesha Rashid, Dr. John Woolley, and Dr. Mike Jain. Your encouragement, motivation, countless hours of discussions, feedback, and suggestions helped me finally reach the finish line. You have helped me cope with the stresses and worries that research entails.

Lastly, I would like to thank my parents and dedicate this thesis to them. Mom and dad, you laid down the foundation upon which I can build my life’s accomplishments. Thank you for your love, care, and giving me the opportunities I now have.

v

TABLE OF CONTENTS

ABSTRACT ...... ii

ACKNOWLEDGMENTS ...... iv

TABLE OF CONTENTS ...... vi

LIST OF TABLES ...... x

LIST OF FIGURES ...... xi

ABBREVIATIONS ...... xiv

CHAPTER 1. INTRODUCTION ...... 1

1.1. INOSITOL POLYPHOSPHATE 4-PHOSPHATASE, TYPE II (INPP4B) ...... 2

1.1.1. Discovery of the 4-Phosphatase and INPP4B ...... 2

1.1.2. INPP4B Locus and Regulation of Expression ...... 5

1.1.3. INPP4B and Active Site ...... 7

1.2. INPP4B BIOLOGICAL ROLE ...... 9

1.2.1. INPP4B Tumor Suppressor Function ...... 10

1.2.2. INPP4B Role in DNA Damage Repair, IR and Drug Resistance ...... 18

1.3. PI3K/AKT PATHWAY ...... 20

1.3.1. Inositol Signalling and the PI3K/AKT Pathway ...... 22

1.3.2. INPP4B Reduces AKT Activation...... 26

1.4. ACUTE MYELOID LEUKEMIA (AML) ...... 27

1.4.1. Acute Myeloid Leukemia ...... 27

1.4.2. Common Genetic Abnormalities in AML ...... 30

1.5. THESIS RATIONALE, HYPOTHESES AND OBJECTIVES ...... 36

vi

CHAPTER 2. INPP4B OVEREXPRESSION IS ASSOCIATED WITH POOR CLINICAL OUTCOME AND THERAPY RESISTANCE IN ACUTE MYELOID LEUKEMIA ...... 39

2.1. ABSTRACT ...... 40

2.2. INTRODUCTION ...... 40

2.3. MATERIALS AND METHODS ...... 43

2.3.1. Patient Samples ...... 43

2.3.2. AML Dataset Analysis ...... 43

2.3.3. Clinical Definitions ...... 44

2.3.4. Cell Culture ...... 45

2.3.5. DNA Plasmids ...... 45

2.3.6. Lentivirus Production ...... 46

2.3.7. Immunoblotting and Immunofluorescence ...... 46

2.3.8. Quantitative RT-PCR ...... 47

2.3.9. Methylcellulose Colony Forming Cell Assay ...... 48

2.3.10. Drug/Ionizing Radiation Treatment ...... 48

2.3.11. Phosphatase Assay ...... 48

2.4. RESULTS ...... 49

2.4.1. High Levels of INPP4B Expression are Associated with Poor Outcome in AML ...... 49

2.4.2. INPP4Bhigh is an Independent Prognostic Marker in AML ...... 56

2.4.3. INPP4Bhigh is Associated with Poor Outcome in AML with Normal Cytogenetics ...... 66

2.4.4. Ecotropic Overexpression of INPP4B Leads to Increased Colony Forming Potential and increased Proliferation in AML Cell Lines ...... 68

vii

2.4.5. INPP4B Overexpression in AML Cells Leads to Reduced Sensitivity to Daunorubicin and Ionizing Radiation ...... 74

2.4.6. Phosphatase dependence of INPP4B-mediated phenotypes in AML ..... 76

2.5. DISCUSSION ...... 81

CHAPTER 3. PAN-CANCER ANALYSIS OF INPP4B REVEALS ITS OPPOSING PROGNOSTIC ROLE ...... 85

3.1. ABSTRACT ...... 86

3.2. INTRODUCTION ...... 87

3.3. MATERIALS AND METHODS ...... 91

3.3.1. Expression and Clinical Data ...... 91

3.3.2. Subgroup Identifier (SubID) ...... 92

3.3.3. Expression-Based Subgrouping (SubID with CoxPH Test) ...... 95

3.3.4. Expression-Based Subgrouping (SubID with Fisher’s Exact Test) ...... 95

3.3.5. Survival, Univariate and Multivariate Analysis ...... 96

3.3.6. Cell Culture ...... 96

3.3.7. Lentiviral Infection ...... 97

3.3.8. Immunoblotting and Chromatin Immunoprecipitation ...... 97

3.4. RESULTS ...... 98

3.4.1. SubID Development and Testing in AML ...... 98

3.4.2. SubID Identifies INPP4Bhigh AML Signature and Potential Transcriptional Regulators ...... 109

3.4.3. Cut-off Optimization Offers Improved Identification of Prognostic Significance...... 117

3.4.4. The Relationship Between INPP4B and Patient Survival is Context Dependent ...... 119

viii

3.5. DISCUSSION ...... 133

CHAPTER 4. DISCUSSION ...... 138

4.1. THE UNEXPECTED ONCOGENE-LIKE ROLE OF INPP4B IN AML ...... 139

4.1.1. Summary of Results ...... 139

4.1.2. Relevant New Literature ...... 142

4.1.3. Significance of Findings ...... 149

4.1.4. Outstanding Questions and Future Directions ...... 150

4.2. SUBID: A CUT-OFF OPTIMIZATION AND SUBGROUP IDENTIFICATION TOOL ...... 151

4.2.1. Summary of Results ...... 152

4.2.2. Significance of Findings ...... 153

4.2.3. Outstanding Questions and Future Directions ...... 155

4.3. POTENTIAL TRANSCRIPTIONAL REGULATORS OF INPP4B ...... 155

4.3.1. Summary of Results ...... 155

4.3.2. Significance of Findings ...... 156

4.3.3. Outstanding Questions and Future Directions ...... 157

4.4. PAN-CANCER PROGNOSTIC SIGNIFICANCE OF INPP4B ...... 158

4.4.1. Summary of Results ...... 158

4.4.2. Significance of Findings ...... 159

4.4.3. Outstanding Questions and Future Directions ...... 160

4.5. CONCLUDING REMARKS ...... 162

REFERENCES ...... 163

ix

LIST OF TABLES

CHAPTER 1

Table 1.1. NCCN risk classification system ...... 32

CHAPTER 2

Table 2.1. Clinical characteristics, treatment effects and survival outcomes ...... 57

Table 2.2. Molecular associations with INPP4B in AML ...... 58

Table 2.3. Multivariate analysis and model comparison of survival in total, CN and intermediate cytogenetic risk group AML patient ...... 59

Table 2.4. Multivariate analysis of OS in AML ...... 60

Table 2.5. Multivariate analysis of EFS AML ...... 60

CHAPTER 3

Table 3.1. Cancers with prognostically significant INPP4B expression status ...... 118

Table 3.2. Datasets with no significant cut-off within the 10 to 90% range ...... 118

Table 3.3. Clinical characteristics for KIRC ...... 126

Table 3.4. Clinical characteristics for LIHC ...... 127

Table 3.5. Clinical characteristics for PAAD ...... 132

x

LIST OF FIGURES

CHAPTER 1

Figure 1.1. INPP4A and INPP4B expression across tissues ...... 4

Figure 1.2. INPP4B protein locus and transcriptional regulators ...... 5

Figure 1.3. INPP4B protein domains and active site ...... 8

Figure 1.4. INPP4B is a pan-cancer tumor suppressor ...... 17

Figure 1.5. PI3K/AKT pathway ...... 21

CHAPTER 2

Figure 2.1. INPP4B expression in peripheral blood versus bone marrow ...... 47

Figure 2.2. High levels of INPP4B are observed in a subset of AML patient samples ...... 50

Figure 2.3. INPP4Bhigh AML patients have lower complete remission rates and shorter survival ...... 53

Figure 2.4. INPP4Bhigh AML patients have lower complete remission rates ...... 55

Figure 2.5 INPP4Bhigh AML patients have shorter survival independent of transplantation status ...... 61

Figure 2.6. INPP4Bhigh constitutes a significant hazard in total and CN.AML ...... 62

Figure 2.7. ROC comparison of INPP4B and several other expression based clinical outcome markers ...... 65

Figure 2.8 INPP4Bhigh constitutes a significant hazard in intermediate cytogenetic risk group AML ...... 67

Figure 2.9. Ectopic overexpression of INPP4B in AML cells leads to increased colony forming potential and proliferation ...... 70

Figure 2.10. Ectopic overexpression of INPP4B in AML cells leads to increased proliferation ...... 72

xi

Figure 2.11. INPP4B overexpression is associated with resistance to chemotherapy and ionizing radiation ...... 75

Figure 2.12. Phosphatase dependence of INPP4B.mediated phenotypes in AML ...... 77

Figure 2.13. INPP4B overexpression does not affect AKT activation levels ...... 80

CHAPTER 3

Figure 3.1. SubID pipeline ...... 94

Figure 3.2. SubID testing in AML – HGF example ...... 99

Figure 3.3. HGF expression distribution in AML ...... 101

Figure 3.4. HGF prognostic significance based on median dichotomization ...... 102

Figure 3.5. SubID application to INPP4B across AML datasets ...... 105

Figure 3.6. INPP4B expression distribution in AML ...... 107

Figure 3.7. INPP4B prognostic significance in AML based on median dichotomization ...... 108

Figure 3.8. INPP4Bhigh signature ...... 110

Figure 3.9. Transcriptional regulators of INPP4B (Verhaak dataset) ...... 111

Figure 3.10. Potential transcriptional regulators of INPP4B (TCGA LAML dataset) ... 113

Figure 3.11. EVI1 regulates INPP4B expression in AML ...... 115

Figure 3.12. Pan-cancer INPP4B prognostic significance ...... 122

Figure 3.13. Pan-cancer INPP4B expression distribution ...... 124

Figure 3.14. Pan-cancer INPP4B prognostic significance based on median dichotomization ...... 125

Figure 3.15. INPP4Bhigh is associated with poor outcome in pancreatic cancer ...... 128

Figure 3.16. INPP4B expression distribution in pancreatic adenocarcinoma ...... 130

xii

Figure 3.17. INPP4B prognostic significance in pancreatic adenocarcinoma based on median dichotomization ...... 131

CHAPTER 4

Figure 4.1. SGK3 signaling ...... 144

Figure 4.2. INPP4B is a pan-cancer oncogene ...... 148

Figure 4.3. INPP4B promotes SGK3 activation ...... 151

xiii

ABBREVIATIONS

AA Amino acid AKT (also known as PKB) Protein kinase B ALL Acute lymphoblastic leukemia APL Acute promyelocytic leukemia AML Acute myeloid leukemia AP2 Adapter protein 2 AR Androgen receptor ARE Androgen response element ATM ATM serine/threonine kinase ATR ATR serine/threonine kinase ATRA All-trans retinoic acid BAD BCL2 associated agonist of cell death bp Basepairs BM Bone marrow BRCA1 Breast Cancer 1 CA Cytogenetically abnormal cDNA Complementary DNA CFC Colony forming cell CN Cytogenetically normal CoxPH Cox proportional hazards CR Complete remission CST Cell Signaling Technology DAG Diacylglycerol DiFMUP 6,8-difluoro-4-methylumbelliferyl DNMT3A DNA cytosine-5-methyltransferase 3A DSB Double strand break DNA Deoxyribonucleic acid ECM Extracellular matrix EEA1 Early endosome antigen 1

xiv

EFS Event-free survival EGF Epidermal growth factor ER Estrogen receptor ETS1 ETS proto-oncogene 1 FAB French-American-British FACS Fluorescence-activated cell sorting FBS Fetal bovine serum FDR False discovery rate FLT3 Fms related tyrosine kinase 3 GEO Gene Expression Omnibus GSK3 Glycogen synthase kinase 3 HER2 (also known as ERBB2) Erb-B2 receptor tyrosine kinase 2 HMEC Human mammary epithelial cells HPLC High pressure liquid chromatography HR Hazard ratio IF Immunofluorescence IGF1 Insulin like growth factor 1 IHC Immunohistochemistry INPP4B Inositol polyphosphate-4-phosphatase, type-II IP3 Inositol 1,4,5-triphosphate IR Ionizing radiation kb Kilo basepairs LOH Loss of heterozygosity LRT Likelihood ratio test MDS Myelosplasia MEF Mouse embryonic fibroblast miR Micro RNA mRNA Messenger RNA NCCN National Comprehensive Cancer Network NHR Nervy Homology 2 NPM1 Nucleophosmin

xv

NR No response OCI Ontario Cancer Institute OS Overall survival PB Peripheral blood PBS Phosphate-buffered saline PCR Polymerase chain reaction PDK1 Phosphoinositide kinase 1 PgR Progesterone receptor PH Pleckstrin homology PI Phosphoinositides PI3K Phosphatidylinositol-4,5-bisphosphate 3-kinase PI3KCA PI3K catalytic P70S6K p70 ribosomal protein S6 kinase PLA2 Phospholipase A2 PLC Phospholipase C PM Princess Margaret PTP Protein tyrosine phosphatase PTEN Phosphatase and tensin homolog PX Phox homology pNPP Para-nitrophenyl phosphate qPCR Quantitate PCR RA Retinoic acid RAD51 RAD51 recombinase RBC Red blood cell RFS Relapse free survival RNA Ribonucleic acid ROC Receiver operating characteristic SDM Site directed mutagenesis SEM Standard error of mean SGK3 Serum and gluococorticoid-regulated kinase 3 SH2 Src homology 2

xvi shRNA Short hairpin RNA SHIP1/2 SH2 domain-containing inositol 5-phosphatase 1/2 siRNA Small interfering RNA SubID Subgroup identifier TCGA The Cancer Genome Atlas TORC2 (also known as CRTC2) CREB regulated transcription coactivator 2 UHN University Health Network UTR Untranslated region WBC White blood cell WHO World Health Organization

xvii

CHAPTER 1.

INTRODUCTION

1

1.1. INOSITOL POLYPHOSPHATE 4-PHOSPHATASE, TYPE II (INPP4B)

Phosphoinositides (PIs) are lipid-based second messengers that can regulate key cellular functions such as survival, proliferation and membrane trafficking (Payrastre et al., 2001).

Structurally, PIs are comprised of a glycerol backbone attached to two fatty acid chains, and an inositol head group that can be reversibly phosphorylated at the D3, D4, and/or D5 positions

(Vivanco and Sawyers, 2002). While the fatty acid chains are responsible for embedding the PIs into the membrane, the variably phosphorylated inositol head group acts as a docking site for downstream target (discussed further in Section 1.3). Due to their vital role, the phosphorylation status of PIs is carefully regulated by kinases and phosphatases in response to growth regulatory signals (Balla, 2013). Below is the story of discovery and early characterization of a PI 4-phosphatase that this dissertation is focused on.

1.1.1. Discovery of the 4-Phosphatase and INPP4B

In 1987, Bansal et al. were studying the metabolism of PI(1,3,4)P3 in crude calf brain

3 32 supernatant using [ H]PI(1,3-[ P]4)P3 (Bansal et al., 1987). Following incubation of the radiolabeled phosphoinositide in this crude extract, the products were fractionated using high pressure liquid chromatography (HPLC). Analysis of the 32P/3H ratios in the HPLC fractions revealed the presence of a PI(1,3)P2 product corresponding to dephosphorylated PI(1,3,4)P3 at the D4 position; the finding of PI(1,3)P2 pointed to the existence of a previously uncharacterized

4-phosphatase. Three years following the initial discovery, Bansal et al. reported the successful purification of a 4-phosphatase responsible for the generation of PI(1,3)P2 from the soluble

2 fraction of calf brain (Bansal et al., 1990). Using column chromatography and salting out, the authors purified the protein for in vitro analysis where they were able to demonstrate a clear correlation between 4-phosphatase enzyme activity and the presence of a 105 kDa product by

Western blot. In accordance with its function of dephosphorylating polyphosphorylated inositols, the newly discovered 4-phosphate was named inositol polyphosphate 4-phosphatase (INPP4).

Building upon this pioneering work, Norris et al. published a study describing the enzyme kinetics analysis of the 4-phosphatase purified from rat brain tissue (Norris and Majerus, 1994).

The analysis revealed that the 4-phosphatase preferentially bound to PI(3,4)P2, and dephosphorylated it to PI(3)P >900 fold more efficiently than it dephosphorylated PI(1,3,4)P3.

The rat brain purified 4-phosphatase protein was sequenced to create degenerate primers for the isolation of the 4-phosphatase’s cDNA (Norris et al., 1995). The isolated cDNA was shown to contain an open reading frame translating to a 939 amino acid product with a predicted molecular weight of 105,588 Da. Furthermore, Norris et al. were able to isolate a related human

4-phosphotase cDNA from fetal brain. Compared to the rat cDNA, the human cDNA had 90% sequence similarity throughout the coding region and 60% sequence similarity in the 5’ untranslated region (UTR). Interestingly, human and rat cDNA translated into proteins with 97% similarity with most amino acid substitutions being conservative. Functionally, immunoprecipitation of INPP4 revealed a physical interaction between the 4-phsophatase and

PI(3,4)P2. Subsequent enzyme assays confirmed that this 4-phosphatase catalyzed the hydrolysis of PI(3,4)P2 to PI(3)P.

3

While analyzing the 4-phosphatase cDNA sequences produced with the degenerate primers in the 1995 paper, Norris et al. observed the existence of a second cDNA product (Anderson et al, 1997). Further examination of this cDNA sequence confirmed the existence of a second 4- phosphatase isoenzyme, which was coined 4-phosphatase type II (in later papers, the names of 4- phosphatase type I and type II were changed to INPP4A and INPP4B, respectively). Similar to

INPP4A, the molecular weight of INPP4B was shown to be 105,527 Da. Both isoenzymes hydrolyzed PI(3,4)P2 to PI(3)P, but differed in expression levels across tissues (Figure 1.1).

Peptide sequence analysis revealed a 37% similarity between INPP4A and INPP4B at the amino acid level.

Figure 1.1. INPP4A and INPP4B expression across tissues. Heatmaps summarizing the protein (immunohistochemistry) and gene expression (RNAseq) levels of INPP4A and INPP4B across normal human tissues (figure based on data downloaded from The Human Protein Atlas).

4

1.1.2. INPP4B Locus and Regulation of Expression

Human INPP4B is located on 4q31.21 and spans a region of 820 kb (Figure 1.2). The gene coding sequence is 4 kb long, and is comprised of 26 exons (Ferron and Vacher, 2006). Bisulfite sequencing analysis was used to demonstrate the presence of a differentially methylated 717 bp

CpG island located in the region of the promoter and first exon region (Yuen et al., 2014).

Though it is not known what transcription factor is prevented from binding to the 74 CpGs within the island, it has been shown that demethylation at this island results in increased transcription of INPP4B in Epstein Barr virus positive nasopharengeal carcinoma cells (Yuen et al., 2014). Though INPP4B is expressed in all tissue, highest transcript levels are present in breast, uterus testis, and bladder cells (Figure 1.1).

Figure 1.2. INPP4B protein locus and transcriptional regulators. INPP4B is located on 4q31.21 and is comprised of 23 coding exons (orange), and three non-coding (blue) exons. Upstream the 5’ UTR is a 717 bp-long CpG island (yellow). Shown here are known transcriptional regulators of INPP4B.

5

In prostate cancer cells, it has been shown that INPP4B mRNA expression is upregulated in response to treatment with synthetic androgen (Hodgson et al., 2011). Specifically, androgen receptors (AR) bound to synthetic androgen translocate into the nucleus where they bind AR- binding sites (androgen response elements; AREs) at -67 kb and +148.5 relative to the INPP4B transcription start site. Androgen-AR complex binding to the AREs was verified using chromatin immunoprecipitation. Similarly, INPP4B expression was also shown to be upregulated in bladder cancer cells following estrogen receptor (ER) binding to the -2353 bp site of the INPP4B promoter (Hsu et al., 2014). High ER and progesterone receptor (PgR) expression has also been associated with increased INPP4B expression in 14 independent breast cancer datasets (Fedele et al., 2010). Though direct binding was not shown, it was identified that loss of the GATA binding protein 1 (GATA1) transcription factor in megakaryocyte cells results in absent INPP4B expression (Vyas et al., 2000). INPP4B protein expression was also shown to be downregulated by miR-937 in lung cancer by binding to the 3’ UTR (Zhang et al., 2016). INPP4B expression was also downregulated following overexpression of miR-494 and miR-599 in melanoma cells

(Chi et al., 2015). Lastly, analysis of colon cancer revealed that INPP4B expression is regulated by the ETS proto-oncogene 1 (ETS1) transcription factor binding to the -279 bp and -26 bp binding sites of the INPP4B promoter (Guo et al., 2015).

6

1.1.3. INPP4B Protein and Active Site

INPP4B is a 105 kDa (924 AA) protein which is primarily localized at the cell membrane, and partially within the cytoplasm (Hodgson et al., 2014). Structurally, INPP4B is comprised of a C2 domain (AA 25-149), a Nervy Homology 2 (NHR2) domain (AA 510-544), and a 4- phosphatase active site (AA 842-849) (Figure 1.3) (Agoulnik et al, 2011). Binding of INPP4B to its substrate is mediated by the Ca2+-dependent phospholipid-binding C2 domain. Binding to the

Ca2+ ion provides INPP4B’s C2 domain with the positive charge necessary to attract its negatively charged target phospholipid (Lemmon, 2008). Currently, the role of the hydrophobic

NHR2 domain in INPP4B is much less clear. It is known that that NHR2 domains are usually involved in mediating protein-protein interactions and oligomerization (Agoulnik et al., 2011;

Liu et al., 2006). Consequently, it has been hypothesized that NHR2 is involved in allowing

INPP4B to interact with other proteins. Finally, INPP4B’s 4-phosphate domain is comprised of a

CKSAKDRT sequence (Norris et al., 1997). Similar to other Mg2+-independent phosphatases, the conserved Cys-XAA5-Arg active site’s cysteine mediates the phosphate reaction by acting as the nucleophile. Modification of the active site cysteine with N-ethylamelimide has been shown to completely abolish the 4-phosphatase activity. Similarly, mutating the active site cysteine residue to alanine (C842A) or serine (C842S) has been shown to create a phosphatase dead

INPP4B enzyme (Gewinner et al., 2009; Lopez et al., 2013). These phosphatase dead C842A and C842S mutants have been used to study the importance of phosphate activity in INPP4B mediated phenotype – including studies outlined in this dissertation. Additional catalytic site studies revealed that mutation of the K846 and D847 residues also abolishes phosphatase activity

(Lopez et al., 2013). K843 has been suggested to play a role in substrate orientation. In addition

7 to its well characterized lipid phosphatase activity, INPP4B has also been shown to possess protein tyrosine phosphatase (PTP) activity (Lopez et al., 2013). In vitro studies demonstrated that INPP4B is able to dephosphorylate para-nitrophenyl phosphate (pNPP) and 6,8-difluoro-4- methylumbelliferyl (DiFMUP), both synthetic phosphotyrosine analogs. This PTP activity was abolished in the catalytically inactive C842A mutants, but enhanced in the K843M mutant; mutation of K846M did not affect PTP activity.

Figure 1.3. INPP4B protein domains and active site. INPP4B is comprised of three domains:

C2 (yellow), NHR2 (green) and a Ca2+-dependent 4-phosphate domain (blue). The 4-phosphate catalytic site is comprised of a CKSAKDRT sequence. The effects of active site mutations are listed in red.

8

1.2. INPP4B BIOLOGICAL ROLE

One of the earliest indications of the importance of INPP4B in cellular function came from a differential expression screen conducted by Vyas et al. in 2000 (Vyas et al., 2000). The screen was aimed at identifying the cause of hyperproliferation in megakaryocytes lacking the GATA1

(GATA1-) transcription factor. Subtractive cDNA hybridization revealed a deficit of INPP4B cDNA levels in the GATA1- megakaryocytes compared to normal megakaryocytes.

Overexpression of INPP4B in GATA1- megakaryocytes demonstrated a reduction in cell proliferation and colony formation potential, thus demonstrating its role in averting the hyperproliferation phenotype present in its absence. Similarly, overexpression of INPP4B in NIH

3T3 cells fibroblasts was shown to reduce cellular proliferation 2.5 fold. In another high throughput study, shRNA mediated knockdown of INPP4B was shown to confer anchorage- independent growth ability to previously anchorage-dependent human mammary epithelial cells

(HMECs) (Westbrook et al., 2005). Together, both of these studies provided the first evidence indicating that INPP4B may play a role in regulating proliferation and colony formation potential. Since these pioneering studies, new evidence has been put forward to support

INPP4B’s tumor suppressor role in breast cancer, as well as, uncovering INPP4B’s tumor suppressive role in other cancers. Furthermore, recent studies have demonstrated that INPP4B also plays an important role in drug resistance, autophagy, and DNA damage repair as discussed below.

9

1.2.1. INPP4B Tumor Suppressor Function

Breast Cancer

shRNA mediated knockdown of INPP4B in HMECs has been shown to result in anchorage-independent growth, increased proliferation, and increased migration as shown by a scratch wound assay (Westbrook et al., 2005; Gewinner et al., 2009). Similarly, INPP4B knockdown in MCF-10A cells was shown to result in increased invasiveness (intravisation through matrigel assay), and disrupted acini architecture in 3D cultures (Gewinner et al., 2009).

ShRNA mediated knockdown of INPP4B in MCF-7 cells was shown to promote cell proliferation in response to serum starvation, increased colony formation potential and anchorage independent growth (Fedele et al., 2010). Lastly, overexpression of INPP4B in MDA-MB-21 cells caused decrease in proliferation, loss of anchorage-independent growth potential, decreased cell growth, and cell cycle arrest at the G1 stage (Fedele et al., 2010; Sun et al., 2014).

Akin to the in vitro work findings, mouse model studies demonstrated that mice injected with MCF-7 cells into the mammary fat pad developed significantly larger tumors than MCF-7 cells with shRNA-mediated knockdown of INPP4B (Fedele et al., 2010). Similarly, mice injected subcutaneously with INPP4B overexpressing breast cancer SUM149 cells developed significantly smaller tumors compared to control SUM149 cells (Gewinner et al., 2009).

Mechanistically, both studies demonstrated that INPP4B overexpression results in decreased activation of protein kinase B (AKT or PKB) (Fedele et al., 2010; Gewinner et al.,

10

2009). Consistent with its 4-phophatase activity and strong affinity for PI(3,4)P2, INPP4B overexpression was shown to be associated with decreased levels of PI(3,4,)P2, but not

PI(3,4,5)P3. Thus, the authors determined that INPP4B acts as a tumor suppressor by decreasing the levels of AKT-activating PI(3,4)P2 by dephosphorylating it to PI(3)P (discussed further in

Section 1.3.2).

Examination of INPP4B expression levels reveals INPP4B expression is limited to ER positive (ER+) but not ER negative (ER-) normal breast cells and breast cancer cell lines (Fedele et al., 2010). Conversely, INPP4B expression was found to be lost in 88% of basal-like

(ER/PgR/HER2 triple negative) tumours, 38% of HER2+ tumours, 0.22% of luminal B tumours, and 8% of luminal A tumours (Fedele et al., 2010). The association between loss of INPP4B expression and the basal-like subtype is so prominent, that INPP4B was found to be a 61% sensitive and 99% specific biomarker for defining basal-like breast cancer (Won et al., 2013).

Consistent with the aggressive nature of basal-like tumours, loss of INPP4B expression was found to be significantly associated with high tumour grade and size, and shorter overall and relapse free survival (Fedele et al., 2010; Gewinner et al., 2009; Tokunaga et al., 2016).

Overall, breast cancer studies have demonstrated that INPP4B loss, a frequent event in the basal-like subtype, is associated with poor patient outcome and advanced disease.

Experimentally, INPP4B has been shown to be a potent tumor suppressor whose loss is associated with increased proliferation, anchorage-independent growth, invasiveness, colony formation potential, and tumor growth in mouse models.

11

Ovarian Cancer

In addition to characterizing the tumor suppressive role of INPP4B in basal-like breast cancer, Gewinner et al. demonstrated that INPP4B loss of heterozygosity (LOH) also occurs in ovarian cancer, and is associated with shorter overall survival (OS) and higher rates of metastasis to the local lymph nodes (Gewinner et al., 2009). This clinical association was validated in a separate study which showed that INPP4B LOH, most common in the serous and endometroid ovarian cancer subtypes, is associated with shorter OS (Salmena et al., 2015). Downregulation of

INPP4B in ovarian follicular granulosa cells increased AKT activation, resulting in increased proliferation, and teratoma formation (Balakrishnan and Chaillet, 2013).

Bladder Cancer

Expression analysis of three independent datasets revealed that ER mRNA expression is significantly lower in bladder cancer samples relative to normal bladder tissue (Hsu et al., 2014).

Cell line and mouse models were used to demonstrate that ER acts as a tumor suppressor, and its loss associated with increased cancer incidence, and results in increased transformation of urothelial cells, and increased proliferation. To identify how ER acts as a tumor suppressor, the authors examined which common oncogenes and tumor suppressors are regulated by it. Of the examined, only INPP4B was shown to be upregulated following overexpression of ER in

UMUC3 and T24 bladder cancer cells. Overexpression of ER in UMUC3, T24 and SVHUC cells resulted in a significant upregulation of INPP4B mRNA and protein levels (Hsu et al., 2014).

Conversely, shRNA-mediated knockdown of ER resulted in downregulation of INPP4B protein

12 levels. As discussed in Section 1.1.2, ER was shown to bind the INPP4B promoter (chromatin immunoprecipitation) and regulate its expression (luciferase assay).

To study the functional roles of INPP4B and ER in bladder cancer, the authors conducted a proliferation assay with UMUC3 and T24 cells. The proliferation assay demonstrated that overexpression of ER resulted in decreased proliferation in both UMUC3 and T24 cells.

However, this decrease in proliferation was not observed when ER was overexpressed while

INPP4B was knocked down with shRNA, thus (in combination with Western blot data) providing evidence that in these experiments, the ER-mediated decrease in proliferation is dependent upon induction of INPP4B. Similarly, the authors demonstrated that ER-mediated decrease in colony formation potential of SVHUC cells was dependent upon induction of

INPP4B. To study the mechanism responsible for INPP4B-mediated decrease in proliferation and colony formation, the authors examined and demonstrated that ER-mediated induction of

INPP4B results in decreased levels of pAKT (INPP4B-pAKT mechanism discussed further in

Section 1.3.2). Consistent with its tumor suppressive role in bladder cancer, the authors demonstrated that INPP4B expression is lost in bladder cancer patient samples.

Lung Cancer

Gene expression analysis of cancer and cancer-adjacent normal tissue pairs revealed that miR-937 is overexpressed in lung cancer cells (Zhang et al., 2016). MiR-937 overexpression in

A549 lung cancer cell lines was shown to increase cell proliferation, colony formation potential, and anchorage independent growth. Subsequent mechanism analysis revealed that miR-937

13 binds directly to the 3’UTR of INPP4B causing downregulation of INPP4B protein levels. siRNA mediated knockdown of INPP4B protein confirmed that the observed miR-937 phenotype was caused by its downregulation of INPP4B. These results support a tumor suppressive role in lung cancer, and are consistent with observed loss of INPP4B gene expression in 47% of patients with squamous cell carcinoma of the lung (Stjernström et al.,

2014).

Prostate Cancer

Immunohistochemical staining of tissue microarrays revealed significantly lower expression of INPP4B protein in prostate tumor cells versus normal tissue (Hodgson et al.,

2011). Furthermore, low INPP4B expression within prostate tumours was associated with decreased recurrence-free survival (Hodgson et al., 2011; Rynkiewicz et al., 2015). Multivariate analysis revealed that INPP4B expression was an independent predictor of disease relapse in a model with a high Gleason score and disease stage (Rynkiewicz et al., 2015). In vitro, doxycycline induced INPP4B expression in PC3 cells resulted in decreased matrigel invasion potential, decreased proliferation, and cell cycle arrest in G1 (Ding et al., 2014; Hodgson et al.,

2014). Gene expression analysis revealed that INPP4B overexpression was associated with differential expression of genes enriched for cell adhesion and ECM deposition pathways.

14

Thyroid Cancer

INPP4B overexpressed in 293T cells was precipitated and used in an in vitro enzyme assay to assess its 4-phosphatase activity (Kofuji et al., 2015). The assay revealed that INPP4B (but not the catalytically inactive C842S mutant) was able to dephosphorylate both PI(3,4)P2 and

(unexpectedly) PI(3,4,5)P3. While PTEN reached a catalytic plateau at 1 mmol/L PI(3,4,5)P3 substrate concentration, incubation of substrate with both PTEN and INPP4B did not demonstrate the same plateau. Thus, these findings suggest that INPP4B plays a role in regulating PI(3,4,5)P3 levels, particularly in low PTEN conditions. To study this, Kofuji et al generated Inpp4bΔ/Δ mice (deleted phosphatase domain-coding exon 21) which were then crossed with Pten+/- mice. While Pten-/- are embryonically lethal, Inpp4bΔ/Δ were healthy and lived a normal lifespan. Inpp4bΔ/Δ; Pten+/- on the other hand had significantly shorter lifespan and developed abnormally large thyroid glands leading to airway compression and respiratory distress (Chew et al., 2015; Kofuji et al., 2015). Further examination of Inpp4bΔ/Δ; Pten+/- mouse thyroids revealed vascular invasion and diffuse pulmonary metastases. Mechanistically,

Δ/Δ +/- Inpp4b ; Pten demonstrated a significant increase in PI(3,4,5)P3 levels, and AKT activation.

INPP4B knockdown in TPC1 and 8505C thyroid cancer cell lines was used to further characterize the role of INPP4B in thyroid cancer (Chew et al., 2015). Soft-agar colony formation assays revealed that INPP4B loss was associated with increased anchorage- independent growth in low, but not normal, serum conditions. Analysis of human thyroid cancers revealed that INPP4B expression is downregulated in human thyroid cancer (Kofuji et al., 2015).

15

Melanoma

To investigate the role of INPP4B in melanocytic lesions, levels of INPP4B were first characterized in nevi, primary melanomas, and melanoma metastases (Perez-Lorenzo et al.,

2014). INPP4B was shown to be lost in both primary melanomas and metastases, thus being implicated in tumor progression. Stable overexpression of INPP4B in melanoma cell lines was associated with decreased proliferation rates and decreased migration as shown by a wound- closure assay. Though INPP4B overexpression did not result in decreased pAKT (S473) levels,

INPP4B knockdown did increase pAKT in both melanoma cell lines and immortalized human melanocytes. INPP4B knockdown was also associated with increased invasion (as shown with a modified Boyden chamber assay) and migration (wound close assay). Furthermore, INPP4B loss was also shown to be associated with increased proliferation rates. In vivo, the authors showed that forced overexpression of INPP4B resulted in decreased tumorigenic potential, and prolonged mouse survival.

Summary

While INPP4B’s role in regulating cellular proliferation and colony formation potential was first demonstrated in non-cancerous megakaryocyte cells (Vyas et al., 2000), its tumor suppressor role in the context of cancer was first shown in 2005 by Westbrook et al while studying breast cancer HMEC cells (Westbrook et al., 2005). Since this pioneering work, numerous subsequent studies have demonstrated the true extent and scope of INPP4B’s tumor suppressor function across cancers. Specifically, to date, INPP4B has been shown to act as a

16 tumors suppressor in seven distinct cancer types (Figure 1.4). The common theme across all these studies is that INPP4B fulfills an import role in regulating cellular proliferation, colony formation potential, anchorage-independent growth, and invasiveness. Clinically, INPP4B loss is associated with poor patient outcome and advanced disease. This tumor suppressor function is the result of INPP4B’s ability to dephosphorylate AKT-activating PI(3,4)P2 into PI(3)P

(discussed further in Section 1.3.2).

Figure 1.4. INPP4B is a pan-cancer tumor suppressor. Summary of INPP4B’s pan-cancer tumor suppressor function detailing the phenotypes observed following INPP4B knockdown/loss or overexpression. The timeline indicates the year in which INPP4B was identified to act as a tumor suppressor in the respective cancer.

17

1.2.2. INPP4B Plays a Role in DNA Damage Repair, IR and Drug Resistance

Microarray analysis of MCF-10A cells with knocked down INPP4B expression revealed a differential gene expression signature similar to that observed in BRCA1-negative tumors (Ip et al., 2015). Due to the signature similarity and BRCA1’s role in DNA repair, the authors examined whether INPP4B also plays a role in DNA repair. Comet assay analysis of ovarian cancer OVCA429 cells following X-ray irradiation demonstrated significantly decreased DNA repair after shRNA knockdown of INPP4B. These results were confirmed in OVCA433 and

OVCA429 cells by demonstrating formation of γH2AX foci (indicative of DNA damage),

RAD51 foci (indicative of DSB repair), and 53BP1 foci (required for p53 accumulation) following treatment with ionizing radiation (IR) or etoposide. To identify the mechanism with which INPP4B contributes to DNA damage repair, the authors used floxed INPP4B mouse embryonic fibroblasts (MEFs). Western blots of the MEF lysates following Ad5Cre treatment revealed loss of BRCA1, ATR and ATM. A possible mechanism for this comes from the treatment of OVCA433 INPP4BshRNA-KD cells with the proteasome inhibitor MG132 in which it was found that INPP4B prevents ATR and BRCA1 protein degradation. In keeping with this, co- immunoprecipitation studies demonstrated that INPP4B interacts directly with ATR and

BRCA1, but not ATM in 293T cells. Thus, INPP4B has been shown to be involved in DNA damage repair by preventing loss of ATR and BRCA1 by way of proteasome degradation.

However, it is important to note that there is much still unknown about INPP4B involvement in DNA damage repair. One example is the contradictory relationship between

INPP4B expression and inhibitors of the DNA damage repair protein PARP. In one study, it was

18 shown that INPP4B knockdown in ovarian cancer OVCA429 cell lines sensitizes the cells to olaprib, a PARP inhibitor in vitro and in vivo (cells injected into nude mice) (Ip et al., 2015).

However, two other studies demonstrate that it was INPP4B overexpression that sensitized cells to the PARP inhibitor AGO14699 in MDA-MB-231 basal-like breast cancer and PC3 prostate cancer cell lines (Ding et al., 2014; Sun et al., 2014). This differential effect suggests inherently different pathways are active in the sensitive and resistant cells following modulation of

INPP4B.

A differential gene expression screen comparing radioresistant and parental HEp2 laryngeal cancer cells identified higher expression of INPP4B in the radioresistant cells (Kim et al., 2012). Treatment of both parental and radioresistant HEp2 cells with IR resulted in a significant upregulation of INPP4B, thus identifying it as a radiation-responsive gene in these cells. siRNA mediated knockdown of INPP4B in radioresistant HEp2 cells resensitized the cells to radiation treatment. Similar upregulation of INPP4B was observed in A549 (lung cancer),

H460 (lung cancer), HCT116 (colon cancer) and MCF7 (breast cancer) cells following IR treatment. Furthermore, INPP4B upregulation was also observed following HEp2 treatment with bleomycin, cisplatin, etoposide and doxorubicin anticancer drugs. Overexpression of INPP4B in the parental A549, H460, HCT116 and MCF7 cells increased baseline resistance to bleomycin, etoposide and doxorubicin treatment.

19

1.3. PI3K/AKT PATHWAY

As discussed in Section 1.1., PIs are inner-plasma membrane phospholipids which can undergo reversible phosphorylation at the D3, D4 and D5 positions of the inositol ring to generate seven different PI species (Vivanco and Sawyers, 2002). These phosphorylated species differ in their ability to activate downstream effectors, and thus have different physiological roles including regulating proliferation, cell survival, and cell growth. Due to the physiological importance of PI signaling, the phosphorylation status of PIs is carefully regulated by kinases and phosphatases (Balla, 2013). One of the most well-known PI signaling pathways the

PI3K/AKT pathway, of which INPP4B is a part (Figure 1.5). Specifically, extensive evidence demonstrates that INPP4B overexpression causes downregulation of AKT activation across cancers. To understand INPP4B’s role in the PI3K/AKT pathway, the following section presents relevant background on phosphoinositide signaling and PI3K/AKT pathway beginning with

PI(4,5)P2.

20

Figure 1.5. PI3K/AKT pathway. AKT activation by PDK1 and TORC2 results in increased proliferation, survival, inhibition of apoptosis, increased protein synthesis, and cell growth.

21

1.3.1. Inositol Signalling and the PI3K/AKT Pathway

Stage 1: PI(4,5)P2 Starting Point

While mostly known for being a PI3K substrate and a starting point for the PI3K/AKT signaling pathway, PI(4,5)P2 is involved in a large number of events occurring at the plasma membrane including extracellular signal transduction, exocytosis, endocytosis, phagocytosis, ion channel transport, and cell adhesion (Di Paolo and De Camilli, 2006). This diverse physiological function is the result of three potential modifications of the PI headgroup (cleavage, dephosphorylation and phosphorylation), triggered by various extracellular and intracellular signals. Cleavage of PI(4,5)P2 is catalyzed by plasma-membrane-bound phospholipase C (PLC) and A2 (PLA2) in response to G-protein-linked receptor activation by extracellular signals.

Cleaved PI(3,4)P2 releases membrane bound diacylglycerol (DAG) and inositol 1,4,5- triphosphate (IP3). IP3 is a small, water-soluble metabolite which rapidly diffuses through the cytosol so as to propagate extracellular signals into the cell. When DAG reaches the endoplasmic reticulum, it triggers the opening of IP3 gated channels which release Ca2+ to the cytoplasm.

Increased Ca2+ levels in turn trigger responsive intracellular targets including Ca2+-dependent ion

2+ channels, and Ca -dependent protein kinases. Dephosphorylation of PI(4,5)P2 at the D5 position

2+ controls PI(4,5)P2 levels without generating DAG metabolite, and thus Ca release from the ER.

Lastly, PI(4,5)P2 can be phosphorylated by PI3K at the D3 position initiates the PI3K/AKT pathway (Di Paolo and De Camilli, 2006).

22

Stage 2: PI3K Responds to Growth Stimuli by Generating PI(3,4,5)P3and PI(3,4)P2 via SHIP1/2

Cell stimulation by growth factors and hormones causes responding cell surface tyrosine kinases to activate the PI3K inositol phospholipid kinase (Cantley, 2002). Although there are several forms of PI3K, in higher eukaryotes, the phosphorylation of PI(4,5)P2 at D3 is primarily mediated by Class 1 PI3Ks. Class II and Class III PI3Ks are less well characterized and appear to respond to signals unrelated to growth and survival (Ma et al., 2008). Class I PI3Ks are heterodimers of the p85 regulatory and p110 catalytic (PIK3CA) subunits (Bunney and Katan,

2010). As the name implies, the regulatory subunit is responsible for controlling the functionality of the catalytic subunit by dampening it in quiescent cells, and promoting it in the presence of growth factor stimulation. Furthermore, the regulatory unit is responsible for interaction with upstream activators at phosphotyrosine residues by way of its two SH2 domains (Ma et al.,

2008). Once activated, PI3K begins phosphorylating PI(4,5)P2 to form PI(3,4,5)P2; levels of these signalling intermediates are negligible in serum starved cells and unstimulated cells

(Cantley, 2002; Roth, 2004). In addition to recruiting signaling proteins (stage 3 discussed below), the rapidly formed PI(3,4,5)P3 acts as a substrate for 5'-phosphatases SH2 domain- containing inositol 5-phosphatase 1 (SHIP1) and SHIP2 to form PI(3,4)P2 (Bunney and Katan,

2010; Franke, 1997). Thus, PI3K activation results in a rapid increase of intracellular levels of both PI(3,4,5)P3 and PI(3,4)P2 (Franke, 1997).

23

Stage 3: PI(3,4,5)P3 and PI(3,4)P2 Activate AKT

Localized formation of PI(3,4,5)P3 and PI(3,4)P2 leads to the accumulation of proteins capable of binding to these PIs through the pleckstrin homology (PH) domain present in many related proteins including AKT and phosphoinostide kinase 1 (PDK1) (Bertucci and Mitchell,

2013; Cantley, 2002). Upon binding to PI(3,4,5)P3, AKT undergoes a conformational change which allows it to be phosphorylated by PDK1 at AKT’s activation loop T308 site. Though T308 phosphorylation activates AKT’s kinase ability, full AKT activation requires additional phosphorylation at the hydrophobic the S473 site by CREB regulated transcription coactivator 2

(CRTC2 or TORC2) (Gasser et al., 2014; Ma et al., 2008). While PI(3,4,5)P3 levels are associated with T308 phosphorylation, S473 phosphorylation is associated with PI(3,4)P2 levels.

Incidentally, AKT binds to PI(3,4)P2 with higher affinity than PI(3,4,5)P3. Once fully activated,

AKT can activate p70 ribosomal protein S6 kinase (p70S6K), inhibit glycogen synthase kinase-3

(GSK3), and plays a role in anti-apoptotic response (Bertucci and Mitchell, 2013; Cantley,

2002). AKT-mediated phosphorylation of BCL2 associated agonist of cell death (BAD) prevents

BAD binding to BCL-2 and BCL-XL, thus allowing these two anti-apoptotic proteins to promote cell survival.

Stage 4: PTEN and INPP4B Shut off PI3K Signalling

Once levels of cell stimulating growth factors and hormones diminish, termination of

PI3K/AKT signaling is mediated by removal of both PI(3,4,5)P3 and PI(3,4)P2 (Bunney and

Katan, 2010). While PTEN acts as a 3-phosphatase to remove PI(3,4,5)P3, INPP4B acts as a 4-

24 phosphase to remove PI(3,4)P2 to form PI(3)P. Due to their similar role in decreasing levels of

AKT-activating PIs, knockdown of either of these phosphatases results in increased AKT activity (Hawkins and Stephens, 2016).

PTEN is a tumor suppressor which limits the PI3K D3 phosphorylation function, and thus pro-growth activity by dephosphorylating PI(3,4,5)P3 at the D3 position to reform PI(4,5)P2

(Bertucci and Mitchell, 2013; Hawkins and Stephens, 2016). By regulating levels of AKT- activating PI(3,4,5)P3 levels, both PI3K and PTEN have been shown to have important roles in human disease, and are found to be amongst the most frequently mutated proteins in cancer – the most common mutations are H107R and E545K (Bunney and Katan, 2010; Gasser et al., 2014).

PTEN protein loss and LOH have been reported in 28% and 25% of breast cancers, respectively

(Bertucci and Mitchell, 2013). In ER+ breast cancer, the mutation rates are even higher at 40%

(Gasser et al., 2014). Conversely, gain-of-function mutations of PI3K have been reported in 27% of breast cancers. While Pten-null mice suffer from developmental abnormalities leading to embryonic lethality, Pten+/- mice develop thyroid, colon, and gonado-stromal tumors (Bertucci and Mitchell, 2013).

AKT signaling activation by PI(3,4)P2 can be terminated by 4-phosphatase mediated dephosphorylation of PI(3,4)P2 to generate PI(3)P (Bunney and Katan, 2010). Unlike PI(3,4,5)P3 and PI(3,4)P2, PI(3)P does not cause activation of AKT signaling, and thus the 4-phosphatase acts as a shutoff switch similar to PTEN explaining the tumor-suppressor role of INPP4B.

25

Stage 5: PI(3)P End Point

Within a cell, PI(3)P is localized at early endosomes, and acts as a ligand for several endosomal proteins through the FYVE domain including the early endosome antigen-1 (EEA1)

(Bunney and Katan, 2010; Hawkins and Stephens, 2016; Toker, 2002). PI(3)P has also been to shown to interact with the adapter protein-2 (AP-2) which is involved in clatherin coat assembly at the cell membrane (Toker, 2002). Though the role of PI(3)P is not yet clearly understood, the association between PI(3)P and various endosome-related proteins suggests it plays an important role in vesicle trafficking.

1.3.2. INPP4B Reduces AKT Activation

Due to its role in decreasing PI(3,4)P2 levels, INPP4B expression levels have been shown to be inversely correlated with AKT phosphorylation across cancers (reviewed in Woolley,

Dzneladze, and Salmena, 2015). In breast cancer, INPP4B knockdown has been shown to result in prolonged AKT phosphorylation following insulin like growth factor 1 (IGF1) stimulation

(Fedele et al., 2010; Gewinner et al., 2009). Conversely, INPP4B overexpression resulted in decreased levels of AKT phosphorylation at both S473 and T308 following epidermal growth factor (EGF) stimulation (Fedele et al., 2010). Similarly, INPP4B silencing was shown to be associated with decreased pAKT (T308) levels in prostate cancer (Hodgson et al., 2011). In melanoma cell lines, INPP4B silencing resulted in increased phosphorylation levels of AKT at

(S473), and a slight increase in pAKT (T308) levels (Perez-Lorenzo et al., 2014).

26

1.4. ACUTE MYELOID LEUKEMIA (AML)

As discussed above, INPP4B acts as an important tumor suppressor in some cancers by regulating signalling through the PI3K/AKT pathway. However, at the time when my PhD research project began, little was known about the role of INPP4B in the context of AML. To address this gap in our knowledge, the second chapter of this dissertation describes work done to examine the prognostic, phenotypic and mechanistic role of INPP4B in AML patients and cell lines. The aim of Section 1.4 is to provide background knowledge on AML necessary to better understand the work done in Chapter 2.

1.4.1. Acute Myeloid Leukemia

AML is a highly heterogeneous hematopoietic malignancy characterized by an accumulation of immature progenitor cells (blasts) in the bone marrow (Löwenberg et al., 1999;

Shipley and Butera, 2009). Successive mutations acquired over time lead to a block in differentiation and increased proliferation of myeloid lineage blast cells which overcrowd the bone marrow thus interfering with normal hematopoiesis (Bullinger et al., 2007; Löwenberg et al., 1999). The resulting cytopenias (reduction of blood cells) are the root cause for the majority of presenting symptoms which include fatigue due to low red blood cells (RBCs), hemorrhage due to low platelets, and infections due to low granulocyte type white blood cells (WBCs). Over time, leukemic cells may infiltrate the lymph nodes (lymphadenopathy), liver (hepatomegaly), skin (leukemia cutis), bone, gingivia and central nervous system. If the WBC count increases to above 100,000 cells/mm3 (hyperleukocytosis), the patient may exhibit respiratory and cognitive

27 problems due to sludging of cells in the lung and brain respectively. The definitive diagnosis of

AML relies on identification of >20% myeloid lineage blast (myeloblasts) cells in bone marrow

(BM) blood smears stained with Wright-Giemsa. Morphologically, the blast cells are characterized by a large nucleus, distinct nucleoli and very little cytoplasm. Occasionally there may be azurophilic granules and Auer rods within the cytoplasm of the blasts. The morphological features of the myeloblasts and their reactivity with histochemical stains are used to classify the disease into nine distinct subtypes as per the French-American-British (FAB) classification system guidelines. The FAB categories (M0-M7) indicate the degree of differentiation and lineage commitment. In addition to the FAB system, AML is also classified using the world health organization (WHO) classification system which relies on the presence of reoccurring chromosomal abnormalities and mutations.

Without treatment, AML is uniformly fatal within days or months. With supportive care consisting of transfusions, antibiotics and low dose chemotherapy that controls the white cell count but does not eliminate them, patients can survive weeks to a year or so. Treatment of AML given with curative intent is comprised of an induction therapy stage followed by post-remission consolidation therapy, or BM transplantation (Shipley and Butera, 2009). The main goal of the induction therapy is to achieve clinical remission of the disease, which is defined as: a normoceullular BM with <5% blasts, and recovery of the platelet count to >100 x 109/L and neutrophils to >1 x109/L in the peripheral blood with no circulating blast cells or evidence of leukemia in sites such as the skin (Löwenberg et al., 1999). With the exception of acute promyelocytic leukemia (APL; AML harboring a t(15;17) abnormality), AML induction therapy entails intravenous infusion of daunorubicin, and cytarabine. Known as the 3+7 therapeutic

28 regimen, daunorubicin is administered intravenously daily for three days at 45-90 mg/m2.

Concomitanly, cytarabine is given by continuous infusion or twice daily infusion at 100-200 mg/m2/d, for seven days (Löwenberg et al., 1999; Shipley and Butera, 2009). On average, 3+7 therapy can produce complete remission in 70-80% of patients <60 years old (Löwenberg et al.,

1999). Following 3+7 induction therapy, clinicians decide whether patients would next receive consolidation therapy or bone marrow transplantation based on patient age, comorbidities, and likelihood of recurrence based on cytogenetics and molecular abnormalities, and the availability of an appropriately matched bone marrow donor (Shipley and Butera, 2009). If consolidation therapy is selected, patients <60 years old with good organ function receive several rounds of high-dose cytarabine. A common regiment entails administration of 3 g/m2 of cytarabine twice daily on days 1, 3 and 5. In older patients however, such high doses are not tolerated very well due to excessive toxicity including neurocognitive changes. Consequently, patients >60 years old receive two cycles of continuous intravenous cytarabine at 100 mg/m2 over five days, followed by two days of daunoruicin at 45 mg/m2. The purpose of consolidation therapy is to reduce the likelihood of disease relapse. The importance of consolidation is highlighted by the finding that if no consolidation therapy is given almost all patients will relapse within 6-12 months.

Generally, response to treatment is adversely affected by advanced age (>60 years old),

AML being the result of prior chemotherapy for another malignancy or myelodysplasia (MDS), high WBC count (>20,000 cells/mm3), unfavourable cytogenetic abnormalities, multidrug resistance, other comorbidities, and CD34 positivity (Löwenberg et al., 1999; Shipley and

Butera, 2009). Cytogenetics provide prognostic and predictive information that stratifies patients who have received 3+7 chemomotherapy into so called better, intermediate, and poor risk

29 cytogentic groups (Shipley and Butera, 2009). Better risk cytogenetics include the recurrent chromosomal abnormalities t(15;17), t(8;21), or inv(16) karyotypes that account for about 20% of under 60 year-old patients (Löwenberg et al., 1999). Patients with these karyotypes have

>85% likelihood of complete remission, and <40% risk of relapse. Unfavourable cytogenetics on the other hand include complex cytogenetics (>2 chromosomal abnormalities), -5, -7, -5q or abnormalities of chromosome 3q. Unfavourable cytogenetics account for 15% of all patients, have a 50% or less chance of achieving remission and a 5-year survival rate of <20%.

1.4.2. Common Genetic Abnormalities in AML

As mentioned in the previous section, AML results from an accumulation of driver somatic mutations and genomic alterations which lead to a block in differentiation and increased proliferation. The diversity of leukemogenic drivers is manifested by marked disease heterogeneity that includes differing clinical characteristics including response to drug treatment.

Furthermore, whole genome sequencing studies reveal that AML is a dynamic disease where multiple clones coexist and compete with one another at any one time (The Cancer Genome

Atlas Research Network, 2013; Welch et al., 2012). As mutations accumulate, some of the clones acquire a proliferative/survival advantage and become the dominant observed clone. The underlying driver mutations responsible for the evolution of AML have been extensively characterized by multiple large-scale genomic studies all reporting similar mutation frequencies

(Baldus and Bullinger, 2008; Papaemmanuil et al., 2016; The Cancer Genome Atlas Research

Network, 2013; Welch et al., 2012). A recent study involving 1540 patients identified 5234 possible driver mutations affecting 76 genes or regions (Papaemmanuil et al., 2016). At least one

30 driver mutation was identified in 96% of all AML patients, with two or more driver mutations seen in 86% of patients. Clinically, the authors demonstrate a significant difference in disease presentation and patient survival across the genomic subgroups. While large-scale genomic studies continue to uncover novel driver mutations, the 2016 WHO, and National

Comprehensive Cancer Network (NCCN) classification systems continue to rely only a handful of reoccurring mutations and cytogenetic abnormalities (Arber et al., 2016; O’Donnell et al.,

2006).

Patients diagnosed with AML are classified into better (favorable), intermediate, and worse

(poor) risk groups based on detected cytogenetic abnormalities and somatic mutations (Table

1.1) (O’Donnell et al., 2006). The risk groups differ in terms of likely response to therapy, chance of relapse, and predicted long term patient survival. These factors in turn influence the aggressiveness of the treatment regimen given to the patient to produce the highest likelihood of remission. Several large-scale survival datasets (totaling 3434 patients) revealed that 5-year survival rates of better (favorable), intermediate and worse (poor) risk group patients were 55-

65%, 24-41%, 5-14% respectively (Byrd et al., 2002; Grimwade et al., 1998; O’Donnell et al.,

2006; Slovak et al., 2015). Below is a summary of some of the reoccurring driver somatic mutations and cytogenetic abnormalities frequently observed in AML.

31

Table 1.1. NCCN risk classification system (O’Donnell et al., 2006)

Epigenetic Modifier Gene Mutations

Mutations in genes related to epigenetic modifications such as DNMT3A, DNMT1,

IDH1/2, ASXL1, and TET2 are often the earliest acquired mutations and are present in the founding clone (Papaemmanuil et al., 2016). While conferring an increased risk for development of hematologic disorders, epigenetic modifier mutations are almost never found alone thus indicating they are insufficient for the development of leukemia on their own. While mutations affecting epigenetic modifiers are associated with adverse OS, their prognostic significance is an exacerbation of other driver mutations (Abdel-Wahab and Levine, 2013). For example, loss of function mutations of the de novo methyltransferase DNA cytosine-5-methyltransferate 3A

(DNMT3A) have been associated with adverse risk only in patients with FLT3-ITD mutations

(Abdel-Wahab and Levine, 2013; Papaemmanuil et al., 2016). Despite the prognostic significance of DNMT3A mutations, their biochemical effect has not yet been elucidated as

32

DNMT3A mutations are associated with hypermethylation of some DNA regions in spite of reductions in global methylation levels. Overall, DNA methylation-related gene mutations occur in 44% of AML cases (The Cancer Genome Atlas Research Network, 2013).

Signaling Gene Mutations

Occurring in 59% of AML cases, mutations involving signaling genes such as FLT3, KIT and RAS are the most highly mutated gene group in AML (The Cancer Genome Atlas Research

Network, 2013). In fact, Fms related tyrosine kinase 3 (FLT3) mutations are the most frequently observed driver mutations in AML, present in 30% of all AML patients (Döhner, 2007;

Kaczkowski et al., 2016; Papaemmanuil et al., 2016; Welch et al., 2012). FLT3 is a receptor tyrosine kinase which drives proliferation and differentiation in early hematopoietic progenitor cells (Döhner, 2007). Two types of mutations are recurrently observed in AML which lead to the constitute activation of the FLT3 receptor. The first is an in-frame internal tandem duplication in the juxtamembrane domain (exons 14 and 15); a domain which is essential for autoinhibition of the normal gene. Mutation of the juxtamembrane region allows for ligand-independent dimerization, autophosphorylation and thus constitutive activation of FLT3; in addition the

FLT3-ITD mutant protein is mislocalized in the endoplasmic reticulum where it interacts with

STAT5 (Woolley et al., 2012). Constitutively active, FLT3-ITD initiates signaling through the

RAS/MAPK, STAT5 and PI3K/AKT pathways. Multiple studies have demonstrated that FLT3-

ITD mutations are significantly associated with shorter OS (Döhner, 2007; Kaczkowski et al.,

2016; Papaemmanuil et al., 2016; Welch et al., 2012). The second type of FLT3 mutation occurs in codon 835 or 836 of the C-terminus activation loop, and similarly leads to constitutive

33 activation. FLT3-TKD mutations are found in 11-14% of cytogenetially normal AML patients, but have a lesser impact on survival compared to the FLT3-ITD mutation.

Nucleophosmin (NPM1) Mutations

NPM1 is a highly conserved phosphoprotein which contains a histone binding, DNA/RNA binding, and oligomerization domains (Sportoletti, 2011). The various binding and interaction domains and NPM1 ability to shuttle between the nucleus and cytoplasm allow it to play a wide variety of roles affecting genomic stability, activity, and stability of some tumor suppressors

(including P53 and ARF), transcription, ribosome biogenesis, and response to stress (Döhner,

2007). While wildtype NPM1 can shuttle between the nucleus and cytoplasm, mutations in exon

12 of NPM1 interfere with this shuttling causing accumulation in the cytoplasm; mutated NPM1 which accumulates in the cytoplasm is referred to as NPM1c. Overall, NPM1 mutations are found in 25-35% of all AML patients, with the majority (45-62%) occurring in patients whose blast cells are grossly cytogenetically normal (Döhner, 2007; Papaemmanuil et al., 2016; Welch et al., 2012). NPM1c mutations are associated with favorable patient outcome as such patients have a high chance of achieving remission.

Transcription Factor Fusions

Occurring in 18% of AML cases, transcription factor fusions such as PML-RARA, CBFB-

MYH11, and RUNX-RUNX1T1 are associated with favorable patient outcome as they respond well to targeted patient treatment (Ghavamzadeh et al., 2006; O’Donnell et al., 2006; The Cancer

34

Genome Atlas Research Network, 2013; Wang and Chen, 2008). Perhaps the most inspiring example of the importance of genomic profiling in AML, APL has demonstrated how targeted therapy against a reoccurring cytogenetic abnormality can vastly improve the survival of an otherwise rapidly fatal disease (Coombs et al., 2015). APL is a subgroup of AML characterized by the presence of a t(15;17) cytogenetic abnormality which fuses together promyelocytic leukemia (PML) and retinoic acid receptor alpha (RARA) genes (Coombs et al., 2015; Fenaux et al., 1992; Ghavamzadeh et al., 2006; Mandelli et al., 1997; Sanz et al., 1999). Unlike the RA- responsive wildtype RARA transcription factor, the PML-RARA fusion peptide does not release its transcriptional repression of differentiation genes in response to physiologic levels of RA.

Targeted treatment of involves therapy with all-trans retinoic acid (ATRA) and arsenic trioxide.

ATRA bindings to the RARA, while arsenic binds to PML leading to the degradation of PML-

RARA and release of the differentiation block (Lallemand-Breitenbach et al., 2008). ATRA and arsenic treatment of APL has been shown to achieve complete remission (CR) rates of 95% and cure rates of 90% (Coombs et al., 2015; Ghavamzadeh et al., 2006).

Current State: Genomic Profiling of AML

AML has long been known to be a heterogeneous disease, with each patient carrying unique combinations of driver mutations affecting distinct cellular programs. However, recent genetic profiling studies revealed that even within the same patient, there is an ongoing competition for dominance as different blast clones accumulate driver mutations giving them a proliferative and survival advantages. An increasing understanding of the various driver mutations, their importance, and contribution to leukemogenesis allows clinicians and

35 researchers to develop novel strategies for the treatment of AML, hopefully one day replicating the success seen with APL and the subgroup-specific treatment with ATRA and arsenic-trioxide.

In addition to improved understanding of the individual driver mutations and their contribution to leukemogenesis, large-scale genomic profiling studies allow researchers to better understand the interplay and relationship between driver mutations (Papaemmanuil et al., 2016).

1.5. THESIS RATIONALE, HYPOTHESES, AND OBJECTIVES

As first discussed in the abstract, cancer is an extremely heterogeneous disease driven by an accumulation of cancer-driving cytogenetic abnormalities and somatic mutations. Disease heterogeneity poses a major hurdle in cancer treatment as the different cancer subtypes differ in their response to treatments and thus require distinct treatment strategies to improve patient outcome. Overall, this thesis addresses cancer heterogeneity by (1) identifying a gene expression-based biomarker which defines prognostically distinct subgroups across several heterogeneous cancer populations, and (2) developing a tool (SubID) for the identification of distinct subgroups within a heterogeneous population.

The initial goal of this thesis was to determine if INPP4B acts as a tumor suppressor in the context of AML, consistent with its tumor suppressive roles in other contexts (Woolley et al.,

2015). However, as the project progressed, it was identified that INPP4B is overexpressed in a small subset of AML patients, where it would appear to act in an oncogene-like manner. As a result of this unexpected observation, Chapter 2 objectives shifted to identifying the impact of its deregulated expression in AML cell lines. Chapter 3 objectives evolved out of the findings in

36

AML that non-median split evaluation can provide unexpected evaluations. This lead to (1) the development of a non-median dichotomization approach to accommodate subgroup-specificity as observed with INPP4B (2) identify potential transcriptional regulators of INPP4B in AML, and (3) conducting a pan-cancer characterization of the prognostic significance of INPP4B to study the context-specificity of INPP4B.

Chapter 2 Hypotheses

1. Consistent with previous findings in breast, ovarian and prostate cancers, INPP4B acts as a tumor suppressor in the context of AML. Specifically, loss of INPP4B expression is associated with shorter OS and poor response to therapy.

2. Overexpression of INPP4B in AML cell lines would result in decreased proliferation, colony formation potential, basal viability and drug resistance (consistent with overexpression of a tumor suppressor).

Chapter 2 Objectives

1. Examine the association between INPP4B expression and patient survival and response to treatment using publically available whole-genome AML expression and clinical data.

2. Overexpress INPP4B in AML cell lines to study the biological role of INPP4B in regulating cancer-related phenotypes such proliferation, drug response, colony formation and viability.

3. Identify the role of INPP4B’s 4-phosphatase activity in mediating the observed phenotype following INPP4B’s overexpression in AML.

37

Chapter 3 Hypotheses

1. Non-median dichotomization is a more sensitive approach to identifying prognostically significant genes in a heterogeneous population.

2. Complete assessment and visualization of the relationship between a continuous variable and an output parameter can identify subgroups within a heterogeneous population.

3. INPP4B overexpression in AML is driven by upregulation of a transcription factor.

4. INPP4B prognostic significance and relationship with patient survival is context dependent.

Chapter 3 Objectives

1. Develop a non-median dichotomization approach which would be more appropriate for a heterogeneous population with distinct subgroups.

2. Identify the potential transcription regulators of INPP4B which may drive its overexpression in AML (25% of the patients).

3. Characterize the pan-cancer prognostic significance of INPP4B to identify contexts where

INPP4B acts in a tumor-suppressive, oncogene-like, or neither manner.

38

CHAPTER 2.

INPP4B OVEREXPRESSION IS ASSOCIATED WITH POOR CLINICAL OUTCOME AND THERAPY RESISTANCE IN ACUTE MYELOID LEUKEMIA

This chapter has been published and reproduced with permission from Leukemia journal

Dzneladze I, He R*, Woolley JF*, Son MH*, Sharobim M, Greenberg SA, Gabra M, Langlois C, Rashid A, Hakem A, Ibrahimova N, Arruda A, Minden M, Salmena L (2015) INPP4B overexpression is associated with poor clinical outcome and therapy resistance in acute myeloid leukemia. Leukemia 29(7): 1485-1495

The majority of the experimental work and design, data interpretation, clinical data analysis, and manuscript writing was done by ID. All other work was coordinated and/or conducted with coauthors. ID, RH, JFW, MHS, SAG, MDM and LS designed experiments; ID, RH, JFW, MHS, SAG, MS, CL, AR, AH and LS performed experiments; ID, MHS, and LS analyzed clinical data; ID, RH, JFW, MHS, SAG, MS, MDM and LS critically discussed results and manuscript; ID and LS wrote the paper. LS and MDM supervised the project.

39

2.1. ABSTRACT

In this study, I investigated the role of INPP4B in AML. I observed that AML patients with high levels of INPP4B (INPP4Bhigh) had poor response to induction therapy, shorter event free survival and shorter overall survival. Multivariate analyses demonstrated that INPP4Bhigh was an independent predictor of poor prognosis, significantly improving current predictive models, where it outperformed conventional biomarkers including FLT3-ITD and NPM1. Furthermore,

INPP4Bhigh effectively segregated relative risk in AML patients with normal cytogenetics. The role of INPP4B on the biology of leukemic cells was assessed in vitro. Overexpression of

INPP4B in AML cell lines enhanced colony formation potential, recapitulated the chemotherapy resistance observed in AML patients, and promoted proliferation in a phosphatase-dependent, and AKT-independent manner. These findings reveal that INPP4Bhigh has an unexpected role consistent with oncogenesis in AML, in contrast to its previously reported tumour suppressive role in epithelial cancers. Overall, I propose that INPP4B is a novel prognostic biomarker in

AML that has potential to be translated into clinical practice both as a disease marker and therapeutic target.

2.2. INTRODUCTION

AML is a heterogeneous clonal stem cell disorder of the BM characterized by impaired differentiation of hematopoietic precursors and accumulation of immature myeloid progenitors or

“blast” cells (Löwenberg et al., 1999). AML is associated with bone marrow failure, abnormal hematopoietic differentiation leading to infection, organ failure, bleeding and lethality unless

40 treated (Löwenberg et al., 1999). While chemotherapy regimens can induce CR of AML, up to one third of AML patients do not achieve CR upon initial therapy, most often due to endogenous cellular resistance (Fernandez et al., 2009). Although our understanding of the mechanisms responsible for the development and progression of AML has increased substantially, there remains a need to identify better predictors of therapy response and disease outcome (Walter et al., 2014).

PI signaling has pleiotropic cellular roles including regulation of cellular proliferation, survival and apoptotic signaling, and is governed by a family of membrane-bound inositol-lipid second messengers that serve as docking sites for numerous signaling proteins. Signaling in these pathways is exquisitely controlled by reversible phosphorylation at the D3, D4, and D5-positions of the inositol ring of the PI molecule (Balla, 2013). Importantly, PI signaling can be a major contributor to malignant transformation specifically through the phosphoinositide-3-kinase (PI3K) pathway which is activated in more than 50% of AML (Park et al., 2010). Genes such as PIK3CA,

AKT and PTEN have been studied extensively (Martelli et al., 2006), however the role of Inositol

Polyphosphate-4-Phosphatase, Type-II (INPP4B), a new player in PI3K-pathway associated cancers, has not been studied in AML.

INPP4B was initially identified as an enzyme that preferentially hydrolyzes the 4-phosphate of phosphatidylinositol-3,4-bisphosphate (PI(3,4)P2), to generate phosphatidylinositol-3-phosphate

(PI(3)P) (Norris et al., 1997). More recently, INPP4B has been reported as a tumour suppressor in breast, ovarian, prostate, lung and melanocytic cancers (Fedele et al., 2010; Gewinner et al., 2009;

Hodgson et al., 2011; Perez-Lorenzo et al., 2014; Stjernström et al., 2014; The Cancer Genome

Atlas Research Network, 2012; Westbrook et al., 2005). The tumour suppressive mechanism of

41

INPP4B has been attributed to its ability to curtail AKT activation, and thereby place a “brake” on the PI3K/AKT pathway (Barnache et al., 2006). Activation of the PI3K/AKT pathway leads to increased intracellular concentrations of PI(3,4,5)P3, and PI(3,4)P2 (Gewinner et al., 2009), both of which are necessary for full AKT activation (Franke, 1997; Frech et al., 1997; Klippel et al., 1997).

Thus, like PI(3,4,5)P3 regulation by PTEN, control of PI(3,4)P2 by INPP4B is also important in mitigating malignant transformation (Fedele et al., 2010; Gewinner et al., 2009). The molecular function of INPP4B as a tumour suppressor is consistent with its observed losses in epithelial cancers (Fedele et al., 2010; Gewinner et al., 2009; Hodgson et al., 2011; Salmena et al., 2015;

The Cancer Genome Atlas Research Network, 2012).

To date, only a limited number of studies have investigated a role for INPP4B in blood cancers. It has been reported that INPP4B is lost in 12% of Down-Syndrome associated acute lymphoblastic leukemia (ALL) (Lundin et al., 2012), and its expression is silenced in malignant erythroleukemia proerythroblasts (Barnache et al., 2006). Conversely, INPP4B was demonstrated to be overexpressed in BCR/ABL1-positive childhood ALL (Ross et al., 2003). Herein I have examined the association of INPP4B expression with various clinical metrics in AML and present evidence that high levels of INPP4B in AML are predictive of unfavourable prognosis and poor response to induction chemotherapy. I also demonstrate that INPP4B overexpression in AML cells confers a growth advantage phenotype and resistance to chemotherapy in a manner that is dependent on the phosphatase function.

42

2.3. MATERIALS AND METHODS

2.3.1. Patient Samples

AML patients included in the Ontario Cancer Institute/Princess Margaret Cancer Center

(OCI/PM) dataset were individuals treated by the Division of Medical Oncology and Hematology,

Princess Margaret Cancer Center (PM) between June 1998 and August 2014. All patients enrolled in this study received induction therapy according to PM guidelines. Most patients received daunorubicin 60 mg/m2 for three days and cytarabine 100 mg/m2 (age>60 years) or 200 mg/m2

(age<60 years) by continuous infusion daily for seven days. Peripheral blood (PB) and BM samples were obtained at the time of diagnosis following informed consent as approved by the research ethics board of the University Health Network (UHN). The diagnosis of AML was made by expert clinical hematopathologists according to the morphological and immunological criteria of the National Cancer Institute expert panel. Cytogenetic analyses were performed in the clinical genetics laboratory of the UHN. FLT3-ITD/TKD status was evaluated in normal cytogenetic risk patients.

2.3.2. AML Dataset Analysis

Normalized microarray data was downloaded from GEO (www.ncbi.nlm.nih.gov/geo/) and

TCGA (www.cancergenome.nih.gov/) databases. Cross-dataset frequency distribution of INPP4B expression was visualized by transforming and mode centering INPP4B expression data in R

(www.r-project.org). A low vs. high INPP4B cut-off was established using Cut-off-Finder

43

Software with a fit of mixture model approach as described by Budczies et al. (Budczies et al.,

2012). Clinical data analysis was performed on non-transformed data.

2.3.3. Clinical Definitions

CR and no response (NR) were defined as presence or absence of normal erythropoiesis, granulopoiesis and megakaryopoiesis, PB absolute neutrophil count ≥1×109/L and platelet count

≥100×109/L, and <5% leukemic blast cells in BM, respectively. Event free survival (EFS) was defined as time from disease diagnosis until relapse or death. OS and EFS were not calculated for patients with treatment related mortality (<28-day OS). Patients lost to follow-up were censored at the date of last follow-up. Cytogenetic risk stratification of patients into favorable, intermediate and adverse groups was done in accordance to NCCN guidelines (www.nccn.org).

Survival Analysis. Analysis of clinical endpoints and response to therapy in the TCGA and

OCI/PM datasets was done exclusively for patients who underwent induction therapy. Detailed description of univariate, multivariate and ROC analyses can be found in Supplementary

Information).

Unless otherwise stated, mean±S.E.M. values are given and P values were calculated using the two-tailed unpaired Student’s t-test. *P<0.05, **P <0.01, ***P<0.001.

44

2.3.4. Cell Culture

OCI/AML-2, OCI/AML-3 and NB-4 cell lines were cultured in αMEM, U937 was cultured in RPMI 1640, all with 10% fetal bovine serum (FBS), 100 units/mL penicillin and 100

o 4 ug/mL streptomycin at 37 C and 5% CO2. For growth curves, cells were seeded at 1x10 or

1x103 cells/mL in 10 cm dishes and counted daily. Low serum medium contained 1% FBS. For cell cycle analysis, cells were synchronized by overnight culture in 1% FBS prior to growth for

24 hours in 10% FBS. Cells were washed and fixed in 70% ethanol before staining with propidium iodide (40 ug/ml) supplemented with RNAse A (100 ug/ml) for 30 minutes at 4oC.

Cells were analyzed by flow-cytometry and cell cycle distribution was determined using ModFit

LT software.

2.3.5. DNA Plasmids

Human wild-type and C824S-mutant INPP4B cDNAs were cloned into pSMAL-GFP or pSMAL-Puro (a kind gift from J. Dick (van Galen et al., 2014). Site directed mutagenesis (SDM) to generate C824S mutant INPP4B was carried out using a Q5 Site Directed Mutagenesis Kit (New

England Biolabs). SDM primers: 5’-TTTCACCTGTAGTAAAAGTGC-3’ (sense) and 5’-

CGAATACCATTCAGTTTG C-3’ (antisense).

45

2.3.6. Lentivirus Production

pSMAL-GFP/Puro, pSMAL-GFP/Puro-FLAG-INPP4Bwt and pSMAL-GFP/Puro-FLAG-

INPP4BC824S (mutant) lentiviruses were produced by calcium phosphate transfection of 293-T cells with psPAX2 and VSVG as described by the manufacturer (Life Technologies). Viral particles in supernatants collected at 48h and 72h post-transfection were enriched with Lenti-X Concentrator

(Clontech). OCI/AML-2, OCI/AML-3, U937 and and NB4 cells were infected with lentivirus for

24 hours with 8 ug/mL protamine sulfate and enriched by fluorescence activated cell sorting

(FACS) or puromycin selection.

2.3.7. Immunobloting and Immmunofluorescence

Western blotting was performed with anti-phospho-Ser473AKT (#4060)/Ser308AKT

(#9275), panAKT (#9272), PTEN (#9552), INPP4B (#8450), FLAG (#8146), GAPDH (#2118) from Cell Signaling Technology (CST). AML cells were seeded onto poly-lysine coverslips in 24- well plates. 24 hours later, cells were washed in PBS and fixed in 2% paraformaldehyde for 10 min, then blocked with 3% goat serum in PBS. INPP4B-antibody was diluted to 1/500 and incubated with cells for 24-hours at 40°C. Labeling was revealed using anti-rabbit conjugated to

PE (CST) and counter-stained with DAPI for 5 min then mounted with Mowiol 4-88 (Calbiochem).

Images were captured with a Leica DMIRB fluorescence microscope and a Leica DC 300RF camera.

46

2.3.8. Quantitative RT-PCR

RNA was extracted using RNeasy Mini kit (Qiagen) and first-strand cDNA synthesis was performed using the MMLV systems (Life). qPCR was performed in triplicate using SYBR-Green

PCR Master Mix kit (Applied Biosystems) on a 7900 HT Real-Time PCR system with SDS v2.3 software (Applied Biosystems) using standard settings: 95oC (10 min) and 40 cycles of 95oC (20 sec) and 60oC (1 min). mRNA levels were normalized to β2M housekeeping gene. The following primers were used for qPCR: INPP4B 5’-GGAAAGTGTGAGCGGAAAAG-3’ (sense) and 5’-

CGAATTCGCATCCACTTATTG’-3’ (antisense); β2M (control), 5’-

TGCTGTCTCCATGTTTGATGTATCT-3’ (sense) and 5’-TCTCTGCTCCCCACCTCTAAGT-3’

(antisense). INPP4B expression values derived from PB or BM were highly correlated (Pearson r=0.97, P<0.0001, Figure 2.1), therefore we used either for our assessment of INPP4B levels.

Figure 2.1. INPP4B expression in peripheral blood versus bone marrow. INPP4B expression in samples derived from PB and BM are highly correlated.

47

2.3.9. Methylcellulose Colony Forming Cell Assay

0.9 mL of 1x103 cells/mL were combined with 1.2 mL of 2.1% (w/v) methylcellulose and

0.9 mL fetal bovine serum; 3 mL was plated in triplicate on 35 mm plates with gridlines. Plates

o were imaged and counted after 9 days at 37 C in 5% CO2 with the EVOS® XL Core Imaging

System.

2.3.10. Drug/Ionizing Radiation Treatment

Daunorubicin (DIN 01926683, Erfa Canada Inc) was used to treat control and INPP4B

OCI/AML-2 and OCI/AML-3 cells lines. Cells were seeded at 1x105 cells/mL in 2 mL of αMEM complete medium in a 12-well plate with daily addition of drug. Cells were irradiated with an X-

RAD 320 irradiation system (AGFA). Cell viability was assessed daily with trypan blue or

AlamarBlue® (Life Technologies).

2.3.11. Phosphatase Assays

INPP4B immunoprecipitated from 500 ug of OCI/AML2 cell lysates (control, INPP4Bwt and INPP4BC824S) was combined with 100 μM diC8-PtdIns(3,4)P2 (Echelon) in 25 mM Tris-HCl

(pH 7.5), 140 mM NaCl, 1 mM DTT at 37°C for 1h. Purified INPP4A protein (Echelon) was used as a positive control, negative control contained no enzyme. Free inorganic phosphate release was measured with Pi-Glo: A Universal Bioluminescent Phosphatase Assay (a kind gift from Promega

Corp.).

48

2.4. RESULTS

2.4.1. High Levels of INPP4B Expression are Associated with a Poor Outcome in AML.

To investigate INPP4B expression in AML, I examined INPP4B transcript levels across several publically available AML datasets and 279 AML patients from the OCI/PM tissue bank - a total of 2193 unique AML patients (Figure 2.2A). Analysis of the different AML datasets revealed that INPP4B expression profiles were consistent across all datasets examined (Figure 2.2B).

Frequency distribution analysis utilizing a fit of mixture model (Budczies et al., 2012) identified a population of high INPP4B expressers comprising 25% of patients across all the datasets (Figure

2.2C). Western blot and immunofluorescence (IF) confirmed the INPP4B protein expression levels were consistent with transcript levels of INPP4B in cells from the OCI/PM patient dataset (Figure

2.2D,E).

49

50

Figure 2.2. High levels of INPP4B are observed in a subset of AML patient samples. (A)

AML patient gene expression datasets used in this study. (B) Heat map and (C) frequency distribution plot of INPP4B expression across AML patients demonstrate similar expression patterns in all examined datasets with 25% of the patients harboring high levels of INPP4B D) immunofluorescence detection of INPP4B and (E) immunoblots from OCI/PM AML patient samples chosen based on low or high transcript expression.

51

I then evaluated the clinical prognostic significance of INPP4B in AML patients using datasets for which clinical data was available including: Metzeler et al. (Metzeler et al., 2008), Bullinger et al. (Bullinger et al., 2007), TCGA (The Cancer Genome Atlas Research Network, 2013), Verhaak

(Verhaak et al., 2009) and OCI/PM datasets (Figure 2.2A). First, examination of response to first induction therapy revealed that INPP4Bhigh patients had lower rates of CR compared to their

INPP4Blow counterparts (57% vs. 74%, respectively; P=0.01; Figure 2.3A-C). Accordingly, the mean expression of INPP4B was significantly lower in CR vs. NR samples (61.8±2.0 vs.

84.9±10.7, P=0.003; Fig 2D). Similar CR vs. NR response rates were observed in the Verhaak

(Verhaak et al., 2009) clinical dataset (Figure 2.4).

Next, survival analysis showed that INPP4Bhigh patients had significantly shorter OS and EFS compared to INPP4Blow patients (Figure 2.3E-M). Three-year OS rates of the INPP4Blow vs.

INPP4Bhigh AML patient groups were as follows: Metzeler I (41.4%±5.1% vs. 20.9%±6.9%;

P=0.005), Metzeler II (49.6%±7.0% vs. 29.4%±11.1%; P=0.05) and Bullinger (46.2%±6.4% vs. none; P=0.002), OCI-PM (41.6%±3.5% vs. 21.4%±4.9%; P<0.001), TCGA (44.4%±4.6% vs.

27.4%±8.0%; P=0.02) (Figure 2.3E-J; Table 1). Three-year EFS rates were: OCI/PM

(33.2%±3.3% vs. 18.6%±4.6%; P<0.001) and TCGA (32.7%±4.4% vs. 17.2%±6.6%; P=0.03)

(Figure 2.32K-M; Table 2.1).

52

53

Figure 2.3. INPP4Bhigh AML patients have lower complete remission rates and shorter survival. (A) Heatmap of 279 primary AML samples from the OCI/PM in which INPP4B expression was quantified by qPCR and aligned with corresponding response to treatment. (B)

Chemotherapy response rates and (C) corresponding Fisher’s exact test. (D) Average INPP4B expression in responders vs. non-responder AML samples. Kaplan-Meier plots for INPP4Bhigh vs. INPP4Blow AML patient OS in (E) Metzeler I, (F) Metzeler II, (G) Bullinger (H) OCI/PM, (I)

TCGA and (J) Verhaak datasets. Kaplan-Meier plots for INPP4Bhigh vs. INPP4Blow AML patient

EFS in (K) OCI/PM, (L) TCGA and (M) Verhaak datasets.

54

Figure 2.4. INPP4Bhigh AML patients have lower complete remission rates. (A) Heatmap of

393 primary AML samples from the Verhaak dataset in which INPP4B expression was quantified by qPCR and aligned with corresponding response to treatment. (B) Summary of chemotherapy response rates and (C) corresponding Fisher’s exact test. D) Average INPP4B expression in responders vs. non-responders.

55

2.4.2. INPP4Bhigh is an Independent Prognostic Marker in AML

Univariate analysis revealed that INPP4Bhigh was associated with an approximately two-fold increased risk of death in TCGA (HR=2.2; P=0.02) and OCI/PM (HR=1.9; P<0.001) datasets

(Table 2.1; Table 2.2). To determine whether INPP4Bhigh is an independent biomarker of poor clinical outcome, I performed multivariate analysis (Table 2.3-2.5) where I found that INPP4Bhigh disease was associated with an increased risk of death (HR=1.8; P=0.001) and an increased likelihood of a leukemia-associated event (HR=1.8; P<0.001). To ensure that stem cell transplantation was not a confounding factor in INPP4B-related prognosis, transplant-censored

Kaplan-Meier survival analysis and multivariate analysis was conducted (Figure 2.5A-C). Overall, the likelihood ratio test (LRT) assessment of the multivariate model demonstrated that INPP4Bhigh significantly improved the predictive models for OS (LRT χ2=9.4, P=0.002) and EFS (LRT

χ2=11.6, P<0.001), and consistently achieved better predictability measures as compared to FLT3-

ITD, FLT3-TKD and NPM1 mutations (Table 2.2). To further compare the relative hazard associated with INPP4Bhigh compared to FLT3-ITD mutations, I examined OS and EFS between total, FLT3-ITD+, and INPP4Bhigh patients (Figure 2.6A,B). I observed that INPP4Bhigh was associated with a greater risk of death compared to FLT3-ITD+ relative to total AML (HR=1.9 vs.

1.5).

56

Table 2.1. Clinical Characteristics, Treatment Effects and Survival Outcomes

57

Table 2.2. Molecular Associations with INPP4B in AML

58

Table 2.3. Multivariate Analysis and Model Comparison of Survival in Total, CN and

Intermediate Cytogenetic Risk Group AML Patient

59

Table 2.4. Detailed Multivariate Analysis of OS in AML (All Patients)

Table 2.5. Detailed Multivariate Analysis of EFS AML (All Patients)

60

Figure 2.5. INPP4Bhigh AML patients have shorter survival independent of transplantation status. Kaplan-Meier OS (A) and EFS (B) plots for INPP4Bhigh vs. INPP4Blow AML patients in

OCI/PM datasets without transplantation. (C) Multivariate analysis and model comparison of survival in AML patients with transplantation status as a covariable.

61

62

Figure 2.6. INPP4Bhigh constitutes a significant hazard in total and CN-AML. Comparison of (A) OS and (B) EFS of patients with INPP4Bhigh or FLT3-ITD to total AML population in the

TCGA-OCI/PM merged dataset. (C) Forest plot of multivariate log hazard rates for OS in AML patients in the TCGA-OCI/PM merged dataset. (D) ROC analysis of EFS for 5 putative AML biomarkers. (E) OS and (F) EFS of INPP4Bhigh vs INPP4Blow CN-AML patients. Comparison of

(G) OS and (H) EFS of CN-patients with INPP4Bhigh or FLT3-ITD to total AML population in the TCGA-OCI/PM merged dataset.

63

Finally, to estimate the relative importance of INPP4Bhigh among other common covariates, I used the TCGA dataset to measure the respective Log Hazard Rate (β) as determined by multivariate Cox-proportional hazard analysis (Figure 2.6C). This analysis demonstrated that in the TCGA AML dataset, INPP4Bhigh ranks highly among the known negative prognostic markers and its contribution to the prediction model was statistically significant. I also used ROC analysis to estimate the sensitivity and specificity of INPP4B expression level on the prediction of 3 and 5- year OS and EFS as compared to 4 other genes that have been reported to have altered gene expression that may be of prognostic relevance (MN1, BAALC, MECOM, SALL4) (Figure 2.6D;

Figure 2.7). ROC analysis did not yield AUC above 0.7 for any of the five putative biomarkers.

This finding that was not unexpected given recent findings in a large scale analysis by Walter et al.

(Walter et al., 2014) where no single risk factor achieved AUC higher than 0.7, although INPP4B had a greater AUC than each of the other 4 genes in the TCGA dataset.

64

Figure 2.7. ROC comparison of INPP4B and several other expression based clinical outcome markers. TCGA clinical data was used to compare ROC curves of MN1, MECOM, BAALC and

SALL4 from to that of INPP4B for 1-, 3- and 5-year (A-C) OS and (D-F) EFS. G) Respective AUC values were next compared for all genes.

65

2.4.3. INPP4Bhigh is Associated with Poor Outcome in AML with Normal Cytogenetics

Cytogenetic aberrations are widely recognized as key prognostic determinants in AML, however there remains a great challenge in predicting clinical outcome in cytogenetically normal

AML patients (Baldus & Bullinger, 2008; Foran, 2010; Grimwade et al., 1998). I explored whether

INPP4B expression could further contribute to risk stratification of cytogenetically normal AML

(CN-AML) patients. Kaplan-Merier analysis revealed that INPP4Bhigh CN-AML patients have significantly shorter OS (P<0.001) and EFS (P<0.001) (Figure 2.6E,F). In a multivariate analysis

INPP4Bhigh conferred a two-fold increased risk of death relative to INPP4Blow for OS (HR=1.8;

P=0.007) and EFS (HR=1.8; P=0.008), significantly improved the predictive models for OS (LRT

χ2=6.5, P=0.01) and EFS (LRT χ2=6.4, P=0.01) and consistently achieving better predictability measures compared to FLT3-ITD, FLT3-TKD and NPM1 mutations (Table 2.2). Finally, I compared OS and EFS in all, FLT3-ITD+ and INPP4Bhigh CN-AML patients. I observed that

INPP4Bhigh CN-AML and FLT3-ITD+ CN-AML had a significantly decreased survival (OS

HR=2.0; EFS HR=2.4) (Figure 2.6G,H). Furthermore, INPP4B was shown to also improve the risk stratification of cytogenetically intermediate risk group AML patients (Figure 2.8A-D; Table

2.2).

66

Figure 2.8 INPP4Bhigh constitutes a significant hazard in intermediate cytogenetic risk group AML. (A) OS and (B) EFS of INPP4Bhigh vs INPP4Blow intermediate cytogenetic risk group AML patients. Comparison of (C) OS and (D) EFS of intermediate risk patients with

INPP4Bhigh or FLT3-ITD to total AML population in the TCGA-OCI/PM merged dataset.

67

2.4.4. Ectopic Overexpression of INPP4B Leads to Increased Colony Forming Potential and

Increased Proliferation in AML Cell Lines

My clinical findings provided a convincing rationale for exploring a direct role for INPP4B overexpression in leukemogenesis and therapy response. To test if INPP4B overexpression might provide leukemic cells with a growth advantage, I generated OCI/AML-2, OCI/AML-3, U-937, and NB-4 cell lines that express FLAG tagged-INPP4B using the pSMAL-GFP/Puro lentiviral vectors17 (Figure 2.9A-B; Figure 2.10A,D-E). Once successful ectopic INPP4B overexpression was confirmed, I measured clonogenic potential using methylcellulose colony forming cell (CFC) assays (Figure 2.9C-D; Figure 2.10C). CFC assays demonstrated that INPP4B overexpression conferred a significant increase in colony formation potential. Next, basal growth of INPP4B overexpressing cells was monitored in normal and low serum conditions. INPP4B overexpressing

OCI/AML-2 (Figure 2.9E), OCI/AML-3 (Figure 2.9F), U-937 and NB-4 cells (Figure 2.10F,G) had a proliferative advantage in normal serum levels compared to empty vector controls. To explain this increased proliferation, I examined the basal viability of INPP4B overexpressing

OCI/AML-2 and OCI/AML-3 cell lines. Annexin-V staining revealed a small, but consistently lower levels of apoptosis in INPP4B overexpressing OCI/AML-2 (P<0.001) and OCI/AML-3

(P=0.01) and NB4 (P<0.001) cell lines (Figure 2.9E,F inserts; Figure 2.10G). Growth of

OCI/AML-2 and OCI/AML-3 in low-serum conditions demonstrated that growth-factor independent survival was greater in INPP4B overexpressing lines (Figure 2.9G,H; Figure

2.10H,I). Together, these findings demonstrate that INPP4B overexpression provides AML cells with a growth advantage as observed in CFC assays and suspension culture; a phenotype that may

68 be fostered, in part, by decreased basal apoptotic activity. These findings are consistent with our clinical observations and further implicate INPP4B in AML pathogenesis.

69

70

Figure 2.9. Ectopic overexpression of INPP4B in AML cells leads to increased colony forming potential and proliferation. (A) OCI/AML-2 and (B) OCI/AML-3 were transduced with empty pSMAL or pSMAL-FLAG-INPP4B lentivirus. Cells were sorted for GFP positivity followed by qPCR and Western blots with indicated antibodies. One thousand sorted (C)

OCI/AML-2 and (D) OCI/AML-3 were plated in methylcellulose for colony forming assay, representative fields of resulting colonies are presented. Sorted OCI/AML-2 and OCI/AML-3 were grown in (E, F) normal and low serum (G, H) conditions and trypan blue cell counts were performed daily . (E, F) Inserts demonstrate apoptotic index as measured by Annexin V flow cytometry. All P-values were derived using the Student’s t-test. *P<0.05, ** P <0.01, *** P

<0.001.

71

72

Figure 2.10. Ectopic overexpression of INPP4B in AML cells leads to increased proliferation. (A) Basal INPP4B expression in AML cell lines. (B) Cell cycle phase distribution in OCI/AML-2 and OCI/AML-3 cells. (C) Colony formation potential of U-937 cells (D)

Western blots with anti-FLAG and anti-INPP4B to demonstrated INPP4B overexpression; anti-

PTEN and anti-GAPDH were used as loading controls. (E) NB-4 cells were infected with empty pSMAL or pSMAL-FLAG-INPP4B lentivirus. Cells were sorted for GFP positivity followed by qPCR and Western blots with anti-FLAG and anti-INPP4B to demonstrate INPP4B-over expression; anti-PTEN and anti-GAPDH were used as loading controls. Sorted U-937 and NB-4 cells were grown in (F, G) normal and low serum (H, I) conditions and cell counts were performed daily. *P<0.05, **P <0.01, ***P<0.001.

73

2.4.5. INPP4B overexpression in AML cells leads to reduced sensitivity to Daunorubicin and

Ionizing Radiation

To examine a direct role for INPP4B in mediating chemotherapy resistance in AML,

INPP4B-overexpressing OCI/AML-2 and -3 cell lines were treated in vitro with daunorubicin and cell viability was measured using trypan blue or Annexin-V flow cytometry. Dose-response curves and time course analysis demonstrate that INPP4B-overexpressing cell lines are less sensitive to daunorubicin (EC50 20.7±8.2 vs 63.5±9.6 nM OCI/AML-2; EC50 46.2±7.4 vs 76.4±5.9 nM

OCI/AML-3) (Figure 2.11A-D). These findings provide evidence that INPP4B overexpression may play a direct role in chemotherapy response. Given that daunorubicin is a DNA-damaging agent that is sensitive to drug efflux and hence varying intracellular levels, I also investigated whether ectopic INPP4B overexpression affected resistance to ionizing radiation (IR) which is not complicated by the intracellular level of drug. As with daunorubicin, IR was better tolerated in

INPP4B-overexpressing OCI/AML-2 and -3 cell lines (Figure 2.11E,F). These results suggest that

INPP4B is important in mediating cellular response to DNA-damaging agents.

74

Figure 2.11. INPP4B overexpression is associated with resistance to chemotherapy and ionizing radiation. Tryoan blue cell counts were used to establish a dose response and EC50 for

(A) OCI/AML-2 and (B) OCI/AML-3 at 24 hours. (C) OCI/AML-2 and (D) OCI/AML-3 cell lines were examined for up to 96 hours of treatment with 0, 10 and 50 nM of Daunorubicin. (E)

OCI/AML-2 and (F) OCI/AML-3 were treated with 10Gy ionizing radiation at 2.5 Gy/min and assessed at indicated times.

75

2.4.6. Phosphatase dependence of INPP4B-mediated phenotypes in AML

INPP4B catalyzes the dephosphorylation of PI(3,4)P2 to generate PI(3)P through the function of its conserved C-terminal CX5R dual specificity-phosphatase consensus domain:

CKSAKDRT (AA 842-849). To test the phosphatase dependency of INPP4B in conferring the

AML-associated phenotypes described in this study, I generated a INPP4B construct with an active site cysteine to serine (C842S) mutation, which has been reported to render INPP4B catalytically inactive (Fedele et al., 2010; Gewinner et al., 2009). A phosphatase activity assay performed with immunoprecipitated INPP4B from INPP4B, mutant and control infected OCI/AML-2 cells (Figure

2.12A) demonstrated increased phosphatase activity on a PI(3,4)P2 substrate by INPP4B cells compared to empty vector control cells (Figure 2.12B). This assay also verified the lack of phosphatase function of the mutant INPP4B in mutant infected cells (Figure 2.12B).

76

77

Figure 2.12. Phosphatase dependence of INPP4B-mediated phenotypes in AML. A) Western blot of INPP4B and catalytically inactive INPP4B mutant were overexpressed in OCI/AML-2 cell lines. B) PI(3,4)P2 directed phosphatase activity of immunopurified INPP4B protein was measured using a Pi glow luminescence assay. C) OCI/AML-2 proliferation, (D) growth in low serum, and

(E, F) colony formation potential were assessed. G) OCI/AML-2 cells were also treated with daunorubicin to establish a dose response curve and EC50s and (H) with 10nM and 50nM of

Daunorubicin to assess viability at 0, 24 and 48 hours.

78

Next, in growth assays with INPP4B, mutant and empty vector control infected OCI/AML-2 cells, INPP4Bwt overexpression led to increased proliferation; however, mutant overexpressing cells demonstrated an intermediate increase in proliferation compared to the control (Figure 2.12C).

Similarly, growth in low serum conditions demonstrated that mutant partially recapitulates the increased viability observed in INPP4B cells (Figure 2.12D). Conversely, colony formation assays revealed that mutant overexpressing cells were no different than control cells in enhancing colony formation. Importantly, dose response and time course analysis demonstrated that INPP4B overexpressing cells were less sensitive to daunorubicin associated toxicity compared to both mutant and control (Figure 2.13F). Together, these results provide evidence that the in vitro effects of INPP4B are phosphatase-dependent with the catalytically-inactive mutant showing total or partial loss of all in vitro phenotypes.

Although it has been previously shown that INPP4B overexpression leads to decreased AKT signaling in epithelial cancers, there was no clear association between INPP4B expression levels and phospho-AKT levels in a small panel of AML patient samples (Figure 2.2E). Similarly, phospho-AKT levels remained unchanged in OCI/AML-2 and OCI/AML-3 cell lines overexpressing INPP4B as measured by Western blot and flow cytometry (Figure 2.12A; Figure

2.13A-B). Unlike its role in other cancers, our findings suggest that INPP4B overexpression does not impact PI3K signaling in AML.

79

Figure 2.13. INPP4B overexpression does not affect AKT activation levels. A) Western blots of control and INPP4B-overexpressing OCI/AML-2, OCI/AML-3 and U-937 cells with INPP4B,

PTEN, pAKT(Ser473), AKT and GAPDH antibodies. B) Flow cytometry was used to measure pAKT(Ser473) in control and INPP4B-overexpressing OCI/AML-2 and OCI/AML-3 cell lines.

80

2.5. DISCUSSION

The results presented here provide several lines of evidence indicating that high levels of

INPP4B expression in AML lead to more aggressive cellular phenotypes and consequently, poor disease outcome. First, I found in six independent gene expression data sets, that high level

INPP4B levels were associated with shorter survival times (Figure 2.3). This consistency across datasets that are heterogeneous with respect to clinical treatment centres and specific treatment regimens suggests that INPP4B expression could be a robust biomarker for AML regardless of the planned therapy. Detailed clinical analysis suggests that the associated decreased survival may be attributed to reduced chance of obtaining remission due to decreased response to drug (Table 2.1).

Importantly, multivariate analyses revealed that INPP4Bhigh is an independent predictor of poor prognosis and significantly improves AML prognostication models (Table 2.2). Notably,

INPP4Bhigh was a better predictor of poor outcome when compared to other known clinical and molecular variables or characteristics including biomarkers such as FLT3-ITD, FLT3-TKD and

NPM1 in the OCI/PM and TCGA AML datasets (Figure 2.6B,C). ROC analysis demonstrated that

INPP4B status has larger AUC compared to other common biomarkers with gene expression alteration in AML.

In AML there is a large degree of uncertainty with regards to outcomes in patients designated as CN-AML (Figure 2.6D-G) or intermediate cytogenetic risk groups (Figure 2.8) using standard induction regimens. In addition to the risk stratification provided by mutations in genes such as

MLL, FLT3, CEBPA, or NPM1, recent analyses have provided evidence that altered gene expression may be of prognostic relevance in AML. Thus, I tested the ability of INPP4B

81 expression to further subgroup CN-AML patients. Indeed, INPP4Bhigh was predictive of poor prognosis in patients with CN-AML (Figure 2.6C; Table 2.2). Moreover, like FLT3-ITD status (a biomarker often used to determine relative risk in intermediate risk patients), INPP4Bhigh was also effective at segregating risk groups. This finding suggests that INPP4Bhigh could be of use in AML subgrouping within the CN-AML patient populations. Together, these findings suggest that

INPP4B is a novel prognostic biomarker that is superior to many molecular markers currently used in clinic. Overall, the use of INPP4Bhigh alongside other markers in clinical practice may yield improved risk stratification. Our findings substantiate the development of a reliable and reproducible gene expression assay for clinical diagnostic purposes.

Next I explored whether the above clinical findings were merely associated with or potentially due to biological effects of INPP4B overexpression. For this, I overexpressed INPP4B in AML cell lines and measured various growth and survival characteristics. In contrast to previous publications which demonstrated that INPP4B overexpression leads to decreased growth potential (Fedele et al.,

2010; Gewinner et al., 2009), overexpression in AML cells was associated with increased colony forming potential, increased proliferation in full and low serum conditions and decreased cell death

(Figure 2.9) in a phosphatase-dependent manner (Figure 2.12). Moreover, our data suggests that although the catalytic activity of INPP4B is important in mediating growth and drug resistance phenotype in AML, INPP4B overexpression does not impact AKT activation in AML, suggesting alternative mechanisms are likely at play. In recent publications, INPP4B has been demonstrated to activate serum- and glucocorticoid-regulated kinase-3 (SGK3) phosphorylation through PI(3)P signaling (Bago et al., 2014; Gasser et al., 2014). In future work it would be important to explore the role of SGK3 activation in leukemogenesis. Importantly, the cellular phenotypes observed upon

82

INPP4B overexpression are consistent with the clinical features I have identified in INPP4Bhigh

AML and suggest a direct and causative role for INPP4B overexpression in promoting AML pathogenesis. Moreover, these analyses are consistent with the notion that INPP4B overexpression has oncogenic potential in the myeloid lineage.

Given the association with reduced remission rate and OS I examined the effect of INPP4B on response to chemotherapy. Consistent with the reduced chance of CR, INPP4B overexpression in

AML cell lines conferred resistance to daunorubicin, a chemotherapy agent used in induction therapy (Figure 2.11A-D) in a completely phosphatase dependent manner (Figure 2.12F-G). To our knowledge, this represents the first study to demonstrate an association between high INPP4B expression and therapy responsiveness in AML, suggesting that INPP4B overexpression may promote cellular mechanisms leading to drug resistance. This INPP4B associated resistance to cell death is likely due to tolerance to DNA damage as exposure of INPP4B-overexpressing cells showed reduced killing by ionizing radiation (Figure 2.11E,F).

The apparent contrasting roles of INPP4B in AML versus breast and other epithelial cancers

(Fedele et al., 2010; Gewinner et al., 2009; Hodgson et al., 2011; Perez-Lorenzo et al., 2014;

Stjernström et al., 2014; The Cancer Genome Atlas Research Network, 2012; Westbrook et al.,

2005) suggest that phosphoinositide signaling may be distinct in different cell types. Importantly, in its tumour suppressive role, INPP4B is an unlikely drug target because it would need to be activated or induced to achieve clinical benefit. However in an oncogenic role, as proposed here for AML, INPP4B becomes a possible target for directed therapy. Further work is required to

83 investigate the specific mechanistic role of INPP4B in promoting AML and its potential and/or effectiveness as a therapeutic target in AML.

84

CHAPTER 3.

SUBID, A NOVEL SUBGROUP IDENTIFICATION TOOL, DEMONSTRATES THE CONTEXT-DEPENDENT PROGNOSTIC SIGNIFICANCE OF INPP4B ACROSS CANCERS

This chapter is in preparation for submission

Irakli Dzneladze, Carla Rossell*, John F Woolley*, Ayesha Rashid, Mike Jain, Jüri Reimand, Mark D Minden¥, Leonardo Salmena¥

The majority of the SubID tool development, data analysis, data interpretation, and manuscript writing was done by ID. All other work was coordinated and/or conducted with coauthors. ID developed SubID under guidance of JR; ID and CR performed the data analysis; ID performed the clinical data analysis; ID, JFW, AR, MJ, JR, LS and MDM critically discussed results and manuscript; ID wrote the paper. JR, LS and MDM supervised the project.

85

3.1. ABSTRACT

The phosphatase INPP4B has been shown to act as a tumor suppressor across several cancer types including breast, ovarian and prostate. However, my work in Chapter 2 has uncovered the unexpected observation that INPP4B can also act in an oncogene-like manner in the context of AML. Specifically, higher levels of INPP4B expression observed in 25% of AML patients define a prognostically distinct subgroup associated with poor patient outcome and lower CR rates. To identify the transcriptional machinery responsible for INPP4B overexpression in AML, and characterize the pan-cancer prognostic significance of AML, I developed a subgroup identification tool (Subgroup Identifier; SubID) used for non-median patient dichotomization. SubID was designed to study the relationship between a continuous variable and an optimization parameter in a heterogenous population. Offering a more informative alternative to the commonly used median dichotomization, SubID recognizes that significantly distinct subgroups can make up less than 50% of a population. Furthermore, the visual output of SubID offers information regarding the nature and number of distinct subgroups in a population. SubID identified potential transcription factors whose expression is upregulated in the INPP4Bhigh AML subgroup and may be responsible for the increased expression of

INPP4B. In vitro Experimental validation of a top potential INPP4B regulator, EVI1, revealed that knockdown of EVI1 results in a respective change in INPP4B expression. Furthermore, chromatin immunoprecipitation demonstrated EVI1 binding at the INPP4B promoter region.

Next, SubID was applied to 25 TCGA datasets which revealed that INPP4B expression may be associated with patient survival in 13 different cancer types. Following stringent multiple testing and permutation corrections, I observed that low expression of INPP4B (INPP4Blow) was associated with shorter survival in kidney clear cell (HR=1.94; P=1.71E-05), liver hepatocellular

86

(HR=2.13; P=4.44E-04), and bladder urothelial (HR=2.27; P=5.62E-05) carcinomas.

Conversely, INPP4Blow status was associated with longer survival in pancreatic adenocarcinoma

(HR=0.38; P=8.72E-05). Similar to Chapter 2, the oncogene-like association observed in pancreatic cancer was validated in two independent datasets (HR=0.39, P=4.6E-03 and

HR=0.59, P=2.0E-03). Furthermore, cross-validation demonstrated a similar pan-dataset optimal cut-off identified by SubID (TCGA: 40%, Chen: 35%, ICGC: 53%) thus providing validation to the SubID derived non-median dichotomization point. Overall, this study describes the development and application of a novel subgroup identification tool to demonstrate the context dependent nature of INPP4B in cancer. Furthermore, this study first describes a previously unknown cancer context (pancreatic cancer) where INPP4B acts in an oncogene-like manner.

3.2. INTRODUCTION

Given the high degree of inter- and intra-tumor heterogeneity present in cancers, there is a great need for the better stratification of patients into specific subgroups. The identification of discrete subgroups under the umbrella of a unifying name, may allow for the development of treatments that take advantage of the biology of particular subgroups, thus enabling targeted therapy to improve patient outcome. Defining the genetic, molecular, and cellular landscapes of tumors is key in delineating the degree and nature of heterogeneity for a given cancer across the spectrum of patients and within each patient with that disease. Ongoing efforts have been geared towards the identification of subgroups within cancers based on defined genetic and molecular parameters and morphology. For example in AML, Valk et al. identified 16 distinct AML subgroups as defined by key molecular signatures based on high throughput microarray gene

87 expression analysis of 285 AML patients (Valk et al., 2004). Similar analysis in pancreatic patients revealed four defined subtypes (Bailey et al., 2016). Breast cancers are classified based on molecular features that include the expression or overexpression of growth factor receptors such as the Erb-B2 receptor tyrosine kinase 2 (ERBB2/HER2), ER, and PgR (Schnitt, 2010).

Identification of these features has guided the development of receptor tyrosine kinase inhibitors for the treatment of HER2 positive breast cancer subtype, demonstrating the translational value of informed disease stratification. However, despite advances in characterizing the genomic landscape of cancers, we are still unable to accurately predict patient survival following specific therapies, nor identify the cancer driving abnormalities in all patients. In this this paper, I tackle both these aspects by developing a tool for identifying and studying subgroups within heterogeneous populations, as well as, identifying INPP4B as a gene expression-based marker for predicting patient survival across cancers.

INPP4B is a phospholipid phosphatase involved in the regulation of membrane-bound PI that serve as second messengers in key cellular signal transduction pathways (Agoulnik et al.,

2011; Di Paolo and De Camilli, 2006). PIs are reversibly modified through the addition and removal of phosphate groups at the D3, D4, or D5 positions of the PI inositol ring, which serve as docking sites for signaling proteins. A prime function of INPP4B is the removal of the phosphate group at the D4 position of PI(3,4)P2 to generate PI(3)P (Norris et al., 1995; Norris and Majerus, 1994). INPP4B was first isolated and identified as a 105 kDa D4 lipid phosphatase by Norris et al. who demonstrated that INPP4B dephosphorylated PI(3,4)P2 to PI(3)P >900-fold more efficiently compared to its impact on PI(1,3,4)P3 (Norris et al., 1995). PI(3,4)P2, along with

PI(3,4,5)P3, are necessary for full activation of the pro-survival kinase AKT (Franke, 1997; Ma

88 et al., 2008). Aberrations in components of the PI3K/AKT pathway are implicated in many cancers, mostly notably mutations of PIK3CA and PTEN (Yuan and Cantley, 2008). Likewise, alterations of INPP4B including LOH and reduced expression have strong associations with outcome in several types of cancer including breast, ovarian, prostate, thyroid, leukemia and skin cancer (reviewed in Woolley et al., 2015).

In keeping with its role in controlling PI(3,4)P2 levels and limiting pro-survival signaling through AKT, INPP4B was proposed to be a putative tumor suppressor (Fedele et al., 2010;

Gewinner et al., 2009; Westbrook et al., 2005). Studies in breast, prostate, melanoma and ovarian cancer datasets and cancer cells provided support for the tumor suppressive nature of

INPP4B (Gewinner et al., 2009; Hodgson et al., 2014; Perez-Lorenzo et al., 2014; Salmena et al., 2015). In human breast cancer models, INPP4B suppresses epithelial cell transformation, anchorage-independent growth, and AKT-dependent growth and proliferation (Gewinner et al.,

2009; Westbrook et al., 2005). INPP4B overexpression in prostate cancer cells decreases levels of phosphorylated AKT, and decreases cellular invasion (Hodgson et al., 2014). Clinically, loss of INPP4B expression in breast cancer is associated with decreased OS. In particular, LOH at the

INPP4B locus frequently occurs in basal-like (triple negative) breast cancers, characterized by poor clinical outcome (Fedele et al., 2010; Gewinner et al., 2009; Tokunaga et al., 2016; Won et al., 2013). Furthermore, Won et al. demonstrated INPP4B to be a biomarker of basal-like breast cancer subtype with 99% specificity and 61% sensitivity (Won et al., 2013). Similarly, in ovarian cancer, loss of INPP4B expression is associated with shorter OS and higher rates of lymph node metastasis (Gewinner et al., 2009; Salmena et al., 2015).

89

In contrast to its reported role as a tumor suppressor, accumulating evidence suggest that high levels of INPP4B are involved in the development or progression of some cancers. For instance, INPP4B can promote the growth and proliferation of ER+ breast cancer cells through

SGK3, a family member of serine/threonine kinases closely related to AKT (Gasser et al., 2014).

In addition, two independent studies in AML reported that INPP4B overexpression is associated with poor clinical outcome and chemotherapy resistance (Dzneladze et al., 2015; Rijal et al.,

2016). In colon cancer cells, INPP4B overexpression was shown to be associated with increased proliferation, anchorage independent growth and xenograft growth (Guo et al., 2015).

These paradoxical functions of INPP4B in different cancer settings suggest that the effect of INPP4B in cancer is cell type- or context-dependent. Indeed, it has been shown that INPP4B may exert differential effects in subtypes of the same cancer type. For example, one study reported tumor suppressive effects for INPP4B in melanoma, with low INPP4B protein expression associated with tumor progression, while another study reported a SGK3-dependent oncogenic role for INPP4B in a subset of the same disease (Chi et al., 2015; Perez-Lorenzo et al., 2014). Moreover, in a PTEN-deficient background, INPP4B may compensate for PTEN loss by dephosphorylating PTEN substrates such as PI(3,4,5)P3 in thyroid malignancy (Kofuji et al.,

2015). These observations point to the importance of ascertaining the genetic and molecular background of the host disease to predict contextual consequence of INPP4B activity.

Though we now have a better understanding of what role INPP4B plays in different cancers, much remains unclear regarding the relationship between INPP4B gene expression and patient outcome across cancers, as well as, how INPP4B expression is regulated Furthermore, in

90 my previous study of INPP4B in AML, I found that INPP4B is highly expressed and expressed with adverse outcome in only 25% of AML patients (Dzneladze et al., 2015). This finding demonstrates the use of a non-median dichotomization approach which recognizes the existence of small significantly distinct subgroups within a population. To address the issues describe above, I developed a subgroup identification tool (Subgroup Identifier; SubID) and used it to conduct a pan-cancer analysis of INPP4B. Specifically, I examined the prognostic value of

INPP4B expression across cancers, and identified a list of potential regulators of INPP4B expression. Furthermore, within prognostically relevant subpopulations defined by INPP4B expression, I applied pathway enrichment analysis to gain insight into the respective roles played by INPP4B.

3.3. MATERIALS AND METHODS

3.3.1. Gene Expression and Clinical Data

Genome-wide expression data and respective clinical information was downloaded from

The Cancer Genome Atlas (TCGA; https://tcga-data.nci.nih.gov/tcga/) for datasets with n>100 patients. RNAseq data were used where available (23/25 datasets), while microarray data was used alternatively (GBM and OV datasets). Normalized microarray data for the Verhaak

(GSE6891) and Chen (GSE57495) datasets were downloaded from GEO database

(www.ncbi.nlm.nih.gov/geo/) (Verhaak et al., 2009). ICGC pancreatic cancer data was downloaded from the ICGC database (http://icgc.org/)

91

3.3.2. Subgroup Identifier (SubID)

SubID was developed and used in R (https://www.r-project.org/). The input for SubID is a spreadsheet which includes a continuous variable (eg. gene expression) and necessary data for calculating the optimization parameter used for subgroup identification (Figure 3.1). In this study, optimization parameters included Fisher’s exact test P value (grouping data required) and

CoxPH P value (OS time and status required), however other tests can be used as needed. Once input data is loaded, SubID sorts the population in order from low to high value of the continuous variable. Next, a loop is initiated which calculates the specified test (optimization parameter) comparing the low and high continuous variable value groups dichotomized at every percentage point from 1 to 99%. The results from this loop are saved into an output file, and plotted as a percentage cut-off (1 to 99%) vs optimization parameter test result plot (SubID plot).

This plot visualizes the relationship between the continuous variable and the optimization parameter across the population. A subset of the patients (70% of population) is then sampled

100 times without replacement to assess and visualize the effect of outlier patients (resampled

SubID plot). Mean absolute deviation of greater than two is calculated and visualized on the resampled SubID plot to examine association with expression distribution. Furthermore, resampling provides an indication as to how many reoccurring dichotomization points were present in the population which corresponded to the number of distinct subgroups. The data was also permutated 1000 times to calculate the false discovery rate (FDR) of the optimization parameter test result. Both FDR at the optimal cut-off point (FDRA.C.) and desired cut-off range

(FDRrange) were calculated. FDRA.C. regulates likelihood that P value at cut-off is better than one observed by chance, while FDRrange regulates likelihood that any significant cut-off within the

92 desired range is not by chance. Significant dichotomization points were applied to the population and visualized using appropriate plots. Though primarily designed to study a single continuous variable, SubID was also modified for a high-throughput application to identify continuous variables matching a specified criteria. For this application, the SubID core process was repeated for every continuous variable in the dataset. This process generated a continuous variable vs percentage cut-off optimization parameter result table. The output data can then be filtered based on specified criteria as required.

93

Figure 3.1. SubID pipeline. Input containing continuous grouping variable levels and data necessary for calculating an optimization parameter is loaded into SubID (R script), and the population is sorted based on levels of the continuous variable. Next, the population is dichotomized at every percentage from 1 to 99%, and the optimization parameter test is calculated at each cut-off. Permutation and resampling analysis is used to assess FDR. SubID pipeline standard output includes (1) SubID plot, (2) Boxplot depicting levels of continuous variable with outlier status (3) Resampled SubID plot and (4) output plot appropriate for the optimization parameter.

94

3.3.3. Expression-Based Subgrouping (SubID with CoxPH Test)

To examine the prognostic value of INPP4B expression across cancers, INPP4B expression and survival data for each cancer was loaded into SubID. Following sorting based on expression, CoxPH was calculated for every dichotomization point from 1 to 99% comparing survival in INPP4Blow vs INPP4Bhigh subgroups. However, to avoid extremely small subgroups which would interfere with meaningful subsequent analysis, only optimal cut-offs from 10 to

90% were selected. Following resampling and permutation, the significant cut-offs were applied to the population and the survival data was visualized using a Kaplan-Meier survival plot. As indicated by the resampling plot, the population was dichotomized into either two (INPP4Blow and INPP4Bhigh) or three (INPP4Blow, INPP4Binter and INPP4Bhigh) subgroups as appropriate

(dependent on number of reoccurring cut-off clusters). A hazard ratio (HR) relative to INPP4Blow expression was calculated.

3.3.4. Co-Expression Analysis (SubID with Fisher’s Exact Test)

To identify genes co-expressed with INPP4B, whole genome expression data and

INPP4B expression status (INPP4Blow indicated with 0, and INPP4Bhigh indicated with 1) were loaded into SubID. A high-throughput version of SubID was then applied: the single-gene SubID pipeline was applied to every gene, calculating the Fisher’s exact test P value for every percentage cut-off (genelow vs genehigh Fisher’s exact test looking at association with INPP4Blow vs INPP4Bhigh). An output table with genes vs Fisher exact test P value across cut-offs was generated. To establish an INPP4Bhigh signature in AML (optimal INPP4B expression cut-off at

95

75%), genes with optimal cut-offs at 75±15% were sorted based on their optimal P value. DNA binding transcription factors were selected out of that list as potential regulators of INPP4B expression. INPP4Bhigh signature was visualized in Gene Set Enrichment Analysis (GSEA) tool from the Broad Institute (http://software.broadinstitute.org/gsea/index.jsp).

3.3.5. Survival, Univariate and Multivariate Analysis

Kaplan-Meier survival analysis was used to compare survival between patients dichotomized based on INPP4B expression status at the optimal cut-off(s) as described above.

CoxPH model was used to calculate the associated HR and respective 95% confidence interval

(CI). Univariate analysis was used to identify any association between INPP4B expression status and other clinical features. Fisher’s exact test was used for categorical data, and Wilcoxon rank- sum tests for continuous data. CoxPH model was used to calculate the OS rates. For multivariate analysis, a preliminary model was constructed for using covariates with a univariate P<0.05.

Reverse selection (based on Akaike information criteria; AIC) was used to remove covariates and establish the optimal multivariate model for OS.

3.3.6. Cell Culture

EVI1high OCI/AML-4, OCI/AML-6, and UCSD/AML-1 cell lines were cultured in alpha minimal essential medium (AMEM) supplemented with 10% fetal bovine serum, 100 units/ml

o penicillin, 100 ug/ml streptomycin, and 10% 5637 conditioned media at 37 C and 5% CO2.

96

3.3.7. Lentiviral Infection

Control pLKO.1 vector, and five pLKO.1 constructs with anti-EVI1 shRNA were a gift from Dr. Jason Moffat (Toronto, ON, Canada). Control and shRNA construct lentiviral particle production and infection was conducted as previously described (Dzneladze et al., 2015).

Briefly, pLKO.1 and packaging constructs were transfected into 293T cells in conjunction using calcium phosphate. Viral particle-rich supernatant was collected and concentrated with Lenti-X concentrator (Clonetech, Mountain View, CA, USA) at 48 and 72h following transfection. Next,

OCI/AML-4 and UCSD/AML-1 cells were infected with lentivirus for 24h with 8ug/ml protamine sulfate and enriched by puromycin selection.

3.3.8. Immunoblotting and Chromatin Immunoprecipitation

Western blotting was performed with anti-EVI1 (#2593), INPP4B (#8450) and GAPDH

(#2118) antibodies from Cell Signaling Technology (CST, Danvers, MA, USA). Chromatin immunoprecipitation was done with anti-EVI1 (#2593), IgG (#7076) and H3 (#9715) antibodies from Cell Signaling Technology (CST, Danvers, MA, USA) with the EZ-ChIP kit (#17-371) from Millipore (Etobicoke, ON, Canada) as directed. EVI1 binding enrichment at the +1379bp and -1000bp (negative control) sites was examined using quantitative PCR as previously described (Dzneladze et al., 2015). The following primers were used: EVI1 +1379bp 5’-

TACCTTTGAACGGCTCCATC-3’ (sense) and 5’- CCCACTTCCTAGCCCCTAAC-3’

(antisense); EVI1 -1000bp 5’- TACTGGAAAACCCGGTAGG-3’ (sense) and 5’-

CTGACAGGAAGGAGATATGCAA-3’ (antisense).

97

3.4. RESULTS

3.4.1. SubID Development and Testing in AML

In order to demonstrate its functionality, SubID was used to characterize the prognostic significance and subgroup specificity of HGF, a gene whose overexpression has been shown to be associated with the favorable outcome t(15;17) cytogenetic abnormality (Valk et al., 2004).

SubID analysis of the Verhaak dataset identified a prognostically significant (P=0.00256)

HGFhigh subgroup, which could have been missed by median dichotomization due to the nearly- insignificant CoxPH P value at 0.0471 (Figure 3.2A). Expression distribution analysis of HGF revealed a right skewed distribution similar to INPP4B with the majority of patients having relatively low expression, and a subset (25% as identified by outlier analysis) having expression greater than 2 standard deviations from the median (Figure 3.2B; Figure 3.3A,B). Resampling analysis with outlier status revealed that a single optimal cut-off point (significant local minima; indicative of two prognostically distinct subgroups) occurred at 70% (Figure 3.2C). Thus, similar to INPP4B, high expression of HGF in AML patients is associated with significantly longer patient survival (Figure 3.2D). Furthermore, dichomotimization of patients based on

SubID optimal cut-off rather than median dichotomization more clearly identifies the prognostically distinct HGFhigh subgroup, and its associated prognostic significance (Figure

3.2D; Figure 3.4A).

98

99

Figure 3.2. SubID testing in AML – HGF example. Standard CoxPH SubID output consisting of (A) SubID plot, (B) HGF expression boxplot with expression distribution outlier status, and

(C) resampled SubID plot with median (blue) and outlier cut-off values (based on expression distribution; red) demonstrates SubID HGF expression optimal cut-off of 70% overlaps with outlier cut-off in the Verhaak dataset. (D) Kaplan-Meier survival for Verhaak dataset patients reveals a significant difference in survival at the optimal cut-off. Standard CoxPH SubID output

(E,F,G) for HGF in cytogenetically abnormal Verhaak dataset patients reveals two optimal cut- offs (28% and 59%) corresponding to three prognostically distinct subgroups. (H) Kaplan-Meier survival for cytogenetically abnormal (CA) Verhaak dataset demonstrated survival differences across the three prognostic subgroups based on HGF expression. Standard Fisher’s exact SubID output consisting of a (I) SubID plot and (J) resampling plot reveals a strong association between

HGF expression and presence of t(15;17) translocation in cytogenetically abnormal Verhaak dataset patients. (K) HGF expression (blue: low expression, red: high expression) and t(15;17) status (translocation present: black) visualizes strong association between HGF expression and presence of the t(15;17) translocation, as well as association between favorable and adverse cytogenetic risk.

100

Figure 3.3. HGF expression distribution in AML. HGF expression across patients of the

(A,B) Verhaak and (C,D) cytogenetically abnormal (CA) Verhaak datasets as visualized by a

(A,C) bar plot and a (B,D) density plot.

101

Figure 3.4. HGF prognostic significance based on median dichotomization. (A) Verhaak and

(B) cytogenetically abnormal (CA) Verhaak dataset patient survival based on median dichotomization of HGF expression.

102

Next, I conducted SubID analysis exclusively on the cytogenetically abnormal (CA) patient subpopulation of the Verhaak dataset. SubID analysis revealed an optimal cut-off point at

28% (P=7.82E-05; median cutoff P=4.59E-04) with a possible second cut-off point at 59%

(Figure 3.2E). Outlier analysis revealed overexpression of HGF in 16% of the patients (Figure

3.2F; Figure 3.3B,C) which did not overlap with the SubID optimal cut-off (Figure 3.2G). Data resampling with loess fitting identified two prognostically significant local minimas indicative of three prognostically distinct subgroups. Kaplan-Meier survival analysis with the 28% and 59% cut-offs verified the presence of three prognostically distinct subgroups defined by HGF expression, which once again, would not have been detected by simple median dichotomization

(Figure 3.2H; Figure 3.4B). Next, Fisher’s exact SubID analysis was used to examine the association between HGF expression and presence of a t(15;17) cytogenetic abnormality. Both

SubID and resampling SubID plots verified optimal association at 92% at P=6.01E-19 (Figure

3.2I,J). Visualization of the association using a heatmap demonstrates the strong association between high HGF expression and t(15;17) cytogenetic abnormality (Figure 3.2K).

Furthermore, as visualized by the heatmap, enrichment analysis using Fisher’s exact SubID revealed that the HGFlow subgroup is enriched for adverse risk cytogenetic abnormalities associated with shorter survival, while HGFhigh subgroup is enriched for favorable risk cytogenetic abnormalities associated with longer patient survival (Figure 3.2K).

Chapter 2 work identified that INPP4B overexpression (observed in 25% of AML patients) is associated with poor patient outcome. To further demonstrate the functionality of

SubID, analysis of the prognostic significance of INPP4B in the Verhaak, TCGA and OCI/PM

AML datasets was re-examined using SubID. The optimal cut-off for INPP4B was found to be

103

87% in the Verhaak dataset (P=1.02E-06, median cut-off P=9.77E-04), 76% in the TCGA dataset (P=0.0224, median cut-off P=0.359), and 75% in the OCI/PM dataset (P=6.48E-06, median cut-off P=0.0373) (Figure 3.5A,E,I). As previously reported, gene expression analysis revealed that INPP4B is overexpressed in 21-30% of AML patients (Figure 3.5B,F,J; Figure

3.6). Resampling analysis demonstrates that the optimal INPP4B cut-off point coincides with the

INPP4Bhigh expression status (Figure 3.5C,G,K).

Overall, SubID analysis validates my Chapter 2 findings that patients dichotomized based

INPP4B expression status into INPP4Blow and INPP4Bhigh subgroups exhibit significantly different survival (Figure 3.5D,H,L). Specifically, INPP4Bhigh AML is associated with significantly shorter patient survival compared to INPP4Blow patients. Furthermore, survival analysis reveals superior risk stratification of the AML patients based on SubID subgrouping into

INPP4Bhigh and INPP4Blow groups when compared to median dichotomization (Figure

3.5D,H,L; Figure 3.7A,B,C). Overall, these findings demonstrate SubID’s ability to more accurately identify and isolate prognostically distinct patients from the general population based on gene expression of a prognostically informative gene in comparison to median dichotomization.

104

105

Figure 3.5. SubID application to INPP4B across AML datasets. Standard CoxPH SubID output consisting of a (A,E,I) SubID plot, (B,F,J) INPP4B expression boxplot with expression distribution outlier status, and (C,G,K) resampled SubID plot with median (blue) and outlier cut- off values (based on expression distribution; red) for the Verhaak, TCGA and OCI AML datasets, respectively. (D,H,L) Kaplan-Meier plot of patients dichotomized at the SubID optimal cut-off demonstrates a significant survival difference between INPP4Blow and INPP4Bhigh subgroups.

106

Figure 3.6. INPP4B expression distribution in AML. INPP4B expression across patients of the (A,B) Verhaak, (C,D) TCGA, and (E,F) OCI/PM AML datasets as visualized by a (A,C,E) bar plot and a (B,D,F) density plot.

107

Figure 3.7. INPP4B prognostic significance in AML based on median dichotomization. (A)

Verhaak, (B) TCGA, and (C) OCI/PM AML dataset patient survival based on median dichotomization of INPP4B expression.

108

3.4.2. SubID Identifies INPP4Bhigh AML Signature and Potential Transcriptional

Regulators

Following development and testing, SubID was used to build upon my previous studies of INPP4B in AML (Dzneladze et al., 2015). Specifically, having previously established that

INPP4B is overexpressed in 25% of AML patients, I now wanted to identify transcriptional regulators of INPP4B expression in AML which may cause its overexpression. High-throughput

Fisher’s exact SubID analysis of the Verhaak AML dataset was used to identify an INPP4Bhigh gene expression signature consisting of the top 500 genes (FDR=0.01; top 100 gene expression signature FDR<1E-05) (Figure 3.8A,C,E). The non-median dichotomization approach used in

SubID allowed for selection of INPP4B co-expressed genes with optimal association with

INPP4Bhigh which matches INPP4B’s 75% cut-off as described in the materials and methods section. The INPP4Bhigh signature was next applied to the TCGA LAML dataset to demonstrate its cross-validity (FDR=0.023; top 100 gene expression signature FDR=4.17E-04) (Figure

3.8B,D,F).

Next, to identify potential regulators of INPP4B expression, DNA binding transcription factors from the INPP4Bhigh signature were selected. Transcription factors with most significant co-expression with INPP4B were identified to be TCF4 (73% cut-off, P=4.5E-13), NFATC2

(70% cut-off, P=1.1E-11), KLF12 (70% cut-off, P=1.4E-09), GATA3 (85% cut-off, P=4.9E-09),

STAT4 (89% cut-off, P=6.6E-08) and EVI1 (89% cut-off, P=3.1E-08). (Figure 3.9, Figure 3.10).

Interestingly, both INPP4B and EVI1 was shown to be highly expressed as members of a leukemic stem signature identified by Eppert et al (Eppert et al., 2011).

109

Figure 8. INPP4Bhigh Signature in AML. (A,B) heatmap and (C-F) GSEA enrichment map of

SubID-derived INPP4Bhigh signature applied to (A,C,E) Verhaak and (B,D,F) TCGA AML datasets.

110

111

Figure 3.9. Transcriptional regulators of INPP4B (Verhaak dataset). (A-F) resampled

Fisher’s exact SubID plot for the top six DNA-binding transcription factors co-expressed with

INPP4B in Verhaak dataset. (G) Color and status (based on Verhaak dataset-derived maximal co-expression analysis) heatmaps of the top potential transcriptional regulators of INPP4B expression in AML (Verhaak dataset).

112

Figure 3.10. Potential transcriptional regulators of INPP4B (TCGA LAML dataset). Color and status (based on Verhaak dataset-derived maximal co-expression analysis) heatmaps of the top potential transcriptional regulators of INPP4B expression in AML (TCGA AML dataset).

113

Due to its role in leukemogenesis and strong association with INPP4B expression in both

Verhaak and TCGA’s LAML datasets (Figure 3.11A, B), for in vitro validation, I focused on

EVI1 as a possible regulator of INPP4B expression. First, EVI1 expression was knocked down using five different anti-EVI1 shRNAs in EVI1high OCI/AML-4 and UCSD1 cells (Figure

3.11C). Knockdown of EVI1 resulted in downregulation of INPP4B in both cells. Next, binding of EVI1 to the INPP4B promoter was confirmed using chromatin immunoprecipitation (ChIP) with anti-EVI1 antibodies at the predicted +1379bp binding site. EVI1 ChIP in both EVI1high

OCI/AML-4 and OCI/AML-6 cells confirmed enrichment of EVI1 at the predicted binding site

(Figure 3.11D,E).

114

115

Figure 3.11. EVI1 regulates INPP4B expression in AML. (A) INPP4B expression is higher in

EVI1high AML patients both in TCGA-LAML and (B) Verhaak datasets. (C) shRNA mediated knockdown of EVI1 in OCI/AML-4 and UCSD1 cells results in decreased levels of INPP4B. (D)

A predicted EVI1 binding site occurs 1379bp downstream the INPP4B transcription start site.

(E) Anti-EVI1 ChIP in EVI1high OCI/AML-4 and OCI/AML-6 cells demonstrated enrichment at the +1379bp EVI1 binding site of the INPP4B promoter region.

116

3.4.3. Cut-off Optimization Offers Improved Identification of Prognostic Significance

Though there is accumulating evidence demonstrating the prognostic value of INPP4B protein overexpression or loss in specific cancers, the prognostic value of INPP4B gene expression across many cancers remained unknown. As gene expression-based biomarkers offer a rapid and cost effective method for risk stratification, I assessed whether INPP4B gene expression is associated with patient outcome across cancers which may allow for its use as a survival biomarker. However, as demonstrated by my previous study in AML, INPP4B overexpression can occur in small patient subgroups which impacts the prognostic significance assessment strategy (Dzneladze et al., 2015). Thus, to accurately characterize the pan-cancer prognostic significance of INPP4B across cancers, I developed and applied SubID to INPP4B gene expression, and clinical data from 25 cancer datasets of TCGA (Table 3.1, Table 3.2). The gene expression datasets ranged in size from 119 to 1083 patients, and encompassed a comprehensive assortment of cancer types. As described in the materials and methods section,

SubID was used to calculate CoxPH P value for every percentile cut-off point, however only cut- offs from 10% to 90% based were considered. The minimum CoxPH P value yielded the maximal difference in survival between the INPP4Blow and INPP4Bhigh group, and thus constituted the optimal cut-off point for population dichotomization in the examined datasets.

Both FDR at the optimal cut-off point (FDRA.C.), and FDR of the 10-90% range of permutated data (FDR10-90%) were calculated.

117

Table 3.1 Cancers with prognostically significant INPP4B expression status

Table 3.2. Datasets with no significant cut-off within the 10 to 90% range

118

Overall, SubID identified that INPP4B expression to be prognostically significant in 13 datasets (P<0.05, FDRA.C.<0.05), with significant cut-offs ranging from 10% to 86%. In comparison, examination of INPP4B prognostic significance using a median cut-off in these datasets, revealed prognostic significance in only four datasets. In all 16 datasets, the SubID optimized cut-off was associated with a lower P value and greater HR than identified by the median cut-off (the directionality of HR was the same in both median and optimal cut-offs for all cancers). Thus, consistent with its intended function, these results demonstrate that SubID is able to optimally segregate away at-risk patients based on gene expression.

In order to minimize the likelihood of false positives, each dataset was permutated, and a likelihood of a receiving an optimal cut-off P value (anywhere from 10 to 90%) as good as the actual observed P value was used to calculate FDR10-90% as an added strategy to minimize false positives. In total, only four datasets had a significant (<0.05) FDR10-90%. Of these, three were also significant at the median cut-off. It is important to note that AML, a context where INPP4B has been previously shown to be prognostically significant, did not pass the FDR10-90% significance threshold.

3.4.4. The Relationship Between INPP4B and Patient Survival is Context Dependent

As previously stated, my SubID-mediated pan-cancer examination of INPP4B prognostic value revealed that INPP4B expression status was associated with patient survival in 13 cancers

(P<0.05, FDRA.C<0.05) (Table 3.1). Only four of these cancers have been previously described in the literature: bladder cancer (Hsu et al., 2014), lung cancer (Zhang et al., 2016), melanoma

119

(Chi et al., 2015; Perez-Lorenzo et al., 2014), and AML (Dzneladze et al., 2015; Rijal et al.,

2016). Consistent with publications, INPP4B association with survival was in-line with tumor suppressive function in bladder cancer, and tumor promoting role in melanoma and AML.

However, my findings in lung cancer contradict those described in the literature. Specifically, while Zhang et al. demonstrated that INPP4B supresses proliferation, colony formation potential and anchorage-independent growth in lung cancer (Zhang et al., 2016), my clinical findings indicate high levels of INPP4B expression are associated with poor patient outcome. It is important to note however, that only bladder cancer is part of the four cancers which met subsequent FDR10-90%<0.05 cut-off, and are hence high confidence hits (Figure 3.12). Thus, my findings confirm the known tumor-suppressive role of INPP4B in bladder cancer, and uncover three new contexts where INPP4B expression status is prognostically significant. Specifically,

INPP4Blow was shown to be associated with shorter survival (consistent with tumor suppressive function) in TCGA kidney clear cell (Figure 3.12A-D; Figure 3.13A,B; Table 3.3), bladder urothelial (Figure 3.12E-H; Figure 3.13C,D), and liver hepatocellular (Figure 3.12I-L; Figure

3.13E,F; Table 3.4) carcinoma datasets. In liver hepatocellular carcinoma, INPP4B expression defined three distinct prognostic subgroups (INPP4Blow, INPP4Binter, and INPP4Bhigh).

Conversely, INPP4Blow was associated with longer patient survival in pancreatic adenocarcinoma (Figure 3.13A-D; Figure 3.16A,B; Table 3.5). Similar to AML, INPP4Bhigh’s association with poor patient outcome in suggests that INPP4B acts as an oncogene, rather than a tumor suppressor, in the context of pancreatic adenocarcinoma. Taking advantage of independent pancreatic cancer gene expression datasets with survival data, I performed cross-validation to validate the findings observed in TCGA. Similar to TCGA, INPP4Bhigh was shown to be associated with shorter overall survival both in the Chen (Figure 3.9E-H; Figure 3.16C,D) and

120

ICGC (Figure 3.6I-L; Figure 3.16E,F) datasets. Furthermore, the SubID-identified optimal cut- off is applicable to all three datasets with optimal cut-offs ranging from 35% to 53%.

Overall, SubID analysis identified three previously unreported contexts where INPP4B expression status is prognostically significant. Additionally, INPP4B prognostic significance was validated in bladder urothelial carcinoma. While INPP4Blow was found to be associated with shorter patient survival in kidney clear cell, liver hepatocellular, and bladder urothelial carcinomas, INPP4Blow was associated with longer survival in pancreatic adenocarcinoma.

Furthermore, the prognostic significance of INPP4B in pancreatic adenocarcinoma was validated in three independent datasets. Survival analysis in all six datasets once again demonstrated

SubID’s superior ability to identify and isolate prognostically distinct patients from the general population based on gene expression of a prognostically informative gene in comparison to median dichotomization (Figure 3.12D,H,L, Figure 3.14, Figure 3.15 D,H,L, Figure 3.17).

121

122

Figure 3.12. Pan-cancer INPP4B prognostic significance. Standard CoxPH SubID output consisting of (A,E,I) SubID plot, (B,F,J) INPP4B expression boxplot with expression distribution outlier status, and (C,G,K) resampled SubID plot with median (blue) and outlier cut-off values

(based on expression distribution; red) for (A-D) kidney clear cell carcinoma, (E-H) bladder urothelial carcinoma, and (I-L) liver hepatocellular carcinoma reveals prognostically significant association between INPP4B expression status and patient survival. Specifically, INPP4Blow is associated with shorter overall survival in (D) kidney, (H) bladder and (L) liver cancers.

Furthermore, INPP4B expression status defines three prognostically distinct subgroups within liver cancer.

123

Figure 3.13. Pan-cancer INPP4B expression distribution. INPP4B expression across patients of the TCGA (A,B) kidney clear cell carcinoma, (C,D) bladder urothelial carcinoma, and (E,F) liver hepatocellular carcinoma datasets visualized by a (A,C,E) bar plot and a (B,D,F) density plot.

124

Figure 3.14. Pan-cancer INPP4B prognostic significance based on median dichotomization.

TCGA (A) kidney clear cell carcinoma, (B) bladder urothelial carcinoma, and (C) liver hepatocellular carcinoma dataset patient survival based on median dichotomization of INPP4B expression.

125

Table 3.3. Clinical characteristics for KIRC

126

Table 3.4. Clinical characteristics for LIHC

127

128

Figure 3.15. INPP4Bhigh is associated with poor outcome in pancreatic cancer. Standard

CoxPH SubID output consisting of (A,E,I) SubID plot, (B,F,J) INPP4B expression boxplot with expression distribution outlier status, and (C,G,K) resampled SubID plot with median (blue) and outlier cut-off values (based on expression distribution; red) for (A-D) TCGA (E-H) Chen, and

(I-L) ICGC pancreatic cancer datasets reveals prognostically significant association between

INPP4B expression status and patient survival. Specifically, INPP4Bhigh is associated with shorter overall survival across all three datasets with a reoccurring optimal cut-off at 35-53% range.

129

Figure 3.16. INPP4B expression distribution in pancreatic adenocarcinoma. INPP4B expression across pancreatic adenocarcinoma patients of the (A,B) TCGA, (C,D) Chen, and

(E,F) ICGC datasets as visualized by a (A,C,E) bar plot and a (B,D,F) density plot.

130

Figure 3.17. INPP4B prognostic significance in pancreatic adenocarcinoma based on median dichotomization. Pancreatic adenocarcinoma (A) TCGA, (B) Chen, and (C) ICGC dataset patient survival based on median dichotomization of INPP4B expression.

131

Table 3.5. Clinical characteristics for PAAD

132

3.5. DISCUSSION

Previous studies have demonstrated the prognostic significance of INPP4B protein loss or overexpression across several different cancer types (Woolley et al., 2015). However, the relationship between INPP4B gene expression levels and patient outcome remained poorly understood. Characterization of INPP4B’s prognostic value across cancers would provide a starting point for subsequent biological studies examining the mechanism responsible for

INPP4B’s observed context-dependent nature. Thus, in this study, I systematically examined the prognostic significance of INPP4B expression status across TCGA patients from 25 different cancer types. Furthermore, to accommodate the previously described subgroup specificity of

INPP4B (overexpression was shown to be confined to a small subgroup in AML (Dzneladze et al., 2015), and improve our pan-cancer analysis, I developed, and applied a novel non-median based dichotomization tool (Subgroup Identifier; SubID) which can be used to identify distinct subgroups. Specifically, SubID was developed to examine the relationship between any continuous variable (in this case gene expression) and a user specified optimization parameter (in this case patient survival) that would allow for identification of subgroups within a heterogenous patient population. SubID application to my pan-cancer study identified 13 new cancers where

INPP4B expression status is prognostically significant, with four of these being high confidence hits. Additionally, SubID was able to identify the presence of three prognostically distinct subgroups in liver hepatocellular carcinoma (INPP4Blow, INPP4Binter, and INPP4Bhigh), thus demonstrating its subgroup identification ability. Finally, SubID was used to further our understanding of the potential cause of its deregulated expression in AML.

133

Overall, SubID offers an alternative strategy which is more appropriate for accommodating heterogenous populations than median dichotomization. Part of its optimal cut- off mapping pipeline, SubID provides the user with information that can identify the presence of all distinct subgroups based on the specified optimization parameter. In cancer, the application of my pipeline can significantly improve patient risk stratification for improved prediction of survival, which may in turn guide selection of best course treatment for patients. However, the underlying concept of SubID allows for input of any continuous variable and user specified optimization parameter. Thus, while this study describes the use of gene expression as the continuous variable, and Fisher’s exact or CoxPH P values as the optimization parameter, other input data can be accommodated as well. SubID was developed in recognition that populations are not homogenous, and that the relationship between a continuous variable and output parameter may be complex, rather than direct.

Though the majority of this manuscript was focused on INPP4B, the concept behind

SubID was first demonstrated with HGF. SubID CoxPH analysis validated the reported finding that AML patients with high HGF expression were significantly more likely to harbor a good prognosis t(15;17) translocation (Valk et al., 2004). However, SubID was able to go beyond this and demonstrate that HGF expression can be used to define three distinct AML subgroups

(HGFlow, HGFitner and HGFhigh) within the cytogenetically abnormal AML population. The

HGFhigh patients had higher frequencies of favorable cytogenetic abnormalities, such as t(15;17), and longest survival. The SubID-mediated analysis of HGF demonstrated how the visualization between percentile cut-off versus optimization parameter can be used to identify distinct subgroups within a heterogeneous population.

134

Following testing, SubID was used to characterize the INPP4Bhigh signature in AML.

Once the signature was validated on an independent dataset, DNA-binding transcription factors were extracted from the list of INPP4B-associated genes as potential regulators of INPP4B expression. Though all six top transcription factors (TCF4, NFATC2, KLF12, GATA3, STAT4, and EVI1) were highly associated with INPP4B expression, particular interest was given to EVI1.

In vitro examination provided experimental support that EVI1 is regulator of INPP4B expression, and further validation of SubID’s ability to accurately identify subgroups and associations between a continuous variable and optimization parameter.

Next, SubID was used to characterize the pan-cancer prognostic significance of INPP4B.

It is important to note, that these prognostic analyses apply to the specific dataset analyzed and identify an association which requires validation in independent datasets, as well as, experimental examination to explain the mechanism responsible for the observed relationship. In the cut-off optimization pipeline used by SubID, the association between INPP4B expression and clinical outcome was divided into two stages: initial CoxPH P value mapping, followed by two separate FDR analyses (FDRA.C. and FDRrange). The FDR analyses rely on data permutation and resampling to increase validity of the identified association with clinical outcome. Of the 25 datasets examined, 13 had significant CoxPH P values, but of those, only four passed the

FDRrange significance threshold. Though the goal of FDR correction is to minimize the likelihood of false positives, it is important to note that a non-significant FDRrange does not mean that

INPP4B expression status is not significant in that dataset. A specific example is the AML dataset, which while it did not achieve FDR10-90% significance, has been validated in multiple datasets in previous studies (Dzneladze et al., 2015; Rijal et al., 2016). In the high confidence

135 hits however, SubID-identified bladder urothelial carcinoma was in agreement with a previous study reporting INPP4B as a tumor suppressor in bladder cancer (Hsu et al., 2014). Although this paper is focused on only four datasets, all 13 cancers are potential contexts where INPP4B expression status is clinically significant.

To examine the relationship between INPP4B expression status and clinical outcome, both optimal and median cut-offs were examined. While the median cut-off generally has superior power – maximizing the size of the compared subgroups – it did so at the expense of specificity. Some patient subgroups may account for only a small proportion of the population, thus to potentially identify those subgroups, the population must be dichotomized to separate the subgroup from the rest of the population if possible. The underlying basis of SubID relies on the concept that specific genes may be tied to a specific subgroup of patients. Thus, by identifying these genes and identifying the expression level which characterizes the subgroup of interest, I can achieve patient subgrouping which is superior to median subgrouping. Due to our non- median dichotomization approach, I was able to identify clinically distinct subgroups accounting for <50% of the patients based on their INPP4B expression status. This approach has also allowed us to identify additional datasets where INPP4B expression status is clinically significant, which were not picked up using median dichotomization. Overall, my analysis of the

TCGA database revealed that deregulated INPP4B expression status was associated with clinical outcome in 13 of 25 examined datasets – this is compared to five datasets identified by median

low dichotomization. Of the high confidence datasets (FDR10-90%<0.05), INPP4B was associated with shorter survival in kidney, bladder and liver hepatocellular carcinoma. Interestingly,

INPP4B expression defined three prognostically significant subgroups in liver hepatocellular

136 carcinoma as demonstrated by two optimal local cut-offs. While bladder INPP4B’s prognostic significance and tumor suppressor role has been previously characterized in bladder cancer, the prognostic significance in kidney and liver cancers is first reported here. Unexpectedly, INPP4B was also shown to be prognostically significant but acting in an oncogene-like manner in the context of pancreatic adenocarcinoma. Similar to AML, INPP4Bhigh patients were shown to have shorter overall survival compared to the INPP4Blow subgroup. This prognostic significance was cross-validated in two other independent datasets. Thus, the pan-cancer analysis once again reveals the opposing relationship between INPP4B expression and patient outcome. Consistent with literature, this systematic analysis reveals that the relationship between INPP4B and survival is context dependent.

Overall, application of the SubID subgroup identification pipeline to study the role of

INPP4B across cancers revealed the context dependent nature of INPP4B. Depending on the context, INPP4Blow can be associated with either shorter or longer patient survival. Additionally,

INPP4B expression can be used to stratify patients into two or three distinct subgroups.

Furthermore, the analysis pipeline described here provides a valuable strategy for studying the relationship between gene expression and clinical outcome. Such improved stratification can be used to better predict patient outcome, and thus adjust the therapeutic regimen to improve patient survival.

137

CHAPTER 4.

DISCUSSION

138

4.1. THE UNEXPECTED ONCOGENE-LIKE ROLE OF INPP4B IN AML

The initial goal of this thesis was to investigate the role of a previously characterized tumor suppressor, INPP4B, in the context of AML. Literature at the time demonstrated that loss of

INPP4B resulted in drug resistance, poor patient outcome, and a more aggressive disease phenotype across several cancer types. However, as work on this project progressed, it became apparent that INPP4B is not only prognostically significant in AML, but it also acts in a completely opposite manner than what was expected and known at the time. Specifically,

INPP4B was shown to act in an oncogene-like, rather than a tumor suppressive manner in AML.

The findings presented in Chapter 2 demonstrate that INPP4B can be used as an independent prognostic marker in AML, contributes to a more aggressive disease phenotype, and thus can be a possible target for directed therapy. The following section provides a brief summary of the main findings presented in Chapter 2 and their significance. Furthermore, this section discusses some of the unanswered questions still remaining and what future work can be conducted to address them.

4.1.1. Summary of Results

Prognostic Significance

Multi-dataset analysis of INPP4B expression levels revealed a right-tailed expression distribution with significantly higher expression levels in 25% of the patients (INPP4Bhigh).

INPP4B gene expression levels were shown to correlate with INPP4B protein levels as

139 demonstrated by selected patient sample Western blots and IF. Clinically, INPP4Bhigh patients had significantly shorter OS and EFS, as observed in six and three independent datasets, respectively. Multivariate OS and EFS analysis revealed that INPP4Bhigh status was an independent prognostic marker and improved the baseline prognostic model consisting of WBC levels, cytogenetic risk and FLT3-ITD status. When compared to other known prognostic markers, INPP4Bhigh was shown to have superior predictability measures. Specifically INPP4B prognostic significance outperformed the mutation based prognostic markers FLT3-ITD, FLT3-

TKD, and NPM1, and the expression based prognostic markers MN1, MECOM, BAALC and

SALL4. In addition to demonstrating prognostic significance across all AML patients, INPP4Bhigh status was shown to be prognostically significant in cytogenetically normal and intermediate cytogenetic risk patients.

Drug Resistance

Clinical data analysis of the OCI/PM and Verhaak datasets revealed that INPP4Bhigh patients were significantly less likely to achieve CR following induction therapy. Conversely,

INPP4B expression levels were significantly higher in NR patients compared to those achieving

CR. To determine the role of INPP4B in drug resistance in vitro, FLAG-tagged INPP4B was stably overexpressed in a panel AML cell lines. In vitro treatment of control and INPP4Bhigh

AML cells with daunorubicin revealed that INPP4B overexpression is associated with a significantly higher resistance to daunorubicin as determined by higher cellular viability following drug treatment. Given that daunorubicin is a non-specific DNA-damaging agent, the role of INPP4B in resistance to IR was also examined. As with daunorubicin, INPP4Bhigh AML

140 cells had significantly higher viability following IR treatment compared to control cells. To determine whether the daunorubicin resistance phenotype is dependent on INPP4B’s 4- phosphatase activity, a catalytically inactive C842S form of INPP4B was overexpressed in

OCI/AML-2. The daunorubicin drug treatment assay revealed a significantly higher viability in

OCI/AML-2 cells overexpressing the wildtype, but not the catalytically inactive form of INPP4B following treatment. Thus, the observed drug resistance phenotype is mediated by the 4- phsophatase activity of INPP4B.

In vitro Proliferation and Colony Formation

CFC assay revealed that INPP4B overexpression was associated with a significant increase in colony formation potential in both normal and low serum conditions. Furthermore, a proliferation assay revealed that INPP4B overexpression was also associated with a significant increase in proliferation. A partial explanation for this proliferative advantage was a significant increase in cell viability in INPP4Bhigh cells as measured by Annexin-V staining. To assess the role of INPP4B’s 4-phosphatase activity in mediating the observed phenotypes, the CFC and proliferation assays were repeated on the control, widltype INPP4Bhigh, and catalytically inactive

INPP4Bhigh cells. While proliferation and low serum growth phenotypes were partially dependent on INPP4B’s 4-phosphatase activity, the increase in colony formation potential was completely dependent on INPP4B’s 4-phosphatase activity.

141

Summary

Overall, Chapter 2 provides some of the earliest evidence in support of a previously uncharacterized oncogenic function for the known tumor suppressor INPP4B. Specifically,

INPP4Bhigh AML was shown to be associated with poor patient outcome across multiple independent datasets. Functional in vitro studies provided evidence that INPP4B overexpression drives results in a phenotype characteristic of a more aggressive disease – higher proliferation, drug resistance, colony formation potential and higher viability. These findings provide evidence that INPP4B expression status defines a prognostically distinct subgroup of AML, and potentially can be used as a prognostic biomarker and a drug target.

4.1.2. Relevant New Literature

At the time when work on the Chapter 2 project began, INPP4B was known to function only as a tumor suppressor. However, our findings in AML provided evidence contrary to that assumption, uncovering a prognostic significance and phenotype consistent with an oncogene rather than a tumor suppressor. Today, there are four additional papers which describe the oncogene-like function of INPP4B across different cancer types. The following section provides a brief overview of these studies which also describe a mechanism which may account for

INPP4B’s oncogenic function in some contexts.

142

Acute Myeloid Leukemia

Shortly following our publication, Rijal et al. l published an independent study detailing the prognostic role of INPP4B in AML (Rijal et al., 2015). Specifically, INPP4B was shown to be overexpressed in leukemic blasts collected from the BM at diagnosis. Median dichotomization of INPP4B revealed INPP4Bhigh was independently associated with lower CR rates, shorter OS and relapse free survival (RFS). The significant survival difference was seen in both total, and intermediate cytogenetic risk only patients. In vitro treatment of AML cell lines overexpressing INPP4B were shown to be resistant to arabinoside, daunorubicin and etoposide treatment. Though Rijal e et al. observed similar prognostic significance and in vitro drug resistance, their findings led to the conclusion that the in vitro drug resistance was independent of INPP4B’s 4-phsophatase activity – a contradiction to our findings. Though it remains unclear what is the exact role of INPP4B’s 4-phosphatase activity in mediating the observed drug resistance, both mine and the Rijal et al. findings demonstrate the prognostic significance of

INPP4B in AML.

Breast Cancer

* The following study by Gasser et al. was the first study to propose an SGK3-mediated mechanism to explain how INPP4B can act as an oncogene

Serum and glucocorticoid-regulated kinase 3 (SGK3) is a protein kinase sharing a 55% sequence similarity with the AKT catalytic domain (Gasser et al., 2014). The sequence similarity

143 allows both AKT and SGK3 to phosphorylate the same RXRXXS/T motif, thus allowing them to target many of the same downstream substrates (Figure 4.1). In addition to the catalytic domain,

SGK3 also contains an activation loop (activated by PDK1 at T320), hydrophobic motif

(activated by TORC2), and an N-terminal phox homology (PX) domain. The PX domain allows

SGK3 to bind to PI(3)P (generated by INPP4B) which localizes it to the endosomoal membrane, and is essential for SGK3 activation. Additionally, both INPP4B and SGK3 have been shown to be transcriptionally upregulated by ER.

Figure 4.1. SGK3 Signaling. SGK3 is recruited to the cell membrane by PI(3)P and activated by

PDK1 phosphorylation. Once activated, SGK3 phosphorylates many of the same targets as AKT.

144

SGK3 is amplified in 30% of tumors, 54% of breast cancers, and its basal phosphorylation has been shown to be essential for the survival and growth of some breast cancer cell lines with mutated PIK3CA or inactivated PTEN (Vasudevan et al., 2009). However, basal SGK3 phosphorylation was not present in all PIK3CA mutated or PTEN inactivated cell lines.

Interestingly, SGK3 phosphorylation was found to correlate with INPP4B overexpression.

Due to its role in generating SKG3-activating PI(3)P, the authors examined the role of

INPP4B in regulating SGK3 activity in ER+ breast cancer cell lines. Overexpression of wild type

INPP4B, but not the catalytically inactive mutant form, was found to significantly enhance in vitro kinase activity of SGK3 while decreasing AKT activation. Conversely, silencing of

INPP4B using shRNA attenuates SGK3 phosphorylation in response to IGF-1 stimulation of the cells (activation of PI3K receptor). Activation of SGK3 was shown to promote matrigel 3D proliferation and anchorage independent growth in ZR-71-1 and MCF7 cells. INPP4B silencing in ZR-75-1 xenograft tumors resulted in decreased tumor growth and reduction of SGK3 phosphorylation. Transwell migration assays revealed that INPP4B silencing results in significantly decreased migration in ZR-75-1, MCF-7 and T47D cells, whereas overexpression increased migration phenotype. Overall, this study revealed that INPP4B is required for invasive migration, 3D proliferation, and formation of tumors in vivo. Furthermore, the authors identified a SGK3-mediated mechanism which allows INPP4B to act as an oncogene.

145

Melanoma

A 2014 study by Perez-Lorenzo et al .l concluded that INPP4B acts at a tumor suppressor in the context of melanoma (Perez-Lorenzo et al., 2014). Summarized in Section 1.2.1, the study demonstrated that INPP4B loss is associated with increased invasion potential, proliferation,

AKT activation and increased tumorigenic potential in mouse xenograft models. However, a later study by Chi et al. in 2015 provided contrary observations which suggests that INPP4B acts as an oncogene, rather than a tumor suppressor in melanoma (Chi et al., 2015). Chi et al. observed that INPP4B expression is higher in metastatic melanomas compared to benign nevi.

INPP4B knockdown resulted in decreased proliferation, which was rescued by reintroduction of

INPP4B. Conversely, INPP4B overexpression resulted in anchorage independent growth and increased proliferation. Though Chi et al. did not provide an explanation for why their observations contradict the study by Perez-Lorenzo et al., it is possible that like breast cancer,

INPP4B function depends on the melanoma subtype.

Next, to determine whether the function of INPP4B in melanoma is mediated by the same

SGK3-mediated mechanism observed in ER+ breast cancer, Chi et al. examined the SGK3 phosphorylation across melanoma patients. SGK3 phosphorylation levels were observed to be significantly higher in melanoma patient samples with elevated INPP4B expression. Similarly, in vitro studies revealed that INPP4B knockdown in melanoma cells lines resulted in decreased

SGK3 activation/phosphorylation. Furthermore, while shRNA mediated knockdown of INPP4B in melanoma cells resulted in decreased proliferation, introduction of a constitutively active

SGK3 (myr-SGK3) rescued the high proliferation rates seen before the INPP4B knockdown.

146

This evidence supports the hypothesis that INPP4B overexpression causes increased proliferation rates through SGK3 activation. Conversely, INPP4B overexpression in melanoma cell lines resulted in increased SGK3 activation. The resulting increase in proliferation was abrogated by shRNA mediated silencing of SGK3.

Colon Cancer

Immunohistochemistry (IHC), Western blot and qPCR analysis revealed that INPP4B gene and protein expression levels are elevated in colon cancer patient samples and cell lines (Guo et al., 2015). shRNA mediated silencing of INPP4B in colon cancer cell lines results in increased proliferation in vitro, and decreased tumor growth in vivo. Conversely, overexpression of

INPP4B results in anchorage independent growth. Similar to melanoma, Guo et al. observed that shRNA-mediated silencing of INPP4B in colon cancer cell lines resulted in decreased phosphorylation of SGK3. Functionally, INPP4B-mediated decrease in proliferation was reversed by introduction of a constitutively active SGK3 (myr-SGK3). Conversely, INPP4B overexpression resulted in increased phosphorylation of SGK3 and anchorage independent growth, which was reversed by introduction of shRNA targeting SGK3. Together, these results provide additional support that INPP4B oncogenic activity is the result of signalling through

SGK3.

Summary

Overall, recent literature (including my own paper) has uncovered several new cancer contexts where INPP4B acts in an oncogene-like manner (Figure 4.1). Interestingly, literature

147 seems to suggest that the function of INPP4B differs not only between cancers, but also within different subtypes of the same cancer. These findings suggest that a difference in how the cells are wired is responsible for the switch between INPP4B’s function. PI signaling studies reveal that INPP4B’s oncogenic function may derive from its ability to generate SGK3-activating

PI(3)P. Sharing many targets with AKT, signaling through SGK3 can induce cellular growth, proliferation, and survival.

Figure 4.2. INPP4B is a pan-cancer oncogene. Summary of INPP4B’s pan-canceroncogene function detailing the phenotypes observed following INPP4B knockdown/loss or overexpression. The timeline indicates the year in which INPP4B was identified to act as an oncogene in the respective cancer.

148

4.1.3. Significance of Findings

Classification of AML based on leukemic drivers allows for more informed and subtype- appropriate treatment which better addresses the AML root cause and driving force. While large- scale genomic profiling studies characterize the somatic mutation and cytogenetic landscape of

AML, much remains unknown regarding expression-based drivers. The in vitro work presented in Chapter 2 demonstrates that INPP4B overexpression in AML cell lines results in increased drug and IR resistance, higher colony formation potential, and greater proliferation rates. Clinical data analysis demonstrates that INPP4Bhigh AML is associated with shorter survival and lower likelihood of achieving CR. These findings demonstrate that INPP4B is a novel expression-based prognostic marker, and contributes to a more aggressive disease phenotype. Though these findings are significant on their own, the most unexpected aspect of my findings was that

INPP4B was shown to act in an oncogene-like, rather than a tumor suppressive manner in AML.

These findings were in direct contradiction to what was known about INPP4B at the time. As discussed in Section 4.1.3., the newly discovered oncogene-like function of INPP4B was also observed in breast cancer, colon cancer and melanoma, in addition to being independently validated in AML by another research group. Together these findings demonstrate the context- dependent function of INPP4B in cancer.

149

4.1.4. Outstanding Questions and Future Directions

Following the work I completed in Chapter 2, I had two main outstanding questions regarding my findings. Firstly I wanted to know if my findings would hold up independent validation, and secondly, what is the mechanism which allows INPP4B to act in an oncogene- like manner in AML. As discussed in section 4.1.2, recent INPP4B literature provided some answers to these questions. The study by Rijal et al. validated our findings regarding the prognostic significance of INPP4B in AML (Rijal et al., 2015). Specifically, Rijal et al. demonstrated that INPP4B overexpression is associated with shorter OS and RFS, and decreased response to drug therapy. Though neither my or the Rijal et al. study identified the mechanism of action, the work done in breast cancer, melanoma, and colon cancer makes SGK3 signaling a plausible explanation for INPP4B’s oncogenic function in the context of AML. Thus, the next step would be to examine SGK3 signaling in AML, to see if there is higher SGK3 phosphorylation and activity in INPP4Bhigh versus AMLlow AML cells. Signaling through the

SGK3 pathway would explain why INPP4Bhigh cells have higher proliferation and colony formation potential without increased pAKT signaling (Figure 4.3).

150

Figure 4.3. INPP4B promotes SGK3 activation. INPP4B-mediated dephosphorylation of

PI(3,4)P2 promotes recruitment and activation of SGK3, and decreases AKT activation.

4.2. SUBID: A CUT-OFF OPTIMIZATION AND SUBGROUP IDENTIFICATION TOOL

INPP4B expression analysis using several independent datasets revealed that INPP4B is overexpressed in 25% of AML patients. This INPP4Bhigh subgroup was prognostically distinct from the rest of the population, and was characterized by shorter OS and EFS, and lower CR rates. Furthermore, INPP4B’s independent prognostic significance was shown to outperform the known mutation-based prognostic biomarkers FLT3-ITD, TLF3-TKD, and NPM1 in terms of survival prediction. These findings demonstrated that gene expression-based biomarkers can be used to improve risk stratification. These findings also highlighted the heterogeneity of cancer,

151 and the importance of identifying distinct subgroups within the population. Together, Chapter 2 work inspired the development of a subgroup identification tool (SubID) to better study, and visualize the relationship between a continues variable and optimization parameter in heterogenous populations.

4.2.1. Summary of Results

Chapter 3 describes the development and application of SubID, a non-median dichotomization tool for the identification of subgroups within a heterogeneous population. The core concept of SubID relies on identifying subgroups based on distinct relationships between a continuous variable of interest, and an output parameter. The continuous variable cut-off versus output parameter plot provided by SubID visualizes this relationship thereby giving valuable information regarding the number and size of significant subgroups within the heterogenous population. This yields two advantages compared to simple median dichotomization: firstly, the optimal cut-off point more specifically segregates the distinct subgroups within a population.

This in turn allows more sensitive examination of the differences between the subgroups.

Secondly, the population can be segregated into multiple distinct subgroups based on several cut- offs (not just two subgroups as done by median dichotomization).

Following development, SubID function was demonstrated by examining HGF – a known biomarker of APL. Clinical data analysis revealed the dichotomization of the AML population based on SubID identified optimal cut-off for HGF yielded a significantly greater survival difference between the HGFhigh and HGFlow subgroups compared to median-dichotomization.

152

Furthermore, SubID revealed that HGF expression can be used to identify three prognostically distinct subgroups. Kaplan-Meier survival analysis demonstrated significant difference in OS between HGFlow, HGFinter, and HGFhigh subgroups. SubID analysis with Fisher’s exact test revealed that the HGFlow subgroup was enriched for adverse risk cytogenetic abnormalities, while the HGFhigh subgroup was enriched for favorable risk cytogenetic abnormalities. SubID examination of the relationship between HGF and t(15;17) cytogenetic abnormality revealed a highly significant association. The maximal association was observed at the 92% HGF cut-off.

Overall, these findings demonstrated that SubID provides the user with a more comprehensive understanding of the relationship between the continuous variable and optimization parameter compared to simple median dichotomization. Furthermore, SubID was shown to be capable of identifying multiple prognostically distinct subgroups within a heterogeneous population.

4.2.2. Significance of Findings

Subgrouping a population with a heterogeneous disease into more homogenous subgroups allows the healthcare team to better predict the patient’s outcome and select the best treatment regimen. Such disease subgrouping relies on progressive dichotomization of patients into subgroups based on biologically and clinically significant parameters. For example, as discussed in the introduction, all AML patients with a t(15;17) cytogenetics abnormality are grouped together into the same subgroup – APL. APL patients have similar predicted survival, and are treated with an APL-specific treatment regimen consisting of all-trans retinoic acid (ATRA) and

153 arsenic trioxide (Ghavamzadeh et al., 2006; Lo-Coco et al., 2013; Sanz et al., 1999; Tallman,

1997; Wang & Chen, 2008). Mutation status and cytogenetic abnormalities have long been used to dichotomize patients. These parameters however are binary – a mutation can either be present or absent, with nothing in between. Continuous stratification variables (such as gene expression and DNA methylation) on the other hand are much more complicated, as there is no innate cut- off point, thus requiring the identification of a threshold or cut-off in order to allow for dichotomization or binning. Saying a gene is overexpressed implies that there is a certain expression level that has to be surpassed in order to go from “normal expression” to

“overexpression”. Developing a strategy for establishing clinically significant cut-offs would allow clinicians to use continuous variables for subgrouping heterogeneous populations, thus allowing improved multivariate survival models. Furthermore, a better characterization of the relationship between a continuous variable and output parameter, as well as, more homogenous subgroups would allow researchers to better study the biological role of the gene.

Overall, the strength of SubID is that it allows for identification of prognostically or biologically distinct subgroups, no matter what proportion of the total population they account for. This in turn offers superior sensitivity to identifying biomarkers compared to median dichotomization. Unlike median dichotomization, cut-off optimization acknowledges that prognostically distinct subgroups can make up a less than 50% of the entire population.

154

4.2.3. Outstanding Questions and Future Directions

In this thesis, SubID was applied exclusively to gene expression data in order assess its prognostic significance and association with other genes. However, SubID input can include continuous variable, and be set to optimize cut-offs on a variety of optimization parameters. By allowing categorization of continuous variables based on prognostic significance, SubID can be used to develop multivariate survival models which take into account continuous patient and clinical characteristics (e.g. age, blood counts, performance scores), and tumor biology (eg. flow cytometry data, DNA methylation) data. Future steps in SubID-mediated prognostic modeling would include performance comparison between a SubID-optimized prognostic model, a median dichotomization-based model, and other subgrouping strategies. This head-to-head comparison would allow us to assess and quantify the model improvement when the SubID cut-off optimization pipeline is used. Part of this analysis would include model validation on an independent cohort of patients. Data validation on an independent dataset would help assess the success of my FDR correction in minimizing false positive and false negatives, and determine how consistent the optimal cut-offs are.

4.3. POTENTIAL TRANSCRIPTIONAL REGULATORS OF INPP4B

4.3.1. Summary of Results

Frequency distribution analysis in AML revealed that INPP4B is overexpressed in 25% of

AML patients. However, the reason for this observed overexpression remained unclear. To

155 identify potential regulators of INPP4B expression in AML, SubID was used to characterize the

INPP4Bhigh AML signature using the established 75% cut-off. The INPP4Bhigh AML signature developed using the Verhaak dataset led to the identification of six transcription factors that were highly co-expressed with INPP4B. Specifically, the identified co-expressed transcription factors were TCF4 (73% cut-off, P=4.5E-13), NFATC2 (70% cut-off, P=1.1E-11), KLF12 (70% cut-off,

P=1.4E-09), GATA3 (85% cut-off, P=4.9E-09), STAT4 (89% cut-off, P=6.6E-08) and EVI1

(89% cut-off, P=3.1E-08). The Verhaak dataset derived INPP4Bhigh signature was next validated on independent dataset (TCGA). Though all six potential transcription factors were significantly associated with INPP4B in TCGA dataset, the NFATC2 (28%; P=3.1E-05) optimal co- expression no longer matched INPP4B’s 75% cut-off. Thus, the top INPP4B co-expressed transcription factors validated in the TCGA dataset were TCF4 (55%, P=3.5E-07), KLF12 (64%,

P=2.4E-04), GATA3 (89%, P=2.6E-05), STAT4 (78%, P=7.2E-05), and EVI1 (79%, P=3.1E-05).

Due to its role in hematopoiesis and leukemogenesis, EVI1 was selected for in vitro validation.

ShRNA mediated knockdown of EVI1 in AML cell lines was shown to be associated with a respective change in INPP4B expression. Chromatin immunoprecipitation demonstrated EVI1 binding at the INPP4B promoter region.

4.3.2. Significance of Findings

The identification and validation of EVI1 as a potential regulator of INPP4B expression is significant for two distinct reasons. First, the co-expression between EVI1 and INPP4B, both identified to be enriched in leukemic stem cells, suggests a common leukemic stem cell network which causes upregulation of both of these genes (Eppert et al., 2011). Similar to INPP4B, high

156

EVI1 expression has been shown to be associated with poor patient outcome in AML (Lugthart et al., 2008). Thus targeted therapy that may downregulate both INPP4B and EVI1 may improve patient survival. Second, the experimental validation demonstrating that EVI1 regulates INPP4B expression (directly or indirectly) provides validation to the SubID co-expression pipeline.

4.3.3. Outstanding Questions and Future Directions

Though the co-expression between EVI1 and INPP4B demonstrated by bioinformatic and in vitro work provides evidence of a transcriptional link, the exact relationship remains unknown. Specifically, while EVI1 was shown to bind to the INPP4B promoter, it is unclear if this binding is functional and causes upregulation of INPP4B levels. There remains a possibility that the EVI1 binding to the INPP4B promoter region is non-functional, and instead EVI1 regulates INPP4B expression by binding to a different site, or via a transcriptional intermediate

(EVI1  transcription factor  INPP4B). To demonstrate the functionality of EVI1, a luciferase assay of the EVI1 binding site would be the necessary next step.

In addition to studying the EVI1-INPP4B relationship further, another future step would be to further validate and assess SubID’s ability to identify co-expressions and predict transcriptional regulation networks. Validation of SubID targets on multiple datasets would allow me to examine the variability and false discovery rates of SubID hits. However, as with

EVI1, the ultimate validation remains experimental validation.

157

4.4. PAN-CANCER PROGNOSTIC SIGNIFICANCE OF INPP4B

The experimental and bioinformatic research work presented in this dissertation was inspired by the curiosity to understand the role of INPP4B across cancers. When the research work for this dissertation began, INPP4B was known exclusively as a tumor suppressor in the context of cancer. As reviewed in the introduction section, loss of INPP4B was known to be associated with poor clinical outcome, increased migration, drug resistance, and increased proliferation in several different cancer types. This led us to hypothesize that INPP4B plays a similar role in AML. However, as our research began, it quickly became apparent that INPP4B overexpression was associated with poor clinical outcome and drug resistance in AML. Next, experimental validation provided additional evidence that INPP4B may indeed fulfill a role more consistent with an oncogene rather than a tumor suppressor in the context of AML. Around the time that our findings were submitted for publication, several additional studies were published showing that INPP4B does indeed have play an oncogene-like role in some cancer subtypes.

However, these findings were primarily based on INPP4B protein expression, and did not examine the role of INPP4B gene expression. To address this issue, I conducted a pan-cancer characterization of the INPP4B prognostic significance.

4.4.1. Summary of Results

Pan-caner characterization of INPP4B prognostic significance with SubID revealed that

INPP4Blow is associated with adverse patient outcome in four cancer types, and favourable outcome in nine cancer types based on CoxPH P<0.05. Of these thirteen cancer types, prognostic

158 significance was observed in four cancer types when median dichtomozation was used instead of

SubID. Next, permutation-based FDR analysis was applied to the SubID-identified thirteen cancer types where INPP4B was prognostically significant. The post-FDR step high-confidence cancer types were kidney clear cell, bladder urothelial, pancreatic, and liver hepatocellular carcinomas. Consistent with literature, our findings demonstrated that low INPP4B expression was associated with adverse patient outcome in bladder cancer (Hsu et al., 2014). The prognostic significance of INPP4B expression status was validated in two other pancreatic cancer datasets thus providing cross-validation to the SubID-mediated identification or prognostic biomarkers.

Overall, the SubID-mediated analysis of INPP4B described in this thesis is the first pan- cancer examination of INPP4B’s pan-cancer prognostic significance. In addition to identifying several new cancer contexts where INPP4B plays either a tumor suppressor or an oncogenic role, the pan-cancer examination established a pipeline for the characterization of additional gene- expression based prognostic biomarkers. Equally interestingly, my analysis uncovered a previously undescribed context where INPP4B acts as an oncogene, thus providing an additional context where INPP4B’s oncogenic mechanism can be studied.

4.4.2. Significance of Findings

The pan-cancer characterization of INPP4B’s prognostic significance complemented current literature by demonstrating the context-dependent nature of INPP4B. Specifically, I identified previously uncharacterized high-confidence contexts where loss of INPP4B expression is associated with adverse (tumor suppressor-like role; kidney and liver cancers) and favorable

159

(oncogene-like role; pancreatic cancer) patient outcome. Furthermore, because my prognostic analysis was based on gene expression data, I was able to characterize the INPP4Bhigh signature which was shown to be enriched for metastasis pathway related genes. Subsequent clinical data analysis revealed that INPP4Blow kidney cancer patients did indeed have higher occurrence of metastasis at diagnosis. In addition to characterizing the prognostic significance of INPP4B across cancers, I was further able to demonstrate functionality of the SubID in detecting prognostic significance, identifying multiple subgroups (as was the case in liver cancer), and establishing a gene expression signature. Most importantly, I was able to demonstrate that SubID is a more sensitive method for identifying prognostic significance compared to median dichotomization.

4.4.3. Outstanding Questions and Future Directions

The SubID-mediated pan-cancer analysis identified several high confidence contexts where

INPP4B is associated with patient outcome. Due to the absence of a validation datasets, minimization false discoveries relied on permutation and resampling-based FDR calculations.

However, the permutation-based FDR is highly stringent and subject to false negatives – a specific example to illustrate this is the AML dataset. As demonstrated by my work in AML, though INPP4B’s prognostic significance in TCGA AML dataset did not surpass the FDR<0.05 threshold, INPP4B’s prognostic significance was successfully validated in AML independent datasets and experimentally (Chapter 2 work). Thus, by examining additional independent datasets, I may be able to validate some of my low-confidence (CoxPH P<0.05 but FDR<0.05) datasets.

160

Lastly, it is of great interest to determine the mechanism of action which allows INPP4B to act in a tumor-suppressive manner in one context, and an oncogene-like manner in another context. By looking for similarities in INPP4B gene expression signatures across all the oncogene-like cancer contexts, or tumor-suppressor contexts, I could identify potential mechanisms responsible for the switch. Furthermore, consistent with Section 4.1, future experimental work would include pan-cancer characterization of SGK3 and AKT phosphorylation levels to determine if these are the pathways that are responsible for INPP4B’s dual role.

161

4.5. CONCLUDING REMARKS

The unifying idea woven in this thesis is that cancer is a heterogenous disease. Though researchers and clinicians have come a long way in characterizing cancer drivers and prognostically significant features, we are still unable to accurately predict patient outcome. The results presented in this thesis provide evidence suggesting that INPP4B may be used as a prognostic biomarker in the context of AML. Furthermore, due to its role in drug response, proliferation and colony formation, INPP4B is a potential target to inhibitor or knockdown therapy. In addition to AML, our results show that INPP4B has a prognostically significant role across cancer types, thereby suggesting a much wider impact if an INPP4B-targeted therapy is developed. In addition to the INPP4B-related work presented here, this thesis also describes the development and application of SubID – a subgroup identification tool. SubID offers an intuitive visualization of the relationship between a continuous variable and an output parameter across a heterogenous population. This visualized relationship allows the user to easily see the number and types of subgroups present within a population. Once subgrouped, the user can then better study each subgroup. As described in the discussion section, SubID can be modified and optimized to have a variety of applications.

162

REFERENCES

163

Abdel-Wahab, O., Levine, R. L. (2013) Mutations in epigenetic modifiers in the pathogenesis and therapy of acute myeloid leukemia. Epigenetics in Hematology. 121(18): 3563–3572.

Agoulnik, I. U., Hodgson, M. C., Bowden, W. A., Ittmann, M. M. (2011) INPP4B: the New Kid on the PI3K Block. Oncotarget. 2(4): 321–328.

Arber, D. A., Orazi, A., Hasserjian, R., Borowitz, M. J., Beau, M. M. Le, Bloomfield, C. D., et al. (2016) The 2016 revision to the World Health Organization classi fi cation of myeloid neoplasms and acute leukemia. Blood. 127(20): 2391–2406.

Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., et al. (2000) : tool for the unification of biology. Nature Genetics. 25(1): 25–29.

Bago, R., Malik, N., Munson, M. J., Prescott, A., Davies, P., Sommer, E. M., et al. (2014) Characterisation of VPS34-IN1, a selective inhibitor of Vps34 reveals that the phosphatidylinositol 3-phosphate binding SGK3 protein kinase is a downstream target of Class III PI-3 kinase. The Biochemical Journal. 463(3): 413-427.

Bailey, P., Chang, D. K., Nones, K., Johns, A. L., Patch, A.-M., Gingras, M.-C., et al. (2016) Genomic analyses identify molecular subtypes of pancreatic cancer. Nature. 531: 47–52.

Balakrishnan, A., and Chaillet, J.R. (2013) Role of the inositol polyphosphate-4-phosphatase type II Inpp4b in the generation of ovarian teratomas. Dev. Biol. 373: 118–129.

Baldus, C. D., Bullinger, L. (2008) Gene Expression With Prognostic Implications in Cytogenetically Normal Acute Myeloid Leukemia. Seminars in Oncology. 35(4): 356–364.

Balla, T. (2013) Phosphoinositides: tiny lipids with giant impact on cell regulation. Physiological Reviews. 93(3): 1019–1137.

Bansal, S., Louis, S., Caldwell, K., Majeruss, W., Chem, P. W. J. B. (1990) The Isolation Phosphatase and Characterization of Inositol Polyphosphate. Journal of Biological Chemistry. 265(3): 1806–1811.

Bansal, V. S., Inhorn, R. C., Majerus, P. W. (1987) The metabolism of inositol 1,3,4- trisphosphate to inositol 1,3-bisphosphate. Journal of Biological Chemistry. 262(20): 9444– 9447.

Barnache, S., Le Scolan, E., Kosmider, O., Denis, N., Moreau-Gachelin, F. (2006) Phosphatidylinositol 4-phosphatase type II is an erythropoietin-responsive gene. Oncogene. 25(9): 1420–1423.

Bertucci, M. C., Mitchell, C. A. (2013) Phosphoinositide 3-kinase and INPP4B in human breast cancer. Annals of the New York Academy of Sciences. 1280(1): 1–5.

Budczies, J., Klauschen, F., Sinn, B. V, Győrffy, B., Schmitt, W.D., Darb-Esfahani, S., Denkert, C. (2012) Cutoff Finder: a comprehensive and straightforward Web application enabling rapid biomarker cutoff optimization. PLoS One. 7: e51862.

164

Bullinger, L., Rücker, F.G., Kurz, S., Du, J., Scholl, C., Sander, S., Corbacioglu, A., Lottaz, C., Krauter, J., Fröhling, S., et al. (2007) Gene-expression profiling identifies distinct subclasses of core binding factor acute myeloid leukemia. Blood. 110(4): 1291–1300.

Bunney, T.D., Katan, M. (2010) Phosphoinositide signalling in cancer: beyond PI3K and PTEN. Nat. Rev. Cancer. 10: 342–352.

Byrd, J.C., Mro, K., Dodge, R.K., Carroll, A.J., Edwards, C.G., Arthur, D.C., Pettenati, M.J., Patil, S.R., Rao, K.W., Watson, M.S., et al. (2002) Pretreatment cytogenetic abnormalities are predictive of induction success , cumulative incidence of relapse , and overall survival in adult patients with de novo acute myeloid leukemia : results from Cancer and Leukemia Group B ( CALGB 8461 ). Blood. 100: 4325–4336.

Cantley, L.C. (2002) The phosphoinositide 3-kinase pathway. Science. 296: 1655–1657.

Chew, C.L., Lunardi, A., Gulluni, F., Ruan, D.T., Chen, M., Salmena, L., Nishino, M., Papa, A., Ng, C., Fung, J., et al. (2015) In vivo role of INPP4B in tumor and metastasis suppression through regulation of PI3K/AKT signaling at endosomes. Cancer Discov. 5(7): 740–752.

Chi, M.N., Guo, S.T., Wilmott, J.S., Guo, X.Y., Yan, X.G., Wang, C.Y., Ying Liu, X., Jin, L., Tseng, H.-Y., Liu, T., et al. (2015) INPP4B is upregulated and functions as an oncogenic driver through SGK3 in a subset of melanomas. Oncotarget. 6: 39891–39907.

Coombs, C.C., Tavakkoli, M., Tallman, M.S. (2015) Acute promyelocytic leukemia: where did we start, where are we now, and the future. Blood Cancer J. 5: e304.

Croft, D., O’Kelly, G., Wu, G., Haw, R., Gillespie, M., Matthews, L., et al. (2011) Reactome: A database of reactions, pathways and biological processes. Nucleic Acids Research. 39: D691– 697

Di Paolo, G., and De Camilli, P. (2006) Phosphoinositides in cell regulation and membrane dynamics. Nature. 443: 651–657.

Ding, H., Sun, Y., Hou, Y., and Li, L. (2014) Effects of INPP4B gene transfection combined with PARP inhibitor on castration therapy-Resistant prostate cancer cell line, PC3. Urol. Oncol. Semin. Orig. Investig. 32: 720–726.

Döhner, H. (2007) Implication of the molecular characterization of acute myeloid leukemia. Hematology Am. Soc. Hematol. Educ. Program. 412–419.

Dzneladze, I., He, R., Woolley, J. F., Son, M. H., Sharobim, M. H., Greenberg, S. A, et al. (2015) INPP4B overexpression is associated with poor clinical outcome and therapy resistance in acute myeloid leukemia. Leukemia. 29(7): 1485-1495

Eppert, K., Takenaka, K., Lechman, E.R., Waldron, L., Nilsson, B., van Galen, P., Metzeler, K.H., Poeppl, A., Ling, V., Beyene, J., et al. (2011) Stem cell gene expression programs influence clinical outcome in human leukemia. Nat. Med. 17: 1086–1093.

165

Fedele, C.G., Ooms, L.M., Ho, M., Vieusseux, J., O’Toole, S.A., Millar, E.K., Lopez-Knowles, E., Sriratana, A., Gurung, R., Baglietto, L., et al. (2010) Inositol polyphosphate 4-phosphatase II regulates PI3K/Akt signaling and is lost in human basal-like breast cancers. PNAS. 107: 22231–22236.

Fenaux, P., Castaigne, S., Dombret, H., Archimbaud, E., Duarte, M., Morel, P., Lamy, T., Tilly, H., Guerci, a, Maloisel, F., et al. (1992) All-transretinoic acid followed by intensive chemotherapy gives a high complete remission rate and may prolong remissions in newly diagnosed acute promyelocytic leukemia: A pilot study on 26 cases. Blood. 80: 2176–2181.

Fernandez, H.F., Sun, Z., Yao, X., Litzow, M.R., Luger, S.M., Paietta, E.M., Racevskis, J., Dewald, G.W., Ketterling, R.P., Bennett, J.M., et al. (2009) Anthracycline dose intensification in acute myeloid leukemia. N. Engl. J. Med. 361: 1249–1259.

Ferron, M., Vacher, J. (2006) Characterization of the murine Inpp4b gene and identification of a novel isoform. Gene. 376: 152–161.

Foran, J.M. (2010) New prognostic markers in acute myeloid leukemia: perspective from the clinic. Hematology Am. Soc. Hematol. Educ. Program 2010: 47–55.

Franke, T. F. (1997) Direct Regulation of the AKT Proto-Oncogene Product by Phosphatidylinositol-3,4-bisphosphate. Science. 275(5300): 665–668.

Frech, M., Andjelkovic, M., Ingley, E., Reddy, K. K., Falck, J. R., Hemmings, B. A. (1997) High affinity binding of inositol phosphates and phosphoinositides to the pleckstrin homology domain of RAC/protein kinase B and their influence on kinase activity. The Journal of Biological Chemistry. 272(13): 8474–81.

Futreal, P. a. A., Coin, L., Marshall, M., Down, T., Hubbard, T., Wooster, R., et al. (2004) A census of human cancer genes. Nat. Rev. Cancer. 4(3): 177–183.

Gasser, J.A., Inuzuka, H., Lau, A.W., Wei, W., Beroukhim, R., Toker, A. (2014) SGK3 Mediates INPP4B-Dependent PI3K Signaling in Breast Cancer. Mol. Cell. 56: 595–607.

Gewinner, C., Wang, Z. C., Richardson, A., Teruya-Feldstein, J., Etemadmoghadam, D., Bowtell, D., et al. (2009) Evidence that Inositol Polyphosphate 4-Phosphatase Type II Is a Tumor Suppressor that Inhibits PI3K Signaling. Cancer Cell. 16(2): 115–125.

Ghavamzadeh, A., Alimoghaddam, K., Ghaffari, S.H., Rostami, S., Jahani, M., Hosseini, R., Mossavi, A., Baybordi, E., Khodabadeh, A., Iravani, M., et al. (2006) Treatment of acute promyelocytic leukemia with arsenic trioxide without ATRA and/or chemotherapy. Ann. Oncol. 17: 131–134.

166

Grimwade, D., Walker, H., Oliver, F., Wheatley, K., Harrison, C., Harrison, G., et al. (1998) The importance of diagnostic cytogenetics on outcome in AML: analysis of 1,612 patients entered into the MRC AML 10 trial. The Medical Research Council Adult and Children’s Leukaemia Working Parties. Blood. 92(7): 2322–33.

Guo, S. T., Chi, M. N., Yang, R. H., Guo, X. Y., Zan, L. K., Wang, C. Y., et al. (2015) INPP4B is an oncogenic regulator in human colon cancer. Oncogene. 35(23): 3049-3061

Hawkins, P.T., and Stephens, L.R. (2016) Emerging evidence of signalling roles for PI(3,4)P2 in Class I and II PI3K-regulated pathways. Biochem. Soc. Trans. 44: 307–314.

Hodgson, M. C., Deryugina, E. I., Suarez, E., Lopez, S. M., Lin, D., Xue, H., et al. (2014) INPP4B suppresses prostate cancer cell invasion. Cell Communication and Signaling. 12: 61.

Hodgson, M.C., Shao, L., Frolov, A., Li, R., Peterson, L.E., Ayala, G., Ittmann, M.M., Weigel, N.L., Agoulnik, I.U. (2011) Decreased expression and androgen regulation of the tumor suppressor gene INPP4B in prostate cancer. Cancer Res. 71: 572–582.

Holm, S. (1979) A Simple Sequentially Rejective Multiple Test Procedure. Scandinavian Journal of Statistics. 6(2): 65–70

Hsu, I., Yeh, C.-R., Slavin, S., Miyamoto, H., Netto, G., Tsai, Y.-C., Muyan, M., Wu, X.-R., Yeh, S. (2014) Estrogen receptor alpha prevents bladder cancer via INPP4B inhibited akt pathway in vitro and in vivo. Oncotarget. 5: 7917–7935.

Ip, L. R. H., Poulogiannis, G., Viciano, F. C., Sasaki, J., Kofuji, S., Spanswick, V. J., et al. (2015) Loss of INPP4B causes a DNA repair defect through loss of BRCA1, ATM and ATR and can be targeted with PARP inhibitor treatment. Oncotarget. 6(12): 10548–62.

Kaczkowski, B., Tanaka, Y., Kawaji, H., Sandelin, A., Andersson, R., Itoh, M., et al. (2016). Transcriptome analysis of recurrently deregulated genes across multiple cancers identifies new pan-cancer biomarkers. Cancer Research. 76(2): 216–226.

Kim, J.S., Yun, H.S., Um, H.D., Park, J.K., Lee, K.H., Kang, C.M., Lee, S.J., Hwang, S.G. (2012) Identification of inositol polyphosphate 4-phosphatase type II as a novel tumor resistance biomarker in human laryngeal cancer HEp-2 cells. Cancer Biol. Ther. 13: 1307– 1318.

Klippel, A., Kavanaugh, W. M., Pot, D., Williams, L. T. (1997) A specific product of phosphatidylinositol 3-kinase directly activates the protein kinase Akt through its pleckstrin homology domain. Molecular and Cellular Biology. 17(1): 338–44.

Kofuji, S., Kimura, H., Nakanishi, H., Nanjo, H., Takasuga, S., Liu, H., et al. (2015) INPP4B Is a PtdIns(3,4,5)P3 Phosphatase That Can Act as a Tumor Suppressor. Cancer Discovery. 5(7): 730–9.

167

Lemmon, M. A. (2008) Membrane recognition by phospholipid-binding domains. Nature Reviews. Molecular Cell Biology. 9(2): 99–111.

Liu, Y., Cheney, M. D., Gaudet, J. J., Chruszcz, M., Lukasik, S. M., Sugiyama, D., et al. (2006) The tetramer structure of the Nervy homology two domain, NHR2, is critical for AML1/ETO’s activity. Cancer Cell. 9(4): 249–260.

Lo-Coco, F., Avvisati, G., Vignetti, M., Thiede, C., Orlando, S.M., Iacobelli, S., Ferrara, F., Fazi, P., Cicconi, L., Di Bona, E., et al. (2013). Retinoic acid and arsenic trioxide for acute promyelocytic leukemia. N. Engl. J. Med. 369: 111–121.

Lopez, S. M., Hodgson, M. C., Packianathan, C., Bingol-Ozakpinar, O., Uras, F., Rosen, B. P., Agoulnik, I. U. (2013) Determinants of the tumor suppressor INPP4B protein and lipid phosphatase activities. Biochemical and Biophysical Research Communications. 440(2): 277– 282.

Löwenberg, B., Downing, J. R., Burnett, A. (1999) Acute myeloid leukemia. The New England Journal of Medicine. 341(14): 1051–1062.

Lugthart, S., Van Drunen, E., Van Norden, Y., Van Hoven, A., Erpelinck, C.A.J., Valk, P.J.M., Beverloo, H.B., Löwenberg, B., Delwel, R. (2008) High EVI1 levels predict adverse outcome in acute myeloid leukemia: Prevalence of EVI1 overexpression and chromosome 3q26 abnormalities underestimated. Blood. 111: 4329–4337.

Lundin, C., Hjorth, L., Behrendtz, M., Nordgren, A., Palmqvist, L., Andersen, M. Ket al. (2012). High frequency of BTG1 deletions in acute lymphoblastic leukemia in children with down syndrome. Genes, & Cancer. 51: 196–206.

Ma, K., Cheung, S. M., Marshall, A. J., & Duronio, V. (2008) PI(3,4,5)P3 and PI(3,4)P2 levels correlate with PKB/akt phosphorylation at Thr308 and Ser473, respectively; PI(3,4)P2 levels determine PKB activity. Cellular Signalling. 20(4): 684–694.

Mandelli, F., Diverio, D., Avvisati, G., Luciano, A., Barbui, T., Bernasconi, C., et al. (1997) Molecular remission in PML/RARα-positive acute promyelocytic leukemia by combined all- trans retinoic acid and Idarubicin (AIDA) therapy. Blood. 90(3): 1014–1021.

Martelli, A. M., Nyåkern, M., Tabellini, G., Bortul, R., Tazzari, P. L., Evangelisti, C., Cocco, L. (2006) Phosphoinositide 3-kinase/Akt signaling pathway and its therapeutical implications for human acute myeloid leukemia. Leukemia. 20(6): 911–28.

Merico, D., Isserlin, R., Stueker, O., Emili, A., Bader, G. D. (2010) Enrichment map: A network- based method for gene-set enrichment visualization and interpretation. PLoS ONE. 5(11).

Metzeler, K. H., Hummel, M., Bloomfield, C. D., Spiekermann, K., Braess, J., Sauerland, M.-C., et al. (2008) An 86-probe-set gene-expression signature predicts survival in cytogenetically normal acute myeloid leukemia. Blood. 112(10): 4193–201.

168

Norris, F. a, Auethavekiat, V., Majerus, P.W. (1995) The isolation and characterization of cDNA encoding human and rat brain inositol polyphosphate 4-phosphatase. J. Biol. Chem. 270: 16128–16133.

Norris, F. A., Majerus, P. W. (1994) Hydrolysis of phosphatidylinositol 3,4-bisphosphate by inositol polyphosphate 4-phosphatase isolated by affinity elution chromatography. Journal of Biological Chemistry. 269(12): 8716–8720.

Norris, F. A., Atkins, R. C., Majerus, P. W. (1997) The cDNA Cloning and Characterization of Inositol Polyphosphate 4-Phosphatase Type II, Journal of Biological Chemistry. 272(38): 23859–23864.

Norris, F.A., Majerus, P.W. (1994) Hydrolysis of phosphatidylinositol 3,4-bisphosphate by inositol polyphosphate 4-phosphatase isolated by affinity elution chromatography. J. Biol. Chem. 269: 8716–8720.

O’Donnell, M.R., Appelbaum, F.R., Baer, M.R., Byrd, J.C., Coutre, S.E., Damon, L.E., Erba, H.P., Estey, E., Foran, J., Lancet, J., et al. (2006) Acute myeloid leukemia clinical practice guidelines in oncology. J. Natl. Compr. Canc. Netw. 4: 16–36.

Papaemmanuil, E., Gerstung, M., Bullinger, L., Gaidzik, V.I., Paschka, P., Roberts, N.D., Potter, N.E., Heuser, M., Thol, F., Bolli, N., et al. (2016) Genomic Classification and Prognosis in Acute Myeloid Leukemia. N. Engl. J. Med. 374: 2209–2221.

Park, S., Chapuis, N., Tamburini, J., Bardet, V., Cornillet-Lefebvre, P., Willems, L., et al. (2010) Role of the PI3K/AKT and mTOR signaling pathways in acute myeloid leukemia. Haematologica. 95(5): 819–828.

Payrastre, B., Missy, K., Giuriato, S., Bodin, S., Plantavid, M., Gratacap, M. P. (2001) Phosphoinositides: Key players in cell signalling, in time and space. Cellular Signalling. 13(6): 377–387.

Perez-Lorenzo, R., Gill, K. Z., Shen, C.-H., Zhao, F. X., Zheng, B., Schulze, H.-J., et al. (2014) A tumor suppressor function for the lipid phosphatase INPP4B in melanocytic neoplasms. The Journal of Investigative Dermatology. 134(5): 1359–68.

Reimand, J., Arak, T., Vilo, J. (2011) G:Profiler - A web server for functional interpretation of gene lists (2011 update). Nucleic Acids Research. 39: 307–315.

Rijal, S., Fleming, S., Cummings, N., Rynkiewicz, N. K., Ooms, L. M., Nguyen, N. N., et al. (2016) Inositol polyphosphate 4-phosphatase II ( INPP4B ) is associated with chemoresistance and poor outcome in AML. Blood. 125(18): 2815–2825.

Ross, M. E., Zhou, X., Song, G., Shurtleff, S. A., Girtman, K., Williams, W. K., et al. (2003). Classification of pediatric acute lymphoblastic leukemia by gene expression profiling. Blood. 102(8): 2951–9.

169

Roth, M. G. (2004) Phosphoinositides in constitutive membrane traffic. Physiological Reviews. 84(3): 699–730.

Rynkiewicz, N. K., Fedele, C. G., Chiam, K., Gupta, R., Kench, J. G., Ooms, L. M., et al. (2015) INPP4B is highly expressed in prostate intermediate cells and its loss of expression in prostate carcinoma predicts for recurrence and poor long term survival. Prostate. 75(1): 92–102.

Salmena, L., Shaw, P., Fan, I., McLaughlin, J.R., Rosen, B., Risch, H., Mitchell, C., Sun, P., Narod, S.A., Kotsopoulos, J. (2015) Prognostic value of INPP4B protein immunohistochemistry in ovarian cancer. Eur. J. Gynaecol. Oncol. 36: 260–267.

Sanz, M. a, Martín, G., Rayón, C., Esteve, J., González, M., Díaz-Mediavilla, J., et al. (1999) A modified AIDA protocol with anthracycline-based consolidation results in high antileukemic efficacy and reduced toxicity in newly diagnosed PML/RARalpha-positive acute promyelocytic leukemia. PETHEMA group. Blood. 94(9): 3015–3021.

Schnitt, S. J. (2010) Classification and prognosis of invasive breast cancer: from morphology to molecular taxonomy. Modern Pathology. 23: Suppl 2, S60–S64.

Shipley, J. L., Butera, J. N. (2009). Acute myelogenous leukemia. Experimental Hematology. 37(6): 649–658.

Slovak, M. L., Kopecky, K. J., Cassileth, P. A., Harrington, D. H., Theil, K. S., Mohamed, A., et al. (2015) Karyotypic analysis predicts outcome of preremission and postremission therapy in adult acute myeloid leukemia : a Southwest Oncology Group / Eastern Cooperative Oncology Group study. Blood. 96(13): 4075–4084.

Sportoletti, P. (2011) How does the NPM1 mutant induce leukemia? Pediatric Reports. 3(2S): 11–13.

Stjernström, A., Karlsson, C., Fernandez, O. J., Söderkvist, P., Karlsson, M. G., Thunell, L. K. (2014) Alterations of INPP4B, PIK3CA and pAkt of the PI3K pathway are associated with squamous cell carcinoma of the lung. Cancer Medicine. 3(2): 337–348.

Sun, Y., Ding, H., Liu, X., Li, X., Li, L. (2014) INPP4B overexpression enhances the antitumor efficacy of PARP inhibitor AG014699 in MDA-MB-231 triple-negative breast cancer cells. Tumor Biology. 35(5): 4469–4477.

Tallman, M.S. (1997) Alltrans-Retinoic Acid in Acute Promyelocytic Leukemia. N. Engl. J. Med. 337: 1021–1028.

The Cancer Genome Atlas Research Network. (2012) Comprehensive molecular portraits of human breast tumours. Nature. 490(7418): 61–70.

The Cancer Genome Atlas Research Network. (2013) Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. The New England Journal of Medicine. 368(22): 2059–74.

170

Toker, A. (2002) Phosphoinositides and signal transduction. Cellular and Molecular Life Sciences. 59(5): 761–779.

Tokunaga, E., Yamashita, N., Kitao, H., Tanaka, K., Taketani, K., Inoue, Y., et al. (2016) Biological and clinical significance of loss of heterozygosity at the INPP4B gene locus in Japanese breast cancer. Breast. 25: 62–68.

Valk, P. J., Verhaak, R. G., Beijen, M. A., Erpelinck, C. A., Barjesteh van Waalwijk van Doorn- Khosrovani, S., Boer, J. M., et al. (2004) Prognostically useful gene-expression profiles in acute myeloid leukemia. N Engl J Med. 350(16): 1617–1628. van Galen, P., Kreso, A., Mbong, N., Kent, D. G., Fitzmaurice, T., Chambers, J. E., et al. (2014) The unfolded protein response governs integrity of the haematopoietic stem-cell pool during stress. Nature. 510(7504): 268–72.

Vasudevan, K.M., Barbie, D.A., Davies, M.A., Rabinovsky, R., McNear, C.J., Kim, J.J., Hennessy, B.T., Tseng, H., Pochanard, P., Kim, S.Y., et al. (2009) AKT-Independent Signaling Downstream of Oncogenic PIK3CA Mutations in Human Cancer. Cancer Cell. 16: 21–32.

Verhaak, R. G. W., Wouters, B. J., Erpelinck, C. A. J., Abbas, S., Beverloo, H. B., Lugthart, S., et al. (2009) Prediction of molecular subtypes in acute myeloid leukemia based on gene expression profiling. Haematologica. 94(1): 131–4.

Vivanco, I., Sawyers, C. (2002) The phosphatidylinositol 3-Kinase-AKT pathway in human cancer. Nature Reviews Cancer. 2(7): 489–501.

Vyas, P., Norris, F. A., Joseph, R., Majerus, P. W., Orkin, S. H. (2000) Inositol polyphosphate 4-phosphatase type I regulates cell growth downstream of transcription factor GATA-1. PNAS. 97(25): 13696–701.

Walter, R. B., Othus, M., Burnett, A. K., Löwenberg, B., Kantarjian, H. M., Ossenkoppele, G. J., et al. (2014) Resistance Prediction in AML: Analysis of 4,601 Patients from MRC/NCRI, HOVON/SAKK, SWOG, and MD Anderson Cancer Center. Leukemia. 29(2): 312-320

171

Wang, Z.Y., Chen, Z. (2008) Acute promyelocytic leukemia: From highly fatal to highly curable. Blood. 111: 2505–2515.

Welch, J. S., Ley, T. J., Link, D. C., Miller, C. A., Larson, D. E., Koboldt, D. C., et al. (2012) The origin and evolution of mutations in acute myeloid leukemia. Cell. 150(2): 264–278.

Westbrook, T. F., Martin, E. S., Schlabach, M. R., Leng, Y., Liang, A. C., Feng, B., et al. (2005) A genetic screen for candidate tumor suppressors identifies REST. Cell, 121(6): 837–848.

Won, J. R., Gao, D., Chow, C., Cheng, J., Lau, S. Y. H., Ellis, M. J., et al. (2013) A survey of immunohistochemical biomarkers for basal-like breast cancer against a gene expression profile gold standard. Modern Pathology. 26(11): 1438–50.

Woolley, J. F., Dzneladze, I., Salmena, L. (2015) Phosphoinositide signaling in cancer: INPP4B AKT(s) out. Trends in Molecular Medicine. 21(9): 530–532.

Yuan, T. L., Cantley, L. C. (2008) PI3K pathway alterations in cancer: variations on a theme. Oncogene. 27(41): 5497–510.

Yuen, J. W. F., Chung, G. T. Y., Lun, S. W. M., Cheung, C. C. M., To, K. F., Lo, K. W. (2014). Epigenetic inactivation of inositol polyphosphate 4-phosphatase B (INPP4B), a regulator of PI3K/AKT signaling pathway in EBV-associated nasopharyngeal carcinoma. PLoS ONE. 9(8): 1–8.

Zhang, L., Zeng, D., Chen, Y., Li, N., Lv, Y., Li, Y., et al. (2016) MiR-937 contributes to the lung cancer cell proliferation by targeting INPP4B. Life Sciences. 155: 110–115.

172