Analysis of Somatic Copy Number Gains in Pancreatic Ductal Adenocarcinoma Implicates ECT2 as a Candidate Therapeutic Target

by

Nardin Samuel

A thesis submitted in conformity with the requirements for the degree of Master of Science

Department of Molecular Genetics University of Toronto

© Copyright by Nardin Samuel 2012 Analysis of Somatic Copy Number Gains in Pancreatic Ductal Adenocarcinoma Implicates ECT2 as a candidate therapeutic target

Nardin Samuel

Master of Science

Department of Molecular Genetics University of Toronto 2012

Abstract

This study presents an integrated analysis of pancreatic ductal adenocarcinoma

(PDAC) for identification of putative cancer driver in somatic copy number gains

(SCNGs). SCNG data on 60 PDAC genomes was extracted to identify 756 genes, mapping to

20 genomic loci that are recurrently gained. Through copy number and expression analysis on a panel of 29 human pancreatic cancer cell lines, this gene catalogue was refined to 34 PDAC high-confidence candidate genes. The performance of these genes was assessed in pooled shRNA screens and only ECT2 showed significant essentiality to cell viability in specific PDAC cell lines with genomic gains at the 3q26.3 locus that harbor this gene. Targeted shRNA-mediated interference of ECT2, as well as pharmacological inhibition, are supportive of the pooled shRNA screen findings. These results favor ECT2 as a candidate target gene for further evaluation in the subset of PDACs presenting with 3q26 somatic copy number gains.

ii

Acknowledgements

First I would like to acknowledge my supervisor and mentor, Dr. Thomas Hudson, for giving me the opportunity to work in his lab and for his immense support, guidance and encouragement. I also thank Dr. Jason Moffat for all of his support and for welcoming me into his lab to learn new techniques and think critically about my work, as well as Azin Sayad and Dr. Kevin Brown from the Moffat Lab, for their willingness to help with this project. I also thank Dr. Fei-Fei Liu and Dr. Brenda Gallie for kindly mentoring me and serving on my supervisory committee. I would also like to thank the entire Hudson lab, especially Mathieu Lemire, for all of his help and support with statistical analyses. At OICR, I also thank Drs. David Uehling, Gennadiy Poda, Rima Al-Awar, Quang Trinh and Lakshmi Muthuswamy for helpful discussions and feedback. I am also very grateful to Dr. Troy Ketela, Kajaal Nagar, Sonali Weerawardane and Jasmyne Carnevale for support with shRNA studies and for accommodating me in the lab. Lastly, I am so grateful for the endless support of my wonderful family and friends. During this work, a championed scientist, Dr. Ralph Steinman, was awarded one of the most prestigious awards in research, a Nobel Prize in Medicine, but passed away from pancreatic cancer before he could be presented with the award. For pancreatic cancer in particular, therapeutic options are limited and it is these types of stories that remind me of the potential impact research can achieve and inspire me to be involved in cancer research.

iii

Table of Contents

Acknowledgements……………………………………………………………………………………….……………………………iii

Table of Contents…………………………………………………………………………………………….………………………….iv

List of Tables……………………………………………………………………………………………………….…………………....vii

List of Figures…………………………………………………………………………………………………………………………..viii

List of Appendices……………………………………………………………………………………………………………………....x

List of Abbreviations……………………………………………………………………………………………………………….…xi

Chapter 1…………………………………………………………………………………………………………………………………….1

1 Introduction……………………………………………………………………………………………………………………………..1

1.1 Pancreatic Ductal Adenocarcinoma…………………………………………………………………………….1

1.1.1 Incidence and Mortality……………………………………………………………………………1

1.1.2 Molecular Biology of Pancreatic Ductal Adenocarcinoma…………………………...1

1.2 Current Therapeutic Options for Pancreatic Ductal Adenocarcinoma…………………………...2

1.2.1 Rationale for identifying novel molecular targets……………………………………….4

1.3 Somatic Mutations in Pancreatic Ductal Adenocarcinoma……………………………………………5

1.3.1 Driver vs. Passenger Mutations…………………………………………………………………5

1.3.2 Known Driver Mutations in Pancreatic Ductal Adenocarcinoma…………………6

1.4 Somatic Copy Number Gains in Human Caner……………………………………………………………..7

1.4.1 Methods for Genome-Wide Detection of Somatic Copy Number Gains………...7

1.4.2 Studies of Structural Mutations in Pancreatic Ductal Adenocarcinoma…..……8

1.5 Features of Ideal Therapeutic Targets……………………………………………………………………….10

1.6 Epithelial cell-transforming oncogene 2 (ECT2)………………………………………………………….10

1.6.1 ECT2 Structure and Function…………………………………………………………………..10

1.6.2 ECT2 Copy Number Gains and Over-Expression in Human Cancer…………….12

iv

Chapter 2…………………………………………………………………………………………………………………………………..15

2 Identification of ECT2 as a Candidate Therapeutic Target Gene in Pancreatic Ductal Adenocarcinoma………………………………………………………………………………………………………………….15

2.1 Introduction…………………………………………………………………………………………………………….15

2.2 Hypothesis………………………………………………………………………………………………………………16

2.3 Project Aims………………….…………………………………………………………………………………………16

2.3.1 Identification of Coding Regions of Recurrent Copy Number Gain in Human Pancreatic Ductal Adenocarcinoma…………………………………………………………16

2.3.2 Analysis of Candidate Gene List in an Independent Cohort of Human Pancreatic Ductal Adenocarcinoma Cell Lines …………….…………………………...16

2.3.3 Assembling a Catalogue of Candidate Genes for Further Study………………….17

2.3.4 Modulation of Candidate Target Gene by shRNA-Mediated Interference and Pharmacological Approaches…………………………………………………………………..17

2.4 Materials and Methods……………………………………………………………………………………………18

2.4.1 Publically Available Pancreatic Ductal Adenocarcinoma Genome Datasets..18

2.4.2 Integrated Analysis of Pancreatic Cancer Genome Datasets………………………18

2.4.3 Copy Number Analysis of Candidate Genes in Human Pancreatic Ductal Adenocarcinoma Cell Lines…………………………………………………..…………………18

2.4.4 Analysis of Candidate Genes in Human Pancreatic Ductal Adenocarcinoma Cell Lines……………………………………………………………………..19

2.4.5 Integrated Analysis of Copy Number and Gene Expression of Candidate Genes to Refine List of Putative Target Genes…………………………………………20

2.4.6 Assembly of Pancreatic Ductal Adenocarcinoma Candidate Target Gene Database………………………………………………………………………………………………...20

2.4.7 Compilation of ‘Druggable Genome’ Database………………………………………….21

2.4.8 Integration of RNA-interference Pooled Screen Studies to Identify Candidate Target Gene for Laboratory-Based Study…………………………………………………21

2.4.9 Tissue Culture and Cell Lines…………………………………………………………………..22

2.4.10 ECT2 and Control Lentivirus Production………………………………………………….22

2.4.11 Lentivirus Titration………………………………………………………………………………...23 v

2.4.12 Cell Viability Assay in shRNA Experiment………………………………………………..24

2.4.13 Pharmacologic Modulation Assay……………………………………………………………24

2.5 Results…………………………………………………………………………………………………………………….26

2.5.1 Genomic Regions of Recurrent Somatic Copy Number Gains in Pancreatic Ductal Adenocarcinoma………………………………………………………………………….26

2.5.2 Integrated Copy Number and Expression Analysis of Candidate Genes……..33

2.5.3 Database of Top-Ranked Candidate Target Genes……………………………………36

2.5.4 Identification of ECT2 for Laboratory Study Through Integration of shRNA Pooled Screen Results……………………………………………………………………………..39

2.5.5 Targeted shRNA studies of ECT2 in Pancreatic Ductal Adenocarcinoma Cell Lines………………………………………………………………………………………………………43

2.5.6 Functional Effects of Pharmacologic Inhibition of the ECT2 Pathway on Cell Viability…………………………………………………………………………………………….……61

Chapter 3…………………………………………………………………………………………………………………………………..64

3 Discussion…………………………………………………………………………………………………………………………...64

3.1 Pooling Data from Genome-Wide Analyses………………………………………………………………..64

3.2 Analysis of Top-Ranked Candidate Genes and Identification of ECT2 as a Putative Target…………………………………………………………………………………………………………………...…65

3.3 Dependence on ECT2 for Cell Viability in Cell Lines Bearing a Genomic Gain at the 3q26 Locus...... ……………………………………………………………………………….…………...67

3.4 Differential Sensitivity to Inhibitors of ECT2-Mediated Cellular Pathway in Cell Lines Bearing Genomic Copy Number Gains at the 3q26 Locus………………………....………..………68

3.5 Future Directions……………………………………………………………………………………………………..70

3.5.1 Rationale……………………………………..……………………………………………..…………..70

3.5.2 Specific Aims………………………………………………………………………………………….71

References……….……………………………………………………………………………………………………………………….72

Appendices…………………………………………………………………………………………………………………………….…80

vi

List of Tables

Table 1 Genomic loci encompassed in SCNGs identified in this study…. …………………...………………….29

Table 2 Regions of genomic gain identified in this analysis of pancreatic tumors as well as a survey of 26 histological subtypes in human cancer by Beroukhim et al, 2011………………………………………..32

Table 3 Database of top-ranked candidate PDAC genes……………………………………………………………….38

Table 4 Results of copy number measures for ECT2 in cell lines utilized for targeted shRNA analyses obtained through different computational methods……………………………………………………………………45

Table 5 Copy number analysis of pancreatic cancer cell lines in Barretina J, et al. 2012……………..…46

Table 6 Comparison of targeted shRNA analysis with results from pooled shRNA screen……………..60

vii

List of Figures

Figure 1 ECT2 structure……………………………………………….………………………………………………..11

Figure 2 ECT2 is mislocalized to the cytoplasm of primary non-small lung cancer tumors…………....13

Figure 3 Number of genes encompassed in genomic gains multiple datasets………………………………..26

Figure 4 Number of genomic loci gained when assessing the datasets inclusive of the OICR dataset (AJH) and inclusive of the OICR dataset (AJHO)…………………………………………………………………………..27

Figure 5 Bioinformatic approach to identifying genes for further analysis…………....……………………...28

Figure 6 Circos plot depicting common regions of genomic gains………...………………………………………30

Figure 7 Comparison of 20 loci identified in this study with other pancreatic copy number studies in the literature…………………………………………………………………………………………………………………………..…31

Figure 8 Peak regions of genomic amplification identified in a survey of 3 131 tumor specimens belonging broadly to 26 histological subtypes…………………………………………………………………………….32

Figure 9 Mean probe intensity for assigning continuous copy number measure…………..………………33

Figure 10 Association between Sum of Ranks and Spearman Rank Correlation Coefficient for PDAC genes……………………………………………………………………………………………………………………………………..….34

Figure 11 Representative copy number and gene expression correlation plots………………..…………..35

Figure 12 Distribution of correlation coefficients in the top 5% most highly correlated genes in comparison to multiple simulations of random sets of gene…………………………………………...……………36

Figure 13 shRNA pooled screen results for top-ranked candidate genes……………………..……………….40

Figure 14a Comparison of essentiality scores of ECT2 in PDAC cell lines with copy number gains and cell lines in which ECT2 is diploid……………………………………………………………………...……………...…41

Figure 14b Comparison of essentiality scores of DPAC essential genes with copy number gain.……42

Figure 15 shRNA pooled screen results for ECT2…………………………………………………………..……………44

Figure 16 Comparison of Mean Probe Intensity (MPI) copy number estimation approach with Circular Binary Segmentation (CBS) for copy number estimation of ECT2……………………………..…….46

Figure 17a-e Copy number plots for ECT2………………………………………………………………………….….47-49

Figure 18a-j Targeted shRNA-mediated interference of ECT2 in PDAC cell lines………………..……51-56

Figure 19 Targeted shRNA experiment results…………………………………………………………….……………..58

viii

Figure 20 Comparison of targeted shRNA-mediated ECT2 interference with shRNA pooled screen results………………………………………………………………………………………………………………………………...…….59

Figure 21 Pharmacological modulation of ECT2-mediated oncogenesis…………………………………….…61

Figure 22 Treatment of PDAC cell lines with PLK1 inhibitors………………………………………………..……..62

ix

List of Appendices

Table A1 Focal somatic copy number gains in pancreatic ductal adenocarcinoma in the literature.81

Table A2 Public pancreatic cancer genome datasets utilized in copy number gain analysis…………..85

Table A3 Cell Lines utilized in integrated analysis in this study…………………………….……….…………….87

Table A4 The RNAi Consortium (TRC) shRNA Constructs……..…………………………………..….………….….88

Table A5 Puromycin concentrations used in shRNA experiments……..…………………………………………88

Table A6 Details of PLK1 compounds utilized in pharmacologic assay…………………………….…….……..88

x

List of Abbreviations

aCGH Array comparative genomic hybridization

AFG3L2 AFG3 ATPase family gene 3-like 2

ATCC American Type Culture Collection

BAC Bacterial artificial bps base pairs

BRAF v-raf murine sarcoma viral oncogene homolog B1

BRCT BRCA1 C-terminal domain

CBS Circular binary segmentation

DH Dbl homology domain

DMEM Dulbecco’s Modified Eagle’s Medium

ECT2 Epithelial Cell-Transforming Oncogene 2

FAK Focal adhesion kinase

FISH Fluorescence in situ hybridization

FBS Fetal bovine serum

GARP Gene Activity Rank Profile

GBM Glioblastoma multiforme

GDP Guanosine diphosphate

GEF Guanine nucleotide exchange factor

GFP Green fluorescent protein

GTP Guanosine triphosphate

GTPase Guanosine triphosphate hydrolase

HEK293T Human embryonic kidney 293 SV40 large T-antigen

HER2 Human Epidermal Growth Factor Receptor 2 xi

IMDM Iscove’s Modified Dulbecco’s Medium

JHSF Japan Health Sciences Foundation kb Kilobase pairs

KRAS V-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog

LACZ Gene Z of lac operon

LSCC Lung squamous cell carcinoma

LUC Luciferase

Mb megabase

MITF Micropthalmia-asssociated transcription factor

MELK Maternal embryonic leucine zipper kinase

MPI Mean Probe Intensity mRNA Messenger ribonucleic acid

MYC v-myc myelocytomatosis viral oncogene homolog

NKX2-1 NK2 homeobox 1

NLS Nuclear localization sequence nM Nanomolar nm Nanometer

NSCLC Non-small cell lung cancer p53 Tumor protein 53

PALB2 Partner and localizer of BRCA2

PBS Phosphate-buffered saline

PCR Polymerase chain reaction

PDAC Pancreatic Ductal Adenocarcinoma

PH Pleckstrin-homology domain

PLK1 Polo-like kinase 1 xii

PSMD1 Proteasome 26 subunit, non-ATPase, 1

RALY RNA binding protein, autoantigenic (hnRNP-associated with lethal yellow homolog)

RhoGEF Rho Guanine nucleotide exchange factor

RNAi RNA-interference

RPMI Roswell Park Memorial Institute Medium

RPS15 Ribosomal protein S15

RT Reverse transcriptase

SCNA Somatic Copy Number Alteration

SCNG Somatic Copy Number Gain shARP shRNA Activity Rank Profile shRNA Short hairpin RNA

SNRPD1 Small nuclear ribonucleiprotein D1

UCSC University of California Santa Cruz

VCP Valosin-containing protein

VST Variance-Stabilized Transformation

WNK1 WNK lysine deficient protein kinase 1

WST1 Water soluble disulfonated tetrazolium (4-[3-(2-methoxy-4-nitrophenyl)-2- (4-nitrophenyl)-2H-5-tetrazolio]-1,3-benzene disulfonate

XRCC1 X-ray repair complementing defective repair in Chinese hamster cells 1 domain z-GARP z-Normalized Gene Activity Rank Profile

xiii

1

Chapter 1

1 Introduction

1.1 Pancreatic Ductal Adenocarcinoma

1.1.1 Incidence and Mortality

Pancreatic ductal adenocarcinoma (PDAC) is the fourth leading cause of cancer-related mortality in North America (Jemal et al., 2010). Patients with PDAC present with the most dismal prognosis of all solid tumors and this fact has remained unchanged over the past 50 years despite advances in the molecular understanding of pancreatic cancers (Jemal et al., 2010; Yeo et al., 2002). Among those patients who are diagnosed with pancreatic cancer, only 20% present with a tumor in situ and are thus candidates for surgical resection with curative intent (Li et al., 2004). The survival rate is only 2% for the majority of patients who present with metastatic disease, indicating that the incidence of this malignancy approximates its mortality (Jemal et al., 2010).

1.1.2 Molecular Biology of Pancreatic Ductal Adenocarcinoma

Histologically differentiated pancreatic ductal adenocarcinomas (PDACs) comprise >90% of exocrine pancreatic malignancies (Kloppel, 1998). Other pancreatic neoplasms such as undifferentiated tumors, acinar cell carcinomas and cystadenomas, along with endocrine pancreatic tumors are rare; hence, these tumor types are not within the scope of this thesis. Hereafter, use of the term ‘pancreatic cancer’ will refer exclusively to PDACs. The origin of pancreatic tumors has been the subject of much debate, as identification of the pancreatic cells that are transformed into malignant lesions has been challenging. In an attempt to characterize the origins of pancreatic tumors, a progression model of neoplasia has been posited whereby precursor lesions give rise to malignant tumors. Precursor lesions include pancreatic intraepithelial neoplasms (PanINs), intraductal papillary mucinous neoplasms (IPMNs) and mucinous cystic neoplasms. Among these lesions, PanINs are the best characterized both genetically and histologically. The first and most widely accepted progression model proposes that PanINs arise from normal pancreatic ductal epithelium and progressively give rise to carcinoma in situ, then invasive pancreatic cancer (Hruban et al., 2000). 2

Only a small subset of precursor lesions tends to be at risk of progressing to an invasive phenotype. The underlying genetic characteristics of tumors might ultimately aid in the identification of precursor lesions, potentially through the discovery of genetic mutations that are common to both PanINs and the primary tumor. At the transcriptional level, high-grade PanINs have demonstrated differential mRNA expression similar to that of PDACs, in comparison with normal pancreatic ductal epithelium and acinar cells (Buchholz et al., 2005). Changes in gene expression suggest that stage 2 PanINs might represent the earliest preneoplastic lesions and stage 3 PanIN lesions have a transcriptional profile nearly identical to that of PDACs (Buchholz et al., 2005).

1.2 Current Therapeutic Options for Pancreatic Ductal Adenocarcinoma

The current standard therapy for both locally advanced pancreatic cancer and metastatic pancreatic cancer involves a chemotherapeutic regimen with gemcitabine, a cytotoxic nucleoside analogue. However, the clinical benefit derived from systemic gemcitabine therapy is a meager average survival of less than 6 months, which is the median survival without therapy (Burris et al., 1997). Combinations of gemcitabine and other chemotherapeutic agents have consistently failed to yield statistically or clinically meaningful improvements in survival (Moore et al., 2007). Only in the past year has a new therapeutic regimen, FOLFIRINOX (a combination of fluorouracil, folinic acid, irinotecan and oxaliplatin), contested the routine use of gemcitabine in PDAC. In the ACCORD 11 trial led by Conroy and colleagues, the median overall survival increased from 6.5 months with standard gemcitabine therapy to 11.1 months with FOLFIRINOX in patients with advanced metastatic disease (Conroy et al., 2011). This finding represents the highest response rate ever reported in a phase III clinical trial in patients with pancreatic cancer. However, FOLFIRINOX is highly toxic and the safety information regarding this regimen is lacking. Despite this potential new shift in the management of pancreatic cancer, the median survival that can be achieved with therapy remains short and the standard of administering therapy to all patients remains, irrespective of the underlying tumor genetics. Treating patients with pancreatic cancer is a challenge for numerous reasons. The poor prognosis of this disease and its resistance to chemotherapies can be partially attributed to diminished drug delivery secondary to the dense stromal barriers that surround the tumor, which decreases the amount of vasculature available for drug delivery to the tumor site (Olive et al., 3

2009). Indeed, one of the most prominent features of pancreatic cancer is the extensive stromal reaction, which comprises up to 90% of the tumor volume (Chu et al., 2007; Neesse et al., 2011). One review highlights other potential factors that might underlie chemoresistance (Wang et al., 2011), such as the transformation of epithelial cells to a mesenchymal phenotype (Wang et al., 2009; Shah et al., 2007). In addition, studies suggest that pancreatic cancer stem cells might be involved in resistance to chemotherapies; however, the mechanism by which they confer resistance remains unclear (Hermann et al., 2007; Hong et al., 2009). Our understanding of chemoresistance is complicated by the fact that marked molecular genetic heterogeneity exists among primary tumor cells and among those cells that are capable of producing metastases (Campbell et al., 2010). Indeed, the lack of success of current clinical interventions for PDACs can be partly attributed to the heterogeneity of molecular abnormalities among patients’ tumors, which leads to differing patient responses to standard cytotoxic therapies. In addition, the various histological subtypes of PDACs are associated with distinctive genetic features and also exhibit different prognoses (Hong et al., 2011a).

4

1.2.1 Rationale for Identifying Novel Molecular Targets

As previously mentioned, genomic studies of pancreatic cancer are uncovering substantial heterogeneity in the genetic alterations of this disease (Li et al., 2004). Parallel studies also demonstrate that this heterogeneity exists not only among different patients, but within different tumors and metastatic lesions from the same patient (Campbell et al., 2010). This observation underscores the need for a mutational-targeted approach to cancer therapeutics, as it is clear that patients have differing dominant genetic alterations that distinguish their respective malignancies. Since mutated cancer genes have an essential role in malignant transformation, they are excellent targets to be exploited for drug therapy (Stratton, 2011). One study that demonstrated the success of applying principles of personalized medicine to PDAC in particular employed global genomic analysis for genomic profiling of a patient’s tumor and subsequent administration of a chemotherapeutic agent on the basis of the tumor’s mutational characteristics. The patient was treated with mitomycin C (a DNA damaging agent) on the basis of the observation of substantial activity of agents that damage DNA on a xenograft generated from the patient’s primary tumor. Exomic sequencing revealed biallelic activation of the PALB2 gene (Villarroel et al., 2011). This mutation provided the basis for the patient’s response to mitomycin C, as PALB2 is involved in DNA repair, which confers a growth advantage in the tumor yet renders it susceptible to an agent that damages DNA, antagonizing a mechanism that enables the tumor to thrive. Taken together, these examples clearly demonstrate that the rational approach of identifying mutations and gearing therapy towards genetic features holds promise for novel cancer therapeutics.

5

1.3 Somatic Mutations in Pancreatic Ductal Adenocarcinoma

1.3.1 Driver vs. Passenger Mutations

Although genetic mutations that are inherited in the germline can lead to familial cancer syndromes, accumulation of somatic mutations, in the presence or absence of inherited baseline susceptibility, is thought to drive and propagate the neoplastic process (Harris and McCormick, 2010). Only 5% of patients diagnosed with pancreatic cancer have a familial form of the disease, which underscores the importance of understanding the role of somatic changes in driving cellular transformation and developing therapies accordingly (Lynch et al., 1996). Although a wealth of information on somatic mutations in a range of cancer genomes has been generated as a result of the latest advances in sequencing technologies, uncertainties remain in distinguishing the molecular events that drive cancer progression and the so-called ‘passenger’ mutations that are present in the tumor but do not drive tumorigenesis. By contrast, ‘driver’ mutations promote tumor growth, are positively selected, and are central to cancer development. Passenger mutations may be randomly distributed in the genome, whereas driver mutations occur in a subset of genes in a non-random pattern. However, this notion might not be valid in all instances. For example, passenger mutations might arise as a result of an increased mutation rate at a given genomic locus and such localized, nonrandom mutations could be erroneously identified as potential driver mutations. Nonetheless, large-scale sequencing of cancer genomes is yielding highly comprehensive genomic datasets that has led to the identification of driver mutations. These datasets have been used to successfully distinguish deletions that encompass tumor suppressor genes, which are driver mutations, from passenger deletions at fragile sites in the genome (Bignell et al., 2010). Fragile sites are particularly prone to breakage and mutation, and nonrandom mutations occurring at such sites are most often passenger mutations. This example demonstrates the essentiality of downstream characterization studies of implicated genes and genomic regions to delineate those somatic mutations that are drivers to enable them to be targeted therapeutically and/or used as clinical biomarkers.

6

1.3.2 Known Driver Mutations in Pancreatic Ductal Adenocarcinoma

A subset of genes that can drive cancer formation has now been well characterized in a model of PDAC development. In this progression model, activating mutations in the KRAS oncogene are nearly universal, occurring in almost 100% of PDAC tumors (Klimstra and Longnecker, 1994; Rozenblum et al., 1997). Highly oncogenic single base-pair substitutions in codon 12 of KRAS can drive pancreatic cancer and are indeed hallmark features of PDAC. This mutation is involved in cell proliferation, inhibition of apoptosis, interference with cellular cohesion and enhanced metastatic properties that might partly result from activation of FAK and decreased expression of E-cadherin (Rachagani et al., 2011). However, mutations in KRAS and enhanced activation of the Ras–Raf– MAPK signaling pathway (through which KRAS acts) are alone insufficient for malignant trans- formation. Mutations in KRAS occur even in subsets of PanINs and IPMNs that do not progress to invasive malignancies (Hruban et al., 2000). Activating mutations in other oncogenes such as BRAF, AKT2 and MYB have also been reported in pancreatic cancer (Calhoun et al., 2003; Cheng et al., 1996; Ruggeri et al., 1998; Wallrapp et al., 1997). Somatic mutations in tumor suppressor genes such as CDKN2A (also known as p16), TP53 (also known as p53) and SMAD4 (also known as DPC4) have been identified as driver mutations occurring in high frequency in PDACs (Caldas et al., 1994; Ruggeri et al., 1992; Hahn et al., 1996). The tumor protein p53 has several essential roles in the progression of the cell cycle, apoptosis and DNA repair (Vogelstein and Kinzler, 2004). Functional inactivation of p53 enables cell proliferation to occur in spite of DNA damage and subsequently facilitates tumor progression (Vogelstein and Kinzler, 2004). Cyclin-dependent kinase inhibitor 2A (CDKN2A, also known as p16) interacts with cyclin-dependent protein kinases such as CDK4 and CDK6 to inhibit progression of the cell cycle at the G1/S checkpoint (Maitra and Hruban, 2008). SMAD4 has an essential role in the transforming growth factor β (TGF-β) cellular pathway. Inactivation or deletion of SMAD4 results in loss of SMAD4-dependent TGF-β signaling, which leads to aberrant cellular proliferation (Hong et al., 2011b). It is important to note that while these driver mutations are characteristic of PDAC and occur in high frequency in PDACs, they are all observed across many other cancer types, indicating that some driver mutations may be universal in cancer but other less frequent, tumor-specific alterations may drive cancer progression.

7

1.4 Somatic Copy Number Gains in Human Cancer

Among the different types of somatic genetic alterations that contribute to cancer development, somatic copy number alterations (SCNAs) are highly common in cancer (Baudis, 2007; NCI/NCBI, 2001). Study of genes that are potentially affected by SCNAs, such as copy number gains and losses, can inform pancreatic cancer pathogenesis and may hold promise for prospective drug development efforts aimed at improved clinical management of PDAC. In particular, gene targets of somatic copy number gains (SCNGs) are of interest since their expression is likely to be upregulated due to increased copy, or gene dosage, and may be thus amenable to selective targeting through pharmacological or genetic modulation. Copy number amplification refers to high-level SCNGs, with an increase of five or more copies of a DNA segment less than 20Mb in length (Brodeur, 1998). Many examples in the literature demonstrate the success of studying SCNGs in identifying cancer-driving genes and associated therapeutic approaches. In one study, integrating genome- wide maps of copy number alterations in melanoma with gene expression signatures, the MITF gene was identified as a critical melanoma oncogene (Garraway et al., 2005). Similarly, a large-scale study of copy number alterations in primary lung adenocarcinomas identified recurrent SCNGs in known lung adenocarcinoma loci, but also identified a highly recurrent 14q13.3 amplification encompassing NKX2-1 (Zender et al., 2006). Further downstream functional analyses substantiated the finding that NKX2-1 is a proto-oncogene in lung adenocarcinoma. These studies, among others, demonstrate the utility of studying SCNGs in human cancer for the identification of genes that are critical to cancer development and progression.

1.4.1 Methods for Genome-Wide Detection of Somatic Copy Number Gains

Molecular detection of SCNGs has advanced significantly over the past decade to enable identification of regions of genomic gain with higher resolution. Comparative genomic hybridization (CGH) detects and maps DNA sequence SCNAs throughout the genome. Conventional CGH techniques for identifying SCNAs, such as chromosomal CGH utilizing metaphase , permitted detection of SCNAs with limited resolution (10-20Mb). Array-based CGH (aCGH) enabled further localization of SCNAs at the cytoband level. In aCGH, or DNA microarrays, relative copy number is measured in specific genomic regions represented by arrays of mapped clones or oligonulceotides. Fluorescently labeled test and reference DNAs are hybridized to the 8

array and the resulting ratio of the fluorescence intensities at each locus approximates the ratio of the copy numbers of the corresponding DNA sequences in the test and reference genomes (Albertson, 2006). Contrary to chromosomal CGH, DNA microarray resolution is determined by the spacing of the array elements, and current SNP-based arrays permit enhanced resolution through use of highly dense array elements (Pinkel and Albertson, 2005). Fluorescence in situ hybridization (FISH) is a cytogenetic method to detect SCNGs, particularly high-copy amplifications (≥10 copies). Nucleic acid probes labeled with a fluorochrome-conjugated nucleotide can be detected by fluorescently-labeled molecules. The labeled probe is then hybridized to tissue or metaphase chromosomes and the nucleic acid sequence is visualized by fluorescence microscopy (Albertson, 2006). FISH has traditionally been the method of choice for clinical diagnostics related to SCNA detection. Bacterial artificial chromosome (BAC) end sequencing is another, less widely used method to measure genomic aberrations. A BAC library is constructed from a test genome, BAC end sequences are obtained, and the end-sequence pairs are mapped onto a reference genome. Copy number is inferred from the density of BAC end sequences and from BAC end pairs which map abnormally far apart from the reference genome sequence (Volik et al., 2003). More recently, with advances in sequencing technologies and their mainstream use in genomics research, high-throughput sequencing has been used as a tool not only to detect base-pair level mutations, but also complex structural rearrangements and SCNAs. In comparison to DNA microarrays, sequence read alignment has been shown to have comparable power to detect SCNAs and has over twofold improved precision for localizing SCNA breakpoints to within 1kb (Chiang et al., 2009).

1.4.2 Studies of Structural Mutations in Pancreatic Ductal Adenocarcinoma

Chromosomal instability, manifesting as SCNAs and structural genomic alterations is highly characteristic of PDAC (Campbell et al., 2010). Various techniques for whole-genome analysis have led to the identification of many regions of genomic gain and loss in pancreatic cancer and these SCNAs likely harbor genes that are involved in PDAC progression as a result of their high recurrence and localization in known loci of cancer genes (for example, the 8q24.3 locus harboring the MYC oncogene). Genes affected by SCNAs have indeed been shown to have a role in PDAC progression at the cellular level in further functional examinations. 9

Various techniques for whole-genome analysis have led to the identification of many regions of SCNG in pancreatic cancer (Appendix Table A1). One study of genome-wide copy number analysis in PDAC combined array-based comparative genomic hybridization (CGH) results from 24 pancreatic cancer cell lines to hone in on a recurrently amplified region, 7q21.3-22.1 (Suzuki et al., 2008). This strategy enabled the identification of a candidate oncogene, SMURF1, through various biological validation studies, including knockdown of this gene and immunohistochemical assays (Suzuki et al., 2008). Another study, which used representational oligonucleotide microarray analy- sis and subsequent biological validation methods such as quantitative PCR and fluorescence in situ hybridization, reported copy number gain and overexpression of the transcription factor GATA6 in pancreatic carcinoma (Fu et al., 2008). Various other studies have identified gains in genetic copy number at the 18q11.2 locus, which contains GATA6 (Heidenblad et al., 2004; Holzmann et al., 2004; Kitoh et al., 2005; Loukopoulos et al., 2007). A follow-up study characterized the putative mechanism through which GATA6 contributes to PDAC tumorigenesis during progression of PanINs through the Wnt signaling pathway (Zhong et al., 2011). These examples demonstrate how focused analysis of genes harbored in SCNAs can lead to identification of genes that are involved in PDAC development and might therefore serve as candidate biomarkers or therapeutic targets. One study utilized next-generation sequencing to survey structural genetic changes in 13 pancreatic cancer samples and identified a novel structural alteration – fold-back inversions (Campbell et al., 2010). Fold-back inversions are copy number alterations whereby a genomic region is duplicated but the two copies re-join in an abnormal head-to-head position in the opposite orientation to the amplification breakpoint. Due to the nature of the orientation, this type of structural genomic aberration could have only been detected through anomalous mapping of sequencing reads. This study demonstrated that fold-back inversions occur early during the development of pancreatic cancer and frequently trigger amplification of genes that can drive cancer progression.

10

1.5 Features of Ideal Therapeutic Targets

The need for identification of novel therapeutic targets for PDAC is clear and moreover, putative cancer-driving genes may be harbored in SCNGs, indicating that therapies targeting genes in SCNGs may attenuate cancer growth. Features of an ideal drug target include: 1) The target plays an essential role in cancer genesis or maintenance of the cancer phenotype; 2) The target is overexpressed in cancer cells, and this over-expression is associated with a biomarker; 3) Inhibition of the target’s expression induces growth suppression and/or apoptosis in cancer cells; 4) The target is ‘druggable’, meaning that is amenable inhibition by a small-molecule or specific antibody; and 5) The target is not expressed, or expressed at very low levels, in normal cells and its inhibition has minimal effect on normal cell growth and function (Sun, 2006). Toward this end, the model of targeting identified genetic changes has proven effective in various examples. One archetypal example is Herceptin™ (Trastuzumab), a therapeutic antibody that targets the protein encoded by amplified HER2, which has greatly impacted the prognoses of patients with breast cancer bearing this genomic feature (Esteva et al., 2010). In another example, a single base-pair substitution (that results in the amino-acid substitution Val600Glu) in the BRAF oncogene, which encodes a serine-threonine kinase, has successfully been the target of drug development. Vemurafenib, a BRAF-targeted agent, was recently approved for use in patients with metastatic melanoma by the Food and Drug Administration (FDA). Indeed, the rational approach of identifying mutations and gearing therapy towards genetic features holds promise for enhanced cancer therapeutics.

1.6 Epithelial cell-transforming oncogene 2 (ECT2)

1.6.1 ECT2 Structure and Function

The candidate therapeutic target identified in this study, Epithelial cell-transforming sequence 2 oncogene (ECT2), is a member of the guanine nucleotide exchange factor (GEFs) family, which catalyze the exchange of GDP for GTP, thereby activating Rho Guanine triphosphatases (GTPases) in signal transduction (Fields and Justilien, 2010). GTPases function as molecular switches regulating numerous signaling pathways that are involved in actin cytoskeleton remodeling, cell motility, cell adhesion, cell cycle progression and gene expression (Fields and 11

Justilien, 2010). Since GTPases cycle between an active state whereby GTP is bound, and a GDP- bound inactive state, GEFs therefore regulate GTPase activity. Mammalian ECT2 was first isolated as a proto-oncogene from a murine keratinocyte cDNA expression library (Miki et al., 1993). This gene is highly evolutionarily conserved. The Drosophila ortholog of ECT2, Pbl was discovered prior to the identification of the mammalian gene, and was found to function as a Rho-GEF required for cytokinesis (Schumacher et al., 2004). ECT2 orthologs have also been identified in (Let-21) and in Xenopus (XECT2), and share similarities with human ECT2 throughout their coding sequence (Dechant and Glotzer, 2003; Tatsumoto et al., 2003). Human ECT2 consists of an 883 amino acid chain with several protein domains (Figure 1 [Fields and Justilien, 2010]). The N-terminal regulatory domain contains sequences that are highly homologous to cell cycle control and repair (Saito et al., 2003). The adjacent XRCC1 domain is homologous to the human XRCC1 protein involved in DNA repair and sister chromatid exchange (Thompson et al., 1990). The Cyclin B6 domain shows to Clb6, a yeast protein involved in the G1-to-S phase cell-cycle progression. Adjacent are two repeating BRCT (Breast Cancer Gene 1 Carboyl-terminal) motifs. These motifs are highly conserved in DNA repair proteins and proteins involved in cell-cycle progression (Bork et al., 1997; Callebaut and Mornon, 1997). The center of the protein contains a small central (S) domain harboring two nuclear localization sequences (NLSs) which may be involved in ECT2 nuclear localization. The C-terminus of ECT2 contains its catalytic component consisting of a Dbl-homology (DH) and a pleckstrin- homology (PH) domain which are responsible for the function of ECT2 as a RhoGEF. The adjacent C- terminal (C) region of ECT2 does not exhibit significant homology to any know protein domains.

Figure 1. ECT2 protein structure (Fields and Justilien, 2010; Permission to re-use obtained; License No: 2891430429871). N, Amino-terminal region; XRCC1, X-ray repair complementing defective repair in Chinese hamster cells 1 domain; Cyclin B6, cyclin B6-like domain; BRCT, BRCA1 C-terminal domain; S, small central region; NLS, nuclear localization sequence; DH, Dbl-homology domain; PH, pleckstrin-homology domain; C, Carboxyl-terminal region.

With regards to intracellular localization and expression, ECT2 expression is controlled throughout . ECT2 remains confined to the nucleus during interphase, and translocates into the cytoplasm following the disappearance of the nuclear envelope at the onset of mitosis. During 12

metaphase, ECT2 is localized to the mitotic spindles, the cleavage furrow during telophase, and the mid-body as cytokinesis ceases (Tatsumoto et al., 1999). Analysis of mRNA expression patterns reveals ECT2 is expressed in adult tissues such as kidney, liver, spleen, testis, lung, bladder, ovary and the brain, as well as fetal tissues such as the liver, thymus, epithelial lining of the nasal cavity and gut, tooth primordial, costal cartilage, heart, lung and pancreas (Miki et al., 1993; Saito et al., 2003). Several studies have demonstrated the essential role of ECT2 and its orthologs in cytokinesis. In Drosophila, the ortholog of ECT2, Pbl activates Rho1 to promote cytokinesis (Prokopenko et al., 1999). Similarly, Let-21, the C. elegans ortholog of ECT2 is required for formation of the cleavage furrow (Dechant and Glotzer, 2003). In addition, perturbation of mammalian ECT2 results in failure of cytokinesis, as observed by the accumulation of multinucleated cells (Kim et al., 2005; Liu et al., 2004; Tatsumoto et al., 1999). There is also strong evidence in the literature suggesting ECT2 is involved in cell polarity and asymmetrical cell division. Apical-basal polarity in epithelial cells is regulated by the Par complex which consists of Par-6/Par-3 (partition-defective)/atypical protein kinase C and small GTPases. One study reported that ECT2 is detectable at cell junctions where it directly interacts with Par6 and PRKCζ and regulates the activity of the latter (Liu et al., 2004). ECT2 has been implicated in regulating the RhoGTPase Cdc42 and attachment of spindle microtubules to kinetochores during metaphase (Oceguera-Yanez et al., 2005). Taken together, it is clear that ECT2 functions as a GEF for RhoGTPases.

1.6.2 ECT2 Copy Number Gains and Over-Expression in Human Cancer

ECT2 has been identified as an oncogene in human cancer and its role in cancer has been linked to genomic amplification and upregulated expression. The first study identifying ECT2 as a proto-oncogene demonstrated it was capable of transforming fibroblasts (Miki et al., 1993). Since then, ECT2 has been reported to be over-expressed in several human tumors including brain (Salhia et al., 2008; Sano et al., 2006), lung (Hirata et al., 2009; Justilien and Fields, 2009), bladder (Saito et al., 2004), esophageal (Hirata et al., 2009), pancreatic (Zhang et al., 2008), and ovarian cancer (Saito et al., 2004). ECT2 is over-expressed at both the mRNA and protein levels in non-small cell lung cancer (NSCLC) cell lines and primary tumors and interestingly, immunohistochemical analysis showed that ECT2 is localized in the nucleus in normal lung tissue, but also appears to translocate to some extent in the cytoplasm in primary NSCLC, and an independent analysis 13

validated these findings by showing cytoplasmic ECT2 staining in approximately 84% of primary NSCLC (Figure 2; [Justilien and Fields, 2009]). These findings were also reproduced in glioblastoma multiforme (GBM), whereby ECT2 was found over-expressed and mislocalized to the cytoplasm in comparison with normal brain tissue and low-grade astrocytomas (Salhia et al., 2008).

Figure 2. ECT2 is mislocalized to the cytoplasm of primary non-small cell lung cancer tumors (Fields and Justilien, 2010. Permission to re-use obtained; License No: 2891430429871). Immunohistochemical staining of ECT2 in normal human lung epithelium (left) and primary lung adenocarcinoma (right) reveals ECT2 is primarily localized to the nucleus in normal lung tissue but localizes to both the nucleus and cytoplasm of primary tumor cells.

The 3q26 locus which harbors ECT2 has been reported to be the target of frequent chromosomal alterations in human cancer (Lin et al., 2006; Meyer et al., 2007). In lung squamous cell carcinoma (LSCC), ECT2 mRNA expression correlates with ECT2 copy number gains, indicating the ECT2 amplification may be driving its over-expression in LSCCs (Justilien and Fields, 2009). In addition, an estimated 40% of esophageal squamous cell carcinoma (ESCC) tumors bear 3q26 amplifications (Yang et al., 2008; Yen et al., 2005). ECT2 was reported to be over-expressed in ovarian tumors that harbor ECT2 copy number gains in comparison to normal ovary tissue (Haverty et al., 2009). 3q26 amplifications encompassing ECT2 have also been observed in head and neck squamous cell carcinoma as well as cervical squamous cell carcinoma (Heselmeyer et al., 1997). Taken together, evidence in the literature strongly suggestion that ECT2 over-expression may be driven by tumor specific amplification of the 3q26 amplicon which harbors ECT2. However, ECT2 amplification alone is likely not the only mechanism by which ECT2 tumor-specific over- expression is observed. 14

While the exact mechanisms of ECT2-mediated oncogenesis remain unclear, studies in NSCLC and GBM suggest that ECT2 is important for proliferation, migration and invasion. The oncogenic role of ECT2 is distinct from its normal physiological role in cytokinesis, and in NSCLC, the role of ECT2 in cellular transformation appears to be related to its cytoplasmic mislocalization.

15

Chapter 2

2 Identification of ECT2 as a Candidate Therapeutic Target Gene in Pancreatic Ductal Adenocarcinoma

N.B. Contributions: Microarray experiments for gene expression and copy number on 29 PDAC cell lines were performed at the University Health Network (UHN) Microarray Center through the laboratory of Dr. Jason Moffat. Azin Sayad assisted with processing copy number data through the GPHMM algorithm as well as representation of shRNA pooled screen data in Figure 15.

2.1 Introduction

Current therapeutic options for pancreatic ductal adenocarcinoma are limited and do not confer any improvement in overall disease progression or survival for the majority of patients. The goal of this project was to integrate genomic data from primary PDACs to identify genes that are targets of recurrent genomic mutation, and subsequently study the oncogenic potential of their mutation in PDAC tumorigenesis. Among the myriad of genetic mutations that can occur in human cancer, structural genomic mutations, resulting from chromosomal instability, are characteristic of PDAC (Campbell et al., 2010). Such aberrations can manifest as various somatic chromosomal changes, including translocations, inversions and somatic copy number alterations (SCNAs). In particular, gene targets of somatic copy number gains (SCNGs) are of interest since their expression is likely to be upregulated due to increased copy, or gene dosage, and may be thus amenable to selective targeting through pharmacological modulation. In order to identify genes that may be effective therapeutic targets, it is necessary to differentiate genes that are ‘passengers’ which are encompassed in SCNGs but are not involved in the neoplastic process, from genes in SCNGs that are critical to tumor initiation and/or progression, or so-called ‘driver genes’. Driver genes are positively selected for and are essential to cancer development. As such, identification of driver genes in PDAC, and selective targeting of such genes in the subset of tumors in which they are drivers, represents a potentially effective approach to development of targeted therapies for this disease. Driver genes harbored in SCNGs are expected to exhibit upregulated expression in the tumor cells harboring the genetic gain. Integration of genomic and transcriptional profiles of the same specimens is therefore valuable for honing potential target genes in defined regions of SCNG. In addition, blockade of driver genes that are essential to the neoplastic process should conceivably 16

result in attenuation of the cellular processes that promote cancer formation and growth. Consequently, the integration of function as measured by RNA-interference analyses with genomic and transcriptomic profiles provides a suitable avenue to identify putative driver genes that can be further studied in laboratory-based analyses to assess therapeutic potential.

2.2 Hypothesis

The primary hypothesis is that genes which are recurrently genomically amplified by copy number gains, are upregulated, and are found to be essential to cancer cell viability in laboratory- based analysis may be suitable therapeutic targets for further study. The secondary hypothesis is that the genomic copy number gain may serve as a useful biomarker to identify patients who would likely benefit from therapy targeting tumor-specific genetic abnormalities.

2.3 Project Aims

2.3.1 Identification of Coding Regions of Recurrent Copy Number Gain in Human Pancreatic Ductal Adenocarcinoma

Analysis of copy number data from primary pancreatic tumors and cell lines obtained from publically available PDAC datasets was conducted in order to identify common regions of recurrent copy number gain and genes mapping to these regions. This provided a repository of genes that can be studied further as their genetic amplification has been observed in human pancreatic cancer.

2.3.2 Analysis of Candidate Gene List in an Independent Cohort of Human Pancreatic Ductal Adenocarcinoma Cell Lines

Integration of gene expression data with the genetic information obtained from copy number data analysis is a valuable approach to identifying candidate genes for further study. Some genes may be identified as genetically amplified in pancreatic tumors but may not be expressed in the tissue. Conversely, promising genes to further examine are those that display increased expression level in the context of copy number gain at their respective locus. Since expression data on the same tissue samples used for the initial copy number analysis from publically available datasets was unavailable, an independent panel of PDAC cell lines was used to quantify copy number and gene expression for all genes identified in the initial analysis. 17

This same panel of cell lines that are genetically and transcriptomically profiled can then serve as tools for laboratory-based investigation of candidate genes. Furthermore, this same panel of PDAC cell lines was utilized in a pooled shRNA functional genetic screen, and results of this screen were subsequently used to corroborate findings for candidate genes for further study.

2.3.3 Assembling a Catalogue of Candidate Genes for Further Study

The gene set obtained from analysis of copy number data from primary PDACs and cell lines may contain potential driver genes, numerous passenger genes co-occurring on an amplicon with a driver, as well as other potential ‘false-positive’ genes. Using stringent filtering parameters to increase the statistical likelihood of capturing true driver genes and differentiating between passengers, the list of candidate target genes was further refined to a handful of genes that can serve as a catalogue of candidate genes for laboratory-based study.

2.3.4 Modulation of a Candidate Target Gene by RNA-interference and Pharmacological Approaches

Laboratory-based targeted analyses are necessary to validate that aberration of cell lines harboring genetic copy number gains of a gene, and its associated over-expression, in fact leads to increased cell viability in comparison to cell lines for which the gene is not genetically gained. This was accomplished by targeted shRNA-mediated interference of the target gene as well as pharmacological inhibition of the cellular pathway in which the gene is involved.

18

2.4 Materials and Methods

2.4.1 Publically Available Pancreatic Ductal Adenocarcinoma Genome Datasets

Somatic copy number gain data from 60 PDAC genomes from four independent pancreatic cancer genome datasets were used to identify genes that are commonly gained in primary PDACs. Details of each dataset as well as copy number analysis methods employed in each individual dataset are summarized (Appendix Table A2).

2.4.2 Integrated Analysis of Pancreatic Cancer Genome Datasets

Regions of genomic gain were extracted from the data on 60 PDAC genomes, as identified in each of four publically-available PDAC copy number datasets in Table 1 (QCMG, OICR, JHU, Harada). The QCMG, OICR and JHU PDAC SCNG data were downloaded from the International Cancer Genome Consortium Data Portal (ICGC, 2010). The PDAC SCNG data from the Harada dataset was extracted from the Supplemental Information from Harada et al, 2009 (Harada et al., 2009). The Python (v2.7) programming tool was utilized to parse each individual file into a common data structure, whereby all of the regions of genomic gain are projected onto a reference genome build (GRCh37/hg19). The output consisted of the UCSC gene name and ID of all genes in which a coding region was encompassed in gains in at least three of the four datasets. In addition, because true SCNA breakpoints may not accurately be delineated through array-based methods employed by the original studies, all genes that were within 1Mb of the minimal common region of overlap of genomic gains across the datasets were also included in the gene set (expanded gene catalogue).

2.4.3 Copy Number Analysis of Candidate Genes in Human Pancreatic Ductal Adenocarcinoma Cell Lines

A panel of 29 PDAC cell lines was utilized to evaluate copy number of the genes in the expanded catalogue (Appendix Table A3). Copy number analysis was performed using the Illumina OmniExpress SNP array (Illumina, San Diego, CA). The raw signal obtained from the array is a ratio of the intensity generated from the PDAC cell line sample relative to a reference sample, Log-R Ratio (LRR):

LRR = log (Robserved/Rexpected)

19

Rexpected is calculated using a cluster file characterizing a reference set of samples, which are identical for all arrays. The value of Rexpected is different for each SNP on the array. These analyses were carried out at the same time to minimize batch effects such that LRRs can be compared across the arrays. For all genes in the analysis, a gene-directed approach to LRR estimation was formulated, whereby the mean probe intensity (MPI) across all array probes mapping to each of the genes is used to assign a continuous measure of copy number for that gene in each cell line. To compute the MPI value for each gene, the R statistical programming tool (http://cran.r- project.org/bin/windows/base/) was used to extract intensity measures of all array probes mapping to a gene in the expanded catalogue, from transcription start to transcription end, and then calculate the MPI of all probes mapping to each gene. A minimum of 10 SNP probes was required for this analysis, and for smaller genes, the 10 probes in closest proximity to the gene were used. To assess the validity of this approach, the MPI analysis was compared to the continuous copy number estimates computed through the Circular Binary Segmentation algorithm (CBS) (Olshen et al., 2004), and the results were consistent with CBS calculations. This resulted in a continuous MPI measure for each gene in the expanded catalogue in each of the 29 cell lines, which represents a continuous quantitative measure of copy number for each gene.

2.4.4 Gene Expression Analysis of Candidate Genes in Human Pancreatic Ductal Adenocarcinoma Cell Lines

Gene expression analysis was performed using the Illumina HT-12v4 BeadChip expression array (Illumina, San Diego, CA). The relative expression value for each gene in the expanded catalogue was implemented using the Bioconductor LUMI package for R (http://www.bioconductor.org/packages/2.0/bioc/html/lumi.html). The basic pipeline involved background signal subtraction, quality control exploratory analysis, variance-stabilization using the Variance Stabilized Transformation algorithm (Lin et al., 2008), and cross-array normalization using Cyclic LOESS normalization. This algorithm performed pair-wise normalization for all possible pairs of samples in the array and a resultant continuous measure of gene expression for all genes.

20

2.4.5 Integrated Analysis of Copy Number and Gene Expression of Candidate Genes to Refine List of Putative Target Genes

Using the MPI measures for copy number and VST measures for expression for each of the genes in the expanded catalogue in each of the 29 PDAC cell lines, a Spearman-Rho correlation coefficient, ρ, was computed and subsequently used to identify genes in which MPI and VST were most highly correlated across all cell lines. Genes for which the VST value was ≤6.8 (median VST measure for all genes across all cell lines) were excluded. Genes for which the VST value in the cell line expressing the gene at the highest level was at least 2.5 fold greater than the cell line expressing the gene at the lowest level were selected for analysis. This was done in order to enable stratification of cell lines into categories of ‘high relative expression’ and ‘low relative expression’ for each gene, as it correlates with copy number. The remaining genes were then selected for inclusion in the candidate gene database if ρ ≥ 0.65 which corresponds to a correlation coefficient p-value < 0.05 when compared to the correlation coefficients in randomly simulated gene sets. Simulated gene sets were compiled to simulate the selection criteria used in the candidate gene catalogue. Since the genes are clustered in discrete loci, the simulated gene sets mimicked the same scenario and were generated as follows: an anchor gene was first randomly selected, along with all genes within 500kb of the anchor gene. This process continued until the number of genes in the simulated gene set was equal to the number of genes in the candidate gene set. This simulation was performed in 1500 replicates.

2.4.6 Assembly of Pancreatic Ductal Adenocarcinoma Candidate Target Gene Database

A working repository of the top candidate genes based on the aforementioned analyses and filtered parameters was created (Table 3). Initial annotations including the gene cytoband and correlation coefficient computed between copy number and expression in the panel of 29 PDAC cell lines were included. In addition, a ‘DeltaExp’ column provides the log2 expression difference in the cell line expressing the gene at the highest relative level and the cell line expressing the gene at the lowest relative level. The ‘zGARP Score’ column lists the mean z-normalized gene activity rank profile (GARP) of that gene in a pooled shRNA screen in 27 of the same PDAC cell lines used for copy number and gene expression analysis. Annotations of ‘Druggability’ were added from the Druggable Genome Database compiled as described in section 2.4.7 below. The ‘DrugBank’ column provides information on small molecules which target that specific gene and are characterized in 21

DrugBank - a comprehensive repository of drug information (Knox et al., 2011). ‘NormalPancExp’ and ‘PancTumorExp’ columns indicate protein levels of each gene as documented by immunohistochemical analysis in the Human Protein Atlas (Uhlen et al., 2010), and ‘Differential Exp’ indicates if differential expression between normal pancreatic and pancreatic tumor tissue was observed at the mRNA level (GeneCards, 2011). Finally, the column ‘Assays’ indicates any potential assays, based on the predicted function of the gene, which can be performed to reliably test the effects of perturbation of the gene.

2.4.7 Compilation of ‘Druggable Genome’ Database

In order to add annotations of druggability to the candidate gene database it was necessary to assemble a comprehensive repository of the ‘druggable genome’. This is the subset of the that expresses proteins that can bind drug-like molecules. Three druggable genome datasets were merged for the compilation of the druggable genome database utilized in this study (Russ and Lampel, 2005; Sophic, 2012; Yildirim et al., 2007). A gene is termed ‘druggable’ if it meets at least one of the following criteria: (1) the gene product is known to bind drug molecules; (2) the gene product can theoretically bind drug molecules because it belongs to a family of gene products known to bind drug molecules (e.g. tyrosine protein kinases); (3) the gene product can theoretically bind drug molecules because it contains protein domains that can theoretically bind small molecules. Using these criteria and extensive manual curation of the three existing druggable genome datasets, a database of druggable genes was generated.

2.4.8 Integration of RNA-interference Pooled Screen Studies to Identify Candidate Target Gene for Laboratory-Based Study

Annotations of performance in an shRNA pooled screen on a panel of 28 PDAC lines by Marcotte et al (all of which were also analyzed in this study), were added to the candidate gene database. shRNA-mediated RNA-interference enables genome-wide loss-of-function screens, and as such, a lentiviral-based shRNA library was used to facilitate genome-wide screening of cultured cancer cells in a pooled format (Marcotte et al., 2012). Briefly, cells were infected with the shRNA library targeting ~16 000 genes in a panel of 72 breast, pancreatic and ovarian cancer cell lines. Integrating these results with the genomic and transcriptomic data in this study facilitated systematic identification of genes which are essential to cell viability in the context of copy number gains and over-expression. A mean z-normalized Gene Activity Rank Profile (zGARP) score from the 22

pooled screen for each gene was used to annotate the essentiality of each gene in the respective cell line. To assign a discrete predicted copy number value for the top candidate genes to formally stratify genes by copy number and relate this to shRNA pooled screen performance, the Global Parameter Hidden Markov Model (GPHMM) method was employed, as described by Li A, et al., 2011 (Li et al., 2011). Genes with copy number ≥4 were grouped as ‘copy number gain’ in the respective cell line, while genes with 2 copies were labeled diploid. A student’s t-test was used to compare GPHMM copy number with RNAi pooled screen scores when the number of representative cell lines was greater than or equal to 10. The Wilcoxon rank-sum test was used when the number of representative cell lines was less than 10 in both groups (diploid and copy number gain).

2.4.9 Tissue Culture and Cell Lines

Ten human pancreatic ductal adenocarcinoma cell lines were utilized for laboratory-based studies. AsPc1, Capan-1, Capan-2, HPAF-II, MIA PaCa-2, Panc03.27, Panc04.03 and Panc08.13 were purchased from the American Type Culture Collection (ATCC; Manassas, VA). The human pancreatic ductal adenocarcinoma cell line KP4 was obtained from the Japan Health Sciences Foundation (JHSF). The human pancreatic ductal adenocarcinoma cell line PATU8988S was generously provided from Francisco Real (Madrid, Spain). HPAF-II, PATU8988S and MIA PaCa-2 were cultured in Dulbecco’s Modified Eagle’s Medium (DMEM; Invitrogen, California), supplemented with 10% fetal bovine serum (FBS; Hyclone, Utah) and 0.1mg/mL penicillin/streptomycin (Invitrogen, California). Capan-1 was cultured in Iscove’s Modified Dulbecco’s Medium (IMDM; Invitrogen, California), supplemented with 20% FBS and 0.1mg/mL penicillin/streptomycin. Capan-2 was cultured in McCoy’s 5A Modified Medium (Invitrogen, California), supplemented with 10% FBS and 0.1mg/mL penicillin/streptomycin. Panc03.27 and Panc08.13 were cultured in Roswell Park Memorial Institute (RPMI) 1640 Medium with 2mM L-gluatmine, 4.5g/L glucose, 10mM HEPES and 1.0mM sodium pyruvate (Invitrogen, California), supplemented with 10 Units/mL Human Insulin (Wisent, Quebec), 15% FBS and 0.1mg/mL penicillin/streptomycin. Panc04.03 was cultured in RPMI 1640 with 2mM L-gluatmine, 4.5g/L glucose, 10mM HEPES and 1.0mM sodium pyruvate, supplemented with 20 Units/mL Human Insulin, 15% FBS and 0.1mg/mL penicillin/streptomycin. AsPc1 was cultured in RPMI 1640 with 2mM L-gluatmine, 4.5g/L glucose, 10mM HEPES and 1.0mM sodium pyruvate, supplemented with 10% FBS and 0.1mg/mL penicillin/streptomycin. KP-4 was 23

cultured in RPMI 1640, supplemented with 10% FBS and 0.1mg/mL penicillin/streptomycin. All cell lines were cultured in a 5% CO2 humidified incubator at 37oC.

2.4.10 ECT2 and Control Lentivirus Production

Human embryonic kidney 293 SV40 large T-antigen (HEK293T) packaging cells were cultured in DMEM, supplemented with 10% FBS and 0.1X penicillin/streptomycin for cell seeding. For viral harvesting, high bovine serum albumin (BSA) 293T growth media was used (DMEM, supplemented with 10% FBS, 1.1g/100mL BSA and 1X penicillin/streptomycin). Viral production was carried out as outlined in the RNAi Consortium Lentiviral Production protocol. Briefly, HEK293T cells were seeded at a density of 2.2x105 cells/mL in 6-well plates and incubated for 24 hours in 5% CO2 and 37oC. HEK293T packaging cells were then transfected with a mixture of 3 infection plasmids: packaging plasmid (pCMV-dR8.74psPAX2; 500ng/well), envelope plasmid (VSV- G/pMD2.G; 50ng), and the hairpin-pLKO.1 vector containing the TRC library shRNA (500ng/well), as well as OPTI-MEM serum-free media (Invitrogen, California), for a total volume of 30µL. All shRNA constructs used are listed in Appendix Table A4. The three-plasmid mix was then added to solution of TransIT-LT1 transfection reagent (Mirus Bio, Madison WI) and incubated for 30 minutes at room temperature. The transfection mix was then added to the packaging cells and left to incubate at 5% CO2 and 37oC for 18 hours. Media was then changed to high-BSA growth media for viral harvests and cells were incubated for 24 hours at 5% CO2 and 37oC. At approximately 40 hours post-transfection, media containing lentivirus was harvested and again replaced with high-BSA media for subsequent viral harvests. Harvesting was repeated after 24 hours. Media containing virus was centrifuged at 1250 rpm for 5 minutes to pellet any packaging cells collected during harvesting, and the supernatant containing the virus was collected at stored at -80oC.

2.4.11 Lentivirus Titration

Two titering experiments were performed: in the first experiment, HEK293 cells were seeded into 96-well plates at a density of 2x104 cells/mL. After 24 hour incubation at 5% CO2 and 37oC, 8µg of polybrene (Millipore, Bellerica MA) was added to the cells, and cells were transduced with either 5uL of virus or 15uL of virus, in triplicate, in DMEM and 10% FBS. After 24 hours, media was changed to DMEM and 10% FBS containing 1µg/mL of puromycin and incubated for 24 hour at 24

5% CO2 and 37oC. Media was then changed to regular media and cells were incubated at 5% CO2 and 37oC. After 72 hours, WST1 reagent (Roche, California) was added to cells and cells were incubated for 45 minutes and 450nm absorbance was then measured using a UV/Vis Spectrophotometer Plate Reader (Biotek, Winooski VT). Absorbance was averaged across triplicate wells and was normalized to identical plates to which no puromycin was added. The second titering experiment involved the same protocol as above, but instead of HEK293 cells, human pancreatic ductal adenocarcinoma cell lines KP4, Capan-1, Capan-2, AsPc1, Panc04.03, Panc03.27 and PATU8988S were utilized and cultured in their respective media as outlined in 2.4.10. A virus mixture of equal amounts of 5 ECT2 shRNAs as well as a GFPshRNA was added in 2- fold serial dilutions (2µL-128µL).

2.4.12 Cell Viability Assay in shRNA Experiment

Human pancreatic ductal adenocarcinoma cell lines AsPc1, Capan-1, Capan-2, KP4, HPAF-II, Panc03.27, Panc04.03, Panc08.13, MIA PaCa-2 and PATU8988S were seeded in duplicate 96-well plates at a cell density of 2000 cells/well and left to incubate at 5% CO2 and 37oC for 24 hours. After incubation, 2uL of 4ug/uL of polybrene was added to all cells and to one plate of each cell line, regular media was replaced with puromycin-containing media and incubated for 24 hours at 5%

CO2 and 37oC. Concentrations of puromycin used for each cell line are listed in Appendix Table A5. After incubation, puromycin-containing media for each plate was then replaced with regular media for each respective cell line and left to incubate for 72 hours at 5% CO2 and 37oC. Cells were then rinsed twice with phosphate-buffered solution (PBS) and fixed with 4% paraformaldehyde for 10 minutes. Following rinsing with PBS, cells were stained with Hoescht (Invitrogen, California) and rinsed twice with PBS. Nuclei counts were obtained by analysis of stained nuclei using the IN Cell Analyzer 2000 and analyzed on the IN Cell Developer Analyzer Workstation 3.7 (GE Healthcare, Chalfont St Giles, United Kingdom). Nuclei counts were averaged across triplicate wells and normalized to nuclei counts in the wells transduced with the shRNA-GFP construct.

25

2.4.13 Pharmacologic Modulation Assay

Human pancreatic ductal adenocarcinoma cell lines AsPc1, Capan-1, Capan-2, KP4, HPAF-II, Panc03.27, Panc04.03, Panc08.13, MIA PaCa-2 and PATU8988S were seeded in duplicate 96-well plates at a cell density of 2000 cells/well and left to incubate at 5% CO2 and 37oC for 24 hours. After incubation, cells were treated with either BI-6727 compound (Boehringer-Ingelheim, Ingelheim am Rhein, Germany) or GSK461364 (Glaxo SmithKlein, Brentford UK) in 3-fold serial dilutions (30µM to 0.01µM) or 0.3% DMSO control and incubated at 5% CO2 and 37oC for 72 hours. Details of each compound are provided in Appendix Table A6. Following incubation, cells were then treated with WST1 reagent for 45 minutes and 450nm absorbance was determined. Media absorbance was subtracted from all readings and absorbance measures from triplicate wells were averaged and normalized to absorbance of the DMSO control wells.

26

2.5 Results 2.5.1 Genes and Genomic Regions of Recurrent Somatic Copy Number Gains in Pancreatic Ductal Adenocarcinoma

Each of the four publically-available pancreatic cancer genome datasets used in this study analyzed a varying number of samples and employed a different platform and associated algorithm for calling SCNAs (Appendix Table A2). A comprehensive review of copy number alteration detection platforms underscores the inherent variability in detecting SCNAs using various techniques (Pinto et al., 2011). Bearing this limitation in mind, I analyzed only SCNAs that were consistently observed in multiple datasets, as this would decrease the likelihood that the observed SCNAs were technical artifacts. As such, the number of genes encompassed in regions of genomic gain across multiple datasets was determined (Figure 3). The number of genes encompassed in gains in two or more datasets comprised a large proportion of the genome (4617 genes). Only one gene, IFLTD1, was encompassed in gains in all four datasets. This gene resides on the same chromosomal locus as KRAS in the 12p12.1 region, and its gain across all four datasets could reflect biological importance of KRAS as well as other nearby genes, including IFLTD1. In order to compile a manageable list of candidate target genes, I chose to focus on genes that were encompassed in gains in three of the four datasets. This integrated analysis of copy number gains revealed 171 genes encompassed in gains in at least one sample in three of the four pancreatic cancer SCNG datasets.

4617

171

1 Number of Genes in SCNGs Genes of Number

Number of Datasets

Figure 3. Number of genes encompassed in genomic gains among multiple datasets. Number of genes encompassed in genomic gains in 2/4 datasets (leftmost bar), 3/4 datasets (middle bar), and all datasets (rightmost bar) are depicted. 27

To further validate the approach of identifying genes identified in SCNGs in 3/4 datasets, analysis of the impact of exclusion of one dataset was performed. Close inspection of two of the five samples of the OICR dataset appeared to potentially over-represent SCNGs as >20% of the genome is called as gained in each of these two samples. Barring exclusion of this dataset from the analysis, the optimal approach to assessing the extent to which this dataset is essential for the final analysis was considered. Namely, the specific regions and total number of regions amplified was assessed with both the inclusion and exclusion of the OICR dataset. The analysis reveals that the loci identified as gained in 2 out of 3 datasets (i.e. excluding the OICR dataset) included all of the same loci identified as gained in three out of four datasets (i.e. including the OICR dataset). This indicates that exclusion of the OICR dataset does not impact the final determination of loci to be studied further (Figure 4).

Figure 4. Number of genomic loci gained when assessing the datasets inclusive of the OICR dataset (AJH) and inclusive of the OICR dataset (AJHO). The leftmost bar indicates the number of genomic regions gained in all three of the AJH datasets, while the middle bar represents less stringent criteria. Namely, the number of genomic regions gained in at least 2 of the AJH datasets. By definition, all loci found gained in 3 out of the AJHO datasets (right bar) would have been identified in at least two of the AJH datasets (middle bar).

28

The set of 171 genes encompassed in SCNGs in 3 out of the four datasets comprised the ‘core catalogue’ of genes that were putative targets of SCNGs. However, because true boundaries of SCNGs cannot be definitively delineated using the methods employed in the four PDAC SCNG datasets, we chose to expand the core PDAC catalogue of 171 genes to include all genes in the vicinity of each respective locus, and included all genes within 1Mb of the minimal common region of overlap of the SCNG in each independent dataset. This resulted in an ‘expanded catalogue’ of 756 PDAC genes which were putative SCNG targets for further analysis (Figure 5).

Figure 5. Bioinformatic approach to identifying genes for further analysis. Chromosomal positions of regions of genomic gain were extracted from all datasets and converted to the same human genome assembly (GRCh37/hg19). Genes residing in these regions were then identified and a non-redundant combined list of genes that are found in at least one region of genomic gain in at least one sample in one of the four datasets. Of these genes, those appearing in regions of genomic gain in at least three of the four datasets were identified (171 genes). [QCMG=Queensland Center for Medical Genomics; ICGC=International Cancer Genome Consortium; OICR=Ontario Institute for Cancer Research; JHU PCGP=Johns Hopkins University Pancreatic Cancer Genome Project.]

29

The catalogue of PDAC genes encompassed in SCNGs in three of the four PDAC datasets map to 20 discrete genomic loci that harbor a total of 756 genes (Table 1). The distribution of the number of tumor samples harboring a genomic gain at each of the 20 genomic loci is depicted in Figure 6. Genomic regions that appear to be most frequently gained are 20q11-20q12.31 (60.0%), 16p11-16p13.3 (58.3%), 12p11.21-12p13.33 (55.0%), 14q11 (48.3%).

Table 1. Genomic loci encompassed in SCNGs identified in this study. Locus Size (Mb) Frequency in 60 PDAC Genomes

2q14-2q14.3 8.01 29 (48.3%)

3q25-3q26.3** 7.10 24 (40.0%)

7p15.2 1.62 28 (46.7%)

7p22-7p22.3 2.02 25 (41.6%)

8q24.2** 3.68 23 (38.3%)

9p13.3 3.16 9 (15.0%)

10q22-10q22.4** 1.07 25 (41.6%)

12p11-12p13.3** 12.16 33 (55.0%)

14q11-14q13** 1.51 30 (50.0%)

15q12-15q15 2.18 28 (46.7%)

15q24.2 1.18 23 (38.3%)

15q26.3 2.83 25 (41.6%)

16p13.3 2.22 35 (58.3%)

16q22-16q22.3 0.92 23 (38.3%)

18p11-18p11.21 0.82 8 (13.3%)

18q11-18q11.2 1.95 22 (36.6%)

19p13.3 1.41 6 (10.0%)

20p13 1.06 7 (11.7%)

20q11-20q13.3** 7.43 36 (60.0%)

Xq12-13** 0.791 6 (10.0%)

30

Figure 6. Circos plot depicting common regions of genomic gains. The loci identified in somatic copy number gains in at least three out the four PDAC datasets are depicted in the figure, mapped to the chromosomal region. The height of each bar represents the frequency of the genomic gain in the respective dataset. (OICR: Ontario Institute for Cancer Research; JHU: Johns Hopkins University; QCMG: Queensland Center for Medical Genomics).

The literature-curated data indicated that 17 of the 20 SCNG regions identified in this study have been observed in previous PDAC copy number studies (Figure 7). Notably, for these loci, the presumptive cancer-related driver genes within these SCNGs are yet to be identified.

31

13

13

13

-

14 -

-

q26 -

Technology Reference -

15q24 18q11 20p13

2q14.3 7p15.2 7p22.3 8q24.2 9p13.3

10q22.2 14q11.2 15q26.3 16p13.3 16q22.1 19p13.3

12p11.2

Xq12

18p11.21

20q11

3q25 15q13 Chromosomal Solinas-Toldo CGH et al, 1996. Mahlamaki et al, 1997. Fukushige et al, 1997. Curtis et al, 1998. Ghadimi et al, 1999. Schleger et al, 2000. Shirasi et al, 2001. Harada et al, 2002. Mahlamaki et al, 2002. Lin et al, 2003. Kitoh et al, 2005. Array CGH Heidenlblad (BAC/PAC, et al, 2004. cDNA) Aguirre et al, 2004. Holzmann et al, 2004. Mahlamaki et al, 2004. Bashyam et al, 2005. Gysin et al, 2005. Nowak et al, 2005. Loukopoulos et al, 2007. Array CGH Harada et al, (SNP-arrays) 2008. Figure 7. Comparison of 20 loci identified in this study with other pancreatic copy number studies in the literature. Literature references to PDAC copy number studies are listed in the ‘Reference’ column, and are grouped by the technology employed to call somatic copy number alterations (leftmost column). Of the 20 loci identified in our study, 17 have been identified in gains in at least one other PDAC study in the literature (depicted by green boxes). 32

Moreover, 7/20 of these genomic loci have been identified as frequent targets of genomic gain or amplification in a survey of 3 131 tumor specimens belonging to 26 distinct tumor histological subtypes (Figure 8; Table 2; [Beroukhim et al., 2010]).

Figure 8. Peak regions of genomic amplification identified in a survey of 3 131 tumor specimens belonging broadly to 26 histological subtypes (Beroukhim et al, 2010; Permission to reuse obtained; License No: 2891421438977). Chromosomal position is depicted on the vertical axis. The horizontal length of each peak indicates the statistical confidence that this peak is a true amplification peak.

Table 2. Regions of genomic gain identified in this analysis of pancreatic tumors as well as a survey of 26 histological subtypes of human cancer by Berkokhim et al, 2010. Locus 3q26 8q24 10q22 12p11.21-p13.33 14q11.2 20q11-q13 Xq12

33

2.5.2 Integrated Copy Number and Expression Analysis of Candidate Genes

To further refine the PDAC gene catalogue, copy number and gene expression level measures of genes in the expanded catalogue of 756 genes were assessed in an independent panel of 29 human PDAC cell lines (Appendix Table A3). Array-based copy number and gene expression data on these 756 genes was utilized in the analysis and genes were ranked by the correlation measure between copy number and expression. In order to correlate copy number and expression for the gene set, it was necessary to derive continuous measures of each of these attributes. To assign a continuous measure of copy number for each gene, a gene-directed approach (as opposed to a genome-wide approach) was used, whereby the array intensity measures for all SNP probes on the array mapping to a gene were averaged to provide an approximate measure of array intensity, mean probe intensity (MPI) as it relates to putative copy number (Figure 9).

Figure 9. Mean probe intensity for assigning continuous copy number measure. SNP probe intensities for all probes mapping to each gene are averaged to assign a mean probe intensity continuous measure of copy number.

For each gene, the correlation between copy number and gene expression for that gene was assessed by computing a Spearman-rho correlation coefficient, ρ, of these measures across the 29 PDAC cell lines. In addition, a Sum of Ranks was computed, which is a measure of the relation of the top five cell lines with the highest MPI copy number measure for that gene and the bottom five cell lines with the lowest MPI copy number measure. The lower the sum of ranks, the better the correlation between copy number and gene expression in the 5 cell lines with the highest copy number and the 5 cell lines with the lowest copy number. The sum of ranks was highly associated with the Spearman-rho correlation coefficient of copy number and expression for each gene (Figure 10). Representative gene plots for copy number and gene expression correlations are shown in Figure 11.

34

Figure 10. Association between Sum of Ranks and Spearman-rho Correlation Coefficient for PDAC genes. Sum of ranks measures are highly associated with the Spearman-rho Correlation Coefficients of copy number and expression for each gene in PDAC gene catalogue.

35

Figure 11. Representative copy number and gene expression correlation plots. Representative copy number and gene expression plots for four genes from the 756 gene set are shown. KRAS is a known oncogene in PDAC, MYC is a known oncogene in other human cancers and the other genes depicted, ECT2 and VCP, have not been characterized in PDAC. Red data points indicate the 5 cell lines with highest relative copy number and 5 cell lines with lowest relative copy number.

Among the genes in the set of 756 genes, those in the 95th percentile of correlation coefficients have higher correlations between copy number and gene expression in comparison to randomly simulated gene sets (p-value=0.007; Figure 12). Genes were selected for further investigation if they were found to exhibit a correlation coefficient p-value < 0.05 when compared with the correlation coefficients in randomly simulated gene sets, a 2.5-fold or greater difference in expression between the cell lines expressing the gene at the highest and lowest levels and a minimum expression measure of 6.8 in the cell line expressing the gene at the lowest level. The 36

minimum expression level was based on the median expression measure for all genes across all cell lines and represents the lower limit of transcript detection.

Figure 12. Distribution of correlation coefficients in the top 5% most highly correlated genes in comparison to multiple simulations of random sets of genes. The top 5% genes in the gene catalogue having the highest correlation (ρ) values were compared with the top 5% genes with the highest ρ values in sets of randomly selected genes. The simulated gene set was compiled to simulate the selection criteria used in our candidate gene catalogue. Since our genes are clustered in discrete loci, our simulated gene set mimicked the same scenario: an anchor gene was first randomly selected, along with all genes within 1Mb of the anchor gene. This simulation was performed in 1500 replicates. The median ρ for the top 5% genes in the simulation set was 0.68 while the median ρ in the top 5% genes in our catalogue was 0.73 (p=0.007).

2.5.3 Database of Top-Ranked Candidate Target Genes

The filtering parameters in 2.5.2 resulted in a refined list of 34 candidate genes (CAN- GENES) for further study, from the original list of 756 genes. These top-ranked genes primarily mapped to 3q26.1-q26.3, 7p22, 9p13.1-p13.3, 12p11.21-pter, 14q11.2, 15q14-15, 15q26, 16p13, 18p11, 18q11.2, 19p13.3, 20p13, 20q11.2-q13. Of these genes, 26% (9/34) mapped to the 12p11- 12 region. Further annotations to these CAN-GENES were added to formulate a working database of candidate targets. Annotations include druggability, as defined as the availability of small molecule modulators or the presence of protein motifs which are potential drug-binding domains, and available data on protein expression (Table 3). Genes in red font appear in the assembled druggable genome database described in 2.4.7. Annotations including the gene cytoband and correlation coefficient computed between copy number and expression in the panel of 29 PDAC cell lines were 37

included. In addition, a ‘DeltaExp’ column provides the log2 expression difference in the cell line expressing the gene at the highest relative level and the cell line expressing the gene at the lowest relative level. The ‘zGARP Score’ column lists the mean z-normalized gene activity rank profile (GARP) of that gene in a pooled shRNA screen in 27 of the same PDAC cell lines used for copy number and gene expression analysis. Annotations of ‘Druggability’ were added from the Druggable Genome Database compiled as described in Methods section 2.4.7. The ‘DrugBank’ column provides information on small molecules that target that specific gene and are characterized in DrugBank - a comprehensive repository of drug information (Knox C et al., 2011). ‘NormalPancExp’ and ‘PancTumorExp’ columns indicate protein levels of each gene as documented by immunohistochemical analysis in the Human Protein Atlas (Uhlen et al., 2010), and ‘Differential Exp’ indicates if differential expression between normal pancreatic and pancreatic tumor tissue was observed at the mRNA level (GeneCards, 2012).

38

Table 3. Database of top-ranked candidate PDAC genes. Gene Cytoband Correlation DeltaExp zGARP ScoreDruggable DrugBank DrugsNormal Prot Panc Tumor Prot Differential Exp (GeneCard) RECQL 12p12 0.835 1.832 -0.609 N n/a Weak (low proteinNegative-Moderate expression)Yes TMEM85 15q14 0.832 2.333 0.654 N n/a n/a n/a n/a VCP 9p13.3 0.825 1.638 -5.016 Y PhosphoaminophosphonicModerate-strongModerate-Strong Acid-AdenylateModerate Ester (DB04395); Adenosine-5'-Diphosphate (DB03431); GOLT1B 12p12.1 0.811 2.641 -0.779 N n/a n/a n/a Moderate CLTA 9p13 0.805 1.491 -0.27 N n/a Moderate Weak-Strong Yes MELK 9p13.2 0.803 1.431 -1.53 Y n/a Strong Moderate-Strong Yes ESCO1 18q11.2 0.802 1.475 0.673 N n/a Strong Moderate-Strong n/a SELS 15q26.3 0.779 1.616 n/a N n/a n/a n/a n/a TMEM55B 14q11.2 0.768 2.474 n/a N n/a n/a n/a n/a MED21 12p11.23 0.766 1.348 -0.323 N n/a Weak Weak-Moderate (StrongYes in CAPAN2) CMAS 12p12.1 0.758 2.864 0.435 Y Cytidine-5'-Monophosphate-5-N-AcetylneuraminicStrong Weak-Strong Moderate Acid (DB02485); CCDC91 12p11.22 0.741 2.209 -0.362 N n/a Strong Weak-Strong Moderate KIAA0528 12p12.1 0.739 2.121 0.278 N n/a n/a n/a No WDR18 19p13.3 0.736 2.325 0.484 N n/a n/a n/a No CHMP4B 20q11.22 0.732 2.659 n/a N n/a n/a n/a n/a UBAP2 9p13.3 0.732 1.425 -0.203 N n/a Negative Weak-Strong No RHBDF1 16p13.3 0.729 2.326 n/a Y n/a n/a n/a Moderate AMN1 12p11.21 0.722 1.906 0.098 Y n/a n/a n/a n/a CSNK2A1 20p13 0.718 1.369 0.963 Y BenzamidineStrong (DB03127);Weak-Strong S-METHYL-4,5,6,7-TETRABROMO-BENZIMIDAZOLEYes (DB04720); Phosphoaminophosphonic Acid-Adenylate Ester (DB04395); Tetrabromo-2-Benzotriazole (DB04462); 2-(CYCLOHEXYLMETHYLAMINO)-4-(PHENYLAMINO)PYRAZOLO[1,5-A][1,3,5]TRIAZINE-8-CARBONITRILE (DB08354); 2,3,7,8-tetrahydroxychromeno[5,4,3-cde]chromene-5,10-dione (DB08468); 2-(4-ETHYLPIPERAZIN-1-YL)-4-(PHENYLAMINO)PYRAZOLO[1,5-A][1,3,5]TRIAZINE-8-CARBONITRILE (DB08360); N-(3-(8-CYANO-4-(PHENYLAMINO)PYRAZOLO[1,5-A][1,3,5]TRIAZIN-2-YLAMINO)PHENYL)ACETAMIDE (DB08362); 5,6-dichloro-1-beta-D-ribofuranosyl-1H-benzimidazole (DB08473); 3,8-DIBROMO-7-HYDROXY-4-METHYL-2H-CHROMEN-2-ONE (DB07802); 3-METHYL-1,6,8-TRIHYDROXYANTHRAQUINONE (DB07715); 1,2,5,8-tetrahydroxyanthracene-9,10-dione (DB08660); (5-Oxo-5,6-Dihydro-Indolo[1,2-a]Quinazolin-7-Yl)-Acetic Acid (DB01765); DIMETHYL-(4,5,6,7-TETRABROMO-1H-BENZOIMIDAZOL-2-YL)-AMINE (DB04719); 1,8-Di-Hydroxy-4-Nitro-Anthraquinone (DB03035); 1,8-Di-Hydroxy-4-Nitro-Xanthen-9-One (DB02170); 5,8-Di-Amino-1,4-Dihydr EIF2AK1 7p22 0.712 1.470 0.858 Y n/a Medium (proteinWeak-Strong expression ofYes normal tissue) PSMG2 18p11.21 0.709 2.145 0.126 N n/a n/a n/a Yes WNK1 12p13.3 0.702 1.434 -1.201 Y n/a n/a n/a No NOP10 15q14-q15 0.698 2.069 1.132 N n/a n/a n/a n/a STOML2 9p13.1 0.697 1.578 0.832 Y n/a Moderate Negative-moderateYes ECT2 3q26.1-q26.2 0.695 3.940 -3.195 Y n/a n/a n/a Yes LYRM5 12p12.1 0.695 2.211 0.313 N n/a n/a n/a n/a RALY 20q11.21-q11.230.688 1.751 -1.403 N n/a n/a n/a No FKBP1A 20p13 0.683 2.472 -0.017 Y PhosphoaminophosphonicWeak Negative Acid-Adenylate Yes Ester (DB04395); (21S)-1AZA-4,4-DIMETHYL-6,19-DIOXA-2,3,7,20-TETRAOXOBICYCLO[19.4.0] PENTACOSANE FKB-001 (DB02888); L-709,587 (DB03621); {3-[3-(3,4-Dimethoxy-Phenyl)-1-(1-{1-[2-(3,4,5-Trimethoxy-Phenyl)-Butyryl]-Piperidin-2yl}-Vinyloxy)-Propyl]-Phenoxy}-Acetic Acid (DB01723); Rapamycin Immunosuppressant Drug (DB02439); Tacrolimus (DB00864); 4-Hydroxy-2-Butanone (DB04094); (3r)-4-(P-Toluenesulfonyl)-1,4-Thiazane-3-Carboxylicacid-L-Leucine (DB04012); Gpi-1046 (DB01951); (21S)-1AZA-4,4-DIMETHYL-6,19-DIOXA-2,3,7,20-TETRAOXOBICYCLO[19.4.0] PENTACOSANE (DB08520); Heptyl-Beta-D-Glucopyranoside (DB03338); N1,N2-ETHYLENE-2-METHYLAMINO-4,5,6,7-TETRABROMO-BENZIMIDAZOLE (DB04721); (3r)-4-(P-Toluenesulfonyl)-1,4-Thiazane-3-Carboxylicacid-L-Phenylalanine Ethyl Ester (DB01712); 6-[4-(2-piperidin-1-ylethoxy)phenyl]-3-pyridin-4-ylpyrazolo[1,5-a]pyrimidine (DB08597); Pimecrolimus (DB00337); Sirolimus (DB00877); MYRISTIC ACID (DB08231); FGFR1OP212p11.23 0.681 1.735 0.092 Y n/a High Moderate-Strong n/a GSS 20q11.2 0.681 1.435 -0.673 Y Gamma-Glutamylcysteinen/a n/a (DB03408); GlycineNo (DB00145); L-Cysteine (DB00151); Adenosine-5'-Diphosphate (DB03431); Glutathione (DB00143); Phosphoaminophosphonic Acid-Adenylate Ester (DB04395); CCDC77 12p13.33 0.680 2.266 0.074 N n/a Moderate Weak-Moderate n/a ZSWIM1 20q13.12 0.677 1.544 0.034 N n/a n/a n/a n/a RPS15 19p13.3 0.663 1.833 -2.381 Y n/a n/a n/a No AFG3L2 18p11 0.658 3.056 -1.458 Y n/a Medium Moderate-Strong No

39

2.5.4 Identification of ECT2 for Laboratory Study through Integration of shRNA Pooled Screen Results

Genomic copy number and global expression data do not indicate whether or not a gene is essential for cancer cellular pathways and networks, and as such, I sought to compare genomic information with existing functional genetic screening data from Marcotte et al, whereby shRNAs targeting ~16 000 genes were tested in a pooled screen on 27 of the same cell lines analyzed for SCNGs and expression in this study (Marcotte et al, 2012). Thus, functional genetic screening data for the set of CAN-GENES in PDAC was integrated with copy number and gene expression data to further refine the list of putative drive genes. Among the 34 top-ranked genes in the CAN-GENES list, 7 genes including, VCP, ECT2, RPS15, MELK, AFG3L2, RALY, and WNK1, were found to be essential for PDAC cell viability, as measured by a median z-normalized GARP (zGARP) score lower than -1 across all pancreatic cancer cell lines (Figure 13).

40

Figure 13. shRNA pooled screen results for top-ranked candidate genes. Heatmap depicts z-normalized GARP (essentiality) scores of 34 CAN-GENES obtained from shRNA pooled screen analysis in 27 human pancreatic ductal adenocarcinoma cell lines. Left vertical axis indicates the PDAC cell line tested.

Next, I sought to determine if these 7 genes were essential to PDAC viability across all PDAC cell lines or display relatively higher essentiality in cell lines that harbor copy number gains at their respective loci. In other words, in the subset of PDAC cell lines in which the gene is essential, is it also amplified and vice versa. In order to formally assess this, it was necessary to assign a discrete copy number for each of the candidate genes across the panel of PDAC cell lines. Copy number data on the panel of 29 PDAC cell lines was processed through the Global Parameter Hidden Markov Model (GPHMM) method, a standard algorithm for array-based copy number analyses (Li A et al., 2011). Genes with copy number ≥4 were grouped as ‘copy number gain’ in the respective cell line, while genes with 2 copies were labeled diploid. 41

Data from the 7 essential genes were depicted in Figure 14. Notably, only ECT2, showed a positive correlation between copy number, expression and zGARP across the 27 PDAC cell lines, indicating that higher essentiality is observed in cell lines bearing ECT2 SCNGs, (p=0.015; Figure 14a). This indicates that while overall, the enriched gene set consists of genes that are essential to PDAC cell viability, or ‘Pancreatic essentiality’ it appears that only ECT2 displays ‘Pancreatic gain- specific essentiality’, demonstrating that increased essentiality may be a direct result of copy number gain and upregulated expression of this gene.

Figure 14a. Comparison of essentiality scores of ECT2 in PDAC cell lines with copy number gains and cell lines in which ECT2 is diploid. p-values denoted at the top of the plots for each gene indicate the degree of significance between differences in RNAi pooled screen z-normalized GARP scores (zGARP) in cell lines where ECT2 is in the diploid state in comparison to cell lines harboring genetic gains at the 3q26 locus. 42

Figure 14b. Comparison of essentiality scores of PDAC essential genes with copy number gains. Boxplots show genes that are overall essential across the 27 surveyed PDAC cell lines. p-values denoted at the top of the plots for each gene indicate the degree of significance between differences in RNAi pooled screen z-normalized GARP scores (zGARP) in cell lines where the gene is in the diploid state in comparison to cell lines harboring genetic gains at the respective locus.

43

These results, coupled with the fact that the 3q26 locus harboring ECT2 was found gained or amplified in 24/60 (40%) of PDACs utilized in the initial analysis of copy number gains in human PDACs using public datasets, indicated that gains at this locus may be recurrent genomic events important in PDAC tumor progression. Moreover, 16/27 (59.3%) of the human PDAC cell lines analyzed for essentiality in the pooled shRNA screen harbored SCNGs at the 3q26 locus, and thus a large number of cell lines modeling gains at this locus were available for functional validation experiments.

2.5.5 Targeted shRNA studies of ECT2 in Pancreatic Ductal Adenocarcinoma Cell Lines

As outlined in 2.5.4, an shRNA-based analysis of the same PDAC cell lines analyzed in this study was conducted as part of a functional genomic study of cell lines utilizing a pooled shRNA screen (Marcotte et al., 2012). Results of the screen demonstrate a trend between ECT2 copy number gain, expression, as well as essentiality to cell viability as measured by a z-normalized GARP (zGARP) score of shRNA pooled screen performance (Figure 15). These results, in addition with the formal analysis between shRNA pooled screen performance and copy number in 2.5.4 prompted targeted validation of the role of ECT2 in PDAC cellular biology.

44

Figure 15. Histogram representation of copy number, expression and shRNA pooled screen performance for ECT2. Individual bars of the histogram represent data for all PDAC cell lines analyzed in a pooled shRNA screen (Marcotte et al., 2012). Below each cell line name is a symbol to indicate presence (+) or absence (-) of ECT2 copy number gain in each respective cell line. The height of each bar represents the expression level of ECT2 in the cell line. The color of each bar, as outlined in the legend, depicts the shRNA pooled screen zGARP value for ECT2 in the cell line.

Since ECT2 is implicated in cell proliferation, a critical experiment should to assess the extent to which ECT2 genetic gain is associated with increased dependency on this protein for cell viability. To assess this, five shRNAs that target ECT2, as well as appropriate controls from The RNAi Consortium, were used in a targeted assay of the effect of ECT2 interference on cell viability (Moffat et al., 2006). A panel of 10 of the same cells lines utilized in the genetic and transcriptomic analysis in this study was selected for analysis. Of this panel of 10 cell lines, 5 harbor focal copy number gains encompassing 3q26 (Capan-1, Capan-2, HPAF-II, Panc03.27, PATU8988S), 2 cell lines bear arm-level gains of 3q (MIAPaCa-2, KP4), 1 cell line has a whole gains (Panc08.13), 1 cell line is copy neutral (diploid) for chromosome 3 (AsPc1) and 1 cell line has a one copy loss in the 3q26 genomic region (Panc04.03). These estimates of copy number were determined by processing the array-based copy number data through the GPHMM algorithm (Li et al., 2011). Moreover, these results were entirely concordant with the copy number estimates 45

obtained through the mean probe intensity (MPI) measures I derived from the copy number data (Table 4). The MPI measures are, in turn, nearly identical to the continuous measures obtained when processing the data through the standard copy number analysis method of Circular Binary Segmentation (CBS), demonstrating that the estimates of copy number are likely very precise (Table 4, Figure 16). In addition, a recent study of 947 human cancer cell lines, termed the Cancer Cell Line Encyclopedia, also profiled genomic copy number of the 10 cell lines utilized in this study, and the results are highly concordant with this study (Table 5; [Barretina et al., 2012]). Copy number plots for all cell lines used in this analysis are depicted in Figure 17a-e and these plots are compared with copy number plots for the same cell lines derived as part of the Wellcome Trust Sanger Institute’s Cancer Cell Line Project (http://www.sanger.ac.uk/genetics/CGP/CellLines/).

Table 4. Results of copy number measures for ECT2 in cell lines utilized for targeted shRNA analyses obtained through different computational methods. Cell Line ECT2 ECT2 ECT2 MPI* CBS** GPHMM Copy Number† Capan-2 0.53727 0.4164 5 Capan-1 0.40464 0.3438 4 HPAF-II 0.37382 0.3026 5 PaTu8988S 0.35131 0.2987 5 Panc03.27 0.33849 0.2302 4 MIA-PaCa-2 0.2137 0.1582 4 Panc08.13 0.17869 0.1596 3 KP-4 0.06285 0.0743 3 AsPc1 -0.10007 -0.0685 2 Panc04.03 -0.12725 -0.1213 1 *Results of analysis in this study. **Olshen et al., 2004. †Li et al., 2011.

46

Figure 16. Comparison of Mean Probe Intensity (MPI) copy number estimation approach with Circular Binary Segmentation (CBS) for copy number estimation of ECT2. MPI measures utilized in this study are compared with measures for copy number obtained by processing the same data through the CBS algorithm described by Olshen et al., 2004.

Table 5. Copy number analysis of pancreatic cancer cell lines in Barretina J, et al. 2012. Cell lines are ordered by ‘seg.mean’ measure of copy number (Barretina J, et al. 2012). CCLE_name chrom loc.start loc.end num.mark seg.mean CAPAN2_PANCREAS 3 173044302 176065417 2001 1.0032 CAPAN1_PANCREAS 3 164028410 190843448 16634 0.8092 HPAFII_PANCREAS 3 168371964 176150140 4901 0.7195 PANC0327_PANCREAS 3 169898702 175222564 3439 0.6604 PATU8988S_PANCREAS 3 169655701 176185849 4214 0.613 PANC0813_PANCREAS 3 169779880 179474606 6273 0.3523 MIAPACA2_PANCREAS 3 168320734 190145372 13735 0.3026 KP4_PANCREAS 3 172208475 174469772 1516 0.1973 ASPC1_PANCREAS 3 75760125 194361700 68972 -0.2326

PANC0403_PANCREAS 3 163625382 174712073 6661 -0.2937

47

Figure 17a. Copy number plots for ECT2 in cell lines with focal 3q26 gains. Each plot depicts copy number data for the respective cell lines (Capan-1, Capan-2, HPAF-II, Panc03.27, PATU8988S). The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red lines on each plot represent the chromosomal position of ECT2 and dashed blue lines represent a LogR ratio of 0, or an approximate copy neutral state. The top panel is the cell line data analyzed in this study and the bottom panel (where available) is the copy number plot for chromosome 3 analyzed in the Wellcome Trust Sanger Institute’s Cancer Cell Line Project.

48

Figure 17b. Copy number plots for ECT2 in cell lines with arm-level 3q gains. Each plot depicts copy number data for the respective cell lines (KP-4, MIAPaCa-2). The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red lines on each plot represent the chromosomal position of ECT2 and dashed blue lines represent a LogR ratio of 0, or an approximate copy neutral state. The top panel is the cell line data analyzed in this study and the bottom panel is the copy number plot for chromosome 3 analyzed in the Wellcome Trust Sanger Institute’s Cancer Cell Line Project.

Figure 17c. Copy number plots for ECT2 in Panc08.13 harboring whole chromosome 3 gain. The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red line represents the chromosomal position of ECT2 and the dashed blue line represent a LogR ratio of 0, or an approximate copy neutral state. The top panel is the cell line data analyzed in this study and the bottom panel is the copy number plot for chromosome 3 analyzed in the Wellcome Trust Sanger Institute’s Cancer Cell Line Project.

49

Figure 17d. Copy number plots for ECT2 in AsPc1 with a diploid/copy neutral chromosome 3. The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red line represents the chromosomal position of ECT2 and dashed blue line represent a LogR ratio of 0, or an approximate copy neutral state. The top panel is the cell line data analyzed in this study and the bottom panel is the copy number plot for chromosome 3 analyzed in the Wellcome Trust Sanger Institute’s Cancer Cell Line Project.

Figure 17e. Copy number plots for ECT2 in Panc04.03 with a one copy loss of ECT2. The horizontal axis is the chromosomal position on chromosome 3 and the vertical axis represents the LogR Ratio value for all probes in that region. Dashed red line represents the chromosomal position of ECT2 and dashed blue line represents a LogR ratio of 0, or an approximate copy neutral state.

50

Using five shRNAs to target ECT2, along with three negative control shRNAs (LACZ, GFP and LUC) and two positive control shRNAs (PSMD1, SNRPD1), the effects of shRNA-mediated ECT2 interference on cell viability (nuclei counts) were assayed. Cells were treated with lentivirus containing the shRNA constructs in duplicate assays: one assay plate was treated with puromycin to select for cells that effectively take-up lentivirus, since the lentiviral plasmid DNA contains a puromycin resistance gene and puromycin is otherwise toxic to mammalian cells in culture; the other assay plate was not treated with puromycin. Effective puromycin selection can be inferred from the cell viability of the cells not treated with lentivirus, or the ‘No Virus’ control. Cells not treated with lentivirus containing a puromycin resistance gene should be eradicated by puromycin treatment, and cells that are not treated with either lentivirus or puromycin should demonstrate substantially higher cell viability. This represents an internal experimental control for puromycin selection of cells that effectively take up the lentivirus containing the respective shRNA construct, such that cell viability measures are not confounded by existing cells that did not effectively take up the shRNA construct. Results from each cell line are depicted in Figure 18a-j. Nuclei counts are normalized to the GFP negative control shRNA as this shRNA construct is known to have the least effects on cell growth, relative to any other control shRNA. In cell lines bearing ECT2 copy number gains (Figure 18a-e; Capan-2, Capan-1, HPAF-II, Panc03.27 and PATU8988S), there appears to be only minor commonality between susceptibility to shRNA-mediated interference and presence of ECT2 copy number gain. In the Capan-2 cell line with a focal copy number gain encompassing ECT2, all ECT2 shRNAs appear to diminish cell viability to some extent, however only ECT2-1, ECT2-2 and ECT2-5 shRNAs yield statistically significant decreases in cell viability, relative to GFP (p=2.55e-5, 0.01, 0.0015, respectively). In contrast, cell lines Capan-1 and HPAF-II which also bear focal genomic gains at the ECT2 locus, only the ECT2-1 shRNA efficiently decreases cell viability (Capan-1: p=0.0027; HPAF-II: p=0.0011). In the Panc03.27 cell line bearing a gain from 3q26-qter encompassing ECT2, all ECT2 shRNAs effectively decrease cell viability and this is statistically significant for all shRNAs (p<0.05). Finally, for PATU8988S, only ECT2-1 and ECT2-2 shRNAs significantly decrease cell viability (p<0.001), but not the other ECT2 shRNAs.

51

(18a)

* *

**

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

(18b)

*

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

52

(18c)

*

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

(18d)

* * * ** **

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

53

(18e)

** **

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

Figure 18a-e. Targeted shRNA-mediated interference of ECT2 in PDAC cell lines with focal 3q26 copy number gains. Panels a-e depict results from targeted shRNA analysis in 5 PDAC cell lines bearing focal genomic gains at the 3q26 locus harboring ECT2. Black bars represent normalized nuclei counts in cells treated with the respective shRNA construct and puromycin; bars with diagonal fill represent normalized nuclei counts in cells treated with the respective shRNA construct but no puromycin. ECT2-1-ECT2-5 are five individual constructs targeting ECT2. LAC-Z, GFP and LUC constructs are negative control shRNAs while PSMD1 and SNRPD1 constructs are positive control shRNAs. Complimenting images of stained nuclei are represented in each panel for puromycin-treated cells with ECT2-1 – ECT2-5 shRNAs are well as GFP shRNA (*p<0.05 relative to shGFP; **p<0.001 relative to shGFP).

The cell lines bearing arm-level 3q copy number gains appear to generally be susceptible to ECT2 inhibition by shRNA (Figure 18f-g; MIA PaCa2, KP4). In the MIA PaCa2 cell line, only two shRNAs, ECT2-1(p<0.001) and ECT2-5 (p=0.0017), reduced cell viability. Similarly, in the KP4 cell line, ECT2-1 and ECT2-5 (p<0.001) shRNA diminish cell viability but in addition, ECT2-2 and ECT2- 3 also decrease cell viability (p<0.001). However, it is important to note that in the KP4 cell line, growth was not inhibited by the positive control shRNA constructs, taking the results from this cell line (performed in two technical replicates) into question.

54

(18f)

* **

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

(18g)

**

** ** **

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

Figure 18f-g. Targeted shRNA-mediated interference of ECT2 in PDAC cell lines arm-level 3q gains. Panels f and g depict results from targeted shRNA analysis in 2 PDACcell lines arm-level 3q copy number gains. Black bars represent normalized nuclei counts in cells treated with the respective shRNA construct and puromycin; bars with diagonal fill represent normalized nuclei counts in cells treated with the respective shRNA construct but no puromycin. ECT2-1-ECT2-5are five individual constructs targeting ECT2. LAC-Z, GFP and LUC constructs are negative control shRNAs while PSMD1 and SNRPD1 constructs are positive control shRNAs. Complimenting images of stained nuclei are represented in each panel for puromycin-treated cells with ECT2-1 – ECT2-5 shRNAs are well as GFP shRNA (*p<0.05 relative to shGFP; **p<0.001 relative to shGFP). 55

Effects of shRNA-mediated ECT2 interference are indeed generally less pronounced in the cell lines that do not harbor focal ECT2 copy number gains (Figure 18h-j). The Panc08.13 cell line bears a whole chromosome 3 gain and a near triploid genome and AsPc1 has a diploid chromosome 3. None of the ECT2 shRNAs attenuated cellular growth and viability in either of these cell lines, although in both cell lines the ECT2-1 shRNA had the greatest effect on cell viability but this was not statistically significant. Finally, in the Panc04.03 cell line which bears a one copy loss of ECT2, only the ECT2-1 shRNA significantly reduced cell viability (p<0.001).

(18h)

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

56

(18i)

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

(18j)

**

ECT2-1 ECT2-2 ECT2-3 ECT2-4 ECT2-5 GFP

Figure 18h-j. Targeted shRNA-mediated interference of ECT2 in PDAC cell lines not bearing ECT2 gains. Panels h-j depict results from targeted shRNA analysis in 3 PDAC cell lines. Black bars represent normalized nuclei counts in cells treated with the respective shRNA construct and puromycin; bars with diagonal fill represent normalized nuclei counts in cells treated with the respective shRNA construct but no puromycin. ECT2-1-ECT2-5are five individual constructs targeting ECT2. LAC-Z, GFP and LUC constructs are negative control shRNAs while PSMD1 and SNRPD1 constructs are positive control shRNAs. Complimenting images of stained nuclei are represented in each panel for puromycin-treated cells with ECT2-1 – ECT2-5 shRNAs are well as GFP shRNA (*p<0.05 relative to shGFP; **p<0.001 relative to shGFP). 57

The next analytical step was to formally compare the results from these targeted experiments with the results for ECT2 represented the pooled shRNA screen on the same cell lines (Marcotte et al., 2012). ‘Essentiality’ of a gene to cell viability in a pooled shRNA screen is determined by a Gene Activity Rank Profile (GARP) score for each individual gene. The GARP score is derived from the shRNA Activity Rank Profile (shARP) scores for the two shRNAs that most effectively diminish transcript levels over a defined time-course. These top two shRNAs are therefore the ‘fastest drop-outs’ and their shARP scores are averaged to provide the GARP score for the gene. The top two fastest drop-out shRNAs targeting a given gene may not necessarily always be the same two shRNAs in all cell lines. In other words, the same two hairpins are not the fastest dropouts across all cell lines. In order to directly compare the results from this study to the results from the shRNA pooled screen, the top two shRNA constructs (fastest drop-outs) identified in the screen are utilized for the comparison as these are the same shRNAs that were used in the pooled screen to compute the Gene Activity Rank Profile (GARP) scores and therefore measures of essentiality depend on the effects of these two shRNAs on cell viability. As mentioned, the identity of both shRNAs is not the same in all cell lines, however the fastest dropout in all cell lines in the shRNA pooled screen, ECT2- 1, also demonstrated the strongest effect on cell viability in this study (Figure 19). Of importance, effects on nuclei counts observed with shRNA-1 (ECT2-1) are generally not always consistent with the effects observed with the second-fastest drop-out, shRNA-2 (Figure 19). In addition, a clear observation was that ECT2-1 has strong effects on cell viability in most cell lines, and the average effect of shRNA-1 and shRNA-2 (the GARP score) may be heavily weighted towards effects of shRNA-1. Data on the ECT2-1 shRNA construct from The RNAi Consortium indicate that this construct least efficiently knocks-down ECT2, when compared to the other four ECT2 shRNAs, as measured by percent mRNA remaining after knock-down (Appendix Table A4), and has numerous off-target effects. This suggests that the strong effects on cell viability secondary to shRNA-1 may not only be as a result of effects on ECT2 inhibition. 58

Figure 19. Targeted shRNA experiment results. Height of the blue bars represents effects on cell viability with shRNA-1 (ECT2-1), the fastest dropout in all cell lines. Height of the pink bars represents effects on cell viability with the second-fastest dropout shRNA, which is not the same among all cell lines. Annotations of ECT2 copy gain are below the cell line names.

The culmination of these results leads to the prominent observation that the ECT2-1 shRNA appears to have the most pronounced and statistically significant effect on cell viability in all cell lines tested, in comparison to the four other ECT2 shRNAs tested in this study, as well as the next fastest drop-out shRNA in the pooled screen. The implications of this are that the GARP scores for ECT2 in the pooled screen may be driven by the strong effect of the ECT2-1 shRNA since these scores are the averages of ECT2-1 and the next fastest drop-out shRNA. Overall, the findings from this study are somewhat in agreement with results from the pooled screen by Marcotte et al. (Figure 20). That is, in cell lines in which ECT2 was deemed essential in the pooled screen (z-normalized GARP score <-3), this study showed consistent results, indicated by decreased cell viability with shRNA-mediated ECT2 interference. 59

Figure 20. Comparison of targeted shRNA-mediated ECT2 interference with shRNA pooled screen results. The bottom half of the graph depicts z-normalized GARP scores from the shRNA pooled screen and data is depicted on cell lines having highest ECT2 essentiality to lowest ECT2 essentiality (left to right), as indicated by negative zGARP scores. The top half of the graph depicts the results from this study. The pink bars represent ECT2-1 shRNA-1 in all cell lines, while the green bar represents the next fastest shRNA drop-out construct and is not the same construct across all lines (shRNA-2). Actual values are outlined in the table below the graph.

One of the cell lines tested in the targeted analysis was not included in the pooled shRNA screen (Capan-1), and of the 9 remaining cell lines tested, results from 6 cell lines in this study were generally concordant with the pooled screen results. This indicates that in these 6 cell lines, results from the top two shRNAs can reliably predict essentiality observed in the pooled screen (Table 5; data for which the results of this study were discordant with the pooled screen analysis are highlighted in red). In the three cell lines for which the results from this study are discordant with the results from the pooled screen (HPAF-II, Panc08.13, Panc04.03), it is evident that the pooled screen classification of ECT2 as essential in these lines is heavily weighted by the effects of shRNA-1 (ECT2-1). Interestingly, for the cell line characterized by one copy loss of ECT2, Panc04.03, the pooled screen classified ECT2 essential in this line, however, while shRNA-1 (ECT2-1) resulted in statistically significant decreases in cell viability, shRNA-2 resulted in statistically significant increases in cell viability. Thus conceivably, the average of these two extreme measures resulted in a GARP score translating into the classification of ECT2 as essential in this cell line, when in fact, 60

results from only one of the shRNAs suggest that ECT2 knock-down (if completely efficient by this shRNA), may potentially confer growth advantage in this cell line, as opposed to reduce cell viability.

Table 6. Comparison of targeted shRNA analysis nuclei counts with results from pooled shRNA screen (shRNA-1 and shRNA-2 counts are results from this study).

Taken together, it is unclear whether ECT2 copy number gains can reliably predict effects of shRNA-mediated ECT2 interference on cell viability. There is indeed a potential trend towards increased essentiality in relation to copy number gains. However, more cell lines must be tested with proven efficient shRNAs to more reliably assess the extent of the association between ECT2 copy number gains and increased dependence on ECT2 for cell viability. 61

2.5.6 Functional Effects of Pharmacological Inhibition of the ECT2 Pathway on Cell Viability

One of the primary goals of this study was to identify a gene that is gained in PDAC, is coordinately highly expressed and is amenable to drug targeting. While no small molecules directly targeting ECT2 currently exist, there are well-characterized small molecule inhibitors of the protein kinase Polo-like kinase 1 (PLK1) which phosphorylates ECT2 and has been reported necessary for its downstream cellular activity (Niiya et al., 2006; Petronczki et al., 2007; Wolfe et al., 2009). As such, it was rational to deploy inhibitors of PLK1 to perturb the ECT2-mediated pathway (Figure 21).

Figure 21. Pharmacological modulation of ECT2-mediated oncogenesis. Through chemical inhibition of a necessary upstream regulator, PLK1, the ECT2-mediated pathway can be perturbed.

A panel of 6 cell lines was treated with two PLK1 inhibitors individually (BI-6727 and GSK461364) in 3-fold serial dilutions. Cell viability was measured after drug treatment for 72 hours and normalized to cell viability in DMSO-treated control cells (Figure 22). The cell line Panc04.03 with a one-copy loss of ECT2 appears to be resistant to treatment with both compounds up to a maximal concentration (3µM). However, the distinction between drug sensitivity in cell lines with an ECT2 gain (PATU8988S, KP4, HPAF-II, Capan-1), in comparison to the cell line AsPc1 in which ECT2 is in the diploid state, is unclear.

62

(a)

(b)

Figure 22. Treatment of PDAC cell lines with PLK1 inhibitors. Cell viability is normalized to that measured in cells treated with only DMSO solvent. The top panel (a) depicts treatment of PDAC cell lines with BI-6727. Panel (b) depicts treatment of PDAC cell lines with GSK461364 treatment.

The one noticeable inference that can be made from these pharmacological studies is that a cell line with a one copy loss of ECT2, Panc04.03, appears to be highly resistant to PLK1 inhibition over the range of concentrations that other cell lines are susceptible to these compounds. Whether or not this effect is associated with one copy loss of ECT2 requires further investigation. For example, there may be other critical mutations present in this cell line which may contribute to its resistance to PLK1 inhibitor compounds. These questions must be formally addressed before the conclusion that ECT2 genomic copy number losses confer resistance to PLK1 inhibitors. 63

There are no obvious differences in susceptibility to PLK1 inhibitors in the ECT2 diploid cell line, in comparison to the cell lines bearing ECT2 gains. In order to reveal such differential susceptibility, if it exists, it is necessary to identify more cell lines in which ECT2 is diploid and assess the effects of PLK1 inhibition on cell viability in these cell lines. In the present study, it is difficult to draw conclusions from only one cell line that is diploid at the ECT2 locus. Interestingly, the overall pattern of cell line drug sensitivity observed for BI-6727 is nearly identical to that of GSK461364 treatment, indicating that observed susceptibility to PLK1 inhibition across the cell lines is similar among both compounds (Figure 22). For example, Panc04.03 is the most resistant cell line to PLK1 inhibition with both compounds, and KP4 is the most susceptible cell line to PLK1 inhibition with both compounds. These results do however indicate that cell viability of some PDAC cell lines is substantially diminished by PLK1 inhibitors and these compounds may be effective in PDAC treatment, however the candidate tumor aberrations that would likely confer most benefit from PLK1 therapeutics remain to be elucidated.

64

Chapter 3

3 Discussion

3.1 Pooling Data from Genome-Wide Analyses

This study was aimed at utilizing a rational approach to therapeutic target identification through integrated analysis of somatic copy number gains (SCNGs), gene expression, and RNAi analysis. Such an approach towards identifying SCNGs and over-expressed genes in cancer has been successful, as exemplified by the earliest targeted therapies. Identification of gained and over- expressed genes across multiple tumor samples in PDAC can therefore point to potential therapeutic targets for further study. While the initial target identification phase of this study was genome-wide in its design, it is limited by factors inherent to each of the original datasets employed to identify the regions of copy number gain in primary tumors. As outlined in Appendix Table 1, each study utilized a different platform and associated algorithm for calling somatic copy number alterations (SCNAs). A review of copy number alteration detection platforms underscores the inherent variability in detecting SCNAs using different techniques or the same platform but different computational algorithm (Pinto D et al., 2011). For this reason, pooling results from independent studies for the purpose of creating a unified dataset is not an optimal approach to further assessing SCNAs. In addition, there are a variety of limitations associated with each of the four datasets. The QCMG (n=3) and OICR (n=5) datasets are small. For the QCMG dataset, the small sample size may potentially hamper SCNAs detection to either only those which are relatively rare or highly recurrent. Furthermore, the sample ICGC-ABMP-20090811-04-CD from the QCMG dataset has an abnormally high proportion of gains in its genome (12.85%, while the mean across previously published datasets is ~ 1.5%). This study employed sequencing methods that can detect SCNAs with much higher resolution in comparison to array-based platforms, however, the algorithms applied to sequencing data are not as well-established as those for array-based methods. Furthermore, close inspection of two of the five samples in the OICR dataset also appear to potentially be associated with peculiarly large gains, as >20% of the genome is gained in each of these two samples. In general, the large size of gains in this dataset may be due to extreme noise spanning a large genomic region, or that the 2-sided Kolmogorov-Smirnov test used to delineate boundaries was too aggressive in merging gained regions. Another potential error may be that the 65

baseline was called too low and what is being called a gain or amplification in fact corresponds to normal or baseline intensity. Barring exclusion of this dataset from analysis, the optimal approach to assessing the extent to which this data is essential for the final analysis was assessed by selecting genomic gains identified in three out of the four datasets, as previously described. Overall, with these limitations in mind, it appears that the method of selecting regions of genomic gain in at least three of the four studies increases the likelihood that the gains selected for further analyses are true recurrent genomic events, and these regions have been identified as targets of SCNGs in PDAC as well as other cancers.

3.2 Analysis of Top-Ranked Candidate Genes and Identification of ECT2 as a Putative Target

Since there was no accompanying gene expression data from the same tumor samples utilized in the initial phase of this analysis, expression measures were assessed and corroborated with copy number measures in an independent panel of well-characterized human PDAC cell lines. This was a useful approach because the same cell lines could subsequently be used to tools for laboratory-based study of the candidate genes. In addition, 27 of the same cell lines were characterized in a functional genomics pooled shRNA study. The approach of correlating copy number with gene expression of genes identified as gained or amplified in human tumors was suitable for identifying target genes that can reasonably be studied in functional validation assays and may be potential drivers. Among the 34 top-ranked genes, ECT2 showed the highest expression in PDAC lines and an excellent correlation between copy number and expression. Moreover, unlike many of the other top-ranked genes, there were numerous cell lines (n=16) which bear genomic gains at the 3q26 locus harboring ECT2, allowing for use of multiple biological replicates to model the role of this gene and most importantly, the extent to which genetic gains of this gene are important to the tumors that harbor them. In the pooled shRNA screen, 7 of the 34 top-ranked genes, VCP, ECT2, RPS15, MELK, RALY, AFG3L2, and WNK1, were determined to be essential to PDAC cell viability. Interestingly, the only gene with a statistically significant positive correlation between copy number, expression, and essentiality in an RNAi screen was ECT2. This gene has been identified as an oncogene, and its genomic amplification and elevated expression have been observed in many other human cancers. There are no known small-molecule chemical inhibitors of the ECT2 protein, however its role in oncogenic processes and the association between ECT2 copy number gain at the 3q26 locus, 66

upregulation and essentiality, indicated that this gene may indeed be a suitable target for further study in PDAC. It is also worth mentioning that copy number gains at the 9p13 locus which harbors valosin-containing protein (VCP), do not appear to impart an increase in essentiality and therefore susceptibility to RNAi, but rather, it appears that copy number gains at this locus are associated with decreased essentiality scores. Interestingly, VCP is the most highly essential gene among the top-ranked candidates (mean zGARP = -5.02). It may be the case that at a basal level, VCP is a highly essential gene, and thus effects of increased cellular gene dosage of this gene do not translate to an enhanced proliferative capacity. Another consideration is the essentiality of VCP to normal cellular function, in that VCP function is required for viability of normal cells. Therefore, perhaps because this gene is uniformly essential and targeted knock-down leads to cell death, it would be difficult to observe any positive correlation between cell viability and genomic and transcriptomic measures. Notwithstanding this, there may potentially be clinical utility to targeting VCP in cancer and, as in the case of proteasome inhibitors, a therapeutic window for VCP inhibitors which may hold promise for selective therapeutic targeting of VCP in cancer. The high degree of essentiality of VCP in pancreatic cancer cells suggests that targeting VCP may have therapeutic value and the protein product of VCP has been shown to bind a series of novel small chemical compounds, which are potential anticancer therapeutics (Bursavich et al., 2010). Although only ECT2 from the 34 top-ranked genes was found to have a positive correlation between copy number, expression, and essentiality, there may be other cellular factors that contribute to the extent of essentiality. For example, copy number, expression and essentiality may only be proportional until a certain threshold, whereby the association between them is no longer direct. A gene may be essential as a result of a low copy gain, and further increases in copy number do not translate in increased essentiality. In this instance, the copy gain may have significance to the role of this gene in tumorigenesis, but this would be missed by our correlation-centered approach. Similarly, RPS15, MELK, AFG3L2, RALY and WNK1 are also highly essential in PDAC and further study of these genes may provide insight into their role in the PDAC neoplastic process. Biological investigation of these genes may be warranted in order to obtain a clearer understanding of the mechanisms that underlie their tumorigencity, and these genes may be the next best candidates for functional validation.

67

3.3 Dependence on ECT2 for Cell Viability in Cell Lines Bearing a Genomic Gain at the 3q26 Locus

Data on the essentiality of ECT2 to PDAC cell viability from the shRNA pooled screen by Marcotte et al. showed a promising trilateral trend between copy number gain, expression and essentiality. These findings prompted targeted analysis of the effects of shRNA-mediated ECT2 inhibition on cell viability, and importantly, the assessment of whether or not these effects were associated with ECT2 copy number gains, as suggested by the shRNA pooled screen. One of the most marked results from the targeted analyses in this study was the relatively weak concordance of effects on cell viability observed between the top two shRNAs (fastest dropouts), shRNA-1 (ECT2-1) and shRNA-2, in the cell lines tested. Moreover, there was minimal concordance between effects of ECT2-1 and the other four shRNAs targeting ECT2. The ECT2-1 shRNA is consistently the fastest dropout in the pooled screen and is the shRNA with the strongest effect on cell viability in all cell lines tested in this study. Essentiality of a gene to cell viability in the pooled shRNA screen is determined by Gene Activity Rank Profile (GARP) scores for each gene. Since GARP scores for genes in the pooled screen are given by the average of the shRNA Activity Rank Profile (shARP) scores of the two fastest drop-out shRNAs, and ECT2-1 is the fastest shRNA drop-out with the strongest effects on cell viability, it is conceivable that the GARP scores in the pooled screen may be heavily weighted by the effect of ECT2-1. This indicates that the ECT2-1 shRNA may be driving essentiality/GARP scores observed for ECT2 in the pooled shRNA screen. These findings prompted investigation of the known properties of the ECT2-1 shRNA construct as indexed in The RNAi Constorium (TRC) library database in order to identify if the effects of this shRNA are due to efficient ECT2 targeting or other confounding factors. Data on the ECT2-1 shRNA construct from The RNAi Consortium indicate that this construct least efficiently knocks-down ECT2, when compared to the other four ECT2 shRNAs, as measured by percent mRNA remaining after knock-down (Appendix Table A4). Most intriguing, the TRC library reports off- target effects for ECT2-1, but not for the other four shRNAs targeting ECT2. For the ECT2-1 shRNA, in addition to ECT2, there are 7 other genes which are putative targets of this shRNA with a >76% match of hairpin target sequence to the transcript RNA sequence. Among these genes that are potential targets of ECT2-1 is PSMD1. An shRNA construct targeting PSMD1 was utilized in this study to serve as a positive control because knock-down of this protein is known to have pan-lethal effects on cell lines. These findings indicate that the strong effects on cell viability observed with the ECT2-1 shRNA may not be accounted for by ECT2 knock-down alone, but also potentially as a result 68

of a combination of less-specific knock-down of ECT2 as well as other genes. The clear implication of this analysis is that ECT2 may have been inaccurately characterized as essential to cell viability in certain cell lines in the pooled shRNA screen as a result of observed effects of ECT2-1. The optimal approach to addressing the problems identified in these shRNA studies is to utilize siRNA or shRNA constructs targeting ECT2 that are both highly specific and rapidly diminish ECT2 transcript levels over time. Once these shRNAs are obtained and validated, it will be necessary to confirm specific knock-down of ECT2 at both the transcript and protein levels, as well as assess the outcomes of rescuing effects on cell viability with an shRNA-resistant ECT2 clone. Finally, it is worth mentioning that PDAC cell lines show differential infectibility in their ability to take up shRNA-containing lentivirus. As such, careful titering must be conducted in each cell line tested to ascertain the appropriate volume of lentivirus to be used to infect cells, such that findings are not confounded by lack of lentivirus uptake. Through these strategies, only then would it be possible to reliably relate effects of ECT2 knock-down on cell viability and subsequently correlate these observations to the genomic feature of ECT2 copy number gain. Moreover, once an effective tool is developed to facilitate ECT2 knock-down, it would be extremely valuable to not only assess the effects of targeted ECT2 inhibition in pancreatic cancer cell lines bearing genomic gains at the ECT2 locus, but also on normal human epithelial cells to decipher differential dependency on ECT2 in cancer cells.

3.4 Differential Sensitivity to Inhibitors of ECT2-Mediated Cellular Pathway in Cell Lines Bearing Genomic Copy Number Gains at the 3q26 Locus

While there are currently no available small molecules to target ECT2 directly, well- characterized inhibitors of PLK1, which phosphorylates ECT2 and is necessary for its downstream signaling, are available. The preliminary results obtained by testing the effects of two PLK1 inhibitors on 6 PDAC cell lines indicate there may be a trend towards increasing sensitivity to PLK1 inhibition in cell lines with genetic gains of ECT2 but this remains unclear. The cell line Panc04.03 appears uniformly resistant to both PLK1 inhibitors across the range of concentrations that these compounds affect all other cell lines. While Panc04.03 was characterized in this study by a one-copy loss encompassing ECT2, there may certainly be other genomic or transcriptomic features of this cell line contributing to its resistance to PLK1 inhibition. These must be explored before resistance to PLK1 inhibitors in this cell line can be reasonably 69

attributed to ECT2 loss. On the other hand, the cell line KP4 was the most sensitive to both PLK1 inhibitors across the entire range of concentrations tested. In addition, examination of the effects of shRNA-mediated ECT2 interference demonstrated that this cell line appears to be overall highly susceptible to all shRNAs tested. These findings may be as a result of generalized sensitivity of this cell line to perturbation. Finally, there was no clearly observed differential sensitivity to PLK1 inhibition in 4 cell lines bearing genomic ECT2 copy number gains in comparison to the one cell line diploid at the ECT2 locus. Notwithstanding the lack of differential activity, all of these cell lines were found to be sensitivity to PLK1 inhibition. This indicates that PLK1 inhibitors may have a toxic effect on PDAC cell lines, and the extent of toxicity may or may not be linked to genomic features hypothesized to play a role, such as ECT2 copy number gains. However, more diploid cell lines must be tested and a clear difference in PLK1 inhibitor susceptibility must be demonstrated in order for the correlation between ECT2 copy number gains and PLK1 inhibitor susceptibility can be made. It is important to also recognize that PLK1 inhibitors are not completely selective for PLK1 over other PLKs (PLK2, PLK3), and can thus potentially affect other cellular pathways (Appendix Table A6). However, the compound GSK461364 has 400-fold selectivity for PLK1 over the other PLKs. In addition, PLK1 itself is known to be involved in other cellular roles and therefore inhibition of its activity not only affects ECT2 activity, but may affect many other proteins involved in the PLK1 cellular signaling network (Strebhardt, 2011). Another factor to take into consideration is the concentration range of the tested compounds. Additional concentrations of each compound may need to be tested, as there may be an optimal therapeutic range or therapeutic window of efficacy of PLK1 inhibitors in cell lines bearing 3q26 gains. Furthermore, the genetic status of the target of these compounds itself, PLK1, cannot be ignored. Discovery of PLK1 mutations in human cancer prompted the development of PLK1 inhibitors. As such, mutations of PLK1 itself would undoubtedly impact the observed results of PLK1 inhibitors on cell viability. Therefore, PLK1 mutations, as well as gene expression, should be assessed in the tested cell lines. Finally, as with shRNA experiments, pharmacological assays may also be performed on normal epithelial cell lines to gauge the selectivity of PLK1 inhibition for cancer cells over benign tissue.

70

3.5 Future Directions

3.5.1 Rationale

The ultimate goal of genomic analyses in cancer is to gain insight into the biological mechanisms which underlie cancer development and to develop improved therapeutic options for patients, resulting in improved patient outcomes and effective treatment. Part of the challenge in the therapeutics of cancer is that cancer is a heterogeneous disease, as exemplified by the highly variable landscape of somatic mutation types and mutated genes across different tumor types (Gerlinger et al., 2012). Even more striking in PDAC, as with other cancers, primary tumor lesions show different mutation patterns than metastatic lesions in the same patient, and the primary tumors themselves comprise multiple sub-populations of tumor cells (Campbell et al., 2010; Samuel and Hudson, 2011). Tumor genetic and molecular heterogeneity necessitates therapy that is tailored to target the specific genetic aberrations characteristic of the individual tumor. This paradigm of precision medicine, or personalized medicine, is indeed the path that genomic cancer research is headed. This study utilized a rational approach to therapeutic target identification for pancreatic ductal adenocarcinoma. Presumably, genes that are targets of genetic gains and found to play a role in the neoplastic process are viable putative therapeutic targets. Moreover, the genetic gain itself represents a clinical biomarker that can be used to predict which patients may benefit from therapy targeting tumors that bear the genetic gain. Through analysis of somatic copy number gains in human PDAC tumor samples in public datasets, it was possible to identify regions of genomic gain that are recurrently observed in the PDAC tumors profiled, as well as the genes mapping to these regions. Using a panel of human PDAC cell lines, it was then possible to identify which genes are also highly expressed in the context of their genomic gain. This analysis, combined with assessment of how the top-ranked genes performed in a pooled shRNA screen on the same cell lines, suggested that ECT2 might be a promising target for further study. While ECT2 has been characterized and implicated in various human malignancies, it is yet to be identified as a putative oncogene and therapeutic target in PDAC and thus presents a novel target for further investigation in PDAC. It will be necessary to assess the extent to which genetic gains of ECT2 confer growth advantage in PDACs, as well as the cellular mechanism by which this occurs.

71

3.5.2 Specific Aims Targeted shRNA-mediated interference experiments performed in this study are suggestive of a potential trend between ECT2 copy number and essentiality to cell viability. However, to better characterize this association, it is necessary to test more PDAC cell lines, as well as perhaps other cell types such as fibroblasts to assess effects of shRNA-mediated interference. Moreover, it is necessary to utilize shRNAs that effectively knock down ECT2 expression and this must be validated at the mRNA and protein level. The phenotypic endpoint of shRNA-mediated ECT2 interference in this study was cell viability as determined by nucleic counts of cells following shRNA treatment. However, given the role of ECT2 in normal cellular processes such as cytokinesis and cell division, presumably, ECT2 knock-down may affect migration and invasion of tumor cells as well as lead to defects in cell division. Experiments assaying these phenotypes following ECT2 knock-down are necessary. Finally, the cellular mechanism through which ECT2-dependent tumor progression occurs is necessary to elucidate. ECT2 is a GEF for RhoGTPases and it would be interesting to determine which, if any, Rho GTPases are most dependent on aberrant ECT2 activity for tumor maintenance and progression, as this is yet to be determined. In addition, analysis of the role of ECT2 in PDAC in vivo would be extremely valuable to understand its role in the context of a biological system and potential interactions with the tumor microenvironment. Pharmacologic studies in mice would also be necessary to ascertain the extent to which drug activity in vitro models effects on tumor growth in a biological system. Also of importance is a survey of an independent cohort of primary patient PDACs to validate the frequency of ECT2/3q26 copy number gains and potentially correlate this genomic feature with survival and patient outcomes. Finally, while this study was aimed at identifying therapeutic targets for PDAC, additional causes for the poor clinical success of traditional anticancer agents in PDAC cannot be ignored. Of particular importance is the low tumor vascularity and poor perfusion of PDAC tumors that limits drug delivery to the tumor site (Wang et al., 2011). These factors must be considered in early studies aimed at molecular target development, since their clinical utility may not be realized if other biological factors are not taken into account.

72

References

Albertson, D.G. (2006). Gene amplification in cancer. Trends Genet 22, 447-455.

Barretina, J., Caponigro, G., Stransky, N., Venkatesan, K., Margolin, A.A., Kim, S., Wilson, C.J., Lehar, J., Kryukov, G.V., Sonkin, D., et al. (2012). The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603-607.

Baudis, M. (2007). Genomic imbalances in 5918 malignant epithelial tumors: an explorative meta- analysis of chromosomal CGH data. BMC Cancer 7, 226.

Beroukhim, R., Mermel, C.H., Porter, D., Wei, G., Raychaudhuri, S., Donovan, J., Barretina, J., Boehm, J.S., Dobson, J., Urashima, M., et al. (2010). The landscape of somatic copy-number alteration across human cancers. Nature 463, 899-905.

Bignell, G.R., Greenman, C.D., Davies, H., Butler, A.P., Edkins, S., Andrews, J.M., Buck, G., Chen, L., Beare, D., Latimer, C., et al. (2010). Signatures of mutation and selection in the cancer genome. Nature 463, 893-898.

Bork, P., Hofmann, K., Bucher, P., Neuwald, A.F., Altschul, S.F., and Koonin, E.V. (1997). A superfamily of conserved domains in DNA damage-responsive cell cycle checkpoint proteins. FASEB J 11, 68-76.

Brodeur, G.M., Hogarty, M.D. Gene amplification in human cancers: biological and clinical significance. In: Vogelstein, B., Kinzler, K.W., editors. The genetic basis of human cancer. New York: McGraw-Hill; 1998. p. 161-72.

Buchholz, M., Braun, M., Heidenblut, A., Kestler, H.A., Kloppel, G., Schmiegel, W., Hahn, S.A., Luttges, J., and Gress, T.M. (2005). Transcriptome analysis of microdissected pancreatic intraepithelial neoplastic lesions. Oncogene 24, 6626-6636.

Burris, H.A., 3rd, Moore, M.J., Andersen, J., Green, M.R., Rothenberg, M.L., Modiano, M.R., Cripps, M.C., Portenoy, R.K., Storniolo, A.M., Tarassoff, P., et al. (1997). Improvements in survival and clinical benefit with gemcitabine as first-line therapy for patients with advanced pancreas cancer: a randomized trial. J Clin Oncol 15, 2403-2413.

Bursavich, M.G., Parker, D.P., Willardsen, J.A., Gao, Z.H., Davis, T., Ostanin, K., Robinson, R., Peterson, A., Cimbora, D.M., Zhu, J.F., et al. (2010). 2-Anilino-4-aryl-1,3-thiazole inhibitors of valosin- containing protein (VCP or p97). Bioorg Med Chem Lett 20, 1677-1679.

Caldas, C., Hahn, S.A., da Costa, L.T., Redston, M.S., Schutte, M., Seymour, A.B., Weinstein, C.L., Hruban, R.H., Yeo, C.J., and Kern, S.E. (1994). Frequent somatic mutations and homozygous deletions of the p16 (MTS1) gene in pancreatic adenocarcinoma. Nat Genet 8, 27-32.

Calhoun, E.S., Jones, J.B., Ashfaq, R., Adsay, V., Baker, S.J., Valentine, V., Hempen, P.M., Hilgers, W., Yeo, C.J., Hruban, R.H., et al. (2003). BRAF and FBXW7 (CDC4, FBW7, AGO, SEL10) mutations in distinct subsets of pancreatic cancer: potential therapeutic targets. Am J Pathol 163, 1255-1260.

Callebaut, I., and Mornon, J.P. (1997). From BRCA1 to RAP1: a widespread BRCT module closely associated with DNA repair. FEBS Lett 400, 25-30. 73

Campbell, P.J., Yachida, S., Mudie, L.J., Stephens, P.J., Pleasance, E.D., Stebbings, L.A., Morsberger, L.A., Latimer, C., McLaren, S., Lin, M.L., et al. (2010). The patterns and dynamics of genomic instability in metastatic pancreatic cancer. Nature 467, 1109-1113.

Cheng, J.Q., Ruggeri, B., Klein, W.M., Sonoda, G., Altomare, D.A., Watson, D.K., and Testa, J.R. (1996). Amplification of AKT2 in human pancreatic cells and inhibition of AKT2 expression and tumorigenicity by antisense RNA. Proc Natl Acad Sci U S A 93, 3636-3641.

Chiang, D.Y., Getz, G., Jaffe, D.B., O'Kelly, M.J., Zhao, X., Carter, S.L., Russ, C., Nusbaum, C., Meyerson, M., and Lander, E.S. (2009). High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 6, 99-103.

Chu, G.C., Kimmelman, A.C., Hezel, A.F., and DePinho, R.A. (2007). Stromal biology of pancreatic cancer. J Cell Biochem 101, 887-907.

Conroy, T., Desseigne, F., Ychou, M., Bouche, O., Guimbaud, R., Becouarn, Y., Adenis, A., Raoul, J.L., Gourgou-Bourgade, S., de la Fouchardiere, C., et al. (2011). FOLFIRINOX versus gemcitabine for metastatic pancreatic cancer. N Engl J Med 364, 1817-1825.

Dechant, R., and Glotzer, M. (2003). Centrosome separation and central spindle assembly act in redundant pathways that regulate microtubule density and trigger cleavage furrow formation. Dev Cell 4, 333-344.

Esteva, F.J., Yu, D., Hung, M.C., and Hortobagyi, G.N. (2010). Molecular predictors of response to trastuzumab and lapatinib in breast cancer. Nat Rev Clin Oncol 7, 98-107.

Fields, A.P., and Justilien, V. (2010). The guanine nucleotide exchange factor (GEF) Ect2 is an oncogene in human cancer. Adv Enzyme Regul 50, 190-200.

Fu, B., Luo, M., Lakkur, S., Lucito, R., and Iacobuzio-Donahue, C.A. (2008). Frequent genomic copy number gain and overexpression of GATA-6 in pancreatic carcinoma. Cancer Biol Ther 7, 1593- 1601.

Garraway, L.A., Widlund, H.R., Rubin, M.A., Getz, G., Berger, A.J., Ramaswamy, S., Beroukhim, R., Milner, D.A., Granter, S.R., Du, J., et al. (2005). Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature 436, 117-122.

GeneCards (2011). GeneCards v3.

Gerlinger, M., Rowan, A.J., Horswell, S., Larkin, J., Endesfelder, D., Gronroos, E., Martinez, P., Matthews, N., Stewart, A., Tarpey, P., et al. (2012). Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med 366, 883-892.

Hahn, S.A., Schutte, M., Hoque, A.T., Moskaluk, C.A., da Costa, L.T., Rozenblum, E., Weinstein, C.L., Fischer, A., Yeo, C.J., Hruban, R.H., et al. (1996). DPC4, a candidate tumor suppressor gene at human chromosome 18q21.1. Science 271, 350-353.

Harada, T., Chelala, C., Crnogorac-Jurcevic, T., and Lemoine, N.R. (2009). Genome-wide analysis of pancreatic cancer using microarray-based techniques. Pancreatology 9, 13-24. 74

Harris, T.J., and McCormick, F. (2010). The molecular pathology of cancer. Nat Rev Clin Oncol 7, 251-265.

Haverty, P.M., Hon, L.S., Kaminker, J.S., Chant, J., and Zhang, Z. (2009). High-resolution analysis of copy number alterations and associated expression changes in ovarian tumors. BMC Med Genomics 2, 21.

Heidenblad, M., Schoenmakers, E.F., Jonson, T., Gorunova, L., Veltman, J.A., van Kessel, A.G., and Hoglund, M. (2004). Genome-wide array-based comparative genomic hybridization reveals multiple amplification targets and novel homozygous deletions in pancreatic carcinoma cell lines. Cancer Res 64, 3052-3059.

Hermann, P.C., Huber, S.L., Herrler, T., Aicher, A., Ellwart, J.W., Guba, M., Bruns, C.J., and Heeschen, C. (2007). Distinct populations of cancer stem cells determine tumor growth and metastatic activity in human pancreatic cancer. Cell Stem Cell 1, 313-323.

Heselmeyer, K., Macville, M., Schrock, E., Blegen, H., Hellstrom, A.C., Shah, K., Auer, G., and Ried, T. (1997). Advanced-stage cervical carcinomas are defined by a recurrent pattern of chromosomal aberrations revealing high genetic instability and a consistent gain of chromosome arm 3q. Genes Chromosomes Cancer 19, 233-240.

Hirata, D., Yamabuki, T., Miki, D., Ito, T., Tsuchiya, E., Fujita, M., Hosokawa, M., Chayama, K., Nakamura, Y., and Daigo, Y. (2009). Involvement of epithelial cell transforming sequence-2 oncoantigen in lung and esophageal cancer progression. Clin Cancer Res 15, 256-266.

Holzmann, K., Kohlhammer, H., Schwaenen, C., Wessendorf, S., Kestler, H.A., Schwoerer, A., Rau, B., Radlwimmer, B., Dohner, H., Lichter, P., et al. (2004). Genomic DNA-chip hybridization reveals a higher incidence of genomic amplifications in pancreatic cancer than conventional comparative genomic hybridization and leads to the identification of novel candidate genes. Cancer Res 64, 4428-4433.

Hong, S.M., Li, A., Olino, K., Wolfgang, C.L., Herman, J.M., Schulick, R.D., Iacobuzio-Donahue, C., Hruban, R.H., and Goggins, M. (2011a). Loss of E-cadherin expression and outcome among patients with resectable pancreatic adenocarcinomas. Mod Pathol 24, 1237-1247.

Hong, S.M., Park, J.Y., Hruban, R.H., and Goggins, M. (2011b). Molecular signatures of pancreatic cancer. Arch Pathol Lab Med 135, 716-727.

Hong, S.P., Wen, J., Bang, S., Park, S., and Song, S.Y. (2009). CD44-positive cells are responsible for gemcitabine resistance in pancreatic cancer cells. Int J Cancer 125, 2323-2331.

Hruban, R.H., Goggins, M., Parsons, J., Kern, S.E. (2000). Progression model for pancreatic caner. 6(8), 2969-72.

ICGC (2010). International Cancer Genome Consortium. In Version 5.

Jemal, A., Siegel, R., Xu, J., and Ward, E. (2010). Cancer statistics, 2010. CA Cancer J Clin 60, 277-300.

Justilien, V., and Fields, A.P. (2009). Ect2 links the PKCiota-Par6alpha complex to Rac1 activation and cellular transformation. Oncogene 28, 3597-3607. 75

Kim, J.E., Billadeau, D.D., and Chen, J. (2005). The tandem BRCT domains of Ect2 are required for both negative and positive regulation of Ect2 in cytokinesis. J Biol Chem 280, 5733-5739.

Kitoh, H., Ryozawa, S., Harada, T., Kondoh, S., Furuya, T., Kawauchi, S., Oga, A., Okita, K., and Sasaki, K. (2005). Comparative genomic hybridization analysis for pancreatic cancer specimens obtained by endoscopic ultrasonography-guided fine-needle aspiration. J Gastroenterol 40, 511-517.

Klimstra, D.S., and Longnecker, D.S. (1994). K-ras mutations in pancreatic ductal proliferative lesions. Am J Pathol 145, 1547-1550.

Kloppel, G. (1998). Clinicopathologic view of intraductal papillary-mucinous tumor of the pancreas. Hepatogastroenterology 45, 1981-1985.

Knox, C., Law, V., Jewison, T., Liu, P., Ly, S., Frolkis, A., Pon, A., Banco, K., Mak, C., Neveu, V., et al. (2011). DrugBank 3.0: a comprehensive resource for 'omics' research on drugs. Nucleic Acids Res 39, D1035-1041.

Li, A., Liu, Z., Lezon-Geyda, K., Sarkar, S., Lannin, D., Schulz, V., Krop, I., Winer, E., Harris, L., and Tuck, D. (2011). GPHMM: an integrated hidden Markov model for identification of copy number alteration and loss of heterozygosity in complex tumor samples using whole genome SNP arrays. Nucleic Acids Res 39, 4928-4941.

Li, D., Xie, K., Wolff, R., and Abbruzzese, J.L. (2004). Pancreatic cancer. Lancet 363, 1049-1057.

Lin, L., Wang, Z., Prescott, M.S., van Dekken, H., Thomas, D.G., Giordano, T.J., Chang, A.C., Orringer, M.B., Gruber, S.B., Moran, J.V., et al. (2006). Multiple forms of genetic instability within a 2-Mb chromosomal segment of 3q26.3-q27 are associated with development of esophageal adenocarcinoma. Genes Chromosomes Cancer 45, 319-331.

Lin, S.M., Du, P., Huber, W., and Kibbe, W.A. (2008). Model-based variance-stabilizing transformation for Illumina microarray data. Nucleic Acids Res 36, e11.

Liu, X.F., Ishida, H., Raziuddin, R., and Miki, T. (2004). Nucleotide exchange factor ECT2 interacts with the polarity protein complex Par6/Par3/protein kinase Czeta (PKCzeta) and regulates PKCzeta activity. Mol Cell Biol 24, 6665-6675.

Loukopoulos, P., Shibata, T., Katoh, H., Kokubu, A., Sakamoto, M., Yamazaki, K., Kosuge, T., Kanai, Y., Hosoda, F., Imoto, I., et al. (2007). Genome-wide array-based comparative genomic hybridization analysis of pancreatic adenocarcinoma: identification of genetic indicators that predict patient outcome. Cancer Sci 98, 392-400.

Lynch, H.T., Smyrk, T., Kern, S.E., Hruban, R.H., Lightdale, C.J., Lemon, S.J., Lynch, J.F., Fusaro, L.R., Fusaro, R.M., and Ghadirian, P. (1996). Familial pancreatic cancer: a review. Semin Oncol 23, 251- 275.

Maitra, A., and Hruban, R.H. (2008). Pancreatic cancer. Annu Rev Pathol 3, 157-188.

Marcotte, R., Brown, K.R., Suarez, F., Sayad, A., Karamboulas, K., Kryzankowski, P.M., Sircoulomb, F., Medrano, M., Fedyshyn, Y., Koh, J.L.Y., et al. (2012). Essential Gene Profiles in Breast, Pancreatic, and Ovarian Cancer Cells. Cancer Discovery 2, 172. 76

Meyer, S., Fergusson, W.D., Whetton, A.D., Moreira-Leite, F., Pepper, S.D., Miller, C., Saunders, E.K., White, D.J., Will, A.M., Eden, T., et al. (2007). Amplification and translocation of 3q26 with overexpression of EVI1 in Fanconi anemia-derived childhood acute myeloid leukemia with biallelic FANCD1/BRCA2 disruption. Genes Chromosomes Cancer 46, 359-372.

Miki, T., Smith, C.L., Long, J.E., Eva, A., and Fleming, T.P. (1993). Oncogene ect2 is related to regulators of small GTP-binding proteins. Nature 362, 462-465.

Moffat, J., Grueneberg, D.A., Yang, X., Kim, S.Y., Kloepfer, A.M., Hinkle, G., Piqani, B., Eisenhaure, T.M., Luo, B., Grenier, J.K., et al. (2006). A lentiviral RNAi library for human and mouse genes applied to an arrayed viral high-content screen. Cell 124, 1283-1298.

Moore, M.J., Goldstein, D., Hamm, J., Figer, A., Hecht, J.R., Gallinger, S., Au, H.J., Murawa, P., Walde, D., Wolff, R.A., et al. (2007). Erlotinib plus gemcitabine compared with gemcitabine alone in patients with advanced pancreatic cancer: a phase III trial of the National Cancer Institute of Canada Clinical Trials Group. J Clin Oncol 25, 1960-1966.

NCI/NCBI (2001). NCI and NCBIs SKY/M-FISH Database.

Neesse, A., Michl, P., Frese, K.K., Feig, C., Cook, N., Jacobetz, M.A., Lolkema, M.P., Buchholz, M., Olive, K.P., Gress, T.M., et al. (2011). Stromal biology and therapy in pancreatic cancer. Gut 60, 861-868.

Niiya, F., Tatsumoto, T., Lee, K.S., and Miki, T. (2006). Phosphorylation of the cytokinesis regulator ECT2 at G2/M phase stimulates association of the mitotic kinase Plk1 and accumulation of GTP- bound RhoA. Oncogene 25, 827-837.

Oceguera-Yanez, F., Kimura, K., Yasuda, S., Higashida, C., Kitamura, T., Hiraoka, Y., Haraguchi, T., and Narumiya, S. (2005). Ect2 and MgcRacGAP regulate the activation and function of Cdc42 in mitosis. J Cell Biol 168, 221-232.

Olive, K.P., Jacobetz, M.A., Davidson, C.J., Gopinathan, A., McIntyre, D., Honess, D., Madhu, B., Goldgraben, M.A., Caldwell, M.E., Allard, D., et al. (2009). Inhibition of Hedgehog signaling enhances delivery of chemotherapy in a mouse model of pancreatic cancer. Science 324, 1457-1461.

Olshen, A.B., Venkatraman, E.S., Lucito, R., and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557-572.

Petronczki, M., Glotzer, M., Kraut, N., and Peters, J.M. (2007). Polo-like kinase 1 triggers the initiation of cytokinesis in human cells by promoting recruitment of the RhoGEF Ect2 to the central spindle. Dev Cell 12, 713-725.

Pinkel, D., and Albertson, D.G. (2005). Array comparative genomic hybridization and its applications in cancer. Nat Genet 37 Suppl, S11-17.

Pinto, D., Darvishi, K., Shi, X., Rajan, D., Rigler, D., Fitzgerald, T., Lionel, A.C., Thiruvahindrapuram, B., Macdonald, J.R., Mills, R., et al. (2011). Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 29, 512-520. 77

Prokopenko, S.N., Brumby, A., O'Keefe, L., Prior, L., He, Y., Saint, R., and Bellen, H.J. (1999). A putative exchange factor for Rho1 GTPase is required for initiation of cytokinesis in Drosophila. Genes Dev 13, 2301-2314.

Rachagani, S., Senapati, S., Chakraborty, S., Ponnusamy, M.P., Kumar, S., Smith, L.M., Jain, M., and Batra, S.K. (2011). Activated KrasG(1)(2)D is associated with invasion and metastasis of pancreatic cancer cells through inhibition of E-cadherin. Br J Cancer 104, 1038-1048.

Rozenblum, E., Schutte, M., Goggins, M., Hahn, S.A., Panzer, S., Zahurak, M., Goodman, S.N., Sohn, T.A., Hruban, R.H., Yeo, C.J., et al. (1997). Tumor-suppressive pathways in pancreatic carcinoma. Cancer Res 57, 1731-1734.

Ruggeri, B., Zhang, S.Y., Caamano, J., DiRado, M., Flynn, S.D., and Klein-Szanto, A.J. (1992). Human pancreatic carcinomas and cell lines reveal frequent and multiple alterations in the p53 and Rb-1 tumor-suppressor genes. Oncogene 7, 1503-1511.

Ruggeri, B.A., Huang, L., Wood, M., Cheng, J.Q., and Testa, J.R. (1998). Amplification and overexpression of the AKT2 oncogene in a subset of human pancreatic ductal adenocarcinomas. Mol Carcinog 21, 81-86.

Russ, A.P., and Lampel, S. (2005). The druggable genome: an update. Drug Discov Today 10, 1607- 1610.

Saito, S., Liu, X.F., Kamijo, K., Raziuddin, R., Tatsumoto, T., Okamoto, I., Chen, X., Lee, C.C., Lorenzi, M.V., Ohara, N., et al. (2004). Deregulation and mislocalization of the cytokinesis regulator ECT2 activate the Rho signaling pathways leading to malignant transformation. J Biol Chem 279, 7169- 7179.

Saito, S., Tatsumoto, T., Lorenzi, M.V., Chedid, M., Kapoor, V., Sakata, H., Rubin, J., and Miki, T. (2003). Rho exchange factor ECT2 is induced by growth factors and regulates cytokinesis through the N- terminal cell cycle regulator-related domains. J Cell Biochem 90, 819-836.

Salhia, B., Tran, N.L., Chan, A., Wolf, A., Nakada, M., Rutka, F., Ennis, M., McDonough, W.S., Berens, M.E., Symons, M., et al. (2008). The guanine nucleotide exchange factors trio, Ect2, and Vav3 mediate the invasive behavior of glioblastoma. Am J Pathol 173, 1828-1838.

Samuel, N., and Hudson, T.J. (2011). The molecular and cellular heterogeneity of pancreatic ductal adenocarcinoma. Nat Rev Gastroenterol Hepatol 9, 77-87.

Sano, M., Genkai, N., Yajima, N., Tsuchiya, N., Homma, J., Tanaka, R., Miki, T., and Yamanaka, R. (2006). Expression level of ECT2 proto-oncogene correlates with prognosis in glioma patients. Oncol Rep 16, 1093-1098.

Schumacher, S., Gryzik, T., Tannebaum, S., and Muller, H.A. (2004). The RhoGEF Pebble is required for cell shape changes during cell migration triggered by the Drosophila FGF receptor Heartless. Development 131, 2631-2640.

Shah, A.N., Summy, J.M., Zhang, J., Park, S.I., Parikh, N.U., and Gallick, G.E. (2007). Development and characterization of gemcitabine-resistant pancreatic tumor cells. Ann Surg Oncol 14, 3629-3637. 78

Sophic (2012). Sophic Druggable Genome, S.S.A. Inc., ed.

Stratton, M.R. (2011). Exploring the genomes of cancer cells: progress and promise. Science 331, 1553-1558.

Strebhardt, K. (2011). Multifaceted polo-like kinases: drug targets and antitargets for cancer therapy. Nat Rev Drug Discov 9, 643-660.

Sun, Y. (2006). E3 ubiquitin ligases as cancer targets and biomarkers. Neoplasia 8, 645-654.

Suzuki, A., Shibata, T., Shimada, Y., Murakami, Y., Horii, A., Shiratori, K., Hirohashi, S., Inazawa, J., and Imoto, I. (2008). Identification of SMURF1 as a possible target for 7q21.3-22.1 amplification detected in a pancreatic cancer cell line by in-house array-based comparative genomic hybridization. Cancer Sci 99, 986-994.

Tatsumoto, T., Sakata, H., Dasso, M., and Miki, T. (2003). Potential roles of the nucleotide exchange factor ECT2 and Cdc42 GTPase in spindle assembly in Xenopus egg cell-free extracts. J Cell Biochem 90, 892-900.

Tatsumoto, T., Xie, X., Blumenthal, R., Okamoto, I., and Miki, T. (1999). Human ECT2 is an exchange factor for Rho GTPases, phosphorylated in G2/M phases, and involved in cytokinesis. J Cell Biol 147, 921-928.

Thompson, L.H., Brookman, K.W., Jones, N.J., Allen, S.A., and Carrano, A.V. (1990). Molecular cloning of the human XRCC1 gene, which corrects defective DNA strand break repair and sister chromatid exchange. Mol Cell Biol 10, 6160-6171.

Uhlen, M., Oksvold, P., Fagerberg, L., Lundberg, E., Jonasson, K., Forsberg, M., Zwahlen, M., Kampf, C., Wester, K., Hober, S., et al. (2010). Towards a knowledge-based Human Protein Atlas. Nat Biotechnol 28, 1248-1250.

Villarroel, M.C., Rajeshkumar, N.V., Garrido-Laguna, I., De Jesus-Acosta, A., Jones, S., Maitra, A., Hruban, R.H., Eshleman, J.R., Klein, A., Laheru, D., et al. (2011). Personalizing cancer treatment in the age of global genomic analyses: PALB2 gene mutations and the response to DNA damaging agents in pancreatic cancer. Mol Cancer Ther 10, 3-8.

Vogelstein, B., and Kinzler, K.W. (2004). Cancer genes and the pathways they control. Nat Med 10, 789-799.

Volik, S., Zhao, S., Chin, K., Brebner, J.H., Herndon, D.R., Tao, Q., Kowbel, D., Huang, G., Lapuk, A., Kuo, W.L., et al. (2003). End-sequence profiling: sequence-based analysis of aberrant genomes. Proc Natl Acad Sci U S A 100, 7696-7701.

Wallrapp, C., Muller-Pillasch, F., Solinas-Toldo, S., Lichter, P., Friess, H., Buchler, M., Fink, T., Adler, G., and Gress, T.M. (1997). Characterization of a high copy number amplification at 6q24 in pancreatic cancer identifies c-myb as a candidate oncogene. Cancer Res 57, 3135-3139.

Wang, Z., Li, Y., Ahmad, A., Banerjee, S., Azmi, A.S., Kong, D., and Sarkar, F.H. (2011). Pancreatic cancer: understanding and overcoming chemoresistance. Nat Rev Gastroenterol Hepatol 8, 27-33. 79

Wang, Z., Li, Y., Kong, D., Banerjee, S., Ahmad, A., Azmi, A.S., Ali, S., Abbruzzese, J.L., Gallick, G.E., and Sarkar, F.H. (2009). Acquisition of epithelial-mesenchymal transition phenotype of gemcitabine- resistant pancreatic cancer cells is linked with activation of the notch signaling pathway. Cancer Res 69, 2400-2407.

Wolfe, B.A., Takaki, T., Petronczki, M., and Glotzer, M. (2009). Polo-like kinase 1 directs assembly of the HsCyk-4 RhoGAP/Ect2 RhoGEF complex to initiate cleavage furrow formation. PLoS Biol 7, e1000110.

Yang, Y.L., Chu, J.Y., Luo, M.L., Wu, Y.P., Zhang, Y., Feng, Y.B., Shi, Z.Z., Xu, X., Han, Y.L., Cai, Y., et al. (2008). Amplification of PRKCI, located in 3q26, is associated with lymph node metastasis in esophageal squamous cell carcinoma. Genes Chromosomes Cancer 47, 127-136.

Yen, C.C., Chen, Y.J., Pan, C.C., Lu, K.H., Chen, P.C., Hsia, J.Y., Chen, J.T., Wu, Y.C., Hsu, W.H., Wang, L.S., et al. (2005). Copy number changes of target genes in chromosome 3q25.3-qter of esophageal squamous cell carcinoma: TP63 is amplified in early carcinogenesis but down-regulated as disease progressed. World J Gastroenterol 11, 1267-1272.

Yeo, T.P., Hruban, R.H., Leach, S.D., Wilentz, R.E., Sohn, T.A., Kern, S.E., Iacobuzio-Donahue, C.A., Maitra, A., Goggins, M., Canto, M.I., et al. (2002). Pancreatic cancer. Curr Probl Cancer 26, 176-275.

Yildirim, M.A., Goh, K.I., Cusick, M.E., Barabasi, A.L., and Vidal, M. (2007). Drug-target network. Nat Biotechnol 25, 1119-1126.

Zender, L., Spector, M.S., Xue, W., Flemming, P., Cordon-Cardo, C., Silke, J., Fan, S.T., Luk, J.M., Wigler, M., Hannon, G.J., et al. (2006). Identification and validation of oncogenes in liver cancer using an integrative oncogenomic approach. Cell 125, 1253-1267.

Zhang, M.L., Lu, S., Zhou, L., and Zheng, S.S. (2008). Correlation between ECT2 gene expression and methylation change of ECT2 promoter region in pancreatic cancer. Hepatobiliary Pancreat Dis Int 7, 533-538.

Zhong, Y., Wang, Z., Fu, B., Pan, F., Yachida, S., Dhara, M., Albesiano, E., Li, L., Naito, Y., Vilardell, F., et al. (2011). GATA6 activates Wnt signaling in pancreatic cancer by negatively regulating the Wnt antagonist Dickkopf-1. PLoS ONE 6, e22129.

80

Appendices 81

Table A1. Focal somatic copy number gains in pancreatic ductal adenocarcinoma in the literature. Citation Tissue/Sample Number of Summary of Genes Additional Notes Type Samples Chromosomal Affected Studied Regions Amplified Mahlamaki EH, et al. Pancreatic cancer 31 See Table 1 MYC, CCND1, CGH and FISH to identify Frequent amplification of cell lines ERBB2, recurrent genetic changes in 8q24, 11q, 17q and 20q- Among the most TBX2, BIRC5, pancreatic cancer cell lines. specific genes in pancreatic frequently gained: BCL2L1, cancer. Genes, 8q, 11q, 17q, 20q NCOA6, Evaluated copy number Chromosome & Cancer, NCOA3, changes of selected genes 2002. 35:353-8. MYBL2, from the four regions by PTPN1, interphase FISH in 30 ZNF217, pancreatic cell lines. ck20.10e9 STK15 CTSZ Heidenblad M, et al. Pancreatic 31 See Table 1 DAD-R, FISH-verified BAC clones and Genome-wide array-based carcinoma cell SOX5, EK11 cDNA clones. comparative genomic lines 60 amplicons at 32 CCDN3 hybridization reveals different locations: HGFR (MET) Amplicons ranged from 0.4- multiple amplification 8q (8 cases) AKT2 38.1 Mb, average 8.4Mb, targets and novel 12p (7 cases) median 4.5Mb homozygous deletions in 7q (5 cases) pancreatic carcinoma cell 18q (5 cases) 18q amplifications were lines. Cancer Research, 19q (5 cases) close to/at a deletion 2004. 64:3052-9. 6p (4 cases) breakpoint 8p (4 cases)

Regions most frequently involved in amplifications: 6p21-22, 7q21-31, 8p11-12, 8q23-24, 12p11-12, 18q11- 12, 19q13.2

Mahlamaki EH, et al. High Pancreatic cancer 13 See Table 1 – 24 105 genes CGH on cDNA microarray to resolution genomic and cell lines independent (Table 2) identify gene expression expression profiling amplicons change events that were reveals 105 putative PAK4 associated with gene copy amplification target genes number alterations. in pancreatic cancer. (Varying clone densities; Neoplasia, 2004. 6(5): 432- average resolution of 300kb 439. throughout the genome)

Amplicons ranged in size from 13kb to 11Mb Gysin S, et al. Analysis of Pancreatic cell 25 See Table 3 BASP1, EBF, genomic DNA alterations lines (derived from TNF, MRSA, and mRNA expression metastatic/primar MYC, CCND1, patterns in a panel of y tumor) BIRC3, human pancreatic cancer TRIM29, cell lines. Genes KRAS, Chromosomes & Cancer, LOC81558, 2005. 44(1):37-51. AKT2, VRK2, NCOA3

82

Nowak NJ, et al. Genome- 17 first passage 33 Recurrent gains: See Table 2 Distinguish differences wide aberrations in xenografts and 16 7p21.1-p11.2, and Table 3 between cell line and pancreatic cell lines 7q21.32, 7q33, xenograft aberration profiles. adenocarcinoma. Cancer 8q1.1-q24, 11p13, Genetics and Cytogenetics, 14q22.2, 20-12.2, 2005. 161:36-50. 20q11.23-q13.3 Heidenblad M, et al. Pancreatic 29 67 recurrently over- FLJ12760, Expression profiling analysis Microarray analyses reveal carcinoma cell expressed genes TLK2 using cDNA microarrays and strong influence of DNA lines located in 7 corroborating with genomic copy number alterations on precisely mapped profiling data (Heidenblad, the transcriptional commonly amplified 2004). patterns in pancreatic regions. cancer: implications for the More than one putative interpretation of genomic Two most target may be of importance amplifications. Oncogene, frequently amplified in pancreatic cancer 2005. 24:1794-1801. regions in amplicons. pancreatic cancer: 8q23-24, 12p11-12

Harada T, et al. Microdissected 23 7p and 18q IQCE, Identification of genetic PDAC tissue TRIAD3, alterations in pancreatic samples, consisting PMS2, cancer by the combined use of purified EIF2AK1, of tissue microdissection populations of PSCD3, and array-based cancer cells EIF2AK1,RA comparative genomic C1,PSCD3, hybridization. British CIGALT1, Journal of Cancer, 2007. GLCCI1, 96:373-382. ICA1, ETC1, DGKB, SNX13, 7A5, DNAH11, STK31, tcag7.981, CREB5 Harada T, et al. Genome- Micro-dissected 27 Frequent gains SKAP2/ High-density microarrays wide DNA copy number PDAC specimens (>78% of cases) SCAP2 gene representing 116 000 SNP analysis in pancreatic (7p15.2) – loci cancer using high-density 1q, 2, 3, 5, 7p, 8q, most single nucleotide 11, 14q, 17q frequently polymorphism arrays. amplified Oncogene, 2008. 27:1951- (63%) 1960. See Table 1 for list of “frequent gene CNs” Kikuchi S, et al. Expression Tissue samples 173 19q13.1-2 ACTN4 CN was calculated by FISH, and gene amplification of from invasive expression was knocked- actinin-4 in invasive ductal pancreatic ductal Amplification of the down by shRNA, carcinoma of the pancreas. adenocarcinoma ACTN4 gene was tumorigenicty was evaluated Clinical Cancer Research, detected in 11/29 by orthotopic implanation 2008. 14(17):5348-56. cases showing into SCID mice. increased expression

83

Suzuki A, et al. Pancreatic cancer 24 7q21.3-22.1 SMURF1 Identification of SMURF1 as cell lines a possible target for May work as 7q21.3-22.1 amplification a growth- detected in a pancreatic promoting cancer cell line by in-house gene and a array-based comparative good genomic hybridization. therapeutic Cancer Science, 2008. target 99(5):986-994. Fu B, et al. Frequent Pancreatic cancer 42 18q11.2 GATA-6, Representational genomic copy number gain xenografts cTAGE1 Oligonucleotide Microarray and over-expression of Analysis (ROMA), and GATA-6 in pancreatic validating using FISH, qPCR, carcinoma. Cancer Biology Western and Therapy, 2008. 7(10):1593- immunohistochemical 601. staining. Lin LJ, et al. Integrated Pancreatic cancer 25/14 23 amplified regions ARID4B Screened genome-wide copy analysis of copy number cell lines/micro- in at least 2 cell number alterations and LOH alterations and loss of dissected tissue lines, including 8 Genes at simultaneously in pancreatic heterozygosity in human specimens unreported loci. newly cancer cell lines using an SNP pancreatic cancer using a identified array and validated the high-resolution, single See Table 2 for loci: amplifications and LOH in nucleotide polymorphism amplifications in ARID4B, primary pancreatic cancer array. Oncology, 2008. cell lines and Table COL4A3, tissue. 75(1-2):102-12. 3 for amplifications COLA4, in patient samples. WWTR1, TRIP, Size of minimal DNAH5, common TNKS2, amplification was at MAML2, 13q22.2 TBC1D4, RAB8A Most frequently amplified region: 12p12.1-12p11.23

3q25.1, 5p15.2, 8q24.21, 11q14.1-2, 11q22.1-3, 14q11.2, 19q13.2

Chen S, et al. Copy number Pancreatic 72 19q13 PAK4 Mechanism relies on KRAS2 alterations in pancreatic adenocarcinomas activation/genomic cancer identify recurrent amplification to activate PAK4 amplification. Cancer PAK4. Biology Therapy, 2008. 7(11):1793-802. Complete data set publically available.

Harada T, et al. Genome- Cell lines/micro- 6/23 See Table 2 See Table 2 1-Mb-spaced CGH arrays, wide analysis of pancreatic dissected tissue then assessed transcript cancer using microarray- specimens levels in regions of genetic based techniques. alterations using Pancreatic Pancreatology, 2009. 9(1- Expression Database. 2):13-24.

84

Laurila E, et al. Pancreatic cancer 16/29 7q21-q22 APRC1A – FISH Characterization of the cell lines/primary had most 7q21-q22 amplicon pancreatic tumors statistically identified APRC1A, a significant subunit of the Arp2/3 correlation complex, as a regulator of between cell migration and invasion ampli- in pancreatic cancer. Genes fication and Chromosomes Cancer, elevated 2009. 48(4):330-9. expression. Kuuselo R, et al. 19q13 Primary pancreatic 357 19q13 19q13 amplification amplification is associated tumors associated with poor tumor with high grade and stage phenotype and shorter in pancreatic cancer. Genes Metastases 151 survival. Chromosomes Cancer, 2010. 49(6):569-75. Local recurrences 24

Cancer cell lines 120 from various tissues

Campbell PJ, et al. The Early passage cell 3 Supplementary patterns and dynamics of lines from resected Table 2 genomic instability in primary patient metastatic pancreatic tumors cancer. Nature, 2010. 467:1109-1113. Multiple 10 metastases collected at autopsy

85

Table A2. Public pancreatic cancer genome datasets utilized in copy number gain analysis. QCMG ICGC OICR ICGC JHU Pancreatic Cancer Harada et al, 2009. Genome Project Reference December 6, 2010 Data July 7, 2011 Data Coordination Jones et al. Core signaling Harada et al. Genome-wide Coordination Center Center Release (ICGC 6) pathways in human pancreatic analysis of pancreatic cancer Release (ICGC 3) cancers revealed by global using microarray-based genomic analyses. Science, techniques. Pancreatology, 2008; 321(5897):1801-6. 2009; 9:13-24. Number of 3 5 24 29 Samples Sample 2 primary tumors 5 primary tumors 14 cell lines 23 microdissected tumor Type(s) 1 cell line 10 xenografts specimens (derived from primary tumors 6 cell lines and multiple metastases collected at autopsy) Platform/ SOLiD Sequencing Nimblegen Human CGH 2.1M Illumina Infinium II Whole BAC/PAC Whole Genome CGH Technology Whole-genome v2.0D Array Genome Genotyping Assay array (Sanger Institute) (Beadchip Platform, 1M SNP loci) Genome Hg19/NCBI37, February Hg19/NCBI37, February 2009 SNP positions based on Hg18/NCBI36, March 2006 Build 2009 hg18/NCBI36, March 2006

Number of 10 229 11 958 228* 701† genes (UCSC) encompasse d in gains in at least one sample* Detailed ICGC-ABMP-20090811- 5 primary pancreatic tumors 24 samples included 14 cell 6 cell lines were acquired from Sample 04-CD is Panc05.04 obtained from Mayo Clinic, USA. lines and 10 xenografts derived Cancer Research UK Cell Information (CRL2557), which is the from 17 patients with surgically services; 23 fresh-frozen PDAC same cell line as Pa18C in resected carcinomas and 7 tissue specimens were the JHU dataset. patients who underwent rapid manually microdissected, autopsy; 22 of the carcinomas collected >90% purity of APGI-1959: ICGC-ABMP- were primary PDACs and 2 tumor cells. 20091020-01-TD were infiltrating (primary PDAC, AU adenocarcinomas centered on sample) the intrapancreatic bile duct; 9 of the cancers have already APGI-1992: ICGC-ABMP- metastasized and were late-

20091203-06-TD stage (Stages IIb or IV); 3 of the (primary PDAC, AU cell lines analyzed are available sample) through ATCC (Pa14C is Panc08.13, Pa16C is Panc 10.05, and Pa18C is Panc05.04). Copy CNV analysis of paired The CNV analysis was conducted Fluorscence intensity image ‘aCGH-Smooth’ was used to Number end and/or LMP from on the Nimblegen platform. The files were processed using detect DNA copy number Analysis SOLiD CNV-Seq was basic pipeline involves: first Illumina BeadStation software alterations in each tumor Method originally used normalization of LogR of Cy-3 to to provide normalized intensity sample. This software (http://www.biomedcen Cy-5, minimization of variance of value (R) for each SNP position. performs the data smoothing tral.com/1471- quantity in a moving window For each SNP, the normalized and breakpoint recognition 2105/10/80). across the 2.1e-6 linear probe- experimental intensity value (R) using a local search algorithm. In addition, a tool space, finding local minima and was compared to the intensity Based on preliminary data developed in-house, recording their boundaries. values for that SNP from a from seven normal vs. normal QCOPY, has also been These boundaries are then tested training set of normal samples DNA hybridisations, the used and compared to by conducting a 2-sided and represented as a ratio (logR threshold for genetic ‘gains’ CNV-Seq. This tool is Kolmogorov-Smirnov(KS) test on ratio) of log2(Rexperimental/Rtraining was determined as a smoothed similar to CNV-Seq, but the null hypothesis that the two set). Amplifications were defined log2 ratio ≥0.214, normalizes better for regions separated by the putative by regions containing ≥3 SNPs corresponding to ±2 standard mappability and GC boundary are the same, so that if with an average LogR ratio ≥ 0.9 deviations. High-level content. the KS test yields a high p-value, with at least one SNP having a amplifications were defined as the boundary can be accepted as LogR ratio ≥1.4. All putative a log2 ratio ≥0.75, genuine. This is performed amplifications with identical corresponding to a theoretical recursively with respect to some boundaries in multiple samples ±3.5-fold of the threshold for empirical threshold found were excluded. As focal low-level alterations. All through previous experiments. amplifications are more likely to identified regions of 86

The boundaries are then collated be useful in identifying specific alterations were verified by to obtain the initial ‘short’ table target genes, a second set of assessing the raw normalized of contiguous segments. In order criteria were used to involve data. The raw normalized to impute CNV state and remove complex amplifications, large (‘non-smoothed’) CGH data the baseline diploid signal, chromosomal regions or entire were analyses using the MSA estimation of the underlying chromosomes that showed copy coftware to identify minimal statistical parameters of the number gains (therefore, common regions (MCRs) or modes of the above log-signal is extensive criterial filtering was non-random genetic performed (the sum of normal done). alterations with a statistical distributions by performing a significance. ‘Non-random’ type of Bayesian analysis which alterations were defined as maximizes the expectation value genetic changes which were of the likelihood that the data are commonly identified in at least represented by a particular sum 14/29 PDACs (≥48 samples). of normal modes. Assuming three modes (amplification, deletion and baseline), the mean and standard deviation for each are calculated. Using these parameters, the original segments are then assessed to determine which fall into to the amplifications/deletions category and so on, providing the CNVs. Note, given the binary nature of the Nimblegen platform, a particular run against a normal sample will automatically yield somatic CNVs. QCMG: Queensland Center for Medical Genomics; ICGC: International Cancer Genome Consortium; OICR: Ontario Institute for Cancer Research; JHU: Johns Hopkins University; NG: NimbleGen; BAC/PAC: bacterial artificial chromosome/P1-derived artificial chromosome. *JHU study reported only high-level amplifications (excluded low-level gains). † Harada et al, 2009 contained genes encompassed in gains in at least 14 samples.

87

Table A3. Cell Lines utilized in integrated analysis in this study. Tumor type ATCC Cell line Sex Patient Age Derived from when tumor excised

Pancreatic CRL-1682 AsPC-1 Female 62 Met (ascites) Pancreatic CRL-1687 BxPC-3 Female 61 Primary, CFTR (-) Pancreatic HTB-79 Capan-1 Female 61 Primary, CFTR (+) Pancreatic HTB-80 Capan-2 Female 56 Primary Pancreatic CRL-1918 CFPAC-1 Male 26 Met (liver) Patient had cystic fibrosis Pancreatic CRL-2119 HPAC Male 64 primary Pancreatic CRL-1997 HPAF-II Male 44 Met (ascites) Pancreatic HTB-134 Hs 766T Male 46 Met (lymph node) Pancreatic IMIM-PC-1 Primary Pancreatic IMIM-PC-2 Primary Pancreatic KP2 Pancreatic KP-3 (JCRB0178.0 ) Met (liver) Pancreatic KP-4 (JCRB0182) Male 50 Met (ascites) Pancreatic CRL-1420 MIA PaCa-2 Male 65 Primary Pancreatic CRL-2553 Panc 02.03 Female 70 Primary Pancreatic CRL-2549 Panc 03.27 Female 65 Primary Pancreatic CRL-2555 Panc 04.03 Male 70 Primary Pancreatic CRL-2557 Panc 05.04 Female 77 Primary Pancreatic CRL-2551 Panc 08.13 Male 85 Primary Pancreatic CRL-2547 Panc 10.05 Male Unknown primary (same patient as PL45) Pancreatic CRL-1469 PANC-1 Primary Pancreatic PATU8988S Met (liver) Pancreatic PATU8988T Met (liver) Pancreatic CRL-2558 PL45 Male Unknown Primary (same patient as Panc 10.05) Pancreatic RWP1 Met (liver) Pancreatic SK-PC-1 primary tumor Pancreatic SK-PC-3 Pancreatic CRL-1837 SU.86.86 Female 57 Met (liver) Pancreatic CRL-2172 SW 1990 Met (spleen)

88

Table A4. The RNAi Consortium (TRC) shRNA Constructs Input Clone ID Clone Name Target Target Target %KD: Taxon Gene Gene mRNA Symbol expression remaining ECT2-1 TRCN0000047683 NM_018098.4-2538s1c1 Human 1894 ECT2 31 ECT2-2 TRCN0000047687 NM_018098.4-683s1c1 Human 1894 ECT2 2 ECT2-3 TRCN0000047684 NM_018098.4-441s1c1 Human 1894 ECT2 3 ECT2-4 TRCN0000047685 NM_018098.4-1958s1c1 Human 1894 ECT2 9 ECT2-5 TRCN0000047686 NM_018098.4-1154s1c1 Human 1894 ECT2 17

Table A5. Puromycin concentrations used in shRNA experiments Cell Line Puromycin Concentration (µg/mL) AsPc1 2 Capan-1 2 Capan-2 3 KP4 2 HPAF-II 2 Panc03.27 2 Panc04.03 2.5 Panc08.13 2 MIA PaCa-2 2 PATU8988S 3.5

Table A6. Details of PLK1 compounds utilized in pharmacologic assay. BI-6727 (Volasertib) GSK461364 Company Boehringer Ingelheim Glaxo SmithKlein Mechanism Selective and ATP-competitive inhibitor of Selective and ATP-competitive PLK proteins inhibitor of PLK1 Highest Dev Status Phase 2 Clinical Current clinical trial PLK1 IC50 0.87nM 2.2nM Selectivity PLK2 (5nM), PLK3 (56nM) 400-fold greater potency for PLK1 than PLK2/3 Chemical Structure