Gene Expression in Pancreatic Ductal Adenocarcinoma Xenografts

From BRCA Mutation Carriers Compared to Non-carriers

Nikita H. Desai Department of Biochemistry Goodman Cancer Research Center (GCRC) McGill University, Montreal, QC December 2015

A thesis submitted to McGill University in partial fulfillment of the requirements of the degree of Masters of Science (MSc.) in the Department of Biochemistry under the Faculty of Medicine

© Nikita Desai, 2015 Thesis Abstract

The following document contains a manuscript-based Masters of Science (MSc.) thesis submission for Nikita H. Desai. This thesis centers on a manuscript currently being prepared for publication: expression in pancreatic ductal adenocarcinoma xenografts from BRCA mutation carriers compared to non-carriers. Pancreatic cancer has one of the worst prognoses of cancer types due to late diagnosis, rapid metastases, and a lack of effective targeted therapies. An estimated 5-10% of pancreatic cancer cases are familial, and involve germ line mutations in cancer-associated . Breast cancer, early onset (BRCA) genes in particular are associated with increased risk of pancreatic cancer and are the most implicated genes in hereditary pancreatic cancer. The following paper studies of 12 pancreatic ductal adenocarcinoma (PDAC) xenografts, four of which have known BRCA mutations, using RNA sequencing. 26 genes were significantly differentially expressed between the two types of tumors. Two genes and their associated pathways, in particular, were highly differentially expressed between mutation carriers and non-carriers. Cathepsin E (CTSE) and associated mucin-producing pathways were more highly expressed in mutation non-carriers. Protease, serine 1 (PRSS1) and other genes associated with hereditary and chronic pancreatitis were more highly expressed in mutation non-carriers, but all had a much lower expression than normal duct.

In addendum to this thesis are included details of three papers to which the author of this thesis has contributed. These papers include: A Drosophila-centric analysis of tyrosine phosphatases (Hatzihristidis et al.), TC-PTP regulates the IL-7 transcriptional response during early T cell development (Pike et al.), and Modulation of PTP1b and TC-PTP expression enhances dendritic cell activation and maturation (Penafuerte et al.).

2 Résumé

Le document qui suit contient une maîtrise de la Science (MSc.) thèse pour le Département de biochimie de l'Université McGill par Nikita H. Desai. Cette thèse centres de sur un manuscrit en cours de préparation pour publication: Analyse de l'expression des gènes dans les xénogreffes d'adénocarcinome du pancréas canalaires des transporteurs de mutation BRCA par rapport aux non-porteurs. Le cancer du pancréas est l'un des pires pronostics des les cancer en raison de diagnostic tardifs, métastases rapides, et un manque de thérapies ciblées efficaces. Environ 5-

10% des les cancers du pancréas sont familiaux, et impliquent des mutations dans des gènes associés au cancer. Les BRCA gènes en particulier sont associés à un risque accru de cancer du pancréas, et sont les gènes les plus impliqués dans les cancers du pancréas héréditaire. Le thèse

étudie l'expression des gènes en 12 pancréatiques adénocarcinome canalaire (PAC) xénogreffes, dont quatre ont connu des mutations BRCA, utilisant le séquençage de l'ARN. 26 gènes ont été exprimés de façon différentielle entre les deux types de tumeurs. Deux gènes et de leurs voies associées ont été très exprimés de manière différentielle. Cathepsine E (CTSE) et la production de mucine ont été plus fortement exprimés dans la mutation non-porteurs. Protéase, sérine 1

(PRSS1) et d'autres gènes associés à la pancréatite héréditaire et chronique ont été plus fortement exprimé dans la mutation non-porteurs.

Dans addendum à cette thèse sont inclus détails de trois documents dont l'auteur de cette thèse a contribué. Ces documents comprennent: Une analyse Drosophile-centrique de protéines tyrosine phosphatases (Hatzihristidis et al.), TC-PTP réglemente l'IL-7 réponse transcription elle pendant le développement précoce des cellules T, et la modulation de PTP1b et TC-PTP (Pike et al) expression augmente l'activation des cellules dendritiques et de la maturation (Penafuerte et al).

3 Preface

This thesis is based on a manuscript to which I contributed as first author. I wrote the entirety of the document, including the bioinformatics analysis, discussions and conclusions herein. The contributions of each of the authors have been outlined below. I am grateful for the collaboration and contribution of many great scientists on the thesis work in this document.

Contribution of Authors

Dr. George Zogopoulos, a clinical oncologist at the McGill University Health Center, resected the pancreatic ductal adenocarcinoma samples analyzed in the paper from 12 of his patients. The resected tumors were xenografted into mice by Anita Hall, a research assistant in the Zogopoulos lab. The RNA was extracted and prepared for RNA sequencing by Anita Hall as well. RNA sequencing of the samples was performed at the Beijing Genomics Institute (BGI) and was paid for by the Tremblay lab. I conducted the bioinformatics analysis in this thesis, with the help and advice of Dr. Diego Miranda-Saavedra, a bioinformatician at Geneix Inc., U.K. Dr. Michel

Tremblay supervised the project and revised the manuscript. The computational equipment used for the bioinformatics analysis outlined this paper was provided by the Tremblay lab. Dr.

Tremblay also advised on the analysis included in the manuscript. I wrote the manuscript, and conducted the analysis of the RNA data outlined in the paper. My contributions to the three papers in addendum are outlined in detail in the paper descriptions included in addendum.

4 Table of Contents

Thesis Abstract ...... 2

Résumé ...... 3

Preface ...... 4

Contribution of Authors ...... 4

Acknowledgements ...... 8 Literature Review: Gene Expression for BRCA Mutation Carriers in Pancreatic Ductal Adenocarcinoma ...... 9 Why Study Pancreatic Cancer: Improving Patient Prognosis ...... 9 Understanding PDAC Biology: Development and Progression ...... 12 Improving Early Detection and Understanding Risk Factors ...... 14 Risk Factors: Hereditary PC, Pancreatitis, and Inherited Mutations ...... 16 PDAC Susceptibility for BRCA1/2 Mutation Carriers ...... 17 Prevalence of BRCA1/2 Mutations ...... 20 BRCA1/2 Mutations in Pancreatic Cancer ...... 22 Studying PDAC Gene Expression for BRCA1/2 mutation carriers ...... 23 RNA sequencing for gene expression ...... 26 Introduction: Gene expression profiles for PDAC xenografts in BRCA mutation carriers compared to non-carriers ...... 29 Abstract ...... 31 1.0 Background ...... 34 2. Methodology ...... 38 2.1 Tumor Resection and Xenografts ...... 38 2.2 Bioinformatics Analysis...... 38 2.3 Unsupervised Analysis: Hierarchical Clustering ...... 39 2.4 Supervised Clustering: BRCA mutation carriers vs. non-carriers ...... 40 3. Results ...... 43 3.1 Unsupervised Analysis: Hierarchical Clustering ...... 43 3.2 Supervised Analysis – Differential Gene Expression: ...... 44

5 4. Discussion ...... 54 4.1 CTSE, MUC6, TFF1 and mucin production in sporadic PDAC cases ...... 54 4.2 PRSS1 and co-expressed genes have decreased expression in chronic pancreatitis and PDAC ...... 58 5. Conclusion ...... 61 Bibliography ...... 62 Supplementary Tables ...... 67 Addendum A: ...... 70 Paper: TC-PTP regulates the IL-7 transcriptional response during early T cell development ...... 70 1. Background ...... 72 2. Methodology ...... 72 3. Results ...... 73 Bibliography ...... 74 Addendum B: ...... 75 Paper: Modulation of PTP1b and TC-PTP expression enhances dendritic cell activation and maturation ...... 75 1. Background ...... 77 2. Methodology ...... 77 3. Results ...... 78 4. Conclusion ...... 80 Bibliography ...... 80 Addendum C: ...... 81 Paper: A Drosophila-centric view of protein tyrosine phosphatases ...... 81

6 List of Tables and Figures

Table 1: Significantly differentially expressed genes ...... 22 Table 2: Coexpressed genes: mucin production ...... 24 Table 3: Mucin gene expression ...... 26 Table 4: Histone cluster 1 genes ...... 26 Table 5: PRSS1 and coexpressed genes ...... 28 Table 6: PLAU and coexpressed genes ...... 29

Figure 1: Heatmap and dendogram log2 fold ±4 and ±6 ...... 20 Figure 2: Heatmap and dendogram 26 significant genes ...... 21 Figure 3: CTSE coexpressed genes expression distribution ...... 25 Figure 4: HIST1 expression distribution BRCA carriers vs noncarriers ...... 27 Figure 5: PRSS1 and coexpressed genes expression distribution ...... 28

7 Acknowledgements

There are many people to thank for their help, advice, support, and contribution towards my Masters degree. I would first like to thank my thesis supervisor, Dr. Michel L. Tremblay, for the incredible support and advice given throughout my Masters experience. In addition to giving me constant advice, information, and encouragement, he gave me the opportunity to work on several projects I loved, and showed a belief in me that pushed me to pursue knowledge to my best ability. I consider myself extremely fortunate to have worked under his supervision. I would also thank Dr. George Zogopoulos and Anita Hall for contributing data that was essential to my project, and for answering my many questions whenever I had them. I would also like to thank the entire Tremblay lab for their incredible help, support, and for making me laugh so often. They made the entire experience of being in the lab enjoyable and unforgettable. I would especially like to thank Serge, Teri, Valerie, Elie, Claudia, and Kelly for answering many of my questions about biochemistry and immunology. I would also like to thank our lab manager, Jean Francois Theberge, for his absolutely essential help with buying and setting up equipment and helping with so many logistical tasks in the lab. It is also important for me to acknowledge the incredible help that was given to me by my academic advisor, Christine Laberge, who helped me countless times with deadlines, registrations, and applications. I must also acknowledge the incredible help and contribution made by Dr. Diego Miranda- Saavedra, who in addition to giving me advice throughout my thesis project, also partly sponsored and hosted my trip to Newcastle University. I would like Ben Allan and Graham Smith, as well as Andrew Skelton and the Bioinformatics Support Unit for their incredible knowledge, training, computational access, and support while I was in Newcastle. The knowledge they gave me made me a better bioinformatician and scientist. I would also thank my research advisory committee, including Drs. Jacek Majewski, Uri-David Akavia, and Josée Dostie, whose advice was important to my thesis projects. Finally, I would like to thank my parents, Bharti and Hiren, as well as my incredible brother Saumil, for not only believing in me, but also always pushing me to be my best self. I would not be the person I am without them. Finally, I would like to acknowledge the humor and love of my friend, Sarah, for her humor, love, and encouragement throughout my degree.

I am truly fortunate to be surrounded by so many incredible people.

8 LITERATURE REVIEW: GENE EXPRESSION FOR BRCA MUTATION CARRIERS

IN PANCREATIC DUCTAL ADENOCARCINOMA

Why Study Pancreatic Cancer: Improving Patient Prognosis

Although it has a relatively low rate of occurrence and is ranked 13th in incidence rate among human malignancies [1, 2], pancreatic cancer (PC) has a remarkably poor prognosis [3-5]. With a median survival of less than 6 months and a five-year survival rate of 3-5% [6-8], it is the fourth most common cause of cancer-related deaths and is one of the most aggressive and lethal malignancies [3, 5, 7, 9]. It is also one of the most difficult cancers to diagnose and treat [10] and has seen little improvement in patient prognosis in the past few decades [4, 11].

The pancreas is a key regulator of protein and carbohydrate digestion and maintains glucose homeostasis [6]. 80% of the pancreas is the exocrine pancreas, composed of acinar and duct cells that produce and deliver digestive zymogens [6]. Pancreatic ductal adenocarcinomas (PDAC) account for more than 85% of pancreatic cancer cases [2, 5, 6] and are called ductal adenocarcinomas due to their histological similarity to duct cells in the pancreas [6].

PDAC typically begins developing in the head of the pancreas after which it infiltrates surrounding tissues and commonly metastasizes to the liver, abdomen and lungs [5, 6]. About

60-70% of pancreatic cancers are in the head of the pancreas while 20-25% are in the body or tail of the pancreas [5, 8]. The location of the tumor usually determines the signs and symptoms presented in each case [5, 11]. Signs most commonly presented in PDAC cases include abdominal pain, weight loss, anorexia, diabetes and jaundice [5, 11]. Some defining features of

PDAC include high rates (>90%) of KRAS mutations, propensity for local and distant invasion, extensive desmoplasia, reprogramming of cellular metabolism, and immune evasion [4-6, 11].

9 More than 90% of patients who present with PDAC end up dying from the disease [1, 5-7].

There are a number of reasons for poor prognosis of pancreatic cancer, including late diagnosis, propensity for rapid dissemination and aggressive biology, as well as resistance to conventional and targeted therapies [1, 6, 7].

First, PDAC is usually diagnosed at very advanced stages and most cases are incurable at diagnosis [4, 11, 12]. There are several challenges present in detecting pancreatic cancer at earlier stages. One of the main reasons for late diagnosis is the lack of diagnostic systems for earlier stages of the disease, lack of effective non-invasive techniques for detection, as well as the fact that most patients tend to be asymptomatic until late in disease progression [2, 5, 11].

Due to the absence of effective screening techniques, most cases are diagnosed very late, after curative surgery is not feasible [1, 5, 11]. Another challenge is the fact that pancreatic cancer has a relatively low incidence rate. Pancreatic cancer is rarely diagnosed in people less than 40 years of age, and has a median age of diagnosis of 71 years [5, 11]. The worldwide incidence rate of pancreatic cancer is between 1-10 cases per 100,000 people [5]. Unfortunately, it’s low prevalence makes it difficult to develop and justify effective screening factors in asymptomatic populations [12]. For these reasons, the development of non-invasive, convenient, and effective early diagnostic methods could be essential to improving patient prognosis and outcomes [4, 11,

13].

Second, PDAC is characterized by a very aggressive biology [4, 6, 11]. PDAC often results in local invasion as well as distant metastasis well before detection [2, 4, 14]. About 63% of pancreatic cancer patients die from extensive metastasis, while 27% have limited metastasis but bulky primary tumors [5]. In addition, most tumors are diagnosed in elderly and/or chronically ill patients, which make surgery impractical in most cases [5, 7, 11]. These patients are also

10 especially vulnerable to complications such as cancer anorexia – cachexia syndrome (CACS)

[11, 15]. An unusually high number of PDAC patients, about 80%, suffer from CACS, a condition that results in abnormally low weight, weakness, and loss of muscle mass [15].

Although the exact connection between CACS and pancreatic cancer is not fully understood,

CACS has a big impact on patients’ overall survival, quality of life and physical activity [15], and it is one of the conditions that results in the poor prognosis of PDAC. Understanding the biology of PDAC and the connections between PDAC and such conditions, therefore, also present an opportunity to improve survival outcomes.

Finally, most PDAC cases tend to be resistant to conventional therapies [4, 5, 11]. PDAC responds poorly to most chemotherapeutic agents because PDAC is characterized by an abnormal tumor microenvironment with chaotic vascular morphology: low vascular density with leaky vessels that result in hypoxia and poor delivery of chemotherapy agents [2, 5-7]. Currently, surgery is the only treatment with curative intent: pancreaticoduodenectomies (the Whipple procedure) are used to remove tumors from the head and neck of the pancreas [4, 5]. Only about

10%-20% of patients are usually considered candidates for surgery at diagnoses [1, 4, 12]. In cases where surgery is feasible, 5-year survival rates improve to 20% while in cases with un- resectable disease survival is usually between 5-12 months [1, 6, 10, 12]. In addition, post- operative recurrence is common [4, 12]. Surgery is usually used in conjunction with adjuvant therapy, which aims to enhance immune response [4, 5]. In advanced disease, there are no curative treatments available, although combination chemotherapy offers some palliation [4, 5].

Adjuvant therapies with 5-FU, gemcitabine, cisplatin, interferon alfa2b, or erlotinib have been shown to reduce tumor burden or prolong prognosis, but this has not translated into increased survival benefit [4, 5]. Therefore, even though there have been some improvements in the

11 clinical management of PDAC, the prognosis remains poor [2, 4]. Developing better therapeutic options, such as immunotherapy, and targeting treatments to exploit PDAC biology could greatly improve patient prognosis.

The combination of late diagnosis, aggressiveness of malignancy, and resistance to most conventional therapeutic agents usually means that most patients present with incurable disease at time of diagnosis [6]. The three main ways this problem can be addressed would be improving our understanding of development and biology of pancreatic cancer, developing better techniques of early detection including characterizing risk factors and genetics, and characterizing different kinds of PDAC to develop more personalized and effective treatment strategies.

Understanding PDAC Biology: Development and Progression

PDAC is a genetic disease characterized by a small number of known genetic alterations responsible for tumourigenesis and progression [2, 5]. Characterizing and understanding these genetic alterations and their effects could be essential to understanding the PDAC phenotype in addition to finding potential biomarkers and developing treatment techniques.

These genetic mutations commonly occur in microscopic premalignant pancreatic lesions, such as pancreatic intraepithelial neoplasia (PanIN), mucinous cystic neoplasm (MCN), and intra- ductal papillary mucinous neoplasms (IPMN), which often precede malignant disease [2, 5, 16].

The most common out of these is PanIN, which precedes over 85% of PDAC cases [16].

Ductal reprogramming of endocrine cells occurs before development of pancreatic intraepithelial neoplasia (PanIN) from normal tissue [2, 16, 17]. Studies have found a step-wise progression between low and high grades of PanIN, which are associated with the accumulation of genetic alterations [2, 5, 14, 16]. It has been found that a significant number of these mutations are

12 actually similar to those found in advanced disease, which suggests they are progenitor mutations, which are present in PanIN precursor lesions and must be present in the majority of the tumor cells [2, 5, 17]. Most of these genetic alterations seem to occur earlier in disease progression, and are hypothesized to occur before metastatic dissemination and may be critical drivers of tumor progression [2, 17].

Surveys of pancreatic autopsy specimens suggest that PanINs are common in older adults, occurring in almost 30% of specimens [6]. PanINs show morphological alterations from normal ducts that represent increasingly dysplastic growth [6]. There have been several molecular profiling studies that have documented increasing number of signature gene mutations in higher grade PanINs, and have reinforced the PanIN-to-PDAC progression model [6, 16-18].

Understanding the genetic alterations seen in both PanIN and PDAC could be useful in determining which neoplasms are more likely to result in malignancies. The first known genetic alteration that is most common is K-RAS mutation, which is present in about 30% of early neoplasms, 90% of PanIN of all grades, and almost 95% of PDAC cases [5, 6]. Some mouse models have demonstrated that K-RAS mutations are an initiating step to PDAC pathogenesis leading to cell proliferation, differentiation and survival [2, 5].

A 2010 study by Yachida et al [14] examined different models of tumor progression, and estimated the average time between tumor initiations to birth of malignant parental cancer clones as about 12 years. The acquisition of genetic alterations may be irregular rather than constant and measured, with episodic mutational activity caused by events undermining genomic integrity such as loss of DNA damage repair and checkpoints [6]. According to the model, a parental clone alone would not be enough to induce metastatic spread, and distant metastasis would occur late in genetic evolution of pancreatic cancer, a process that would take an additional 7 years [2,

13 14]. Once dissemination starts, however, it is estimated that time from metastatic dissemination to patient death was about 2.7 years, indicating developing early detection methods would provide a window of opportunity for medical intervention [2, 14].

Although research in recent years has greatly improved our understanding of disease progression and treatment, there are still several questions that remain unanswered in terms of pathogenesis, biology and development of pancreatic cancer [9].

Improving Early Detection and Understanding Risk Factors

As it stands, there are no validated adequately effective screening methods for detecting asymptomatic pre-malignant or early malignant tumors [3, 5, 13]. Since curative surgery is only possible in earlier stages of PDAC, development of improved screening and early diagnosis, i.e. tumor markers, represents an important opportunity to improve the prognosis for PDAC patients.

The development of effective serological markers could greatly improve ability to detect early disease, especially in sub-populations with known risk factors [5, 12].

Studies have shown that even strongly upregulated, secreted may not be detectable in circulation due to the poor perfusion and vascularization around PDAC tumors. Studying autoantibodies against aberrant or overexpressed antigens could serve as an alternative marker of

PDAC development [12].

High-throughput technologies including gene expression analysis and protein profiling present opportunities to study and develop tumor markers for PDAC [12]. High throughput technologies are increasingly being used with the goal of understanding PDAC gene expression while discovering potential biomarkers. Currently, early diagnosis continues to pose a problem for general population, although increased knowledge of some risk factors has improved screening

14 strategies [1]. Once chronic pancreatitis is diagnosed in patients, for example, tumor markers can be used to screen patients for the development of pancreatic cancer [1].

Currently there are a few types of tumor markers for PDAC: tumor associated antigens, enzymes, and ectopic hormones or other peptides, of which tumor associated antigens are the most commonly used [1]. Out of these, CA 19-9, CA 50, CA 125 and CA242 are the most commonly used [1, 12]. Although it has a poor specificity and sensitivity as a marker for early pancreatic cancer, CA 19-9 has proved to be one of the most effective markers in differentiating between chronic pancreatitis and pancreatic cancer so far, with a sensibility and sensibility of 70-90% and

68-91% respectively [1, 12]. It is currently the most commonly used marker and it can conjunction with imaging to monitor treatment as serial levels often correlate with response to systemic therapy [1].

Another marker, CEACAM1, has been shown to be elevated in the sera of 91% of PC patients,

66% of patients with chronic pancreatitis, and 24% of normal patients [12]. In distinguishing between pancreatitis and pancreatic cancer, it has a sensitivity and specificity superior to CA 19-

9, but still does not have enough of a sensitivity or specificity for use in general screening for pancreatic cancer [5, 12].

A recent study by Tomasz et al. [13] studied protein expression urine samples from 192 PDAC patients, 92 patients with chronic pancreatitis, and 87 healthy individuals, and developed a panel of three proteins that could be used as urine biomarkers for PDAC screening. The three protein biomarkers, LYVE-1, REG1A, and TFF1, were all found to be elevated in PDAC cases, and differentiated between PDAC and healthy individuals with good sensitivity and specificity [13].

The sensitivity in differentiating between PDAC and chronic pancreatitis was slightly lower, but the results were still better than CA19-9 or CEACAM [13]. Interestingly, apart from the three

15 proteins, the study documented lists of highly expressed proteins that were significantly different between men and women, which presents the case for potentially developing separate panels for men and women [13]. Although these results were promising, the panel has not yet been validated independently on separate cohorts of patients.

The identification, validation and screening implementation for a marker or markers that effectively differentiate between patients with PC and excludes patients with benign disorders or without disease remains a challenge.

Risk Factors: Hereditary PC, Pancreatitis, and Inherited Mutations

So far, there are only a few known risk factors for PDAC, including a few genetic conditions.

Known risk factors for PDAC include advanced age, smoking, chronic pancreatitis, diabetes and obesity [5-8].

One of the known risk factors for developing PDAC is the presence of chronic pancreatitis, which is associated with a 7.2 fold increase in disease incidence [2, 19-21]. In some sub- populations such as patients with hereditary or chronic pancreatitis, the risk of developing PDAC is higher therefore screening such patients for tumor markers would be reasonable [12, 21, 22].

In particular, mutations in cationic trypsinogen (PRSS1) have been highly associated in increased risk of hereditary and chronic pancreatitis as well as increased risk of PDAC development [22, 23]. Indeed, germ-line mutations of PRSS1 have been associated with a 53- fold increase in PDAC incidence [6]. In addition, smoking and diabetes can induce chronic pancreatic inflammation which lead to fibrosis and scarring of pancreatic acini [2].

Pancreatic cancer is generally associated with a number of hereditary cancer syndromes, including hereditary breast-ovarian cancer syndrome, which is usually caused by BRCA2 [24].

Relatives of PDAC patients are also often considered to be at increased risk [2, 5]. Certain

16 families and populations are especially at risk of familial pancreatic cancer cases. Familial pancreatic cancer (FPC) refers to families where cases are documented in two or more first- degree relatives [25]. Compared to the general population, first-degree relatives of pancreatic cancer patients have a relative risk increased by a factor of 2, 6, and 30% for one, two, and three family members affected, respectively [5].

A number of genes, including BRCA1, BRCA2, PALB2, CDKN2A, STK11, ATM, MLH1, are associated with hereditary cancer syndromes, contribute to the development of PDAC and also explain many FPC cases [26]. Of these genes, BRCA1 and BRCA2 are the most commonly implicated genes in FPC. About 5-10% of PDAC cases are associated with inherited predisposition [5, 6]. Familial clustering of PC in the U.S. in 2004 accounted for between 1,600 and 3,200 cases [24].

Although the connections between these different genetic conditions and PDAC are not fully understood, common clinical observations between the conditions are exocrine insufficiency and pancreatitis that lead to PDAC development [6]. One of the benefits of characterizing and understanding familial and epidemiologic risk factors is the potential to gain a better understanding of key processes that govern PDAC genesis and evolution [6].

PDAC Susceptibility for BRCA1/2 Mutation Carriers

Breast cancer susceptibility gene 1 (BRCA1) and breast cancer susceptibility gene 2 (BRCA2) play important roles in the repair of double-stranded DNA breaks and are therefore important to the maintenance of the genetic integrity of cells [26].

Germline BRCA2 mutations are usually associated with hereditary breast and ovarian cancer syndrome, but they also present an increased risk for the development of pancreatic cancer. One

17 study, for example, estimates that about 17% of familial pancreatic cancers harbor BRCA2 mutations [6, 27].

Some studies have indicated that it is possible that BRCA mutations tend to play a role malignant progression of precursor lesions rather than in tumor initiation [6, 26, 28]. This theory is based on the facts PDAC has low penetrance in BRCA2 mutation carriers, has a late age of onset that is not very different in sporadic cases, and BRCA2 mutations are not usually detected in most early sporadic pre-malignant lesions but rather in later or advanced neoplasm lesions [6, 26, 28].

INK4A and BRCA2 mutations, for example, are not usually detected in the earliest sporadic pre- malignant lesions but are commonly found in intermediate or advanced PanIN lesions [6]. Loss of heterozygosity (LOH) seems to occur late in patients with inherited BRCA2 mutated alleles and seems to appear in severely dysplastic PanINs or PDAC. [6]. All of this suggests that

BRCA2 mutations may act as an aggravating factor rather than an initiating factor, but would still be causal for pancreatic cancer ontogeny [26, 28].

Although the increased susceptibility to PDAC of BRCA2 mutation carriers has been well documented [26, 29], the link between BRCA1 and increased susceptibility has been less well- characterized [6, 26, 30]. The relative risk of PDAC development associated with germline mutations of both BRCA1 and BRCA2 have been estimated by various studies to be between 2.3 and 7 [25, 26, 30].

Inherited BRCA2 mutations are usually associated with hereditary breast and ovarian cancer syndrome, but are also associated with increased risk of PC development [6]. There is a great amount of evidence for increased susceptibility, including numerous anecdotal family reports

[24], molecular genetic studies [26], and sequencing studies [25]. There have been numerous studies of cancer risks associated with BRCA2 germline mutation carriers, which have reported

18 the associated risks between 3.51 and 6.61 for pancreatic cancer [25, 26]. BRCA2 mutations contribute to an estimated 6-12% of familial pancreatic cancer cases [26].

The recorded penetrance of PDAC in BRCA1 is lower than it is for BRCA2 [26]. It is hypothesized that loss of heterozygosity [LOH] is a late-occurring event in BRCA1 mutation carriers [6, 26]. One study of nine families looked at cancer risks (other than breast and ovary) of

BRCA1 mutation carriers and found them to have a statistically significant increased risk for pancreatic, uterine, prostate and cervical cancers [26]. The relative risk for PC was about 2.26 in the study, which is lower than the relative risk of 3.55 in BRCA2 mutation carriers as estimated in previous studies [29]. It was found that BRCA1 mutations confer increased risk of PC in both men and women [26].

One study [20] of 483 BRCA1 mutation carriers from 147 families found that the age-adjusted lifetime risk of PC was increased about three-fold (3.6%) compared to general population

(1.3%). The study used quantitative RT-PCR and immune-histochemistry antibody staining to analyze gene expression in 13 normal pancreas samples, 30 samples with chronic pancreatitis, and 53 samples with pancreatic adenocarcinomas [20]. The study also found that down- regulation of BRCA1 may be an important step in the development of sporadic PC [20, 26]. In addition, patients with reduced or absent BRCA1 expression in their tumors had a worse 1-year survival rate than patients with normal BRCA1 expression in tumors, which suggests that

BRCA1 inactivation plays an important role in PC malignancy [20, 24, 26]. Interestingly, down- regulation of BRCA1 also seems to be a contributing factor to chronic pancreatitis [20].

A different study [26] demonstrated that BRCA1 mutation carriers may have an inactivating mechanism in the pancreatic tumor DNA, similar to the role of BRCA1 mutations in breast and ovarian tumourigenesis. BRCA1 carriers encounter more occurrences of other malignancies

19 including prostate and colorectal cancers compared to BRCA2 mutation carriers [26, 31]. One study of 11,847 patients from 699 families found that BRCA1 mutation carriers were more likely to have developed other malignancies in addition to PDAC than BRCA2 mutation carriers [31].

Overall, several different studies indicate that BRCA1 germline mutations likely pre-dispose carriers to PC as well as certain other cancers, and individuals with the mutations should be considered for pancreatic cancer-screening programs [6, 20, 24, 26].

Another interesting perspective on BRCA1/2 mutations is the age of onset, which has been looked at in a number of studies [6, 24-26]. Most studies do not show a significant difference in age of onset between mutation carriers and non-carriers, although BRCA2 mutation carriers on average present at slightly lower ages [24]. Average PC onset for BRCA mutation carriers has been estimated to be between 65 and 65.4 years [24, 26]. Average PC age of onset for BRCA2, however, has been estimated to be about 52.5 years – 12.5 years less than BRCA1 [24]. Finally, age of onset for sporadic patients is estimated to be around 63.6 years [26].

Prevalence of BRCA1/2 Mutations

There are varying estimations of the prevalence of BRCA1/2 mutations in familial PC. One study estimated ~17% of FPC cases harbor BRCA2 mutations [6], while the data on the prevalence of BRCA1 mutations are lacking [25]. In patients with FPC, the prevalence of

BRCA2 mutations has been found to be 3-17% [25]. In unselected patients, this prevalence is reported between 5-7% [25]

Ashkenazi Jewish populations tend to have an increased prevalence of BRCA1/2 founder mutations. In particular, this population demonstrates well-documented founder mutations that place them at increased risk of inherited BRCA1 mutation status and PDAC development [31].

The del6174T founder mutation is particularly common in familial pancreatic cancers in the

20 Ashkenazi Jewish population, as are the 5382insC and 185delAG mutations [6, 26]. The exact prevalence of BRCA1/2 mutations in Ashkenazi Jewish patients is not known, but one study in this patient population undergoing surgery for PDAC found BRCA1 and BRCA2 founder mutations in 5.5% and 21% of the patients, respectively [25].

One single-site study [28] of Ashkenazi patients who underwent resection segregated them into three different cohorts: 37 patients underwent surgery for PDAC, seven for intra-ductal papillary mucinous neoplasm (IPMN), and 19 for other related diseases [28]. A high prevalence of BRCA

1/2 mutations were found in the overall surgical cohort 19% of 63 patients, as well as in the

PDAC cohort (21.6% of 37 patients) and the IPMN cohort (28.6% of 7 patients), compared to control mutational frequency [28].

One of the most recent studies of BRCA mutations in pancreatic adenocarcinoma [25] used unselected, consecutive incident patients with PDAC over a 2-year period. Sanger sequencing of germline variants of BRCA1 and BRCA2 was done for all the patients. The study had a high rate of patients with Ashkenazi Jewish ancestry, so the prevalence of BRCA mutations was also calculated without those self-reporting as Ashkenazi Jewish [25]. Relevant BRCA mutations were found in 4.6% of 306 patients [25]. BRCA1 mutations accounted for 1% of the 306 patients, while BRCA2 mutations accounted for 3.6% [25]. This percentage was lower than in previously reported studies, but could have been more realistic as the study was unselected.

Within the Ashkenazi Jewish cohort of patients, the prevalence of BRCA mutations represented

12.1% of the 33 patients [25]. This was the first prospective large clinic-based study that recruited consecutive unselected patients with PDAC and then performed BRCA1/2 analysis with Sanger sequencing and MLPA [25].

Studying and identifying groups that are at high risk of BRCA1/2 mutations and/or PDAC is an

21 important way to improve targeted screening strategies to improve early detection of PDAC.

BRCA1/2 Mutations in Pancreatic Cancer

Understanding the link between BRCA1/2 and pancreatic cancer has very important diagnostic and therapeutic implications [25, 26]. There is a great importance in identifying high-risk individuals, as knowledge of BRCA1/2 mutation status allows surveillance for breast and ovarian cancers as well as pancreatic, prostate, and uterine cancers. Given the difficulty of early detection methods for pancreatic cancer, as well as the lack of common non-invasive techniques for detection of early disease, monitoring high-risk individuals throughout their lives can be a useful way to detect early disease or pre-malignant neoplasms [20, 24, 26, 28]. In addition tracking individuals from families with inherited pancreatic cancer mutations increases the chances of detecting the disease before it becomes advanced [24].

In addition, BRCA germline mutations have therapeutic implications. BRCA genes play roles in

DNA repair via homologous recombination, and tend to be sensitive to agents that take advantage of impaired DNA repair mechanisms that characterize BRCA-deficient tumors [25,

26]. In terms of therapeutic implications, in vitro and in vivo studies of BRCA1-deficient tumors have found that they tend to be particularly sensitive to certain chemotherapeutic agents such as cross-linking agents like Mitomycin C, type II topoisomerase inhibitors and poly (ADP-ribose) polymerase - PARP - inhibitors [26]. BRCA-related tumors in general tend to have increased sensitivity to PARP inhibitors and platinum-based chemotherapies [6, 25]. PARP inhibitors block base-excision repair, which allows both double-stranded and single-stranded DNA breaks to go unrepaired, and eventually leads to cell death [25]. This means that individuals with

BRCA1/2 mutations who develop malignancies could be treated with PARP inhibitors or platinum agents to potentially improve response and enhance survival [25].

22 Given the important implications of the presence of BRCA mutations both diagnostically and therapeutically, the case should be made for broader BRCA genetic testing for all PDAC patients, as well as for greater research into the link between BRCA1/2 and PDAC [25]. Greater screening allows for identification of BRCA mutations and allows families to have genetic counseling and predictive testing [25]. As the costs of genetic sequencing continue to decline, the case can be made for testing all PDAC patients for BRCA1/2 mutations at the time of diagnosis even if yield is low [25]. At the very least, screening should be done for all members of families with familial pancreatic cancer. Holter et al [25] also recommend that all Ashkenazi Jewish individuals be screened for BRCA1/2 mutations.

Studying PDAC Gene Expression for BRCA1/2 mutation carriers

Although there have been a number of studies that have looked at differential gene expression in pancreatic cancer [3, 32-35], there have not been many studies that have looked at differential gene expression between patients with BRCA mutations and without BRCA mutations. Most of the bioinformatics studies done on pancreatic cancer have looked at differentially expressed

RNA or miRNA [3, 32, 33, 35], or genomic alterations [34] between tumor and healthy samples.

A number of papers [36-39] have studied gene expression profiles in PDAC patients. Franka et al. and Ayars et al. [38, 40] note that there are a number of important advantages for studying gene expression profiles of tumors. These include the potential to study and understanding PC biology, determine and track response to chemotherapy, radiation, surgery or other therapies, tailor therapeutic strategies based on expression profiles, study the tumors after they have developed resistances to certain therapies, and further understand the tumors to improve drug delivery efficiency [38, 40]. All of these benefits could improve patient outcomes by enhancing

23 treatment efficacy, lowering cytotoxicity, monitoring patient responses, and improving overall patient quality of life [40].

One very recent study of miRNA expression profiles in pancreatic cancer studied the expression profiles of 1733 miRNA in 104 pancreatic tumors [36]. The study determined 3 main PDAC tumor subtypes based on miRNA expression profiles, and additionally demonstrated that subtyping of PDAC based on these expression profiles had prognostic value [36]. This study therefore demonstrated the potential usefulness of studying expression profiles of PDAC tumors to obtain diagnostic or prognostic information and possibly categorize tumors by into therapeutically relevant groups [36].

Moffitt et al [37] used virtual micro-dissection to extract stroma and tumor cells separately. They then conducted RNA expression profiling of both the tumor and stroma cells, and found that each provided independent prognostic information [37]. The study identified two main tumor subtypes: a ‘basal-like’ subtype - that had the worst outcomes and molecular similarities to basal- type tumors in bladder and breast cancers - and second a ‘normal’ or ‘activated’ subtype that had a slightly better prognostic outcome [37]. The study also demonstrated another important fact about PDAC: that the tumors are complex mixtures of cells with a minority of malignant epithelial tumor cells [37]. Interestingly, it was found that the identified stroma-specific subtypes and tumor-specific subtypes could be used independently for prognostication [37]. Moffit et al. also performed analysis on 37 patient-derived xenografts, and found that extracted tumors tended to have human pancreatic tumor cells surrounded by mouse stromal cells, which demonstrates the advantage of PDXs for studying tumor cells specifically if mouse RNA reads are filtered out

[37]. Ayars et al. [38] note that primary PDAC cells make up about 5-10% of tumor mass.

Xenografts could therefore represent a method to amplify the proportion of tumor cells. One note

24 to be made is that the cells most likely to grow in the xenografts are more likely to be aggressive malignant cell lines, and may therefore not be a complete representation of tumor heterogeneity

[37, 38].

Buchholz et al [39] attempted to identify novel genes with important functional roles in PC by using a workflow that first analyzed serial gene expression profiles in a number of in vivo and in vitro models to identify genes with conspicuous expression patterns. The candidate genes were then reverse transfected for functional analysis, and individual candidates were selected for study of cellular roles [39]. The study found 14 genes, 8 of which were from “druggable” gene families, that were classified as high priority candidates for individual characterization [39]. The study demonstrated the usefulness of gene expression assays in determining the importance of particular genes in different kinds of PC models, as well as the potential uses of using gene expression for identifying potential genes for targeted therapy [39].

Overall, there are several important advantages to studying gene expression in PDAC. This is especially useful in cases with known genetic alterations, as the analysis could inform diagnostic and therapeutic strategies.

There have only been a few studies [20, 41, 42] that have looked at gene expression in PDAC patients with BRCA mutations, but all focused only on particular pathways. Lyakhovich et al

[41], for example, focused on the Fanconi anemia/BRCA pathway. Most studies that have been done on gene expression in BRCA mutation carriers usually focus on other cancers, such as breast or ovarian [42]. Beger et al. [20] studied the expression levels of BRCA1/2 in chronic pancreatitis and sporadic PDAC, and found that BRCA1 expression was suppressed on RNA and protein levels in chronic alcoholic pancreatitis and in pancreatic cancer. The study did not find significant differences in BRCA2 expression in these conditions [20]. There have been no studies

25 looking at mRNA expression profiles for PDAC in BRCA mutation carriers compared to non- carriers. There is therefore a need to study the gene expression profiles of PDAC cases with

BRCA mutations compared to sporadic cases.

RNA sequencing for gene expression

In terms of gene expression, there are a number of tools that can be used for gene quantification, but the most commonly used have been microarrays and RNA-sequencing (RNA-seq) [43]. RNA sequencing, a method of RNA profiling that is based on next generation sequencing (NGS), has been used in an increasing number of studies and is replacing microarrays as the preferred method for gene expression profiling [43-45].

RNA seq involves sequencing of millions of short ‘reads’ from random positions of input RNAs

[43]. The process involves first extracting RNA from a sample, fragmenting the RNA to produce shorter reads of 50-200bp on average, amplifying the reads and then sequencing the reads using a particular sequencing technology [32, 44, 46].

RNA-seq allows for the study of all RNA sequences, transcript isoforms, and expressions for a sample at a very high and unprecedented resolution [43-45]. RNA seq has a number of well- documented advantages that make it ideal for transcriptomic studies. First, RNA-seq has a large dynamic range of detection which allows for quantification of genes that are both highly and lowly expressed [32, 43]. Second, RNA-seq has a very high level of resolution, which allows for the study of particular transcript isoforms, detections of single-base mutations or alterations and allows for accurate quantification [32, 43, 46]. Third, RNA-seq has been demonstrated to have a high level of reproducibility [43-45] and therefore analysis does not often have to be repeated on the same sample. Finally, RNA-seq allows for de novo discovery of transcripts, presenting the

26 ability to quantify novel transcripts in samples [32, 46]. These advantages have allowed for an unprecedented rate of progress in transcriptomics research.

With its increased advantages however, RNA-seq also comes with a number of challenges that have to be dealt with. The first is the difficulty of cDNA library preparation, which can introduce biases and makes it harder to compared RNA-seq results across different technologies or experiments [32, 43, 46]. Second, RNA seq produces a huge amount of data, thereby posing challenges in storage, manipulation and analysis [32, 44, 46]. Different tools have been developed to deal with RNA seq data manipulation and analysis [44]. Third, there are certain errors, such as sequencing errors, that are inherent to sequencing technology that have to be accounted for in the final analysis of RNA-seq results [43]. Finally, although the costs of high- throughput sequencing continue to go down, the cost of RNA-sequencing is still significantly most expensive, per sample, than microarray analysis [43, 44, 47]. It is therefore important to carefully design experiments to maximize analytical value.

Once RNA-seq is completed on a sample, reads can be computationally mapped to a reference genome or used to map a de novo genome [32, 43, 44, 46]. Once the reads are aligned to a particular genome, the relative expression can be quantified on a raw count basis or using metrics such as fragments per kilobase million (FPKM) or reads per kilobase million (RPKM) which normalize for gene length and the total number of reads for a particular samples [32, 44, 46, 47].

The analysis of RNA-seq data sets involves a number of important steps: quality control and trimming, read mapping, count computation, normalization and statistical analysis of differentially expressed genes [43].

The Tuxedo Suite of sequencing analysis tools, which includes the Tophat-Cufflinks pipeline, includes some of the most commonly used tools for RNA-seq alignment and quantification [32,

27 44, 46, 48]. This is because the software in the suite include free, open-source tools for gene discovery and comprehensive gene expression analysis of RNA-seq data [44]. In addition, the tools have been documented as being robust, efficient, and statistically principled [44]. The frequent use of these tools by other studies [32, 43, 44, 46, 48] means that results from using these tools can be compared to other studies that have used a similar pipeline, with minimal adjustments required. The Tophat tool performs alignment of RNA-seq reads to a reference genome and has a high performance in aligning RNA reads across splice junctions [44, 46, 48].

The Cufflinks tool uses Bayesian algorithms to quantify gene expression of isoforms in a particular sample, while Cuffdiffs performs differential expression analysis by comparing gene expression levels across different samples, determining the statistical significance of expression differences, and correcting for multiple testing [44]. These tools have been validated in a number of studies, and are some of the simplest yet most robust tools for RNA-seq analysis [32, 44, 46].

28 INTRODUCTION: GENE EXPRESSION PROFILES FOR PDAC XENOGRAFTS IN

BRCA MUTATION CARRIERS COMPARED TO NON-CARRIERS

The first paper in this submission, Gene expression profiles for PDAC xenografts in BRCA mutation carriers compared to non-carriers, discusses the study of gene expression differences in pancreatic ductal adenocarcinomas from patients with and without BRCA mutations. The author of this thesis was the first author of this paper, and conducted the bulk of the bioinformatics analysis and writing.

29

Gene expression profiles for PDAC xenografts from BRCA mutation carriers compared to non-carriers

Nikita Desai 1,2, Anita Hall 1,3,#, Diego Saavedra4, Michel L. Tremblay1,2,#, and George Zogopoulos1,3,#

1 Rosalind and Morris Goodman Cancer Research Center, McGill University, 1160 Pine Avenue,

Montreal, Quebec H3A 1A3, Canada

2 Department of Biochemistry, McGill University, Montreal, Quebec, Canada

3 The Research Institute of McGill University Health Center, McGill University, 1001 Decarie

Boulevard, Montreal, Quebec, H4A 3J1, Canada

4 Geneix Inc. London, UK,

#Co-corresponding authors.

Email addresses:

ND: [email protected]

AH: [email protected]

DM-S: [email protected]

MLT: [email protected]

GZ: [email protected]

30 ABSTRACT

Background

It is estimated that about 5-10% of pancreatic cancer cases are familial, with BRCA1/2 mutations as the most frequently implicated mutations in hereditary pancreatic cancer cases. In this study, the gene expression profiles for xenografts from patients with known harmful BRCA mutations were compared to xenografts from patients without BRCA mutations. It was hypothesized that gene expression profiles in the two sets of patients would be different, and understanding these differences could provide further insight into clinical management of each type of PDAC case.

Methodology

In this study, RNA was extracted from xenografts originally resected from 12 pancreatic ductal adenocarcinoma (PDAC) patients, four of whom had identified BRCA1/2 mutations. The extracted RNA was then sequenced and analyzed to compare gene expression profiles between

BRCA mutation carriers and non-carriers.

Results

Unsupervised hierarchical clustering of over- and under- expressed genes in PDAC xenografts tended to cluster BRCA mutation carriers separately from BRCA mutation non-carriers. In addition, 26 genes were significantly differentially expressed in the two sets of xenografts.

Subsequent analyses showed that CTSE and PRSS1, in particular, were significantly differentially expressed and seem to play a role in PDAC development and progression

Conclusions

Based on differences in gene expression, we conclude that somewhat separate gene expression deregulation processes seem to be implicated in the development of PDAC cases in patients with

31 and without identified BRCA mutations. Certain processes, such as mucin production, appear to play different roles in BRCA mutation non-carriers than in mutation carriers.

KEYWORDS

Pancreatic ductal adenocarcinoma BRCA1/2 mutations RNA-sequencing Murine xenografts Gene expression

List of Abbreviations ACER2 alkaline ceramidase 2 AMY2B amylase, alpha 2B BRCA breast cancer, early onset CTRC chymotrypsin C CTSC cathepsin C CTSE cathepsin E DNA deoxyribonucleic acid ERP27 endoplasmic reticulum protein 27 FPKM fragment per kilobase million GKN1 gastrokine 1 GKN2 gastrokine 2 IDC invasive ductal carcinomas KRAS Kirsten rat sarcoma viral oncogene homolog MHC major histocompatibility complex MIR3615 microRNA 3615 MUC5AC mucin 5AC MUC6 mucin 6 PanIN Pancreatic intraepithelial neoplasia PC Pancreatic cancer

32 PDAC Pancreatic ductal adenocarcinoma PLAU plasminogen activator, urokinase PRSS1 Protease, serine 1 RIN RNA integrity number RNA seq RNA sequencing RNA Ribonucleic acid SCID Severe combined immunodeficiency TAAR1 trace amine-associated receptor 1 TFF1 trefoil factor 1 TPM2 tropomyosin 2 VSIG1 V-set and immunoglobulin domain containing 1 WT wild type

33 1.0 BACKGROUND

Although it has a low rate of occurrence, pancreatic cancer (PC) has one of the worst prognoses in cancer, with a median survival less than 6 months, a 5 year survival rate of 3-5%, and as the fourth most common cause of cancer-related deaths across the globe [3, 49]. Reasons for low survival rates and poor prognoses include late diagnoses, a propensity for rapid metastases, and a resistance to most conventional and targeted therapies [5, 49, 50]. For this reason, the ability to optimize and improve upon existing diagnostic and treatment strategies is important for the overall clinical management of pancreatic cancer cases.

Pancreatic ductal adenocarcinomas (PDAC) are the most common histological type of pancreatic cancer, and make up about 85% of pancreatic cancer cases [8, 19]. These adenocarcinomas typically arise from premalignant lesions in the pancreatic ducts and pancreatic intraepithelial neoplasia (PanIN) [5, 16]. Some defining features of PDAC include very high KRAS mutation rates (>90% of PDAC cases), high propensity for local and distant metastasis, extensive stromal reactions, changes in cellular metabolism, and immunosuppression [5, 19]. In addition, PDAC genomes usually have a large number of chromosomal changes, including genetic alterations that induce tumourigenesis and determine disease phenotype [5, 19]. This means that characterizing and understanding these genetic changes and features of PDAC is important to understanding and managing the disease.

It is estimated that 5-10% of pancreatic cancer cases are familial. Breast cancer 1, early onset

(BRCA1) and breast cancer 2, early onset (BRCA2) gene mutations, in particular, have been highly associated with hereditary cases of pancreatic cancer and increased incidence of pancreatic cancer [24, 26].

34 The BRCA1 and BRCA2 genes encode for tumor suppressor proteins implicated in diverse cellular processes, especially DNA damage repair [51, 52]. BRCA1 is an E3 ubiquitin ligase essential to DNA repair, transcription regulation, cell-cycle progression, and X- inactivation [52]. BRCA2 is integral to homologous recombination, which is important to the repair of double stranded DNA breaks and inter-strand crosslinks [53]. Both genes are therefore critical to the maintenance of genomic stability, as they are essential to transcription regulation and DNA damage repair [52, 54].

Tumor suppressor proteins have essential anti-cancer functions, such as controlling cell growth and proliferation or inducing cell death to keep cells in proper balance [29, 55]. Some tumor suppressor genes, such as BRCA1/2 are involved in DNA-repair processes, which are essential to preventing mutations in cancer-related genes [55]. The inactivation of these tumor suppressor proteins, through gene mutations or through suppression of gene expression, can result in uncontrolled cancer- or tumor-promoting phenomena [55]. For this reason, germ line mutations in BRCA1/2 genes that affect the functions of their proteins, have been shown to predispose individuals to certain types of cancer, especially breast, ovarian and prostate cancers [55].

BRCA1 mutations account for most hereditary breast and ovarian cancer cases, up to 40-50% of hereditary breast cancer cases alone [54]. BRCA1 and BRCA2 gene mutations have also been associated with increased lifetime risks of pancreatic cancer [24, 26]. In fact, after breast, ovarian, and prostate cancers, pancreatic cancer is one of the most BRCA-associated cancers, and mutation carriers have increased risk of pancreatic cancer in both men and women [26, 30,

31].

According to the two-hit hypothesis [55], mutations on one of the alleles of the gene are usually inherited, with a somatic mutations occurring to alter the second allele, thereby resulting in two

35 mutant alleles [52, 55]. Tumors arising from these mutations invariably have two mutant alleles, which demonstrates the significant difference germ line mutations in these genes can make in the associated risk of developing certain kinds of tumors [55]. In addition, due to the fact only one additional mutation has to take place to cause inactivation of the gene, cancers in patients with germ-line mutations often present with cancer at younger than average ages of ontology [55].

Differences in pathological and gene expression have been observed in breast and ovarian tumors from patients with BRCA1/2 mutations, in comparison with sporadic cases, or from patients that are not carriers of BRCA mutations [56, 57]. Understanding these variations has already resulted in alternative clinical management strategies between patients with and without the mutations. This study therefore aims to determine if significant gene expression differences can be observed between tumors from pancreatic cancer patients with known BRCA1/2 mutations compared to patients without the mutations. Understanding these differences may help in optimizing selected treatments for PDAC patients with or without BRCA mutations.

In this study, twelve tumors were resected from patients with pancreatic adenocarcinoma

(PDAC) and were individually expanded in murine xenografts. RNA was extracted from these isolated tumors and corresponding transcriptomic profiles were generated by RNA sequencing

(RNA seq). Three of the tumors were from patients identified as having BRCA2 mutations, while one of the tumors was from a patient with a BRCA1 mutation. RNA from a healthy pancreatic ductal sample was also extracted and sequenced. Bioinformatics analysis was conducted on the RNA-seq datasets for each of the xenografts to quantify and compare gene expression. Genes differentially expressed between xenografts from patients with and without

BRCA mutations were extracted and analyzed to elucidate pathways implicated in each tumor type.

36 Herein we demonstrate that specific differences in gene expression profiles do exist between patients with and without BRCA mutations, and that different pathways seem to be implicated in mutation carriers versus non-carriers. In particular, CTSE and mucin production based on the expression of specific genes associated to mucin biosynthesis appears to play a greater role in

PDAC cases without BRCA mutations. In addition, the down regulation of PRSS1 and associated genes seems to play a greater role in BRCA mutation carrier than non-carriers.

37 2. METHODOLOGY

2.1 Tumor Resection and Xenografts

Tumors were resected from twelve patients with pancreatic ductal adenocarcinomas (PDAC).

Three of these patients had BRCA2 mutations while one patient had a BRCA1 mutation. In addition, a healthy ductal sample was resected from a patient who did not have pancreatic cancer.

Information about the samples, including BRCA mutation status of patient of origin, is summarized in supplementary table 1.

Resected tumors were then implanted subcutaneously into SCID beige mice. After 2-4 months, tumors developed in the mice. RNA was extracted from the first xenograft passage for each tumor sample and from the single normal sample. The 13 RNA samples were then sent for sequencing at the Beijing Genomics Institute (BGI). Before sequencing was performed, the RNA integrity number (RIN) of each of the tumors was determined by measuring the RNA integrity number (RIN). The RIN for each of the samples is also shown in supplementary table 1.

The Illumina Hiseq 2000 sequencer was used for RNA sequencing, producing paired-end reads with an insert length of 200bp and read length of 90 bp. The approximate depth of the datasets was between 51-76 million paired reads (supplementary table 1) per patient sample.

2.2 Bioinformatics Analysis

Alignment and Gene Expression Quantification

Once each of the RNA samples was sequenced, the quality of the reads in each of the RNA seq datasets was checked using FASTQC and was determined to be acceptable. Once quality control was completed, adaptor sequences were trimmed using Trimmomatic (v. 0.32) before continuing with gene expression analysis.

38 For gene expression analysis, a standard Tophat-Cufflinks pipeline [44, 58] was used for quantification and differential expression analysis. The paired-end RNA reads were mapped to the Ensembl (GRCh38.p2) [59] using Tophat (v. 2.0.12) with no coverage search and a mean mate inner distance of 100.

Since RNA was extracted from mouse xenografts, it had to be ascertained that subsequent gene expression analysis was from human sequence reads. The percentage of reads aligned to the mouse genome (mm10) [59] versus the human genome for the first tumor xenograft, PDAC X1, was 11.2% and 82.1%, respectively. The remaining reads did not align to either the mouse or human genomes. There was a very small overlap between reads mapped to the human genome versus the mouse genome (~0.4%). Reads mapping to the human genome and not to the mouse genome were therefore used for gene expression analysis.

Once the alignment of the reads to the human genome was completed, Cufflinks (v. 2.1.1) [44] was used with the Ensembl human transcriptome (GRCh38.p2) as a guide to reconstruct and quantify transcripts in each of the samples. This allowed for increased accuracy of transcript reconstruction, and well as detection of potentially novel transcripts in each of the datasets [44].

Cuffdiffs (v. 2.1.1.) was then used to estimate FPKM (“fragments per kilobase of exon model per million mapped reads”) values and levels of gene differential expression [44].

2.3 Unsupervised Analysis: Hierarchical Clustering

To determine whether gene expression profiles in BRCA mutation carriers are different from non-carriers, an unsupervised hierarchical clustering was performed on two different gene lists to determine whether BRCA mutation carriers tended to cluster separately from non-carriers.

39 Filtering and Extraction

To increase the robustness of the hierarchical clustering, a number of filters were applied to the

Cuffdiffs output for each of the transcripts. First, only transcripts with an “OK” test status, i.e. transcripts successfully quantified due to the concentration of reads in exons not being too high or too low, were kept [44]. In addition, transcripts with lengths lower than 150bp or higher than

30,000bp were filtered out. Finally, transcripts with FPKM values lower than 1 or higher than

500,000 were filtered out.

Heat maps and Dendograms

Once the gene expression files were filtered, log2 fold change values between each of the tumors and the normal ductal sample were used for hierarchical clustering. With only one normal ductal sample, the statistical significance of the gene expression changes between tumor and normal could not be determined. The log2 fold change values between each tumor and the normal sample, however, could be compared amongst the tumors. Using fold change values rather than raw FPKM values provided some normalization of expression values.

Two gene lists were used to make heat maps and dendograms for the tumors. The gene lists contained genes with a log2 fold change of at least ±4 and ±6, respectively, in at least one of the tumor samples compared to normal. The gene lists had 592 and 183 genes, respectively. An unsupervised hierarchical clustering was performed for both lists using the heatmap.2 function in

R (Figure 1).

2.4 Supervised Clustering: BRCA mutation carriers vs. non-carriers

Supervised analysis comparing xenografts from BRCA mutation carriers and non-carriers was performed using Cuffdiffs. The goal was to determine whether significant differences in gene

40 expression patterns linked with the development and propagation of PDAC were present in the mutation carriers compared to non-carriers.

Four of the twelve patients from whom the tumors were extracted were identified as having

BRCA mutations. Three of the patients (PDAC X2, PDAC X10, and PDAC X12) were identified as having BRCA2 mutations while one patient (PDAC X11) was identified as having a BRCA1 mutation (Supplementary Table 1).

It is important to note that due to low power (only one sample with a BRCA1 mutation), all the samples with BRCA mutations were analyzed together. This is notable since, although BRCA1 and BRCA2 mutation tumors tend to behave differently from sporadic tumors, they also tend to behave slightly differently from one another [24, 30, 56].

With the exception of PDAC X1, BRCA mutation carriers tended to cluster closer together than non-carriers in the resulting dendograms. Although PDAC X1 was suspected to behave similarly to the BRCA mutation carriers, as the patient of origin presented at a younger than average age, a

BRCA mutation was not identified in the patient. PDAC X1 was therefore put in the BRCA mutation non-carrier set for supervised analysis.

To determine which genes were significantly differentially expressed between the two sets of xenografts, the Cuffdiffs (v. 2.1.1) tool was run on the two sets using upper quartile normalization [44, 60]. Genes determined by Cuffdiffs to be significantly differentially expressed were extracted [44]. The resulting list contained 46 genes, 20 of which were identified as being novel by Cufflinks, and did not align with known genes when aligned against the human genomic and transcript database using the blastn tool from NCBI [61, 62]. These 20 genes were removed from the list before continuing the analysis. The expression data for all 46 differentially expressed genes is included in supplementary table 2.

41 Heat maps and Dendogram

A heatmap with a dendogram was created for the remaining 26 genes (Figure 2). The average

FPKM values were then extracted for the BRCA mutation carrier set, the non-carrier set and the single non-tumor sample (Table 1).

Co-expression analysis

Finally, as a means of validation and to elucidate enriched pathways, co-expression analysis was performed on eight genes that were highly differentially expressed between the two sets of genes. For each of these genes, a list of co-expressed genes was extracted using the GeneFriends:

RNA Seq [63] tool. For each of the genes, a gene list was created for the co-expressed genes with a mutual rank less than or equal to ten [63]. Once these gene lists were extracted, the average FPKM values of each gene in each set were extracted from the Cuffdiffs output.

42 3. RESULTS

3.1 Unsupervised Analysis: Hierarchical Clustering

Hierarchical clustering performed on genes with a log2 fold change of at least ±4 and ±6 yielded two heat maps (Figure 1). Xenografts from patients with BRCA1/2 mutations (in red) tended to

Figure 1: Heat maps with dendograms for gene lists with a) ±4 log2 fold change in at least one tumor - 592 genes b) ±6 log2 fold change in at least one tumor – 183 genes. Genes shown in green have more expression in tumor compared to the normal ductal sample, while genes shown in red have less expression in tumor compared to the normal sample. Sample names in red have been identified as being BRCA mutation carriers

43 3.2 Supervised Analysis – Differential Gene Expression:

The average expression data for the significantly differentially expressed genes in the two tumor xenograft sets as well as the normal sample were then extracted using Cuffdiffs, and the 26 non- novel genes were used to create a heatmap and dendogram (Figure 2).

Figure 2: Heat maps with dendograms for 26 most significantly differentially expressed genes. Boxes in green have a high overall expression while genes in red have a lower overall expression. Sample names in red have been identified as being BRCA mutation carriers

The average expression of each of the genes in the BRCA mutation carriers, non-carriers and normal ductal sample were also extracted from Cuffdiffs (Table 1). Again, PDAC X1 tended to cluster with the BRCA mutation carriers.

19 of the 26 genes were more highly expressed in the BRCA wild-type (WT) samples compared to the BRCA mutation carriers and normal tissues. Two were more highly expressed in BRCA

44 WT and normal tissue than in BRCA mutation carriers, and three were more highly expressed in

BRCA mutation carriers than BRCA WT or normal tissue.

Of the 26 genes, 8 have been previously directly linked with pancreatic cancer, 6 have been linked with one or more other types of cancer. In addition, nine of the genes were associated with cellular proliferation, DNA replication and/or transcription. Finally, three of the genes (CTSE,

PRSS1, and CTSC) were associated with proteolysis.

BRCA WT BRCA Mut Normal Duct Gene Cancer Associations Associated Functions FPKM FPKM FPKM CTSE 1485.27 252.24 0.94 Pancreatic [64, 65] Proteolysis [66] MIR3615 10776.2 131.22 42.12 - - TFF1 6445.06 873.95 18.66 Pancreatic [65, 67] Cell proliferation [65, 66] HIST1H2BJ 545.78 44.04 46.2 DNA replication [65] PRSS1 341.11 7.56 HIDATA Pancreatic [21, 65] Proteolysis [65, 66] CA2 320.73 35.66 33.76 Esophageal, renal, lung [68] - PIGR 263 31.85 11.34 Hepatocellular [65] - Transcription regulation [65, CREB3L1 183.23 31.12 17.25 - 66] RBP4 178.65 5.03 4.21 - - HIST1H1C 156.99 17.06 19.73 Pancreatic [65] DNA replication [65] DMBT1 58.55 0.49 0.41 Brain, lung, gastric [65] Cell proliferation [65, 66] HIST1H4H 45.85 1.16 0.45 - DNA replication [65] ADRA2A 24.51 1.31 17.24 - DNA replication [65] MXRA5 21.54 1.12 2.44 Colorectal[69] - CPA1 16.19 0.1 0 Pancreatic [65] Proteolysis [65, 66] XIST 11.66 0.03 0.01 Breast [70] - SPON1 9.01 0.39 0.93 - - Transcription regulation [65, CREB3L3 6.92 0.35 0 - 66] PIWIL1 3.84 0.05 0.01 - - TRPM5 1.48 0.03 0.99 - - MUC6 40.37 0.46 39.41 Pancreatic [71] - CADPS 4.04 0.01 7.23 - - SERPINF1 0.75 23.32 44.48 - Cell proliferation [65] PLAU 14.45 118.78 0.89 Pancreatic [65] Hypoxia response [65, 66] TPM2 38.21 217.85 30.36 Pancreatic [18, 65] - CTSC 98.55 503.99 25.23 Breast [72] Proteolysis [65, 66] Table 1: Average FPKM values for 26 most significantly differentially expressed genes in the BRCA WT xenografts, BRCA 1/2 mutated xenografts, and the single normal ductal sample. Dark green values represent values at least 100 FPKM greater than overall average, light green are less than 100FPKM above average, dark red are at least 100 FPKM below average, and light red less than 100 FPKM below average. The last column contains published cancer associations

45

Co-Expression Analysis

As a means of validation of the gene expression data and to further understand differences in patient derived xenografts with and without BRCA mutations, co-expression analysis was performed on 8 genes that were highly differentially expressed. These genes were CTSE, TFF1,

HIST1H2BJ, PRSS1, MUC6, PLAU, TPM2 and CTSC. The MIR3615 microRNA was not assessed as the co-expression database did not have co-expression information for the gene [63].

Once co-expression analysis was performed on each of the selected genes, the genes were grouped in terms of known function or signaling pathways. a) Genes associated with mucin production in gastrointestinal lining – CTSE, TFF1, MUC6

Three of the genes, cathepsin E (CTSE), trefoil factor 1 (TFF1), and mucin 6 (MUC6) were found to be co-expressed on each other’s co-expression lists, and have several co-expressed genes in common (Table 2). Some of the genes in the single normal sample have a NOTEST test status, which meant that not enough reads aligned to the gene exons to reliably quantify expression [44].

Five of the genes on the combined co-expression gene list have been associated with pancreatic cancer and five have been associated with gastric or mucin-producing tumors (Table 2). In addition, the eight genes which are highlighted in green have been shown to be overexpressed in invasive ductal carcinomas (IDC) of the pancreas through microarray analysis in the Pancreatic

Cancer Database [73]. In addition, eight of the genes (CTSE, TFF2, GKN2, GKN1, MUC5AC,

VSIG1, TFF1, MUC6) are usually expressed in gastric epithelial mucosa, while three of the genes (ACER2, TFF2 and GKN1) are associated with positive cell proliferation and/or negative cell-matrix adhesion [65, 66].

46

CTSE TFF1 MUC6 BRCA WT BRCA Mut Norm Duct Gene ID Cancer associations P.C. P.C. P.C FPKM FPKM FPKM CTSE 1 0.79 0.75 1485.27 252.24 1.13 Mucin-producing tumors[65] TFF2 0.95 0.8 0.78 2117.32 630.75 8.60 Gastric[74] GKN2 0.93 0.83 0.73 20.60 0.04 NOTEST Gastric[65] GKN1 0.92 0.83 0.72 22.02 0.06 NOTEST Gastric[65] DPCR1 0.92 0.84 0.69 262.05 185.54 0.33

ANXA10 0.9 0.78 - 257.34 130.80 7.06 Pancreatic[65] MUC5AC 0.91 0.82 - 94.42 18.25 0.00 Colon and pancreatic[75] VSIG1 0.89 0.79 - 175.25 59.14 0.16

ACER2 0.8 0.73 - 26.33 3.90 2.90

TFF1 0.79 1 - 6445.06 873.95 18.66 Pancreatic cancer[65, 67] Infiltrating ductal CLDN18 0.89 - 0.7 529.28 75.49 0.16 adenocarcinomas[65] VSIG2 0.85 - 0.75 181.02 40.98 7.26

SLC9A4 0.76 - 0.68 8.20 3.63 0.07

MUC6 0.75 - 1 40.37 0.46 39.41 Pancreatic cancer[65, 71] FER1L6 0.86 - - 79.17 15.68 0.32

CAPN9 0.78 - - 30.86 7.43 0.39

FAM101A 0.78 - - 120.43 69.91 53.85

AGR2 0.74 - - 4527.58 1959.87 24.42

Gastric, mucin-producing A4GNT - - 0.89 0.28 0.15 1.67 [71] GAST - - 0.68 1.00 0.08 0.39 Mucin production[65] TAAR1 - - 0.59 0.06 1.86 NOTEST

Table 2: Co-expressed genes for CTSE, TFF1, and MUC6, with pearson correlation (P.C) scores for each gene, average expression values in BRCA WT and mutation carriers and the single normal sample, and cancer associations. Gene values in dark green have expression 100 FPKM more than average expression across tumors, while genes in light green have expression more than 10 FPKM but less than 100 FPKM compared to average expression. Gene values in red are under expressed. Gene names highlighted in green have been demonstrated to have increased expression in invasive ductal carcinomas in the Pancreatic Cancer Database [73]

All the genes contained in the combined co-expression lists for CTSE, TFF1 and MUC6 had a

much lower expression in the BRCA carriers than in the mutation non-carriers, with the

exception of TAAR1 (Figure 3).

47 Figure 3: Stacked graph showing relative expression of each of the 21 CTSE, TFF1 and MUC6 co-expressed genes as a percentage of total expression in BRCA mutation carriers (red) versus non-carriers (blue)

Since five of the genes up-regulated in BRCA wild-type samples were implicated in mucin- producing tumors, the expression of mucin genes was extracted from the cufflinks results, for the

BRCA wild-type samples, BRCA mutation carriers, and the single normal duct sample (Table 3).

Four of the genes, MUC1, MUC5AC, MUC13, and MUC17 have higher expression in the

BRCA mutation non-carriers than in the BRCA mutation carriers or the normal sample. Two of the genes, MUC6 and MUC5B, had lower levels of expression in the BRCA mutation carriers than in the non-carriers or the normal ductal sample (Table 3).

48 BRCA WT BRCA Mut Normal Duct Gene ID Notes FPKM FPKM FPKM MUC1 1636.65 1080.46 517.30 Increased in PDAC [16] MUC13 351.37 91.52 2.89 Increased in PDAC [16] MUC17 186.74 86.05 0.04 Neoexpressed in PDAC [16] MUC3A 171.16 201.91 8.67

MUC20 110.07 75.20 28.84

MUC5AC 94.42 18.25 0.00 Neoexpressed in PanIN [16] Increased in PanIN, decreased in PDAC MUC6 40.37 0.46 39.41 [16] MUC5B 32.91 13.02 33.07 Expressed in normal ductal [16] MUC12 9.10 5.53 0.35

MUC4 7.07 12.27 0.56 Increased in PDAC [16] MUC2 6.99 2.01 0.02

MUC15 1.31 0.36 1.83

MUC16 0.43 13.43 0.15

Table 3: Expression values for mucin genes in BRCA WT samples, BRCA mutation carriers and the single normal duct sample. The genes are highlighted based on the difference between the expression values for the particular set and the average gene expression across all 13 pancreatic samples. Genes highlighted in darker green and red have expression at least 50 FPKM greater or lesser than average, respectively. Lighter green and red values are less than 50 FPKM but greater than 10 FPKM than average. b) Genes associated with methylation – HIST1H2BJ

Six of the top seven co-expressed genes were slightly overexpressed in the BRCA WT set, while one of the genes - HIST1H3A - was not expressed highly enough to be quantified by RNA-Seq

(Table 4). Almost all the co-expressed histone cluster one genes had a higher expression in the

BRCA mutation carriers than in the non-carriers (Figure 4).

HIST1H2BJ BRCA WT BRCA Mut Log2 Gene ID P.C. FPKM FPKM Fold Change HIST1H2BJ 1 545.78 44.04 3.63 HIST1H2BO 0.85 9.32 1.98 2.23 HIST1H3A 0.85 -­‐ -­‐ -­‐ HIST1H2AE 0.82 7.66 1.85 2.05 HIST1H4I 0.82 1.22 0.15 3.02 HIST1H2BG 0.75 6.58 1.67 1.97 Table 4: Co-expressed genes for HIST1H2BJ with Pearson correlation (P.C) values, as well as average FPKM values in the two tumor xenograft sets, as well as the log2 fold change of BRCA WT expression over BRCA mutation carriers

49 Figure 4: Stacked graph showing relative expression of histone cluster 1 genes as a percentage of total expression in BRCA mutation carriers (red) versus non-carriers (blue)

c) PRSS1 and associated genes

All except one of the co-expressed genes for PRSS1 had a lower expression in BRCA mutation carriers than in non-carriers (Table 5). All the tumor xenografts, however, had much lower expression values (reduced greater than 1000X in some cases) for most of the genes than the normal ductal sample. Some of the genes in the normal ductal sample had a “HIDATA” status in the Cuffdiffs output, meaning that the number of reads aligned to gene exons were too high to reliably quantify those genes [44]. Hence, exact expression values could not be extracted for those particular genes. Most of the genes (10/15) on the co-expression list for PRSS1 have been demonstrated to have decreased expression in invasive ductal carcinomas in the Integrated

Pancreatic Database [73]. With the exception of ERP27 and AMY2B, all the genes had a higher expression in BRCA mutation non-carriers than in mutation carriers (Figure 5).

50 PRSS1 BRCA WT BRCA Mut Normal log2 fold Gene ID P.C. FPKM FPKM Duct FPKM change PRSS1 1 341.11 7.56 HIDATA 5.50 CTRC 1 4.83 0.05 28493 6.48 PLA2G1B 0.98 7.75 0.20 91355 5.24 CPB1 0.98 10.70 0.07 HIDATA 7.18 PNLIP 0.98 1.81 0.02 HIDATA 6.47 CELA2A 0.98 0.28 0.04 33832 2.73 CPA1 0.97 16.19 0.10 HIDATA 7.30 ERP27 0.95 1.42 3.97 684 -­‐1.48 AMY2B 0.98 17.36 13.62 HIDATA 0.35 CEL 0.95 3.07 0.22 HIDATA 3.81 CELA2B 0.97 0.28 0.04 33832 2.73 AMY2A 0.97 0.69 0.00 HIDATA -­‐inf SYCN 0.95 1.19 0.00 28046 -­‐inf PRSS3 0.97 302.17 149.50 12280 1.02 CLPS 0.93 23.42 0.16 HIDATA 7.23 Table 5: Co-expressed genes for PRSS1, with pearson correlation (P.C) values, average FPKM in each set, test status for each gene in Cuffdiffs, and the log2 fold change between the BRCA WT and mutation carriers. Genes highlighted in red have been shown to be under expressed in invasive ductal carcinomas [73]. HIGHDATA indicates that too many reads mapped to exons for cufflinks for accurate quantification

Figure 5: Stacked graph showing relative expression of PRSS1 coexpressed genes as a percentage of total expression in BRCA mutation carriers (red) versus non-carriers (blue)

51 d) PLAU – Overexpression in BRCA Mutation Carriers

For the plasminogen activator, urokinase (PLAU) gene, six of the top nine co-expressed genes had slightly higher expression in the BRCA mutation carriers. Three of the over expressed genes have been recorded as having increased expression in invasive ductal carcinomas in the pancreatic database (Table 6). However, the levels of over-expression for co-expressed genes were not quantitatively much higher (~1-12 FPKM increase). This meant that the overexpression PLAU-associated genes was not considered to be very robust, and could therefore not be used to elucidate associated pathways.

Gene Symbol PLAU P.C. BRCA WT FPKM BRCA Mut FPKM Log2 fold change PLAU 1 14.45 118.78 -3.04 C10orf55 0.68 0.13 0.45 -1.79 KRT37 0.58 0.00 0.00 0.00 LOXL1-AS1 0.54 4.86 2.54 0.93 HEPHL1 0.5 0.53 0.70 -0.41 COL13A1 0.5 0.19 0.02 -3.17 PHLDA1 0.45 6.87 18.31 -1.41 TMEM158 0.46 0.67 1.41 -1.08 KRT75 0.44 0.00 0.00 0.00 Table 6: Co-expressed genes for PLAU, with Pearson correlation (P.C) values, average FPKM values in the two tumor xenograft datasets, and the log2 fold-change values of the BRCA WT dataset over the BRCA mutation carriers.

e) CTSC and TPM2 – Non-Uniform Gene Co-Expression

For the 6 genes co-expressed for TPM2, and the 6 genes co-expressed for CTSC, the over- or under- expression of the genes were not uniform across the co-expressed genes. For TPM2, half the genes had greater expression in the BRCA WT set and the other half had greater expression in the BRCA mutation carriers. For the 6 genes in the CTSC co-expression list, 3 did not have sufficient expression for quantification by RNA Seq, two were overexpressed in the BRCA WT xenografts, and one was overexpressed in BRCA mutated xenografts. This means that for these

52 genes, there does not seem to be uniform trends of gene expression for associated genes, meaning the effect of over- or under- expression of these genes are likely not functionally significant.

53 4. DISCUSSION

From the results of the differential gene expression analysis, it is apparent that gene expression profiles for xenografts from BRCA1/2 mutation carriers and non-carriers have differences of interest. In addition, the results in this study suggest that gene expression deregulation plays a greater role in sporadic cases of pancreatic cancer than in cases where BRCA mutations are present.

Four main variations were observed between the BRCA mutation carriers and non-carriers. First, several gastric and mucin related genes were overexpressed in the BRCA wild-type samples compared to the BRCA mutation carriers, i.e. CTSE, TFF1, TFF2, and MUC6. Second, although there was a lowered expression of PRSS1 and related genes including CTRC, in all the PDAC xenografts, there was a much lower expression of these genes in BRCA mutation carriers. Third, a slightly higher expression of histone cluster 1 genes was observed in sporadic PDAC cases.

Finally, genes over-expressed in BRCA mutation carriers, i.e. TPM2 and CTSC, seem to have a much lower association with PDAC development and progression than genes over-expressed in

BRCA mutation non-carriers, with PLAU as the only significantly over-expressed gene found to have co-expressed genes that may play a role in PDAC development or maintenance.

4.1 CTSE, MUC6, TFF1 and mucin production in sporadic PDAC cases

Three of genes that were most significantly upregulated in BRCA mutation non-carriers were cathepsin E, mucin 6 and trefoil factor 1 (Table 3). These three genes were also closely related to one another in terms of co-expression, localization, and function. They not only shared co- expressed genes (Table 3) but have also been demonstrated to have increased expression in both gastric carcinomas and invasive ductal carcinomas (IDC) of the pancreas [73]. In addition, all three genes and several of their co-expressed genes are associated with mucin production,

54 suggesting increased expression of mucins plays a greater role in PDAC cases without BRCA mutations than in cases with BRCA mutations.

CTSE and TFF1 are potential biomarkers for PDAC in BRCA mutation non-carriers

Cathepsin E (CTSE) is an aspartic protease present in immune cells, and found with the highest concentration in the surface of epithelial mucus-producing cells of the stomach [64-66, 76, 77].

Its exact function has not been fully determined; it does not appear to be involved in digestion of dietary protein but seems to play a role in MHC class II antigen processing [65, 66, 78]. It is not usually expressed in the normal pancreas but is expressed in most pancreatic ductal adenocarcinomas [64, 76, 79-81]. It has also been studied as a possible diagnostic marker for

PDAC [76], has been shown to be highly and specifically expressed in more than 90% of PDAC cases [77], and has increased expression with disease progression [79]. One study found that

CTSE was expressed in the pancreatic juice of eight out of eleven patients with PDAC and five out of ten patients with mucin-producing tumors [64]. CTSE may be expressed in relatively early stages of carcinogenesis in pancreatic lesions and its expression has therefore been suggested to be associated with pathogenesis of PDAC from precancerous pancreatic lesions [64].

In this study, the average expression of CTSE was more than 1200 FPKM greater than BRCA mutation carriers, and over 1400 FPKM greater than the normal ductal sample (Table 1). This suggests that CTSE overexpression might play a much more significant role in sporadic cases of

PDAC than in cases involving BRCA mutations. CTSE may therefore be a better diagnostic marker for PDAC cases in patients without BRCA mutations than in patients with BRCA mutations.

Estrogen-regulated protein trefoil factor 1 (TFF1) is a member of the trefoil family of proteins which, like CTSE, are found in the gastrointestinal mucosa [65, 66]. TFFs protect the mucosal

55 linings by protecting epithelial cells from apoptotic death and increasing their motility and have a similar role in cancer cells [13, 65, 66]. Deregulation of TFF expression has therefore been associated with a number of gastrointestinal cancers [82]. It has been shown that TFF1 expression is usually low in normal pancreas but is much higher expression in pancreatic cancers, including PDACs [67, 83, 84]. TFF1 overexpression has been identified to occur early in pancreatic cancer development [67] and reported to increase cancer cell invasion and growth of stromal cells [67, 84]. Moreover, one study suggests that TFF1 expression does not affect growth of primary tumor, but has a significant impact on tumor metastasis [67]. TFF1 expression could therefore be an important prognostic biomarker for PDAC cases. In addition, a recent paper studying urine biomarkers for PDAC listed TFF1 as one of three major biomarkers that could possibly be used as a diagnostic marker for early stages of PDAC [13].

In this study, average TFF1 expression in the BRCA wild type set was greater than average expression in the BRCA mutation carrier by more than 5500 FPKM, and was greater than expression in the normal duct sample by more than 6400 FPKM (Table 1). Since expression in the normal ductal sample was about 22 FPKM, this demonstrates a significant increase in TFF1 expression in both BRCA mutation carriers and non-carriers. However, like CTSE, it is apparent that overexpression of TFF1 is much more extreme in PDAC cases without BRCA mutations than with BRCA mutations. This data suggests that TFF1 expression is likely a better biomarker in PDAC cases without BRCA mutations.

Mucin production deregulation plays a role in pancreatic ductal adenocarcinoma development

Mucins are glycoproteins synthesized by secretory epithelial cells and are usually involved in protection of epithelial linings from chemical and mechanical aggressions [85]. Secreted and transmembrane proteins have been hypothesized to be intimately involved in inflammation and

56 development of different adenocarcinomas, including pancreatic, colon, and breast [86-88] and as diagnostic markers and potential therapeutic targets [86, 88].

Five mucins that were over-expressed in BRCA mutation non-carriers in this study - MUC1,

MUC5AC, MUC6, MUC13, and MUC17 (Table 3) - have been associated with invasive pancreatic ductal carcinomas [16, 89-91]. Four of these genes, MUC1, MUC5AC, MUC13, and

MUC17, have been demonstrated to have increased expression in PDAC [16, 73]. MUC6, however, has been demonstrated to have increased expression in pancreatic intraepithelial neoplasia (PanIN) and decreased expression levels in PDAC [16].

MUC1 is associated with tumor invasion and metastasis and has been shown to be frequently overexpressed in pancreatic cancer [91]. Its increased expression seems to be associated with disease progression and has been linked with poor prognosis in stage IV invasive ductal carcinoma (IDC) of the pancreas [91]. The expression of MUC13 has been shown to be much higher in pancreatic cancer samples compared to normal/non-neoplastic pancreatic tissues, and has been suggested to augment tumourigenesis in the pancreas [89]. MUC17 overexpression has also been observed in PDAC cases [16]. MUC5AC is overexpressed in the ductal region of some cases of human pancreatic cancer, but is not expressed in the normal pancreas, and seems to be associated with immunosuppression and progression of pancreatic cancer [75]. In this study, the expression of all four of these genes was higher in the BRCA wild-type set than in the BRCA mutation carriers, and much higher in the BRCA wild-type set than in the normal ductal sample

(Table 1). Increased expression of these genes, therefore, seems to be associated with all PDAC tumors, but seems to be much greater in PDAC cases with BRCA wild-type genes than with

BRCA mutations.

57 The mucin 6 gene (MUC6) encodes for gastric mucin, a secreted glycoprotein expressed in normal gastric mucosa [16, 85]. Over-expression of MUC6 has already been observed in gastric carcinomas and in PanIN, which is a histologically pre-cursor to about 85% of PDAC cases [16,

17, 73, 92]. In contrast to MUC1, MUC6 positive expression in pancreatic tumors has been associated with decreased tumor invasiveness and significantly better survival [90]. Therefore we found that the expression of MUC6 was much lower in the BRCA mutation carrier set than in the

BRCA wild type set and the normal ductal sample. MUC6 expression could therefore be a better prognostic marker for PDAC in cases with BRCA mutations than in cases without BRCA mutations.

Overall, we report that expression of almost all mucin genes was lower in BRCA mutation carrier than in non-carriers, with the exception of MUC3A, MUC4 and MUC16 (Table 3). This suggests a greater role for mucin overexpression in PDAC cases without BRCA wild-type genes than in cases with BRCA mutated genes, with the exception of MUC6. Overall, the results of this study make it apparent that mucin gene expression profiles could be promising diagnostic and prognostic biomarkers for pancreatic ductal adenocarcinomas.

4.2 PRSS1 and co-expressed genes have decreased expression in chronic pancreatitis and

PDAC

Protease, Serine, 1 (PRSS1) is a gene that encodes for cationic trypsinogen, an enzyme secreted by the pancreas and hypothesized to promote apoptotic cell death [23, 65, 66]. Germline PRSS1 mutations have already been associated with chronic and hereditary pancreatitis, as well as an increased risk of PDAC [21, 22, 65]. Chronic pancreatitis is indeed one of the most recognized risk factors for the development of PDAC [21]. Gene expression studies have also demonstrated

58 the under-expression of PRSS1 and several co-expressed genes in invasive ductal carcinomas of the pancreas (IDCP) [73].

The level of PRSS1 gene expression in this study was significantly lower in the BRCA mutation carriers than in the non-carriers, suggesting a greater role for PRSS1 under expression in PDAC cases with BRCA mutations than in cases without BRCA mutations (Table 5). The expression of

PRSS1 in all the tumors, however, was significantly lower than in the normal ductal sample

(Table 5). In addition, the expression of another gene, chymotrypsin C (CTRC), was far lower in all the tumor samples than in the normal ductal sample (Table 5). This is of interest as mutations in CTRC have also been shown to be associated with chronic pancreatitis [93]. It is possible, therefore, that lowered expression of PRSS1, CTRC, and associated genes have a similar pathological effect as mutations of PRSS1 or CTRC, resulting in chronic pancreatitis that is involved in the development of PDAC.

The expression of PRSS1 and co-expressed genes was very low in the BRCA mutation carriers, with several gene expression levels close to zero FPKM (Table 5), suggesting the presence of an expression suppression mechanism in BRCA mutation carriers. The lowered expression of these genes in all the tumor xenografts, however, suggests that suppressed expression of PRSS1 and associated genes may play a role in the specific contexts of both sporadic and familial cases of

PDAC.

4.3 Less Gene Expression Deregulation in BRCA Mutation Carriers

Finally, it is apparent from the results of this study that there are lower levels of gene expression deregulation in BRCA mutation carriers than non-carriers. This suggests that similar to breast and ovarian cancers, PDAC development in BRCA mutation carriers seems to involve genetic or transcriptional deregulation rather than gene expressed deregulation. The consequences of which

59 is unclear yet it may lead to different severity of the diseases, variability in the detection of the tumors and a change in the amount of stromal involvement and in the development of the severe cancer cachexia in pancreatic cancer patients.

60 5. CONCLUSION

The results of this study suggest that gene expression deregulation plays a more significant role in PDAC cases from patients without BRCA mutations. Specifically, xenografts from patients without BRCA mutations tend to have higher expressed of mucin and mucin-related genes, which are implicated in PDAC. In addition, most PDAC cases have a lowered expression of trypsinogen and co-expressed genes, but BRCA mutation carriers tend to have a much lower expression of these genes than non-carriers. CTSE, TFF1, mucin-producing genes, PRSS1, and

CTRC have all been shown to be promising possible diagnostic and prognostic biomarkers for

PDAC.

Overall, it is apparent that PDAC xenografts from patients with and without BRCA mutations have functionally different gene expression profiles, and that differing strategies may have to be employed when using gene expression data from each set to make diagnostic or prognostic predictions.

61 BIBLIOGRAPHY

1. Bacalbasa, N., Gireada, A., Balescu, I., Tumor markers in pancreatic cancer – literature review HVM Bioflux, 2015. 7(2): p. 75-8. 2. Fokas, E., O'Neill, E., Gordon-Weeks, A., Mukherjee, S., McKenna, WG., Muschel, RJ., Pancreatic ductal adenocarcinoma: From genetics to biology to radiobiology to oncoimmunology and all the way back to the clinic. Biochim Biophys Acta, 2015. 1855(1): p. 61- 82. 3. Yang, D., Zhu, Z., Wang, W., Shen, P., Wei, Z., Wang, C., & Cai, Q, Expression profiles analysis of pancreatic cancer. European Review for Medical and Pharmacological Sciences, 2013. 17(3): p. 311-317. 4. Ishiwata, T., Pancreatic Ductal Adenocarcinoma: Basic and Clinical Challenges for Better Prognosis. . Journal of Carcinogenesis and Mutagenesis, 2013. 5. Ryan, D.P., Hong, T.S., Bardeesy, N., Pancreatic Adenocarcinoma. New England Journal of Medicine, 2014. 371: p. 1039-49. 6. Hezel, A., Kimmelman, AC., Stanger, BZ., Bardeesy N.,, and DePinho RA., Genetics and biology of pancreatic ductal adenocarcinoma. Genes and Development, 2006. 7. Xie, D., Xieb, K., Pancreatic cancer stromal biology and therapy. Genes and Diseases, 2015. 2(2): p. 133-43. 8. Society, A.C. What is pancreatic cancer? 2014 9 Jan 2015 [cited 2015 July 20]. 9. Yang, J., Zeng, Y. , Identification of miRNA-mRNA crosstalk in pancreatic cancer by integrating transcriptome analysis. Eur Rev Med Pharmacol Sci 2015. 19(5): p. 825-834. 10. Duffy MJ, S.C., Lamerz R, Haglund C, Holubec VL, Klapdor R, Nicolini A, Topolcan O, Heinemann V. , Tumor markers in pancreatic cancer: a European Group on Tumor Markers (EGTM) status report. Ann Oncol., 2010. 21(3): p. 441-7. 11. Mönkemüller, K., Fry, LC., Malfertheiner, P., Pancreatic cancer is 'always non-resectable'. Dig Dis., 2007. 25(3): p. 285-8. 12. Rückert F, P.C., Grützmann R. , Serum tumor markers in pancreatic cancer-recent discoveries. Cancers (Basel), 2010. 2(2): p. 1107-24. 13. Tomasz, R.P., Massat, N.J., Jones, R., Alrawashdeh, W., Dumartin, L., Ennis, D., Duffy, S.W., Kocher, H.M., Pereira, S.P., ..., & Crnogorac-Jurcevic, T., Identification of a Three-Biomarker Panel in Urine for Early Detection of Pancreatic Adenocarcinoma. Clin Cancer Res., 2015. 21: p. 3512. 14. Yachida S., J.S., Bozic I, Antal T, Leary R, Fu B, Kamiyama M, Hruban RH, Eshleman JR, Nowak MA, Velculescu VE, Kinzler KW, Vogelstein B, Iacobuzio-Donahue CA., Distant metastasis occurs late during the genetic evolution of pancreatic cancer. Nature, 2010. 467(7319): p. 1114-7. 15. Ronga, I., Fernando, G., Riccardib, F., Uomo, G, Anorexia–cachexia syndrome in pancreatic cancer: Recent advances and new pharmacological approach. Advances in Medical Sciences, 2014. 59(1): p. 1-6. 16. Jonckheere, N., Skrypek, N., Van Seuningen, I., Mucins and Pancreatic Cancer. Cancers, 2010. 2: p. 1794-812. 17. Hruban, R.H., Maitra, A., Goggins, M., Update on Pancreatic Intraepithelial Neoplasia. Int J Clin Exp Pathol., 2008. 1(4): p. 306-16. 18. Logsdon C.D., S.D.M., Binkley C, Arumugam T, Greenson J.K., Giordano T.J., Misek D.E., Kuick R, Hanash S., Molecular profiling of pancreatic adenocarcinoma and chronic pancreatitis identifies multiple genes differentially regulated in pancreatic cancer. Cancer Res, 2003. 63(10): p. 2649-57. 19. Fokas, E., O'Neill, E., Gordon-Weeks, A., Mukherjee, S., McKenna, W.G., Muschel, R.J. , Pancreatic ductal adenocarcinoma: From genetics to biology to radiobiology to oncoimmunology and all the way back to the clinic. Science Direct, 2015. 1855(1): p. 61-82.

62 20. Beger C, R.M., Meyer S, Leder G, Kru ̈ger M, Welte K, Gansauge F, Beger HG, Down- regulation of BRCA1 in chronic pancreatitis and sporadic pancreatic adenocarcinoma. Clin Cancer Res., 2004. 10: p. 3780-7. 21. Teich, N., Rosendahl, J., Tóth, M., Mössner, J., and Sahin-Tóth, M., Mutations of Human Cationic Trypsinogen (PRSS1) and Chronic Pancreatitis. Human Mutat., 2006. 27(8): p. 721-30. 22. Liu J, Z.H., A comprehensive study indicates PRSS1 gene is significantly associated with pancreatitis. Int J Med Sci, 2013. 10(8): p. 981-7. 23. Athwal, T., Huang, W., Mukherjee, R., Latawiec, D., Chvanov, M., Clarke, R., Smith, K., Campbell, F., Merriman, C., Criddle, D., Sutton, R., Neoptolemos, J., Vlatković, N., Expression of human cationic trypsinogen (PRSS1) in murine acinar cells promotes pancreatitis and apoptotic cell death. Cell Death Dis, 2015. 10(5). 24. Lynch, H.T., Deters, C. a., Snyder, C. L., Lynch, J. F., Villeneuve, P., Silberstein, J., Brand, R. E. , BRCA1 and pancreatic cancer: Pedigree findings and their causal relationships. Cancer Genetics and Cytogenetics, 2005. 158(2): p. 119-125. 25. Holter S, B.A., Dodd A, Grant R, Semotiuk K, Hedley D, Dhani N, Narod S, Akbari M, Moore M, Gallinger S., Germline BRCA Mutations in a Large Clinic-Based Cohort of Patients With Pancreatic Adenocarcinoma. J Clin Oncol., 2015. 33(28): p. 3124-9. 26. Al-Sukhni, W., Rothenmund, H., Eppel Borgida, A., Zogopoulos, G., O’Shea, A. M., Pollett, A., & Gallinger, S., Germline BRCA1 mutations predispose to pancreatic adenocarcinoma. Human Genetics, 2008. 124(3): p. 271-278. 27. Murphy KM1, B.K., Griffin C, Sollenberger JE, Petersen GM, Bansal R, Hruban RH, Kern SE., Evaluation of candidate genes MAP2K4, MADH4, ACVR1B, and BRCA2 in familial pancreatic cancer: deleterious BRCA2 mutations in 17%. Cancer Res, 2002. 62(13): p. 3789-93. 28. Lucas AL, S.R., Lipsyc MD, Mitchel EB, Kumar S, Hwang C, Deng L, Devoe C, Chabot JA, Szabolcs M, Ludwig T, Chung WK, Frucht H., High prevalence of BRCA1 and BRCA2 germline mutations with loss of heterozygosity in a series of resected pancreatic adenocarcinoma and other neoplastic lesions. Clin Cancer Res., 2013. 19(13): p. 3967-403. 29. Health, N.I.o. BRCA2. 2015 [cited 2015. 30. Greer, J.B., & Whitcomb, D. C., Role of BRCA1 and BRCA2 mutations in pancreatic cancer. Gut, 2007. 56(5): p. 601-605. 31. Thompson, D., & Easton, D. F. , Cancer Incidence in BRCA1 mutation carriers. Journal of the National Cancer Institute, 2002. 94(18): p. 1358-1365. 32. Wang Y., L.Y., Analysis of molecular pathways in pancreatic ductal adenocarcinomas with a bioinformatics approach. Asian Pac J Cancer Prev, 2015. 16(6): p. 2561-7. 33. Ye S, Y.L., Zhao X, Song W, Wang W, Zheng S., Bioinformatics method to predict two regulation mechanism: TF-miRNA-mRNA and lncRNA-miRNA-mRNA in pancreatic cancer. Cell Biochem Biophys, 2014. 70(3): p. 1849-58. 34. Jones S, Z.X., Parsons DW, Lin JC, Leary RJ, Angenendt P, Mankoo P, Carter H, Kamiyama H, Jimeno A, Hong SM, Fu B, Lin MT, Calhoun ES, Kamiyama M, Walter K, Nikolskaya T, Nikolsky Y, Hartigan J, Smith DR, Hidalgo M, Leach SD, Klein AP, et al., Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science, 2008. 321(5897): p. 1801-6. 35. Gutiérrez ML, C.L., Teodosio C, Sarasquete ME, del Mar Abad M, Iglesias M, Esteban C, Sayagues JM, Orfao A, Muñoz-Bellvis L., Identification and characterization of the gene expression profiles for protein coding and non-coding RNAs of pancreatic ductal adenocarcinomas. Oncotarget, 2015. 6(22): p. 19070-86. 36. Namkung, J., Kwon, W., Choi, Y., Yi, S.G., Han, S., Kang, M.J., Kim, S.W., Park, T., Jang, J.Y., Molecular subtypes of pancreatic cancer based on miRNA expression profiles have independent prognostic value. J Gastroenterol Hepatol, 2015: p. Epub.

63 37. Moffit, R., Marayati, R., Flate, EL., Volmar, KE., Loeza, SGH., et al., Virtual microdissection identifies distinct tumor- and stroma-specific subtypes of pancreatic ductal adenocarcinoma. Nature Genetics, 2015. 47: p. 1168-78. 38. Ayars, M., Goggins, M., Pancreatic cancer: Classifying pancreatic cancer using gene expression profiling. Nat Rev Gastroentero. & Hepato., 2015. 12: p. 613-4. 39. Buchholz, M., Honstein, T., Kirchhoff, S., Kreider, R., Schmidt, H., Sipos, B., et al. , A Multistep High-Content Screening Approach to Identify Novel Functionally Relevant Target Genes in Pancreatic Cancer. PLoS ONE, 2015. 10(4). 40. Franka, T., Suna, X., Zhang, Y., Yanga, J., Fishere, WE., Gingrase, MC., Lia, M., Genomic profiling guides the choice of molecular targeted therapy of pancreatic cancer. Cancer Lett., 2015. 363(1): p. 1-6. 41. Lyakhovich A, S.J., Disruption of the Fanconi anemia/BRCA pathway in sporadic cancer. Cancer Lett., 2006. 232(1): p. 99-106. 42. Konstantinopoulos PA, S.D., Karlan BY, Taniguchi T, Fountzilas E, Francoeur N, Levine DA, Cannistra SA., Gene expression profile of BRCAness that correlates with responsiveness to chemotherapy and with outcome in patients with epithelial ovarian cancer. J Clin Oncol., 2010. 28(22): p. 3555-61. 43. Finotello F, D.C.B., Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis. Brief Funct Genomics, 2015. 14(2): p. 130-42. 44. Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D. R., … Pachter, L. , Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 2012. 7(3): p. 562–78. 45. Marioni JC, M.C., Mane SM, Stephens M, Gilad Y., An assessment of technical reproducibility and comparison with gene expression arrays. Genome Res, 2008. 18(9): p. 1509-17. 46. Hutchins AP, P.S., Fujii H, Miranda-Saavedra D., Discovery and Characterization of New Transcripts from RNA-seq Data in Mouse CD4+ T Cells. Genomics, 2012. 100(5): p. 303-13. 47. Lee J, J.Y., Liang S, Cai G, Müller P., On differential gene expression using RNA-Seq data. Cancer Inform, 2011. 10: p. 2015-15. 48. Kim D, P.G., Trapnell C, Pimentel H, Kelley R, Salzberg SL., TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol, 2013. 14(4): p. R36. 49. Hezel, A.F., Kimmelman, A.C., Stanger, B.Z., Bardeesy, N., Depinho, R.A., Genetics and biology of pancreatic ductal adenocarcinoma. Genes Dev, 2006. 20(10): p. 1218-49. 50. Lopez-Casas, P.P., Lopez-Fernandez, L.A., Gene-expression profiling in pancreatic cancer. Expert Review of Molecular Diagnostics, 2010. 10(5): p. 591-601. 51. Institute, N.C. BRCA1 and BRCA2: Cancer Risk and Genetic Testing. Genetics of Cancer April 1 2015 [cited 2015 July 24]. 52. Jasin, M., Homologous repair of DNA damage and tumorigenesis:the BRCA connection. Oncogene, 2002. 21(58): p. 8981-93. 53. Li X, H.W.D., Homologous recombination in DNA repair and DNA damage tolerance. Cell Res, 2008. 18(1): p. 99-113. 54. Zheng, L., Li, S., Boyer, T., Lee, W.H., Lessons learned from BRCA1 and BRCA2. Oncogene, 2000. 19(53): p. 6159-75. 55. Chial, H., Tumor Suppressor (TS) Genes and the Two-Hit Hypothesis. Scitable by Nature Education, 2008. 1(1): p. 177. 56. Honrado, E., Osorio, a, Palacios, J., & Benitez, J. , Pathology and gene expression of hereditary breast tumors associated with BRCA1, BRCA2 and CHEK2 gene mutations. Oncogene, 2006. 25 (43): p. 5837–5845. 57. Matros, E., BRCA1 promoter methylation in sporadic breast tumors: relationship to gene expression profiles. Breast Cancer Res. Treat, 2005. 91: p. 179–186.

64 58. Trapnell, C., Williams, B. a, Pertea, G., Mortazavi, A., Kwan, G., van Baren, M. J., ... Pachter, L. , Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. . Nature Biotechnology, 2010. 28(5)(5): p. 511–515. 59. Flicek, P., Ridwan Amode, M., et al. , Ensembl 2014, in Nucleic Acids Research Ensembl, Editor. 2014. 60. Marot, G., Castel, D., Estelle, J., Guernec, G., Jagla, B., Servant, N., … Schae, B. , A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Briefings in Bioinformatics, 2012. 14 (6 ): p. 671-683. 61. Korf, I., Yandell, M., & Bedell, J. , BLAST: An Essential Guide to the Basic Local Alignment Search Tool O'Reilly, Editor. 2003: Sebastopol, CA. 62. Madden, T., The BLAST sequence analysis tool. In NCBI Handbook, ed. J.M.a.J. Ostell. 2005, Bethesda, MD: National Library of Medicine. 63. Van Dam S, C.T., de Magalhães JP, GeneFriends: a human RNA-seq-based gene and transcript co-expression database. Nucleic Acids Res, 2015. 43. 64. Azuma T, H.M., Ito S, Yamamoto K, Taggart RT, Matsuba T, Yasukawa K, Uno K, Hayakumo T, Nakajima M., Expression of cathepsin E in pancreas: a possible tumor marker for pancreas, a preliminary report. Int J Cancer, 1996. 64(4): p. 492-497. 65. Rebhan, M., Chalifa-Caspi, V., Prilusky, J., and Lancet, D. , GeneCards: A novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics, 1998. 14: p. 656-664. 66. Lancet D, S.M., Olender T, Dalah I, Iny-Stein T, Inger A, Harel A and Stelzer G. , GeneCards tools for combinatorial annotation and dissemination of human genome information in GIACS Conference on Data in Complex Systems. 2008. 67. Arumugam T., H., R., Moore, T., Wang H., Brandt, W., Westley., B, and Logsdon, C., TFF1 is expressed early in pancreatic cancer and increases cancer cell invasion and growth and migration of stellate cells within the tumor microenvironment. Cancer Res, 2008. 75(13). 68. Yoshiura K, N.T., Nishishita T, Sato K, Yamamoto A, Shimada S, Saida T, Kawakami Y, Takahashi TA, Fukuda H, Imajoh-Ohmi S, Oyaizu N, Yamashita N., Carbonic anhydrase II is a tumor vessel endothelium-associated antigen targeted by dendritic cell therapy. Clin Cancer Res., 2005. 11(22): p. 8201-7. 69. Wang GH1, Y.L., Xu HW, Tang WT, Fu JH, Hu XF, Cui L, Xu XM., Identification of MXRA5 as a novel biomarker in colorectal cancer. Oncol Lett, 2013. 5(2): p. 544-8. 70. Sirchia, S.M., Tabano, S. Monti, L., Recalcati, M.P., Gariboldi, M., Grati, F.R., Porta, G., Finelli, P., Radice, P., and Miozzo M., Misbehaviour of XIST RNA in Breast Cancer Cells. PLoS ONE, 2009. 4(5). 71. Leir S.H., H.A., MUC6 mucin expression inhibits tumor cell invasion. Exp Cell Res, 2011. 317(17): p. 2408-19. 72. Ruffell, B., Role of Cathepsin C During Breast Cancer Metastasis. 2011, U.S. Army Medical Research and Material Command: University of California, San Francisco. 73. Thomas, J.K.e.a., Pancreatic Cancer Database: An integrative resource for pancreatic cancer. Cancer Biology and Therapy, 2014. 15(8). 74. Dhar DK, W.T., Maruyama R, Udagawa J, Kubota H, Fuji T, Tachibana M, Ono T, Otani H, Nagasue N., Expression of cytoplasmic TFF2 is a marker of tumor metastasis and negative prognostic factor in gastric cancer. Lab Invest, 2003. 83(9): p. 1343-52. 75. Hoshi H, S.T., Uchida M, Saito H, Iijima H, Toda-Agetsuma M, Wada T, Yamazoe S, Tanaka H, Kimura K, Kakehashi A, Wei M, Hirakawa K, Wanibuchi H., Tumor-associated MUC5AC stimulates in vivo tumorigenicity of human pancreatic cancer. Int J Oncol, 2011. 38(3): p. 619- 27. 76. Zaidi, N., Hermann, C., Herrmann, T., Kalbacher, H., Emerging functional roles of cathepsin E. Biochemical and Biophysical Research Communications, 2008. 377(2): p. 327-30.

65 77. Azuma, T., Yamada, M., Murakita, H., Nishikawa, Y., Kohli, Y., Yamamoto, K., and and H. Hori, Cathepsin E Expressed in Pancreatic Cancer. Advances in Experimental Medicine and Biology, 1995. 362: p. 363-6. 78. Chain, B.M., Free, P., Medd, P., Swetman, C., Tabor, A.B., Terrazzini, N., The expression and function of cathepsin E in dendritic cells. J of Immunol, 2005. 174(4): p. 1791-800. 79. Li H, L.Y., Cui L, Wang B, Cui W, Li M, Cheng Y, Monitoring pancreatic carcinogenesis by the molecular imaging of cathepsin E in vivo using confocal laser endomicroscopy. PLoS ONE, 2014. 9(9). 80. Keliher EJ, R.T., Earley S, Klubnick J, Tassa C, Lee AJ, Ramaswamy S, Bardeesy N, Hanahan D, Depinho RA, Castro CM, Weissleder R, Targeting cathepsin E in pancreatic cancer by a small molecule allows in vivo detection. Neoplasia, 2013. 2013(15): p. 7. 81. Azuma T, Y.M., Hajime M, Yasuyuki N, Kohli Y, Yamamoto K and Hori H, Cathepsin E Expressed in Pancreatic Cancer. Advances in Experimental Medicine and Biology, 1995. 362: p. 363-6. 82. Kastin, A., TFF (Trefoil Factor Family) Peptides, in Handbook of Biologically Active Peptides, A.J. Kastin, Editor. 2006, Elsevier: Burlington, MA. p. 1147-53. 83. Thiruvengadam, A., Brandt, W., Ramachandran, V., Moore, T.T., Wang, H., May, F.E., Westley, B.R., Hwang, R.F., Logsdon, C.D., Trefoil Factor 1 Stimulates Both Pancreatic Cancer and Stellate Cells and Increases Metastasis. Pancreas., 2011. 40(6): p. 815-22. 84. Ebert, M.P., Hoffmann, J., Haeckel, C., Rutkowski, K., Schmid, R.M., Wagner, M., Adler, G., Schulz, H.U., Roessner, A., Hoffmann, W., Malfertheiner, P., Induction of TFF1 gene expression in pancreas overexpressing transforming growth factor alpha. Gut, 1999. 45(1): p. 105-11. 85. Reis, C.A., David, L., Carvalho, F., Mandel, U., de Bolos, C., Mirgorodskaya, E., Clausen, H., Sobrinho-Simoes, M., Immunohistochemical Study of the Expression of MUC6 Mucin and Co- expression of Other Secreted Mucins (MUC5AC and MUC2) in Human Gastric Carcinomas. J Histochem Cytochem, 2000. 48(3): p. 377-88. 86. Byrd JC, B.R., Mucins and mucin binding proteins in colorectal cancer. Cancer Metastasis Rev, 2004. 23(1-2): p. 77-99. 87. Andrianifahanana M, M.N., Schmied BM, Ringel J, Friess H, Hollingsworth MA, Büchler MW, Aubert JP, Batra SK., Mucin (MUC) gene expression in human pancreatic adenocarcinoma and chronic pancreatitis: a potential role of MUC4 as a tumor marker of diagnostic significance. Clin Cancer Res., 2001. 7(12): p. 4033-40. 88. DW, K., Mucins in cancer: function, prognosis and therapy. Nat Rev Cancer, 2009. 9(12): p. 874-85. 89. Chauhan, S.C., Ebeling, M.C., Maher, D.M., Koch, M.D., Watanabe, A., Aburatani, H., Lio, Y., Jaggi, M., MUC13 mucin augments pancreatic tumorigenesis. Mol Cancer Ther., 2012. 11(1): p. 24-33. 90. Jinfeng, M., Kimura, W., Hirai, I., Sakurai, F., Moriya, T., Mizutani, M., Expression of MUC5AC and MUC6 in invasive ductal carcinoma of the pancreas and relationship with prognosis. Int J Gastrointest Cancer, 2003. 34(1): p. 9-18. 91. Hinoda, Y., Ikematsu, Y., Horinochi, M., Sato, S., Yamamoto, K., Nakano, T., Fukui, M., Suehiro, Y., Hamanaka, Y., Nishikawa, Y., Kida, H., Waki, S., Oka, M., Imai, K., Yonezawa, S., Increased expression of MUC1 in advanced pancreatic cancer. J Gastroenterol, 2003. 38(12): p. 1162-6. 92. Prasad, N.B., Biankin, A.V., Fukushima, N., Maitra, A., Dhara, S., Elkahloun, A.G., Hruban, R.H., Goggins, M., Leach, S.D., Gene expression profiles in pancreatic intraepithelial neoplasia reflect the effects of Hedgehog signaling on pancreatic ductal epithelial cells. Cancer Res, 2005. 65(5): p. 1619-26. 93. Zhou, J., and Sahin-Toth, M., Chymotrypsin C (CTRC) mutations in chronic pancreatitis. J Gastroenterol, 2012. 26(8): p. 1238-46.

66 SUPPLEMENTARY TABLES

Table 1

Sample RIN BRCA mutation Pairs sequenced Pairs Aligned (min) status (mil) PDAC X1 8.9 -­‐ 67.2 52.7 PDAC X2 9.5 BRCA2 mut 56.7 42.4 PDAC X3 9 -­‐ 64.5 48.9 PDAC X4 9.5 -­‐ 63.4 36.1 PDAC X5 9.7 -­‐ 76.4 63.3 PDAC X6 9.9 -­‐ 63.2 41.9 PDAC X7 8.4 -­‐ 53.8 41 PDAC X8 8.7 -­‐ 50.7 42.3 PDAC X9 8.3 -­‐ 6.72 43.3 PDAC X10 7.6 BRCA2 mut 56.4 56.4 PDAC X11 8.7 BRCA1 mut 56.4 42.2 PDAC X12 9.1 BRCA2 mut 64 43.1 DUCT X13 7.3 63.9 58.5

67 Table 2:

Gene PDAC X1 PDAC X2 PDAC X3 PDAC X4 PDAC X5 PDAC X6 PDAC X7 PDAC X8 PDAC X9 PDAC X10 PDAC X11 PDAC X12 Duct

CTSE 336.90 77.10 346.90 1827.38 1038.39 2566.40 1289.25 1753.88 1111.19 467.37 13.16 432.02 0.94

PIGR 454.94 88.16 31.22 33.89 310.23 72.35 356.19 196.85 563.93 1.53 6.91 24.70 11.34

PLAU 3.50 319.12 33.04 15.01 9.37 8.90 6.90 23.70 7.20 7.10 132.96 3.68 0.89

ADRA2 68.33 0.39 9.44 6.68 14.91 0.87 31.95 35.71 25.77 2.18 0.01 2.53 17.24 A DMBT1 0.34 0.22 0.05 1.59 308.43 16.26 25.94 71.22 5.96 0.18 1.52 0.08 0.41

RBP4 0.10 3.87 74.63 45.31 215.50 613.76 41.59 174.87 15.55 10.40 2.42 3.37 4.21

SPON1 1.43 0.52 0.34 0.45 15.83 6.56 28.80 4.02 4.47 0.18 0.48 0.35 0.93

CREB3 75.42 4.05 36.45 385.86 116.02 90.10 190.24 170.75 243.78 80.07 7.59 32.73 17.25 L1 MUC6 0.16 0.96 0.40 222.23 2.56 9.28 1.53 0.30 34.99 0.09 0.20 0.52 39.41

TRPM5 6.07 0.03 0.04 0.04 0.28 2.03 0.26 0.74 2.33 0.03 0.01 0.04 0.99

CTSC 54.82 616.21 123.28 90.32 52.96 77.10 96.04 101.34 128.28 130.83 269.54 922.69 25.23

PIWIL1 0.03 0.01 1.45 0.59 0.02 7.44 2.81 13.55 1.57 0.16 0.03 0.01 0.01

SERPI 0.46 2.49 0.63 0.48 0.20 0.21 0.88 1.18 1.59 0.99 93.36 0.18 44.48 NF1 MIR361 151.02 117.93 86.48 69943.10 303.84 91.73 160.90 107.43 209.18 143.62 64.31 187.00 42.12 5 CREB3 0.66 0.13 0.38 1.69 16.39 7.45 6.53 12.17 4.35 0.49 0.02 0.73 0.00 L3 TFF1 553.32 332.78 1642.94 10540.70 3938.96 7019.57 6940.61 3359.75 11117.00 892.09 35.02 2110.13 18.66

CADPS 2.58 0.00 0.05 1.82 5.48 1.83 2.76 2.49 13.58 0.03 0.02 0.00 7.23

HIST1 71.69 11.56 66.91 606.42 20.27 75.39 81.89 47.29 118.05 5.29 35.58 15.78 19.73 H1C HIST1 13.81 0.64 8.21 214.12 11.25 21.87 20.46 4.32 14.91 0.46 1.43 2.01 0.45 H4H HIST1 238.53 20.22 246.07 2130.71 157.75 136.68 262.48 230.96 413.55 18.98 61.13 73.20 46.20 H2BJ CPA1 0.06 0.19 0.02 0.03 0.04 0.04 0.11 0.64 129.81 0.09 0.08 0.04 0.00

PRSS1 0.77 10.51 41.20 1273.00 4.79 7.92 9.52 425.02 693.15 11.70 0.17 7.11 NOTE ST CA2 211.33 2.78 267.11 287.04 73.39 335.36 612.92 115.47 367.80 54.66 11.52 70.93 33.76

TPM2 31.18 527.34 65.03 2.82 13.65 55.83 38.30 34.52 39.04 125.06 165.64 31.92 30.36

MXRA5 5.66 1.56 28.66 64.48 12.27 24.35 4.27 6.03 3.15 2.43 0.44 0.04 2.44

XIST 0.00 0.05 3.36 0.06 0.00 0.02 72.91 0.10 0.00 0.04 0.01 0.00 0.01

XLOC_ 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.78 0.00 100.67 0.00 005829 XLOC_ 7.14 0.00 1.38 45.07 2.42 5.45 2.35 0.71 12.06 0.00 0.00 0.00 0.00 009799 XLOC_ 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.18 0.00 0.00 028976 XLOC_ 0.00 0.27 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.96 0.01 0.00 041859 XLOC_ 0.00 0.85 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.42 0.00 0.00 041871 XLOC_ 0.00 0.59 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.41 0.02 0.00 041874 XLOC_ 0.00 0.64 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2.66 0.00 0.03 041875 XLOC_ 0.00 0.22 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.93 0.00 0.00 041881 XLOC_ 0.00 0.28 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.82 0.01 0.00 041888 XLOC_ 0.00 2.65 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 8.77 0.02 0.00 041897 XLOC_ 0 1.09E+08 0 0 0 0 0 0 0 3.53E+07 0 4.80E+08 0 042346

68 XLOC_ 0.29 0.01 0.42 0.05 0.14 0.04 0.29 65746.40 0.15 0.15 0.48 0.26 0.02 050808 XLOC_ 0.00 0.95 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.46 0.05 0.01 052150 XLOC_ 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3.25 0.01 0.00 057303 XLOC_ 0.00 0.00 0.00 0.00 0.00 2.60 0.00 0.00 0.00 0.00 0.00 0.00 0.00 072454 XLOC_ 0.00 0.16 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.59 0.00 0.00 072846 XLOC_ 0.00 0.14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.90 0.00 0.00 072852 XLOC_ 0.00 0.36 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.87 0.00 0.00 072880 XLOC_ 0.00 0.31 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.31 0.00 0.00 072881 XLOC_ 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.01 0.03 0.00 0.00 084554

69 ADDENDUM A:

Paper: TC-PTP regulates the IL-7 transcriptional response during early T cell development

This publication, TC-PTP regulates the IL-7 transcriptional response during early T cell development, is currently being drafted for publication. The paper discusses the phenotypic and gene expression differences observed in T cells from TC-PTP knock out mice compared to T- cells from TC-PTP wild type mice. As the paper is not completed, the completed manuscript could not be included in this thesis submission. The description below summarizes the bioinformatics background, methodology, and results produced by the author of this thesis document (N.H. Desai).

70

TC-PTP regulates the IL-7 transcriptional response during early T cell development

K.A. Pike1, S. Bussières-Marmen1,2, T. Hatzihristidis1, F. Robert1, N.H. Desai1,2, D. Miranda-

Saavedra3, J. Pelletier1,2 and M.L. Tremblay1,2

1 Rosalind and Morris Goodman Cancer Research Center, McGill University, 1160 Pine Avenue,

Montreal, Quebec H3A 1A3, Canada

2 Department of Biochemistry, McGill University, Montreal, Quebec, Canada

3 Centro de Biologia Molecular Severo Ochoa, CSIC/Universidad Autónoma de Madrid, 28049

Madrid, Spain

Email addresses:

KP: [email protected]

SBM: [email protected]

TH: [email protected]

FR: [email protected]

ND: [email protected]

DMS: [email protected]

JP: [email protected]

MLT: [email protected]

71

1. Background

The purpose of this paper was to determine the function of T-cell protein tyrosine phosphatase

(TC-PTP) in mouse CD4+ T-cells. To study these differences, a T-cell specific knockout of TC-

PTP was performed in three mice. The gene expression in these cells was compared to T-cells from three TC-PTP wild type mice.

2. Methodology

2.1 RNA Extraction and Sequencing:

To perform gene expression analysis, RNA was extracted from the CD4+ T-cells from TC-PTP-/- and TC-PTP+/+ mice. The extracted RNA was then sequenced using the Illumina HiSeq 2000 sequencer. Once RNA sequencing (RNA seq) was performed for each of the datasets, bioinformatics analysis was performed on each of the resulting RNA seq datasets to analyze differential gene expression between TC-PTP WT and KO T-cells.

A standard Tophat-Cufflinks pipeline was used to determine which genes were significantly differentially expressed in the RNA-seq datasets. The RNA reads were mapped to the UCSC mouse genome (GRCm38/mm10) [1] using Tophat (version 2.0.12) with default parameters and no coverage search [2]. Once alignments were completed, Cufflinks (version 2.1.1) was used with default parameters to reconstruct transcripts and quantify expression [2,3]. The Cuffdiffs tool was then used to determine which genes were significantly differentially expressed between the knockout and wild-type samples. The list was then filtered for genes with p-values of 5E-05 or less and a log2 fold change of at least ±2.

72 3. Results

The list of genes with p-values of 5E-05 or less and a log2 fold change of at least ±2 are included in table 1. Figure 1 contains a scatter plot of all the genes that were analyzed.

Table 1: List of genes that were significantly differentially expressed between TC-PTP-/- over TC-PTP+/+ mouse T-cells 73

Figure 1: Scatter plot of log2 expression of genes in TC-PTP-/- verses TC-PTP+/+ T-cells

Bibliography

1. Flicek, P., Ridwan Amode, M., et al. , Ensembl 2014, in Nucleic Acids Research Ensembl, Editor. 2014. 2. Trapnell, C., Williams, B. a, Pertea, G., Mortazavi, A., Kwan, G., van Baren, M. J., ... Pachter, L., Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. . Nature Biotechnology, 2010. 28(5): p. 511–515. 3. Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D. R., … Pachter, L., Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 2012. 7(3): p. 562–78.

74 ADDENDUM B:

Paper: Modulation of PTP1b and TC-PTP expression enhances dendritic cell activation and maturation

This publication, Modulation of PTP1b and TC-PTP expression enhances dendritic cell activation and maturation, is currently also being drafted for publication. As the paper is not completed, the completed manuscript could not be included in this thesis submission. The paper focuses on how expression of PTP1b and TC-PTP in dendritic cells can affect the activation of dendritic cells for immunotherapy applications. The description below summarizes the bioinformatics analysis that was done by the author of this thesis document (N.H. Desai).

75 Modulation of PTP1b and TC-PTP expression enhances dendritic cell activation and maturation

Matthew Feldhammer1,2, Claudia Penafuerte1,2, Nikita Desai1,2, George Zogopoulos1,3, and Michel L.

Tremblay1,2

1 Rosalind and Morris Goodman Cancer Research Center, McGill University, 1160 Pine Avenue,

Montreal, Quebec H3A 1A3, Canada

2 Department of Biochemistry, McGill University, Montreal, Quebec, Canada

3 The Research Institute of McGill University Health Center, McGill University, 1001 Decarie

Boulevard, Montreal, Quebec, H4A 3J1, Canada

Email addresses:

MF: [email protected]

CP: [email protected]

ND: [email protected]

GZ: [email protected]

MLT: [email protected]

76 1. Background

In recent years, dendritic cell (DC) based has presented an increasingly promising treatment option for cancer [1-3]. Immunotherapy techniques can be used in conjunction with existing treatment strategies to improve cancer outcomes [1, 4]. Avoidance of immune recognition and subsequent reduction in immunogenicity, the ability to provoke an immune response, is one of the hallmarks of most cancers [1, 4, 5]. The aim of cancer vaccines, therefore, is to revert tolerance to tumour cells after cancer is established [2]

It has already been demonstrated that the efficacy of DC-based immunotherapy techniques is compromised in immunosuppressive tumour microenvironments [6] and this is largely due to a lack of activation or maturation of infiltrating DCs. The purpose of this paper, therefore, was to study the activation and maturation of mouse DCs by modulating the expression of two closely related phosphatases, T-cell protein tyrosine phosphatase (TC-PTP) and protein tyrosine phosphatase 1b (PTP1B). The gene expression DCs treated with a Merck inhibitor shown to affect both TC-PTP and PTP1B activity were compared to an untreated sample to determine how inhibitor treatment affected gene expression in DCs [7].

2. Methodology

2.1 RNA extraction and microarray quantification:

Monocyte DCs (moDCs) from two wild type mice treated with a TC-PTP/PTP1B inhibitor [7]. A second set of moDCs were isolated from two wild-type mice and cultured. Following 8 days of culture, RNA was extracted from the cells for microarray analysis.

The Illumina Mouse WG-6 Expression BeadChip microarray was used for gene quantification at the Genome Quebec Innovation Center. Once the microarray data was collected, bioinformatics

77 analysis was performed to determine what genes were over- or under- expressed as compared to wild type in the inhibitor-treated DCs.

To extract differentially expressed genes as well as expression data for genes of interest, the

FlexArray software (version 1.6.1) was used [8]. The Illumina sample probe data for the microarray samples were imported into the software along with the control probe report. Once imported, the lumi algorithm was used to for a robust spline normalization using negative controls for background correction [8]. A two-sample Bayesian t-test was used to determine which of the genes were significantly differentially expressed [8]. Finally, the following filters were used to filter genes that were considered to be significantly differentially expressed genes: a p-value of less than 0.05, a T-statistic of at least ±2.5 and a log2 fold change of at least ±1.5.

Once a list of the most highly differentially expressed genes was extracted, the PantherDB tool

[9] was used to extract the top gene ontology terms to determine what cellular processes were more represented in the over- or under- expressed genes.

Finally, the expression for certain DC maturation markers (IL12, TNFα, PI3K, MyD88, CIITA,

IFNγR2, and IL10) as well as p-values for significance of differential expression, were extracted from the FlexArray results.

3. Results

Once gene expression was filtered to ensure significance of up- or down- regulation, a list of 164 genes was extracted. A heat map of genes was created in the FlexArray software (Figure 1a). In addition, a scatter plot for the genes was also generated (Figure 1b). It is evident from the list of the top 11 enriched gene ontology terms (Figure 1c) that several processes related to the positive regulation of immune responses were enriched in the gene list.

78

Figure 1: a) Heat map of top 164 up- and down- regulated genes in inhibitor treated DCs compared to untreated. b) Scatter plot of log2 expression of genes with significant genes in black c) gene ontology terms for top 164 up- and down- regulated genes from the Panther DB tool [9]

Finally, it is apparent from the expression values of main DC maturation markers (Figure 2), that the most of the maturation markers are significantly up-regulated in inhibitor-treated DCs.

Figure 2: Expression of main DC maturation markers

79 4. Conclusion

The results of the bioinformatic analysis for this paper suggest that the TC-PTP/PTP1B inhibitor seems to increase dendritic cell maturation in mice.

Bibliography

1. Humphries, C., Honing that killer instinct. Nature Outlook. 504(7480): S13-S15. 2013. 2. Blankenstein, T. et al., The determinants of tumor immunogenicity. Nature Reviews Cancer. 2012. 12(4): 307-313. 3. Wank, R, et al., Benefits of a continuous therapy for cancer patients with a novel adoptive cell therapy by cascade priming (CAPRI). Immunotherapy. 2014. 6(3): 269-282. 4. Gravitz, L., Cancer immunotherapy. Nature Outlook. 2013. 504(7480): S6-S8. 5. Zitvogel, L. et al., Cancer despite immunosurveillance: immunoselection and immunosubversion. Nature Reviews Immunology. 2006. 6: 715-27. 6. Rosenberg, S.A., Progress in human tumour immunology and immunotherapy. Nature, 2001. 411(6835): p. 380-4. 7. Julien, S.G., et al., Protein tyrosine phosphatase 1B deficiency or inhibition delays ErbB2-induced mammary tumorigenesis and protects from lung metastasis. Nat Genet, 2007. 39(3): p. 338-46. 8. Michal B., Mathieu M., Robert N. FlexArray: A statistical data analysis software for gene expression microarrays. Genome Quebec, 2007. Montreal, Canada, URL http://genomequebec.mcgill.ca/FlexArray 9. Thomas, P.D., Campbell, M.J., Kerjariwal, A., Huaiyu, M., Karlak, B., Daverman, R., Diemer, K., Muruganujan, A., Narechania, A., PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 2003. 13: 2129-41

80 ADDENDUM C:

Paper: A Drosophila-centric view of protein tyrosine phosphatases

This publication has been published in BMC Biology. The abstract for the paper is included below. The author of this thesis (N.H. Desai) contributed to the publication by creating figure 3, a phylogenetic tree showing the relationship between Drosophila protein tyrosine phosphatases and their human orthologs.

81 A Drosophila-centric analysis of protein tyrosine phosphatases

Teri Hatzihristidis 1,2, Nikita Desai 1,3, Andrew P. Hutchins 4, Tzu-Ching Meng 5,6,7

Michel L. Tremblay 1,2,3,# and Diego Miranda-Saavedra 8,9,#

1 Goodman Cancer Research Centre, McGill University, 1160 Pine Avenue, Montreal, Quebec

H3A 1A3, Canada

2 Department of Medicine, Division of Experimental Medicine, McGill University, Montreal,

Quebec, Canada

3 Department of Biochemistry, McGill University, Montreal, Quebec, Canada

4 South China Institute for Stem Cell Biology and Regenerative Medicine, Guangzhou Institutes of

Biomedicine and Health, Chinese Academy of Sciences, 190 Kaiyuan Avenue, Guangzhou

510663, China

5 Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan

6 Institute of Biochemical Sciences, National Taiwan University, Taipei, Taiwan

7 Institute of Biological Chemistry, Academia Sinica, Taipei, Taiwan

8 Fibrosis Laboratories, Institute of Cellular Medicine, Newcastle University Medical School,

Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom

9 World Premier International (WPI) Immunology Frontier Research Center (IFReC), Osaka

University, 3-1 Yamadaoka, Suita 565-0871, Osaka, Japan

#Corresponding authors.

Email addresses:

TH: [email protected]

ND: [email protected]

APH: [email protected]

TCM: [email protected]

MLT: [email protected]

DMS: [email protected]

82 ABSTRACT

Background

Reversible tyrosine phosphorylation controls sophisticated regulatory systems in metazoans. Whereas tyrosine kinases are a well-understood group of enzymes, the study of protein tyrosine phosphatases (PTP) has lagged behind due to the difficulty of finding target proteins and developing specific biochemical inhibitors. In most cases PTPs are dominant over protein kinases in shaping the spatio-temporal patterns of protein phosphorylation networks, and hence the abnormal regulation of PTPs is responsible for many diseases. Most of our knowledge on PTPs comes from human pathologies and the generation of mouse knockouts, which largely correlate well with human disease phenotypes.

Results

Here we present the analysis of the PTP complement of the fruit fly and the complementary view that PTP studies in

Drosophila will accelerate our understanding of PTPs in physiological and pathological conditions. With only 44

PTP genes, Drosophila presents a streamlined version of the human complement, with over 70% of human PTPs having an ortholog in the fly. Moreover, 75% of disease-associated PTP genes have an ortholog in Drosophila.

Conclusions

Our integrated analysis places the Drosophila PTPs into evolutionary and functional contexts, thereby providing a platform for the exploitation of the fly for research on PTPs and the transfer of knowledge onto other model systems. Finally, we discuss how the establishment of disease models in Drosophila will prove invaluable in accelerating the discovery of useful drug leads by non-traditional high-throughput screening methods, and thus overcome some of the technical difficulties associated with the development of PTP inhibitors.

KEYWORDS • Tyrosine phosphatase • Drosophila • Sequence analysis • Model system • Phosphorylation

• Biochemical evolution

83

84