Search for DNA methylation biomarkers in the circulating DNA of prostate and colorectal cancer

by

Mina Park

A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate department of Pharmacology and Toxicology University of Toronto

© Copyright by Mina Park (2012)

Search for DNA methylation biomarkers in the circulating DNA of prostate and colorectal cancer

Mina Park Master of Science, 2012 Graduate Department of Pharmacology and Toxicology University of Toronto

ABSTRACT

Early diagnosis represents an effective way to improve patient prognosis in cancer. New opportunities for cancer diagnosis and screening may arise from identification of cancer-specific epigenetic alterations in the cell-free circulating DNA (cirDNA). This study investigated biomarkers at the level of DNA methylation in the plasma cirDNA of individuals affected with prostate cancer or colorectal cancer. A methylation-sensitive restriction -based method was used to enrich methylated DNA fractions, which were interrogated on CpG island and tiling microarrays. A number of and non-coding loci exhibited differential methylation between prostate cancer patients and controls. The candidate loci identified from these microarray experiments underwent verification by bisulfite modification coupled with pyrosequencing. Our results suggest that microarray-based studies of DNA methylation in the cirDNA can be a promising avenue for the identification of epigenetic biomarkers in cancer.

ii

ACKNOWLEDGEMENTS

I would like to thank my supervisor, Dr. Art Petronis, for giving me the opportunity to work under his supervision for the past few years. The support and academic training I have received have been invaluable. My gratitude goes to Dr. Rene Cortese, to whom I am indebted for his wonderful guidance and mentorship throughout my degree. I would also like to thank all the other members of the Krembil Epigenetics Laboratory. Their assistance, encouragement, and camaraderie, both in and out of lab, have been instrumental in the completion of this degree. My thanks also goes to my advisor, Dr. Albert Wong, for his advice throughout my program.

Finally, I would like to my family and friends for their unrelenting support over the years. I am and will ever remain grateful, for it has truly made all the difference.

iii

TABLE OF CONTENTS

Title i

Abstract ii

Acknowledgements iii

Table of contents iv

List of tables vii

List of figures viii

Abbreviations ix

1.0 Introduction

1.1 Overview of the problem 1

1.2 Epigenetics 2 1.2.1 DNA methylation 2 1.2.2 DNA methylation and transcriptional repression 3

1.3 DNA methylation changes in cancer 4 1.3.1 Global genomic hypomethylation 5 1.3.2 Single-locus DNA hypomethylation 5 1.3.3 DNA hypermethylation 6 1.3.4 CpG island methylator phenotype 7

1.4 Cancer biomarkers 7 1.4.1 Sensitivity and specificity 8

1.5 Cell-free circulating DNA 9 1.5.1 Origins of circulating DNA 9 1.5.2 Mechanisms for DNA release into the circulation 10 1.5.3 Circulating DNA in cancer 11 1.5.4 Circulating DNA and cancer biomarkers 11

1.6 Studies of DNA methylation in the circulating DNA of cancer patients 13 1.6.1 Need for large scale studies of DNA methylation for identification of cancer biomarkers in the circulating DNA 15

1.7 Prostate cancer 15 1.7.1 Prostate specific antigen 16

iv

1.7.2 Studies of DNA methylation in the circulating DNA of prostate cancer patients 17

1.8 Colorectal cancer 18 1.8.1 Screening modalities used in colorectal cancer 18 1.8.2 Studies of DNA methylation in the circulating DNA of colorectal cancer patients 20

1.9 Research objectives 20

2.0 Materials and methods

2.1 Samples 22 2.1.1 Prostate cancer study 22 2.1.2 Colorectal cancer study 22

2.2 DNA extraction 23 2.2.1 Prostate cancer study 23 2.2.2 Colorectal cancer study 23

2.3 DNA methylation detection 23 2.3.1 Principle of DNA methylation detection 23 2.3.2 DNA blunting 24 2.3.3 Adaptor ligation 24 2.3.4 DNA methylation-sensitive enzyme digest 25 2.3.5 Adaptor-mediated PCR 25

2.4 Microarray experiments and data analysis 27 2.4.1 Prostate cancer study 27 2.4.1.1 Microarrays 27 2.4.1.2 Microarray data analysis 27 2.4.2 Colorectal cancer study 28 2.4.2.1 Microarrays 28 2.4.2.2 Microarray data analysis 28

2.5 Fine mapping of individual CpG locations 29 2.5.1 Principle of fine mapping 29 2.5.2 Bisulfite treatment and whole bisulfitome amplification 29 2.5.3 Nested PCR 30 2.5.4 Pyrosequencing 32

3.0 Results

3.1 Microarray methylation analysis in the circulating DNA of prostate cancer 33

v

3.2 Verification of microarray findings by fine mapping of cytosines on selected genes 38 3.2.1 Genes showing statistically significant differential methylation by pyrosequencing 39 3.2.2 Concordance of microarray and pyrosequencing data 42 3.2.3 Predictive value of differential circulating DNA methylation in RNF219 42

3.3 Microarray methylation analysis in the circulating DNA of colorectal cancer 44 3.3.1 Potential candidate loci in the circulating DNA of colorectal cancer 49

4.0 Discussion

4.1 DNA methylation differences in the plasma circulating DNA of cancer patients 50 4.1.1 Prostate cancer study 51 4.1.2 Colorectal cancer study 52

4.2 Discovery of candidate markers 53 4.2.1 Prostate cancer study 54 4.2.1.1 Replication of microarray data by pyrosequencing 54 4.2.1.2 Performance characteristics of RNF219 55 4.2.2 Colorectal cancer study 57

4.3 Future directions 59

5.0 References 61

vi

LIST OF TABLES

Table 1. Primers used for amplification of the external locus in nested PCR 31

Table 2. Primers used for amplification of the internal locus in nested PCR 31

Table 3. Loci that were significantly differentially methylated and mapped to repetitive elements, in order of significance 35

Table 4. Loci that were significantly differentially methylated and mapped to unique sequences, in order of significance 37

Table 5. Loci selected for fine mapping of methylated CpG positions 38

Table 6. Top 48 loci located within genes exhibiting differential methylation between colorectal cancer samples and controls 47

Table 7. Top 52 loci located in intergenic regions that exhibit differential methylation between colorectal cancer samples and controls 48

vii

LIST OF FIGURES

Figure 1. Principle of DNA methylation detection technology in plasma cirDNA 26

Figure 2. Volcano plot of microarray data in prostate cancer and control samples using FDR-adjusted p-value as statistics 34

Figure 3. Pyrosequencing results showing methylation status in CpG sites of RNF219 in prostate cancer and control samples 40

Figure 4. Pyrosequencing results showing methylation status in CpG sites of SIX3 in prostate cancer and control samples 40

Figure 5. Pyrosequencing results showing methylation status in CpG sites of KIAA1539 in prostate cancer and control samples 41

Figure 6. Sample pyrosequencing results showing methylation status of loci in prostate cancer and control samples 41

Figure 7. Differential methylation values obtained from microarrays and pyrosequencing for candidate loci 43

Figure 8. DNA methylation in the RNF219 gene in two independent sample sets 43

Figure 9: Predictive accuracy of cirDNA methylation level in the RNF219 gene 44

Figure 10. Volcano plot of microarray data in colorectal cancer and control samples 46

viii

ABBREVIATIONS

BPH benign prostatic hyperplasia caC carboxylcytosine CIMP CpG island methylator phenotype cirDNA cell-free circulating DNA DNMT DNA methyltransferase FDR false discovery rate FOBT fecal occult blood test mC methylated cytosine PCR polymerase chain reaction PSA prostate specific antigen ROC receiver operator curve

ix

1.0 INTRODUCTION

1.1 Overview of the problem

Cancer is a group of diseases that share the central characteristic of uncontrolled cellular proliferation. Cancer is a major public health problem. In 2008, there were an estimated 12.7 million new cancer cases, with the risk of dying from cancer before the age of 75 at 11.2%.

Cancer is also a leading cause of death, responsible for 7.6 million deaths, accounting for 13% of all deaths that year [1].

Early diagnosis of cancer is one of the best ways to reduce cancer-related mortality. By detecting the tumour at an early stage, the chances for available treatment options to be successful increase dramatically [2-4]. There are a number of cancers for which diagnosis at an early stage of disease is associated with improved survival outcomes. In colorectal cancer, 5-year survival when the cancer is diagnosed while it is localized to the colon is 90.1%, but falls to 11.7% if the cancer is diagnosed after it has metastasized. Similarly, 5-year survival for localized prostate cancer is 100%, but survival for diagnoses of metastatic prostate cancer falls to 28.7%

[5].

Traditional cancer diagnosis is based on assessing the morphology of cancer cells [6].

This method is suitable for diagnosing cancer in sites of the body that are easily accessible, such as the cervix or blood. However, in cancers for which cells are not easily accessible, diagnosis by morphological assessment requires tumour biopsies gained by invasive methods [7]. Discovering biomarkers that can detect cancer-specific changes in peripheral and easily accessible tissues is a way to bypass the challenge of lack of access to adequate testing material for cancer diagnosis.

1

1.2 Epigenetics

New opportunities for cancer diagnosis and screening may arise from identification of cancer-specific epigenetic alterations. Epigenetics refers to heritable changes in gene expression that are not based on the underlying DNA sequence [8]. In the human genome, the two main epigenetic mechanisms are modifications of histones, which are the main protein components of chromatin, and methylation of the cytosine nucleotide in DNA [9]. Alterations in chromatin structure are mediated through post-translational modifications of histone such as acetylation, methylation, and phosphorylation. These modifications are able to change the conformation of chromatin between an open, transcriptionally active form known as euchromatin and a condensed, transcriptionally inactive form known as heterochromatin [10]. DNA methylation refers to the covalent addition of a methyl group to position 5 of the cytosine pyrimidine ring

[11], and it represents a relatively stable and conserved mark, which makes it an appealing option for epigenetic studies.

1.2.1 DNA methylation

In humans, DNA methylation occurs primarily in the context of a cytosine followed by a guanine, which is known as a CpG dinucleotide. It is estimated that 1% of cytosine moieties and between 70 – 80% of CpGs are methylated in humans [12]. Methylated CpGs are located mainly in repetitive genomic regions [13]. In contrast, CpG islands, which are areas that show a high density of CpG sites and are typically associated with active transcription [14], contain largely unmethylated CpGs. Approximately 60% of genes are estimated to be associated with a CpG island in their promoter regions [15].

2

In addition to methylated cytosine (mC), the existence of additional modifications to cytosine has been recently discovered. The Tet family of has been found to oxidize mC to hydroxymethylated cystosine [16], and to catalyze these oxidated substrates into formylcytosine and carboxylcytosine (caC) [17]. This conversion is hypothesized to play an important role in demethylation, through the excision of caC by thymine-DNA to yield unmethylated cytosine [18]. However, because most of the techniques used in epigenetic studies of cancer have not differentiated mC from other forms of modified cytosine [19], the term “DNA methylation” will be used throughout this work to refer to such modifications.

DNA methylation is mediated by a family of enzymes called DNA methyltransferases

(DNMTs). There are currently four identified DNMTs that play a role in DNA methylation [11].

DNMT1 is proposed to be the maintenance DNMT, responsible for copying methylation patterns from hemimethylated templates to daughter strands during DNA replication [11]. DNMT3a and

3b are de novo methyltransferases, which set up the methylation patterns early in development

[20]. DNMT3L is thought to facilitate de novo methylation, by binding to DNMT3a and 3b and stimulating their activity [21].

1.2.2 DNA methylation and transcriptional regulation

DNA methylation has been associated with transcriptional repression [22], thereby playing an important role in the regulation of gene expression. One mode of repression is for

DNA methylation to physically impede the binding of transcription factors [23]. Another mode is for methylated DNA to mediate transcriptional repression by attracting proteins that compact chromatin, disposing it to an inactive heterochromatic state [24]. A family of proteins containing a methyl-binding domain that is involved in this process has been characterized, the most studied

3 of which is methyl CpG binding protein 2 (MeCP2) [25]. It is important to note, however, that the rule of high density of mC and suppression of transcriptional activity applies only to regulatory regions, such as promoters. One group of researchers investigating methylation on the

X found patterns of gene body hypermethylation in the active X chromosome compared to the inactive X chromosome [26]. This finding suggests that DNA methylation in gene bodies is associated with gene expression, in contrast to the repressive effect that DNA methylation in promoter regions has on gene expression.

Given its important role in transcriptional regulation, DNA methylation is crucial for proper biological development and functioning. DNA methylation is essential for genomic imprinting [27], X-chromosome inactivation [28], and differentiation and maintenance of cellular identity [29-30]. Aberrant DNA methylation has been implicated in a large spectrum of human diseases, ranging from imprinting disorders such as Beckwith-Wiedemann syndrome and Prader-

Willi syndrome [27], to complex diseases, the most studied of which is cancer [31].

1.3 DNA methylation changes in cancer

Abnormal patterns of DNA methylation are one of the most common alterations found in cancer [32-39]. Cancer cells exhibit a global loss of DNA methylation in addition to a gain of methylation in some CpG islands [39]. These alterations provide tumour cells with a growth advantage by elevating their genetic instability and allowing them to accrue progressive changes that support their continued proliferation and metastasis [27].

4

1.3.1 Global genomic hypomethylation

Loss of DNA methylation was the first epigenetic alteration identified in cancer cells [35].

Global genomic hypomethylation is largely due to loss of methylation in repetitive DNA sequences [40], and it has been seen universally across various cancers as well as in some pre- malignant adenomas [41]. Moreover, the degree of hypomethylation has been associated with disease severity and metastatic potential [35, 40].

There are many functional implications of global DNA hypomethylation as it relates to cancer. By weakening transcriptional repression, DNA hypomethylation can facilitate chromosomal instability, which is another hallmark of tumour cells [27]. Experiments in which methylation was depleted showed that loss of DNA methylation leads to aneuploidy and chromosomal rearrangements [42], which are thought to be primarily due to loss of methylated cytosines in centromeric or pericentric regions [43].

1.3.2 Single-locus DNA hypomethylation

Hypomethylation in coding sequences has also been observed in cancer [41]. A recent study found that CpG islands can be normally methylated in somatic tissues [44], and that the hypomethylation of these islands in cancer can activate nearby genes [45]. This has been found in genes with no known relationship to the disease, such as the growth hormone (GH), α- chorionic gonadatropin (αHCG), and γ-globin (HBG1) in colorectal cancer [46]. However hypomethylation has also been found in genes whose activation contributes to tumorigenesis [45].

There are several examples of genes activated by hypomethylation in cancer, and they include oncogenes such as homeobox proto-oncogene (HOX11) in leukemia [47], v-myc myelocytomatosis viral oncogene homolog (C-MYC) in colorectal cancer [48], and v-Ha-ras

5

Harvey rat sarcoma viral oncogene homolog (HRAS) in melanoma [36], as well as non- oncogenes such as trefoil factor 1 (pS2) in breast cancer, which is implicated in the control of cell proliferation [49], and carbonic anhydrase 9 (MN/CA9) in renal cell carcinoma [50].

Moreover, hypomethylation in genes can disrupt genomic imprinting through activation of the normally silent allele, and there is a vast array of cancers that exhibit such loss of imprinting

[34].

1.3.3 DNA hypermethylation

Hypermethylation of DNA in cancer occurs concomitantly with global genomic hypomethylation. Hypermethylation of DNA frequently occurs in the CpG islands of gene promoters and, in many cases, is associated with transcriptional silencing [40, 51]. It is estimated that an average of 600 out of the approximately 45, 000 CpG islands in the genome are hypermethylated in cancer [52]. Hypermethylation of promoters is an important mechanism for inactivation of tumour suppressor genes [53], and aberrant hypermethylation and downregulation have been observed in genes involved in the cell cycle, DNA repair, cell signaling, chromatin remodeling, transcription, and apoptosis for almost every type of tumour [27].

Studies have found that patterns of CpG hypermethylation occur in a cancer type-specific fashion [52], in both sporadic as well as inherited cancers of the same tumour-type [32]. In these studies, cancer-associated DNA methylation was found to vary with the kind of cancer under investigation [32, 52]. It is suggested that this may be due to different growth selection pressures or individual CpG island susceptibilities in each tumour-type [52]. Promoter hypermethylation in certain CpG islands may confer a selective advantage for the survival of a specific cell type [54].

Hence, the reason for certain genes to be downregulated in one type of cancer versus another is

6 because there are important cellular consequences to lack of expression of that gene that promotes the growth of tumours of a specific tissue [54]. Known hypermethylated genes in different cancers include glutathione S-transferase P (GSTP1) in prostate cancer [55], breast cancer 1, early onset (BRCA1) in breast and ovarian cancers [32, 56], and mutL homolog 1, colon cancer, nonpolyposis type 2 (hMLH1) in gastric, colorectal, and endothelial cancers [37,

57-58].

1.3.4 CpG island methylator phenotype

One theory suggests that there is a CpG island methylator phenotype (CIMP) in human cancers. This theory developed from studies in colorectal cancer which found a subset of cancers that displayed a 3 – 5 fold increased frequency of aberrant hypermethylation in multiple loci, and this pattern of methylation in a cluster of genes was not seen in the remaining cases [59].

According to this theory, CIMP cancers are biologically unique compared to other cancers, with differences in genetics, histology, pathology, and clinical attributes [39]. However, this is still a very controversial concept with no consensus in the choice of genes that are included in a panel to distinguish CIMP cancers from other types [45].

1.4 Cancer biomarkers

A biomarker is defined as, “a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention” [60]. Biomarkers can have many clinical applications in disease detection and monitoring. Biomarkers can be used to detect the presence of a disease [60], and an example of a commonly used screening tool is the Pap smear, where abnormal cells may indicate

7 cervical cancer [6]. Prognostic biomarkers predict the natural course of disease in an individual.

For instance, the Breast Cancer Profiling, or “H/I”, test looks at the ratio of expression of homeobox B13 (HOXB13) to interleukin 17 receptor B (IL17RB) in tumour tissue and estimates the probability of disease recurrence in an individual after the original tumour has been resected, with increasing risk associated with higher ratios [61]. Predictive biomarkers are used to assess whether a patient will benefit from a particular treatment based on the characteristics of their disease [60]. They are also used in breast cancer, as patients whose tumours overexpress the v- erb-b2 erythroblastic leukemia viral oncogene homolog 2 (HER2) gene may respond to treatment with trastuzumab, whereas those whose tumours express the estrogen receptor may benefit from treatment with tamoxifen [62]. In addition, biomarkers can be used to measure the treatment effects of a drug on the tumour, and these are called pharmacodynamic biomarkers [63].

1.4.1 Sensitivity and specificity

Potential biomarkers for cancer diagnosis have to distinguish between individuals who do and do not have cancer, and the way to assess the performance of a biomarker is to calculate the proportion of patients whose test results correctly identify those with the disease. The sensitivity of a biomarker refers to the proportion of true positive cases that are correctly identified by the test [64]. In contrast, the specificity of a biomarker is the proportion of true negative cases that are correctly identified by the test [64]. These performance characteristics can be assessed from case-control studies. An example of the performance characteristics for a commonly used biomarker in prostate cancer screening, prostate specific antigen, are 80% sensitivity and 20% specificity [55]. Another example is the Pap test, which is widely used in screening for cervical cancer, and has 58% sensitivity and 69% specificity [65].

8

1.5 Cell-free circulating DNA

Discovering blood-based biomarkers is an appealing option as blood is a minimally invasive and easily accessible specimen [7]. There are several potential biomarker targets in blood that exhibit cancer-related differences including DNA, RNA, and proteins [6]. Of these,

DNA biomarkers are especially attractive as DNA is easily stored, and far more stable in comparison to RNA and proteins. Moreover, only small amounts of DNA are required for the analysis as it is possible to amplify the template through PCR.

1.5.1 Origins of circulating DNA

Cell-free circulating DNA (cirDNA) refers to fragments of extracellular DNA that flow freely in the circulation. CirDNA is believed to originate from dead cells through apoptosis and necrosis [66-69]. Apoptosis refers to a process of programmed cell death involving the action of enzymes called caspases [66]. A major hallmark of apoptosis is internucleosomal cleavage of chromatin, which results in DNA fragments that exhibit a ladder-like pattern of 180 bp [70]. Necrosis is cell death that results from physical or chemical trauma [67]. High molecular weight DNA fragments are expected after necrosis, as it causes nonspecific and incomplete digestion of DNA [71]. The size distribution of DNA extracted from human plasma show fragments of both 180-bp and high molecular weight, suggesting cirDNA originates from both cell death processes [69]. Furthermore, experiments that induced apoptosis and necrosis in mice liver cells found that DNA recovered from plasma resulted in a 180 bp ladder pattern and high molecular weight fragments (>10, 000 bp), respectively [69]. DNA in plasma has also been shown to circulate in the form of nucleosomes [66, 69], which is expected after apoptosis, and

9 being bound in protein likely protects DNA from further enzymatic digestion in the bloodstream

[70].

1.5.2 Mechanisms for DNA release into the circulation

Though the precise manner in which cells release DNA into the circulation is unknown, some theories regarding this mechanism have been postulated. Cells that die by apoptosis and necrosis are rapidly cleared from the circulation through phagocytosis by macrophages and other cellular scavengers [68, 70]. Macrophages may play a role in the release of DNA from cells that die by necrosis. Cell culture studies have shown that macrophages that engulf necrotic cells release digested DNA into the medium, in contrast to macrophages that engulf apoptotic cells, which do not [67]. During apoptosis, DNA gets fragmented and sequestered within blebs that move to the cell surface and can be released into circulation [72]. In vitro studies have shown that apoptotic cells release DNA spontaneously as they die [66]. These studies suggest that apoptotic cells may release DNA directly into the blood while the release of DNA from necrotic cells is dependent on other cellular factors [66].

Another possibility is that living cells can actively release DNA into the circulation [71].

Cell culture studies in lymphocytes show that they can release DNA into the supernatant in the absence of cell death [73]. Furthermore, it was shown that actively released DNA also displays a ladder-like pattern, suggesting that the ladder-like pattern seen in cirDNA may not be due solely to apoptosis [74]. Thus, there are several possible mechanisms by which DNA can enter the bloodstream and other bodily fluids.

10

1.5.3 Circulating DNA in cancer

CirDNA levels have been widely reported to be elevated in a number of cancers, including those of the colon [75-76], pancreas [76], prostate [77], breast [75, 78], lung [75, 79], ovary, uterus, and cervix [75]. Reported concentrations of cirDNA in plasma range from 0 to >

1,000 ng/ml of blood in cancer patients [77-79], compared to healthy subjects, who have between 0 to 100 ng of cirDNA per ml of blood [79]. These values reflect a considerable variation in cirDNA concentrations in both groups, which can be partly attributed to the different techniques used to quantify cirDNA as well as the different treatments of DNA that were employed by the different studies [80]. Taking an average of multiple studies, cancer patients have 180 ng/ml of cirDNA while healthy subjects have 30 ng/ml of cirDNA [80].

It is thought that the high rates of cell apoptosis and necrosis in a tumour is related to the greater amounts of DNA that are found in the circulation of cancer patients. One explanation is that as tumour enlarge, they are likely to outgrow their blood supply, leading to hypoxia-induced cell necrosis and apoptosis in large regions of the tumour [68, 80]. This would lead to increased phagocytosis of tumour cells by macrophages and DNA release by apoptosis, corresponding to higher levels of cirDNA. Moreover, excessive cell death could lead to reduced clearance of circulating nucleic acids by the liver and kidney [80].

1.5.4 Circulating DNA and cancer biomarkers

CirDNA has been a focus of biomarker research in oncology ever since tumour cells were shown to release their DNA into blood in 1987 [81]. Studies aiming to discover the cellular origins of cirDNA in cancer were able to detect alterations in cirDNA that matched those of the primary tumour, which led to the conclusion that some of the DNA circulating in blood comes

11 from tumour cells [69, 73, 82-83]. For example, Chen et al were able to detect cancer-specific microsatellite instability in the plasma of small cell lung carcinoma patients [82]. However, there are additional cellular sources of cirDNA. Jahr et al found that cirDNA derives from both tumour and nontumour cells, with T-cells and endothelial cells making minimal contributions [69].

Moreover, they hypothesized that the nontumour fraction of cirDNA originates from tumour cells that lie in the vicinity of the tumour, which get degenerated as the tumour grows [69]. This is consistent with other findings that show cirDNA containing cancer-related mutations to represent only a small fraction of the total cirDNA detected in plasma [68].

Cancer biomarkers in cirDNA have the potential to examine a number of different alterations, including microsatellite instability, mutations, integrity, and methylation [79-80].

Microsatellites are short repetitive nucleotide sequences in the genome that are 1 – 6 bps long and polymorphic for length [79, 84]. Microsatellite instability is characterized by discrepancies in the number of nucleotide repeats found within microsatellites in tumour versus normal DNA

[85], and can be demonstrated as loss of heterozygosity or as a shift after gel separation [79, 86].

Microsatellite instability is a common trait of cancer, and several studies have demonstrated that it is possible to detect microsatellite instability in the cirDNA of cancer patients [82, 86-87].

However, studies looking at DNA extracted from the primary tumours and plasma of individuals have shown discrepancies regarding the detection of tumour-related loss of heterozygosity in cirDNA, and the overall sensitivity of tests for detecting microsatellite instability in cirDNA is very low, at only 0.5% [79].

Detecting tumour-specific gene mutations in cirDNA is another opportunity for biomarker development. The most frequently analyzed mutations are in v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) and tumor protein 53 (p53), as they are the most

12 commonly mutated genes in cancer [79]. However, despite the high frequency of these mutations in tumours, assays in cirDNA show inconsistent results for patients who test for these alterations

[79]; several studies have found tumour-specific mutations in less than 10% of samples [88-91] .

One reason for these results may be that the tumour-specific mutation is present infrequently in the cirDNA, and is masked by the presence of wild-type DNA [80]. Moreover, mutations in

KRAS can be found in the cirDNA of patients with non-neoplastic disease, such as chronic pancreatitis [92], as well as in those of healthy controls [93]. Thus, issues of sensitivity and specificity are major drawbacks to this approach [80].

Assessing cirDNA integrity has been a more recent advancement in cancer biomarker assay development. CirDNA integrity is measured as the ratio of longer to shorter DNA fragments [94], and studies looking at DNA integrity typically examine the integrity of repeat sequences such as ALU and LINE1 [80]. Greater cirDNA integrity has been associated with cancer in a number of studies [78, 94-96]. However, because repeat sequences are interspersed throughout the genome, the ability to specify a cancer type is lost in these assays, though the sensitivity of the test may be enhanced [80].

1.6 Studies of DNA methylation in the circulating DNA of cancer patients

Detecting DNA methylation is a promising avenue for discovering biomarkers in cirDNA

[80]. Assays for DNA methylation have several advantages. Tumour-specific DNA methylation represents a stable marker that will generally not be lost [79], and there are particular genes that are frequently methylated in certain cancers [80]. Hence, a plethora of studies has been dedicated to detecting cancer-specific DNA methylation markers in circulation. The vast majority of these studies have employed a candidate-gene approach based on existing knowledge of DNA

13 methylation changes in tumour tissues. The candidate gene approach requires a priori knowledge and selection of cancer-related genes, which comes from genes that have already been discovered to be methylated in tumours [80]. Once selected, methylation at these loci can be detected by treatment with sodium bisulfite, which converts unmethylated cytosines, but not methylated cytosines, into uracil (and into thymine by subsequent PCR amplification). The modified DNA can then be analyzed by methylation specific PCR (MSP), which uses primer sets specific for methylated or unmethylated DNA [97], real-time PCR [79], or DNA sequencing [98].

Using these methods, differential methylation of several genes in cirDNA has been found in numerous cancer types [79]. For example, one study in breast cancer found methylation of adenomatous polyposis coli (APC) in 29%, of Ras association domain family member 1

(RASSF1A) in 56%, and of death-associated protein kinase 1 (DAPK) in 35% of patients’ plasma

(n=35), and no methylation in any of the genes in the plasma DNA of 20 healthy controls and eight patients with benign breast disease [99]. Another study in lung cancer found methylation of

APC in the plasma of 47% of lung cancer patients (n = 89) and in no healthy controls (n = 50)

[100]. In ovarian cancer patients, BRCA1 was methylated in 18%, and RASSF1A in 40%, of plasma samples (n = 50), and methylation in these genes was not found in any healthy controls (n

= 40) [101]. Similar studies have identified differential cirDNA methylation in the plasma of patients with gastrointestinal, renal, hepatocellular, colorectal, prostatic, esophageal, cervical, and bladder cancer [79].

To date, there have been few microarray-based studies of cirDNA methylation in cancer, and the ones that have been conducted have been for pancreatic [102], breast [103], colorectal

[104], and ovarian cancers [105-106]. The arrays used in these studies were custom designed

[107] and were limited to just 56 genes. However, they were able to find different methylation

14 profiles in the plasma cirDNA of cancer patients and healthy controls. For instance, using a panel of five genes identified by this method, a sensitivity of 85% and a specificity of 61% was achieved for detecting ovarian cancer [106]. Of these genes, one was previously unreported to be involved in cancer [106].

1.6.1 Need for large scale studies of DNA methylation for identification of cancer biomarkers in the circulating DNA

A shortcoming to the studies that have been conducted to date in the cirDNA methylome, which refers to the distribution of methylated cytosines across the entire genome [108], is that they have only tested a small number of selected genes. Moreover, using the candidate gene approach has precluded the discovery of novel biomarkers in cirDNA that are currently unknown to be altered in cancer. Hence, it is likely that many informative markers have been missed. In order to find the best biomarkers, it is necessary to use a use an approach which can comprehensively analyze methylation patterns in the cirDNA of cancer. This method can be used to identify specific and unknown regions that are differentially methylated in cirDNA [109], and the results from such studies can inform the selection of genes that will be investigated for future analysis.

1.7 Prostate cancer

Prostate cancer is the most common cancer for men in the developed world [1]. In 2010 in the United States alone, it was responsible for 217,730 new cases and for 32, 505 deaths [110].

Prostate cancer occurs in the prostate gland and is classified as an adenocarcinoma, arising from prostate gland epithelial cells [111]. The leading risk factor for developing this disease is age

[111]. Most cases of prostate cancer present asymptomatically, especially in the earlier stages,

15 though urinary tract symptoms such as urgency, frequency, and incomplete emptying have been associated with prostate cancer [111].

1.7.1 Prostate specific antigen

Prostate cancer is one of the few cancers for which a molecular marker is routinely used for detection, risk stratification, and monitoring [112]. Prostate specific antigen (PSA) has been used since 1994 to screen asymptomatic populations beginning at 50 years of age [112]. PSA is a serine protease whose physiological role is believed to be liquefying the seminal fluid [113], and it is present in conditions of normal health [114]. The traditional threshold that is used for biopsy is 4.0 ng/ml of blood [111], with values above this amount warranting follow-up. For biopsy, samples of the prostate are taken using transrectal ultrasound-guided needles [111]. Patient management after a positive diagnosis of prostate cancer includes watchful waiting to see if the cancer progresses and is found to be more aggressive, radiation therapy to control cell growth, or surgical intervention to remove the prostate [111].

Despite the common use of PSA in clinical practice, the effectiveness of PSA as a diagnostic tool has come under strong questioning in recent years. The performance characteristics of PSA are mixed, with a sensitivity of 80% and a specificity of 20% [55]. As

PSA is a marker that is specific to the prostate, not prostate cancer, increased PSA levels are detected in non-neoplastic conditions of the prostate, such as benign prostatic hyperplasia (BPH) and prostatitis [112]. This leads to unnecessary biopsies which are both costly and carry significant risks and stress for patients [115]. Another challenge to PSA screening is that it leads to the overdiagnosis and overtreatment of clinically insignificant tumours. Estimates based on autopsies of men who have died from unrelated causes suggest that 70% of men in their 60s have

16 a latent form of prostate cancer [112, 116]. It is therefore possible that a significant proportion of positive PSA tests detect tumours that would otherwise not have any clinical impact on an individual, either because the person would die of other causes before the tumour could progress or because it is essentially benign [116]. Corroborating this theory is a meta-analysis of six randomized controlled trials totalling 387, 286 participants which found that PSA screening had no effect on mortality [117]. Therefore, there exists a real need for improved diagnostic markers for prostate cancer.

1.7.2 Studies of DNA methylation in the circulating DNA of prostate cancer patients

There have been several studies looking at DNA methylation in the cirDNA of prostate cancer patients. The most widely studied and promising methylation marker in the cirDNA of prostate cancer is GSTP1, which is a common pathological DNA methylation event in prostate cancer [55, 118]. A meta-analysis of 22 different studies concluded that the average specificity of

GSTP1 methylation was 89%, much higher than that of PSA; however, its sensitivity was lower than that of PSA, at 52% [55]. Other studies have found additional differentially methylated regions. For example, one study looking at DNA methylation in metastatic prostate cancer found hypermethylation of ATP-binding cassette, sub-family B, member 1 (MDR1) in 15 (83.3%), endothelin receptor type B (EDNRB) in 9 (50%), retinoic acid receptor, beta (RARβ) in 7 (38.9%),

GTSP1 in 5 (27.8%) and RASSF1A in 3 (16.7%) metastatic prostate serum samples [119].

However, as most of the earlier studies tested only a small number of selected genes, there is a high likelihood that the most informative markers have been missed.

17

1.8 Colorectal cancer

Colorectal cancer is the fourth most common cancer in the world [1], and in the United

States, it was responsible for 102,900 new cancer cases and 51,370 deaths in 2010 [110]. There are a number of risk factors associated with developing colorectal cancer including age, a diet rich in processed food and red meats, obesity, physical inactivity, high consumption of alcohol, smoking, and a family history of the disease [120]. Though there are several types of colorectal cancers, the vast majority of cases (> 90%) are adenocarcinomas [121] which originate from the lining of the colon [122]. The most common form of colorectal cancer presentation is to a primary care provider with non-urgent symptoms [123], such as rectal bleeding, diarrhea, constipation, weight loss, and abdominal pain [124], though a smaller percentage of colorectal cancer cases also presents as emergencies [125].

1.8.1 Screening modalities used in colorectal cancer

It is widely acknowledged that the stage of colorectal cancer at the moment of diagnosis is the main prognostic factor for this disease [2, 4]. Five-year survival for patients diagnosed when their tumours are limited to the colon exceeds 90%, but drops to below 70% once the cancer has spread to lymph nodes, and below 12% when the cancer has already metastasized [5].

The most commonly used screening modalities for colorectal cancer detection can be divided into those based on finding markers in stool and those based on structural examinations of the colon.

Fecal occult blood tests (FOBTs) are based on the observation that there are small but unobservable amounts of blood (occult blood) released into the bowel lumen in colorectal cancer

[126]. FOBTs detect this blood loss in stool and are the main screening tool used in Europe and

18

Canada [127]. There are two main types of FOBTs. Guaiac FOBTs (gFOBT) work by detecting the peroxidase activity of hemoglobin when it interacts with guaiac in the presence of hydrogen peroxide. This interaction converts the colorless guaiac to a blue color [128]. In contrast, immunochemical FOBTs (iFOBTs) use antibodies directed against human globin to detect blood in stool. Though FOBTs represent a non-invasive diagnostic method, a limitation common to both of these tests is their poor sensitivity. The reported sensitivity of gFOBTs is generally low and varies widely, from 11 to 64%, and specificity varies from 90 to 98%, depending on the test brand used [129]. The performance characteristics of iFOBTs are improved compared to gFOBTs, but their sensitivities are still variable, ranging from 56 to 89%, with specificities of 91 to 97% [126]. Moreover, because FOBTs are designed to detect blood in stool, they are a nonspecific test for colorectal cancer, as gastrointestinal bleeding can result from other causes such as ulcers, inflammatory bowel disease, or the use of anticoagulant/antiplatelet medications, resulting in false positive results.

Colonoscopy is the gold standard for detecting colorectal cancer [126, 128], and it is the most commonly used screening modality in the United States [126]. The sensitivity of colonoscopies for detecting colorectal cancer is 95%, and the specificity ranges from 95 to 99%

[126]. However, routine diagnosis of colorectal cancer by colonoscopy is impractical as the procedures are invasive, consume a great deal of healthcare resources, and present significant inconvenience and discomfort to patients. Given the effectiveness of early diagnosis and treatment in colorectal cancer and the performance gaps of current non-invasive colorectal cancer markers, the development of novel biomarkers could result in improved patient outcomes for colorectal cancer.

19

1.8.2 Studies of DNA methylation in the circulating DNA of colorectal cancer patients

Epigenetic studies of cirDNA of colorectal cancer have included the analysis of hMLH1, cyclin-dependent kinase inhibitor 2A (p16INK4a), septin 9 (SEPT9), and DAPK, with methylation differences at these loci being detected in 16% (n = 19), 36% (n = 58), 71% (n = 133)

17% (n = 122), respectively, of examined samples [79, 130]. Of these genes, SEPT9 has been commercially developed and is marketed as Epi proColon® in Europe and the Middle East [131].

The reported sensitivity and specificity for this test range between 68 – 72% and 89 – 93%, respectively [132]. Though promising, the moderate sensitivity of this test is something that can be improved upon in future studies.

1.9 Research objectives

Despite the potential for discovering novel DNA methylation biomarkers in the cirDNA of cancer patients, studies that comprehensively analyze the cancer circulating methylome have not been conducted. The moderate performance characteristics of the DNA methylation markers discovered to date show promise but also show much room for improvement. Dedicated epigenome wide studies in the cirDNA could identify novel DNA methylation biomarkers in regions of the genome that have currently been unexplored.

The key objective of this study was to perform a comprehensive study of the plasma cirDNA methylome in prostate cancer and colorectal cancer. Our general hypothesis driving this study was that cancer patients and controls would exhibit different methylation profiles in their plasma cirDNA.

For the prostate cancer study, which has been completed, a CpG island microarray-based scan covering 12, 192 loci was performed on the methylation-enriched plasma cirDNA of

20 prostate cancer patients (n = 20) and healthy controls (n = 20). Based on the results of the microarray data analysis, thirteen genes were selected for fine mapping using sodium bisulfite treatment and pyrosequencing, as well as replication in a second, independent sample set.

For the colorectal cancer study, which is still underway, high resolution tiling microarrays were used to test plasma cirDNA samples from 93 colorectal cancer patients and

100 matched controls. The microarray platform was substantially expanded by using Affymetrix tiling arrays which contained over 6.5 million probes covering the entirety of 1 and 6. Like in prostate cancer, the goal was to find genes and other regions of significantly differential methylation between colorectal cancer patients and controls and identify informative epigenetic markers.

21

2.0 MATERIALS AND METHODS

2.1 Samples

2.1.1 Prostate cancer study

The first sample set consisted of individuals with a confirmed diagnosis of prostate cancer (n = 20, ages 68.9 ± 6.2 yrs) and healthy controls (n = 20, ages 46.3 ± 6.4 yrs). Cases and controls were recruited from several hospitals in Novosibirsk, Russia. The second sample set consisted of 20 prostate cancer patients and 18 control individuals, who were diagnosed as having BPH. Cases and controls in the second sample set were matched by age (mean group ages were 68.7 ± 6.8 yrs and 69.1 ± 7.2 yrs for cases and controls, respectively). They were recruited from the Vilnius University Hospital, in Vilnius, Lithuania. All individuals across both sample sets were male Caucasians, while the tissue type for prostate cancer patients was T2-

3N0MX. All the participants provided written informed consent and the research protocol was approved by the research ethic boards from CAMH (Toronto, Canada), Vilnius University

Hospital (Vilnius, Lithuania), and the Institute of Chemical Biology and Fundamental Medicine

(Novosibirsk, Russia).

2.1.2 Colorectal cancer study

The sample set for the colorectal cancer study consisted of patients diagnosed with T1-

4N0MX colorectal cancer (n = 93; ages 70.4 ± 12.6) and unaffected individuals (n = 100; ages

57.0 ± 8.1). There were equivalent numbers of male and female individuals in both study groups

(46 male and 47 female colorectal cancer patients; 50 male and 50 female unaffected controls).

Plasma samples of the colorectal cancer patients were provided by the Ontario Tumor Bank

(Toronto, ON), while unaffected controls were provided by the Colon Family Registry (National

22

Institutes of Health, USA). All participants provided written informed consent, and research protocols were approved by the research ethics boards of CAMH and the Ontario Institute for

Cancer Research in Toronto, Canada.

2.2 DNA extraction

2.2.1 Prostate cancer study

Blood samples from all individuals were collected and the plasma fraction was separated by centrifugation and frozen at -80°C prior to nucleic acid extraction. Total cirDNA was isolated from 1 mL of plasma using the GF-1 Nucleic Acid Extraction Kit (Vivantis, Selangor, Malaysia) in the first sample set and QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) in the second sample set, according to manufacturers’ instructions. Isolated total plasma cirDNA was stored at -20°C until use.

2.2.2 Colorectal cancer study

Total cirDNA was isolated from 0.5 mL of plasma samples from cases and controls using the QIAamp Circulating Nucleic Acids Kit (Qiagen, Valencia, CA) according to the manufacturer’s instructions. CirDNA was eluted in 100 mL EB buffer and stored at -20°C until use.

2.3 DNA methylation detection

2.3.1 Principles of DNA methylation detection

The method for DNA methylation detection used in the prostate and colorectal cancer studies is based on the preferential enrichment of methylated DNA, followed by interrogation on

23 microarray platforms (Figure 1). Briefly, adaptors were ligated to the ends of the isolated total cirDNA prior to an enzymatic digestion using methylation-sensitive enzymes, which do not cut in the presence of mC. After the enzymatic digestion, the only fragments that remain intact for subsequent PCR amplification are those that contain methylated CpG sites; hence, the methylated fraction becomes enriched. Moreover, as the PCR reaction conditions in this protocol were optimized for shorter templates, any large-size genomic DNA that might be present in the total plasma cirDNA was not amplified.

2.3.2 DNA blunting

DNA was enzymatically treated to create blunt double-stranded ends. Fifty ng of total cirDNA was incubated with 1X NEB Buffer 2, 100 µM dNTPs, 60 units of T4 DNA polymerase, and 2 ng of BSA in 112.2 µL final volume. The mixture was incubated at 12ºC for 20 minutes, then transferred to ice. Blunted DNA was purified using a standard phenol-chloroform DNA isolation with 120 µL of phenol:chloroform:isoamyl alcohol (25:24:1), followed by ethanol precipitation. After precipitation, samples were dissolved in 25 µL of water.

2.3.3 Adaptor ligation

Two types of blunt universal adaptors were prepared by annealing two oligonucleotides

(prostate cancer study: oJW 102: GCGGTGACCCGGGAGATCTGAATTC and oJW

103:GAATTCAGATC; colorectal cancer study: RCB1: ATTTGAACCCCTTCATGGGTACCA and RCB2: TGGGGAAGTACCCATGGT). The adaptors were prepared in a reaction of 100

µL consisting of two oligo sequences at 40 µM in 1 M Tris (pH 7.9). The solution was heated to

95ºC for 5 minutes then incubated at 70ºC for 2 minutes and cooled at 25ºC for 2 minutes, then

24 incubated for 16 hours at 4ºC. The annealed adaptors were stored at -20ºC prior to use. Samples were incubated at 16ºC for 8 hours in a reaction volume of 50.2 µL containing 1X NEB ligation buffer, 0.1 pmol of adaptors, and 200 units of T4 DNA ligase.

2.3.4 DNA methylation-sensitive enzyme digestion

Enzyme digestion was performed using methylation-sensitive restriction enzymes, which do not cut when the corresponding restriction sites are methylated. 10 U each of HpaII,

HpyCH4IV, and HinP1, with 3 µL of 10X NEB Buffer 1, were added to one-half of the adaptor- ligated template, in a total reaction volume of 56 µL. The samples were incubated for 8 hours at

37ºC then heated to 65ºC for 20 minutes to deactivate the enzymes.

2.3.5 Adaptor-mediated PCR

The digestion product was amplified under the following conditions: 1X PCR buffer, 2.8 mM MgCl2, 275 µM aminoallyl dNTP mix, 1.6 µM primer, and 25 unit of Taq DNA polymerase.

Amplification for prostate cancer study samples were performed in 100 µL final volume with 25

µL of the digestion template and the oJW 102 primer. For the colorectal cancer samples, amplification was performed with RCB2 primer in 400 µL final volume with 50 µL of the digested template, as a larger amount of PCR product was required for subsequent microarray hybridization. PCR conditions were 72 ºC for 5 minutes, then 30 cycles of 95ºC for 1 minute,

94ºC for 40 seconds, and 67ºC for 2 and a half minutes, followed by a final elongation stage of

72ºC for 5 minutes. PCR products were checked by 1% gel electrophoresis, and purified with the

Qiagen MinElute kit. DNA concentration and quality was assessed by spectrophotometry using the Nanodrop 2000, with the 260/280 nm absorbance ratio of 1.8 indicating pure DNA.

25

Figure 1. Principle of DNA methylation detection technology in plasma cirDNA. After the enrichment of the methylated fraction of the cirDNA, PCR products will be obtained only in templates from cirDNA either containing methylated CpG positions (enriched methylated fraction) or lacking restriction sites (no informative fraction). DNA samples isolated from plasma consist of fragmented circulating DNA originating from apoptosis/necrosis in tumor cells (right) and larger size genomic DNA originating from circulating cells (i.e. lymphocytes) or other cellular sources (left). First, universal adaptors (magenta boxes) are ligated to the ends of all DNA molecules. Next, samples are digested with DNA methylation sensitive restriction enzymes. These enzymes will cut only at unmethylated CpG positions (white circles) but not in methylated CpG positions (black circles). Digested DNA is then amplified using primers that bind to the universal adaptors (green arrows). During the PCR reaction, DNA polymerase extends primers (dashed green lines) according to its processivity and the optimized reaction conditions. PCR products will be obtained only from undigested short templates that have ligated adaptors at both sides (mainly from tumour circulating DNA). In longer templates (which are expected from genomic DNA), the DNA polymerase cannot extend primers in the distance between 5’ and 3’ adaptors and therefore they will not be amplified.

26

2.4 Microarray experiments and data analysis

2.4.1 Prostate cancer study

2.4.1.1 Microarrays

The microarray experiments performed on the prostate cancer sample set utilized a two channel, reference-pool based experimental design. In this design, DNA from individual samples is labelled with the fluorescent dye Cy3, and DNA from a common reference pool is labelled with Cy5. After hybridization to a single microarray, the difference between the hybridization intensities can be used to determine methylation differences, and comparing relative changes in each sample against the common reference pool allows the comparison of samples across arrays

[133].

For the reference pool, DNA was extracted from the white blood cells of 20 individuals

(13 females, 7 males) who were unrelated to this project. Participants were recruited from

CAMH in Toronto, Canada, and provided informed consent. DNA was extracted using phenol- chloroform isolation followed by ethanol precipitation. Isolated DNA was pooled and sheared by sonication to 200-500 bp fragments, prior to undergoing the methylation detection protocol.

Two technical replicates of microarrays were used for all patients and controls. 1.5 μg of purified DNA was labeled using Cy3 for the cirDNA samples and Cy5 for the reference pool, and was hybridized to University Health Network HCGI12K arrays containing 12, 192 probes

[134].

2.4.1.2 Microarray data analysis

Hybridized microarrays were scanned using the Axon 4000B scanner, and signals were obtained using the GenePix Pro software version 6.1.0.4. The intensity values that were obtained

27 underwent extensive quality control. Data were corrected for background noise and were normalized using a variance stabilization and normalization method to yield a raw p-value based on a moderated t-statistic [135]. The raw p-values were corrected for multiple testing using false discovery rates (FDR), to yield FDR-adjusted p-values. After normalization, microarray signals were calculated for each probe by subtracting the signal intensities obtained from the cirDNA sample (Cy3) from the ones obtained from the reference pool (Cy5). Coefficients were calculated as the ratio of methylation between prostate cancer and control individuals, with positive values signifying hypermethylation and negative values indicating hypomethylation in prostate cancer.

2.4.2 Colorectal cancer study

2.4.2.1 Microarrays

Samples from the colorectal cancer study set were hybridized to Affymetrix GeneChip®

Human Tiling 2.0R A arrays, which have over 6.5 million probes covering chromosomes 1 and 6 at a 35-bp resolution (probes are 25 bp long with a 10 bp gap). Affymetrix microarrays use a single channel system using only one fluorophore. The design enables the hybridization of a single sample to each microarray, after which signal intensities are compared across microarrays and between different sample groups [133]. 9 µg of purified cirDNA amplification product was used for DNA fragmentation, labelling and hybridization experiments, according to Affymetrix protocols. The Genechip® Fluidics Station 450 and GeneChip® Operating Software were used.

2.4.2.2 Microarray data analysis

Hybridized microarrays were scanned using the Affymetrix GeneChip Scanner 3000 and CEL files were generated. Extensive data normalization was performed using the Affytiling and limma

28 packages implemented in the bioconductor suite (www.bioconductor.org). The raw intensities were corrected for background noise then underwent quantile normalization [136]. Normalized intensity values for each probe were corrected for multiple testing by local FDR.

2.5 Fine mapping of individual CpG locations

2.5.1 Principle of fine mapping

Fine mapping of individual CpG sites relies on the selective conversion of unmethylated cytosine to uracil after treatment with sodium bisulfite, while methylated cytosines remain unchanged. After PCR amplification, uracil is amplified as thymine while mC remains as cytosine, thereby allowing the detection of methylation at a single base-pair resolution. Thirteen loci were selected from the microarray experiment for the further analysis and fine mapping of methylated cytosines. They were TRK-fused gene (TFG), atonal homolog 8 (ATOH8), SIX homeobox 3 (SIX3), NudC domain containing 3 (NUDCD3), protocadherin beta 1 (PCDHB1),

KIAA1539 protein (KIA1539), ring finger protein 219 (RNF219), 2 (HPSE2), discs large homolog 2 (DLG2), guanine nucleotide binding protein (G protein) gamma 7 (GNG7), core-binding factor, runt domain alpha subunit 2 translocated to 2 (CBFA2T2), zinc finger

CCCH-type containing 4 (ZC3H4) and ArfGAP with GTPase domain ankyrin repeat and PH domain 1 (AGAP1).

2.5.2 Bisulfite treatment and whole bisulfitome amplification

In this study, the remaining cirDNA from the samples used in the microarray studies was bisulfite-treated using the Qiagen Epitect kit. In order to enable the amplification of minimal amounts of template, bisulfite-treated cirDNA was amplified using the Epitect Whole

29

Bisulfitome kit (Qiagen, Mississauga, ON), according to the manufacturer’s instructions. This kit allows the whole genome amplification of bisulfite-treated DNA (whole bisulfitome amplification).

2.5.3 Nested PCR

Nested PCR was used for single locus amplification, as it allows for greater yield of amplification product while reducing the chances of contamination. The technique involves two successive runs of PCR, with the second reaction amplifying a target within the amplicons from the first reaction. Briefly, 5 µL of bisulfite-treated and whole bisulfitome amplified-DNA was added to a reaction mix containing 1X Hotstart PCR buffer with 1.5mM MgCl2, 120 nM specific primers, 200 nM dNTPs and 0.65 units of Hotstart Taq polymerase. The DNA was first amplified in 10 cycles with an external primer, then in 40 cycles with an internal primer. The list of primers that were used for each gene can be found in Tables 1 and 2. The reaction conditions were 95°C for 15 minutes, followed by the appropriate number of cycles of 95°C for 1 min, 55°C for 45 sec, 72°C for 1 minute, and a final extension step of 72°C for 10 minutes.

30

Table 1. Primers used for amplification of the external locus in nested PCR.

External Fragment Locus Forward Primer Reverse Primer ATOH8 AGGAGGTAGGTTTTGGGTTAAG CCTCCCTCTCTCCCTTTCT CBFA2T2 TAAAAATATTTTGAGTTAGGGGGTT CAAAACCAACTCCCATTAAACAC CENTG2 TTGTATGAGATATTGAGAGTATTAT ACAAAAAAAACCTATACCCTCTAA DLG2 TGGGGGGTTTAAGTTTTTTTGAT CCTCTTAAACTCTCTCTTCAAAAT GNG7 TTGGTTGTTTTTGAGGTTGGGT AAACCCCTTACAAAAAAAATAAACT HPSE2 TAGTAGAGATAGGGTTTTTTTATGT TCTTTCACTAATTATCCTCCACA KIAA1539 TTAAAGGAGGAAGGAGGAGATA AAACCCTCAAACTAATAACTTTAAC NUDCD3 TAGGGTTATTTTTTAGGTTTAGGTA TTTCTAAATAAACCCCTAACAAACT PCDHB1 AAGTATGTGATTAAGTGGATATTTA AACTCCTAACCTCAAATAATCT RNF219 GTTATATTTGTTTGGGGAAGGTAA ACCCAAATAAATCCATTAATCA SIX3 TTGTTAGTTTTTTTGTTGGGAGAAAT ACTTTCCCACCCCAACCCTA TFG GTTTTAAATTTTTTGAGAGTTGGTT AAATAATTCACCCCCATTCCTA ZC3H4 GATTTGAGGGAGAGAGGGAA ACCTTCAACTCTTTCTAACTCTC

Table 2. Primers used for amplification of the internal locus in nested PCR.1

Internal Fragment Locus Forward Primer Reverse Primer ATOH8 B-ATTGGGTTTTTGTGTAAATTGAGG CTACCTCCTTACCAACATTTCT CBFA2T2 B-GTTTGTATTTGGAGAATTTAGGTG TATAACCAAAACAATAACCCAAACT CENTG2 B-TTGGGATGAGGTAAAAAATAGA ACACACACTCAAACAAATAACTAAA DLG2 B-GTTGTTTGGGAATGTAGTTTAAA TCAAAATTCTTTTCAACTTTCCCT GNG7 B-GGGTTTTTTAGTTTGAGTTTTTAGT TACCACCTCTATATAATCTACCA HPSE2 B-GTGTTGGGATTGTAGGTATGA AACACTAAATTTAACAACTATCTAC KIAA1539 B-AGGAAGGAGGAGATAAAGTGAT CCCCTCTAAACTTATCATCACA NUDCD3 B-AGGGAGAATAGTTTTAGTTTTGTT ATAAAAATATAACCACCCTCAAAC PCDHB1 B-TTGTTGTGTTTATATAATATTGAAA TAATCTCCCCACCTTAACCT RNF219 B-GTGATTGTGGGTATAGTTATAAAAT ACTACCCCCATCTCCCAAAA SIX3 B-AGTAGAAATTTTTAGAGGAAGTTAA TTCCCACCCCAACCCTAAA TFG B-TTTTGAGAGTTGGTTGTAGTAGA TCAACTATTACTACAATAATCAACA ZC3H4 B-GAGGAAGGGTGAGATGGGA ATCTCTACCCCTCTCCTACA

1 B stands for biotin-labelled

31

2.5.4 Pyrosequencing

Pyrosequencing technology was used for site-specific DNA methylation analysis. Briefly, this technology involves a single-stranded biotinylated PCR fragment that is hybridized to a sequencing primer in a reaction containing DNA polymerase, ATP sulfurylase, luciferase and apyrase. When a dNTP complementary to the next base in the template is added to the reaction, its incorporation is catalyzed by DNA polymerase and this causes the release of pyrophosphate

(PPi) in amounts equivalent to the number of added dNTPs. ATP sulfurylase converts PPi to

ATP, which drives the generation of light by luciferase. This light is then detected by a camera and displayed as a peak on a Pyrogram [137]. PCR products were analyzed by pyrosequencing using a Qiagen PyroMark Q24 according to the manufacturer’s standard protocol. Methylation values at single CG positions were assessed using the PyroMark Q24 1.0.10 software. The primers used for pyrosequencing were the reverse primers used for amplification of the internal fragments in nested PCR (Table 2).

32

3.0 RESULTS

3.1 Microarray methylation analysis in the circulating DNA of prostate cancer

The methylation profiles of the plasma cirDNA of 40 individuals (20 with prostate cancer and 20 healthy controls) were interrogated through a microarray scan covering 12, 192 regions.

The experiment was based on a double-channel, reference pool design that hybridized the sample

DNA and the reference DNA to a single microarray. The raw data were corrected for background noise, normalized within and between arrays, and FDR-corrected for multiple testing. The FDR was used in this study as it is a less conservative method for multiple testing correction.

Normalized signal intensities were then compared against the reference pool between the prostate cancer and control samples. As the profiling technology was based on enrichment of methylated DNA, higher intensity signals corresponded to higher levels of methylation.

In total, 197 regions were found to exhibit significant methylation differences between prostate cancer and controls, with FDR adjusted p < 0.05. Figure 2 shows a volcano plot which compares the FDR-adjusted p-values against the differential methylation between prostate cancer patients versus healthy controls across all the interrogated sites. Of the loci that reached statistical significance, 20 showed increased and 177 showed decreased DNA methylation in the prostate cancer samples. From the 197 regions displaying significant differential methylation,

133 were able to be mapped to genomic positions. The remainder were unable to be mapped, as the probes for the microarrays were generated from a CpG island library, where the remapping of probes is not fully complete [134]. Of the 133 mapped loci, 85 corresponded to repetitive elements, and 79 of these loci showed decreased methylation, while 6 showed increased methylation in prostate cancer compared to controls (Table 3). 48 of the significantly

33 differentially methylated loci represented unique genomic regions. Of these, 6 and 42 showed increased and decreased cirDNA methylation in prostate cancer, respectively (Table 4).

Figure 2. Volcano plot of microarray data in prostate cancer and control samples using FDR- adjusted p-value as statistics. 197 regions were significantly differentially methylated between the groups (FDR- adjusted p-value <0.05). The X-axis represents DNA methylation differences between groups, with coefficients expressed in a log2 scale. Samples with increased microarray signals in prostate cancer and control individuals had positive and negative coefficients, respectively. The Y-axis represents –log10- transformed p-values adjusted by multiple test correction. The number of counts represented by each point in the plot is shown in a color gradient, from light gray to black (representing 1 and 600 probes, respectively). The horizontal red line depicts the cutoff value for adjusted p-value (-log10(0.05) = 1.3). The vertical red line depicts the 0 value, i.e. no differences in DNA methylation between cases and controls.

34

Table 3. Loci that were significantly differentially methylated and mapped to repetitive elements, in order of significance.

Probe ID Coeff.CAP/HEA FDR P-value Genome Location UHNhscpg0001663 -0.776908816 0.006432242 chr10:38815228-39143413 UHNhscpg0001649 -0.854934588 0.006432242 chr10:42358712-42372843 UHNhscpg0001995 -0.838656678 0.006432242 chr10:42366914-42809822 UHNhscpg0009381 -0.456758451 0.006432242 chr18:24885626-24885931 UHNhscpg0002087 -0.756679085 0.006432242 chr2:63285911-63286740 UHNhscpg0008433 -0.908690792 0.006432242 chr4:49126830-49133776 UHNhscpg0008390 -0.750966962 0.006432242 chr5:140438825-140439039 UHNhscpg0006245 -0.274639503 0.006432242 chr7:44421116-44421769 UHNhscpg0001923 -0.678400414 0.006432242 chr9:35116280-35116819 UHNhscpg0001242 -0.851338397 0.00735105 chr4:49115772-49151756 UHNhscpg0001739 -0.87691482 0.007443746 chr10:42355056-42379972 UHNhscpg0003307 -0.253907813 0.007443746 chr12:100741038-100741172 UHNhscpg0002784 -0.859337309 0.007443746 chr7:61739508-61780443 UHNhscpg0004775 -0.880715886 0.007524734 chr10:42355036-42378863 UHNhscpg0001830 -0.870410508 0.007738687 chr10:42358712-42372843 UHNhscpg0007070 -0.874818478 0.00781097 chr10:38779672-38803046 UHNhscpg0007161 -0.508694665 0.00781097 chr10:38779672-38803046 UHNhscpg0002242 -0.844241141 0.00781097 chr10:42363430-42377935 UHNhscpg0000727 -0.991126733 0.00781097 chr10:42366279-42380277 UHNhscpg0001763 -0.856964563 0.00781097 chrY:13659134-13822521 UHNhscpg0002651 -0.837069895 0.008139026 chrY:58835946-58836139 UHNhscpg0000996 -1.023410867 0.008800031 chr10:42355136-42380328 UHNhscpg0005339 -0.776831569 0.008800031 chr10:42358812-42792009 UHNhscpg0009093 -0.819173232 0.008800031 chr10:42366914-42377710 UHNhscpg0001676 -0.816426023 0.009018402 chrY:13659134-13822521 UHNhscpg0001927 -0.870154957 0.009056594 chr4:49111868-49141790 UHNhscpg0001086 -0.952173533 0.009526305 chr10:42355136-42380328 UHNhscpg0001583 -0.889930823 0.009526305 chrY:13659134-13822521 UHNhscpg0001904 -0.770090381 0.009526305 chrY:58821881-58822367 UHNhscpg0002870 -0.504833287 0.010101982 chr8:58055777-58056307 UHNhscpg0001148 -0.925040661 0.011048652 chr4:49115772-49151756 UHNhscpg0001493 -0.861348633 0.011048652 chrY:13659134-13822521 UHNhscpg0002530 -0.69182099 0.012316297 chr11:66858359-66858543 UHNhscpg0001353 -0.811335918 0.012316297 chr7:27274952-27275734 UHNhscpg0002871 -0.67737808 0.012337815 chr7:61739508-61780443 UHNhscpg0002028 -0.460800671 0.012636124 chr6:138297562-138297805 UHNhscpg0000955 -0.874121343 0.014605895 chr10:42359650-42364767 UHNhscpg0000861 -0.775928698 0.015797721 chr10:42359650-42364767 UHNhscpg0002186 -0.794309878 0.015843303 chr16:46407279-46407971 UHNhscpg0005209 -0.819487548 0.015863217 chr10:42355036-42378863 UHNhscpg0000711 -0.734897069 0.016944462 chr16:46432719-46432999 UHNhscpg0003296 0.242600964 0.019410882 chr15:62682027-62683106 UHNhscpg0002624 -0.621745149 0.022185929 chr10:42363430-42377935 UHNhscpg0006803 -0.780366935 0.022185929 chr4:49106722-49146860

35

Probe ID Coeff.CAP/HEA FDR P-value Genome Location UHNhscpg0004509 0.465964309 0.024100801 chr1:143283172-143283376 UHNhscpg0003055 -0.690916263 0.024853989 chr10:42355176-42364767 UHNhscpg0010932 -0.242620904 0.025304622 chr3:174034864-174035019 UHNhscpg0001902 -0.671777288 0.025853566 chr10:42366914-42809822 UHNhscpg0005992 0.319287199 0.026092441 chr8:112856178-112856364 UHNhscpg0002558 -0.717472296 0.026092441 chrY:58835946-58836139 UHNhscpg0001752 -0.762578727 0.026292003 chr10:38872463-39135756 UHNhscpg0003146 -0.455937445 0.0270572 chr20:40321718-40322050 UHNhscpg0002938 -0.381101596 0.02838984 chr11:67739610-67740216 UHNhscpg0002121 -0.783775314 0.029886109 chr10:38815228-39143413 UHNhscpg0007442 0.4886568 0.031972991 chrY:28688083-28688251 UHNhscpg0003836 -0.337604756 0.032955276 chr12:121834388-121834627 UHNhscpg0003053 -0.324050329 0.033304712 chr11:118229643-118229956 UHNhscpg0002434 -0.570556705 0.036129219 chr10:42363430-42377935 UHNhscpg0002898 -0.450567595 0.036412747 chr10:42361241-42377935 UHNhscpg0001758 -0.284882417 0.037910509 chr9:115826381-115826844 UHNhscpg0002416 -0.393268049 0.03841807 chr13:100629915-100630525 UHNhscpg0000906 -0.755281284 0.03841807 chr16:46407279-46407971 UHNhscpg0005173 -0.451169711 0.03841807 chr18:69113910-69114008 UHNhscpg0011873 0.405059642 0.03841807 chr3:146513114-146513274 UHNhscpg0007050 -0.38012713 0.03841807 chr9:93144582-93144744 UHNhscpg0000365 -0.590624073 0.040211141 chr6:43682298-43682469 UHNhscpg0005686 -0.378026672 0.040211141 chr9:38481830-38482064 UHNhscpg0005550 -0.346576083 0.04170891 chr12:117292033-117292256 UHNhscpg0001579 -0.695937102 0.04271975 chr16:46429029-46429379 UHNhscpg0001000 -0.815391933 0.043763049 chr16:46407279-46407971 UHNhscpg0004767 -0.380561086 0.044168449 chr10:37804599-37805023 UHNhscpg0011677 -0.26341298 0.044168449 chr12:70518596-70518669 UHNhscpg0009063 -0.803540314 0.044168449 chr7:61745826-61784764 UHNhscpg0003177 -0.713268845 0.044973143 chr4:49120544-49145894 UHNhscpg0001821 -0.742585065 0.04523539 chr16:46407279-46407971 UHNhscpg0007350 0.320227819 0.046272392 chr4:65896113-65896293 UHNhscpg0000534 -0.247494501 0.046293452 chr10:6389195-6389913 UHNhscpg0000903 -0.330321424 0.046293452 chr6:49500311-49500714 UHNhscpg0011652 -0.215532108 0.046293452 chr8:127012853-127012909 UHNhscpg0001761 -0.748337915 0.046896733 chr16:46429029-46429379 UHNhscpg0002783 -0.651936342 0.047694522 chr7:61739508-61780443 UHNhscpg0002689 -0.742688217 0.047722832 chr7:61739508-61780437 UHNhscpg0002528 -0.589377455 0.048931951 chr10:42366614-42379868 UHNhscpg0002209 -0.849917654 0.049874637 chr10:38815228-39143413

36

Table 4. Loci that were significantly differentially methylated and mapped to unique sequences, in order of significance.

Probe ID Coeff.CAP/HEA FDR P-value Genome Location Associated Gene UHNhscpg0004787 -0.688758593 0.006432242 chr12:12509906-12510708 LOH12CR2 UHNhscpg0007302 -0.554141456 0.006432242 chr12:53835864-53836647 PRR13 UHNhscpg0001950 -0.529255765 0.006432242 chr13:79233005-79233425 RNF219 UHNhscpg0001236 -0.271519549 0.006531116 chr20:35373931-35375109 NDRG3 UHNhscpg0001274 -0.734001686 0.00735105 chr11:124736261-124736800 ROBO3 UHNhscpg0003054 -0.394771176 0.007766859 chr3:186284800-186285518 TBCCD1 UHNhscpg0001954 -0.602254731 0.00781097 chrM:7586-8094 X64709 UHNhscpg0002867 -0.329073452 0.008139026 chr3:137892966-137893726 DBR1 UHNhscpg0001996 -0.455359622 0.009526305 chr11:27528310-27528744 LIN7C UHNhscpg0005678 -0.306123084 0.009526305 chr22:38668265-38669098 TMEM184B UHNhscpg0007291 -0.60835814 0.010101982 chr14:24615485-24616644 PSME2 UHNhscpg0008376 -0.747900752 0.012316297 chr11:64001388-64001774 DNAJC4 UHNhscpg0000695 -0.612009867 0.012960561 chr11:89956481-89956761 CHORDC1 UHNhscpg0001011 -0.946719818 0.014180023 chr16:46654601-46655581 SHCBP1 UHNhscpg0003048 -0.323606546 0.014612427 chr3:42001551-42001688 ULK4 UHNhscpg0007292 -0.48117522 0.014631291 chrX:6144076-6144388 NLGN4X UHNhscpg0002041 -0.292466982 0.015797721 chr13:79233005-79233425 RNF219 UHNhscpg0001182 -0.592990551 0.015797721 chrM:7586-8094 X64709 UHNhscpg0002136 -0.43921499 0.016135905 chrM:7586-8094 X64709 UHNhscpg0006471 -0.257146857 0.023215171 chr2:37102227-37102496 STRN UHNhscpg0001481 -0.305421729 0.023385351 chr6:163554934-163555456 PACRG UHNhscpg0007075 0.251009473 0.025304622 chr14:31495719-31496005 AP4S1 UHNhscpg0010908 -0.203544084 0.025853566 chr1:180736035-180736254 XPR1 UHNhscpg0010014 0.295689537 0.026092441 chr6:100055498-100056055 PRDM13 UHNhscpg0002903 -0.421631986 0.028248777 chr18:20513364-20514287 RBBP8 UHNhscpg0002924 -0.541556081 0.029284067 chr18:20513364-20514288 RBBP8 UHNhscpg0001597 -0.495997757 0.033304712 chr13:27825245-27825856 RPL21 UHNhscpg0003842 -0.329591486 0.035651189 chr1:24841149-24841426 RCAN3 UHNhscpg0008412 -0.204755641 0.035651189 chr3:39447768-39448654 RPSA UHNhscpg0002181 0.350544522 0.035651189 chr8:132916065-132916782 EFR3A UHNhscpg0002006 -0.562472341 0.03841807 chr1:68516009-68516544 DIRAS3 UHNhscpg0003529 -0.410797671 0.04170891 chr10:100819518-100819599 HPSE2 UHNhscpg0003334 -0.340595815 0.04170891 chr10:99046266-99046411 ARHGAP19 UHNhscpg0011811 -0.515863226 0.043763049 chr2:201390913-201391243 SGOL2 UHNhscpg0009080 0.173667806 0.043763049 chr19:58446151-58446732 ZNF418 UHNhscpg0009351 -0.329435155 0.044168449 chr10:126137986-126138206 NKX1-2 UHNhscpg0002716 -0.536877928 0.045105824 chr3:10206365-10206999 IRAK2 UHNhscpg0002037 -0.466971356 0.045105824 chr16:14165558-14166308 MKL2 UHNhscpg0003288 -0.258991491 0.04523539 chr2:37304202-37304275 HEATR5B UHNhscpg0010678 -0.328013347 0.046006946 chr5:126277183-126277247 MARCH3 UHNhscpg0010769 -0.373540101 0.046272392 chr7:124793521-124793834 AX746567 UHNhscpg0010847 -0.318473882 0.046272392 chr16:31105339-31106359 POL3S UHNhscpg0000800 -0.549803834 0.046272392 chr20:32250745-32251549 C20orf144

37

Probe ID Coeff.CAP/HEA FDR P-value Genome Location Associated Gene UHNhscpg0002199 -0.557366205 0.046272392 chr20:56726209-56726447 C20orf85 UHNhscpg0005399 -0.474548861 0.046293452 chr10:94333594-94334436 IDE UHNhscpg0001947 -0.526460487 0.047694522 chr16:14165558-14166308 MKL2 UHNhscpg0002962 0.422261394 0.048800768 chr19:2620935-2621226 GNG7 UHNhscpg0006273 0.195407654 0.048829155 chr15:63673617-63673868 CA12

3.2 Verification of microarray findings by fine mapping of cytosines on selected genes

To verify the microarray results, fine mapping of methylated cytosines using bisulfite

treatment followed by pyrosequencing was performed. This was done on the same samples that

were used for the microarrays as well as in an independent sample set. Thirteen genes showing

differential methylation between prostate cancer patients and controls were chosen. These genes

were selected on the basis of the statistical significance of the differences between cancer

patients and control individuals in the microarray experiment, as well as on whether they were

previously reported as relevant for tumour initiation and/or progression. The loci were mapped

within the gene or in immediate upstream/downstream regions of the genes (Table 5).

Table 5. Loci selected for fine mapping of methylated CpG positions.

Candidate probe Probe Associated Distance CpG # CG Restriction Position ID length gene to TSS island positions sites 650 UHNhscpg0008844 TFG Exon 1 175 Yes 2 HpyCH4IV bp 277 UHNhscpg0006615 ATOH8 Exon 1 2 Yes 3 HinP1 bp 546 5' UHNhscpg0002846 SIX3 11558 No 4 HpaII bp Intergenic 402 HinP1, UHNhscpg0010587 CBFA2T2 5'-UTR 237 Yes 9 bp HpaII 656 UHNhscpg0001363 ZC3H4 Intron 1 523 Yes 4 HpyCH4IV bp 784 3' UHNhscpg0006245 NUDCD3 108995 No 3 HinP1 bp Intergenic 257 3' UHNhscpg0008390 PCDHB1 8166 No 1 HpaII bp Intergenic 646 3' HpaII, UHNhscpg0001923 KIAA1539 12202 Yes 5 bp Intergenic HpyCH4IV

38

Candidate probe Probe Associated Distance CpG # CG Restriction Position ID length gene to TSS island positions sites 505 HinP1, UHNhscpg0001950 RNF219 Intron 1 138 Yes 17 bp HpaII UHNhscpg0003529 98 bp HPSE2 Intron3 175449 No 1 HinP1 UHNhscpg0006294 67 bp DLG2 Intron 2 143170 No 5 HpaII 350 HinP1, UHNhscpg0002962 GNG7 Intron 2 25281 No 3 bp HpyCH4IV 500 UHNhscpg0002775 AGAP1 Intron 15 539426 No 1 HpyCH4IV bp

3.2.1 Genes showing statistically significant differential methylation by pyrosequencing

Three genes were found to show statistically significant differential methylation between the prostate cancer and control samples. The results that are displayed reflect those obtained from the combined sample set. RNF219 is a member of the RING-type zinc fingers protein family and encodes the ring finger protein 219 [138]. Microarray data showed RNF219 to be hypomethylated in prostate cancer relative to control samples. Figure 3 shows results from pyrosequencing, which found this locus to be significantly hypomethylated in prostate cancer over all CpG sites that were analyzed (p = 0.0012, Mann-Whitney test). SIX3 encodes a homeobox-containing transcription factor, which plays an important role in cell proliferation

[139]. Results from the microarray analysis showed SIX3 to be hypomethylated in prostate cancer, which was confirmed by pyrosequencing (p = 0.003, Mann-Whitney test) (Figure 4).

KIAA1539 is a gene whose protein is currently uncharacterized. Microarray data showed hypomethylation of this locus in prostate cancer. This was also confirmed by pyrosequencing (p

= 0.02, Mann-Whitney test), which can be seen in Figure 5.

Sample results from pyrosequencing of the remaining loci are presented in Figure 6. The methylation values obtained from these assays did not achieve significance when compared between prostate cancer and control samples.

39

RNF219 100

80

60

40

20 % % Methylation 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Figure 3. Pyrosequencing results showing methylation status in CpG sites of RNF219 in prostate cancer and control samples. The Y-axis represents the percentage of mean methylation and the X-axis refers to the CpG sites that were being analyzed. Methylation at RNF219 was significantly different between prostate cancer and controls samples (p = 0.0012, Mann-Whitney test).

SIX3

70 60

50 40 30 20

% % Methylation 10 0 1 2 3 4

Figure 4. Pyrosequencing results showing methylation status in CpG sites of SIX3 in prostate cancer and control samples. The Y-axis represents the percentage of mean methylation and the X-axis refers to the CpG sites that were being analyzed. Methylation at SIX3 was significantly different between prostate cancer and controls samples (p = 0.003, Mann-Whitney test).

40

KIAA1539

60

40

20 % % Methylation

0 1 2 3 4 5

Figure 5. Pyrosequencing results showing methylation status in CpG sites of KIAA1539 in prostate cancer and control samples. The Y-axis represents the percentage of mean methylation and the X-axis refers to the CpG sites that were being analyzed. Methylation at KIAA1539 was significantly different between prostate cancer and controls samples (p = 0.02, Mann-Whitney test).

AGAP1 ATOH8 50

100 80 40 60 30 40 20 20 10

Methylation % % % Methylation 0 0 1 1 2 3

CBFA2T2 DLG2

50 100

40 80 30 60 20 40

10 20 % Methylation %

% Methylation % 0 0 1 2 3 4 5 1 2 3 4

Figure 6. Sample pyrosequencing results showing methylation status of loci in prostate cancer and control samples. The Y-axis represents the percentage of methylation and the X-axis refers to the CpG sites that were being analyzed. The methylation differences in these loci were not significantly different between prostate cancer and control samples.

41

3.2.2 Concordance of microarray and pyrosequencing data

Overall, pyrosequencing yielded a 62% replication of microarray results. Figure 7 shows the concordance of results obtained by microarray and pyrosequencing experiments, by looking at the ratio of DNA methylation in prostate cancer over controls for both methods.

3.2.3 Predictive value of differential circulating DNA methylation in RNF219

RNF219 was further investigated using bioinformatics methods to assess its potential performance as a biomarker for prostate cancer. In RNF219, differential methylation between prostate cancer cases and controls was detected at the restriction site as well as at the neighbouring CpG dinucleotides (Figure 3). In sample set 1, the mean amplicon methylation across all the CpG positions investigated in RNF219 was significantly lower in prostate cancer cases (35 ± 18%) than in controls (55 ± 17%; p = 7.5 x 10-4, Mann-Whitney test) (Figure 8). In the second sample set, the mean amplicon methylation was consistently lower in cases (35 ±

32%) than in controls (44 ± 25%), although these values did not reach statistical significance (p

= 0.54, Mann-Whitney test) (Figure 8). The ability of the cirDNA mean amplicon methylation in

RNF219 to predict whether a sample was derived from an individual with prostate cancer was evaluated by a receiver operator curve (ROC) analysis in sample set 1 (Figure 9). Mean cirDNA methylation in RNF219 between 49.9 to 53.4% exhibited 89.5% sensitivity and 71% specificity for prostate cancer in the training cohort. Applying this analysis to sample set 2 resulted in 61% sensitivity and 71% specificity (Figure 9, dashed lines). The area under the curve (AUC) was calculated for both cohorts (0.79 and 0.56 for training and testing cohorts, respectively), and this analysis is related to the probability or likelihood that the classifier can correctly distinguish two independent prostate cancer and control samples.

42

Figure 7. Differential methylation values obtained from microarrays and pyrosequencing for candidate loci. The Y-axis shows the coefficients for methylation values obtained from prostate cancer samples/controls, with positive and negative values indicating higher and lower methylation in prostate cancer, respectively. The X-axis shows the loci that were being analyzed.

Figure 8. DNA methylation in the RNF219 gene in two independent sample sets. The studied region showed statistically significant differences in Sample Set 1 (yellow box plots) between prostate cancer (PCa) and control (Ctrl) individuals (Mann-Whitney test; p = 7.5 x 10-4). In Sample Set 2 (PCa and Ctrl, red boxes) prostate cancer patients and controls did not show statistically significant differences (Mann- Whitney test; p = 0.54).

43

Figure 9: Predictive accuracy of cirDNA methylation level in the RNF219 gene. Mean DNA methylation percentages from Sample Set 1 were used to train classifiers (blue) and from Sample Set 2 to test the classifiers (red). An optimal cut-off threshold was determined in the training cohort by computing the True Positive (TPR) and False Positive (FPR) rates for mean cirDNA methylation between 1-100% in 0.1% increments. The optimal cut-off value (49.9%) showed FPR = 0.2917 (dashed vertical line) and TPR = 0.8947 in training cohort. Extrapolating the FPR value at the cut-off from the training to the testing cohort, a TPR of 0.6111 was obtained (dashed horizontal line). The area under the curve (AUC) was calculated for both cohorts (0.794 and 0.562 for training and testing cohorts, respectively) and represents the probability that the classifier will accurately distinguish a prostate cancer from a control sample.

3.3 Microarray methylation analysis in the circulating DNA of colorectal cancer

The methylation profiles in the cirDNA of colorectal cancer patients (n = 93) and healthy controls (n = 100) were examined using Affymetrix GeneChip® human tiling arrays covering chromosomes 1 and 6. For this experiment, a single channel system was used in which one sample was hybridized per microarray, and differences among the two groups were compared.

Affymetrix tiling arrays contain 25-mer probes, with a mean gap of 10 bp between probes, allowing a resolution of 35 bp, as measured by the distance from the central position of two

44 adjacent oligos. This platform allows for substantially higher genome resolution than CpG island arrays. Moreover, repetitive elements are not included among the probes, allowing for a detailed interrogation of unique DNA sequences.

Figure 10 shows a volcano plot displaying the significance of the differences found compared to the differences in methylation found between cases and controls. Local FDR enabled the ranking of the probesets on the array according to the differential methylation between colorectal cancer cases and controls. The top 100 differentially methylated loci were mapped to the genome and were all were found to be hypermethylated in colorectal cancer. In contrast to the CpG island microarray used in the prostate cancer experiment, tiling arrays exclude repetitive sequences. As mentioned above, repetitive sequences have been reported to be hypomethylated in tumor cells [41]. Therefore, the finding of increased cirDNA methylation in the plasma samples of colorectal cancer patients compared to unaffected controls is in line with the reported observation of hypermethylaton of unique sequences in cancer cells [52]. Of the top

100 loci that were identified, 48 were located within genes (six of these within exons, and two of them within CpG islands) (Table 6), while the remainder was located in intergenic regions

(Table 7). The loci are ranked in order of significance, and the chromosomal locations of the probes begin at the location listed in the probe ID and end 25 bp downstream.

45

Figure 10. Volcano plot of microarray data in colorectal cancer and control samples. The X-axis represents DNA methylation differences between groups, with coefficients expressed in a log2 scale. Samples with increased microarray signals in colorectal cancer and control individuals had positive and negative coefficients, respectively. The Y-axis represents –log10- transformed p-values. Each circle represents a probe on the array. The vertical line depicts the 0 value, i.e. no differences in DNA methylation.

46

Table 6. Top 48 loci located within genes exhibiting differential methylation between colorectal cancer samples and controls.

Probe ID P-value Associated Gene Position CpG Island Hs:NCBIv36;chr1-213838851 2.91E-10 ESRRG Intron No Hs:NCBIv36;chr6-21054177 1.53E-09 CDKAL1 Intron No Hs:NCBIv36;chr1-31994380 3.50E-08 KHDRBS1 Intron No Hs:NCBIv36;chr6-90524907 1.59E-07 MDN1 Exon. Yes Hs:NCBIv36;chr1-152510715 2.49E-07 RUSC1 Exon Yes Hs:NCBIv36;chr1-148148046 4.99E-07 SETDB1 Intron No Hs:NCBIv36;chr1-12805234 6.88E-07 LOC440563 Intron No Hs:NCBIv36;chr1-154941271 8.95E-07 FCRL2 Intron No Hs:NCBIv36;chr6-88847150 2.00E-06 CNR1 Exon No Hs:NCBIv36;chr6-159566695 2.03E-06 FNDC1 Intron No Hs:NCBIv36;chr1-22387712 2.06E-06 EPHA8 Intron No Hs:NCBIv36;chr6-33640852 3.47E-06 ITPR3 Intron No Hs:NCBIv36;chr1-98863913 4.88E-06 LPPR5 intron No Hs:NCBIv36;chr6-46819678 5.47E-06 MEP1A Intron No Hs:NCBIv36;chr1-117953214 6.78E-06 SPAG17 Intron No Hs:NCBIv36;chr6-158915058 7.20E-06 TMEM181 Intron No Hs:NCBIv36;chr6-69506081 7.56E-06 BAI3 Intron No Hs:NCBIv36;chr1-6095584 1.05E-05 ACOT7 Intron No Hs:NCBIv36;chr6-45281902 1.26E-05 SUPT3H Intron No Hs:NCBIv36;chr1-20488704 1.40E-05 KIF17 Intron No Hs:NCBIv36;chr6-157481247 1.50E-05 ARID1B Intron No Hs:NCBIv36;chr6-121505803 1.80E-05 C6ORF170 Intron No Hs:NCBIv36;chr6-56118092 2.09E-05 COL21A1 Intron No Hs:NCBIv36;chr6-162926468 2.62E-05 PARK2 Intron No Hs:NCBIv36;chr1-143671056 2.89E-05 PDZK1P1 Exon No Hs:NCBIv36;chr1-17049487 2.90E-05 PADI4 Intron. No Hs:NCBIv36;chr6-117707951 3.74E-05 ROS1 Exon No Hs:NCBIv36;chr1-59872336 3.86E-05 C1ORF87 Intron No Hs:NCBIv36;chr1-166857156 3.89E-05 SELL Intron No Hs:NCBIv36;chr1-242676608 4.04E-05 KIF26B Intron No Hs:NCBIv36;chr1-77528245 4.30E-05 ZZZ3 Intron No Hs:NCBIv36;chr6-151848043 4.34E-05 C6ORF97 Intron No Hs:NCBIv36;chr1-116956942 4.57E-05 TTF2 Intron No Hs:NCBIv36;chr1-211733065 7.32E-05 PTPN14 Intron No Hs:NCBIv36;chr6-168746121 8.06E-05 SMOC2 Intron No Hs:NCBIv36;chr6-99915686 8.13E-05 PNISR Intron No Hs:NCBIv36;chr6-12843794 9.05E-05 PHACTR1 Intron No Hs:NCBIv36;chr1-102909098 0.000107 COL1A1 Intron No Hs:NCBIv36;chr1-75791022 0.000113 MSH4 Intron No Hs:NCBIv36;chr1-41721479 0.000132 HIVEP3 Intron No

47

Probe ID P-value Associated Gene Position CpG Island Hs:NCBIv36;chr1-114874368 0.000135 TSHB Intron No Hs:NCBIv36;chr6-16393344 0.000141 GMPR Intron No Hs:NCBIv36;chr6-107032438 0.000166 AIM1 Intron No Hs:NCBIv36;chr6-146165843 0.000179 LOC100507557 Intron No Hs:NCBIv36;chr1-148235825 0.000197 BNIPL Exon No Hs:NCBIv36;chr1-195911599 0.000208 PTPRC Intron No Hs:NCBIv36;chr1-198051675 0.000228 CAMSAP2 Intron No Hs:NCBIv36;chr6-152733717 0.000236 SYNE1 Intron No

Table 7. Top 52 loci located in intergenic regions that exhibit differential methylation between colorectal cancer samples and controls.

Probe ID P-value Hs:NCBIv36;chr6-21842458 7.08E-10 Hs:NCBIv36;chr1-98087976 2.18E-08 Hs:NCBIv36;chr6-113291105 2.02E-07 Hs:NCBIv36;chr1-172432657 2.89E-07 Hs:NCBIv36;chr6-56748085 2.97E-07 Hs:NCBIv36;chr6-93386032 4.14E-07 Hs:NCBIv36;chr1-96297880 5.94E-07 Hs:NCBIv36;chr6-46219241 6.12E-07 Hs:NCBIv36;chr6-109144812 1.13E-06 Hs:NCBIv36;chr1-225127559 1.19E-06 Hs:NCBIv36;chr1-217670439 1.84E-06 Hs:NCBIv36;chr6-157647107 2.05E-06 Hs:NCBIv36;chr6-114702555 2.19E-06 Hs:NCBIv36;chr6-94594322 2.39E-06 Hs:NCBIv36;chr1-84903351 3.17E-06 Hs:NCBIv36;chr6-140437388 3.59E-06 Hs:NCBIv36;chr6-91063725 4.74E-06 Hs:NCBIv36;chr1-153331972 5.90E-06 Hs:NCBIv36;chr1-232095177 5.90E-06 Hs:NCBIv36;chr1-158641351 8.11E-06 Hs:NCBIv36;chr6-94420702 8.32E-06 Hs:NCBIv36;chr1-163573985 8.91E-06 Hs:NCBIv36;chr1-12428075 1.06E-05 Hs:NCBIv36;chr6-9688571 1.07E-05 Hs:NCBIv36;chr1-232193363 1.09E-05 Hs:NCBIv36;chr6-46790190 1.11E-05 Hs:NCBIv36;chr6-85519191 1.15E-05 Hs:NCBIv36;chr1-12360737 1.25E-05 Hs:NCBIv36;chr6-121697390 1.62E-05 Hs:NCBIv36;chr6-119156212 1.68E-05

48

Probe ID P-value Hs:NCBIv36;chr6-30314349 2.66E-05 Hs:NCBIv36;chr6-143814024 2.75E-05 Hs:NCBIv36;chr6-92012184 3.74E-05 Hs:NCBIv36;chr1-165918822 3.78E-05 Hs:NCBIv36;chr6-67527603 4.40E-05 Hs:NCBIv36;chr1-100582630 4.65E-05 Hs:NCBIv36;chr6-88716345 4.83E-05 Hs:NCBIv36;chr1-173602290 5.67E-05 Hs:NCBIv36;chr6-132263478 6.27E-05 Hs:NCBIv36;chr1-58906929 6.35E-05 Hs:NCBIv36;chr1-69038857 7.35E-05 Hs:NCBIv36;chr6-103677824 7.46E-05 Hs:NCBIv36;chr6-108654738 9.04E-05 Hs:NCBIv36;chr6-48156297 0.000101 Hs:NCBIv36;chr1-29378111 0.000109 Hs:NCBIv36;chr1-201325158 0.000109 Hs:NCBIv36;chr1-193168655 0.000118 Hs:NCBIv36;chr6-39125578 0.000152 Hs:NCBIv36;chr1-65494488 0.000187 Hs:NCBIv36;chr1-3810641 0.000209 Hs:NCBIv36;chr6-84145291 0.000218 Hs:NCBIv36;chr6-136045316 0.000221

3.3.1 Potential candidate loci in the circulating DNA of colorectal caner

Of the loci located within genes, 10 were found to have putative roles in cancer and/or were previously reported as potential cancer biomarkers. These 10 were absent in melanoma 1

(AIM1), brain-specific angiogenesis inhibitor 3 (BAI3), chromosome 6 open reading frame 97

(C6ORF97), collagen, type 1, alpha 1 (COL1A1), estrogen-related receptor gamma (ESRRG), c receptor-like 2 (FCRL2), protein tyrosine phosphatase, non-receptor type 14 (PTPN14), protein tyrosine phosphatase, receptor type, C (PTPRC), c-rose oncogene 1, receptor tyrosine kinase

(ROS1), and SPARC related modular calcium binding 2 (SMOC2).

49

4.0 DISCUSSION

The search for DNA methylation biomarkers in the plasma cirDNA of cancer patients has been an active area of biomedical research. By and large, the vast majority of studies have examined DNA methylation in candidate loci, which were selected based on existing knowledge of their differential methylation in primary tumour samples. DNA methylation profiling in cancer has thus far been limited to studies on tissue samples and cell lines [119, 132, 140]; however, applying this type of approach to cirDNA could be a fruitful pathway for biomarker discovery.

As part of a large research effort, our group adapted the differential methylation hybridization technology [141] for DNA methylation profiling of plasma cirDNA. To establish a proof-of-concept of this method in clinical samples, we performed a microarray experiment interrogating 12,192 loci on the plasma cirDNA of prostate cancer patients and control individuals. The areas showing significant differences in DNA methylation were identified and thirteen loci were subsequently evaluated using a fine-mapping bisulfite pyrosequencing approach on the original and an independent sample set. To further improve the genomic coverage and resolution of the method, we investigated the plasma cirDNA from colorectal cancer patients and healthy controls by hybridizing the samples to human tiling arrays containing over 6.5 million probes.

4.1 DNA methylation differences in the plasma circulating DNA of cancer patients

Overall, this study identified epigenome-wide methylation differences in the plasma cirDNA of cancer. In cancer, there are extensive alterations in DNA methylation patterns. Global genomic decreases in methylation, likely due to the loss of methylation in repetitive elements,

50 are seen universally throughout various tumours [40]. There is also hypo- and hypermethylation of single copy DNA sequences [41]. Most studies in cancer have looked at methylation of CpG islands in the promoters of genes, and the traditional understanding is that these CpG islands are unmethylated in normal conditions and tend to become hypermethylated in cancer, leading to the inactivation of tumour suppressor genes [40, 51]. However, this may be due to a bias in the literature to study hyper- rather than hypomethylation [41], as there have been several reports of cancer-related DNA hypomethylation in the promoters of tumour- or proliferation-related genes, leading to activation of oncogene activity [41].

4.1.1 Prostate cancer study

The overall trend in our experiment on the prostate cancer samples was that the majority of the statistically significant differentially methylated loci showed decreased methylation the cirDNA of prostate cancer patients. This can be partially explained by the fact that 85 out of the

133 loci which mapped to genomic regions corresponded to repetitive elements, and the finding of decreased methylation in the vast majority of these loci (79 out of 85) is consistent with the understanding of cancer-related DNA hypomethylation in repetitive DNA sequences [41]. The decreased methylation found in the unique sequences in the cirDNA of prostate cancer could be related to hypomethylation-associated gain-of-function of various genes in this disease; however, this finding could also be reflective of the overall dysregulation of the epigenetic machinery in cancer.

51

4.1.2 Colorectal cancer study

There was substantial differential methylation found in the cirDNA of colorectal cancer as well, and the trend from this study was increased methylation in colorectal cancer relative to controls. As repetitive elements have been removed from Affymetrix GeneChip® tiling array probes, all loci that were differentially methylated between colorectal cancer samples and controls corresponded to unique sequences. This may help explain why hypermethylation was the dominant pattern in the colorectal cancer samples, in contrast to the hypomethylation in cancer observed in prostate cancer study, which used CpG microarrays containing repetitive elements. Moreover, the hypermethylation observed in the colorectal cancer samples may be related to tumour suppressor gene loss-of-function as well as to the overall epigenetic dyregulation that is seen in cancer.

Of the top 100 loci exhibiting differential methylation between colorectal cancer patients and controls, roughly half of these were located within genes (48 out of 100). Of these, six were located within exons, with two of these in CpG islands, and the rest were in introns. The remaining loci (52 out of 100) were found in intergenic regions. Aberrant DNA methylation of unique sequences in cancer has been largely studied in the context of CpG island-associated promoters. However, a recent study conducted by Irizarry et al in colorectal cancer found that the regions showing the most differential methylation were in sequences located up to 2 kb away from CpG islands, which they termed “CpG island shores” [38]. The ability of our method to detect methylation changes in the cirDNA of cancer in areas outside of CpG islands suggests that it could identify biomarkers in areas of the cancer methylome that have been largely unexplored, but are very relevant for identifying cancer.

52

From a functional perspective, though the role of DNA methylation of CpG-islands in silencing transcription has been well established [41], the function of methylation in intra- and intergenic DNA has not been widely studied. Recent research may bring to light the functional implications of these phenomena. One study found that intragenic methylation overlapped with potential alternative promoters within genes, suggesting a functional role for intron methylation in regulating alternative promoters [142]. Aberrant methylation in these regions could thus interfere with transcription by either preventing or mediating the binding of transcription factors to these alternative promoter sites. Aberrant methylation in intergenic regions could also have important functional consequences by interfering with transcriptional regulation involving sequences such as enhancers. Moreover, although intergenic regions are not well characterized, they can contain genes for non-coding RNAs that may have important regulatory functions [143].

One group of researchers investigated a differentially methylated intergenic region located next to a gene, maternally expressed 3 (MEG3), which codes for a noncoding RNA involved in suppressing tumour cell growth in pituitary adenomas [144]. Hypermethylation at this intergenic region was found to be associated with loss of MEG3 expression and the loss of pituitary function that is found in some pituitary adenomas [144]. Therefore, increased DNA methylation in both intra- and intergenic DNA can have important functional consequences in cancer, and this may be partially responsible for the results that were found in this study.

4.2 Discovery of candidate gene markers

The results of the microarray experiments identified numerous potential candidate loci in the prostate cancer and colorectal cancer sample sets. However, to account for inaccuracies in the microarray assessment of DNA methylation, it was important to verify the results obtained from

53 the microarrays with an independent method. Bisulfite modification coupled with pyrosequencing, which allows for quantitative analysis of methylation at a single nucleotide resolution, was used to verify the microarray data from the prostate cancer study, and is the next step in the colorectal cancer study.

4.2.1 Prostate cancer study

Among the loci identified as differentially methylated in prostate cancer cases and controls, the reported physiological roles and/or involvement in carcinogenesis of some of them helped inform their inclusion in the fine mapping experiments. For instance, AGAP1 belongs to the ADP-ribosylation factor GTPase-activating protein family, involved in catalyzing the conversion of GTP to cGMP [145]. Overexpression of these proteins has been found to lead to activation of gene transcription [146], which is a common event in cancer. Downregulation of

GNG7 has been corrrelated with poor prognosis in esophageal cancer, and hypermethylation or loss of heterozygosity has been reported in 57% of tumours with GNG7 suppression [147].

Moreover, SIX3 was reported to be differentially expressed in gastric tumours compared to normal tissues [148] and was found to be hypermethylated in primary gliomas and cell lines

[149].

4.2.1.1 Replication of microarray data by pyrosequencing

Eight of the thirteen loci examined by fine mapping showed concordant results with microarray data, and three of these loci displayed significant methylation differences between prostate cancer samples and healthy controls in the pyrosequencing experiments. Of these genes,

RNF219 was found to be significantly hypomethylated in cancer by both the microarray and

54 pyrosequencing techniques. The precise role of RNF219 has not yet been elucidated; however, its family of proteins includes the early onset breast cancer gene BRCA1 [150] and the polycomb ring finger oncogene BMI1 [151].

The moderate replication rate of the microarray data by bisulfite pyrosequencing (62%) can be partially explained by differences in the methodology of the two techniques. Bisulfite pyrosequencing allows for the quantitative assessment of methylation at individual CpG sites, which is not possible with microarrays. However, the assays are limited in size to 150 bp, whereas the probes on the CpG island microarrays used in this experiment have a much longer length (345 ± 237 bp). Therefore, a discrepancy may have arisen due to the fact that the microarray probes interrogate much longer spans of the genome than the pyrosequencing assays.

This may help explain the disparity in methylation that was found between the prostate cancer and control samples by the two methods. One way to decrease this discrepancy is to use tiling microarrays, which contain much shorter probes (25 bp), allowing for more specific hybridization results. Tiling arrays were used in the subsequent methylation profiling experiment which analyzed the plasma cirDNA in colorectal cancer.

4.2.1.2 Performance characteristics of RNF219

Of the candidate genes identified by both the microarrays and pyrosequencing in the prostate cancer study, RNF219 was further evaluated for its performance characteristics as a potential biomarker. To verify whether the cirDNA methylation difference in RNF219 was an intrinsic property of the sample set 1, a second sample set consisting of a similar number of individuals was also studied (20 cases and 18 controls). A difference in the two samples sets was that although the controls in the first set were disease-free, controls in the second set were

55 diagnosed with BPH and may have a higher risk for prostate cancer [152-153]. Consistent with the results from the first sample set, prostate cancer samples in sample set 2 exhibited lower density of methylated cytosines compared to controls. However, this difference did not reach statistical significance, and this may be due to the bimodal distribution of methylation in the second set of prostate cancer samples.

To determine the predictive accuracy of the cirDNA methylation in RNF219, the first and second sample sets were respectively used as training and testing cohorts. 71% specificity and 61% sensitivity were estimated for this single marker. The values obtained give plasma cirDNA methylation in RNF219 an intermediate position amongst other biomarkers detected in blood samples. Specificity was much higher than in PSA levels (20%), but slightly lower than for differential methylation in GSTP1 (89%). In turn, sensitivity was lower than in PSA levels (80%), but higher than differential methylation in GSTP1 (52%) [55]. Taken together, these results support the idea that a panel of biomarkers may provide higher specificity and sensitivity than individual markers alone [154-155].

Though the performance characteristics for RNF219 as a biomarker were moderate, this gene was identified from our pilot study, which was done on a small scale using CpG island microarrays. It is therefore probable that studies using more samples and more informative microarray platforms can yield biomarkers with enhanced performance characteristics.

Forthcoming experiments and analysis of data obtained from the colorectal cancer study, which used a much larger sample size (193 compared to 40) and was performed on tiling microarrays, will help to establish the clinical value of this new approach in cancer diagnosis.

56

4.2.2 Colorectal cancer study

From the 48 loci located within genes that were identified in the top 100 significantly differentially methylated regions in colorectal cancer, 10 were previously reported to be associated with cancer. Among these, AIM1 was among the most studied genes for its role in cancer. AIM1 is a novel non-lens member of βγ-crystallin superfamily [156]. It is believed to be play a role in the suppression of melanoma as it is expressed at high levels in suppressed melanoma cells and maps to a locus for suppression of malignant melanoma on chromosome 6

[157]. AIM1 hypermethylation is postulated to play a role in a number of other cancers. Lower levels of methylation in the AIM1 promoter was found to predict for decreased risk of cancer recurrence in prostate cancer patients who had undergone radical prostatectomy [158]. AIM1 methylation was also found in 84% of bladder cancer tumours (n = 93) and correlated with primary tumour invasion [159]. In a panel of other genes, AIM1 methylation was found to be associated with nasopharyngeal carcinoma [160]. A low correlation was found between AIM1 methylation and multiple myeloma (detected in 12.5% of samples, n = 51) [161]. In addition, hypermethylation of AIM1 was identified as a potential biomarker for lung cancer in serum cirDNA in a panel with 5 other markers [162].

Another cancer-related gene that was identified is BAI3, which encodes an angiogenesis inhibitor expressed in brain tissues. It is hypothesized to play a role in the suppression of glioblastoma [163], and this family of proteins is thought to have anti-tumour properties as expression is lost during tumour formation [164]. In another study, BAI3 expression levels were decreased in metastatic brain cancer [165]. The function of C6ORF97 is unknown; however, expression of this gene predicted for improved disease-free survival in breast cancer, and was negatively correlated with expression of the estrogen receptor (ER) gene in ER-positive breast

57 cancers [166]. Another gene that has been studied in breast cancer is ESRRG, which is thought to play an important role in modulating estrogen signalling in breast cancer cells [167]. COL1A1 encodes a major component of collagen and has been largely studied in the context of bone and connective tissue disorders [168]. However, it has also been implicated as a potential marker that can distinguish between malignant and premalignant gastric lesions [169], and overexpression of

COL1A1 has been associated with gastric carcinoma [170]. In addition, high levels of FCRL2 are expressed by B cells in chronic lymphocytic leukemia [171-172], which has been correlated with clinical progression of the disease [171].

Protein tyrosine phosphatases (PTPs), such as PTN14 and PRPRC, play important roles in signaling pathways that underlie tumourigenesis [173]. A study in colorectal cancer found expression of these genes to inhibit cell growth, and postulated a role for PTPs as tumour suppressor genes [173]. In contrast, ROS1 is a proto-oncogene that has been shown to be activated by chromosomal rearrangements in several cancers, including those of the brain, lung, and bile duct [174-175]. Finally, SMOC2 encodes for a protein that acts as an angiogenic growth factor [176]. Interestingly, it was found to be downregulated in chemotherapy-resistant ovarian cancer and implicated in ovarian tumour suppression [177].

As significant differential DNA methylation of these genes was detected by the microarrays and they have been previously implicated in cancer, these loci represent promising candidates for future studies. The findings from previous research suggest that the differential methylation of these genes that was detected by the microarrays may stem from cancer-specific processes. Moreover, the cancer-associated lower expression levels observed in the majority of these genes seem consistent with the hypermethylation that was detected in the microarrays. As a further illustration of the utility of this approach for discovering novel markers, of all of the

58 genes that were listed above, only AIM1 has been previously studied in the cirDNA. This demonstrates the ability of methylation profiling in cancer cirDNA to identify novel candidate markers that have been missed by previous studies.

4.3 Future directions

Overall, the results from this study indicate that microarray-based scans of the cancer circulating methylome are a promising way to discover candidate genes for biomarker research.

This study represents a significant improvement over existing research by its relatively unbiased approach as well as its direct interrogation of cirDNA. This is the first study to comprehensively analyze methylation patterns in cancer plasma cirDNA. Directly profiling the methylation patterns found in cirDNA, rather than in tumour tissue, was a major strength of our approach, as malignant cells shedding DNA into the circulation may represent different DNA methylation profiles than those found in the primary tumour [69]. Distinct disease-specific DNA methylation changes would argue that cells forming tumours versus malignant cells that are prone to apoptosis and necrosis represent different populations of tumour cells. It is possible that the cirDNA originating from tumour cells comes from cells that did not survive the evolutionary

‘bottlenecks’ of tumour growth. This may explain the relatively low concordance found in several studies between methylation of genes in cirDNA compared to that in the primary tumour

[79]. For example, p16 methylation was found in 82% of esophageal squamous cell carcinoma tumours, but only in 23% of the corresponding serum samples [178]. Hence, conducting biomarker studies directly on cirDNA, rather than directing studies on candidate markers based on results found in tumour tissues, is an important point. Another strength of the approach used in this study was in verifying microarray results using an independent method, pyrosequencing.

59

This checks that microarray-nominated candidate markers do show differential methylation prior to engaging in further validation efforts.

In this study, the methylation profile of plasma cirDNA in cancer was examined using a restriction enzyme-based methylation detection technique. Though informative, using restriction enzymes limits the number of CpG sites that can be analyzed, as analysis of genomic regions that do not have the restriction site of the enzymes being used cannot be performed. Hence our analysis was based on CpG sites that contained these specific sequences. To establish a proof-of- principle of the method, a pilot study was performed using CpG island arrays. Upon establishing the utility of this method for identifying areas of differential methylation between cancer and controls, the experiments were scaled up by the use of tiling arrays, which are a substantially more informative and precise platform. However, the tiling arrays that were used were specific for chromosomes 1 and 6 and thus, did not provide whole genome coverage. Applying next- generation sequencing technology is a way to achieve whole genome coverage as well as bypass the limitations of restriction enzyme-based approaches, though its current price point makes its use a prohibitive endeavour.

To conclude, this study provides a basis for future efforts in the discovery of biomarkers in cirDNA. This study was able to detect differences in methylation across the genome as well as identify promising and novel candidate loci for future studies. Moreover, it represents the first study of its kind to comprehensively analyze the cirDNA methylome in cancer. Applying this approach to the discovery of candidate diagnostic markers in other types of cancer as well as for other types of biomarkers, such as those for predictive or prognostic applications, is a promising avenue for research.

60

5.0 REFERENCES

1. Ferlay J, S.H., Bray F, Forman D, Mathers C and Parkin DM, GLOBOCAN 2008 v1.2, Cancer Incidence and Mortality Worldwide: IARC CancerBase N.10. 2010. 2. Cerdan-Santacruz, C., et al., Colorectal cancer and its delayed diagnosis: have we improved in the past 25 years? Rev Esp Enferm Dig, 2011. 103(9): p. 458-63. 3. Pepe, M.S., et al., Phases of biomarker development for early detection of cancer. J Natl Cancer Inst, 2001. 93(14): p. 1054-61. 4. Roncoroni, L., et al., Delay in the diagnosis and outcome of colorectal cancer: a prospective study. Eur J Surg Oncol, 1999. 25(2): p. 173-8. 5. Howlader N, N.A., Krapcho M, Neyman N, Aminou R, Waldron W, Altekruse SF, Kosary CL, Ruhl J, Tatalovich Z, Cho H, Mariotto A, Eisner MP, Lewis DR, Chen HS, Feuer EJ, Cronin KA, Edwards BK, SEER Cancer Statistics Review, 1975-2008, National Cancer Institute. 6. Feng, Q., M. Yu, and N.B. Kiviat, Molecular biomarkers for cancer detection in blood and bodily fluids. Crit Rev Clin Lab Sci, 2006. 43(5-6): p. 497-560. 7. Ziegler, A., U. Zangemeister-Wittke, and R.A. Stahel, Circulating DNA: a new diagnostic gold mine? Cancer Treat Rev, 2002. 28(5): p. 255-71. 8. Henikoff, S. and M.A. Matzke, Exploring and explaining epigenetic effects. Trends Genet, 1997. 13(8): p. 293-5. 9. Bird, A., DNA methylation patterns and epigenetic memory. Genes Dev, 2002. 16(1): p. 6-21. 10. Jenuwein, T. and C.D. Allis, Translating the histone code. Science, 2001. 293(5532): p. 1074-80. 11. Bestor, T.H., The DNA methyltransferases of mammals. Hum Mol Genet, 2000. 9(16): p. 2395-402. 12. Ehrlich, M., et al., Amount and distribution of 5-methylcytosine in human DNA from different types of tissues of cells. Nucleic Acids Res, 1982. 10(8): p. 2709-21. 13. Yoder, J.A., C.P. Walsh, and T.H. Bestor, Cytosine methylation and the ecology of intragenomic parasites. Trends Genet, 1997. 13(8): p. 335-40. 14. Gardiner-Garden, M. and M. Frommer, CpG islands in vertebrate genomes. J Mol Biol, 1987. 196(2): p. 261-82. 15. Antequera, F. and A. Bird, Number of CpG islands and genes in human and mouse. Proc Natl Acad Sci U S A, 1993. 90(24): p. 11995-9. 16. Tahiliani, M., et al., Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science, 2009. 324(5929): p. 930-5. 17. Ito, S., et al., Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5- carboxylcytosine. Science, 2011. 333(6047): p. 1300-3. 18. He, Y.F., et al., Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science, 2011. 333(6047): p. 1303-7. 19. Nestor, C., et al., Enzymatic approaches and bisulfite sequencing cannot distinguish between 5-methylcytosine and 5-hydroxymethylcytosine in DNA. Biotechniques, 2010. 48(4): p. 317-9. 20. Okano, M., et al., DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development. Cell, 1999. 99(3): p. 247-57.

61

21. Suetake, I., et al., DNMT3L stimulates the DNA methylation activity of Dnmt3a and Dnmt3b through a direct interaction. J Biol Chem, 2004. 279(26): p. 27816-23. 22. Stein, R., A. Razin, and H. Cedar, In vitro methylation of the hamster adenine phosphoribosyltransferase gene inhibits its expression in mouse L cells. Proc Natl Acad Sci U S A, 1982. 79(11): p. 3418-22. 23. Watt, F. and P.L. Molloy, Cytosine methylation prevents binding to DNA of a HeLa cell transcription factor required for optimal expression of the adenovirus major late promoter. Genes Dev, 1988. 2(9): p. 1136-43. 24. Harikrishnan, K.N., et al., Brahma links the SWI/SNF chromatin-remodeling complex with MeCP2-dependent transcriptional silencing. Nat Genet, 2005. 37(3): p. 254-64. 25. Nan, X., F.J. Campoy, and A. Bird, MeCP2 is a transcriptional repressor with abundant binding sites in genomic chromatin. Cell, 1997. 88(4): p. 471-81. 26. Hellman, A. and A. Chess, Gene body-specific methylation on the active X chromosome. Science, 2007. 315(5815): p. 1141-3. 27. Robertson, K.D., DNA methylation and human disease. Nat Rev Genet, 2005. 6(8): p. 597-610. 28. Goto, T. and M. Monk, Regulation of X-chromosome inactivation in development in mice and humans. Microbiol Mol Biol Rev, 1998. 62(2): p. 362-78. 29. Jones, P.A. and P.W. Laird, Cancer epigenetics comes of age. Nat Genet, 1999. 21(2): p. 163-7. 30. Nagase, H. and S. Ghosh, Epigenetics: differential DNA methylation in mammalian somatic tissues. FEBS J, 2008. 275(8): p. 1617-23. 31. Feinberg, A.P., An epigenetic approach to cancer etiology. Cancer J, 2007. 13(1): p. 70-4. 32. Esteller, M., et al., DNA methylation patterns in hereditary human cancers mimic sporadic tumorigenesis. Hum Mol Genet, 2001. 10(26): p. 3001-7. 33. Esteller, M., et al., Inactivation of the DNA repair gene O6-methylguanine-DNA methyltransferase by promoter hypermethylation is a common event in primary human neoplasia. Cancer Res, 1999. 59(4): p. 793-7. 34. Feinberg, A.P., Imprinting of a genomic domain of 11p15 and loss of imprinting in cancer: an introduction. Cancer Res, 1999. 59(7 Suppl): p. 1743s-1746s. 35. Feinberg, A.P. and B. Vogelstein, Hypomethylation distinguishes genes of some human cancers from their normal counterparts. Nature, 1983. 301(5895): p. 89-92. 36. Feinberg, A.P. and B. Vogelstein, Hypomethylation of ras oncogenes in primary human cancers. Biochem Biophys Res Commun, 1983. 111(1): p. 47-54. 37. Fleisher, A.S., et al., Hypermethylation of the hMLH1 gene promoter in human gastric cancers with microsatellite instability. Cancer Res, 1999. 59(5): p. 1090-5. 38. Irizarry, R.A., et al., The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat Genet, 2009. 41(2): p. 178-86. 39. Issa, J.P., CpG island methylator phenotype in cancer. Nat Rev Cancer, 2004. 4(12): p. 988-93. 40. Esteller, M., Epigenetics in cancer. N Engl J Med, 2008. 358(11): p. 1148-59. 41. Ehrlich, M., DNA methylation in cancer: too much, but also too little. Oncogene, 2002. 21(35): p. 5400-13.

62

42. Karpf, A.R. and S. Matsui, Genetic disruption of cytosine DNA methyltransferase enzymes induces chromosomal instability in human cancer cells. Cancer Res, 2005. 65(19): p. 8635-9. 43. Eden, A., et al., Chromosomal instability and tumors promoted by DNA hypomethylation. Science, 2003. 300(5618): p. 455. 44. Strichman-Almashanu, L.Z., et al., A genome-wide screen for normally methylated human CpG islands that can identify novel imprinted genes. Genome Res, 2002. 12(4): p. 543-54. 45. Feinberg, A.P. and B. Tycko, The history of cancer epigenetics. Nat Rev Cancer, 2004. 4(2): p. 143-53. 46. Goelz, S.E., et al., Hypomethylation of DNA from benign and malignant human colon neoplasms. Science, 1985. 228(4696): p. 187-90. 47. Watt, P.M., R. Kumar, and U.R. Kees, Promoter demethylation accompanies reactivation of the HOX11 proto-oncogene in leukemia. Genes Chromosomes Cancer, 2000. 29(4): p. 371-7. 48. Sharrard, R.M., et al., Patterns of methylation of the c-myc gene in human colorectal cancer progression. Br J Cancer, 1992. 65(5): p. 667-72. 49. Martin, V., et al., Involvement of DNA methylation in the control of the expression of an estrogen-induced breast-cancer-associated protein (pS2) in human breast cancers. J Cell Biochem, 1997. 65(1): p. 95-106. 50. Cho, M., et al., Hypomethylation of the MN/CA9 promoter and upregulated MN/CA9 expression in human renal cell carcinoma. Br J Cancer, 2001. 85(4): p. 563-7. 51. Rauch, T.A., et al., High-resolution mapping of DNA hypermethylation and hypomethylation in lung cancer. Proc Natl Acad Sci U S A, 2008. 105(1): p. 252-7. 52. Costello, J.F., et al., Aberrant CpG-island methylation has non-random and tumour-type- specific patterns. Nat Genet, 2000. 24(2): p. 132-8. 53. Herman, J.G. and S.B. Baylin, Gene silencing in cancer in association with promoter hypermethylation. N Engl J Med, 2003. 349(21): p. 2042-54. 54. Esteller, M., CpG island hypermethylation and tumor suppressor genes: a booming present, a brighter future. Oncogene, 2002. 21(35): p. 5427-40. 55. Wu, T., et al., Measurement of GSTP1 promoter methylation in body fluids may complement PSA screening: a meta-analysis. Br J Cancer, 2011. 105(1): p. 65-73. 56. Esteller, M., et al., Promoter hypermethylation and BRCA1 inactivation in sporadic breast and ovarian tumors. J Natl Cancer Inst, 2000. 92(7): p. 564-9. 57. Esteller, M., et al., A gene hypermethylation profile of human cancer. Cancer Res, 2001. 61(8): p. 3225-9. 58. Herman, J.G., et al., Incidence and functional consequences of hMLH1 promoter hypermethylation in colorectal carcinoma. Proc Natl Acad Sci U S A, 1998. 95(12): p. 6870-5. 59. Toyota, M., et al., CpG island methylator phenotype in colorectal cancer. Proc Natl Acad Sci U S A, 1999. 96(15): p. 8681-6. 60. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin Pharmacol Ther, 2001. 69(3): p. 89-95. 61. Marchionni, L., et al., Impact of gene expression profiling tests on breast cancer outcomes. Evid Rep Technol Assess (Full Rep), 2007(160): p. 1-105.

63

62. Finn, R.S., et al., PD 0332991, a selective cyclin D kinase 4/6 inhibitor, preferentially inhibits proliferation of luminal estrogen receptor-positive human breast cancer cell lines in vitro. Breast Cancer Res, 2009. 11(5): p. R77. 63. Sawyers, C.L., The cancer biomarker problem. Nature, 2008. 452(7187): p. 548-52. 64. Altman, D.G. and J.M. Bland, Diagnostic tests. 1: Sensitivity and specificity. BMJ, 1994. 308(6943): p. 1552. 65. Fahey, M.T., L. Irwig, and P. Macaskill, Meta-analysis of Pap test accuracy. Am J Epidemiol, 1995. 141(7): p. 680-9. 66. Choi, J.J., C.F. Reich, 3rd, and D.S. Pisetsky, Release of DNA from dead and dying lymphocyte and monocyte cell lines in vitro. Scand J Immunol, 2004. 60(1-2): p. 159-66. 67. Choi, J.J., C.F. Reich, 3rd, and D.S. Pisetsky, The role of macrophages in the in vitro generation of extracellular DNA from apoptotic and necrotic cells. Immunology, 2005. 115(1): p. 55-62. 68. Diehl, F., et al., Detection and quantification of mutations in the plasma of patients with colorectal tumors. Proc Natl Acad Sci U S A, 2005. 102(45): p. 16368-73. 69. Jahr, S., et al., DNA fragments in the blood plasma of cancer patients: quantitations and evidence for their origin from apoptotic and necrotic cells. Cancer Res, 2001. 61(4): p. 1659-65. 70. Fournie, G.J., et al., Plasma DNA as a marker of cancerous cell death. Investigations in patients suffering from lung cancer and in nude mice bearing human tumours. Cancer Lett, 1995. 91(2): p. 221-7. 71. van der Vaart, M. and P.J. Pretorius, Circulating DNA. Its origin and fluctuation. Ann N Y Acad Sci, 2008. 1137: p. 18-26. 72. Cline, A.M. and M.Z. Radic, Apoptosis, subcellular particles, and autoimmunity. Clin Immunol, 2004. 112(2): p. 175-82. 73. Anker, P., M. Stroun, and P.A. Maurice, Spontaneous release of DNA by human blood lymphocytes as shown in an in vitro system. Cancer Res, 1975. 35(9): p. 2375-82. 74. Stroun, M., et al., About the possible origin and mechanism of circulating DNA apoptosis and active DNA release. Clin Chim Acta, 2001. 313(1-2): p. 139-42. 75. Leon, S.A., et al., Free DNA in the serum of cancer patients and the effect of therapy. Cancer Res, 1977. 37(3): p. 646-50. 76. Shapiro, B., et al., Determination of circulating DNA levels in patients with benign or malignant gastrointestinal disease. Cancer, 1983. 51(11): p. 2116-20. 77. Allen, D., et al., Role of cell-free plasma DNA as a diagnostic marker for prostate cancer. Ann N Y Acad Sci, 2004. 1022: p. 76-80. 78. Sunami, E., et al., Quantification of LINE1 in circulating DNA as a molecular biomarker of breast cancer. Ann N Y Acad Sci, 2008. 1137: p. 171-4. 79. Fleischhacker, M. and B. Schmidt, Circulating nucleic acids (CNAs) and cancer--a survey. Biochim Biophys Acta, 2007. 1775(1): p. 181-232. 80. Schwarzenbach, H., D.S. Hoon, and K. Pantel, Cell-free nucleic acids as biomarkers in cancer patients. Nat Rev Cancer, 2011. 11(6): p. 426-37. 81. Stroun, M., et al., Isolation and characterization of DNA from the plasma of cancer patients. Eur J Cancer Clin Oncol, 1987. 23(6): p. 707-12. 82. Chen, X.Q., et al., Microsatellite alterations in plasma DNA of small cell lung cancer patients. Nat Med, 1996. 2(9): p. 1033-5.

64

83. Goessl, C., et al., Microsatellite analysis of plasma DNA from patients with clear cell renal carcinoma. Cancer Res, 1998. 58(20): p. 4728-32. 84. Cunningham, D., et al., Colorectal cancer. Lancet, 2010. 375(9719): p. 1030-47. 85. Worthley, D.L. and B.A. Leggett, Colorectal cancer: molecular features and clinical opportunities. Clin Biochem Rev, 2010. 31(2): p. 31-8. 86. Sozzi, G., et al., Detection of microsatellite alterations in plasma DNA of non-small cell lung cancer patients: a prospect for early diagnosis. Clin Cancer Res, 1999. 5(10): p. 2689-92. 87. Nawroz-Danish, H., et al., Microsatellite analysis of serum DNA in patients with head and neck cancer. Int J Cancer, 2004. 111(1): p. 96-100. 88. Hibi, K., et al., Molecular detection of genetic alterations in the serum of colorectal cancer patients. Cancer Res, 1998. 58(7): p. 1405-7. 89. Mayall, F., et al., Mutations of p53 gene can be detected in the plasma of patients with large bowel carcinoma. J Clin Pathol, 1998. 51(8): p. 611-3. 90. Silva, J.M., et al., Presence of tumor DNA in plasma of breast cancer patients: clinicopathological correlations. Cancer Res, 1999. 59(13): p. 3251-6. 91. Otsuka, J., et al., Detection of p53 mutations in the plasma DNA of patients with ovarian cancer. Int J Gynecol Cancer, 2004. 14(3): p. 459-64. 92. Castells, A., et al., K-ras mutations in DNA extracted from the plasma of patients with pancreatic carcinoma: diagnostic utility and prognostic significance. J Clin Oncol, 1999. 17(2): p. 578-84. 93. Trombino, S., et al., Mutations in K-ras codon 12 detected in plasma DNA are not an indicator of disease in patients with non-small cell lung cancer. Clin Chem, 2005. 51(7): p. 1313-4. 94. Umetani, N., et al., Prediction of breast tumor progression by integrity of free circulating DNA in serum. J Clin Oncol, 2006. 24(26): p. 4270-6. 95. Salani, R., et al., Measurement of cyclin E genomic copy number and strand length in cell-free DNA distinguish malignant versus benign effusions. Clin Cancer Res, 2007. 13(19): p. 5805-9. 96. Umetani, N., et al., Increased integrity of free circulating DNA in sera of patients with colorectal or periampullary cancer: direct quantitative PCR for ALU repeats. Clin Chem, 2006. 52(6): p. 1062-9. 97. Herman, J.G., et al., Methylation-specific PCR: a novel PCR assay for methylation status of CpG islands. Proc Natl Acad Sci U S A, 1996. 93(18): p. 9821-6. 98. Kristensen, L.S. and L.L. Hansen, PCR-based methods for detecting single-locus DNA methylation biomarkers in cancer diagnostics, prognostics, and response to treatment. Clin Chem, 2009. 55(8): p. 1471-83. 99. Dulaimi, E., et al., Tumor suppressor gene promoter hypermethylation in serum of breast cancer patients. Clin Cancer Res, 2004. 10(18 Pt 1): p. 6189-93. 100. Usadel, H., et al., Quantitative adenomatous polyposis coli promoter methylation analysis in tumor tissue, serum, and plasma DNA of patients with lung cancer. Cancer Res, 2002. 62(2): p. 371-5. 101. Ibanez de Caceres, I., et al., Tumor cell-specific BRCA1 and RASSF1A hypermethylation in serum, plasma, and peritoneal fluid from ovarian cancer patients. Cancer Res, 2004. 64(18): p. 6476-81.

65

102. Liggett, T., et al., Differential methylation of cell-free circulating DNA among patients with pancreatic cancer versus chronic pancreatitis. Cancer, 2010. 116(7): p. 1674-80. 103. Liggett, T.E., et al., Methylation patterns in cell-free plasma DNA reflect removal of the primary tumor and drug treatment of breast cancer patients. Int J Cancer, 2011. 128(2): p. 492-9. 104. Cassinotti, E., et al., DNA methylation patterns in blood of patients with colorectal cancer and adenomatous colorectal polyps. Int J Cancer, 2011. 105. Liggett, T.E., et al., Distinctive DNA methylation patterns of cell-free plasma DNA in women with malignant ovarian tumors. Gynecol Oncol, 2011. 120(1): p. 113-20. 106. Melnikov, A., et al., Differential methylation profile of ovarian cancer in tissues and plasma. J Mol Diagn, 2009. 11(1): p. 60-5. 107. Melnikov, A.A., et al., Array-based multiplex analysis of DNA methylation in breast cancer tissues. J Mol Diagn, 2008. 10(1): p. 93-101. 108. Wilson, I.M., et al., Epigenomics: mapping the methylome. Cell Cycle, 2006. 5(2): p. 155-8. 109. Gonzalgo, M.L., et al., Identification and characterization of differentially methylated regions of genomic DNA by methylation-sensitive arbitrarily primed PCR. Cancer Res, 1997. 57(4): p. 594-9. 110. Jemal, A., et al., Cancer statistics, 2010. CA Cancer J Clin, 2010. 60(5): p. 277-300. 111. Dunn, M.W. and M.W. Kazer, Prostate cancer overview. Semin Oncol Nurs, 2011. 27(4): p. 241-50. 112. Lilja, H., D. Ulmert, and A.J. Vickers, Prostate-specific antigen and prostate cancer: prediction, detection and monitoring. Nat Rev Cancer, 2008. 8(4): p. 268-78. 113. Lilja, H., A kallikrein-like serine protease in prostatic fluid cleaves the predominant seminal vesicle protein. J Clin Invest, 1985. 76(5): p. 1899-903. 114. Thompson, I.M. and D.P. Ankerst, Prostate-specific antigen in the early detection of prostate cancer. CMAJ, 2007. 176(13): p. 1853-8. 115. Hoffman, R.M., Viewpoint: limiting prostate cancer screening. Ann Intern Med, 2006. 144(6): p. 438-40. 116. Croswell, J.M., B.S. Kramer, and E.D. Crawford, Screening for prostate cancer with PSA testing: current status and future directions. Oncology (Williston Park), 2011. 25(6): p. 452-60, 463. 117. Djulbegovic, M., et al., Screening for prostate cancer: systematic review and meta- analysis of randomised controlled trials. BMJ, 2010. 341: p. c4543. 118. Goessl, C., et al., Fluorescent methylation-specific polymerase chain reaction for DNA- based detection of prostate cancer in bodily fluids. Cancer Res, 2000. 60(21): p. 5941-5. 119. Bastian, P.J., et al., CpG island hypermethylation profile in the serum of men with clinically localized and hormone refractory metastatic prostate cancer. J Urol, 2008. 179(2): p. 529-34; discussion 534-5. 120. Chan, A.T. and E.L. Giovannucci, Primary prevention of colorectal cancer. Gastroenterology, 2010. 138(6): p. 2029-2043 e10. 121. Laurent-Puig, P., H. Blons, and P.H. Cugnenc, Sequence of molecular genetic events in colorectal tumorigenesis. Eur J Cancer Prev, 1999. 8 Suppl 1: p. S39-47. 122. Gryfe, R., et al., Molecular biology of colorectal cancer. Curr Probl Cancer, 1997. 21(5): p. 233-300.

66

123. Barrett, J., et al., Pathways to the diagnosis of colorectal cancer: an observational study in three UK cities. Fam Pract, 2006. 23(1): p. 15-9. 124. Hamilton, W., et al., Clinical features of colorectal cancer before diagnosis: a population-based case-control study. Br J Cancer, 2005. 93(4): p. 399-405. 125. McArdle, C.S. and D.J. Hole, Emergency presentation of colorectal cancer is associated with poor 5-year survival. Br J Surg, 2004. 91(5): p. 605-9. 126. Bretthauer, M., Colorectal cancer screening. J Intern Med, 2011. 270(2): p. 87-98. 127. Rabeneck, L., et al., Cancer Care Ontario guaiac fecal occult blood test (FOBT) laboratory standards: evidentiary base and recommendations. Clin Biochem, 2008. 41(16-17): p. 1289-305. 128. Walsh, J.M. and J.P. Terdiman, Colorectal cancer screening: scientific review. JAMA, 2003. 289(10): p. 1288-96. 129. Young, G.P. and S. Cole, New stool screening tests for colorectal cancer. Digestion, 2007. 76(1): p. 26-33. 130. Lofton-Day, C., et al., DNA methylation biomarkers for blood-based colorectal cancer screening. Clin Chem, 2008. 54(2): p. 414-23. 131. Lao, V.V. and W.M. Grady, Epigenetics and colorectal cancer. Nat Rev Gastroenterol Hepatol, 2011. 132. deVos, T., et al., Circulating methylated SEPT9 DNA in plasma is a biomarker for colorectal cancer. Clin Chem, 2009. 55(7): p. 1337-46. 133. Patterson, T.A., et al., Performance comparison of one-color and two-color platforms within the MicroArray Quality Control (MAQC) project. Nat Biotechnol, 2006. 24(9): p. 1140-50. 134. Heisler, L.E., et al., CpG Island microarray probe sequences derived from a physical library are representative of CpG Islands annotated on the human genome. Nucleic Acids Res, 2005. 33(9): p. 2952-61. 135. Smyth, G.K., Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol, 2004. 3: p. Article3. 136. Bolstad, B.M., et al., A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 2003. 19(2): p. 185-93. 137. Royo, J.L., M. Hidalgo, and A. Ruiz, Pyrosequencing protocol using a universal biotinylated primer for mutation detection and SNP genotyping. Nat Protoc, 2007. 2(7): p. 1734-9. 138. Chasapis, C.T. and G.A. Spyroulias, RING finger E(3) ubiquitin ligases: structure and drug discovery. Curr Pharm Des, 2009. 15(31): p. 3716-31. 139. Del Bene, F., K. Tessmar-Raible, and J. Wittbrodt, Direct interaction of geminin and Six3 in eye development. Nature, 2004. 427(6976): p. 745-9. 140. An, Q., et al., Detection of p16 hypermethylation in circulating plasma DNA of non-small cell lung cancer patients. Cancer Lett, 2002. 188(1-2): p. 109-14. 141. Huang, T.H., M.R. Perry, and D.E. Laux, Methylation profiling of CpG islands in human breast cancer cells. Hum Mol Genet, 1999. 8(3): p. 459-70. 142. Maunakea, A.K., et al., Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature, 2010. 466(7303): p. 253-7. 143. Mattick, J.S. and I.V. Makunin, Non-coding RNA. Hum Mol Genet, 2006. 15 Spec No 1: p. R17-29.

67

144. Gejman, R., et al., Selective loss of MEG3 expression and intergenic differentially methylated region hypermethylation in the MEG3/DLK1 locus in human clinically nonfunctioning pituitary adenomas. J Clin Endocrinol Metab, 2008. 93(10): p. 4119-25. 145. Meurer, S., et al., AGAP1, a novel binding partner of nitric oxide-sensitive guanylyl cyclase. J Biol Chem, 2004. 279(47): p. 49346-54. 146. Xia, C., et al., GGAPs, a new family of bifunctional GTP-binding and GTPase-activating proteins. Mol Cell Biol, 2003. 23(7): p. 2476-88. 147. Ohta, M., et al., Clinical significance of the reduced expression of G protein gamma 7 (GNG7) in oesophageal cancer. Br J Cancer, 2008. 98(2): p. 410-7. 148. Rajkumar, T., et al., Identification and validation of genes involved in gastric tumorigenesis. Cancer Cell Int, 2010. 10: p. 45. 149. Zhang, Z., et al., MiR-185 targets the DNA methyltransferases 1 and regulates global DNA methylation in human glioma. Mol Cancer, 2011. 10: p. 124. 150. Shakya, R., et al., BRCA1 tumor suppression depends on BRCT phosphoprotein binding, but not its E3 ligase activity. Science, 2011. 334(6055): p. 525-8. 151. Kikuchi, J., et al., Distinctive expression of the polycomb group proteins Bmi1 polycomb ring finger oncogene and enhancer of zeste homolog 2 in nonsmall cell lung cancers and their clinical and clinicopathologic significance. Cancer, 2010. 116(12): p. 3015-24. 152. Hammarsten, J. and B. Hogstedt, Calculated fast-growing benign prostatic hyperplasia-- a risk factor for developing clinical prostate cancer. Scand J Urol Nephrol, 2002. 36(5): p. 330-8. 153. Alcaraz, A., et al., Is there evidence of a relationship between benign prostatic hyperplasia and prostate cancer? Findings of a literature review. Eur Urol, 2009. 55(4): p. 864-73. 154. Anglim, P.P., et al., Identification of a panel of sensitive and specific DNA methylation markers for squamous cell lung cancer. Mol Cancer, 2008. 7: p. 62. 155. Tsou, J.A., et al., Identification of a panel of sensitive and specific DNA methylation markers for lung adenocarcinoma. Mol Cancer, 2007. 6: p. 70. 156. Saida, T., Recent advances in melanoma research. J Dermatol Sci, 2001. 26(1): p. 1-13. 157. Ray, M.E., et al., Isolation and characterization of genes associated with chromosome-6 mediated tumor suppression in human malignant melanoma. Oncogene, 1996. 12(12): p. 2527-33. 158. Rosenbaum, E., et al., AIM1 promoter hypermethylation as a predictor of decreased risk of recurrence following radical prostatectomy. Prostate, 2011. 159. Brait, M., et al., Aberrant promoter methylation of multiple genes during pathogenesis of bladder cancer. Cancer Epidemiol Biomarkers Prev, 2008. 17(10): p. 2786-94. 160. Loyo, M., et al., A survey of methylated candidate tumor suppressor genes in nasopharyngeal carcinoma. Int J Cancer, 2011. 128(6): p. 1393-403. 161. de Carvalho, F., et al., TGFbetaR2 aberrant methylation is a potential prognostic marker and therapeutic target in multiple myeloma. Int J Cancer, 2009. 125(8): p. 1985-91. 162. Begum, S., et al., An epigenetic marker panel for detection of lung cancer using cell-free serum DNA. Clin Cancer Res, 2011. 17(13): p. 4494-503. 163. Shiratsuchi, T., et al., Cloning and characterization of BAI2 and BAI3, novel genes homologous to brain-specific angiogenesis inhibitor 1 (BAI1). Cytogenet Cell Genet, 1997. 79(1-2): p. 103-8.

68

164. Kaur, B., et al., Brain angiogenesis inhibitor 1 is differentially expressed in normal brain and glioblastoma independently of p53 expression. Am J Pathol, 2003. 162(1): p. 19-27. 165. Zohrabian, V.M., et al., Gene expression profiling of metastatic brain cancer. Oncol Rep, 2007. 18(2): p. 321-8. 166. Dunbier, A.K., et al., ESR1 is co-expressed with closely adjacent uncharacterised genes spanning a breast cancer susceptibility locus at 6q25.1. PLoS Genet, 2011. 7(4): p. e1001382. 167. Ijichi, N., et al., Estrogen-related receptor gamma modulates cell proliferation and estrogen signaling in breast cancer. J Steroid Biochem Mol Biol, 2011. 123(1-2): p. 1-7. 168. McArthur, G.A., Molecular targeting of dermatofibrosarcoma protuberans: a new approach to a surgical disease. J Natl Compr Canc Netw, 2007. 5(5): p. 557-62. 169. Zhao, Y., et al., A potential role of collagens expression in distinguishing between premalignant and malignant lesions in stomach. Anat Rec (Hoboken), 2009. 292(5): p. 692-700. 170. Oue, N., et al., Gene expression profile of gastric carcinoma: identification of genes and tags potentially involved in invasion, metastasis, and carcinogenesis by serial analysis of gene expression. Cancer Res, 2004. 64(7): p. 2397-405. 171. Li, F.J., et al., FCRL2 expression predicts IGHV mutation status and clinical progression in chronic lymphocytic leukemia. Blood, 2008. 112(1): p. 179-87. 172. Nuckel, H., et al., FCRL2 mRNA expression is inversely associated with clinical progression in chronic lymphocytic leukemia. Eur J Haematol, 2009. 83(6): p. 541-9. 173. Wang, Z., et al., Mutational analysis of the tyrosine phosphatome in colorectal cancers. Science, 2004. 304(5674): p. 1164-6. 174. Acquaviva, J., R. Wong, and A. Charest, The multifaceted roles of the receptor tyrosine kinase ROS in development and cancer. Biochim Biophys Acta, 2009. 1795(1): p. 37-52. 175. Gu, T.L., et al., Survey of tyrosine kinase signaling reveals ROS kinase fusions in human cholangiocarcinoma. PLoS One, 2011. 6(1): p. e15640. 176. Rocnik, E.F., et al., The novel SPARC family member SMOC-2 potentiates angiogenic growth factor activity. J Biol Chem, 2006. 281(32): p. 22855-64. 177. L'Esperance, S., et al., Gene expression profiling of paired ovarian tumors obtained prior to and following adjuvant chemotherapy: molecular signatures of chemoresistant tumors. Int J Oncol, 2006. 29(1): p. 5-24. 178. Hibi, K., et al., Molecular detection of p16 promoter methylation in the serum of patients with esophageal squamous cell carcinoma. Clin Cancer Res, 2001. 7(10): p. 3135-8.

69