VU Research Portal

Analysis of chromosomal copy number aberrations in gastrointestinal cancer Haan, J.C.

2014

document version Publisher's PDF, also known as Version of record

Link to publication in VU Research Portal

citation for published version (APA) Haan, J. C. (2014). Analysis of chromosomal copy number aberrations in gastrointestinal cancer.

General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal ?

Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.

E-mail address: [email protected]

Download date: 27. Sep. 2021 Chapter 2

Candidate driver in focal chromosomal aberrations of stage II colon cancer

Rebecca PM Brosens* Josien C Haan* Beatriz Carvalho François Rustenburg Heike Grabsch Philip Quirke Alexander F Engel Miguel A Cuesta Nicola Maughan Marcel Flens Gerrit A Meijer Bauke Ylstra

* These authors contributed equally to this work

Journal of Pathology 2010 Aug;221(4):411-24 Chapter 2

ABSTRACT

Chromosomal instable colorectal cancer is marked by speci!c large chromosomal copy number aberrations. Recently, focal aberrations of 3Mb or smaller have been identi!ed as a common phenomenon in cancer. Inherent to their limited size, these aberrations harbour one or few genes. The aim of this study is to identify recurrent focal chromosomal aberrations and their candidate driver genes in a well de!ned series of stage II colon cancers and assess their potential clinical relevance. High resolution DNA copy number pro!les were obtained from 38 formalin !xed paraf!n embedded colon cancer samples with matched normal mucosa as a reference using array comparative genomic hybridization. In total, 81 focal chromosomal aberrations were identi!ed that harboured 177 genes. Statistical validation of focal aberrations and identi!cation of candidate driver genes was performed by enrichment analysis and mapping copy number and mutation data of colorectal-, breast-, pancreatic cancer and glioblastomas to loci of focal aberrations in stage II colon cancer. This analysis demonstrated a signi!cant overlap with previously identi!ed focal ampli!cations in colorectal cancer, but not with cancers from other sites. In contrast, focal deletions seem less tumour type speci!c since they also show signi!cant overlap with focal deletions of other sites. Focal deletions detected are signi!cantly enriched for cancer genes and genes frequently mutated in colorectal cancer. The mRNA expression of these genes is signi!cantly correlated with DNA copy number status, supporting the relevance of focal aberrations. Loss of 5q34 and gain of 13q22.1 were identi!ed as independent prognostic factors of survival in this series of patients. In conclusion, focal chromosomal copy number aberrations in stage II colon cancer are enriched in cancer genes which contribute to and drive the process of colorectal cancer development. DNA copy number status of these genes correlate with mRNA expression and some are associated with clinical outcome.

30 Candidate driver genes in focal chromosomal aberrations

INTRODUCTION

Colorectal cancer results from alterations in the genome including mutations and chromosomal aberrations that affect oncogenes and tumour suppressor genes.1 Array comparative genomic hybridization (array CGH) allows to study chromosomal aberrations at a genome-wide level and a high spatial resolution in large series of tumours.2 Many of these aberrations concern large chromosomal regions which makes it dif!cult to identify actual driver genes.3 Increased resolution4 has led to the detection of focal chromosomal aberrations, de!ned as 3Mb or smaller in size.5-8 Inherent to their size, focal aberrations contain a small number of genes, facilitating driver identi!cation.6,9 Frequently, identi!cation of candidate driver genes is done by integrating chromosomal copy numbers with genome-wide expression array data.3,10 An alternative approach for identifying 2 candidate cancer genes at loci of genomic alterations is to analyse chromosomal copy numbers in conjunction with mutations, consistent with Knudsons two hit model.6 Speci!c chromosomal copy number aberrations have been linked to speci!c (sub-) types of cancers, well-de!ned disease stages and clinical outcome.11 It has recently been demonstrated that speci!c chromosomal copy number aberrations correlate with progression from colorectal adenoma to adenocarcinoma and response to therapy in advanced colorectal cancer (CRC).3,12 However, the relevance of focal chromosomal aberrations and their potential prognostic value in CRC has remained largely unexplored. In the Netherlands 40% of all colon cancer patients present with stage II disease, according to the TNM system.13 Stage II colon cancer patients are primarily treated by resection of their tumour. About 20-30% of stage II colon cancer patients will relapse, but standard adjuvant therapy is not recommended. A better understanding of disease biology could help to identify stage II CRC patients at high risk of relapse who may bene!t from closer follow up or additional therapy. The present study aims to identify focal chromosomal aberrations in a series of stage II FFPE colon cancer samples and evaluate their potential clinical value.

METHODS

Patients and tumour material Forty-six patients operated between 1990 and 2000 for stage II colon cancer (pT3 or pT4, pN0, pM0, R0 TNM classi!cation, !fth edition13) were selected for this study, 20 with and 26 without relapse. Patient and tumour characteristics are listed in Table 1. Twenty- one patients underwent resection of their primary tumour at the John Goligher Colorectal Unit, Leeds General In!rmary (UK) and 25 patients at the Zaans Medical Centre (NL). Medical records of all patients were reviewed retrospectively to obtain clinical data, patient characteristics and follow-up data. None of the patients received postoperative

31 Chapter 2 chemotherapy or radiotherapy. Tumour relapse was de!ned as the occurrence of distant metastasis, con!rmed by ultrasound, CT scan and/or histology within 36 months. Histopathological classi!cation included TNM staging13, grade of differentiation and number of nodes assessed.

Of one additional patient with colon cancer, both a fresh frozen sample and an FFPE sample were included for technical validation purposes. All samples were used in compliance with the respective institutional ethical regulations for surplus material and use of material from Leeds General In!rmary (UK) was approved by the Leeds (West) research ethics committee, unique identi!er CA02/014. DNA isolations DNA was isolated from 46 stage II CRC FFPE samples as previously described.14,15 For each tumour an area containing at least 70% tumour cells was selected for DNA extraction and marked on an H&E stained section. Four to six 10µm adjacent sections were cut, deparaf!nised, and macro dissected. DNA was extracted using a column based method (QIAamp microkit; Qiagen, Hilden, Germany).14 For all FFPE samples, reference DNA was extracted from resection margins or normal colon mucosa of the same patient at least 1cm distance of the tumour (matched normal). Normality was con!rmed for each reference by comparison with the array signals of pooled DNA samples isolated from blood obtained of eighteen healthy males.16 DNA from the frozen tissue sample was isolated using the Wizard® Genomic DNA Puri!cation Kit (Promega, Benelux, Leiden, The Netherlands). This sample was hybridized against a blood derived pooled reference DNA as previously described.16 MSI status Samples were analyzed for microsatellite instability (MSI) using the MSI Analysis System, Version 1.1 (Promega, Madison, USA). PCR products were separated by capillary electrophoresis using ABI PRISM® 3130 Sequencer and output data were analyzed using GeneScan 3100 (Applied Biosystems, Foster City, CA, USA). Samples with instability in two or more markers were classi!ed as MSI, samples with either no or one instability marker were classi!ed as microsatellite stable (MSS). DNA copy number analysis by array CGH Labelling was performed as previously described16 and was identical for FFPE and frozen samples, as well as Agilent and Nimblegen array platforms. CGH arrays for the 46 stage II samples contained 45,220 in situ synthesized 60-mer oligonucleotides representing 42,494 unique chromosomal locations evenly distributed over the genome (4x44k array, AMADID number 014950, Agilent Technologies, Palo Alto, CA, USA).

32 Candidate driver genes in focal chromosomal aberrations

Table 1. Patient and tumor characteristics of 46 stage II colon cancers

Mean Age (years) (range) 74.5 yrs (48.5-91.3) Mean number of nodes assessed (range) 8.2 (1-22) Sex Female 26 Male 20 MSI status MSI 14 MSS 32 Tumour location Right sided* 31 2 Left sided* 15 Depth of tumour invasion (pT) T3 43 T4 3 Differentiation Poor 6 Moderate 40 * Right –sited includes caecum, ascending colon, transverse colon; left-sited includes splenic !exure, descending colon, sigmoid colon. No rectum included.

Technical validation Technical validation of the performance of array CGH on FFPE was performed in two ways. First, DNA of one FFPE case that showed multiple focal aberrations on the Agilent platform was also analyzed on a 135K Nimblegen array (12x135k WG-T array) containing 134.937 unique in situ synthesized 60-mer oligonucleotides (Roche Nimblegen, Madison, USA). Second, of an additional case of which both fresh frozen and FFPE tissue was available, DNA of both samples was hybridized on 105K Agilent arrays (2x105k array, AMADID number 019015) containing over 99,000 unique in situ synthesized 60- mer oligonucleotides. Hybridizations were performed according to the manufacturer’s protocols for respectively Agilent or Nimblegen. Images of the arrays were acquired using a microarray scanner G2505B (Agilent technologies). Gene expression microarrays For 19 stage II patients fresh tumour tissue was available and used for RNA isolation (Ambion Inc., Austin, Texas, USA), 6 relapse, 11 no relapse and 2 unknown. Synthesis and cDNA labelling was performed according to manufacturers recommendations including QC (Agilent Technologies). Hybridization was performed on single 44K formatted expression arrays containing 41675 60-mer oligonucleotides representing over 27.000 well characterised full length or partial human genes and expressed sequence tag clusters (G4112A, Agilent Technologies).

33 Chapter 2

Statistical analysis Array CGH data analysis Image analysis was performed using feature extraction software version 9.1 for the Agilent arrays and NimbleScan version 2.5 for the Nimblegen array. Oligonucleotides were mapped according to the build NCBI36 (March 2006). Quality assessment of array CGH pro!les was based on the median-absolute deviation (MAD) of the log2 ratio of a without breakpoints.14 Pro!les with MAD values ≥ 0.20 were classi!ed as poor quality and excluded from further analysis. Of the 46 stage II samples, 38 array CGH pro!les passed this quality control (18 with and 20 without relapse). Data is made publicly available in GEO (accession number GSE17047). Downstream analysis was done using the statistical computing language R, version 2.6.1 (http://www.r-project. org). Chromosomal copy number aberrations were identi!ed using the package CGHcall17 with cellularity set to 0.7 and median normalization. Focal chromosomal aberrations were de!ned as regions less than 3 Mb as previously de!ned.6,8,9,18 Furthermore, only those focal deletions are taken into account that were detected 2 or more times and focal ampli!cations if they spanned 3 or more probes. Reason for using different criteria for focal deletions and focal ampli!cations, lies in the fact that these have different characteristics. Theoretically deletions can measure two copies while ampli!cation can be much higher. In the present series, the average DNA copy number ratio for focal deletions was -0.67, while for the focal ampli!cations the average amplitude was more than twice as high at 1.46. Statistical validation of focal aberrations and putative driver gene identi!cation To validate the focal aberrations, we tested whether they were enriched for loci of previously published focal aberrations in colorectal, breast, pancreatic cancer and glioblastoma.5-7 To this end enrichment analysis19 was implemented whereby 10.000 sets of simulated focal aberrations were randomly generated throughout the genome, with the same amount and length as the stage II focals, and at least one Agilent probe mapping within the region. Overlap was determined of either the randomly selected regions or the published focal aberrations with the published focal aberrations and signi!cance of enrichment expressed as a p-value. Identi!cation of candidate driver genes was performed using the same enrichment calculations. Enrichment of stage II focal aberrations or the randomly selected focal aberrations was calculated for published recurrent mutations5,7,20 or genes of the Cancer Census list.21 Statistical signi!cance of enrichment19 of overlap with these driver genes was again expressed as a p-value. The UCSC table browser data retrieval tool was used to obtain the list of overlapping UCSC genes within the experimental and simulated sets of focal aberrations.22 Where necessary the Batch Coordinate Conversion (liftOver) tool from the UCSC Genome Bioinformatics resource (http://genome.ucsc.edu/cgi-bin/hgLiftOver) was used to remap regions to the NCBI36 March 2006 freeze.23,24

34 Candidate driver genes in focal chromosomal aberrations

Correlation of focal aberrations with clinical data The relationship between any of the recurrent stage II focal aberrations, clinical variables age > or < 70 years, right or left-side tumour localisation, T-stage, number of nodes assessed (< or > 10), differentiation grade, MSI-status and gender with either overall survival (OS) or disease free survival (DFS) was calculated using univariate survival analysis with log rank statistics. To determine the independent effects of clinical variables and recurrent focal aberrations, multivariate Cox proportional hazard analysis was performed including only the signi!cant variables from the univariate survival analysis. To determine the association of DNA copy number alterations at loci of focal aberrations with survival or clinical variables, DNA copy number changes due to either focal aberrations or large chromosomal copy number aberrations overlapping with loci of focal aberrations were included. OS and DFS were de!ned as time from surgery to date of death due to any cause or to date until 2 relapse of tumour, respectively. Survival analysis was performed using Statistical Package for Social Sciences version 15.0 (SPSS, Inc., Chicago, IL). Gene expression in focal chromosomal aberrations Scanning and feature extraction of the expression arrays were performed using scanner version G2505B and feature extraction version A.7.5.1 (Agilent Technologies). Data is made publicly available in GEO (accession number GSE17047). The R-package LIMMA was used for background correction according to the Edward method25 and within array Lowess and between array quantile normalization according to Smyth and Speed.26 Expression values of candidate driver genes in or overlapping with the loci of focal aberrations were compared between tumours with and without DNA copy number aberrations of these genes. Of the 177 candidate driver genes in focal aberrations (Tables 2 and 3) 103 could be directly matched by name to the gene names of the Agilent expression arrays and were included in this analysis. The log 2 mRNA expression ratio in a focal aberrant region was rescaled from 0 to 100 per gene and statistical signi!cance of difference in average expression of these 103 genes between cases with normal copy number status and losses or gains, respectively, was calculated using a Wilcoxon rank test.

35 Chapter 2

RESULTS

Chromosomal copy number aberrations in FFPE stage II CRC samples An overview of large (> 3Mb) chromosomal copy number aberrations in the 38 stage II colon cancer samples is shown in Figure 1A. Most frequent aberrant regions were gains of chromosome 7, 8q, 13, 20 and deletions of chromosome 8p, 15, 17p and 18q, which is consistent with earlier observations. 3,27 Focal chromosomal copy number aberrations in stage II CRC patients Array CGH of 38 tumours yielded high quality DNA copy number pro!les that allowed for reliably detecting focal aberrations, typical examples of which are given in Figure 2. In total 176 focal chromosomal aberrations were detected in 81 genomic loci, comprising 47 deletions and 34 ampli!cations regions spread across the genome (Figure 1B, Tables 2 and 3). On average, tumours showed 3.7 focal deletions (range 0-19), containing on average 17 genes (range 1-130). The frequency of focal ampli!cations was considerably lower than that of focal deletions. Tumours had on average 0.95 (range 0-9) focal ampli!cations, containing on average 13 genes (range 1-50). a

b

Figure 1. Frequency plots of large scale and focal chromosomal aberrations in 38 stage II colon cancers. On the horizontal axis are plotted as alternating white and grey bands. Centromers are shown as dotted grey vertical lines. On the vertical axis frequencies of aberrations are shown and MSI and MSS tumours are depicted separately; dark colours are MSS, grey colours MSI. (a) frequency of the combined large scale and focal chromosomal aberrations and (b) frequency of focal chromosomal aberrations, de!ned as 3Mb or less in size.

36 Candidate driver genes in focal chromosomal aberrations 4 14 3 12 2

HLA- TRIM46 HOXA9 4 4 4 DVL3 C9orf23 HIST1H4J 4 1 1 1B1 3,14 BTNL2 H 2,4 HOXA6 HTR3E NC13B 2 ADAM15 4 U 1,

ST1 2,4 1 1 1,3,14 2 G O HI

1 C1 2 F4 2,4 HIST1H4I I focal deletions §,¶ focal M PIG 2 HOXA4 NOTCH4

E

4 1 MU 2B

Candidate driver genes at Candidate driver 2,11 3,4 1,14 1,4 C1orf2 1 4 HSD17B8 4 2 1,5

H 1 G 8 TLN1 3C IP 12 14 A1 R ST1 M T BX HI O HIST2H2AB H SLC39A7 HOXA3 ADAM18 ZMAT4 FANC P FEV GNAI2 USH2A THBS3 HIST1H4L VCP DRB5 2 ¶

II 5 7,8 11 11 11,12 Overlap of Stage Overlap focal deletions with focal published focals # § published focals FS* 0.6 0.2 0.2 0.008 0.2 0.02 0.4 0.2 0.2 0.2 0.9 0.02 0.4 0.07 0.2 0.7 0.2 0.4 0.2 0.07 D Log-rank Log-rank 0.3 0.08 0.1 0.01 0.2 0.02 0.3 0.07 0.3 0.1 0.8 0.02 0.2 0.09 0.3 0.9 0.3 0.2 0.1 0.09 OS* Log-rank Log-rank 3 3 0 0 0 2 1 1 0 1 1 0 1 0 2 1 2 1 1 1 Freq Freq expr 3 2 2 2 4 2 3 3 2 2 4 2 4 2 2 2 5 2 12 10 Freq 9 6 6 5 6 1 5 3 2 3 3 16 23 39 16 12 53 26 13 22 of No genes 8.7 b) 86.8 76.1 43.9 50.6 64.3 37.4 19.5 41.1 K Size 238.5 522 197.5 438.9 240.5 318 201.2 131.6 930.5 ( 1018.4 1219.3 b) 752 104 K End ( 27969 26233 59022 33286 27171 39700 40744 35820 50384 50641 31665 148126 185783 162851 144189 153490 219648 214745 62 b) inimal affected region 554 K M ( Start 27883 26140 58784 32268 27107 39262 40504 34600 50252 50621 31656 148082 185261 162800 144152 153172 219447 213814 Focal deletions in stage II colon cancers and candidate driver genes therein. 6p22.2 6p22.1 1p32.2 1q21.2 3q27.1 5q34 6p21.32 Chr 1p36.33 7p15.2 8p11.22 8p11.21 9p13.1 1q21.1 1q22 2q35 3p21.31 3p21.31 4p16.3 6p21.33 1q41 Table 2. 2. Table

37 Chapter 2 4 3,4 3 3

4 MVP 4 TNK1 4 3,4 PARP2 SPN 16 POLR2A 3,4 4 4 RNF40 GPS2 2,4 4 TGM5 3 2 HRAS CD19 3 2 OR4L1 ZBTB4 4 4 ITGAL 2 1,4,5,6,8,14 ATP1B2 2,14

4 focal deletions §,¶ focal DULLARD EPS8L2 TMEM62

4 RABEP2 TEN OR4N2 FUS 1,3 1,2 NLGN2 P 1 2 Candidate driver genes at Candidate driver 3,4

4 1 1 15 15 1 P 4 1 15 IFIT3 SEPHS2 14 1 2,4 14 1 3 4 2,3 1,2,3,4,7,8,14 K D M 4 IPR 53B 53 R L P P G A2BP1 WWOX STA LIPM MGMT TMEM16J O MTA1 T ATP2A1 ATA NXN CENTB1 T HEI10 TBX6 PLSCR3 SENP3 ZNF646 ¶

II , 11 , 10 ,9 ,9 ,6,8,9 5 6 5 6, 9 5 6 7 7,8 11 Overlap of Stage Overlap focal deletions with focal published focals # § published focals FS* 0.9 0.7 0.5 0.7 0.6 0.4 0.8 0.003 0.3 0.9 0.9 0.9 0.5 0.6 0.7 0.5 0.9 0.7 0.03 D Log-rank Log-rank 0.9 0.7 0.3 0.8 0.7 0.5 0.6 0.004 0.1 0.6 0.8 0.8 0.8 0.7 0.2 0.8 0.8 0.5 0.002 OS* Log-rank Log-rank 0 4 0 2 2 1 0 0 1 2 0 1 0 2 0 0 0 0 0 Freq Freq expr 2 9 2 2 2 2 4 3 2 2 2 3 2 2 3 2 6 5 2 Freq 6 1 1 1 1 2 2 9 4 5 6 21 51 24 12 15 34 62 of No 130 genes 0.5 2.6 6.7 b) 31.5 16.1 61.6 46.7 K Size 227.3 940.1 195.4 639.8 632.5 222.9 118.7 976.3 339.7 ( 1170.4 2649.8 1373 b) 827 6368 1025 7426 7517 K End ( 75248 77251 17798 91166 54799 19998 41487 41683 31104 89616 55262 24208 131451 105024 49 b) inimal affected region 187 6368 7086 7471 K M ( Start 74078 77024 17767 90226 54797 19365 41264 41676 28454 89497 55200 22835 131256 105008 Continued 12q21.1 16p13.3 16q23.1 Chr 10p12.33 10q23.31 10q26.3 11p15.5 12q11.2 14q11.2 14q32.33 15q15.3 15q15.3 16p11.2 10q23.31 16q12.2 17p13.3 17p13.1 17p13.1 17q11.2 Table 2. Table

38 Candidate driver genes in focal chromosomal aberrations

4

3,4 4

SPACA4 1 cations in S ! K PRR12 3 KCNA7 4 TS

;9: Homozygous 7 1,3 ; 13: Ampli ; 13: 3,4 5 SLC17A7 4 PLEKHA4 3 ;5: Homozygous deletions in 7 focal deletions §,¶ focal LOC126147 CPT1C 1 4 TEAD2 1 Candidate driver genes at Candidate driver 1 1,3,7 4 2 1 4 1 RASIP1 P P 4 D 32 A P CB S M D SETB U S FLJ10490 S TRPM4 FUT2 BCL2L12 2 ¶

II cations in pancreatic cancer ! ,11 ;8: Homozygous deletions in glioblastoma 5 ,6 5 7,9 10 ;4: Mutated genes in glioblastoma ;12: Ampli ;12: 5 Overlap of Stage Overlap 6 focal deletions with focal published focals # § published focals FS* 0.6 0.4 0.2 0.6 0.7 0.08 0.02 0.02 D Log-rank Log-rank 0.2 0.1 0.2 0.2 0.5 0.1 0.03 0.03 OS* cations in breast cancer Log-rank Log-rank ed by: ! ! 2 2 1 2 2 0 0 0 Freq Freq expr ;11: Ampli ;11: ; 3: Mutated genes pancreatic cancer 6 20 2 5 2 2 2 2 2 2 Freq ; 7: Homozygous deletions in pancreatic cancer 6 2 2 1 3 2 2 17 of No 108 genes . 15: Single gene located within focal aberration. ¶ Genes and numbers in bold have been described in CRC. ¶ Genes and numbers in bold have been described CRC. 15: Single gene located within focal aberration. . cations in CRC cations in CRC 21 ! b) 22 49.6 11.5 K Size 806.6 119.8 963.1 611.2 ( 1430.9 ; 10: Ampli ; 10: b) 29 1586 3013 K End ( 40536 14892 55632 38755 46814 55081 ;2: Mutated genes breast cancer 20 b) inimal affected region 975 3001 K M ( Start 39729 14772 55610 37792 46765 53650 ; 14: Cancer Census genes 7 Continued ;6: Homozygous deletions in breast cancer 6 18q12.3 20p12.1 Chr 17q23.2 18q12.3 18q21.1 19q13.33 20p13 20p13 Chr.= Chromosomal band; OS= Overall Survival; DFS=Disease-free survival. Freq = number of focal aberrations in the entire dataset; Chromosomal band; OS= Overall Survival; DFS=Disease-free survival. Chr.= expression arrays ran for with that aberration. P-values not corrected for multiple testing. *Log-rank statistic, indicated by number. found in published datasets, #Overlap with focal copy number data, § Focal chromosomal aberrations and candidate driver genes identi 1:Mutated genes CRC CRC Table 2. Table glioblastoma deletions in human cancers

39 Chapter 2

3

2 1,4 2,4

3 RARRES2

4 1 NOS3 1 11,14 2 2 3 HOOK3 4 P 2 RGS22 A 14 K ABP1 4 GIMAP7 A 14 DDHD

CCND1 RFC2 cations §,¶ 2,4 14 3 ! KIAA1862 IKBKB 4 1,11 2 3,14 3,4 4 COX6C 124 4 FGFR1 ampli ELN 14 GIMAP5 2 CCND2 1 2,4 GIMAP8 GPR ABCC11 14 2,4 4 3

2 ACRBP MRGPRD 4 2 ZNF425 MYST3 4 2 2 OSR2 1,2 1 T1 Candidate driver genes at focal genes at focal Candidate driver 2 1,3 4 4 2 B HI H LF5 ABCB8 WHSC1L1 REPIN1 GIMAP1 PHK CHD4 DACH1 K MYH11 PARP11 MTL5 KCNQ3 IGF2 POP1 WBSCR28 ZN EZ ANK1 PROSC PPBP

II ¶ # § cations with ! 10 12 11 6,13 11 11 11 Overlap of stage Overlap ampli published focals data published focals OS* 0.2 0.8 0.2 0.06 0.06 0.7 0.5 0.8 0.4 0.5 1.0 0.9 0.6 0.3 0.9 0.3 0.4 Log-rank Log-rank FS* 0.2 0.9 0.1 0.05 0.06 0.7 0.3 0.5 0.3 0.4 1.0 0.9 0.8 0.7 0.5 0.7 0.4 D Log-rank Log-rank 1 0 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 Freq Freq expr 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 Freq

9 6 1 5 5 2 4 of 13 17 13 21 17 24 14 48 21 26 No genes b) 96 38.2 56.3 K Size 143.7 796.5 350.7 158.3 ( 1349 2396.2 2689.2 1147.1 1358.2 2374.3 1207 2159.7 1465.8 1474.3 b) K 6626 4624 2149 End ( 47178 72491 72630 18454 69569 74124 38826 43097 38448 75122 133533 101234 100936 150358 cations in stage II colon cancer patients and candidate driver genes therein. ! inimal affected region b) K M 6482 3477 2110 ( Start 45829 70095 72534 15765 68211 98860 72917 38668 41631 36973 75066 132737 100586 148198 Focal ampli Chr 16p13.11 12p13.31 13q21.33 13q22.1 16p13.11 12p13.32 11q13.3 8q24.22 11p15.5 8q22.2 7q11.23 7q22.1 7q36.1 8p11.23 8p11.21 8p12 4q13.3 Table 3. 3. Table

40 Candidate driver genes in focal chromosomal aberrations

1 3 CACNB1

14 14 PSMD3 RARA cations §,¶ 3 LASP1 ! 4 10,11,14 2,4 WIRE ampli 4 BB2 R PCGF2 4 E CHD9 1

14 3 14 Candidate driver genes at focal genes at focal Candidate driver STAC2 ACACA MLLT6 RAPGEFL1 SLC4A1 CYLD SNRPB21 OTOR4 PCSK21,3,4 OTOR4 NKX2-23 FOXA24 CST42 FOXA24 HM132,10 BCL2L14 TPX21 DUSP153 XKR72 HCK3 TPX21 DUSP153 TM9SF43,4 CBFA2T23 SULF22,3 PREX13 ARFGEF22 CSE1L1 SULF22,3 PREX13 ZNF2171 2

II ¶ # § cations with ! ,11 10 7, 10 10 5 6 10,12 10 10 10 Overlap of stage Overlap ampli published focals data published focals OS* 0.9 0.7 0.8 0.8 0.8 0.2 0.5 0.5 0.7 0.7 0.7 0.2 0.1 0.2 0.2 0.2 0.1 Log-rank Log-rank FS* 0.9 0.4 0.5 0.5 0.5 0.2 0.6 0.6 0.8 0.7 0.7 0.2 0.1 0.2 0.2 0.2 0.09 D Log-rank Log-rank 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 Freq Freq expr 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 Freq

1 7 1 4 2 1 5 of 50 13 16 12 10 23 13 18 10 10 No genes b) 90.1 K Size 263.7 435 335.3 350.5 470.5 856 721 421 733.6 817.1 142.8 ( 1677.2 2536.8 1225.6 1309.7 1443.4 b) K 1868 End ( 33242 35404 35763 39928 51762 11428 16680 17402 22316 23763 29773 30526 31659 47163 51994 62135 inimal affected region b) K M 1604 ( Start 32807 33727 35428 39578 49225 10957 15824 16681 21091 22453 29352 29793 31569 45720 51177 61992 Continued Chr 20p13 17q12 17q12 17q21.2 17q21.31 16q12.2 20p12.2 20p13 20p12.1 20p11.22 20p11.21 20q11.21 20q11.21 20q11.22 20q13.12 20q13.2 20q13.33 Table 3. Table 2 Table Legends: see

41 Chapter 2

Technical validation of focal aberrations in FFPE samples Technical validation of focal aberrations in the FFPE samples was performed in two manners. First, for one of the 38 stage II colon cancers (sample 26), DNA isolated from FFPE was also hybridised on an array CGH platform of an independent manufacturer with a completely different probe design, i.e. the Nimblegen 135K array. In Figure 2, is shown with multiple large and focal chromosomal aberrations. Five of 7 focal aberrations were detected on both platforms. Focal deletion 2d, containing p53, was only detected by the Agilent platform, focal ampli!cation 2j, containing gene NPEPPS, was only detected by the Nimblegen platform. The discrepancy can be explained by the lack of probes on either platform for the particular chromosomal locations; i.e. no p53 probes are present on the Nimblegen 135K platform and only one probe was present for NPEPPS on the Agilent 44K platform. Secondly, for one additional colon cancer DNA was isolated separately from both an archival FFPE tissue block and a fresh frozen tissue sample, respectively. Both samples were hybridised to the same type of array CGH to verify if particularly the focal aberrations detected in the FFPE sample would also be detected in the frozen sample, and not just be artefacts.28 In fact, large and focal aberrations detected with DNA isolated form the fresh sample were identical to those observed with the DNA from the FFPE samples. A representative example of a focal deletion detected with either procedure is given in Figure 2. Statistical validation of focal aberrations using independent datasets Focal aberrations have previously been published for colorectal, breast, pancreatic cancer and glioblastoma.5-7 In addition, driver mutations have been mapped for these tumours by massively parallel sequencing.5,7,20 Finally, a list of human oncogenes and tumour suppressor genes has been published, designated the Cancer Census gene list.21 Of the 47 focal deletions in stage II CRC, 12 overlapped with published focal deletions in colorectal, breast, pancreatic cancer and glioblastoma,5-7,29 !ve of which overlapped with homozygous deletions previously reported in CRC.6 Of the 34 focal ampli!cations, 14 overlapped with published focal ampli!cations found of which 8 have previously been described for CRC.6 To validate focal chromosomal aberrations detected in stage II CRC we statistically tested if they were enriched for published focal aberrations (Table 4). The focal deletions identi!ed in stage II CRC displayed a signi!cant overlap with the homozygous deletions identi!ed in CRC by Leary et al. (p-value =0.007). They also showed signi!cant overlap with those identi!ed in glioblastoma (p-value = 0.04) and pancreatic cancers (p-value = 0.03) and showed a trend to overlap with focal deletions identi!ed in breast cancer (p-value =0.06). Overlap of focal deletions identi!ed in stage II CRCs with ampli!cations reported in published datasets was not signi!cant (p-value =0.3) (Table 4), thereby functioning as a negative internal control of the enrichment calculations. Focal ampli!cations seem to be much more tumour type speci!c compared to focal deletions, since a high and signi!cant overlap was observed with the ampli!cations published for the CRC dataset (p-value <0.001), but not with the other tumour sites.

42 Candidate driver genes in focal chromosomal aberrations

Known tumour suppressors genes, such as PTEN (Figure 2c), TP53 (Figure 2e), SMAD4 and OMA1 were located within the focal deletions detected. A focal deletion at chromosome 6p22.2 was most frequent, occurring in 12 of 38 (31%) patients, harbouring a cluster of HIST1H genes. Recurrent mutations of HIST1H genes have been described for colorectal, breast and pancreatic cancer5,20 and it was included in the Cancer Census list.21 The next most common focal deletion was identi!ed at 16p13.3, occurring in 9 of 38 (24%) patients. This focal deletion harbours one single gene, i.e. Ataxin-2-binding- -1 (A2BP1) (Figure 2a). Other, less frequent focal deletions also contained only one gene, like those at the loci of MGMT, WWOX, ZMAT4, STAM and USP32 (Table 2). Focal ampli!cations were identi!ed in genomic regions containing genes known to be involved in tumourigenesis, like FGFR1, CCND1 (Figure 2d), CCND2 and ERBB2 (Figure 2i). The two most frequent focal ampli!cations occurred in 2 of 38 patients (5%); one at 2 13q22.1 which harboured one single gene, KLF5 and one at 20q13.2 which harboured 5 genes, TSHZ2, AK056432, LOC391257, BCAS1 and ZNF217 (Table 3). To validate if focal aberrations in stage II CRC are enriched for frequently mutated genes or genes in the Cancer Census list5-7,21,29 we again performed enrichment analysis.19 These calculations showed a signi!cant overlap of genes located in focal deletion of stage II CRC with published driver mutations in CRC (p-value < 0.001) and other cancer types as well as with genes in the Cancer Census list (p-value = 0.01) (Table 4). Finally, a trend towards overlap of ampli!cations with genes in the Cancer Census list was observed (p-value = 0.05), emphasizing the signi!cance of focal ampli!cations in cancer development and progression (Table 4).

43 Chapter 2

44 Candidate driver genes in focal chromosomal aberrations

Figure 2. (as shown on the left) Representative examples of focal chromosomal aberrations of 4 stage II colon cancer patients. All array data are available in GEO. Sample 4, chromosome 16 has 2 focal deletions (a) containing A2BP1 and (b) containing ZFHX3. Sample 18, chromosome 10 has one focal deletion (c) containing PTEN. Sample 26 chromosome 17 has 5 focal aberrations; 2 focal deletions (d) containing TP53 and (e) MAP2k3 and 3 ampli!cations; containing (g) ACACA, (h) ERBB2 and (i) SLC4A1, respectively. Sample 26 was performed on two different array CGH platforms, i.e.; Agilent and Nimblegen; (f) was not recognized on the Agilent platform as a focal ampli!cation and (j) was not recognized by the Agilent platform by CGHcall for reasons explained by the technical design of the arrays (see text).17 Focal deletion (b) and (e) occurred only once in this dataset, were thus not counted as recurrent deletions and hence not listed in Table 1. Sample 50 is an additional colon cancer of which both frozen and FFPE material was hybridised to Agilent arrays. This case had one focal deletion on chromosome 4 (k) containing FAM190A that was detected in both the fresh and the FFPE sample. On the Y-axis the log2 tumour to normal ratio. On the X-axis chromosomal position in bp. 2 Centromers are visualized as dotted grey vertical lines and chromosomal ideograms are displayed in the top of the graphs.

Table 4. Signi!cance of enrichment of focal chromosomal aberrations identi!ed in 38 stage II colon cancers with published focal deletions and ampli!cations for colon, breast, lung, and pancreatic cancer and glioblastoma; genes of the Cancer Census list and homozygous deletions of the Cancer Genome project. Signi!cance of enrichment is expressed in p-value.

Published data § Focal deletions Focal ampli!cations Cancer Census list 21 0.01 0.05

Mutations Colorectal 20 <0.001 0.2 Breast 20 0.03 0.2 Pancreas 5 0.04 0.7 Glioblastoma 7 0.001 0.7

Homozygous deletions Colorectal 6 0.007 0.8 Breast 6 0.06 0.8 Pancreas 5 0.03 0.8 Glioblastoma 7 0.04 1.0 Cancer Genome project 29 0.004 1.0

Ampli!cations Colorectal 6 0.3 <0.001 Breast 6 0.2 0.4 Pancreas 5 0.7 0.5 Glioblastoma 7 1.0 0.8

§ Signi!cant correlations are highlighted in bold, see method section for p-value calculations

45 Chapter 2

Biological validation of focal aberrations Alterations in expression of one or few genes impose selection advantage for DNA copy number changes in tumours.6 Gene dosage effects can therefore be used for biological validation of focal aberrations. To this end, RNA expression microarray analysis was performed for those samples of which fresh frozen material was available. For reasons of statistical power, gene dosage effects of focal aberrations were not evaluated for individual genes, but data were pooled for 103 genes. Expression of each of these genes was scaled from 1 to 100 to achieve normalization across genes. Tumours with DNA copy number loss at loci of focal deletions had signi!cantly lower mRNA expression of the genes involved than tumours with normal DNA copy number (Figure 3, P= 0.008). Similarly, tumours with DNA copy number gains at loci of focal ampli!cations had signi!cantly higher mRNA expression of the genes involved than tumours with normal DNA copy number (Figure 3, P< 0.001).

Figure 3. Box plot of mRNA expression values of candidate driver genes on regions with focal deletions, focal ampli!cations and normal DNA copy number status. On the Y-axis rescaled mRNA expression levels are given from 0 to 100 and status of the focal aberrations on the X-axis.

Focal aberrations have implications for survival Because detailed clinical follow-up was available for all patients, the clinical relevance of focal aberrations could be assessed. This analysis identi!ed 7 focal deletions at 1q21.2, 5q34, 9p13.1, 12q11.2, 17q11.2, 20p13 and one focal ampli!cation at 13q22.1 to be signi!cantly associated with poor overall survival (OS). These focal deletions were also signi!cantly related to disease-free survival (DFS), except the ampli!cation at 13q22.1 (Tables 2 and 3). No signi!cant associations with overall or disease free survival were observed for any of the clinical variables. Cox proportional hazard analysis showed that both the focal deletion at 5q34 and the ampli!cation at 13q22.1 were independently associated with poor OS in this series of patients (p-value 0.02 with hazard ratio 4.7 and p-value 0.05 and hazard ratio 3.1 respectively), the 5q34 deletion was also independently associated with poor DFS (p-value 0.009 and hazard ratio 5.5).

46 Candidate driver genes in focal chromosomal aberrations

DISCUSSION

One hundred seventy-six focal chromosomal aberrations spread over 81 genomic loci were identi!ed in 38 stage II colon cancer patient samples. Using matched normal DNA ensured that array CGH pro!les were devoid of germ-line copy number variations,30 so that the focal events detected were genuine somatic aberrations. All focal aberrations were called as such by the CGHcall algorithm17 and stood out in the individual pro!les on visual inspection. Technical as well as statistical validation by enrichment analysis,19 using previously published focal copy number aberrations in colorectal and other cancers, practically excluded that these !ndings would be due to noise. The previously published data sets form an independent source of focal aberrations, since they were generated using different methods, namely single channel SNP array analysis and digital 2 karyotyping, performed at a different time, place, on different samples by different research groups.5-7 The overlap of focal aberrations in stage II colon cancer was most signi!cant with previously reported focal aberrations in CRC. Focal deletions also showed signi!cant overlap with focal deletions previously detected in glioblastoma, breast and pancreatic cancers, but ampli!cations seem more tumour speci!c. No distinction between homozygous and heterozygous focal aberrations was was made in the present study since tumour tissue samples, rather than cultured tumour cells,6 were used for the experiments. Culturing cells, prior to DNA isolation, ensures that the isolates will be devoid of normal cells, crucial for the distinguishing heterozygous and homozygous deletions. On the other hand, using DNA from FFPE tumour tissue from large archives with extensive clinical follow up allowed for selecting a stage II colon cancer series with detailed follow up. Using optimized DNA isolation protocols and strict quality criteria for array CGH data facilitated detection of descript focal aberrations that went undetected with previous platforms like BAC arrays.3,6 One hypothesis at the onset of this study was that focal aberrations could point to driver genes in chromosomal regions that so far mainly had been found to show larger gains or losses in colon cancer, complicating the identi!cation of driver genes. Indeed, in the present study focal deletions were highly enriched for genes frequently mutated in CRC, glioblastoma, breast, pancreatic cancers. Partly, focal aberrations were con!rmatory in this respect, e.g. focal aberrations containing TP53, SMAD4 or ERBB2. On the other hand, for a number of chromosomal regions that commonly show losses (e.g. 10p, 14q, 15q) or gains (e.g. 20q) focal aberrations certainly yielded candidate driver genes. Examples are PTEN on 10p and other genes listed in Table 2. In addition, mRNA expression of these candidate driver genes was signi!cantly associated with DNA copy number status, underlining the biological relevance of these aberrations.

47 Chapter 2

Moreover, some of these focal aberrations were correlated to clinical outcome in the setting of stage II colon cancer. In this series of patients, 5q34 was an independent prognostic factor for poor OS and DFS. This chromosomal region was previously found to correlate with worse prognosis in CRC.31 and two independent studies reported deletion of 5q in stage II CRC to be associated with liver metastasis.32,33 In conclusion, part of the relevant DNA copy number changes in stage II colon cancer are in the order of focal chromosomal aberrations, affecting mRNA expression of well known cancer genes at these loci, that consequently are candidates for driving these focal aberrations, but perhaps also for driving larger chromosomal aberrations commonly observed in colon cancer.

Acknowledgment The research presented in this article is !nancially supported by the Dutch Cancer Society, KWF-2007-3974, the Gastrointestinal Surgery research Foundation of the VU Medical Centre and the Translational Research Foundation of the VUmc Cancer Centre Amsterdam.

48 Candidate driver genes in focal chromosomal aberrations

REFERENCES

1. Fearon ER, Vogelstein B. A genetic model for colorectal tumorigenesis. Cell 1990;61(5):759-67. 2. Pinkel D, Albertson DG. Array comparative genomic hybridization and its applications in cancer. Nat Genet 2005;37 Suppl:S11-S17. 3. Carvalho B, Postma C, Mongera S et al. Multiple putative oncogenes at the chromosome 20q amplicon contribute to colorectal adenoma to carcinoma progression. Gut 2009;58(1):79-89. 4. Ylstra B, van den Ijssel P, Carvalho B et al. BAC to the future! or oligonucleotides: a perspective for micro array comparative genomic hybridization (array CGH). Nucleic Acids Res 2006;34(2):445-50. 5. Jones S, Zhang X, Parsons DW et al. Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science 2008;321(5897):1801-6. 2 6. Leary RJ, Lin JC, Cummins J et al. Integrated analysis of homozygous deletions, focal ampli!cations, and sequence alterations in breast and colorectal cancers. Proc Natl Acad Sci U S A 2008;105(42):16224-9. 7. Parsons DW, Jones S, Zhang X et al. An integrated genomic analysis of human glioblastoma multiforme. Science 2008;321(5897):1807-12. 8. Weir BA, Woo MS, Getz G et al. Characterizing the cancer genome in lung adenocarcinoma. Nature 2007;450(7171):893-8. 9. Beroukhim R, Mermel CH, Porter D et al. The landscape of somatic copy-number alteration across human cancers. Nature 2010;463(7283):899-905. 10. Pollack JR, Sorlie T, Perou CM et al. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Natl Acad Sci U S A 2002;99(20):12963-8. 11. Costa JL, Meijer G, Ylstra B et al. Array comparative genomic hybridization copy number pro!ling: a new tool for translational research in solid malignancies. Semin Radiat Oncol 2008;18(2):98-104. 12. Postma C, Koopman M, Buffart TE et al. DNA copy number pro!les of primary tumors as predictors of response to chemotherapy in advanced colorectal cancer. Ann Oncol 2009. 13. Sobin LH, Fleming ID. TNM Classi!cation of Malignant Tumors, !fth edition (1997). Union Internationale Contre le Cancer and the American Joint Committee on Cancer. Cancer 1997;80(9):1803-4. 14. Buffart TE, Tijssen M, Krugers T et al. DNA quality assessment for array CGH by isothermal whole genome ampli!cation. Cell Oncol 2007;29(4):351-9. 15. Weiss MM, Hermsen MA, Meijer GA et al. Comparative genomic hybridisation. Mol Pathol 1999;52(5):243-51. 16. Buffart TE, Israeli D, Tijssen M et al. Across array comparative genomic hybridization: a strategy to reduce reference channel hybridizations. Genes Chromosomes Cancer 2008;47(11):994-1004. 17. van de Wiel MA, Kim KI, Vosse SJ et al. CGHcall: calling aberrations for array CGH tumor pro!les. Bioinformatics 2007;23(7):892-4. 18. Albertson DG. Gene ampli!cation in cancer. Trends Genet 2006;22(8):447-55.

49 Chapter 2

19. Huang dW, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 2009;37(1):1-13. 20. Wood LD, Parsons DW, Jones S et al. The genomic landscapes of human breast and colorectal cancers. Science 2007;318(5853):1108-13. 21. Futreal PA, Coin L, Marshall M et al. A census of human cancer genes. Nat Rev Cancer 2004;4(3):177-83. 22. Karolchik D, Hinrichs AS, Furey TS et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res 2004;32(Database issue):D493-D496. 23. Hinrichs AS, Karolchik D, Baertsch R et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res 2006;34(Database issue):D590-D598. 24. Kent WJ, Sugnet CW, Furey TS et al. The human genome browser at UCSC. Genome Res 2002;12(6):996-1006. 25. Ritchie ME, Silver J, Oshlack A et al. A comparison of background correction methods for two- colour microarrays. Bioinformatics 2007;23(20):2700-7. 26. Smyth GK, Speed T. Normalization of cDNA microarray data. Methods 2003;31(4):265-73. 27. Douglas EJ, Fiegler H, Rowan A et al. Array comparative genomic hybridization analysis of colorectal cancer cell lines and primary carcinomas. Cancer Res 2004;64(14):4817-25. 28. Mc Sherry EA, Mc GA, Kay EW et al. Formalin-!xed paraf!n-embedded clinical tissues show spurious copy number changes in array-CGH pro!les. Clin Genet 2007;72(5):441-7. 29. Cox C, Bignell G, Greenman C et al. A survey of homozygous deletions in human cancer genomes. Proc Natl Acad Sci U S A 2005;102(12):4542-7. 30. Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet 2006;7(2):85-97. 31. Kurashina K, Yamashita Y, Ueno T et al. Chromosome copy number analysis in screening for prognosis-related genomic regions in colorectal carcinoma. Cancer Sci 2008;99(9):1835-40. 32. Diep CB, Kleivi K, Ribeiro FR et al. The order of genetic events associated with colorectal cancer progression inferred from meta-analysis of copy number changes. Genes Chromosomes Cancer 2006;45(1):31-41. 33. Zeitoun G, Buecher B, Bayer J et al. Retention of chromosome arm 5q in stage II colon cancers identi!es 83% of liver metastasis occurrences. Genes Chromosomes Cancer 2006;45(1):94-102.

50