Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 1

Genomic Profiles Specific to Patient Ethnicity in Lung Adenocarcinoma Running title: Ethnic-specific chromosomal aberrations in NSCLC

Philippe Broët1,2,3, Cyril Dalmasso1, Eng Huat Tan4, Marco Alifano3, Shenli Zhang5, Jeanie Wu6, Ming Hui Lee6, Jean-François Régnard3,7, Darren Lim4, Heng Nung Koong4, Thirugnanam Agasthian8, Lance D. Miller1,9, Elaine Lim5, Sophie Camilleri-Broët2, Patrick Tan1,6,10

1 Genome Institute of Singapore, 60 Biopolis Street, #02-01, Genome, Singapore 138672;

2 Univ Paris-Sud, JE2492, Villejuif, 12 Avenue Paul-Vaillant-Couturier, Villejuif F-94807, France;

3 APHP (Assistance Publique Hôpitaux de Paris), 3 Avenue Victoria, 75004 Paris, France;

4 National Cancer Centre, 11 Hospital Drive Singapore 169610;

5 Yong Loo Lin School of Medicine, NUS; 10 Medical Drive, Singapore 117597;

6 Duke-NUS Graduate Medical School; 8 College Road Singapore 169857;

7 Faculty of Medicine Paris-Descartes, 15 rue de l’Ecole de Médecine, 75006 Paris, France;

8 Singapore General Hospital, Outram Road Singapore 169608;

9 Wake Forest University School of Medicine, Medical Center Boulevard Winston-Salem, N.C.

27157, USA;

10 Cancer Sciences Institute of Singapore, NUS, 28 Medical Drive, Singapore 117456.

Address for reprint requests:

Dr. Philippe Broët

Genome Institute of Singapore, 60 Biopolis Street, #02-01, Genome, Singapore 138672

Tel: (65) 6808 8000; E-mail: [email protected], [email protected]

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 2

Abstract

Purpose: East-Asian (EA) patients with Non Small Cell Lung Cancer (NSCLC) are associated with a high proportion of non-smoking women, EGFR activating somatic mutations, and clinical responses to tyrosine kinase inhibitors. We sought to identify novel molecular differences between EA and Western European (WE) NSCLCs.

Experimental Design: 226 EA (90) and WE (136) lung adenocarcinomas were analyzed for copy-number aberrations (CNAs) using a common high resolution SNP microarray platform. Univariate and multivariate analyses were performed to identify CNAs specifically related to smoking history, EGFR mutation status and ethnicity.

Results: The overall genomic profiles of EA and WE adenocarcinomas were highly similar. Univariate analyses revealed several CNAs significantly associated with ethnicity, EGFR mutation and smoking, but not to gender, KRAS or p53 mutations. A multivariate model identified four ethnic-specific recurrent CNAs - Significantly higher rates of copy number gain were observed on 16p13.13 and 16p13.11 in EA tumors, while higher rates of genomic loss on 19p13.3 and 19p13.11 were observed in WE tumors. We identified several potential driver in these regions, showing a positive correlation between cis-localized copy number changes and transcriptomic changes.

Conclusion: 16p copy number gains (EA) and 19p losses (WE) are ethnic-specific chromosomal aberrations in lung adenocarcinoma. Patient ethnicity should be considered when evaluating future NSCLC therapies targeting genes located on

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 3

these areas.

Keywords: Lung Cancer, Comparative Genomic, amplification

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 4

Statement of Translational Relevance

East-Asian (EA) and Western European (WE) patients with Non Small Cell Lung Cancer (NSCLC) several exhibit epidemiologic and molecular differences, including a high proportion of non-smoking women, EGFR activating somatic mutations, and clinical responses to tyrosine kinase inhibitor therapy in EA cancers. However, whether differences exist in the pattern of copy-number alterations (CNAs) between EA and WE populations is currently unknown. We report the first large scale comparative analysis of genomic alteration profiles between lung adenocarcinomas from EA and WE populations. We identify two ethnic-specific CNAs: 16p13 amplification (Asian-specific) and 19p13 deletion (Caucasian specific). We discuss candidate copy-number driven genes located on these areas. Our findings emphasize the need to take into account ethnicity in the design of future NSCLC clinical trials evaluating targeted therapies involving copy-number driven candidate genes located on those ethnic-specific genomic regions.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 5

Introduction

Lung cancer is a common malignancy worldwide and the leading cause of global cancer mortality in both males and females (1). Clinical and epidemiological studies on non small cell lung carcinomas (NSCLC) from patients of Asian and Caucasian origin have revealed substantial clinical and molecular differences between the two populations, with the former exhibiting a higher proportion of non-smoking females, adenocarcinomas and epidermal growth factor receptor (EGFR) mutations (2-5). Some studies have shown that EGFR mutations, which are currently used as predictive markers of clinical response to EGFR tyrosine kinase inhibitors (TKis), are linked with EGFR gene amplifications (6-7). Since Asian patients are known to exhibit higher response rates to EGFR TKis, these findings raise the possibility that in addition to DNA mutations, recurrent copy-number aberrations (CNAs) (amplifications/deletions) specific to Asian and Caucasian NSCLC patients may also exist. To date, most reported large CNA analyses of NSCLC have largely comprised tumors of Caucasian origin, identifying important oncogenic loci such as Myc, TTF-1 (NKX2-1), and E2F (8-12). In contrast, similar CNA data on Asian NSCLCs has either been sparse, unavailable, or only available as small patient populations or low resolution technologies. Identifying discrete CNA differences between Asian and Caucasian NSCLCs would suggest that these two conditions represent distinct biological and clinical entities. The discovery of ethnic-specific CNAs would also argue for patient ethnicity being an important consideration in the design and evaluation of future targeted therapies in lung cancer, if the targeted genes involve ethnic specific CNAs.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 6

Here, we report the first large scale comparative analysis of genomic alteration profiles between lung adenocarcinomas from East-Asian (EA) and Western Europe (WE). Using a common high-resolution SNP microarray platform, we found that the genomic profiles of lung adenocarcinomas from both populations were highly similar, except for a limited number of chromosomal aberrations that could be ascribed to specific differences in smoking history, EGFR mutation status or ethnicity. We also identified interesting copy-number driven candidate genes located on these ethnic-specific chromosomal aberrations 16p13 (EA) and 19p13 (WE). These results emphasize the need to take into account ethnicity in future lung adenocarcinoma clinical trials targeting oncogenic pathways that involve copy-number driven candidate genes located on those ethnic-specific genomic areas.

Material and Methods

Patients & Procedures

Frozen samples of 318 lung carcinomas were processed. Tumors from Western European patients (n=157, WE series) were collected at the Hotel-Dieu hospital (Paris, France) between 2000 and 2005. Tumors from East-Asian patients (n=162, predominantly Asian-Chinese, EA series) were collected at four centres (National University Hospital, Tan Tock Seng Hospital, National Cancer Centre Singapore, and Singapore General Hospital) between 2002 and 2006. All samples were surgical specimens and underwent pathological review. Tumor samples were immediately frozen at –80°C until retrieved. Only tissue samples with tumor cells >50% were selected. The two series were initially collected from different perspectives

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 7

(WE series: prognostic impact for Stage I adenocarcinomas, EA: chromosomal aberrations in lung tumors). Thus, the inclusion for the WE series was restricted to Stage I adenocarcinomas while the EA series was heterogeneous with respect to histology and disease stage (only half of the tumors were stage I). A preliminary analyze on both variables (stage, histology) on the EA series was performed. While genome-wide significant differences in CNAs were observed for histology (e.g. higher copy number gains in 3q for squamous carcinomas), there was no genome-wide significant relationship between CNAs and disease stage. These results suggest that CNAs observed in lung cancers from different disease stages are likely to be highly similar. Thus, for this work, we restricted our analysis only to lung adenocarciomas but included all stages to preserve statistical power. The present study included 226 adenocarcinoma samples: n=136 for WE (all stage I) and n=90 for EA (48 stage I). All study protocols were approved by both institutional and ethics review committees. DNA was extracted from frozen samples using the Nucleon DNA extraction kit (BACC2, Amersham Biosciences, Buckinghamshire, UK). All samples were processed and hybridized to Affymetrix Genome-Wide Human SNP 6.0 arrays according to the manufacturer’s specifications in the same center. All 318 patients were genotyped for EGFR, KRAS and p53 mutations. Primers (EGFR exons 18-21, KRAS exon 2, and p53 exons 5-9) were used to amplify the relevant regions and DNA sequencing was performed on an ABI3730xl Sanger sequencer. All mutations were confirmed by bidirectional sequencing. For MYH11 and LKB1 sequencing, we sequenced 43 samples (22 WE and 21 EA) amplified using a whole-genome amplification (WGA) protocol using the Qiagen REPLI-g Midi Kit (cat 150045,

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 8

Qiagen, Hilden, Germany) (14). Amplified DNA was diluted 1:100 and uses 2-3 μl of diluted DNA for each PCR. DNA sequencing was performed on an ABI3730xl Sanger sequencer. All mutations were confirmed by bidirectional sequencing. For gene expression analyses, total RNA was extracted from frozen (–80°C) tumor samples using Trizol. Samples were processed and hybridized to Human U133Plus 2.0 oligonucleotide arrays (Affymetrix, Santa Clara, CA) according to the manufacturer's protocol. In this study, RNA from 128 (WE series) and 71 (EA series) were of sufficient quality to enable reliable gene expression analysis.

Statistical analysis (Flow chart, Figure 1)

To detect DNA copy-number changes, we considered the log-ratio between the probe signal and a reference signal estimated from a pool of hybridized HapMap samples that were matched for ethnicity to rule out the hypothesis that the alterations identified could be normal copy number variations (CNVs) specific to the different ethnic populations. Systematic sources of variation (GC content, fragment length) were removed by subtracting for each log-ratio its estimated predicted values inferred from the least squares estimates of the parameters. Data from the sex were not considered. To reduce measurement variability while keeping a sufficient level of resolution, we averaged the intensities over fifty consecutive probes leading to a total of 17,769 genomic segments (average size of 150kb) providing tiling coverage of the . Inferences about the copy number status of each genomic segment were obtained using the modified CGHmix classification procedure (14). We employed a novel methodology for chromosomal pattern analysis to distinguish between “passenger” and the “driver”

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 9

chromosomal aberrations (15). From this classification, driver genomic areas were identified as genomic segments, having the highest frequency for amplification while also showing the lowest frequency for deletion (exclusively amplified copy-number alterations), or vice versa (exclusively deleted copy-number alterations). A set of univariate nonparametric analyses was used to compare the marginal trinomial distribution of genomic copy-number alterations to clinical (ethnicity, smoking history, gender) and molecular (EGFR, KRAS) factors. For each genomic segment, we computed the chi-square statistic (with Yates' correction for continuity) testing the null hypothesis of no difference between the considered groups for the chromosomal aberration distributions. Selected genomic segments were reported while controlling the False Discovery Rate for a classical threshold of 5% (16). To disentangle relationships between ethnicity and CNAs from other confounding factors, we performed multinomial logistic regression analyses. We compared the log-likelihoods of a null model (without ethnicity but including all clinico-molecular variables shown to be significantly associated to CNAs in univariate analysis) to those of a full model including ethnicity using an iterative least-square estimation procedure (VGAM R-package, http://cran.r-project.org/web/packages/VGAM, 17). We calculated the likelihood ratio statistics corresponding to twice the difference in the log-likelihoods that test the hypothesis of the lack of influence of ethnicity on the propensity for deletion and amplification. We then selected genomic segments exhibiting a FDR threshold of 5%. In order to focus on driver areas, these segments were further filtered only to regions that were exclusively amplified or deleted in only one population. In order to overcome any confounding effect related to disease stage, we also performed additional analyses restricted on stage I

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 10

lung adenocarcinomas (136 WE and 48 EA). Due to the reduction of the sample size of the EA series, we decided to relax the FDR threshold for feature selection and increase from 5% to 25% that keeps a good balance between not inflating the number of false positive discoveries while not sacrificing the detection of true discoveries. For gene expression, the microarray data were standardized and normalized using the robust multi-array average (RMA) procedure (18) implemented in the Bioconductor open source software (http://www.bioconductor.org). We tested the null hypothesis that no correlation exists between between cis-localized copy number changes and transcriptomic changes for the subset of genes located in the selected genomic areas. To address the multiple testing problem, we selected genes for a 5% FDR level. These genes were considered as copy-number-driven candidate genes.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 11

Results

Clinical and molecular features from the two series are detailed in Table 1. As expected (19, 20), the EA series was associated with a higher rate of females, non-smokers, and EGFR mutations. In both series, smoking history showed a significant relationship with EGFR mutation with the highest mutation rate among non-smokers (p<10-3, 52.8% and 30.7% in EA and WE, respectively). There was a significant relationship between gender and smoking with a higher rate of non-smokers among women (p<10-10). Substitution mutations in EGFR exon 21 (EGFR_21) and small in-frame deletions in exon 19 (EGFR_19) are the most frequent and closely associated with response to EGFR-TKi therapy. Among the 47 cases with EGFR mutations, 33 cases (70%) were either EGFR_19 or EGFR_21, with a significantly higher rate for the EA series (p=0.002 and p=0.01, respectively). EGFR mutations in exons 20 and 18 were not significantly related to ethnicity. In our study, mutations in p53 occurred at similar frequencies between the two series. KRAS mutation rates were significantly higher in the WE series than the EA series. As previously described (21, 22), there was a significant relationship between smoking history and KRAS mutation status with a higher rate of mutation among smokers (p=0.002). KRAS mutations were not associated with gender. As expected from previous data (23, 24) there was a significant negative correlation between EGFR and KRAS mutations (p=0.02). We identified two cases with a co-mutation of EGFR and KRAS, but in these cases the EGFR mutations (exon 20 and 21) were synonymous nucleotide

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 12

alterations, and thus not expected to influence EGFR function. For each series, we evaluated overall patterns of chromosomal aberration complexity by calculating the fraction of genomic regions altered in each sample. The median chromosomal aberration rate was of 38.7% and 38.1% for EA and WE series, respectively. When comparing the two series, the fraction of genome altered (divided on quartiles) was not significantly different between the two series (p=0.14). Figure 2 displays the frequencies of copy gains and losses across the whole genome for the two series. Except for a limited number of genomic areas that are discussed later, visual inspection of the two series revealed a similar pattern of CNAs. From chromosomal pattern analysis, 15.5% (WE) and 14.5% (EA) of genomic segments were considered as exclusively amplified with a mixture of broad (1q, 5p, 6p, 7p, 8q) and focal (11q, 14q, 20q) contiguous areas. Exclusively deleted areas represent 25.3% (WE) and 28.7% (EA) of genomic sequences with large contiguous areas (3p, 6q, 8p, 9q, 13p). As previously reported (8-11), 5p15 and 1q21 were the most common exclusively amplified genomic regions, which are found in more than 60% of total samples. With a rate higher than 50%, 3p14 and 9q21 are the most common exclusively deleted genomic regions. We also confirmed well-known exclusive amplifications at 7p11 (EGFR), 8q24 (MYC), 11q13 (CCND1), 14q13 (NKX2-1/TTF1) and 20q11 (E2F). Exclusive deletions were identified at 3p14 (FHIT), 8p21 (DUSP4), 9p21 (CDKN2A), 10q, 15q, 16q23 (WWOX) and 17p13 (p53). All these genomic aberrations were common to both series.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 13

Using univariate analyses, we then identified genomic segments associated with the following factors (table 2, figure 3): ethnicity, smoking history, gender, EGFR and KRAS. Ethnicity: 28 chromosomal cytobands exhibited CNAs significantly related to ethnicity. Significantly higher rates of copy number gain on 1 (1p36) and 16 (16p13, 16p12, 16p11) were observed in EA tumors. In contrast, higher rates of copy loss on (19p13) were identified in WE tumors. EGFR status: 59 chromosomal cytobands exhibited CNAs distributions significantly related to EGFR mutation. In EGFR mutant tumors, we identified significantly higher rates of copy number gain on the chromosome 1 (1p36, 1p35), 7 (7p22-21, 7p15-12) 16 (16p13, 16p12) and chromosome 14 (14q31-32). Conversely, higher rates of copy loss on 21q (21q21-22) were observed in tumors with EGFR wild type status. As previously reported, we observed a strong positive correlation between EGFR amplification and mutation (p<10-5). Smoking History: 16 chromosomal cytobands whose distributions of CNAs were significantly related with smoking history were selected. Significant higher rates of copy gains were observed on 1p (1p34) and 16p (16p13, 16p12, 16p11) among non-smokers. Gender and KRAS status : No significantly associated CNAs were identified for gender and KRAS mutation status. Some CNAs associated with ethnicity were also related to smoking history and EGFR mutation status, such as chromosome 1 (1p36, 1p35) and 16 (16p13, 16p12, 16p11). This finding suggests a complex interplay between these distinct factors that ultimately combine to shape the NSCLC genome. To unravel the genomic contributions of ethnicity, smoking and EGFR mutation status, we applied multivariate modeling to identify CNAs significantly related to ethnicity, while

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 14

simultaneously accounting for the potentially confounding effects of EGFR mutation status and smoking history. We identified 7 chromosomal cytobands significantly associated with ethnicity in the multivariate analysis (Table 2, in bold). It is worth noting that considering the multivariate model, genomic regions such as 1p36 and 1p35, initially identified as related to ethnicity in the univariate analysis, were no longer retained, suggesting that these regions are mainly related to other factors. Specifically, the high rate of amplification on 1p, previously observed among EA patients, is likely related to EGFR mutation. Similarly, several CNAs on 16p, were not retained in the multivariate model since they are mainly related to smoking history (16p11.2) or EGFR mutation (16p13.12, 16p12.2, 16p12.1). We then focused on ethnic-specific genomic segments that were either exclusively amplified or deleted in only one series but not the other (contrasting chromosomal aberration patterns). Using this filter, we identified four ethnic-specific genomic areas: 16p13.13 and 16p13.11 as exclusively amplified in EA tumors and 19p13.3 and 19p13.11 as exclusively deleted in WE tumors. In order to avoid any confounding effect related to disease stage, we performed additional analyses restricted on stage I lung adenocarcinomas (136 WE and 48 EA samples). This multivariate analysis identified 17 out of the 28 previously selected ethnic-specific genomic areas (Table 2). Moreover, the ten strongest signals for association with ethnicity were located on the following genomic areas: 1p31.3, 4p15.1, 6q14.1, 16p13.13, 16p13.11 and 19p13.3. When focusing specifically on exclusively amplified or deleted areas, we selected four ethnic-specific genomic areas: 16p13.13, 16p13.11 and 19p13.3 and 19p13.11. To identify potential candidate driver genes in these four regions, we combined the genomic data with gene expression information. In total, 29 genes located on these ethnic-specific genomic areas

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 15

showed a significant positive correlation between cis-localized CNAs and transcriptomic changes (copy-number-driven genes) (Table 3). Among the selected candidate genes, we discuss bellow LKB1/STK11 (19p1.3) which showed high rates of deletion in the WE series (36.8% of WE cases compared to 11% of EA cases) and MYH11 (16p13.11) harboring copy gain in the EA series (32% of EA cases compared to 12% of WE cases) and have been previously implicated in other human neoplasms. We confirmed the existence of a significant inverse relationship between gain of 16p13.11 and loss of 19p13.3 in both series (p=3*10-3 EA series, 2*10-5 WE series).

In an initial exploratory analysis, we performed DNA sequencing of the LKB1 and MYH11 genes on 43 samples, including 22 WE and 21 EA tumors respectively. We identified one LKB1 mutation (W308L) occurring in a WE tumor concurrently exhibiting 19p deletion. The finding of only one LKB1 mutation (W308L) confirms previous findings showing that LKB1 inactivation in lung cancer is more often linked with copy loss than somatic mutation (25). We also identified four cases of MYH11 genetic alterations occurring in EA tumors - of these, two were germline or constitutional alterations (N312S in both samples), while one was a somatic mutation (R280C).

Discussion

In this study, we analyzed 226 histologically homogeneous lung tumors (adenocarcinomas) from two distinct series. To the best of our knowledge, this is the first study to specifically compare CNAs from lung tumors of East-Asian and Western Europe origin. Our use of a common technology platform, with standardized DNA

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 16

extraction protocols, quality parameters and array profiles generated at a single center avoids the inherent difficulty of comparing results from different publications which are often based upon different technology platforms, data analysis methods, and study designs.

The two series considered in this study recapitulated several previously known clinical differences, with EA patients being associated with higher rates of females, non-smokers and tumors with EGFR mutations. Given these clinical differences and the complexity of chromosomal aberrations observed in individual lung tumors with more than one quarter of genomic regions showing non-random propensities for either copy loss or gain, it is perhaps remarkable that the overall genomic profiles of the two series were strikingly similar, with classical recurrent copy gains (1q, 5p) and copy losses (3p, 9q) observed in both series. The high similarities in genomic profiles suggest that the basic underlying oncogenic pathways regulating lung carcinogenesis are similar in both populations. Consistent with this notion, aberrations in well known lung cancer oncogenes such as MYC, CCND1 and TTF-1 were observed in both series. We analyzed the relationship between CNAs frequency and clinico-molecular characteristics. Univariate analyses revealed that ethnicity, EGFR mutation and smoking history were significantly associated with several chromosomal aberrations, while no significant relationships were identified for gender, KRAS, or p53 mutation. Taking into account EGFR mutation and smoking status revealed four ethnic-specific CNAs on 16p13.13, 16p13.11, 19p13.3 and 19p13.11. These areas are still selected (but considering a higher rate of false positives) when focusing only on stage I lung adenocarcinomas. However, it is worth noting that the top

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 17

association signals are mainly located on three areas: 16p13.13, 16p13.11, 19p13.3 whereas 19p13.11 shows weaker signals. It is worth noting that, in our study, 1p36-35 was selected in univariate analysis but not by the multivariate model. It suggests that the high rate of 1p36-35 copy gain in EA patients is more likely explained by its relationship with EGFR mutation rather than ethnicity. This idea is supported by recent work from a Japanese series from Shibata et al. (26) showing that 1p36 amplifications are correlated with EGFR mutation status. Among the candidate genes located on 1p36 is the mTOR/FRAP kinase that controls cell growth and is deregulated in many human cancers. mTOR inhibitors are currently being tested in cancer clinical trials (27), and it will be of interest to test the relationship between mTOR amplifications and resistance to EGFR TKis. 19p13.3 is exclusively deleted in the WE series. This strongly suggests the activity of a potential tumor suppressor gene in this population, with a highly plausible candidate gene being the LKB1/STK11 gene. Constitutional mutations in LKB1/STK11 have been associated with Peutz-Jeghers syndrome, an autosomal dominant disorder characterized by the growth of polyps in the gastrointestinal tract and a high frequency of somatic mutations in LKB1 have been described in NSCLC. Consistent with LKB1 acting as an important lung cancer tumor suppressor gene in Caucasian patients, LKB1 mutation rates are significantly higher in NSCLC tumors from the US compared to Korea (17% vs 5%) (28). 16p13.13, 16p13.11 and 16p12.3 exhibited exclusively amplified patterns in the EA series, strongly suggesting the activity of an oncogene. Although 16p13 amplifications have been suggested to be linked with non-smoking histories (11, 29), we demonstrate here that even after taking tobacco history into account, a high level of amplification is still observed in EA patients, suggesting

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 18

that 16p13 amplifications are also related to ethnicity. Among the potential oncogene candidates located on 16p13, the MYH11 gene is notable since activating MYH11 mutations have been shown to occur in human colorectal cancer; Moreover germline mutations in MYH11 have been shown to cause features similar to Peutz-Jeghers syndrome, similar to LKB1 (30). In a preliminary analysis, we have observed MYH11 somatic mutations in a subset of cases of our EA series. Specifically, one somatic mutation (R280C mutation (c->t)) was found in a EA case which was also amplified for 16p13. Additional results are needed to confirm MYH11 as the driver gene. Among other candidate genes, we should also note the RRN3 gene which is required for efficient transcription initiation by RNA polymerase I. However, to the best of our knowledge no amplification has been described in carcinomas. Our discovery of prominent, ethnic specific, copy-number aberrations in NSCLC indicate that patient ethnicity status should be an important consideration in the design of clinical trials evaluating NSCLC targeted therapies acting on biological pathways that involve genes located on these areas. Moreover, our results strongly argue that the recruitment and analysis of biological samples from diverse ethnic populations should represent an explicit goal of large-scale cancer genome projects such as the International Cancer Genome Consortium (31).

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 19

REFERENCES

1- Jemal A, Siegel R, Ward E, Hao Y, Xu J, Thun MJ. Cancer statistics 2009. CA Cancer J Clin 2009;59:225-49.

2- Barnes DJ. The Changing Face of Lung Cancer. Chest 2004;126:1718-21.

3- Bell DW, Brannigan BW, Matsuo K, et al. Increased prevalence of EGFR-mutant lung cancer in women and in East Asian populations: analysis of estrogen-related polymorphisms. Clin Cancer Res 2008;14:4079-84

4- Toh CK, Wong EH, Lim WT, et al. The impact of smoking status on the behavior and survival outcome of patients with advanced non-small cell lung cancer: A retrospective analysis. Chest 2004;126:1750-6.

5- Toh CK, Gao F, Lim WT, et al. Never-smokers with lung cancer: epidemiologic evidence of a distinct disease entity. J Clin Oncol 2006;24:2245-51.

6- Cappuzzo F, Hirsch FR, Rossi E, et al. Epidermal growth factor receptor gene and protein and gefitinib sensitivity in non-small cell lung cancer. J Natl Cancer Inst 2005;97:643–55

7- Hirsch FR, Varella-Garcia M, McCoy J, et al. Increased epidermal growth factor receptor gene copy number detected by fluorescence in situ hybridization associates with increased sensitivity to gefitinib in patients with bronchioloalveolar carcinoma subtypes: a Southwest Oncology Group Study. J Clin Oncol 2005;23:6838-45.

8- Balsara BR, Testa JR. Chromosomal imbalances in human lung cancer. Oncogene 2002;21:6877-83.

9- Garnis C, Lockwood WW, Vucic E, Ge Y, et al. High resolution analysis of non-small cell lung cancer cell lines by whole genome tiling path array CGH. Int J Cancer 2006;118:1556-64.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 20

10- Tonon G, Wong KK, Maulik G, et al. High-resolution genomic profiles of human lung cancer. Proc Natl Acad Sci U S A 2005;102:9625-30.

11- Weir BA, Woo MS, Getz G, et al. Characterizing the cancer genome in lung adenocarcinoma. Nature 2007;450:893-89.

12- Broët P, Camilleri-Broët S, Zhang S, et al. Prediction of clinical outcome in multiple lung cancer cohorts by integrative genomics: implications for chemotherapy selection. Cancer Res 2009;69:1055-62.

13- Lim EH, Zhang SL, Li JL, et al. Using whole genome amplification (WGA) of low-volume biopsies to assess the prognostic role of EGFR, KRAS, p53, and CMET mutations in advanced-stage non-small cell lung cancer (NSCLC). J Thorac Oncol 2009;4:12-21.

14- Broët P, Richardson S. Detection of gene copy number changes in CGH microarrays using a spatially correlated mixture model. Bioinformatics 2006;22:911-8.

15- Broët P, Tan P, Alifano M, Camilleri-Broët S, Richardson S. Finding exclusively deleted or amplified genomic areas in lung adenocarcinomas using a novel chromosomal pattern analysis. BMC Med Genomics 2009;14:43.

16- Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann Stat 2001;29:1165–88.

17- Yee TW, Wild CJ. Vector Generalized Additive Models, J. Roy. Statistical Society, Series B 1996;58:481-493.

18- Irizarry RA, Hobbs B, Collin F, et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003;4:249-64.

19- Sasaki H, Endo K, Takada M, et al. L858R EGFR mutation status correlated with clinico-pathological features of Japanese lung cancer. Lung

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 21

Cancer 2006;54:103-8.

20- Tokumo M, Toyooka S, Kiura K, et al. The relationship between epidermal growth factor receptor mutations and clinicopathologic features in non-small cell lung cancers. Clin Cancer Res 2005;11:1167-73.

21- Sun S, Schiller JH, Gazdar AF. Lung cancer in never smokers-a different disease. Nat Rev Cancer 2007;7:778-90.

22- Dacic S, Shuai Y, Yousem S, Ohori P, Nikiforova M. Clinicopathological predictors of EGFR/KRAS mutational status in primary lung adenocarcinomas. Mod Pathol 2010;23:159-68.

23- Chitale D, Gong Y, Taylor BS, et al. An integrated genomic analysis of lung cancer reveals loss of DUSP4 in EGFR-mutant tumors. Oncogene 2009;28:2773-83.

24- Ding L, Getz G, Wheeler DA, et al. Somatic mutations affect key pathways in lung adenocarcinoma. Nature 2008;455:1069-75.

25. Shah U, Sharpless NE, Hayes DN. LKB1 and lung cancer: more than the usual suspects. Cancer Res. 2008;68:3562-5.

26- Shibata T, Uryu S, Kokubu A, et al. Genetic classification of lung adenocarcinoma based on array-based comparative genomic hybridization analysis: its association with clinicopathologic features. Clin Cancer Res 2005;11:6177-85.

27- Wang X, Sun SY. Enhancing mTOR-targeted cancer therapy. Expert Opin Ther Targets 2009;13:1193-203.

28- Koivunen JP, Kim J, Lee J, et al. Mutations in the LKB1 tumour suppressor are frequently detected in tumours from Caucasian but not Asian lung cancer patients. Br J Cancer 2008;99:245-52.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 22

29- Wong MP, Fung LF, Wang E, et al. Chromosomal aberrations of primary lung adenocarcinomas in nonsmokers. Cancer 2003;97:1263-70.

30- Alhopuro P, Phichith D, Tuupanen S, et al. Unregulated smooth-muscle myosin in human intestinal neoplasia. Proc Natl Acad Sci U S A 2008;105:5513-8.

31- Ledford H. Big science: The cancer genome challenge. Nature 2010;464:972-4.

Conflict of interest The authors declare no conflict of interest.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 23

Acknowledgments. This work was supported by grants from the Agency for Science, Technology, and Research, the Duke-NUS Graduate Medical School and the French Cancer Research Association (ARC).

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 24

Table 1: Clinical characteristics and mutational status in WE and EA series

(number, %) Paris series Singapore series P-Value

(136) (90)

Sex Male 100 49 0.002 Females 36 (26.5%) 41 (46.7%)

Age Median (range) 63 (42-84) 66 (30-85) NS

Smoking history Smokers 123 37 9*10-10 Non-smokers 13 (9.5%) 35 (48.6%) NA 18

Mutational status EGFR 15 (11%) 29 (32.2%) 8*10-5 KRAS 28 (20.5%) 9 (10.0% 0.03 TP53 41 (30.1%) 27 (30.0%) NS

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 25

Table 2: Cytobands selected from univariate analysis Ethnicity 1p36.23; 1p36.22; 1p36.21; 1p36.13; 1p36.12; 1p36.11; 1p35.3; 1p31.1; 2p22.3; 4p15.1; 5q35.3; 6p21.32; 6q14.1; 16p13.3; 16p13.2; 16p13.13*; 16p13.12; 16p13.11*; 16p12.3; 16p12.2; 16p12.1; 16p11.2; 17q21.31; 19p13.3*; 19p13.2; 19p13.13; 19p13.12; 19p13.11* Smoking 1p34.3; 1p34.2; 5q31.1; 5q31.2; 5q31.3; 5q35.2; 5q35.3; 16p13.3; status 16p13.2; 16p13.13; 16p13.12; 16p13.11; 16p12.3; 16p12.2; 16p12.1; 16p11.2 Gender None EGFR 1p36.32; 1p36.31; 1p36.23; 1p36.22; 1p36.21; 1p36.13; 1p36.12; 1p36.11; 1p35.3; 1p35.2; 1p35.1; 1p34.3; 1p34.2; 1p34.1; 1p31.3; 2q12.2; 4p16.1; 4p15.33; 4p15.32; 4p15.31; 4p15.2; 4p15.1; 4p13; 7p22.2; 7p22.1; 7p21.3; 7p21.2; 7p21.1; 7p15.3; 7p15.2; 7p15.1; 7p14.3; 7p14.2; 7p14.1; 7p13; 7p12.3; 7p12.2; 7p12.1; 7p11.2; 12q21.2; 14q24.3; 14q31.1; 14q31.2; 14q31.3; 14q32.11; 14q32.12; 14q32.13; 15q26.2; 15q26.3; 16p13.3; 16p13.2; 16p13.13; 16p13.12; 16p13.11; 16p12.3; 16p12.2; 16p12.1; 21q21.3; 21q22.11 KRAS None In bold: ethnic-specific cytobands, selected by multivariate analysis; * Ethnic-specific cytobands that were exclusively amplified or deleted in one series but not in the other.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 26

Table 3 : Selected ethnic-specific genomic areas and cis-localized candidate genes

Cytoband Start End Genes

16p13.13 11407460 11579617 LITAF

16p13.11 14597534 15984279 PDXDC1, NDE1, MYH11, NTAN1, RRN3, MPV17L,C16orf45

PTBP1, PRG2, AZU1,

19p13.3 1054702 1340803 PRTN3, ELA2, CFD, MED16, C19orf22, KISS1R, ARID3A, WDR18, GRIN3B, C19orf6, CNN2, ABCA7, HMHA1, POLR2E, GPX4, SBNO2, STK11, C19orf26, ATP5D, MIDN, C19orf23, CIRBP, C19orf24, EFNA2, MUM1, NDUFS7

19p13.11 17187183 17187183

USE1, OCEL1, NR2F6, USHBP1, C19orf62, ANKLE1, ABHD8, MRPL34, DDA1, ANO8, GTPBP3, PLVAP, BST2, FAM125A

In bold: copy-number-driven genes

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Ethnic-specific chromosomal aberrations in NSCLC 27

Figure legends

Figure 1: Flow Chart of the method The method was the following: 1) Univariate analysis for each of the clinical parameter or mutational status; 2) Multivariate analysis for selecting ethnic-specific clones; 3) Integrative analysis using genomic and transcriptomic information used to select copy-number driven genes; 4) Selection of genomic areas with contrasted exclusive chromosomal pattern; 5) Integrative analysis with selection of copy-number driven genes located within these later selected genomic areas.

Figure 2: Frequencies of chromosomal aberrations Frequencies of copy gain (positive values) and copy loss (negative values) for genomic sequences located across the whole genome are displayed for the WE (Paris) series (136 samples, upper panel) and EA (Singapore) series (90 samples, lower panel), according to genome position order from 1pter to 22qter. The exclusively amplified (up) or deleted (down) recurrent copy-number aberrations are indicated in orange.

Figure 3: Univariate analysis

The test statistics (-log10(p-value)) testing the null hypothesis of identical chromosomal distributions for the two tested groups are shown according to genome position order from 1pter to 22qter. Genomic segments with a FDR below 5% are highlighted in orange. Test statistics are shown for ethnicity (3A), EGFR (3B), smoking status (3C) and gender (3D).

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research. Author Manuscript Published OnlineFirst on April 26, 2011; DOI: 10.1158/1078-0432.CCR-10-2185 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited.

Genomic Profiles Specific to Patient Ethnicity in Lung Adenocarcinoma

Philippe Broet, Cyril Dalmasso, Eng Huat Tan, et al.

Clin Cancer Res Published OnlineFirst April 26, 2011.

Updated version Access the most recent version of this article at: doi:10.1158/1078-0432.CCR-10-2185

Author Author manuscripts have been peer reviewed and accepted for publication but have not yet been Manuscript edited.

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Subscriptions Department at [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://clincancerres.aacrjournals.org/content/early/2011/04/26/1078-0432.CCR-10-2185. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from clincancerres.aacrjournals.org on September 23, 2021. © 2011 American Association for Cancer Research.