Genomic markers for malignant progression in pulmonary adenocarcinoma with bronchioloalveolar features

Sarit Aviel-Ronen*, Bradley P. Coe†‡, Suzanne K. Lau*§, Gilda da Cunha Santos*¶, Chang-Qi Zhu*, Dan Strumpf*, Igor Jurisica*§ʈ, Wan L. Lam†‡, and Ming-Sound Tsao*§¶**

*University Health Network, Ontario Cancer Institute and Princess Margaret Hospital Site, Toronto, ON, Canada M5G 2M9; †Department of Cancer Genetics and Developmental Biology, British Columbia Cancer Research Centre, Vancouver, BC, Canada V5Z 1L3; Departments of §Medical Biophysics, ¶Laboratory Medicine and Pathobiology, and ʈComputer Science, University of Toronto, Toronto, ON, Canada M5G 2C1; and ‡University of British Columbia, Vancouver, BC, Canada V6T 289

Edited by John D. Minna, University of Texas Southwestern Medical Center, Dallas, TX, and accepted by the Editorial Board April 30, 2008 (received for review October 9, 2007) Bronchioloalveolar carcinoma (BAC), a subtype of lung adenocar- epidermal growth factor receptor (EGFR) inhibitors (8). The initial cinoma (ADC) without stromal, vascular, or pleural invasion, is studies that recognized BAC as a distinct entity reported 5-year considered an in situ tumor with a 100% survival rate. However, survival rates of 100% (9–11), but more recent studies have the histological criteria for invasion remain controversial. BAC-like reported lower 5-year survival rates of 83–86% for resected stage areas may accompany otherwise invasive adenocarcinoma, re- I BAC patients (12–14). These rates possibly reflect difficulties in ferred to as mixed type adenocarcinoma with BAC features applying the histological criteria of invasion in BAC or AWBF. (AWBF). AWBF are considered to evolve from BAC, representing a Some studies have also reported that BAC with focal areas of paradigm for malignant progression in ADC. However, the sup- microinvasion may also have excellent prognosis similar to nonin- porting molecular evidence remains forthcoming. Here, we have vasive BAC (11, 15). The identification of / that may studied the genomic changes of BAC and AWBF by array com- distinguish BAC from AWBF and are predictors of ADC with poor parative genomic hybridization (CGH). We used submegabase- prognosis could be useful for the establishment of molecular resolution tiling set array CGH to compare the genomic profiles of pathological classification of lung ADC. In this study, we have used 14 BAC or BAC with focal area suspicious for invasion with those array comparative genomic hybridization (CGH) to test our hy- of 15 AWBF. Threshold-filtering and frequency-scoring analysis pothesis that BAC is molecularly distinguishable from AWBF by found that genomic profiles of noninvasive and focally invasive their differential genomic profiles and that marker genes for BAC are indistinguishable and show fewer aberrations than tumor invasion and/or poor prognosis may be identified. cells in BAC-like areas of AWBF. These aberrations occurred mainly at the subtelomeric chromosomal regions. Increased genomic al- Results terations were noted between BAC-like and invasive areas of Most chromosomal changes in both BAC and AWBF were subtle, AWBF. We identified 113 genes that best differentiated BAC from indicating low levels of genomic alteration as well as partial atten- AWBF and were considered candidate marker genes for tumor uation by contaminating nonneoplastic host cells. The profiles of invasion and progression. Correlative expression analyses BAC and BAC with focal areas suspicious for invasion were demonstrated a high percentage of them to be poor prognosis indistinguishable and showed low copy gains at 1p, 2q, 5p, 7p, 11p, markers in early stage ADC. Quantitative PCR also validated the 11q, 12q, 16p, 16q, 17q, 20q, and 21q (Fig. 1B). Copy gains typically amplification and overexpression of PDCD6 and TERT on chromo- occurred at the subtelomeric regions. AWBF had similar chromo- some 5p and the prognostic significance of PDCD6 in early stage somal changes but with greater variability and frequency and longer ADC patients. We identified candidate genes that may be respon- segmental alterations. Deletions were also more common in AWBF sible for and are potential markers for malignant progression and were observed mainly on 3p and 5q and to a lesser extent on in AWBF. 4q and 6q. In two patients with synchronous BAC and invasive AWBF, the BAC-like area of the latter showed greater aberrations ͉ ͉ array comparative genomic hybridization bronchioloalveolar carcinoma than the BAC (Fig. 2A). In two other AWBF, greater alterations ͉ ͉ microarray non-small-cell lung carcinoma prognostic markers were also noted in BAC-like areas compared with invasive areas (Fig. 2B). Normal lung samples showed no alteration of these ung adenocarcinoma (ADC) accounts for Ϸ35% of all lung regions. Lcancers and has an overall 5-year survival of 17% (1). The recent World Health Organization (WHO) classification recognized a particular subtype, bronchioloalveolar carcinoma (BAC), for its Author contributions: W.L.L. and M.-S.T. designed research; S.A.-R., B.P.C., S.K.L., and noninvasive features and excellent prognosis (2). BAC has a distinct G.d.C.S. performed research; S.A.-R., B.P.C., S.K.L., and I.J. contributed new reagents/ analytic tools; S.A.-R., B.P.C., S.K.L., G.d.C.S., C.-Q.Z., and D.S. analyzed data; and S.A.-R., histological pattern of tumor cells growing along preexisting alve- B.P.C., S.K.L., C.-Q.Z., D.S., and M.-S.T. wrote the paper. olar framework without evidence of stromal, pleural, or vascular The authors declare no conflict of interest. invasion. Yet, some invasive ADC, classified as mixed type, may This article is a PNAS Direct Submission. J.D.M. is a guest editor invited by the Editorial have components or large areas of BAC-like pattern. Multistage Board. development of adenocarcinoma putatively involves progression Freely available online through the PNAS open access option. from atypical adenomatous hyperplasia (AAH) through BAC to Data deposition: The data reported in this paper have been deposited in the Gene invasive mixed type ADC with BAC features (AWBF) (3–5). Mice Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE11945). that express oncogenic KRAS develop histological changes that **To whom correspondence should be addressed at: Ontario Cancer Institute, Toronto, ON, range from mild hyperplasia/dysplasia analogous to atypical adeno- Canada M5G 2M9. E-mail: [email protected]. matous hyperplasia to alveolar adenomas and ultimately display This article contains supporting information online at www.pnas.org/cgi/content/full/ overt ADC (6, 7). BAC-associated tumors have gained significant 0709618105/DCSupplemental. MEDICAL SCIENCES attention for their potentially greater sensitivity to treatment by © 2008 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0709618105 PNAS ͉ July 22, 2008 ͉ vol. 105 ͉ no. 29 ͉ 10155–10160 Downloaded by guest on September 25, 2021 A C AWBF BAC A BAC B AWBF-BAC area

Chr 5 Chr 8 Chr 7 Chr 19 1

AWBF AWBF-invasive area

2 Fig. 2. Increase in genomic instability. Shown is a karyotypic presentation of the log2 ratio of array CGH signals (SeeGH software). Normal genomic content -0.5 +0.5 is represented by the midline (blue); clonal gains deviate to the right (green lines) and deletions to the left (red lines). The lower images show progression B of genomic instability represented by more chromosomal aberrations than -100% +100% the upper images (purple brackets). (A) Synchronous BAC and AWBF in the same patient. (B) AWBF sampled in BAC and invasive areas.

Using threshold-filtering, we identified 119 clones that distin- guished BAC from AWBF. Hierarchical clustering of all cases using these clones separated BAC from AWBF samples (Fig. 1C). In addition, a Fisher’s Exact Test comparing the frequency of genomic changes between the BAC and AWBF groups yielded a list of 517 clones that best differentiated the two lesions. Integrating these two analyses was accomplished by applying a 10-clone ‘‘window’’ to identify shared regions [supporting information (SI) Text]. The result was a list of 256 candidate clones of high interest, from which a shorter list of 58 clones with gains in AWBF compared with BAC was selected. These clones included 113 unique amplified genes (Table S1) that could represent invasion and tumor progression markers for AWBF. Quantitative polymerase chain reaction (QPCR) validated the gene content changes in 33 of the 113 candidate marker genes; 25 genes (75.8%) (Table S2) showed significantly higher gene copy number in AWBF compared with BAC. Among the evaluated genes were TERT and PDCD6, which we selected for further validation by QPCR and/or FISH based on their location on 5p, which showed prominent genomic changes (Fig. 1B). Measurement of both genes by QPCR demonstrated signifi- cant differences in gene copy number between BAC and AWBF (P ϭ 0.03 for TERT and P ϭ 0.02 for PDCD6), consistent with the array CGH results (Fig. 3A). Using FISH, we also studied the gene copy of TERT and in 21 tumors. The correlation coefficients were 0.76 between array CGH and QPCR, 0.50 be- tween QPCR and FISH, and 0.53 between array CGH and FISH (Fig. 4). FISH appears the most sensitive in detecting the amplifi- cation levels and revealed the existence of chromosome 5 polysomy, especially in AWBF. Furthermore, FISH showed increased signal in the invasive area of AWBF compared with the BAC-like area in two samples, T41 and T46 (Fig. 4A). The coefficient of correlation Fig. 1. BAC compared with AWBF by histology, frequency scoring, and thresh- for PDCD6 amplification between array CGH and QPCR was 0.94. old filtering. (A1) BAC showing typical growth pattern of tumoral cells along the Using real-time QPCR (RT-QPCR), we showed that in 10 preexisting alveolar scaffold without evidence of invasion. (H&E staining; ϫ100.) (A2) AWBF has both BAC-like and invasive areas. (H&E staining; ϫ16.) (Inset) separate pairs of invasive ADC and their corresponding nonneo- High-power view of the invasive component. (H&E staining; ϫ100.) (B) Frequency plastic lung tissues, PDCD6 was overexpressed in tumor compared scoring of BAC (green) compared with AWBF (red) illustrates the percentage of with normal lung tissue (P Ͻ 0.01), with a mean 3-fold increase in cases at which a change in genomic content has occurred in each of the study expression (Fig. 3B). In a series of 85 resected (stage I–IIIA) groups. Some changes were shared by BAC and AWBF (yellow). The presentation non-small-cell lung carcinoma (NSCLC) samples, PDCD6 overex- is per array loci; gains are represented by the colors on the right, and deletions are pression was an independent poor prognostic factor for overall represented by the colors on the left. Vertical black thick and thin lines represent survival in stage I–II ADC patients [hazard ratio (HR) ϭ 4.94, 95% 100% and 50% of the samples, respectively. Blue arrows highlight the chromo- C.I. 1.22–8.52, P ϭ 0.02] (Fig. 3C) as well as for the entire cohort somal areas of most frequent changes in BAC. (C) Unsupervised hierarchical of stage I–II NSCLC patients (HR ϭ 3.82, 95% C.I. 1.26–11.6, clustering (Genesis software) of 119 clones selected by threshold filtering shows ϭ complete segregation of the two study groups: BAC (green rectangle on the P 0.03). right) and AWBF (red rectangle on the left). The color code of data corresponds We next performed a correlative gene expression study using to log2 ratio of array CGH signals. external and our own lung ADC gene expression microarray

10156 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0709618105 Aviel-Ronen et al. Downloaded by guest on September 25, 2021 Fig. 3. PDCD6 validation and markers of poor prognosis. (A) QPCR performed on genomic DNA showed statistically significant differences in PDCD6 gene copy number between BAC and AWBF (P ϭ 0.01), confirming the array CGH analysis that identified its amplification. (B) QPCR performed on 10 paired cDNAs of ADC-normal samples. PDCD6 was significantly overexpressed in tumor compared with normal lung tissue (P Ͻ 0.01), with a mean 3-fold higher expression. (C) Multivariate analysis adjusting for stage, histology, and differentiation that relied on QPCR of cDNA from 85 NSCLC samples found that PDCD6 was an independent poor prognostic factor for overall survival in stage I–II ADC patients. (D–F) Kaplan-Meier survival curves of SERPINE1 (D), GNB2 (E), and ST13 (F), based on gene expression data from the Duke database. Expression data were dichotomized at the median.

datasets, starting with the 113 amplified genes. Analysis of the regulation was found in 45%, 26%, and 20% of them (Table S4). Toronto, Harvard, and Michigan datasets discovered that 35%, These results also showed an enrichment of the candidate deleted 33%, and 29% of the genes were overexpressed; a fraction are gene list for down-regulation. expected to be based on gene amplification. These datasets in- cluded only 87, 59, and 42 of the 113 genes, respectively, and Discussion overexpression was noted in 42%, 36%, and 34% of them (Table 1 We have demonstrated that the genomic profile of BAC is distin- and Table S1). These results indicate a slight enrichment of the guishable from that of invasive AWBF, with the latter displaying candidate amplified gene list for overexpression. greater genomic aberrations. We have also demonstrated that there Univariate analysis of the Duke microarray dataset showed that is progression at the genomic level from BAC-like to invasive areas 10,023 of 54,675 (18%) probe sets were prognostic for overall of AWBF. The 113 differentially gained genes in AWBF compared survival (P Ͻ 0.05), with 4,879 (9%) overexpressed genes associated with BAC could represent candidate marker genes for tumor with poor prognosis. Among our 113 candidate-amplified genes, invasion and malignant progression. Correlative gene expression 112 were represented by 227 probe sets on the U133 plus 2 array. studies on microarray datasets suggest that a high percentage of The expression of 46/227 (20%) probe sets was significantly asso- these genes are prognostic markers for early stage ADC patients. ciated with prognosis and thus was not significantly different from Using QPCR, we validated the common amplification of 25 genes the percentage of all microarray probe sets that were prognostic including TERT and PDCD6 and found PDCD6 overexpression to (P ϭ 0.507). However, 34 of the 227 probe sets (15%), representing be an independent prognostic marker for poor overall survival in 27/113 (24%) putatively amplified and overexpressed genes (Table early stage ADC. Further validation could potentially lead to use of 2) were associated with poor prognosis. This percentage is signif- these genes as markers for differentiating aggressive AWBF from icantly higher than the 9% of all probe sets (P ϭ 0.002) with such noninvasive and prognostically excellent BAC. association. The most prognostic overexpressed genes included There have been many attempts to classify lung carcinomas at the SERPINE1 (HR ϭ 6.02, 95% C.I. 1.98–16.23, P ϭ 0.001), GNB2 molecular level through various techniques, including metaphase (HR ϭ 5.8, 95% C.I. 1.83–14.52, P ϭ 0.002), and ST13 (HR ϭ 5.37, CGH studies (16), gene expression (17–19), and array CGH anal- 95% C.I. 1.67–13.05, P ϭ 0.003), (Fig. 3 D–F). yses (20). Our whole-genome submegabase array study shows that Using frequency scoring, we identified the most common dele- BAC is characterized by alterations at subtelomeric chromosomal tions (SI Text). The majority of deleted clones in AWBF were on regions. Such aberrations could be explained mechanistically by the 3p and 5q, and they showed more continuity in their chromosomal breakage/fusion/bridge (BFB) cycle (21). Normally telomeres pro- location than on the other . The deleted clones on tect chromosome ends and prevent their fusion; however, telomere chromosome 3p and 5q included 149 genes (Table S3), among loss may lead to chromosome instability and cyclic fusion, the which are FHIT and DLEC1. Similar to the candidate gained genes, formation of a chromosomal bridge and its breakage in proximity correlative gene expression analysis using external and our own lung to the site of initial fusion, of sister chromatids during replication. ADC datasets found that 22%, 20%, and 16% of the genes in the This self-perpetuating process resolves through net gain of telomere Toronto, Harvard, and Michigan datasets were down-regulated. by translocation of the ends of another chromosome, by small Among the 149 candidate genes with loss, only 113, 84, and 48, subtelomeric duplications of the end of the same chromosome, or respectively, were represented in these three datasets. Down- by direct telomere addition. Consequently, gene amplification and MEDICAL SCIENCES

Aviel-Ronen et al. PNAS ͉ July 22, 2008 ͉ vol. 105 ͉ no. 29 ͉ 10157 Downloaded by guest on September 25, 2021 platform that matched the genomic gene list (less than half). AB89Correlation p value CGH & QPCR : <0.01 8 TERT Copy Number (FISH) Nevertheless, it confirms the importance of some of our candidate QPCR & FISH : 0.02 markers in lung carcinoma (Table S1) and the overexpression of 4 7 CGH & FISH : 0.01 6 others such as SAR1A (23), SYCP1 (24), and MCM7 (22), which 5 2 have been linked to other malignancies as well as lung cancer. We 4 have previously reported the poor prognostic significance of TERT 3 1 2 gene amplification in NSCLC (25). Our present findings extend the 1 importance of TERT amplification to AWBF and increased TERT

TERT Log2 Ratio (CGH & QPCR) Ratio (CGH Log2 TERT 0 0.5 gene copy because of chromosome 5 polysomy. T6 T3 T9 T2 T5 T4 T1 T19 T10 T17 T11 T30 T22 PDCD6, programmed cell death 6, or apoptosis-linked gene 2 T195 T12R T43B T41B T46B T44B T42A T41A T46A T44A T48A T51A T49A T40A BAC AWBF (ALG-2) is located on chromosome 5pter–5p15.2 and is in close CGH ratio QPCR FISH - TERT FISH 5q proximity to TERT. It encodes a 191-aa that was originally Fig. 4. TERT validation by QPCR and FISH. (A) TERT content measured by considered pro-apoptotic (26). PDCD6 belongs to the penta-EF 2ϩ array CGH and QPCR on genomic DNA (relative to normal control) and FISH hand Ca -binding protein family (27) and is ubiquitously ex- (mean gene copy number per nucleus) show high correlation between the pressed in the body. PDCD6 is required for T cell receptor-, different methods. The black lines connecting copy number of TERT (filled glucocorticoid- (26), and FAS- (28) induced cell death. It interacts circles) and 5q (open squares) are drawn to highlight the difference in copy with the SH3-binding domain containing pro-apoptotic protein number between the two probes. The blue boxes mark the AWBF sampled in AIP1 (ALG-2-interacting protein-1) (29), (30), and annexin invasive areas. Samples T41A, T44A, and T46A represent the BAC-like area, 2ϩ and T41B, T44B, and T46B represent the invasive area of AWBF. (B) FISH XI (31) in a Ca -dependent way as well as with DAPK1 (death- performed on AWBF in BAC area (sample T195) using the dual-color FISH associated protein kinase 1) (32). During FAS-induced apoptosis, probe mix that contains the hTERT (5p15, green signal) and the control PDCD6, which is a 22-kDa protein, is cleaved in its N-terminus to D5S89 probe (5q31, red signal). Gain of TERT is reflected by the increased yield a 19-kDa protein and translocates from the cytoplasmic number of green compared with red foci. membrane to the cytosol (28). More recent work questioned the need of PDCD6 for apoptosis, as it may be compensated by other functionally redundant proteins (33). Immunohistochemical stain- deletions occur preferentially at the subtelomeric chromosomal ing has revealed high expression of PDCD6 in primary tumors regions. BFB also predicts progressive accumulation of genetic compared with normal tissues of the breast, liver, and lung (34, 35). alterations, as observed in both BAC and AWBF. It has been shown Both nuclear and cytoplasmic overexpression have been reported that, as BFB progresses and more chromosomal abnormalities for lung cancer, especially metastatic ADC, indicating that it plays accumulate, the breakpoints are more interstitial (21). PDCD6 The differential genomic changes noted between BAC and a role in survival pathways (35). We confirm that is invasive AWBF provide important evidence for a better under- significantly overexpressed in lung ADC (35). Moreover, we have standing of the pathogenesis of ADC. We have used two indepen- also demonstrated that PDCD6 is a poor prognostic factor in both dent algorithms to enhance the certainty of the profile that distin- early stage NSCLC and ADC and thus may serve as one of the guishes BAC from invasive AWBF. Our inability to clearly markers to differentiate more indolent from aggressive AWBF. differentiate BAC from BAC with focal area of invasion at the Potti et al. (19) reported a genomic strategy to refine prognosis genomic level suggests that both may have a similar behavior with for early stage NSCLC and identify patients at high risk of relapse low metastatic potential and that early invasion is likely determined after initial surgery. They constructed a lung metagene model at gene expression levels by epigenetic mechanisms. The finding based on gene expression data and showed that its prognostic also suggests that BAC or BAC with focal invasion, which are accuracy surpasses that of a model based on traditional clinical data. negative for the overexpression of identified marker genes, could Their model was applied to all histologic types of early stage disease potentially be grouped into a single diagnostic entity with excellent but did not consider BAC as a special entity. Although none of the prognosis (11, 15). 122 genes in the published metagenes matched our 113 genes, The 113 candidate marker genes that we identified may represent analysis of our genes in their dataset showed that the overexpression part of the ‘‘signature of chromosomal instability’’ for invasion and of 27 genes (24%) was associated with poor prognosis in early stage malignant progression in AWBF (22). The correlative gene expres- ADC patients. Significantly higher gene copy number in AWBF sion validation rate (Ϸ35%) in the Harvard and Michigan datasets compared with BAC was confirmed by QPCR on genomic DNA in was limited by the low number of probe sets in the microarray 74% of these genes (20 of 27 genes, Table S2).

Table 1. Correlative gene expression validation results for the candidate amplified genes Toronto Harvard (17) Michigan (18)

No. of ADC samples 39 127 86 No. of normal lung samples 10 17 10 Array type U133A U95A HuGeneFL No. of genes/probe sets 13,840/22,215 9,513/12,625 5,945/7,129 No. of genes/probe sets up-regulated 4,885/6,251 3,175/3,635 1,692/1,909 No. of genes/probe sets down-regulated 3,076/3,998 1,887/2,119 929/946 % of genes up-regulated 35.29 33.37 28.46 % FDR used 4.06 3.98 4.98 No. of genes from 113 gene list that are present on array 87 59 42 No. of up-regulated genes that match 113 gene list 38 22 15 No. of genes that match by mistake based on FDR 1.54 0.88 0.75 Expected no. of up-regulated genes based on observed % of up-regulated genes 30 19 12 Validation rate, % 41.90 35.80 33.93

FDR, false discovery rate.

10158 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0709618105 Aviel-Ronen et al. Downloaded by guest on September 25, 2021 Table 2. Markers of poor prognosis in early stage lung ADC, as identified in silico in the Duke microarray expression dataset (19) Gene symbol Gene name Hazard ratio 95% C.I. P

AP1S1 Adaptor-related protein complex 1, ␴ 1 subunit 4.59 1.77–11.91 0.002 AP4M1 Adaptor-related protein complex 4, ␮ 1 subunit 3.68 1.5–9.01 0.004 BRD9 Bromodomain containing 9 3.89 1.1–13.8 0.035 CCDC21 Coiled-coil domain containing 21 6.46 1.15–36.14 0.034 CCL8 Chemokine (C-C motif) ligand 8 1.74 1–3.04 0.050 COPS6 COP9 constitutive photomorphogenic homolog subunit 6 (Arabidopsis) 2.95 1.4–6.22 0.004 CSDE1 Cold shock domain containing E1, RNA-binding 3.61 1.1–11.84 0.034 EP300 E1A binding protein p300 6.15 1.23–30.63 0.027 GNB2 Guanine nucleotide binding protein (G protein), ␤ polypeptide 2 6.69 2.1–21.29 0.001 HIPK1 Homeodomain interacting protein kinase 1 2.98 1.1–8.11 0.032 HRSP12 Heat-responsive protein 12 2.98 1.19–7.45 0.020 LAPTM4B Lysosomal associated protein transmembrane 4␤ 1.47 1.01–2.13 0.044 MCM7 MCM7 minichromosome maintenance deficient 7 (Saccharomyces cerevisiae) 2.74 1.45–5.19 0.002 MGC4677 Hypothetical protein MGC4677 3.97 1.89–8.31 Ͻ0.001 OLFM2 Olfactomedin 2 4.45 1.5–13.15 0.007 POP7 Processing of precursor 7, ribonuclease P subunit (S. cerevisiae) 3.25 1.2–8.79 0.020 PPA1 Pyrophosphatase (inorganic) 1 3.06 1.31–7.16 0.010 RABL4 RAB, member of RAS oncogene family-like 4 3.88 1.12–13.43 0.032 RPL30 Ribosomal protein L30 19.70 2.44–158.85 0.005 SERPINE1 Serpin peptidase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 1 3.43 1.84–6.4 Ͻ0.001 SH3BGRL3 SH3 domain binding glutamic acid-rich protein like 3 5.01 1.75–14.35 0.003 SLC25A17 Solute carrier family 25 (mitochondrial carrier), member 17 3.68 1.32–10.26 0.013 ST13 Suppression of tumorigenicity 13 2.02 1.21–3.37 0.007 TAF6 TAF6 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 80 kDa 3.99 1.52–10.52 0.005 TLE3 Transducin-like enhancer of split 3 (E(sp1) homolog, Drosophila) 5.09 1.45–17.83 0.011 TOB2 Transducer of ERBB2, 2 3.37 1.01–11.32 0.049 ZNF561 Zinc finger protein 561 8.20 1.41–47.74 0.019

The 27 putative markers that we identified include serpin pep- BAC indicates that gene deletions may also play an important role tidase inhibitor, clade E, member 1 (SERPINE1); guanine nucle- in the progression of ADC. Our putatively deleted genes require otide-binding protein ␤-2 (GNB2); and suppression of tumorige- further validation. nicity 13 (ST13). SERPINE1, also known as plasminogen activator inhibitor-1, is the primary physiological inhibitor of both tissue-type Materials and Methods plasminogen activator (tPA) and urokinase-like PA (uPA) and thus Study Materials. The study protocol was approved by the University Health promotes the stabilization and formation of thrombi. In addition to Network Research Ethics Board and included 26 resected lung cancers (1996– regulating the fibrinolytic system, SERPINE1 has de-adhesive 2005) classified histologically as nonmucinous BAC or invasive-AWBF. For each properties and is capable of inducing cell detachment that is case, the histology slides were reviewed independently by the study pathologists dependent on the presence of complexes of uPA:uPA-receptor (S.A.-R. and M.-S.T.) and tumors were classified according to the 2004 WHO matrix-engaged integrins (36). Interestingly, SERPINE1 high ex- criteria (2). Twelve cases were classified as AWBF when they had not only prom- Ͼ pression has been linked previously with poor prognosis in a inent nonmucinous BAC-like pattern ( 50% of the tumor) but also frank invasive number of malignancies (37), including lung ADC (17). High ADC of other histological types, such as acinar, papillary, or solid (Fig. 1A). Fourteen cases were considered noninvasive BAC or BAC with possible focal expression of SERPINE1 may activate cellular scattering, promote microinvasive area. In 11 of the AWBF cases, tissue from the BAC-like area was migration, and possibly enhance metastatic spread, all of which sampled, and in three, additional tissue from a frankly invasive area was sampled could account for the poor prognosis observed. Our study relates separately. One case involved sampling from the invasive area only. Clinical the high expression to amplification present at the genomic level. characteristics of the samples are provided in Table S5. Eight corresponding SERPINE1 is located on the same locus, 7q21.3–q22, as GNB2, normal lung tissues were selected arbitrarily as normal controls. For mRNA which is a prognostic marker for lung ADC. GNB2 is the second of expression studies, we used matched tumor and normal tissues from the Univer- five possible genes encoding the ␤ subunit of G proteins. As of yet, sity Health Network snap-frozen lung tumor bank (41). no other study associates GNB2 with lung cancer, but it is well established that G protein-coupled receptors can promote cancer Tissue Sampling, DNA Isolation, and Array CGH. DNA was isolated from formalin- progression and metastasis in a variety of tumors including NSCLC fixed, paraffin-embedded tissue. Guided by hematoxylin and eosin (H&E)-stained (38). ST13, whose aliases are P48, HOP, and Hsc70-interacting sections, we marked representative paraffin blocks with tumor areas containing protein, acts as a cochaperone of heat-shock protein 70 (Hsp70) to Ͼ50% tumor cell nuclei were marked and cored them by using the needle for stabilize its activity (39). Hsp70 is known to promote survival in tissue array (Beecher Instruments). The process of tissue sampling, DNA isolation, cancer cells (40), thus making it reasonable to hypothesize that ST13 and array CGH is detailed in the SI Text. amplification would lead to tumor progression. Array CGH Data Analysis. Array CGH data analysis was based on two independent For the most part, the function and role in lung cancer of the algorithms, threshold-filtering and frequency-scoring (42), using multiple soft- remaining genes from the 113 candidate gene list are poorly ware tools including SeeGH (43), Genesis (44), aCGH-Smooth (45), and Frequency- characterized, and their involvement in the progression of lung Plot (42). The algorithms and the overlap between them are described in SI Text. ADC is worthy of further study. These genes are also promising Our analysis concentrated on clone gains rather than losses because clone gains markers for poor prognosis in early lung ADC and could serve as involved more chromosomes, their prevalence was higher (Fig. 1B), and occa- potential targets for future therapy. The finding that gene losses at sionally they were of higher copy number (not limited to just two copies per MEDICAL SCIENCES certain chromosomal regions are more prominent in AWBF than clone).

Aviel-Ronen et al. PNAS ͉ July 22, 2008 ͉ vol. 105 ͉ no. 29 ͉ 10159 Downloaded by guest on September 25, 2021 Validation by RT-QPCR. Gene copy numbers were evaluated for the DNA used in referred to as ‘‘Harvard’’ and ‘‘Michigan,’’ respectively. For a detailed description the array CGH studies by RT-QPCR, using primer sets for target and housekeeping of the analytic process and a summary of the validation see the SI Text and Tables genes. The evaluation of 33 genes, including TERT and PDCD6, was performed on S1 and S4. all of the array CGH samples except two BACs (Tables S2 and S6). The mRNA In addition, univariate analysis was performed on microarray expression data expression study was carried out on two groups of samples: 10 pairs of matched of stage I ADC patient samples from a third dataset referred to as ‘‘Duke’’ (19) to ADC and their adjacent normal lung tissue and 85 NSCLC samples. Primer sets identify prognostic markers and compare them with the 113 candidate markers, design, reaction conditions and analysis description, and patients’ demographic as detailed in the SI Text and Table S7. information are included in SI Text and Tables S7 and S8. Statistical Analysis. The Mann–Whitney test was used to compare the genomic Validation by FISH. The 21 cases studied by FISH included 7 BAC with or without copy number of 33 genes including TERT and PDCD6 (Table S2). Pearson corre- suspicion for invasion and 14 AWBF; 3 of the latter were scored in both their BAC lation coefficients assessed the correlation between array CGH, QPCR, and FISH and invasive areas. An additional case of AWBF was scored only in the invasive results. The Wilcoxon signed rank test was used to compare PDCD6 expression in area. Among these cases was one with synchronous BAC and invasive AWBF sampled from the BAC area. FISH failed in 6 samples. The FISH protocol is detailed the paired ADC-normal samples. Survival analysis of PDCD6 mRNA of 85 NSCLC in the SI Text. patients and 34 stage I ADC patients from the Duke dataset is described in the SI Text. The Toronto DNA Microarray Dataset. RNA was extracted by the phenol/ chloroform method from 39 adenocarcinomas (Table S9) and 10 normal lung ACKNOWLEDGMENTS. We thank Ni Liu, Olga Ludkovski, Paul Boutros, and Dr. tissue samples. RNA quality was assessed by gel electrophoresis and Bioanalyzer Jeremy Squire for assistance and advice. This work was supported by the Cana- (Agilent). cRNA synthesis, hybridization, and scanning were performed according dian Cancer Society and National Cancer Institute of Canada Grant 015184, to the manufacturer’s protocol. The adenocarcinoma RNA was profiled on the Genome Canada/British Columbia, Ontario Institute of Cancer Research, and the U133A chip (Affymetrix) and the normal lung RNA on the U133A2 chip (Af- Canadian Institutes for Health Research. S.A.-R. is a fellow of the Canadian fymetrix). To ensure the compatibility of these two platforms, four of the 39 Institutes for Health Research Training Program for Clinician Scientists in Molec- ular Oncologic Pathology (STP-53912) and a recipient of the Ontario Cancer adenocarcinomas were reprofiled on the U133A2 chip. Institute Knudson Research Fellowship and the National Cancer Institute of Canada Terry Fox Foundation Clinical Research Fellowship. B.P.C. is supported by Correlative Gene Expression Study. We validated the 113 amplified genes and a scholarship from the Michael Smith Foundation for Health Research. M.-S.T. is the 149 deleted genes from array CGH analysis on the Toronto microarray dataset the M. Qasim Choksi Chair in Lung Cancer Translational Research. I.J. is the and on two publicly available lung cancer microarray expression datasets (17, 18), recipient of the Canada Research Chair in Integrative Computational Biology.

1. Travis WD, Travis LB, Devesa SS (1995) Lung Cancer. Cancer 75:191–202. 24. Tureci O, et al. (1998) Identification of a meiosis-specific protein as a member of the 2. Travis WD, Brambilla E, Mu¨ller-Hermelink HK, Harris CC, eds (2004) WHO Classification class of cancer/testis antigens. Proc Natl Acad Sci USA 95:5211–5216. of Tumors: Pathology and Genetics of Tumors of the Lung, Pleura, Thymus and Heart. 25. Zhu CQ, et al. (2006) Amplification of telomerase (hTERT) gene is a poor prognostic (IARC Press, Lyon), pp 35–44. marker in non-small-cell lung cancer. Br J Cancer 94:1452–1459. 3. Kitamura H, Kameda Y, Ito T, Hayashi H (1999) Atypical adenomatous hyperplasia of 26. Vito P, Lacana E, D’Adamio L (1996) Interfering with apoptosis: Ca2ϩ-Binding protein the lung. Implications for the pathogenesis of peripheral lung adenocarcinoma. Am J ALG-2 and Alzheimer’s disease gene ALG-3. Science 271:521–525. Clin Pathol 111:610–622. 27. Maki M, Narayana SV, Hitomi K (1997) A growing family of the Ca2ϩ-binding proteins 4. Kim CF, et al. (2005) Identification of bronchioalveolar stem cells in normal lung and with five EF-hand motifs. Biochem J 328:718–720. lung cancer. Cell 121:823–835. 28. Jung YS, et al. (2001) Apoptosis-linked gene 2 binds to the death domain of Fas and 5. Jackson EL, et al. (2001) Analysis of lung tumor initiation and progression using dissociates from Fas during Fas-mediated apoptosis in Jurkat cells. Biochem Biophys conditional expression of oncogenic K-ras. Genes Dev 15:3243–3248. Res Commun 288:420–426. 6. Johnson L, et al. (2001) Somatic activation of the K-ras oncogene causes early onset 29. Vito P, Pellegrini L, Guiet C, D’Adamio L (1999) Cloning of AIP1, a novel protein that lung cancer in mice. Nature 410:1111–1116. associates with the apoptosis-linked gene ALG-2 in a Ca2ϩ-dependent reaction. J Biol 7. Guerra C, et al. (2003) Tumor induction by an endogenous K-ras oncogene is highly Chem 274:1533–1540. dependent on cellular context. Cancer Cell 4:111–120. 30. Kitaura Y, Matsumoto S, Satoh H, Hitomi K, Maki M (2001) Peflin and ALG-2, members 8. Miller VA, et al. (2004) Bronchioloalveolar pathologic subtype and smoking history of the penta-EF-hand protein family, form a heterodimer that dissociates in a Ca2ϩ- predict sensitivity to gefitinib in advanced non-small-cell lung cancer. J Clin Oncol dependent manner. J Biol Chem 276:14053–14058. 22:1103–1109. 31. Satoh H, Shibata H, Nakano Y, Kitaura Y, Maki M (2002) ALG-2 interacts with the 9. Yokose T, et al. (2000) Favorable and unfavorable morphological prognostic factors in amino-terminal domain of annexin XI in a Ca2ϩ-dependent manner. Biochem Biophys peripheral adenocarcinoma of the lung 3 cm or less in diameter. Lung Cancer 29:179–188. Res Commun 291:1166–1172. 10. Suzuki K, et al. (2000) Prognostic significance of the size of central fibrosis in peripheral 32. Lee JH, Rho SB, Chun T (2005) Programmed cell death 6 (PDCD6) protein interacts with adenocarcinoma of the lung. Ann Thorac Surg 69:893–897. death-associated protein kinase 1 (DAPk1): Additive effect on apoptosis via caspase-3 11. Noguchi M, et al. (1995) Small adenocarcinoma of the lung. Histologic characteristics dependent pathway. Biotechnol Lett 27:1011–1015. and prognosis. Cancer 75:2844–2852. 33. Jang IK, Hu R, Lacana E, D’Adamio L, Gu H (2002) Apoptosis-linked gene 2-deficient 12. Rena O, et al. (2003) Stage I pure bronchioloalveolar carcinoma: Recurrences, survival, and mice exhibit normal T-cell development and function. Mol Cell Biol 22:4094–4100. comparison with adenocarcinoma of the lung. Eur J Cardiothorac Surg 23:409–414. 34. Krebs J, Saremaslani P, Caduff R (2002) ALG-2: A Ca2ϩ-binding modulator protein 13. Ebright MI, et al. Clinical pattern and pathologic stage but not histologic features involved in cell proliferation and in cell death. Biochim Biophys Acta 1600:68–73. predict outcome for bronchioloalveolar carcinoma. Ann Thorac Surg 74:1640–1647. 35. la Cour JM, et al. (2003) Up-regulation of ALG-2 in hepatomas and lung cancer tissue. 14. Breathnach OS, et al. (2001) Bronchioloalveolar carcinoma of the lung: Recurrences Am J Pathol 163:81–89. and survival in patients with stage I disease. J Thorac Cardiovasc Surg 121:42–47. 36. Czekay RP, Loskutoff DJ (2004) Unexpected role of plasminogen activator inhibitor 1 15. Sakurai H, et al. (2004) Grade of stromal invasion in small adenocarcinoma of the lung: in cell adhesion and detachment. Exp Biol Med (Maywood) 229:1090–1096. Histopathological minimal invasion and prognosis. Am J Surg Pathol 28:198–206. 37. Andreasen PA, Kjoller L, Christensen L, Duffy MJ (1997) The urokinase-type plasmin- 16. Petersen I, Petersen S (2001) Towards a genetic-based classification of human lung ogen activator system in cancer metastasis: A review. Int J Cancer 72:1–22. cancer. Anal Cell Pathol 22:111–121. 38. Dorsam RT, Gutkind JS (2007) G-protein-coupled receptors and cancer. Nat Rev Cancer 17. Beer DG, et al. (2002) Gene-expression profiles predict survival of patients with lung 7:79–94. adenocarcinoma. Nat Med 8:816–824. 39. Nollen EA, et al. (2001) Modulation of in vivo HSP70 chaperone activity by Hip and 18. Bhattacharjee A, etal. (2001) Classification of human lung carcinomas by mRNA Bag-1. J Biol Chem 276:4677–4682. expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci USA 40. Ravagnan L, et al. (2001) Heat-shock protein 70 antagonizes apoptosis-inducing factor. 98:13790–13795. Nat Cell Biol 3:839–843. 19. Potti A, et al. (2006) A genomic strategy to refine prognosis in early-stage non-small- 41. Barsyte-Lovejoy D, et al. (2006) The c-Myc oncogene directly induces the H19 noncoding cell lung cancer. N Engl J Med 355:570–580. RNA by allele-specific binding to potentiate tumorigenesis. Cancer Res 66:5330–5337. 20. Shibata T, et al. (2005) Genetic classification of lung adenocarcinoma based on 42. Coe BP, et al. (2006) Gain of a region on 7p22.3, containing MAD1L1, is the most frequent array-based comparative genomic hybridization analysis: Its association with clinico- event in small-cell lung cancer cell lines. Genes Chromosomes Cancer 45:11–19. pathologic features. Clin Cancer Res 11:6177–6185. 43. Chi B, DeLeeuw RJ, Coe BP, MacAulay C, Lam WL (2004) SeeGH–A software tool for 21. Murnane JP, Sabatier L (2004) Chromosome rearrangements resulting from telomere visualization of whole genome array comparative genomic hybridization data. BMC dysfunction and their role in cancer. BioEssays 26:1164–1174. Bioinformatics 5:13. 22. Carter SL, Eklund AC, Kohane IS, Harris LN, Szallasi Z (2006) A signature of chromosomal 44. Sturn A, Quackenbush J, Trajanoski Z (2002) Genesis: Cluster analysis of microarray instability inferred from gene expression profiles predicts clinical outcome in multiple data. Bioinformatics 18:207–208. human cancers. Nat Genet 38:1043–1048. 45. Jong K, Marchiori E, Meijer G, Vaart AV, Ylstra B (2004) Breakpoint identification and 23. Difilippantonio S, et al. (2003) Gene expression profiles in human non-small and smoothing of array comparative genomic hybridization data. Bioinformatics 20:3636– small-cell lung cancers. Eur J Cancer 39:1936–1947. 3637.

10160 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0709618105 Aviel-Ronen et al. Downloaded by guest on September 25, 2021