Supplementary Methods
Total Page:16
File Type:pdf, Size:1020Kb
Supplementary methods Human lung tissues and tissue microarray (TMA) All human tissues were obtained from the Lung Cancer Specialized Program of Research Excellence (SPORE) Tissue Bank at the M.D. Anderson Cancer Center (Houston, TX). A collection of 26 lung adenocarcinomas and 24 non-tumoral paired tissues were snap-frozen and preserved in liquid nitrogen for total RNA extraction. For each tissue sample, the percentage of malignant tissue was calculated and the cellular composition of specimens was determined by histological examination (I.I.W.) following Hematoxylin-Eosin (H&E) staining. All malignant samples retained contained more than 50% tumor cells. Specimens resected from NSCLC stages I-IV patients who had no prior chemotherapy or radiotherapy were used for TMA analysis by immunohistochemistry. Patients who had smoked at least 100 cigarettes in their lifetime were defined as smokers. Samples were fixed in formalin, embedded in paraffin, stained with H&E, and reviewed by an experienced pathologist (I.I.W.). The 413 tissue specimens collected from 283 patients included 62 normal bronchial epithelia, 61 bronchial hyperplasias (Hyp), 15 squamous metaplasias (SqM), 9 squamous dysplasias (Dys), 26 carcinomas in situ (CIS), as well as 98 squamous cell carcinomas (SCC) and 141 adenocarcinomas. Normal bronchial epithelia, hyperplasia, squamous metaplasia, dysplasia, CIS, and SCC were considered to represent different steps in the development of SCCs. All tumors and lesions were classified according to the World Health Organization (WHO) 2004 criteria. The TMAs were prepared with a manual tissue arrayer (Advanced Tissue Arrayer ATA100, Chemicon International, Temecula, CA) using 1-mm-diameter cores in triplicate for tumors and 1.5 to 2-mm cores for normal epithelial and premalignant lesions. Microarray data analysis Raw data files and probe intensities of gene expression in cells of the human in vitro lung carcinogenesis model were initially processed using Perfect Match by the positional-dependent nearest-neighbor (PDNN) model matching observed probe signals and model-fitted values (1). Gene features with differential expression of greater than 1.65 fold in any of the analyzed cells compared to the NHBE cells were selected for further analysis (n=1221). Unsupervised clustering by average linkage was performed using Cluster v2.11 and results were visualized with TreeView programs (Michael Eisen Laboratory, Lawrence Berkeley National Laboratory and University of California, Berkeley; http://rana.lbl.gov/EisenSoftware.htm). Self-organizing map (SOM) analysis was then performed to identify clusters of genes displaying variation of expression from the normal NHBE cells to the tumorigenic 1170-I cells. Differentially expressed genes were also analyzed using Ingenuity Pathway Analysis® (http://www.ingenuity.com) to gain a pathway level of understanding of the modulated genes and identify significant gene- interaction networks. The cancer microarray database and integrated data-mining platform, Oncomine (2) was utilized to analyze the expression of UBE2C in microarray databases of human lung carcinomas available on-line. Raw data *.CEL files of the cells constituting the in vitro lung carcinogenesis model were also analyzed using the BRB-ArrayTools v.3.7.0 Beta developed by Dr. Richard Simon and BRB-ArrayTools Development Team (Biometric Research Branch, National Cancer Institute; http://linus.nci.nih.gov/BRB-ArrayTools.html) to develop a gene signature reflective of differential and progressive gene expression among the in vitro model and to be analyzed in published data sets of NSCLC. Samples were normalized by Robust Multichip Average (RMA) normalization and genes were selected based on the criteria of at least 2-fold expression difference between the NHBE and 1170-I cells and with a p-value<0.001 of the univariate t-test with permutation as well as displaying regular variation progressively across all cells of the in vitro model (n=584). We analyzed raw microarray data files from several published NSCLC cohort microarray databases to assess the expression of the 584 genes in clinical samples of NSCLC. We used 361 adenocarcinomas from a recent multi-site blinded validation study by the Director’s Challenge Consortium for the Molecular Classification of Lung Adenocarcinoma (3) that were initially surgically resected in Memorial Sloan Kettering (n=104), University of Michigan (n=178), and H. Lee Moffitt cancer center (n=79) (National Cancer Institute Cancer Array database, experiment ID 1015945236141280:1 (https://caarraydb.nci.nih.gov/caarray). We excluded adenocarcinomas from the Dana Farber institute within the same report because of their reduced total gene expression levels as well as samples that were excluded in their original report (3). We also utilized 125 unique adenocarcinomas from the cohort by Bhattacharjee et al. (4), 58 adenocarcinomas, and 53 SCCs from the published report by Bild et al. (5) (http://data.cgt.duke.edu/oncogene.php), and 129 SCCs from the report by Raponi et al (6). All NSCLC samples analyzed included only those that passed the quality checks in the original published reports of the mentioned cohorts. In addition, each gene expression sample analyzed represents one unique patient. All raw data files were imported and analyzed using the BRB- ArrayTools v.3.7.0 Beta Genes were normalized independently by cohort. Common genes present in all gene chip platforms (Affymetrix® HG-U133A, HG-U133 plus 2.0, and U95A) were identified using NetAffxTM from Affymetrix (http://www.affymetrix.com/analysis/index.affx) by searching corresponding gene annotations in the platforms. Prior to integration of the selected genes with the analyzed NSCLC datasets into a mixed cell line-clinical NSCLC samples dataset, gene expression ratios were normalized and median-centered separately on the different microarrays (7, 8). Hierarchical cluster analysis by average linkage was performed with Cluster v2.11, Kaplan-Meier and log-rank test survival analyses based on 1170-I or NHBE clusters following hierarchical clustering analysis were performed using the R 2.6.0 statistical package (http://www.r-project.org/). Expression of the group of six selected genes (UBE2C, MCM2, MCM6, FEN1, TPX2, and SFN) was also analyzed in all of the aforementioned cohorts. Hierarchical cluster analysis by average linkage of the data and log-rank statistics were then performed as described earlier to measure the survival differences among the identified patient clusters. Quantitative Real-Time PCR This method was performed using TaqMan® Gene Expression Assay (Applied Biosystems, Foster City, CA) probes for UBE2C and for the housekeeping genes beta-actin (ACTB), ribosomal protein large P0 (RPLP0), glucuronidase beta (GUSB), Transferrin receptor (TFRC), 18S, 28S, Transactivating factor IID (TFIID) and Glyceraldehyde phosphate dehydrogenase (GAPDH). All the real-time PCR reactions were carried out in duplicate using TaqMan® Fast Universal PCR Master Mix and run on a 7500 fast Real-Time PCR System according to the manufacturer’s instructions (Applied Biosystems). Normalization was assessed by a combination of the 4 best housekeeping genes over all eight genes tested. References 1. Zhang L, Miles MF, Aldape KD. A model of molecular interactions on short oligonucleotide microarrays. Nat Biotechnol 2003;21:818-21. 2. Rhodes DR, Yu J, Shanker K, et al. ONCOMINE: a cancer microarray database and integrated data-mining platform. Neoplasia 2004;6:1-6. 3. Shedden K, Taylor JM, Enkemann SA, et al. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat Med 2008;14:822-7. 4. Bhattacharjee A, Richards WG, Staunton J, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A 2001;98:13790-5. 5. Bild AH, Yao G, Chang JT, et al. Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 2006;439:353-7. 6. Raponi M, Zhang Y, Yu J, et al. Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Res 2006;66:7466-72. 7. Kaposi-Novak P, Lee JS, Gomez-Quiroz L, Coulouarn C, Factor VM, Thorgeirsson SS. Met-regulated expression signature defines a subset of human hepatocellular carcinomas with poor prognosis and aggressive phenotype. J Clin Invest 2006;116:1582-95. 8. Lee JS, Heo J, Libbrecht L, et al. A novel prognostic subtype of human hepatocellular carcinoma derived from hepatic progenitor cells. Nat Med 2006;12:410-6. Supplementary figure 1 legend: Increased UBE2C mRNA levels in NSCLC relative to adjacent normal lung tissue. A. Normalized UBE2C mRNA expression levels in lung adenocarcinomas and SCCs relative to adjacent normal lung. UBE2C normalized expression levels were downloaded from four published microarray data sets available from the public microarray analysis platform, Oncomine. The lead author of each published microarray cohort data is indicated in each panel. The number of analyzed tissues of adenocarcinomas, SCC, or normal lung is indicated below each bar. p- values represent statistical significance assessed by independent two-sided t-tests. B. UBE2C mRNA levels between 26 adenocarcinomas and 24 adjacent normal lung tissues analyzed by quantitative real-time PCR analysis. The p-value was determined by a two-sided t-test. Supplementary figure 1 p < 0.001 p < 0.001 A B p < 0.001 2.0 2.4 Beer Su 2.0 1.6 2.0 1.6 1.2 1.6 1.2 0.8 0.8 1.2 0.4 0.4