Research Article

Combined cDNA Array Comparative Genomic Hybridization and Serial Analysis of Expression Analysis of Breast Tumor Progression

Jun Yao,1,4 Stanislawa Weremowicz,3,4 Bin Feng,1,4 Robert C. Gentleman,6 Jeffrey R. Marks,7 Rebecca Gelman,2,5 Cameron Brennan,1 and Kornelia Polyak1,4

Departments of 1Medical Oncology and 2Biostatistical Sciences, Dana-Farber Cancer Institute; 3Department of Pathology, Brigham and Women’s Hospital; 4Harvard Medical School; and 5Harvard School of Public Health, Boston, Massachusetts; 6Program in Computational Biology, Fred Hutchinson Cancer Center, Seattle, Washington; and 7Department of Surgery, Duke University Medical Center, Durham, North Carolina

Abstract amplification occurs recurrently on some chromosomal locations, indicating the common activation of some during tumor To identify genetic changes involved in the progression of development. The most prominent and frequent amplicons have breast carcinoma, we did cDNA array comparative genomic been reported on 1q, 8p12, 8q24, 11q13, 12q13, 17q21, hybridization (CGH) on a panel of breast tumors, including 17q23, and 20q13, and several candidate targets have been proposed 10 ductal carcinoma in situ (DCIS), 18 invasive breast and verified (2). The most well characterized breast cancer carcinomas, and two lymph node metastases. We identified is ERBB2, located on 17q21 and amplified 49 minimal commonly amplified regions (MCRs) that included in 20% to 30% of breast carcinomas (3). Other oncogenes amplified known (1q, 8q24, 11q13, 17q21-q23, and 20q13) and several in breast cancer include MYC (8q24); CCND1, EMS1, EMSY (11q13), uncharacterized (12p13 and 16p13) regional copy number IGF1R (15q26); and STK15, AIB1, and ZNF217 (20q13), whereas gains. With the exception of the 17q21 (ERBB2) amplicon, the PI3KCA is activated in 25% to 40% of breast carcinomas due to overall frequency of copy number alterations was higher oncogenic mutations, although its amplification was also reported in invasive tumors than that in DCIS, with several of them in a fraction of breast tumors (4–8). The successful treatment of present only in invasive cancer. Amplification of candidate ERBB2-amplified breast tumors with Herceptin, an inhibitor of loci was confirmed by quantitative PCR in breast carcinomas ERBB2 activity, is one of the few examples of successful molecular- and cell lines. To identify putative targets of amplicons, we based therapy in breast cancer (9). Therefore, several large-scale developed a method combining array CGH and serial analysis genome resequencing and genomic approaches are aimed at the of (SAGE) data to correlate copy number and identification of novel tumor-specific genetic events, amplifications, expression levels for each gene within MCRs. Using this or mutations, which could be targeted therapeutically. approach, we were able to distinguish a few candidate targets Complicating the identification of relevant is the fact that from a set of coamplified genes. Analysis of the 12p13-p12 most amplicons are fairly large, can span several megabases, and in amplicon identified four putative targets: TEL/ETV6, H2AFJ, extreme cases involve whole chromosome arms (e.g., chromosome EPS8, and KRAS2. The amplification of all four candidates was 1q). Thus, amplification of the targeted oncogene is inevitably confirmed by quantitative PCR and fluorescence in situ associated with coamplification of many surrounding genes. For hybridization, but only H2AFJ and EPS8 were overexpressed example, amplification of ERBB2 is frequently accompanied by in breast tumors with 12p13 amplification compared with a amplification and overexpression of nearby GRB7, TOP2A, and panel of normal mammary epithelial cells. These results show PIP4K2B genes (10). GRB7 encodes a that directly binds to the power of combined array CGH and SAGE analysis for the ERBB2 and modulates ERBB2 signaling (11), TOP2A is a top- identification of candidate amplicon targets and identify oisomerase involved in DNA repair, and PIP4K2B is a lipid kinase H2AFJ and EPS8 as novel putative oncogenes in breast cancer. that has been shown to enhance breast cancer cell growth (12). The (Cancer Res 2006; 66(8): 4065-78) function of these genes suggests that they may cooperate with ERBB2 and contribute to the malignant phenotype and therefore Introduction could also be considered targets of the 17q21 amplicon. Similar Gene amplification is one of the mechanisms underlying the complexity is observed for the 20q13, 11q13, and other amplicons, activation of oncogenes, and it is often associated with poor suggesting that genetic selection for the overactivation of a group prognosis, tumor progression, and acquired drug resistance (1). of genes is a general phenomenon. Thus, despite the fact that many Therefore, identification of amplified oncogenes has potential amplified chromosomal regions have been identified, the charac- diagnostic and therapeutic implications. In breast cancer, gene terization of their targets remains a difficult task. Performing comprehensive genome-wide screens on a large set of tumors and analyzing both genetic and gene expression changes in the same Note: Supplementary data for this article are available at Cancer Research Online tumors may help in resolving this problem because this combined (http://cancerres.aacrjournals.org/). approach facilitates the identification of genes that are both C. Brennan is currently at the Neurosurgery Service, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021. amplified and overexpressed. Using this approach, KCNK9 was Requests for reprints: Kornelia Polyak, D740C, Department of Medical Oncology, found to be the target of a small (550 kb) amplicon at 8q24.3 Dana-Farber Cancer Institute, 44 Binney Street D740C, Boston, MA 02115. Phone: because it was the only overexpressed gene within that region (13). 617-632-2106; Fax: 617-632-4005; E-mail: [email protected]. I2006 American Association for Cancer Research. Array comparative genomic hybridization (CGH) is a technology doi:10.1158/0008-5472.CAN-05-4083 suitable to study gene copy number changes on a genome-wide www.aacrjournals.org 4065 Cancer Res 2006; 66: (8). April 15, 2006

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Cancer Research scale at a high resolution (14–16). Currently, three different array distribution of segment values lies at zero (23). Based on the histogram of CGH platforms are used: bacterial artificial chromosome, cDNA, log 2 ratio distribution among all normal samples and among all tumors, z and long (60 bp) oligo arrays, with each having its own advantages a minimum gain threshold was set to log 2 ratio of 0.09 (Supplementary and disadvantages. Previous cDNA array CGH studies of breast Fig. S1), and the amplification threshold was set to log 2 ratio of 0.5. Similarly, segments with log 2 ratio less than À0.08 and À0.35 are considered to have a tumors and breast cancer cell lines revealed a strong correlation chromosomal region of ‘‘loss’’ and ‘‘deletion,’’ respectively. The raw and between copy number gain and increased gene expression and segmented data sets are provided as Supplementary Data Files. led to the identification of several new amplicons and their Identification of amplified loci and statistical analysis. Identification candidate targets, including HOXB7 at 17q21.3(17–19). However, of priority loci was done in a similar way as described (23). Based on the the limitation of these studies was the use of a reference cDNA histogram of log 2 ratio distribution among all normal samples and among mix for evaluating gene expression changes, the potential probe all tumors, a minimum gain threshold was set to log 2 ratio of z0.09 hybridization bias associated with the use of fairly long cDNA (Supplementary Fig. S1), and the amplification threshold was set to log 2 fragments on the arrays, and limiting their analysis at advanced- ratio of 0.5. Minimal common regions (MCRs) of chromosome amplification stage tumors. In this study, we did cDNA array CGH on 30 breast were generated based on overlapping recurrence across samples using the tumors, including 10 preinvasive tumors (DCIS), and five non- same algorithm as previously described (23). MCRs were further prioritized malignant cells purified from normal breast tissue or breast by the presence of the following features: (a) recurrence of high-fold amplification events in more than one sample, (b) a peak segment value of carcinomas, and in parallel, we used serial analysis of gene >0.8 in at least one sample, or (c) statistically significant recurrence of low- expression (SAGE) for evaluating gene expression patterns. Using level alteration. MCRs with one or more of these features are summarized in this integrated approach, we confirmed known amplicons and Table 1. Recurrence of array CGH gains and losses were compared between their targets, such as ERBB2 at 17q21 and PAK1/EMSY at 11q13. DCIS and invasive ductal carcinoma (IDC) sample groups, as well as between À We further identified many uncharacterized amplicons and their estrogen receptor–negative (ER ) and ER+ sample groups. Low-level putative targets in both in situ and invasive carcinomas. Based on a thresholds, >0.09 and less than À0.08, were used to define gain and loss, targeted screen for the amplification of known oncogenes, KRAS2 respectively. At each probe location, total numbers of samples with gains and losses were counted in each sample group (e.g., DCIS versus IBC and ER+ in the 12p13-12 amplicon has previously been described as a À gene amplified in a subset of breast carcinomas (20) and in one versus ER tumors), and significant difference was determined by Fisher’s metastatic rectal tumor (21), but this amplicon has not been exact test (P < 0.05, not accounting for multiple testing). Identification of best candidate targets in MCRs using SAGE. Breast systematically characterized at high resolution. Our detailed cancer SAGE libraries were previously described (22, 25) and are also characterization of the 12p13-p12 amplicon identified four putative available online.9 SAGE libraries were normalized to 100,000 total tags. For candidate targets (ETV6, KRAS2, H2AFJ, and EPS8), and subsequent each gene in the MCR, normalized SAGE tag numbers are listed for each follow-up experiments confirmed H2AFJ and EPS8 as novel tumor. For each gene in the MCR, amplification status was defined by the candidate oncogenes in breast cancer. Thus, these data show that local segmented array CGH value in that sample’s profile. Thus, for each the integration of SAGE with cDNA array CGH is a powerful gene, samples are divided into three groups: (a) amplified tumors, (b) approach for the identification of amplified candidate oncogenes. nonamplified tumors, and (c) normal tissues. For each gene with SAGE data, four Ps were generated reflecting the statistical significance of Materials and Methods differences in tag numbers (a) between amplified tumors and normal group (PA/N); (b) among amplified tumor, nonamplified tumor, and Tissue specimens and cell lines. Tumor specimens were obtained from normal group (PA/NA/N); (c) between amplified group and nonamplified Brigham and Women’s and Massachusetts General hospitals (Boston MA), plus normal group (PA/NA,N); and (d) between all tumors (amplified and Duke University (Durham, NC), University Hospital Zagreb (Zagreb, Croatia), nonamplified) and normal group (PA,NA/N). The last two Ps(PA/NA,N and and the National Disease Research Interchange. All tissue was PA,NA/N) were only calculated if PA/NA/N is significant (P < 0.05). These tests collected using protocols approved by the Dana-Farber Cancer Institute were done separately for each of three thresholds for defining ‘‘amplified’’ Institutional Review Board. Tissues were snap frozen on dry ice, stored in segment values: (a) + level (>0.09), (b) ++ level (>0.2), and (c) +++ level (>0.5) À80jC until use, or were immediately processed for immunomagnetic to confirm that results were stable across a range of CNA. Fold purification (22). Breast cancer cell lines were obtained from the American overexpression is estimated by dividing mean tag numbers from the tumor Type Culture Collection (Manassas, VA) or were generously provided by samples with amplification with mean tag numbers from normal group. We Dr. Steve Ethier (University of Michigan) and Dr. Arthur Pardee (Dana-Farber used the following criteria for identifying a gene as a candidate target of

Cancer Institute). Cells were grown in media recommended by the provider. the amplicon: the gene must meet the following criteria (a) either PA/N or cDNA array CGH profiling. Array CGH analysis was done essentially as PA/NA/N is <0.05, (b) overexpression must be >2-fold in amplified tumors previously described (23). Briefly, genomic DNA from normal and tumor compared with normal, (c) among tumors predicted to be amplified the tissues was fragmented and labeled according to published protocols.8 SAGE tag ratio of (tumor tag) / [max (normal tag)] must reach 0.67, 0.8, and Labeled DNAs were hybridized to human cDNA microarrays containing 14,160 1.0 for levels of +, ++, and +++ amplification (see above), respectively. cDNA clones (Agilent Technologies, Palo Alto, CA), for which f9,000 unique Overall correlation between gene overexpression and amplification. map positions were defined (National Center for Biotechnology Information, SAGE data from each tumor with array CGH data were compared with those build 34). The median interval between mapped elements is 100 kb, with from two normal mammary epithelial cells using a previously described 92.8% of intervals <1 Mb and 98.6% <3Mb. Log 2 ratios were calculated from method (22, 26), and tags that satisfied the following two criteria were Cye3/Cye5 fluorescence channels and further normalized by GC % content of considered overrepresented in tumors: (a) the difference between the probe’s genome region using local regression. These normalized profiles the tag numbers in tumor and normal samples is statistically significant were then processed using Circular Binary Segmentation, a change-point (P < 0.05) using the PK algorithm (26), and (b) normalized tumor tag number identification technique developed for array CGH, to demarcate genomic is at least 2-fold higher than the tag number in either of the two normal segments with statistically uniform copy number (23, 24). Segments are samples. Each tag with at least two copies/library in the SAGE libraries was assigned a log 2 ratio that is the median log 2 ratio of the contained probes. The assigned to the best matching gene using an online resource.9 The total data were then centered as previously described, so that the peak in the number of overexpressed genes in each sample was estimated based on the

8 http://genomic.dfci.harvard.edu/array_cgh.htm. 9 http://cgap.nci.nih.gov/SAGE.

Cancer Res 2006; 66: (8). April 15, 2006 4066 www.aacrjournals.org

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Novel Candidate Breast Cancer Oncogenes

Table 1. List of MCRs of amplifications

Chromosome Start Start End End Band Size No. Candidates and known targets position gene position gene (Mb) genes

1 801450 FLJ22639 802851 LOC284591 1p36.33 0.00 2 1 148592826 RORC 150329171 S100A4 1q21 1.74 37 2 166880319 SCN9A 169484614 NOSTRIN 2q24 2.60 7 342800799 HIG1 44462437 ZNF445 3p22-p21 1.66 4 5 142639325 NR3C1 145807087 TCERG1 5q31 3.17 10 5 175319142 THOC3 176717480 RGS14 5q35.2-35.3 1.40 24 NSD1 6 26195427 HFE 29631375 UBD 6q21.33.4479 6 138230274 TNFAIP3 146906521 RAB32 6q23-q24.3 8.68 33 C6orf115 7 30182906 0 54860934 EGFR 7p15.1-p12 24.68 111 KIAA0241, ELMO1, UCC1, GLI3 7 109897061 IMMP2L 115173977 TFEC 7q31 5.28 13 8 28681099 FLJ10871 30555575 GTF2E2 8p21.1-p12 1.87 10 8 33568398 MGC1136 39427722 ADAM3A 8p12 5.86 30 FGFR1 8 59628282 SDCBP 62578374 ASPH 8q12 2.95 8 8 82515121 PMP2 89122152 MMP16 8q21-q22 6.61 25 WWP1 8 124329884 ZHX1 131133535 DDEF1 8q24.1-q24.2 6.80 27 KIAA0196, MYC 9 125076688 HSPA5 127106700 GARNL3 9q33-q34.1 2.03 9 10 126075862 OAT 126480382 KIAA0157 10q26 0.40 5 KIAA0140, KIAA0157 11 74788222 RPS3 77489641 ALG8 11q13.3-q13.5 2.70 31 WNT11, E2IG4, CLNS1A, PTD015, GARP, EMSY 12 9896236 CLECSF2 15926613 STRAP 12p13.3-p12.3 6.03 92 H2AFJ, EPS8 12 24855562 BCAT1 32151452 BICD1 12p12.1-p11.21 7.30 54 TEL/ETV6, KRAS2 12 61324033 PPM1H 64504458 HMGA2 12q14-q15 3.18 18 12 66329021 DYRK2 75248048 OSBPL8 12q15 8.92 44 NUP107, CPSF6, FRS2, CCT2, MGC23401, KCNC2 15 94674950 NR2F2 99661657 PCSK6 15q26 4.99 26 IGF1R BAIAP3, CLCN7, KIAA0683, MAPK8IP3, NUBP2, NDUFB10, RAB26, LOC114984, PAQR4, HCFC1R1 16 1246283 TPSD1 4931075 KIAA0420 16p13.3 3.68 128 FLJ14154, Magmas, C16orf5 16 6009133 A2BP1 19087139 LOC51760 16p13-p12 13.08 59 16 22264760 CDR2 23499838 NDUFAB1 16p12.3-p12.1 1.24 9 16 31408316 FLJ13868 31792664 ZNF267 16p11.2 0.38 6 17 7416868 EIF4A1 7435272 FXR2 17p130.02 5 17 22645233 WSB1 23718425 VTN 17q11 1.07 15 17 23924266 ALDOC 24977091 SSH2 17q11 1.05 30 SPAG5, SDF2, SUPT6H 17 27682505 NJMU-R1 28364218 ACCN1 17q11 0.68 7 17 33120548 TCF2 34143676 RNF110 17q12 1.02 10 MLLT6 17 34816380 PPARBP 35502567 NR1D1 17q21 0.69 21 PERLD1, ERBB2 17 35853236 IGFBP4 36285721 KRT20 17q21 0.4313CCR7 17 37562439 KCNH4 38088158 CNTNAP1 17q21 0.5319 17 38256727 AOC3 38916860 DHX8 17q21 0.66 15 17 41327624 MAPT 41805904 NSF 17q21 0.48 4 17 44325162 ATP5G1 45858592 FLJ20920 17q21.32 1.53 33 17 52517620 AKAP1 53952615 PNUTL2 17q231.4319MSI2 17 53952615 PNUTL2 54188231 PPM1E 17p230.24 5 17 57910136 TLK2 59754145 PECAM1 17q231.84 28 LOC51204 17 71128808 MYO15B 71284112 H3F3B 17q25.1 0.16 6 20 4149816 ADRA1D 5043599 PCNA 20p13-p12 0.89 8 20 19141290 SLC24A3 20296765 INSM1 20p11 1.16 6 20 24397835 C20orf39 25176706 PYGB 20p11.2 0.78 8 ACAS2L PPGB, SULF2, ADNP, DPM1, STX16, NPEPL1 20 43004186 TOMM34 59983250 TAF4 20q13.1-q13.3 16.98 130 NCOA3, BCAS4, ZNF217, BCAS1, CYP24A1 20 61630222 PTK6 62181932 OPRL1 20q13.3 0.55 25 UCKL1, TCEA2 DYRK1A, DSCR8, HMGN1, MX1, TFF3, TSGA2 21 36429337 CBR3 46879955 HRMT1L1 21q22.2-q22.310.45 127 ICOSL, ADARB1, POFUT2, C21orf56, LSS, PCNT2 ALG12, TUBGCP6, MAPK12, SBF1, ECGF1 22 48487799 BRD1 49353600 ARSA 22q13.33 0.87 28 MGC16635, MAPK8IP2

NOTE: The position of genes are based on NCBI build 34. Candidates from statistical analysis of SAGE data are listed along with previously reported amplification targets (in bold).

www.aacrjournals.org 4067 Cancer Res 2006; 66: (8). April 15, 2006

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Cancer Research number of overexpressed tags with unique best gene match. SAGE analysis and in separate DCIS and IDC groups (Fig. 1B). This overview shows was done for all genes located in predicted amplicons to determine the that low-fold copy number gains and losses (segment values >0.09 or fraction of amplified genes that are also overexpressed, and the subset of less than À0.08) affect nearly all chromosomes, whereas most high- overexpressed genes that are amplified. For each sample, the observed fold amplifications and deletions (>0.5 or less than À0.35) number of overexpressed genes within amplified regions is compared with correspond to previously identified regions (1q21, 8q24, 11q13, the number expected by chance to give an odds ratio. Odds ratios are not calculated if the expected number of overexpressed genes in amplified 12q13, 15q26, 17q21, 20q13 and 1p32, 11q11-12, 13q, 16q24, and regions is <2. For each sample, odds ratios are determined separately for all 17p13, respectively). Increased log 2 ratio for chromosome X was three levels of amplification, as above: >0.09 (+ level), >0.2 (++), and >0.5 (+++). observed in all samples due to the use of male genomic DNA as Quantitative real-time PCR. Quantitative PCR primers were designed to reference, whereas all breast tissue samples were obtained from amplify products of 100 to 150 bp (sequence of primers for genes analyzed is females. The highest level amplification (>50-fold, calculated based listed in Supplementary Table S4). Quantitative PCR was done on MJ on array CGH log 2 ratio; refs. 17, 18) was found at 15q26.3, Research Chromo 4 (Bio-Rad, Hercules, CA). Briefly, PCR reactions were done presumably targeting IGF1R (29); but because this amplification was A Â in a total volume of 25 L composed of 1 PCR buffer [16.6 mmol/L NH4SO4, present in only one tumor (IDC-B17), it is not prominent in the h 67 mmol/L Tris (pH 8.8), 6.7 mmol/L MgCl2, 10 mmol/L -mercaptoethanol] recurrence chart (Fig. 1B), whereas 17q21 harboring ERBB2 clearly containing 2 ng of genomic DNA, 0.5 Amol/L of each primer, 0.5 mmol/L stands out. We also identified areas of amplifications, such as 12p13 deoxynucleotide triphosphates, 0.5 mg/mL bovine serum albumin, 1 AL 1:1,500 diluted SYBR Green I, and 0.2 AL Platinum Taq (Invitrogen, Carlsbad, and 16p13, that have not been characterized in detail and may CA). The cycling conditions were 10 minutes at 95jC followed by 40 cycles of harbor novel breast cancer oncogenes (Fig. 1B). The segmentation 15 seconds at 95jC and 1 minute at 58.1jC with a plate read at the end of each algorithm (Circular Binary Segmentation; ref. 24) effectively cycle. Composition of PCR products was examined by generating melting disregards single cDNA probes possessing aberrant high CGH log curves. The relative gene copy number was calculated by the comparative 2 ratios but identifies sets of adjacent probes with an altered average Ct method (27) and normalized to normal human genomic DNA and to a log 2 ratio. This segmental filtration can only detect amplicons that nonamplified gene (PVR on chromosome 19q13) from the same sample. are composed of more than two genes. However, we cannot exclude Based on array CGH data, PVR gene copy number had no particular changes the possibility that single highly aberrant log 2 ratios could in all the samples. Quantitative PCR using cDNA templates was similarly represent very small focal amplifications not captured in Fig. 1B. done using RPL 39 (ribosomal protein L39) as control for normalization. Fluorescence in situ hybridization. Bacterial artificial chromosome In addition to the combined analysis of all 30 tumors, we also clones flanking or containing TEL/ETV6, KRAS2 (RP11-37P8), H2AFJ analyzed the 10 DCIS and 18 invasive tumors as separate groups (RP-911J12), or EPS8 (RP-878D15) were obtained from Invitrogen/Research with the aim of identifying genetic alterations potentially involved Genetics (Carlsbad, CA). The bacterial artificial chromosomes in the in the in situ to invasive carcinoma transition. Copy number TEL/ETV6 probe are RP11-144O23(AC006518) and RP11-267J23 changes were readily detected by array CGH in DCIS and in some (AC007537). The CEP4, CEP12 (D12Z3), and TEL/AML1 probes were cases were extremely prevalent across the whole genome (e.g., in obtained from Vysis, Inc. (Downers Grove, IL). The TEL/AML1 mix contains DCIS5; Fig. 1C), correlating with prior studies describing high a TEL (spectrum green) probe at 12p13that begins between exons 3and 5 degree of genomic instability at this early stage of breast f of TEL and extends 350 kb towards the telomere of 12p and an AML1 tumorigenesis (30). However, with the exception of 1q and (spectrum red) probe at 21q22 spanning the entire AML1 gene. Touch 17q21/ERBB2 amplicons, an overall trend toward an increase in preparations from the frozen tissues were prepared as follows: tissue was cut with a razor blade, and fresh cut surface was touched gently against a the number and amplitude of gains and losses from DCIS to IDC surface of a clean glass slide in several places and air-dried. Cells were fixed was observed (Fig. 1B, middle and lower). Unsupervised clustering in cold 70% ethanol at 4jC for 2 hours, dehydrated in ascending ethanol of filtered raw data (log 2 ratio above 0.09), including all amplified series, and air-dried. After probe application, both tissue and probe DNAs genes with at least one sample having a log 2 ratio above 0.5, did were denatured simultaneously at 80jC for 2 minutes. Hybridization was not identify clear DCIS and IDC clusters nor did the tumors cluster carried out overnight at 37jC. Hybridizations of metaphase chromosomes according to grade or ER status (Fig. 1C). However, statistical obtained from the HCC1937 and ZR75-1 cell lines were done according to analyses determined that gain of 5q, chromosome 7, 11q, 16p, and the method described (28). Metaphase chromosomes and interphase nuclei 20p was statistically significantly (P < 0.05, not corrected for V were stained with 4,6-diamidino-2-phenylindole. multihypothesis testing) more likely to be detected in IDC than in DCIS, whereas loss of chromosome 9 preferentially occurred in Results and Discussion DCIS (Fig. 1B). Similarly ER+ invasive tumors were more likely to À Array CGH analysis. cDNA array CGH was used to analyze copy have 1q and 11q gain than ER ones, whereas we did not detect number changes in 30 breast tumors (10 DCIS, 18 IDCs, and two any statistically significant association between a specific ampli- lymph node metastases) along with five nonmalignant cells purified fication event and tumor grade, potentially because the majority of from normal breast tissue or breast carcinomas. The array used in our tumors (22 of 30) were high grade (Fig. 1C). this study (Agilent Human 1 clone set) covers >9,000 unique map Identification of MCRs of amplifications. MCRs of amplifica- positions with a median interval of about 100 kb between mapped tions from the predicted copy number gains were identified using elements. Typical array CGH profiles after normalization and a recently described algorithm (23). The 49 highest ranked MCRs and Circular Binary Segmentation (see Materials and Methods for their known and candidate targets are listed in Table 1, and examples details) are depicted in Fig. 1A. Overall, individual cDNA log 2 ratios of corresponding array CGH profiles with the MCRs indicated are are scattered with the majority of log 2 ratios ranging from À0.5 to depicted in Fig. 1D and F. Among these highest ranked MCRs are 0.5 even when using normal DNA. However, segmented array CGH many known amplicons frequently (10-30% of tumors) amplified in data of nonmalignant samples clearly showed lack of statistically breast cancer, including 17q21 (ERBB2), 8q24 (MYC), 11q13( GARP significant copy number gains and losses, whereas that of tumor and EMSY), 8p12 (FGFR1), and 20q13( BCAS1 and ZNF217). We also samples identified multiple genetic alterations (Fig. 1A). To identify detected less frequent amplicons, including 5q35 (FGFR4) and genomic areas that are recurrently amplified or deleted, we 15q25.6 (IGF1R), that are amplified in 3% to 5 % of tumors. Although generated plots summarizing copy number alterations in all tumors the majority of MCRs are fairly large (0.5-10 Mb), a few of them are

Cancer Res 2006; 66: (8). April 15, 2006 4068 www.aacrjournals.org

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Novel Candidate Breast Cancer Oncogenes

Figure 1. cDNA array CGH profiling of breast tumors. A, representative array CGH profiles of a normal and a tumor sample. Black dots represent raw log 2 ratios, and the red lines represent data after segmentation. B, recurrence of chromosomal alterations. Integer value recurrence of copy number alterations in segmented data (y axis) is plotted for each probe aligned along the x axis in chromosome order. Dark red or green bars denote gain or loss of chromosome material, respectively; bright red or green bars represent probes within regions of higher-level amplification or deletion, respectively (see Materials and Methods). Blue stars mark copy number alterations that are statistically significantly different between DCIS and IDC. C, clustering of normal and tumor samples based on copy number gain. ‘‘Gained’’ regions with corresponding segmented log 2 ratio of z0.09 were identified, and raw data from these regions were used along with a filter of at least one sample having a log 2 ratio of z0.5. Red and green signal represents genes with copy number gains and losses, respectively; black areas correspond to clones that were removed from the sample using the ‘‘Gain’’ filter. Depicted chromosome length is proportionate to the number of cDNA probes contained in each region. Normal samples (green) cluster together and are devoid of statistically significant chromosomal changes, whereas DCIS (blue), invasive (red), and metastatic (black) tumors do not form distinct clusters according to tumor stage. Colored rectangles indicate tumor grade (red, high; purple, intermediate; blue, low grade), lymph node (LN), ER, and HER2 status (red, positive; blue, negative; gray, unknown). D, identification of MCRs at 8q21 and 8q24. Raw array CGH log 2 ratios and segmented data of tumors IDC-C7, IDC-C2, and DCIS5. MCRs are marked with green lines, and their potential targets WWP1 and CMYC/PVT1 are indicated. E, identification of amplicons and MCRs at 12p13. Normalized array CGH log 2 ratios and segmented data of tumors IDC1, LN1, and IDC-C6. MCRs are marked with green lines, and their potential targets TEL/ETV6, H2AFJ, EPS8, and KRAS2 are indicated. small (<500 kb) and contain a limited number (5–10) of genes percentage of overexpressed genes that are located in amplicons representing attractive regions for further study. and the percentage of genes present in amplicons that are To validate our array CGH results, we did quantitative real-time overexpressed within each tumor. Among the 30 tumor samples PCR analysis of selected candidate genes in primary breast tumors with array CGH profiling data, we had SAGE libraries for 14 of them. and breast cancer cell lines (Table 2). Amplification of most In addition, we had SAGE libraries from two different cases of candidates was confirmed, although the fold amplification normal luminal mammary epithelial cells to be used for com- determined by array CGH and quantitative real-time PCR was parisons. Odds ratios for overexpressed genes within amplicons were not always in perfect agreement. determined for each sample (see Materials and Methods and Overall correlation between gene amplification and over- Supplementary Table S1.). For amplifications defined by segment expression. We also analyzed the overall contribution of copy values >0.09 (+ level), the mean odds ratio among all samples is 1.85 number gain to gene expression changes by calculating the (0.52-2.96). In other words, genes in low-level amplified regions are www.aacrjournals.org 4069 Cancer Res 2006; 66: (8). April 15, 2006

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Cancer Research

Table 2. Quantitative PCR validation of selected genes from selected MCRs in primary breast tumors (A) and breast cancer cell lines (B)

A.

Chromosome Gene Band Sample CGH, log 2 ratio Quantitative PCR, AmpPredict copy no.

2 VAMP8 2p12 DCIS5 1.01 1.2 N IDC1 1.32 1 N LN1 0.94 1.1 N IDC-C47 0.95 0.8 N 8 TOX 8q12 IDC-C10 0.906 3.2 Y IDC-C22 0.858 2.2 Y IDC-C7 1.33 1.9 Y DCIS5 1.168 1.6 N 8 KIAA0196 8q24 DCIS5 2.292 5.8 Y IDC-C6 1.05 5.4 Y IDC-C10 1.4433 .4 Y DCIS-D9728 0.75 1 N 12 ETV6 12p13IDC-C6 0.87 2.8 Y IDC1 1.76 3.7 Y LN1 2.17 4.9 Y 12 KRAS2 12p12 IDC-C6 -0.31 N IDC1 1.39 3.6 Y LN1 1.834 Y 12 RAP1B 12q14 DCIS5 2.3ND Y IDC-C4 2.31 1.3 N IDC-C47 4.07 0.7 N 17 MLLT6 17q12 LN1 1.465 7.4 Y IDC1 1.1534.5 Y I-EPI-7 2.329 1.1 N 17 NGFR 17q21 IDC-C2 3.271 11 Y IDC-C32.3 74 5 Y D-EPI-30.98 3.4 Y DCIS4 0.9431 N 20 CEBPB 20q13IDC-B17 1.974 1.7 Y IDC-C7 0.852 ND Y DCIS5 0.782 2.5 Y D-EPI-7 1.032 7 Y 21 COL6A1 21q22 DCIS5 1.986 6.5 Y DCIS4 0.7631.3N 21 AIRE 21q22 DCIS5 1.05 2.9 Y DCIS4 0.557 1.1 N 22 HDAC10 22q13DCIS4 1.56 5.5 Y IDC-C22 0.982 1.3N 22 MLC1 22q13DCIS4 1.06 5.1 Y IDC5 0.834 1.5 N B.

Cell line Chromosome/gene

8q13 8q24 8q24 20q13 12p13 12p13 12p11 21q22 22q13 22q13 BIG1 KIAA0196 ZHX1 CEBPB ETV6 KRAS2 SURB7 COL6A1 HDAC10 MLC1

21MT1 2.1 2.7 3.6 1.4 1.4 1.5 2.3 0.7 1.2 1.6 21NT 2.4 2.0 2.2 0.7 0.8 1.31.1 0.3 1.2 1.3 BT-20 1.5 4.8 3.7 4.8 1.1 1.8 1.6 0.8 0.9 1.5 BT-474 1.31.1 1.9 4.4 0.8 1.4 1.2 0.7 0.7 1.3 BT-549 1.9 1.6 2.1 0.9 0.7 1.0 1.0 0.9 0.7 0.8 HCC1937 3.1 4.3 5.1 1.5 5.7 3.4 3.8 1.0 1.3 1.2 Hs578T 1.6 1.5 1.2 1.7 0.9 1.1 1.30.6 0.9 2.4 MCF10DCIS 0.7 1.5 1.30.5 0.9 1.3 0.9 0.6 1.6 1.2

(Continued on the following page)

Cancer Res 2006; 66: (8). April 15, 2006 4070 www.aacrjournals.org

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Novel Candidate Breast Cancer Oncogenes

Table 2. Quantitative PCR validation of selected genes from selected MCRs in primary breast tumors (A) and breast cancer cell lines (B) (Cont’d)

B.

Cell line Chromosome/gene

8q13 8q24 8q24 20q13 12p13 12p13 12p11 21q22 22q13 22q13 BIG1 KIAA0196 ZHX1 CEBPB ETV6 KRAS2 SURB7 COL6A1 HDAC10 MLC1

MCF-7 1.1 3.9 3.1 2.0 0.9 1.3 1.1 1.7 0.8 0.6 MDA-MB-157 1.32.0 1.7 6.0 0.2 0.3 0.4 2.3 0.6 0.6 MDA-MB-231 0.7 0.3 1.6 0.7 0.8 0.8 0.7 0.7 0.7 0.8 MDA-MB-435S 1.3 1.8 1.6 1.4 0.6 0.8 0.9 1.3 1.4 1.5 MDA-MB-468 1.0 1.1 1.4 0.7 0.9 0.9 1.0 1.4 1.1 1.4 SK-BR-31.0 2.9 8.5 1.4 0.7 0.5 0.7 1.0 0.8 2.3 SUM-44 1.8 1.8 1.6 0.5 1.0 0.8 0.9 0.9 0.7 0.6 SUM-52 0.5 1.5 1.6 0.8 1.2 1.0 1.4 0.6 0.5 0.3 SUM-102 1.4 1.2 1.1 0.4 1.4 1.1 1.8 0.5 0.7 1.5 SUM-1315 1.2 1.2 1.3 0.4 0.6 0.5 0.7 0.7 0.9 1.8 SUM-149 1.1 1.5 1.8 1.0 0.8 1.7 1.9 0.9 1.8 2.1 SUM-159 1.1 0.7 0.2 0.9 0.9 0.9 1.1 0.8 1.0 1.5 SUM-185 0.8 0.8 1.0 0.6 0.7 0.6 0.9 1.0 1.0 1.0 SUM-190 2.31.5 1.4 0.7 1.2 1.1 1.3 1.4 1.8 1.7 SUM-225 0.9 3.2 3.3 1.3 1.0 1.1 1.4 1.8 9.9 13.6 SUM-229 1.4 1.8 1.8 1.2 1.0 0.6 0.7 1.31.1 0.8 T47-D 1.2 1.6 1.5 0.9 0.31.0 0.9 0.6 0.5 0.4 UACC-812 1.0 1.8 1.4 1.1 0.8 0.7 0.8 1.4 0.7 0.7 UACC-8931.6 2.9 2.6 0.31.5 1.1 1.7 0.4 0.4 0.4 ZR-75-1 1.2 1.5 1.6 4.2 2.7 1.5 2.1 1.1 0.7 0.7

NOTE: Chromosomal location, genes, tumor or cell line name, and copy numbers predicted based on array CGH (A) and quantitative PCR are listed. The ‘‘AmpPredict’’ column denotes whether the gene is predicted to be amplified based on the segmentation algorithm.

nearly twice as likely to be overexpressed than genes in not amplified and array CGH studies did not use normal mammary epithelial cells areas. Raising the amplification threshold to 0.2 (++) and 0.5 (+++) for reference (18, 19). Thus, we believe our analysis was improved levels leads to a stronger association, with odds ratios of 2.14 (0.27- and more reliable in detecting overexpressed oncogenes. On the 4.47) and 3.97 (1.49-10.65), respectively, supporting that higher levels other hand, a limitation of the SAGE method is that at the usual of amplification lead to greater overexpression of contained genes. sequencing depth (f50,000 tags per library), it is not able to identify Contrary to previous studies reporting high overall association targets with low overall expression levels. In addition, regardless of between gene expression and amplification (18, 19), we found that the method used, the magnitude of gene dosage effect is not a the correlation between gene amplification and overexpression is necessary predictor of functional relevance in tumorigenesis. highly variable among tumors, suggesting different mechanisms To validate our approach, we first applied our statistical of gene activation depending on tumor subtype (Supplementary analysis to fairly well characterized amplicons, including 17q21, Table S1). The difference between our results and that of previous 11q13, and 8q24. The 17q21/ERBB2 MCR contains 21 genes, and studies could be due to the use of different platforms for expression among the 14 tumors with SAGE data, eight of them showed profiling, setting more stringent criteria for ‘‘overexpression,’’ using amplification of this 17q21 MCR. Using our method, we predicted purified normal mammary epithelial cells as reference, and the fact that the strongest candidate of this MCR was ERBB2 based on the that SAGE tag numbers predict absolute mRNA copy numbers, P values obtained in the four different statistical tests and the whereas cDNA array data reflects relative mRNA levels. fold overexpression (Supplementary Table S2, top). However, in Identification of candidate targets of MCRs. To identify addition to ERBB2, neighboring gene PERLD1 was also identified candidate targets of amplicons based on gene expression patterns, as a potential candidate target. Correlating with our data, a we tested the genes in each MCR based on SAGE data obtained from recent detailed characterization of the ERBB2 amplicon at the the same samples by comparing to those from three normal copy number [fluorescence in situ hybridization (FISH) on tissue controls: two from purified normal mammary epithelial cells and microarrays] and gene expression (real-time PCR) levels also one from normal breast organoid (details of the statistical analysis found that ERBB2, PNMT, and PERLD (MGC9753) show the best are described in Materials and Methods). Using this combinatorial correlation between amplification and overexpression (31). approach in many cases, we were able to narrow down the number The same analysis done in the 11q13.3-5 MCR identified E2IG4 as of candidate targets to a few genes. A complete gene list and the strongest candidate target, immediately adjacent to EMSY,a statistical analysis for each MCR is available as a Supplementary recent proposed target of this amplicon (refs. 32, 33; Supplementary Data File. It is noteworthy that previous gene expression profiling Table S2, bottom). Interestingly, E2IG4 was previously identified as a www.aacrjournals.org 4071 Cancer Res 2006; 66: (8). April 15, 2006

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Cancer Research

Table 3. Identification of candidate genes in the 12p13 amplicon using SAGE data

Chromosome Gene Amplification 12 position N-EPI-1 N-EPI-2 N-ORG-1 À À À + À À À +++ DCIS1 D-EPI-3 DCIS4 DCIS5 D-EPI-6 D-EPI-7 DCIS8 IDC1

9896236 CLECSF2 009 0000 0000 9992003 FLJ46363 000 0000 0000 10015281 MICL 000 0000 0000 10036942 CLEC2 000 0000 0000 10114349 CLEC1 000 0000 0000 10160649 CLECSF12 000 0000 0000 10202169 OLR1 000 0000 0000 10222898 FLJ31166 000 0000 0000 10351684 KLRD1 000 0000 0000 10416221 KLRK1 000 0000 0000 10451250 KLRC4 000 0000 0000 10633039 KLRA1 000 0000 0000 10648061 FLJ10292 17 0 0 0 0 0 0 0000 10662805 STYK1 000 0000 0000 10845480 TAS2R7 000 0000 0000 10853003 TAS2R9 000 0000 0000 10869212 TAS2R10 000 00000000 10889749 PRR4 000 0000 0000 10952253 TAS2R13 000 0000 0000 10982120 TAS2R14 000 0000 0000 11040812 TAS2R49 000 0000 0000 11065538 TAS2R48 000 0000 0000 11310125 PRB3 000 0000 0000 11396024 PRB1 000 0000 0000 11694055 ETV6 000 0000 0000 12115145 BCL2L14 000 0000 0000 12401322 LOH12CR1 000 0000 0000 12520098 DUSP16 000 0000 0000 12705263 GPR19 000 00000000 12761576 CDKN1B 11 0 0 130 0 0 89014 12829882 DKFZP434F0318 001400012 8000 12935620 RAI3 39164513 044 020 39 0 42 12984976 GPRC5D 000 0000 0000 13019071 HEBP1 000 00019 00160 13044635 KIAA1467 000 00012 0600 13127761 GSG1 000 0000 0000 13415290 FLJ33810 000 0000 0000 13605411 GRIN2B 000 0000 0000 14656843 GUCY2C 000 0000 0000 14818562 H2AFJ 90 48 18 208 26 51 168 87 233 65 885 14847773 MGC47869 000 0000 0000 14848851 LOC440087 000 0000 0000 14926095 MGP 344 177 14 13 333 163 56 630 487 28 14986232 ARHGDIB 000130170 0000 15017245 PDE6H 000 00000000 15046034 LOC440088 000 0000 0000 15151985 RERG 000 0000 0000 15366754 PTPRO 000 0000 0000 15664365 EPS8 23 0 50 85 0 34 37 81816162 15926613 STRAP 34 16 27 13 0 30 0 28 45 43 56 24855562 BCAT1 000 0000 0000 25037625 LOC196415 000 0000 0000 25152490 CASC1 000 0000 0000 25249447 KRAS2 00140000 001142 25520283 FLJ36004 000 0000 0000

(Continued on the following page)

Cancer Res 2006; 66: (8). April 15, 2006 4072 www.aacrjournals.org

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Novel Candidate Breast Cancer Oncogenes

Table 3. Identification of candidate genes in the 12p13 amplicon using SAGE data (Cont’d)

Tag SAGE

À À À À +++ À + + + + + +++ +++ +++ +++ +++

IDC3 IDC5 IDC6 I-EPI-7 LN1 LN2 P(A/NA/N) P(A/AN, N) P(A,NA/N) P(A/N) Fold P(A/NA/N) P(A/NA, N) P(A,NA/N) P(A/N) Fold

00 0 0 0 0 AGAGGGAGTG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 AAAATTTCAC NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CATTTATTAC NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CCTCGGAAAT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CCGTTTCCCA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TGCTGATTTG NA NA NA NA 0 NA NA NA NA 0 09 0 0 0 0 CATACTACAA 0.819 NA NA NA 0 0.864 NA NA NA 0 00 0 0 0 0 TTCCCATTTA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TCACTATGCC NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TTGTATAAAT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CTTCTATAAA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CAGAAGAAAG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 GGTTGGACAG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TATTGTTCAT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CATCTCTAAA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 AGGGCCATAA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TTTTACTGTG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 ACATTGAAAT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CTTTGTGAGA NA NA NA NA 0 NA NA NA NA 0 15 0 0 0 0 0 TGTGTATGTA 0.819 NA NA NA 0 0.864 NA NA NA 0 00 0 0 0 0 GCAAAGGATC NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TATCCTTCAT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 ACATTGGAAG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 ACATTGGAAA NA NA NA NA 0 NA NA NA NA 0 09 0 0 0 0 GTGTTTTTGT 0.819 NA NA NA 0 0.864 NA NA NA 0 00 0 0 0 0 TGTTTCCACT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 AAAATCTGAC NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 GTGGCATCTG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 GCCAAAACTA NA NA NA NA 0 NA NA NA NA 0 00 0320 0 TTTTGTGCAT NA NA NA NA 1 0.907 NA NA 0.945 2 00 0 0 0 0 TCAAGCAATC 0.399 NA NA 0.423 1 0.732 NA NA NA 0 26 0 0 72 41 47 GTGGTGGCAG 0.215 NA NA 0.441 1 0.397 NA NA 0.452 1 00 0 0 0 0 GGTTTTCCTG NA NA NA NA 0 NA NA NA NA 0 00 270 0 0 TTCCATATAC 0.718 NA NA 0.4236 0.599 NA NA NA 0 70 0 0 0 13CTCCTTTCTT 0.681 NA NA 0.423 4 0.478 NA NA NA 0 00 0 0 0 0 TGCTTAAGCC NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 GCAGGTTGTG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TCCCTGGACG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 AATCAGATGT NA NA NA NA 0 NA NA NA NA 0 90 147 106 29 1011 375 GAGGGCCGGT 0.005 0.129 0.062 0.103 13 0 0 0.062 0.006 18 00 0 0 0 9 TGCTATGTTA 0.819 NA NA NA 0 0.864 NA NA NA 0 00 0 0 0 0 AAAGACTTTA NA NA NA NA 0 NA NA NA NA 0 67 142 372 125 53 34 GTTTATGGAT NA NA NA NA 0 NA NA NA NA 0 034 0 140 0 CTGGCCCGAG 0.366 NA NA NA 0 0.486 NA NA NA 0 00 0 0 0 0 AGCTCGCTCA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 AACGATTGGG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 ACTTATTTTG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 GATATACAAC NA NA NA NA 0 NA NA NA NA 0 730530272 17 AGTCAGCTGG 0.008 0.113 0.355 0.093 6 0.001 0.077 0.355 0.061 9 19 17 0 36 278 30 ATAAAGTAAC 0.365 NA NA 0.584 4 0.012 0.327 0.888 0.355 6 00 0 0 0 0 GAATAATTGT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CAGATAATCC NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 GTGAAAGACA 0.819 NA NA NA 0 0.864 NA NA NA 0 00 0 771 0 AACTGTACTA 0.014 0.217 0.065 0.19 8 0 0.066 0.065 0.082 12 00 0 0 0 0 AGAGGGTGAA NA NA NA NA 0 NA NA NA NA 0

(Continued on the following page)

www.aacrjournals.org 4073 Cancer Res 2006; 66: (8). April 15, 2006

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Cancer Research

Table 3. Identification of candidate genes in the 12p13 amplicon using SAGE data (Cont’d)

Chromosome Gene Amplification 12 position N-EPI-1 N-EPI-2 N-ORG-1 À À À + À À À +++ DCIS1 D-EPI-3 DCIS4 DCIS5 D-EPI-6 D-EPI-7 DCIS8 IDC1

26003229 C12orf2 00 00 000 0000 26164228 BHLHB3 00 0013012 0000 26239789 SSPN 00 140 000 0000 26381609 ITPR2 00 00 000 0000 26949385 FLJ10637 00 00 000 0000 26982583 FGFR1OP2 00 00 000 0000 27017390 TM7SF3 00 00 0012 8000 27066750 SURB7 00 00 000 0000 27135702 LOC440091 00 00 000 0000 27288373 STK38L 00 00 000 0000 27377255 ARNTL2 00 00 000 0000 27568312 PPFIBP1 00 001300 20 9 0 0 27740695 REP15 00 00 000 0000 27754996 MRPS35 00 00 000 81500 28002284 PTHLH 00 00 000 0000 28014639 LOC440092 11 0 0 0 0 0 0 0000 28301400 FLJ11088 00 00 000 0000 29267865 MLSTD1 00 001300 0000 29385227 PTX1 00 00 000 0000 29477951 OVCH1 00 00 000 0000 29550182 ARG99 00 00 000 0000 30675001 IPO8 00 00 000 0000 30753755 C1QDC1 00 00 000 0000 31118077 DDX11 00 00 000 0000 31324785 C12orf14 11 0 14 0 130 0 8600 31428985 MGC24039 00 00 000 0000 31703388 MGC50559 00 00 000 0000 31715340 LOC196394 00 00 000 0000 32029259 FLJ10652 00 00 000 00110 32151452 BICD1 00 00 000 0000

NOTE: Normalized SAGE data (tags per 100,000) from 14 breast tumors, two normal mammary epithelial cells (N-EPI-1 and N-EPI-2), and one normal organoid (N-ORG-1) were analyzed to identify statistically significant differences in gene expression among amplified, nonamplified, and normal samples. Four statistical tests were done to calculate P values for difference between amplified and normal (A/N); among amplified (bold), nonamplified, and normal (A/NA/N); between amplified and nonamplified plus normal (A/NA,N); and between tumors and normal (A,NA/N) tissues. Amplification status of tumors was predicted by segmented CGH values (À, +, ++, +++ for log 2 ratio <0.09, 0.09, 0.2, and 0.5, respectively). Statistical analyses were done at different levels in which tumors having only +, ++, +++, or above were considered as amplified (see Materials and Methods for details). Fold = fold increase in gene expression compared to normals. Tag column lists the SAGE tag sequences used for the calculations. Candidate target(s) are highlighted in italic. Abbreviation: NA, not applicable.

gene induced by estrogen in the MCF-7 breast cancer cell line, and it Table S3). Among the five tumors with amplification of this area, encodes a secreted protein with leucine-rich repeats. Its exact none of them showed significant overexpression of MYC compared function and potential role in breast cancer are unknown (34). EMSY with normal mammary epithelial cells or tumors that lacked was recently identified as a BRCA2-interacting protein and a target amplification, suggesting that it may not be the best candidate of the 11q13 amplicon in breast and ovarian carcinomas (32, 33). The target of this MCR in these breast tumors. A recent study analyzing amplification and overexpression of EMSY may compromise BRCA2 myeloid malignancies with 8q24 amplification also concluded that function in sporadic tumors. Although our SAGE data did not MYC is not overexpressed in the amplified tumors and thus may not support EMSY as the target, the fact that our method pinpointed to be the only target of this amplicon (35). Similarly, cDNA array CGH the neighboring gene E2IG4 confirmed the location of target(s) in and gene expression analysis of breast carcinomas showed that only this particular amplicon. Overexpression of E2IG4 and EMSY should two of eight tumors with MYC amplification had increased be further validated using other means, such as quantitative RT-PCR expression of its mRNA, again raising the question whether MYC or Northern hybridization. is the only target of the 8q24 amplicon in breast cancer (18). A On the other hand, our analysis of the 8q24.3amplicon did not previous study using an inducible MYC expression model showed identify the known presumed target, MYC (Fig. 1D; Supplementary that MYC expression was not fully required for tumor progression in

Cancer Res 2006; 66: (8). April 15, 2006 4074 www.aacrjournals.org

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Novel Candidate Breast Cancer Oncogenes

Table 3. Identification of candidate genes in the 12p13 amplicon using SAGE data (Cont’d)

Tag SAGE

À À À À +++ À + + + + + +++ +++ +++ +++ +++

IDC3 IDC5 IDC6 I-EPI-7 LN1 LN2 P(A/NA/N) P(A/AN, N) P(A,NA/N) P(A/N) Fold P(A/NA/N) P(A/NA, N) P(A,NA/N) P(A/N) Fold

00 0 7 0 9 TTCACTAATT 0.647 NA NA NA 0 0.729 NA NA NA 0 00 0 0 0 0 CTATTTTTGT 0.513NA NA 0.4234 0.728 NA NA NA 0 00 0 0 0 0 GAGTAGCTGA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 AGAAATTCAG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CAGGAGCAAA 0.819 NA NA NA 0 0.864 NA NA NA 0 00 0 0 0 0 GACTGGAGAG NA NA NA NA 0 NA NA NA NA 0 11 0 0 7 12 0 CCTGGAGTGG 0.2 NA NA 0.184 8 0.553NA NA 0.5 6 00 0 0 0 0 TAATACATTA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 AAGCTCCCCC NA NA NA NA 0 NA NA NA NA 0 00 0 030 0 ATGCAAATTA 0.109 NA NA 0.423 10 0.017 0.5 0.336 0.5 15 00 0 0 0 0 GCTGCATTTA NA NA NA NA 0 NA NA NA NA 0 00 270 0 0 CCCGGCCCAA 0.365 NA NA NA 0 0.485 NA NA NA 0 00 0 0 0 0 CTGGAATGAT NA NA NA NA 0 NA NA NA NA 0 00 0 012 0 ACTGCTGTCT 0.685 NA NA 0.4234 0.455 NA NA 0.5 6 039 0 70 0 TAAAAATAAC 0.693NA NA NA 0 0.765 NA NA NA 0 00 0 0 0 0 CAATGTGAAA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CAAAAGATCA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CAGAATGGAG 0.819 NA NA NA 0 0.864 NA NA NA 0 00 0 0 0 0 GAATTGGAAA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CATATATGGG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TTCCCGCCTG NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 TGAGGCCTAT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 ACTGATTGGT NA NA NA NA 0 NA NA NA NA 0 70 0 0 0 0 ACTATAGAGA 0.819 NA NA NA 0 0.864 NA NA NA 0 11 9 0 0 0 0 CACTTTGTAT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 CCGGTAATCT NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 ATTTTAAATA NA NA NA NA 0 NA NA NA NA 0 00 0 0 0 0 AGCCAGTCTT NA NA NA NA 0 NA NA NA NA 0 00 0 7 0 0 TGTAAGAAAT 0.65 NA NA NA 0 0.731 NA NA NA 0 00 0 0 0 0 TCTTCCTTCC NA NA NA NA 0 NA NA NA NA 0

the presence of additional genomic changes, such as KRAS2 and even in rare cases of IDCs (44). TEL/ETV6 encodes a member of mutation (36), raising the possibility that MYC overexpression in the ETS family of transcription factors with an NH2-terminal tumors might be transiently maintained. We identified KIAA0196 as oligomerization (PNT) and a COOH-terminal DNA binding (ETS) another potential target of this region, correlating with recent domain (40). TEL/ETV6 translocation was frequently found in findings that KIAA0196 was both amplified and overexpressed in myeloid and lymphoid leukemias and in solid tumors (39, 40, 45), and prostate cancer (37, 38). recently, amplification of TEL/ETV6 was reported in a myelodys- Systematic characterization of the 12p13-p12 amplicon. In plastic syndrome (46). KRAS2 is another attractive target due to the addition to these previously described and well-characterized well-established roles of Ras family involved in tumorigen- amplicons, we also identified several chromosomal areas with esis. Mutation of RAS genes is infrequent in human breast carci- high-level copy number gains that have not previously been nomas (47, 48), but amplification of KRAS2 has previously been characterized in detail in breast cancer. Among these, we further reported in 10 of 27 cases of breast carcinomas and breast cancer cell characterized a 1.8-Mb amplicon at 12p13-p12 found in three tumors lines (20) and in one case of metastatic rectal carcinoma (21). (Fig. 1F; Table 3), because it showed one of the highest levels of Amplification of all four candidates in tumors predicted by array amplification (comparable with that of the ERBB2 amplicon). CGH was confirmed by quantitative PCR (Tables 2 and 4). Within KRAS2, an oncogene that is located in the 12p13-12 amplicon, has additional 16 tumor samples screened that did not have array CGH previously been described as a gene amplified in a subset of breast data, only one tumor had amplification of H2AFJ, EPS8, and KRAS2. carcinomas based on a on a limited array CGH screen for the Among 26 breast cancer cell lines screened, only HCC1937 and amplification of known oncogenes (20), but this amplicon has not ZR-75-1 showed copy number gain in this region (Tables 2 and 4). been systematically characterized at high resolution. Based on our Thus, amplification of 12p13-p12 occurs with a low frequency in integrated array CGH/SAGE analysis, we identified three candidate breast cancers. We further did FISH to confirm amplification of the target genes in this region: H2AFJ, EPS8, and KRAS2. However, we candidates on the single-cell level and to determine whether TEL/ also considered TEL/ETV6 as a candidate target, because it is a ETV6 is involved in a translocation, as in the case of secretory breast known oncogene (39–41), and translocation of TEL/ETV6 to NTRK3 cancer and other tumor types. FISH displayed dramatic amplifica- on 15q25 was reported in >90% of secretory breast cancers (42, 43) tion of all four candidates in tumors IDC1 and LN1, but not in the www.aacrjournals.org 4075 Cancer Res 2006; 66: (8). April 15, 2006

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Cancer Research

Table 4. Validation of 12p13 amplicon candidate target genes in primary breast tumors and breast cancer cell lines

Sample ETV6 H2AFJ EPS8 KRAS2 HER2

Copy no. Exp Copy no. Exp Copy no. Exp Copy no. Exp Copy no. Exp

Normal N050702 1.0 0.31.0 1.0 1.0 1.0 1.0 0.6 1.0 0.1 N051002 1.0 0.1 1.0 0.31.0 0.4 1.0 0.3 1.0 0.1 N052902 1.0 0.1 1.0 0.31.0 0.1 1.0 0.1 1.0 0.1 N061202 1.0 1.0 1.0 0.6 1.0 0.4 1.0 0.9 1.0 0.2 N062002 1.0 0.6 1.0 0.9 1.0 0.4 1.0 1.0 1.0 0.3 Tumors BWH-T1 1.30.9 0.9 0.4 0.9 0.2 0.6 0.7 0.4 1.3 BWH-T31.4 1.5 1.5 0.6 1.2 1.2 0.7 0.9 0.5 2.9 BWH-T4 1.1 0.6 1.1 0.30.30.5 0.30.8 0.2 3.1 BWH-T7 0.6 1.0 0.4 1.0 0.0 1.1 0.2 2.8 0.1 6.8 BWH-T8 1.5 0.9 2.3 1.0 3.1 0.8 2.6 1.1 0.1 2.4 BWH-T15 0.5 0.30.7 0.4 0.6 0.6 0.30.7 0.3 4.6 BOT169 1.0 0.31.0 1.31.1 9.3 1.0 1.1 4.3 22.2 BWH-T18 0.8 0.30.7 0.9 0.7 0.4 0.5 0.4 0.4 0.9 BWH-T24 1.5 0.6 1.4 1.5 1.4 9.4 1.2 5.1 1.1 0.9 CT6 2.4 0.2 0.6 0.8 0.5 1.1 1.1 0.7 35.0 74.8 CT-36 0.8 1.5 1.2 7.5 1.31.1 0.6 1.0 0.9 23.1 CT-39 ND 0.2 ND 0.7 ND 0.6 ND 0.3 ND 0.4 CT-46 0.6 0.8 1.0 7.6 1.3 2.4 0.5 2.5 0.8 18.8 CT-47 0.8 0.6 1.0 0.9 1.0 0.9 0.6 0.5 0.8 55.4 CT-49 ND 1.6 ND 2.6 ND 2.1 ND 1.8 ND 5.6 IDC1 3.6 0.2 6.1 13.5 7.3 4.0 3.3 1.0 2.0 9.1 LN1 4.9 0.2 8.6 14.4 8.0 3.8 3.9 1.3 4.5 8.9 IDC2 ND 0.0 ND ND ND ND ND 0.0 ND 0.2 LN2 ND 0.0 0.7 15.7 1.0 0.3ND 0.1 0.9 0.9 MGH-T30.8 0.1 1.1 0.4 1.6 0.1 0.30.1 0.1 1.6 MGH-T4 1.4 0.4 1.3ND 0.3ND 1.3ND 1.3 107.0 Cell lines 21MT1 1.4 0.7 1.2 3.7 1.3 2.5 1.5 0.5 ND 142.9 21MT2 1.0 0.6 1.1 2.7 0.9 3.7 1.1 0.6 ND 70.0 21NT 0.8 0.0 1.0 2.4 0.8 3.9 1.30.3 16.4 96.1 21PT 0.7 1.0 0.8 2.2 0.6 4.5 0.8 1.3 8.5 177.3 BT-20 1.1 0.2 0.5 3.5 0.7 1.6 1.8 0.8 0.0 0.0 BT-549 0.7 0.1 0.8 2.4 0.9 1.1 1.0 ND 0.5 142.4 HCC1937 5.7 0.7 1.2 3.9 0.9 6.2 3.4 1.1 0.30.6 Hs578T 0.9 0.2 0.7 0.30.5 1.31.1 ND 0.6 0.4 MCF-10A 0.9 0.1 0.5 0.4 0.4 0.31.3ND ND 0.2 MCF10DCIS 0.9 0.2 0.5 0.4 0.4 0.4 1.3ND ND 0.7 MCF-7 ND ND 0.7 ND 0.7 ND ND ND ND ND MDA-MB-435 0.6 0.1 0.8 0.3 0.9 0.5 0.8 ND ND 0.1 MDA-MB-468 0.9 0.1 0.5 0.5 0.30.1 0.9 ND 0.3 0.1 SUM-44 1.0 0.1 0.8 0.0 0.9 0.7 0.8 ND ND 0.5 SUM-52 1.2 0.1 0.6 0.4 0.5 0.4 1.0 ND ND 0.9 SUM-102 1.4 0.1 1.4 0.30.9 0.31.1 ND 0.3 0.1 SUM-1315 0.6 0.2 1.1 1.8 0.8 2.8 0.5 ND 0.5 0.6 SUM-149 0.8 0.1 1.6 0.6 1.30.7 1.7 ND 0.3 0.3 SUM-159 0.9 0.2 1.1 4.8 0.7 2.9 0.9 ND 0.5 1.2 SUM-185 0.7 0.31.5 1.4 1.31.0 0.6 ND 0.7 290.6 SUM-190 1.2 0.4 0.9 1.8 1.0 1.2 1.1 ND 58.7 331.8 SUM-225 1.0 0.2 0.7 0.6 0.5 0.8 1.1 ND 34.8 265.5 SUM-229 1.0 0.1 0.6 0.0 0.6 0.8 0.6 ND ND 0.6 UACC-812 0.7 1.8 0.8 0.30.4 2.0 0.8 3.1 16.0 5,843.2 UACC-8931.5 0.5 0.8 4.8 0.6 0.4 1.1 1.1 48.2 371.6 ZR-75-1 2.7 0.2 2.1 3.8 2.6 1.0 1.5 ND ND 1.7

NOTE: Gene and sample names, copy numbers predicted based on quantitative PCR (copy no.) and overexpression determined by quantitative RT-PCR (Exp) are listed. Values predicting z2 fold copy number gain or overexpression are in boldface. Abbreviation: ND, not determined.

Cancer Res 2006; 66: (8). April 15, 2006 4076 www.aacrjournals.org

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Novel Candidate Breast Cancer Oncogenes control tumor LN2, which was also negative by array CGH (Fig. 2). highly homologous (f95%) to other H2A family members. FISH using two bacterial artificial chromosomes flanking the TEL/ The function of this particular H2A protein is not known, but ETV6 gene labeled in red and green showed adjacent red and green presumably, it is involved in modulating chromatin structure and signals in the tumors, suggesting that the TEL/ETV6 gene is not gene expression. A recent study described overexpression of H2AFJ disrupted in these tumors. In HCC1937 cell line, the majority cells in human metastatic melanoma lesions compared with common carried five copies of TEL/ETV6. Staining of metaphase chromo- nevocellular nevi, suggesting a potential role for this gene in somes using centrosome-specific probes showed that three copies of melanoma metastasis (49). Epidermal growth factor (EGF) pathway ETV6 were associated with , whereas the other two substrate 8 (EPS8) was originally identified as a substrate of EGF copies associated with chromosome 4 and one yet unidentified chro- receptor (EGFR) that enhances mitogenic signaling from receptor mosome. Again, the TEL/ETV6 gene is not disrupted in HCC1937 tyrosine kinases, phorbol ester, and c-Src (50, 51). Constitutive cell line. Therefore, translocation and fusion of TEL/ETV6 to NTRK3 tyrosine phosphorylation of EPS8 was observed in many tumor cell or other genes is not a common event in breast cancers. lines (52). Overexpression of EPS8 in murine C3H10T1/2 fibroblasts We next did quantitative reverse transcription-PCR (RT-PCR) on induced cellular transformation in the presence of EGF (53), whereas cDNA samples from primary breast tumors, breast cancer cell lines, down-regulation of EPS8 by trichostatin A or small interfering RNA and five purified normal mammary epithelial cells as references to inhibited the growth of v-Src-transformed chicken cells (46). At examine overexpression of the four putative targets (Table 4). TEL/ the molecular level, EPS8 binds to internalized EGFR, controls EGFR ETV6 and KRAS2 were overexpressed in a subset of breast tumors, trafficking, and relays signals from Ras, phosphatidylinositol but this did not correlate with their amplification. From the four 3-kinase to Rac. More recently, EPS8 was found to bind to the genes tested, only H2AFJ and EPS8 were overexpressed in tumors in barbed ends of filaments and regulates actin polymerization which they were also amplified, although the association between and cell motility (54, 55). Public gene expression data suggest that gene amplification and overexpression was statistically significant EPS8 is overexpressed in breast and several other cancer types, only for ERBB2 (P = 0.007, Fisher exact test) due to small sample size including lung and pancreatic cancer. Further studies are required to and the low frequency of amplification of the 12p13target genes in confirm overexpression of EPS8 protein in breast tumors by breast tumors. Thus, based on these data, H2AFJ and EPS8 are the immunochemistry and to evaluate its potential prognostic value. potential targets of this 12p13-p12 amplicon. The H2AFJ gene In summary, we identified H2AFJ and EPS8 as novel candidate encodes a member of the histone H2A super family. It has two breast cancer oncogenes based on integrated cDNA array CGH and isoforms generated by differential splicing. We detected the SAGE analyses. The combination of these two technologies seems to expression of only isoform 2 (NM_177925) by SAGE and quantitative be powerful for the identification of candidate target genes of RT-PCR. This isoform encodes a 129-amino-acid protein that is amplified loci as shown by the identification of a novel 12p13

Figure 2. FISH analysis of candidate targets of the 12p13 amplicon in breast tumors and breast cancer cell lines. FISH analysis using a commercially available TEL/AML probe in breast tumors IDC1 (A), LN1(B), LN2(C), IDC-C6 (D), and HCC1937 breast cancer cell line (E). A to E, only the TEL (green) signal was captured. E, inset, typical interphase nucleus observed in the majority of HCC1937 cells, showing five TEL/ETV6 signals per cell. FISH analysis using bacterial artificial chromosome probes centromeric (red) and telomeric (green) to TEL/ETV6 in tumor IDC-C6 (F) and metaphase chromosomes of HCC1937 breast cancer cell line (G). Colocalization of the two signals indicates integrity of the TEL/ETV6 chromosomal in both cases. H, FISH analysis of metaphase chromosomes of HCC1937 cell line. Green and red signal corresponds to TEL/ETV6 and AML probes, respectively, whereas aqua signal (white arrow) marks the centromeres of the three chromosomes 4. The majority of the metaphase cells in the HCC1937 breast cancer cell line seem to be near tetraploid. Yellow arrow points to TEL/ETV6 (green signal) localized on a derivative chromosome 4. FISH analysis of H2AFJ and EPS8 in breast tumor LN1 (I and J, respectively) and ZR-75-1 breast cancer cell line (K and L, respectively). In both cases, red signal corresponds to the gene-specific bacterial artificial chromosome, whereas the green signal reflects hybridization using a chromosome 12 centromeric probe. www.aacrjournals.org 4077 Cancer Res 2006; 66: (8). April 15, 2006

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Cancer Research amplicon and its putative targets (H2AFJ and EPS8) detected in a Cancer Center grant CA89393, Department of Defense Breast Cancer Center of Excellence grant DAMD17-02-1-0692 (K. Polyak), Department of Defense Postdoctoral subset of breast tumors. Further functional studies are necessary to Fellowship grant DAMD17-02-1-0363 (J. Yao), and grant CA93683. validate the role of these new candidate oncogenes in breast The costs of publication of this article were defrayed in part by the payment of page tumorigenesis. charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. We thank Drs. Drazen Belina, Zrinka Pagon, and Jasminka Razumovic (University Acknowledgments Hospital Rebro and Zagreb Medical School, Zagreb, Croatia) and Drs. Andrea Richardson, Gabriela Lodeiro, and Ruth Gomes (Brigham and Women’s Hospital, Received 11/14/2005; revised 1/8/2006; accepted 1/26/2006. Boston, MA) for help with the acquisition of tumor samples and the current and past Grant support: National Cancer Institute Cancer Genome Anatomy Project and members of the Polyak laboratory for critical reading of the article and their Specialized Program in Research Excellence in Breast Cancer at Dana-Farber/Harvard constructive criticism throughout the execution of this project.

References array comparative genomic hybridization for extensive resolution array comparative genomic hybridization of amplicon profiling of breast cancers: a new approach for chromosome arm 8q: evaluation of genetic progression 1. Savelyeva L, Schwab M. Amplification of oncogenes the molecular analysis of paraffin-embedded cancer markers for prostate cancer. Genes Chromosomes revisited: from expression profiling to clinical applica- tissue. Am J Pathol 2001;158:1623–31. Cancer 2005;44:438–49. tion. Cancer Lett 2001;167:115–23. 21. Rodenhuis S, van de Wetering ML, Mooi WJ, et al. 39. Knezevich SR, McFadden DE, Tao W, Lim JF, 2. Courjal F, Cuny M, Simony-Lafontaine J, et al. Mapping Mutational activation of the K-ras oncogene. A possible Sorensen PH. A novel ETV6–3gene fusion in congenital of DNA amplifications at 15 chromosomal localizations pathogenetic factor in adenocarcinoma of the lung. N fibrosarcoma. Nat Genet 1998;18:184–7. in 1875 breast tumors: definition of phenotypic groups. Engl J Med 1987;317:929–35. 40. Wlodarska I, Mecucci C, Baens M, Marynen P, van Cancer Res 1997;57:4360–7. 22. Allinen M, Beroukhim R, Cai L, et al. Molecular den Berghe H. ETV6 gene rearrangements in hemato- 3. Slamon DJ, Godolphin W, Jones LA, et al. Studies of the characterization of the tumor microenvironment in poietic malignant disorders. Leuk Lymphoma 1996;23: HER-2/neu proto-oncogene in human breast and breast cancer. Cancer Cell 2004;6:17–32. 287–95. ovarian cancer. Science 1989;244:707–12. 23. Aguirre AJ, Brennan C, Bailey G, et al. High-resolution 41. Golub TR, Barker GF, Bohlander SK, et al. Fusion of 4. Bachman KE, Argani P, Samuels Y, et al. The PIK3CA characterization of the pancreatic adenocarcinoma the TEL gene on 12p13to the AML1 gene on 21q22 in gene is mutated with high frequency in human breast genome. Proc Natl Acad Sci U S A 2004;101:9067–72. acute lymphoblastic leukemia. Proc Natl Acad Sci U S A cancers. Cancer Biol Ther 2004;3:772–5. 24. Olshen AB, Venkatraman ES, Lucito R, Wigler M. 1995;92:4917–21. 5. Samuels Y, Wang Z, Bardelli A, et al. High frequency of Circular binary segmentation for the analysis of array- 42. Tognon C, Knezevich SR, Huntsman D, et al. mutations of the PIK3CA gene in human cancers. based DNA copy number data. Biostatistics 2004;5: Expression of the ETV6–3gene fusion as a primary Science 2004;304:554. 557–72. event in human secretory breast carcinoma. Cancer Cell 6. Campbell IG, Russell SE, Choong DY, et al. Mutation of 25. Porter D, Lahti-Domenici J, Keshaviah A, et al. 2002;2:367–76. the PIK3CA gene in ovarian and breast cancer. Cancer Molecular markers in ductal carcinoma in situ of the 43. Makretsov N, He M, Hayes M, et al. A fluorescence Res 2004;64:7678–81. breast. Mol Cancer Res 2003;1:362–75. in situ hybridization study of ETV6–3fusion gene in 7. Lee JW, Soung YH, Kim SY, et al. PIK3CA gene is 26. Cai L, Huang H, Blackshaw S, et al. Clustering secretory breast carcinoma. Genes Chromosomes Can- frequently mutated in breast carcinomas and hepato- analysis of SAGE data using a Poisson approach. cer 2004;40:152–7. cellular carcinomas. Oncogene 2004;24:1477–80. Genome Biol 2004;5:R51. 44. Letessier A, Ginestier C, Charafe-Jauffret E, et al. 8. Wu G, Xing M, Mambo E, et al. Somatic mutation and 27. Ginzinger DG. Gene quantification using real-time ETV6 gene rearrangements in invasive breast carcino- gain of copy number of PIK3CA in human breast cancer. quantitative PCR: an emerging technology hits the ma. Genes Chromosomes Cancer 2005;44:103–8. Breast Cancer Res 2005;7:R609–16. mainstream. Exp Hematol 2002;30:503–12. 45. Rowley JD. The role of chromosome translocations in 9. Yarden Y, Baselga J, Miles D. Molecular approach to 28. Ney PA, Andrews NC, Jane SM, et al. Purification of leukemogenesis. Semin Hematol 1999;36:59–72. breast cancer treatment. Semin Oncol 2004;31:6–13. the human NF-E2 complex: cDNA cloning of the 46. Mauvieux L, Helias C, Perrusson N, et al. ETV6 (TEL) 10. Kauraniemi P, Barlund M, Monni O, Kallioniemi A. hematopoietic cell-specific subunit and evidence for gene amplification in a myelodysplastic syndrome with New amplified and highly expressed genes discovered in an associated partner. Mol Cell Biol 1993;13:5604–12. excess of blasts. Leukemia 2004;18:1436–8. the ERBB2 amplicon in breast cancer by cDNA micro- 29. Almeida A, Muleris M, Dutrillaux B, Malfoy B. The 47. Thor A, Ohuchi N, Hand PH, et al. ras gene arrays. Cancer Res 2001;61:8235–40. insulin-like growth factor I receptor gene is the target alterations and enhanced levels of ras p21 expression 11. Janes PW, Lackmann M, Church WB, et al. Structural for the 15q26 amplicon in breast cancer. Genes in a spectrum of benign and malignant human determinants of the interaction between the erbB2 Chromosomes Cancer 1994;11:63–5. mammary tissues. Lab Invest 1986;55:603–15. receptor and the Src homology 2 domain of Grb7. J Biol 30. Chin K, de Solorzano CO, Knowles D, et al. In situ 48. Miyakis S, Sourvinos G, Spandidos DA. Differential Chem 1997;272:8490–7. analyses of genome instability in breast cancer. Nat expression and mutation of the ras family genes in 12. Luoh SW, Venkatesan N, Tripathi R. Overexpression Genet 2004;36:984–8. human breast cancer. Biochem Biophys Res Commun of the amplified Pip4k2beta gene from 17q11–12 in 31. Kauraniemi P, Kuukasjarvi T, Sauter G, Kallioniemi A. 1998;251:609–12. breast cancer cells confers proliferation advantage. Amplification of a 280-kilobase core region at the ERBB2 49. de Wit NJ, Rijntjes J, Diepstra JH, et al. Analysis of Oncogene 2004;23:1354–63. locus leads to activation of two hypothetical proteins in differential gene expression in human melanocytic 13. Mu D, Chen L, Zhang X, et al. Genomic amplification breast cancer. Am J Pathol 2003;163:1979–84. tumour lesions by custom made oligonucleotide arrays. and oncogenic properties of the KCNK9 potassium 32. Hughes-Davies L, Huntsman D, Ruas M, et al. EMSY Br J Cancer 2005;92:2249–61. channel gene. Cancer Cell 2003;3:297–302. links the BRCA2 pathway to sporadic breast and ovarian 50. Fazioli F, Minichiello L, Matoska V, et al. Eps8, a 14. Albertson DG, Pinkel D. Genomic microarrays in cancer. Cell 2003;115:523–35. substrate for the epidermal growth factor receptor human genetic disease and cancer. Hum Mol Genet 33. Rodriguez C, Hughes-Davies L, Valles H, et al. kinase, enhances EGF-dependent mitogenic signals. 2003;12 Spec No 2:R145–52. Amplification of the BRCA2 pathway gene EMSY in EMBO J 1993;12:3799–808. 15. Pinkel D, Segraves R, Sudar D, et al. High resolution sporadic breast cancer is related to negative outcome. 51. Gallo R, Provenzano C, Carbone R, et al. Regulation analysis of DNA copy number variation using compar- Clin Cancer Res 2004;10:5785–91. of the tyrosine kinase substrate Eps8 expression by ative genomic hybridization to microarrays. Nat Genet 34. Charpentier A, Bednarek A, Daniel R, et al. Effects of growth factors, v-Src and terminal differentiation. 1998;20:207–11. estrogen on global gene expression: identification of Oncogene 1997;15:1929–36. 16. Mantripragada KK, Buckley PG, de Stahl TD, novel targets of estrogen action. Cancer Res 2000;60: 52. Matoskova B, Wong WT, Salcini AE, Pelicci PG, Di Dumanski JP. Genomic microarrays in the spotlight. 5977–83. Fiore PP. Constitutive phosphorylation of eps8 in tumor Trends Genet 2004;20:87–94. 35. Storlazzi CT, Fioretos T, Paulsson K, et al. Identifi- cell lines: relevance to malignant transformation. Mol 17. Pollack JR, Perou CM, Alizadeh AA, et al. Genome- cation of a commonly amplified 4.3Mb region with Cell Biol 1995;15:3805–12. wide analysis of DNA copy-number changes using cDNA overexpression of C8FW, but not MYC in MYC-contain- 53. Maa MC, Hsieh CY, Leu TH. Overexpression of microarrays. Nat Genet 1999;23:41–6. ing double minutes in myeloid malignancies. Hum Mol p97Eps8 leads to cellular transformation: implication of 18. Pollack JR, Sorlie T, Perou CM, et al. Microarray Genet 2004;13:1479–85. pleckstrin homology domain in p97Eps8-mediated ERK analysis reveals a major direct role of DNA copy 36. D’Cruz CM, Gunther EJ, Boxer RB, et al. c-MYC activation. Oncogene 2001;20:106–12. number alteration in the transcriptional program of induces mammary tumorigenesis by means of a 54. Disanza A, Carlier MF, Stradal TE, et al. Eps8 controls human breast tumors. Proc Natl Acad Sci U S A 2002; preferred pathway involving spontaneous Kras2 muta- actin-based motility by capping the barbed ends of actin 99:12963–8. tions. Nat Med 2001;7:235–9. filaments. Nat Cell Biol 2004;6:1180–8. 19. Hyman E, Kauraniemi P, Hautaniemi S, et al. Impact 37. Porkka KP, Tammela TL, Vessella RL, Visakorpi T. 55. Offenhauser N, Borgonovo A, Disanza A, et al. The of DNA amplification on gene expression patterns in RAD21andKIAA0196at8q24areamplifiedand eps8 family of proteins links growth factor stimulation breast cancer. Cancer Res 2002;62:6240–5. overexpressed in prostate cancer. Genes Chromosomes to actin reorganization generating functional redun- 20. Daigo Y, Chin SF, Gorringe KL, et al. Degenerate Cancer 2004;39:1–10. dancy in the Ras/Rac pathway. Mol Biol Cell 2004;15: oligonucleotide primed-polymerase chain reaction-based 38. van Duin M, van Marion R, Vissers K, et al. High- 91–8.

Cancer Res 2006; 66: (8). April 15, 2006 4078 www.aacrjournals.org

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research. Combined cDNA Array Comparative Genomic Hybridization and Serial Analysis of Gene Expression Analysis of Breast Tumor Progression

Jun Yao, Stanislawa Weremowicz, Bin Feng, et al.

Cancer Res 2006;66:4065-4078.

Updated version Access the most recent version of this article at: http://cancerres.aacrjournals.org/content/66/8/4065

Supplementary Access the most recent supplemental material at: Material http://cancerres.aacrjournals.org/content/suppl/2006/04/26/66.8.4065.DC1

Cited articles This article cites 55 articles, 16 of which you can access for free at: http://cancerres.aacrjournals.org/content/66/8/4065.full#ref-list-1

Citing articles This article has been cited by 19 HighWire-hosted articles. Access the articles at: http://cancerres.aacrjournals.org/content/66/8/4065.full#related-urls

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Subscriptions Department at [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://cancerres.aacrjournals.org/content/66/8/4065. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from cancerres.aacrjournals.org on September 26, 2021. © 2006 American Association for Cancer Research.