A

Cell line p53 mutation Type UM-SCC 1 wt UM-SCC5 5, 157 GTC --> TTC Missense mutation by transversion (Valine --> Phenylalanine UM-SCC6 wt UM-SCC9 wt UM-SCC11A wt UM-SCC11B Exon 7, 242 TGC --> TCC Missense mutation by transversion (Cysteine --> Serine) UM-SCC22A Exon 6, 220 TAT --> TGT Missense mutation by transition (Tyrosine --> Cysteine) UM-SCC22B Exon 6, 220 TAT --> TGT Missense mutation by transition (Tyrosine --> Cysteine) UM-SCC38 Exon 5, 132 AAG --> AAT Missense mutation by transversion (Lysine --> Asparagine) UM-SCC46 Exon 8, 278 CCT --> CGT Missense mutation by transversion (Proline --> Alanine)

B

1 Supplementary Methods

Cell Lines and Cell Culture

A panel of ten established HNSCC cell lines from the University of Michigan series

(UM-SCC) was obtained from Dr. T. E. Carey at the University of Michigan, Ann Arbor,

MI. The UM-SCC cell lines were derived from eight patients with SCC of the upper

aerodigestive tract (supplemental Table 1). Patient age at tumor diagnosis ranged from

37 to 72 years. The cell lines selected were obtained from patients with stage I-IV

tumors, distributed among oral, pharyngeal and laryngeal sites. All the patients had

aggressive disease, with early recurrence and death within two years of therapy. Cell

lines established from single isolates of a patient specimen are designated by a numeric

designation, and where isolates from two time points or anatomical sites were obtained,

the designation includes an alphabetical suffix (i.e., "A" or "B"). The cell lines were

maintained in Eagle's minimal essential media supplemented with 10% fetal bovine

serum and penicillin/streptomycin.

Human normal (HKC) were obtained from four individuals

(Cascade Biologics Inc., Portland, OR), and were maintained in serum-free

medium 154CF containing 0.08mM of calcium chloride and supplemented with

keratinocyte growth supplements (HKGS). The final concentration of HKGS in the

complete medium are: 0.2% (v/v) of bovine pituitary extract (BPE), 5 μg/ml of bovine

insulin, 0.18 μg/ml of hydrocortisone and 5 μg/ml of transferrin, and 0.2ng/ml of human recombinant EGF. All HKC were used within five passages. The cultures were incubated

o in a humidified cell culture incubator at 37 C and 5 % of CO2.

1 RNA isolation, Labeling of cDNA, Microarray Hybridization and Image Collection

Total RNA was isolated from cultured UM-SCC cells and primary keratinocytes at their log growth phase using Trizol reagent (Invitrogen, Carlsbad, CA) according to the manufacturer’s protocol and stored at -80°C. To make fluorescence-labeled target cDNA by reverse transcription, 50-100 µg total RNA was incubated in a cocktail containing Cy3 or Cy5-dUTP (Amersham Pharmacia Biotech Inc, Piscataway, NJ) and SuperScript II RT

(Gibco BRL Technologies, Inc, Gaithersburg, MD). Labeled targets were purified using a

Microcon column (Millipore, Bedford, MA). The appropriate Cy3 and Cy5 targets were combined, along with 2 µl (20 µg) mouse COT-1 DNA, 1 µl (8-10 µg) polyA, 2.6 µl 20X

SSC and 0.45 µl 10% SDS in a final volume of 15 µl. After denaturation, labeled targets were added to processed 24K human array chips developed by National

Research Institute (NHGRI, Bethesda, MD), which were then placed in moisturized chambers and incubated overnight (10-16 hours) @ 65oC. The next day, slides were washed for 1 minute in 1X SSC, 1 minute in 0.2X SSC, 10 seconds in 0.05X SSC, then spin dried. Each hybridization condition was repeated at least three times. Fluorescence images were captured by GenePix 4000 microarray scanner (Axon Instruments, Union

City, CA) with GenePix Pro software from the same company. The PMT voltage for both channels was adjusted between 750V to 890V range to give same overall intensity between both channels and the raw image was saved in TIFF format.

Data Collection, Filtering and Quality Control

The 24K cDNA microarray chips containing a total of 23220 spots, which includes known , ESTs, hypothetical genes and control spots were obtained from National

2 Human Genome Research Institute (Bethesda, MD). The study was focused on the 12270 known set in the array. Raw images were analyzed using the ArraySuite 2.1 extensions (Chen et al., 1997) in the IPLab program (Scanalytics, Inc., Fairfax, VA) to

calibrate relative ratios and develop confidence intervals for their significance (Chen et

al., 1997). Fluorescence intensities for both dyes (Cy3 and Cy5) and local background

subtracted values for individual spots were obtained. Spots were filtered based on

RQuality score provided in the software and visually checked. The software extracts

information regarding spot quality and assigns an RQuality score to each ratio

measurement, with 0 as the lowest measurement quality and 1 as the highest

measurement quality. Any spots with less than 0.5 were excluded. For each spot, the

calibrated ratio (the median of the pixel-by-pixel ratios of pixel intensities that have the

median background intensity subtracted) was used in subsequent analysis. Expression

outliners were determined using +3.0 SD and a 99% confidence interval as cutoff. The resulting gene list was further filtered to exclude EST and hypothetical clones and

contain expression value from at least three human normal keratinocyte and seven tumor

cell lines, resulting in a total of 9273 genes.

Data Normalization, Calculation and Statistical Analysis

PCA analysis was performed on normalized data from Partek Pro 5.1 software (Partek

Inc., St. Louis, MO). Hierarchical clustering was carried out on gene list where the

average ratio difference of ten tumor cell lines to normal keratinocytes were more than 2-

fold and had a t-test score at P < 0.01. The resulting 969 genes were analyzed by cluster

software based on clustering algorithm of Eisen et al (Eisen et al., 1998), and the

3 expression maps of clustered genes were visualized using Java Treeview software

(Saldanha et al., 2004). Hierarchical agglomerative clustering was performed by BRB-

ArrayTools developed by Dr. Richard Simon and Amy Peng (available at http://linus.nci.nih.gov/BRB-ArrayTools.html). It was applied to the normalized log ratios by using both compact linkage and average linkage and both Euclidean and one minus Pearson correlation distance metrics. Normalized log ratios were median-centered within each gene for all of the cluster analyses. The clustering results obtained by using compact linkage with one minus Pearson correlation distance applied to the 969 probe elements appeared by visual inspection to yield the most distinctive clusters, where attention is focused on the genes responsible for segregating members in sub_A and sub_B by PCA analysis in Figure 1. The presence of significant clustering was also assessed by applying the global test of clustering proposed by McShane et al. to confirm the findings (McShane et al., 2002).

Gene Ontology analysis was carried out by Database for Annotation,

Visualization, and Integrated Discovery (DAVID); a web-based, client/server application that allows users to access a relational database of functional annotation (NIAID, NIH,

Bethesda, MD). Functional annotations are derived primarily from LocusLink at the

National Center for Biotechnology Information (NCBI). Annotation pedigrees are provided via direct links to the primary sources of annotation, which also provide additional gene specific information. To analyze the differential expression between the tumor and normal group uniquely, statistical analysis of the data from the hierarchical clustering was performed using BRB-Array Tools, where a class comparison tool was used. Univariate F-tests (BRB-Array Tools 2.0) were performed on these in the two

4 classes. This tool also computes a global permutation test that excludes genes that differ

significantly due to chance alone. Genes that showed significant differences (P <0.001) after 2000 permutations were designated as differentially expressed genes.

A mixed-model-based F test was used to analyze differential among

three groups of cells, HKC, the sub-group 1 or 2 of UM-SCC cells (Tempelman, 2005;

Wolfinger et al., 2001). Restricted maximum likelihood (REML) was used as the

estimation method. The F test was performed in mixed model using SAS 9.1 program

(SAS Institute Inc. Cary, NC). P < 0.05 was designated as significant difference between

two groups.

Real Time RT-PCR

Gene expression profiles generated from microarray were confirmed by real time quantitative RT-PCR. cDNA synthesis using total RNA was performed by using the

High-Capacity cDNA Archive Kit, and Real time quantitative RT-PCR was performed

using the Assays-on-Demand™ Gene Expression kits for specific genes and an ABI

Prism 7700 Sequence Detection System (Applied Biosystems, Foster City, CA)

according to the manufacturer’s protocols. PCR was performed together with

endogenous control using eukaryotic 18S ribosomal RNA (18S). The amplification was

carried out by addition of 30 ng of cDNA in 30 μl of PCR reaction mix (1.5 μl of 20X assay mix and 15 μl of 2X TaqMan Universal Master Mix). Amplification conditions were as follows: activation of enzymes for 2 min at 50oC and 10 min at 95oC, followed

by 40 cycles at 15 sec at 95oC and 1 min at 60oC. Relative quantitation of the expression

was calculated by normalizing the target gene signals with the 18S endogenous control.

An arbitrary unit was calculated after setting CT equaling to 40 as undetectable

5 expression level and used for normalization.

In Situ Hybridization

Human larynx tumors and normal mucosa were obtained from the Cooperative Human

Tissue Network. Specimens were sectioned at 10 µm, mounted onto Poly-L-Lysine microscope slides (Polysciences, Inc, Warrington, PA), and the UM-SCC 11A xenograft

tumors were used as a positive control (Ricker et al., 2004). The cRNA probes were

synthesized from the following cDNA clones: BAG2 (659 base pairs, Invitrogen);

CCNB2 (1315 base pairs, Open Biosystems, Huntsville, AL); PCNA (154 base pairs);

and CCND1 (560 base pairs, from Dr. C Bondy’s laboratory at NICHD, NIH). In situ

hybridization was carried according to the protocol published previously (Bondy, 1993).

Immunohistochemical Analysis of HNSCC Tissue Arrays and Specimens

Approval was obtained from the Office of Human Subjects Research, National Institutes

of Health, for studies using tissue arrays and tissues from anonymous donors obtained

from commercial sources and the Cancer Human Tissue Network. Formalin fixed and

paraffin embedded HNSCC tissue arrays were obtained from Cybrdi Inc. (Frederick,

MD). The array contains HNSCC tumor tissues from 20 individuals, spotted in triplicate,

plus normal mucosa tissues from six normal subjects spotted in duplicate. The tissues

were sectioned 5 μm in thickness, and each array spot was 1.5mm in diameter. Presence

of HNSCC was confirmed by a certified pathologist based on histological H&E and

immunohistochemical pan staining. The slides were de-paraffinized, fixed

and stained with H&E following manufacturer’s suggestions. The primary antibodies

6 used for immunostaining were as follows: 2μg/ml of mouse anti-human TP53 (DO-1,

IgG2a, Oncogene, San Diego, CA), 1 μg/ml of mouse anti-human pan cytokeratin (Cat#

NCL-PAN-CK, IgG1, Novocastra Lab, Newcastle upon Tyne, UK), or the isotype control mouse IgG (Cat#I2000, Vector Lab, Inc, Burlingame, CA), which were incubated with the cells overnight at 4oC. Vectastain Elite ABC kit (Vector Lab, Burlingame, CA) and

the protocol provided by the manufacturer were used for subsequent immunostaining

steps.

The frozen HNSCC tumor tissues were obtained from Cooperative Human Tissue

Network (CHTN), Midwestern Division, Columbus OH. For the immunostaining of frozen HNSCC tumors, 10 micron frozen sections were fixed with 4% paraformaldehyde and permeabilized using 0.2% Triton X-100. The sections were quenched with 3%

hydrogen peroxide/methanol, and blocked with a 10% serum for 20 mins. Appropriate

dilution of primary antibodies and isotype controls were first titrated on cell lines and

then applied to tumor sections. The final concentration of the primary antibodies used for

immunostaining were as follows: 6.2μg/ml of mouse anti-human Pan-Cytokeratin (IgG1,

Vector Laboratories, Inc, Burlingame CA), 1μg/ml of mouse monoclonal anti-human

TP53 (DO-1, IgG2a, Calbiochem, EMD Biosciences, San Diego, CA), 0.76μg/ml of

rabbit anti-human phospho-NF-κB p65 (Ser536, Cell Signaling Technology, Danvers,

MA), 5μg/ml of rabbit anti-human YAP (Cell Signaling), 2μg/ml of rabbit anti-human

CA-IX(H-120), 2μg/ml of rabbit anti-human c-IAP1 (H-83, Santa Cruz Biotechnology,

Santa Cruz, CA), and matched isotype control mouse IgG (Dako) and control rabbit IgG

(ICN, Aurora, OH). The sections were incubated with primary antibodies overnight at

4ºC. Vectastain Elite ABC kit (Vector Lab, Burlingame, CA) and the protocol provided

7 by the manufacturer were used for the immunostaining. Sections were then counterstained with hematoxylin, dehydrated using graded alcohols and xylene, and mounted with Permount.

Immunohistochemical scores were based upon the products of % positive cells multiplied by stain intensity (0 = negative, 1= weak, 2 = moderate, 3 = strong) in three different high power fields (400x) for each tumor section. A final IHC score was then given to each tumor for each primary antibody. To test if there is a statistical significance of the relationship between immunostaining intensity of TP53 and phospho-NF-κB p65, a

"univariate" association between TP53 and phospho-NF-κB p65 was analysized through ploting of the data with least squares regression fits, and then calculating the spearman rank correlation and p-values corresponding to tests of whether the correlation is equal to

zero.

Chromatin Immunoprecipitation Assay (ChIP)

ChIP assays were performed using the EZ ChIP assay kit (Upstate Biotechnology,

Waltham, MA) following the manufacturer’s directions. Briefly, cultured UM-SCC cells

at 70-80% confluence were treated with or without TNF (2,000unit/ml) for four hours.

Whole cell lysates were harvested, and DNA and were cross linked by 1% formaldehyde and sonicated using SONICATOR XL2020 (Misonix Inc, Farmingdale,

NY). Precleared was immunoprecipitated with a rabbit polyclonal antibody against the p65 subunit of human NF-κB (Upstate Biotechnology, Waltham, MA), or normal rabbit IgG (Upstate Biotechnology, Waltham, MA) or no antibody added as negative controls followed by incubation with A-agarose saturated salmon sperm

8 DNA (Upstate Biotechnology, Waltham, MA). Precipitated DNA was analyzed by PCR

(35 cycles) with Platinum Taq DNA Polymerase (Invitrogen, Carlsbad, CA). The PCR products were analyzed on a 2% Agarose E-Gel (Invitrogen, Carlsbad, CA). Primers were designed using OligoPerfectTM Designer (Invitrogen) for the IL-8 promoter (-121 ~

+61 bp) were 5’-GGGCCATCAGTTGCAAATC-3’ (forward), and 5’-TTCCTTCCGG-

TGGTTTCTTC-3’(Reverse); for the IL-6 promoter (-203 ~ -60bp) were: 5’-

TGCACTTTT-CCCCCTAGTTG-3’ (forward), and 5’-TCATGGGAAAATCCCA-

CATT-3’ (Reverse); for the YAP1 (Yes-associated protein 1) promoter (-309 ~ -164) were 5’-TAGCAACTTGCAGCGAAAAG-3’ (forward) and 5’-GCCTCAAACG-

CCAAAACTAA-3’ (Reverse); and for the CA9(carbonic anhydrase IX) promoter (-433

~ -175) were 5’-GCCTGCCCTACCTCTTTACC-3’ (forward) and 5’-TGTGCACA-

GGCAGAAGGTTA-3’ (Reverse). All above PCR amplified regions contain a putative

NF-κB or TP53 binding site predicted using by Gene2Promoter of Genomatix Suite

3.4.1. The promoter length of each gene was set to the optimized promoter region including 500 bp upstream and 100 bp downstream from transcription start site (TSS), respectively. Default values were set for all other parameters used in promoter analysis.

9 Supplemental Results

Gene Ontology Analysis of Clustered Genes

The differentially expressed genes were classified into functional categories according to

gene ontology DAVID database (NIAID, NIH). Many of the up-regulated genes include

molecules implicated in cancer and malignant progression, such as cytokine growth

factors and receptors (IL-6, IL-8, G protein coupled receptors), molecules regulating cell

death and apoptosis (BAG2 and BIRC2/c-IAP1), cell cycle progression (cyclin B2, cyclin

D1 and p18), signal transduction (YAP1, MAK-related kinase/ICK, creatine kinase/CKB

and PI3KR3), DNA replication (topoisomerase II alpha/TOP2A, and H2B members of

family), and adhesion and migration (TIMP2 and BMP7). The down regulated

genes include molecules involved in receptor signaling and adhesion (IL-1RII, IL-4R,

CD44, CD53, and CD81), cell death and apoptosis (BOK and BAK1), cell cycle

(CDK5R1, and CDKN1A, cyclin I), signal transduction pathways (AKT1 and PI3KR1),

transcription factors (JunD, and PEA15), and many adhesion and structural proteins. The

detailed gene lists with predicted TF binding sites of the cluster A or B from the up-

regulated genes in UM-SCC cells are presented in supplemental Table 3.

To better understand the factors affecting the segregation of the two major UM-SCC

subgroups, we examined the gene ontogeny between two major clusters that

distinguished each group (cluster A and B, Fig. 3 and supplemental Table 3,) based on

Euclidean and Pearson correlation distance metrics in the gene set. Interestingly, most of

genes in cluster A found to be highly expressed in UM-SCC subgroup 1 are involved in

regulation of structure, function, and repair, such as H2A and H2B histone

family components, histone deacetylase 5 (HDAC5) and xeroderma pigmentosum A

10 (XPA, Fig. 3, and supplemental Table 3). In addition, this group also includes genes that

regulate important functions related to calcium storage and the endoplasmic reticulum

(CALR, OLFM1), signal transduction (IGFBP2, BLK, RALGDS), extracellular matrix

(MATN2), and metabolism (PDK2, ATP1B3, CPS1).

Conversely, the cluster B genes highly expressed in the subgroup 2 of UM-SCC lines are involved in inflammation, cell death, signaling, and differentiation. These include interleukin-6 (IL-6), interleukin-8 (IL-8), Yes-associated protein 1 (YAP1), allograft inflammatory factor 1 (AIF1), baculoviral Inhibitor of apoptosis-containing 2

(BIRC2/c-IAP1), intracellular adhesion molecule 1 (ICAM1) and 8 gene (KRT8), which have been reported with overexpression in malignant and metastatic SCC of skin and head and neck (Casanova et al., 2004; Markey et al., 1991; Suo et al., 1993).

We used a mixed-model-based F test to examine statistical difference of gene expression among HKC and the subgroups of UM-SCC cells overexpressing cluster A and B genes. In cells expressing cluster A genes, a difference in expression with statistical significance with P < 0.0001 was observed when compared with expression in

UM-SCC with the cluster B pattern or with HKC (Fig. 3A). By contrast, in cells expressing cluster B genes, a difference in expression with P<0.001 was obtained when compared with UM-SCC cells expressing cluster A or with HKC (Fig. 3B).

11 Table S1. Tumor, treatment, and outcome characteristics of patients providing human SCC Cell Lines a ______Cell Line Age (yr) b Gender Stage TNM c Primary Site d Specimen Site e Prior Therapy f Status g Survival (M)h ------UM-SCC 1 72 M I T1N0M0 FOM Local recur R DWOD 15

UM-SCC 5 59 M III T2N1M0 Supraglottic Pri bx S DOD 8 Larynx

UM-SCC 6 37 M II T2N0M0 Anterior Pri bx N LTF i Larynx

UM-SCC 9 72 F II T2N0M0 Tonsil/BOT Local recur R DOD 15

UM-SCC 11A 65 M V T2N2aM0 Hypopharynx Pri bx N DOD 14

UM-SCC 11B Pri resect C

UM-SCC 22A 59 F III T2N1M0 Hypopharynx Pri N DOD 10

UM-SCC 22B LN met N

UM-SCC 38 60 M IV T2N2aM0 Tonsil/BOT Pri N DOD 11

UM-SCC 46 57 F III No TMNGiven Suprglottic Local recur R, S DOD 6 Larynx ______a The clinical information was kindly provided by Drs. Thomas E. Carey and Carol R. Bradford, and some information was presented previously in the literature. b Age represents patients age in years at diagnosis. c TNM, tumor-node-metastasis system of staging. d Primary sites: FOM, floor of mouth; BOT, base of tongue. e Specimen site refers to origin of tissue used to establish cultures: recur: recurrence; pri, primary tumor site; bx, biopsy; resect, surgical resection specimen; LN, lymph nodes; met, metastasis. f Prior therapy refers to therapy given before the specimen used for culture was obtained: N, none; R, radiation, C, chemotherapy, S, surgery. g Status: NED: no evidence of disease; DWOD, died without disease; DOD, died with disease. h Survival represents time in months from diagnosis to last follow-up. i LTF: Lost to follow-up.

Table S2A. Selected list of up-regulated genes with 2-fold or higher (P<0.01) in average expression ratio by UM-SCC cell lines when compared with HKC1

Number of Chromosome Average Cell Gene Description 2 Symbol Location Ratio 3 Lines4

Cell Cycle and Proliferation 2 cyclin B2 CCNB2 15q21.2 2.62 9 epidermal growth factor receptor pathway substrate 8 EPS8 12q23-q24 2.52 6 cyclin D1 (PRAD1: parathyroid adenomatosis 1) CCND1 11q13 2.38 5 glycoprotein (transmembrane) nmb GPNMB 7p15 6.85 4 bone marrow stromal cell antigen 2 BST2 19p13.2 2.32 3 cyclin-dependent kinase inhibitor 2C (p18, inhibits CDK4) CDKN2C 1p32 2.01 3

Cell Death and Apoptosis BCL2-associated athanogene 2 BAG2 6p12.3-p11.2 4.43 8 baculoviral IAP repeat-containing 2 BIRC2 11q22 2.76 5 Tax1 (human T-cell leukemia virus type I) binding protein 1 TAX1BP1 7p15 2.00 4 Fas (TNFRSF6)-associated via death domain FADD 11q13.3 2.07 3

Growth Factors, Cytokines and Inducers tumor necrosis factor, alpha-induced protein 2 TNFAIP2 14q32 9.21 10 interleukin 6 (interferon, beta 2) IL6 7p21 4.18 5 tumor necrosis factor (ligand) superfamily, member 10 TNFSF10 3q26 3.52 5 interleukin 8 IL8 4q13-q21 5.21 4

Inflammation and Immune Response mal, T-cell differentiation protein 2 MAL2 8q23 3.68 9 interleukin 10 receptor, beta IL10RB 21q22.11 2.08 6 CED-6 protein CED-6 2q32.3-q33 2.25 5 major histocompatibility complex, class II, DO alpha HLA-DOA 6p21.3 1.95 4

Adhesion, Migration and Angiogenesis intercellular adhesion molecule 1 (CD54) ICAM1 19p13.3-p13.2 2.39 10 tissue inhibitor of metalloproteinase 2 TIMP2 17q25 3.59 10 tenascin XB TNXB 6p21.3 15.95 10 matrilin 2 MATN2 8q22 4.90 7 bone morphogenetic protein 7 (osteogenic protein 1) BMP7 20q13 5.04 7 mesothelin MSLN 16p13.3 2.06 3 carcinoembryonic antigen-related cell adhesion molecule 5 CEACAM5 19q13.1-q13.2 3.46 3

Receptors and Surface Antigen CD151 antigen CD151 11p15.5 11.53 10 serologically defined breast cancer antigen 84 SDBCAG84 20pter-q12 1.89 6 MAGEF1 protein MAGEF1 3q13 2.94 6 melanoma antigen, family A, 2 MAGEA2 Xq28 6.47 6 interleukin 22 receptor IL22RA1 1p36.11 3.13 6 melanoma antigen, family B, 2 MAGEB2 Xp21.3 1.91 5

1 mucin 4, tracheobronchial MUC4 3q29 3.52 4 chemokine orphan receptor 1 CMKOR1 15q14-q15.1 2.33 3

Tumor Suppressor and Oncogenes HIV-1 Tat interactive protein 2, 30 kD HTATIP2 11p15.1 2.54 9 v-myb myeloblastosis viral oncogene homolog (avian)-like 2 MYBL2 20q13.1 2.71 7 pim-1 oncogene PIM1 6p21.2 2.05 6 DEK oncogene (DNA binding) DEK 6p23 2.17 6 retinoic acid receptor responder (tazarotene induced) 3 RARRES3 11q23 3.31 5

Signal Transcription Pathway tumor-associated calcium signal transducer 1 TACSTD1 4q 6.35 10 phosphatidic acid phosphatase type 2C PPAP2C 19p13 4.29 9 FERM, RhoGEF (ARHGEF) and pleckstrin domain protein 1 (chondrocyte-derived) FARP1 13q32.2 2.55 8 ADP-ribosylation factor-like 4 ARL4 7p21-p15.3 2.46 7 creatine kinase, brain CKB 14q32 19.83 7 GABA(A) receptor-associated protein like 1 GABARAPL1 12p13.31 2.31 7 Ral guanine nucleotide exchange factor RalGPS1A RALGPS1A 9q34.11-q34.12 4.35 7 AT-hook transcription factor AKNA AKNA 9q32 2.29 6 GATA binding protein2 GATA2 3q21 2.48 6 inositol 1,4,5-trisphosphate 3-kinase A ITPKA 15q14-q21 2.48 6 protein phosphatase 1, regulatory (inhibitor) subunit 12A PPP1R12A 12q15-q21 2.58 6 synuclein, gamma (breast cancer-specific protein 1) SNCG 10q23.2-q23.3 3.27 6 cellular retinoic acid binding protein 2 CRABP2 1q21.3 2.89 5 creatine kinase, brain CKB 14q32 4.26 5 phosphoinositide-3-kinase, regulatory subunit, polypeptide 3 (p55, gamma) PIK3R3 1pter-p32.1 2.32 5 ribosomal protein S6 kinase, 90kD, polypeptide 5 RPS6KA5 14q31-q32.1 2.35 5 -like 3 STMN3 20q13.3 2.44 5 A kinase (PRKA) anchor protein (gravin) 12 AKAP12 6q24-q25 3.50 4 MAK-related kinase ICK 6p12.3-p11.2 1.98 4 Yes-associated protein 1, 65 kD YAP1 11q13 2.85 4 -like 3 CALML3 10pter-p13 2.02 3 mitogen-activated protein kinase kinase 5 MAP2K5 15q22.2-q22.31 4.00 3 neurotrophic tyrosine kinase, receptor, type 2 NTRK2 9q22.1 4.67 3 regulator of G-protein signalling 2, 24kD RGS2 1q31 2.44 3

Transcriptional and Nuclear Factors ets variant gene 1 ETV1 7p21.3 4.10 9 topoisomerase (DNA) II alpha (170kD) TOP2A 17q21-q22 2.37 9 SWI/SNF-related matrix-associated -dependent regulator of chromatin a3 SMARCA3 3q25.1-q26.1 2.74 8 nucleolar protein ANKT ANKT 15q13.3 2.53 8 E74-like factor 3 (ets domain transcription factor, epithelial- specific ) ELF3 1q32.2 7.49 8 msh homeo box homolog 1 (Drosophila) MSX1 4p16.3-p16.1 2.46 8 transcriptional activator of the c-fos promoter CROC4 1q21.3 2.13 6 IGF-II mRNA-binding protein 3 IMP-3 7p11 2.42 6 homeo box D9 HOXD9 2q31-q37 2.41 6 zinc finger protein 239 ZNF239 19q13.2-q13.3 2.41 6

2 Meis1, myeloid ecotropic viral integration site 1 homolog 2 (mouse) MEIS2 15q13.2 2.24 4 homeo box A13 HOXA13 7p15-p14 2.01 4 TAF15 RNA polymerase II, TATA box binding protein (TBP)- associated factor, TAF15 17q11.1-q11.2 2.45 3

DNA and RNA Synthesis and Processing topoisomerase (DNA) II alpha (170kD) TOP2A 17q21-q22 2.37 9 H4 histone family, member G HIST1H4C 6p21.3 2.73 8 centrosomal protein 1 CEP1 9q33-q34 2.47 7 forkhead box M1 FOXM1 12p13 2.52 7 TAF7 RNA polymerase II, TATA box binding protein (TBP)- associated factor TAF7 5q31 2.11 7 RNA binding protein S1, serine-rich domain RNPS1 16p13.3 2.54 7 H2B histone family, member C HIST1H2BL 6p22-p21.3 3.68 6 H2B histone family, member D HIST1H2BN 6p22-p21.3 3.73 5 H2B histone family, member B HIST1H2BD 6p21.3 4.49 5 H2B histone family, member L HIST1H2BC 6p21.3 3.11 5 H2B histone family, member R HIST1H2BJ 6p21.33 3.28 5 H4 histone family, member I HIST1H4B 6p21.3 1.96 5 histone deacetylase 5 HDAC5 17q21 2.02 4

Protein Processing ubiquitin specific protease 11 USP11 1q21 2.51 8 ubiquitin-conjugating enzyme E2C UBE2C 20q13.12 2.80 7 ubiquitin-conjugating enzyme HBUCE1 UBE2D4 7p13 2.78 7 ubiquitin specific protease 13 (isopeptidase T-3) USP13 3q26.2-q26.3 2.01 6

Metabolism cytochrome b-245, alpha polypeptide CYBA 16q24 10.55 8 prostaglandin E synthase PTGES 9q34.3 2.70 6 sushi-repeat protein SRPX2 Xq21.33-q23 2.61 5

Structure Protein VB MYO5B 18q12 2.49 5 KRT8 12q13 1.98 5 KRT19 17q21 4.81 5 KRT4 12q12-q13 5.40 4 (epidermolytic hyperkeratosis) KRT1 12q12-q13 2.48 3

3 Table S2B. Selected list of down-regulated genes with 2-fold or lower (P<0.01) in average expression ratio by UM-SCC cell lines when compared with HKC1

Number of Chromosome Average Cell Gene Description2 Symbol Location Ratio3 Lines4

Cell Cycle and Proliferation amphiregulin (schwannoma-derived growth factor) AREG 4q13-q21 0.19 10 cyclin-dependent kinase 5, regulatory subunit 1 (p35) CDK5R1 17q11.2 0.42 9 cyclin-dependent kinase inhibitor 1A (p21, Cip1) CDKN1A 6p21.2 0.28 9 peripheral myelin protein 22 PMP22 17p12-p11.2 0.37 9 tetraspan 5 TM4SF9 4q23 0.25 9 Cdc42 effector protein 2 CDC42EP2 11q13 0.48 7 cyclin I CCNI 4q21.21 0.52 6 tetraspan 2 TSPAN-2 1p11.1 0.50 6

Cell Death and Apoptosis mucosa associated lymphoid tissue lymphoma translocation gene 1 MALT1 18q21 0.34 10 caspase recruitment domain protein 12 CARD12 2p22.3 0.44 8 Bcl-2-related ovarian killer protein-like BOK 2q37.3 0.39 8 tumor necrosis factor receptor superfamily, member 6b, decoy TNFRSF6B 20q13.3 0.35 8 synuclein, alpha (non A4 component of amyloid precursor) SNCA 4q21 0.37 8 modulator of apoptosis 1 MINK 17p13.3 0.53 7 BCL2-antagonist/killer 1 BAK1 6p21.3 0.55 7 death effector filament-forming Ced-4-like apoptosis protein DEFCAP 17p13.2 0.48 5

Growth Factors, Cytokines and Inducers colony stimulating factor 2 (granulocyte-macrophage) CSF2 5q31.1 0.23 9 teratocarcinoma-derived growth factor 1 TDGF1 3p21.31 0.41 8 gastrin-releasing peptide GRP 18q21.1-q21.32 0.46 8 growth hormone releasing hormone GHRH 20q11.2 0.43 8 retinoic acid induced 2 RAI2 Xp22 0.50 7 neuregulin 2 NRG2 5q23-q33 0.48 7 interleukin 1, alpha IL1A 2q14 0.34 7 cysteine-rich, angiogenic inducer, 61 CYR61 1p31-p22 0.50 6 transforming growth factor, beta 2 TGFB2 1q41 0.53 5

Inflammation and Immune Response mal, T-cell differentiation protein 2 MAL2 8q23 0.39 10 B-cell receptor-associated protein BAP29 BAP29 7p 0.44 9 interferon, alpha-inducible protein (clone IFI-6-16) G1P3 1p35 0.43 7

Adhesion, Migration and Angiogenesis endothelial and smooth muscle cell-derived neuropilin-like protein ESDN 3 0.19 10 laminin, alpha 3 (nicein (150kD), kalinin (165kD), epilegrin) LAMA3 18q11.2 0.17 10 caldesmon 1 CALD1 7q33 0.11 10 VIM 10p13 0.11 10 follistatin FST 5q11.2 0.13 10

4 integrin, alpha 5 (fibronectin receptor, alpha polypeptide) ITGA5 12q11-q13 0.26 10 integrin, alpha 3 (antigen CD49C, alpha 3 subunit of VLA-3 receptor) ITGA3 17q21.33 0.39 10 melanoma cell adhesion molecule MCAM 11q23.3 0.21 9 gap junction protein, alpha 5, 40kD (connexin 40) GJA5 1q21.1 0.31 9 angiopoietin 1 ANGPT1 8q22.3-q23 0.37 9 integrin, alpha 2 (CD49B, alpha 2 subunit of VLA-2 receptor) ITGA2 5q23-31 0.44 9 endothelin 1 EDN1 6p24.1 0.32 8 vascular endothelial growth factor C VEGFC 4q34.1-q34.3 0.43 8 integrin, beta 4 ITGB4 17q11-qter 0.41 8 gap junction protein, beta 2, 26kD (connexin 26) GJB2 13q11-q12 0.43 7 integrin, beta 7 ITGB7 12q13.13 0.50 7 integrin, beta 6 ITGB6 2q24.2 0.51 7 collagen, type VII, alpha 1 COL7A1 3p21.1 0.42 7 integrin, alpha 6 ITGA6 2q31.1 0.43 7 collagen, type XVII, alpha 1 COL17A1 10q24.3 0.41 6 collagen, type VI, alpha 3 COL6A3 2q37 0.54 6

Receptors and Surface Antigen interleukin 1 receptor, type II IL1R2 2q12-q22 0.15 10 transmembrane 6 superfamily member 2 TM6SF2 19p13.3-p12 0.45 9 seven transmembrane protein TM7SF3 TM7SF3 12q11-q12 0.47 8 lung type-I cell membrane-associated glycoprotein T1A-2 1p36 0.35 8 endothelial differentiation, sphingolipid G-protein-coupled receptor, 5 EDG5 19p13.2 0.33 8 CD53 antigen CD53 1p31-p12 0.31 8 nerve growth factor receptor (TNFR superfamily, member 16) NGFR 17q21-q22 0.53 8 GDNF family receptor alpha 3 GFRA3 5q31.1-q31.3 0.47 7 vitamin D (1,25- dihydroxyvitamin D3) receptor VDR 12q12-q14 0.43 7 prostaglandin E receptor 2 (subtype EP2), 53kD PTGER2 14q22 0.53 7 G-protein coupled receptor GPR161 1q23.2 0.52 6 squamous cell carcinoma antigen recognized by T cell SART2 6q22 0.51 6 CD44 antigen (homing function and Indian blood group system) CD44 11p13 0.50 6 interleukin 4 receptor IL4R 16p11.2-12.1 0.44 6 CD81 antigen (target of antiproliferative antibody 1) CD81 11p15 0.53 6 signal sequence receptor, beta (translocon-associated protein beta) PEG3 19q13.4 0.55 5 tumor rejection antigen (gp96) 1 TRA1 12q24.2-q24.3 0.53 5

Tumor Suppressor and Oncogenes v-maf musculoaponeurotic fibrosarcoma (avian) oncogene family, protein K MAFK 7p22.3 0.22 10 Ras suppressor protein 1 RSU1 10p13 0.43 9 loss of heterozygosity, 11, chromosomal region 2, gene A LOH11CR2A 11q23 0.36 9 arginine-rich, mutated in early stage tumors ARMET 3p21.1 0.34 9 downregulated in ovarian cancer 1 DOC1 3q12.3 0.30 9 N-myc downstream regulated gene 1 NDRG1 8q24 0.44 7 ubiquitin specific protease 6 (Tre-2 oncogene) USP6 17p13 0.53 7

Signal Transcription Pathway SH2 domain protein 2A SH2D2A 1q21 0.40 10

5 dual adaptor of phosphotyrosine and 3-phosphoinositides DAPP1 4q25-q27 0.28 10 RAP1, GTP-GDP dissociation stimulator 1 RAP1GDS1 4q23-q25 0.46 9 vasodilator-stimulated phosphoprotein VASP 19q13.2-q13.3 0.37 9 SHC (Src homology 2 domain containing) transforming protein 1 SHC1 1q21 0.34 9 ADP-ribosylation factor-like 7 ARL7 2q37.2 0.33 9 regulator of G-protein signalling 20 RGS20 8 0.28 9 diacylglycerol kinase, alpha (80kD) DGKA 12q13.3 0.35 9 v-akt murine thymoma viral oncogene homolog 1 AKT1 2 0.47 8 ras homolog gene family, member C ARHC 1p21-p13 0.49 8 rho/rac guanine nucleotide exchange factor (GEF) 2 ARHGEF2 1q21-q22 0.49 8 annexin A6 ANXA6 5q32-q34 0.44 8 ras homolog gene family, member C ARHC 1p13.1 0.49 8 poly(rC) binding protein 2 PCBP2 12q13.12-q13.13 0.51 8 G protein, alpha inhibiting activity polypeptide 2 GNAI2 3p21 0.47 7 SH3 domain binding glutamic acid-rich protein like SH3BGRL Xq13.3 0.46 7 latent transforming growth factor beta binding protein 2 LTBP2 14q24 0.40 7 neurogranin (protein kinase C substrate, RC3) PTPRG 3p21-p14 0.48 7 guanine nucleotide binding protein beta subunit 4 GNB4 3q27.1 0.52 7 IQ motif containing GTPase activating protein 2 IQGAP2 5q13.1 0.5 7 DNA-dependent protein kinase catalytic subunit-interacting protein 2 KIP2 15q24 0.49 7 S100 calcium binding protein A4 (Ca protein, calvasculin, metastasin) S100A4 1q21 0.53 6 WW domain binding protein 1 WBP1 2p12 0.48 6 Rho guanine nucleotide exchange factor (GEF) 10 ARHGEF10 8p23 0.45 6 HSPC022 protein RAC2 22 0.48 6 Polo-like kinase 4 (Drosophila) PLK4 4q27-q28 0.53 6 protein phosphatase 1, regulatory subunit 3D PPP1R3D 20q13.3 0.52 6 phosphoinositide-3-kinase, regulatory subunit, polypeptide 1 (p85 alpha) PIK3R1 5q12-q13 0.50 6 ras homolog gene family, member B RHOB 6p25 0.51 5

Transcriptional and Nuclear Factors core promoter element binding protein COPEB 10p15 0.22 10 zinc finger protein 42 (myeloid-specific retinoic acid- responsive) ZNF42 19q13.2-q13.4 0.19 10 PPAR binding protein PPARBP 17q12-q21.1 0.21 10 ELK3, ETS-domain protein (SRF accessory protein 2) ELK3 12q23 0.18 10 myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila) MLL 11q23 0.36 10 YY1 associated factor 2 YAF2 12q12 0.41 10 AHNAK nucleoprotein (desmoyokin) AHNAK 11q12-q13 0.45 9 tripartite motif-containing 22 TRIM22 11p15 0.39 9 Rap guanine nucleotide exchange factor (GEF) 2 RAPGEF2 4q32.1 0.37 8 B-cell CLL/lymphoma 11B (zinc finger protein) BCL11B 14q32.2 0.27 8 methyl-CpG binding domain protein 1 MBD1 18q21 0.37 8 chromosome condensation 1-like CHC1L 13q14.3 0.43 7 jun D proto-oncogene JUND 19p13.2 0.51 7 phosphoprotein enriched in astrocytes 15 PEA15 1q21.1 0.54 6 thymopoietin RNH 11p15.5 0.53 6 SRp25 nuclear protein ARL6IP4 12q 0.51 6 coactivator independent of AF-2 NCOA5 20q12-q13.12 0.54 5

6

DNA and RNA Synthesis and Processing RNA-binding protein gene with multiple splicing RBPMS 8p12-p11 0.37 8

Protein Processing 3 (ubiquitous) TMOD3 2q14-q21 0.36 10 serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 7 SERPINB7 18q21.33 0.11 10 DnaJ (Hsp40) homolog, subfamily C, member 12 DNAJC12 10q21.1 0.36 9 E3 ubiquitin ligase SMURF2 SMURF2 17q22-q23 0.29 9 cystatin A (stefin A) CSTA 3q21 0.47 8 Kruppel-like factor 7 (ubiquitous) KLF7 2q32 0.45 8 ubiquitin-conjugating enzyme E2H (UBC8 homolog, yeast) UBE2H 7q32 0.52 6 protease, serine, 2 (trypsin 2) PRSS2 7q35 0.45 6

Metabolism cytochrome c oxidase subunit VIIb COX7B Xq13.1 0.38 10 farnesyl-diphosphate farnesyltransferase 1 FDFT1 8p23.1-p22 0.22 10 cytochrome P450, 51 (lanosterol 14-alpha-demethylase) CYP51A1 7q21.2-q21.3 0.42 9 metallothionein 1G MT1G 16q13 0.41 8 metallothionein 1H MT1H 16q13 0.45 8 cytochrome b5 reductase 1 (B5R.1) CYB5R1 1p36.13-q41 0.40 8

Structure Protein myotubularin related protein 8 MTMR9 8p23-p22 0.44 10 epithelial protein lost in neoplasm beta EPLIN 12q13 0.25 10 keratin, , acidic, 1 KRTHA1 17q12-q21 0.16 10 microfibrillar-associated protein 3 MFAP3 5q32-q33.2 0.39 9 microfibrillar-associated protein 2 MFAP2 1p36.1-p35 0.35 9 -associated protein 4 MAP4 3p21 0.47 8 keratin, hair, acidic, 2 KRTHA2 17q12-q21 0.47 8 KRT5 12q12-q13 0.42 8 small proline-rich protein 1B (cornifin) SPRR1B 1q21-q22 0.33 8 myosin, heavy polypeptide 9, non-muscle MYH9 22q13.1 0.54 7 pannexin 1 PANX1 11cen-q12.1 0.50 7 extracellular matrix protein 1 ECM1 1q21 0.48 7 -associated protein 4 CKAP4 12q24.11 0.43 7 KRT6A 12q12-q13 0.39 7 immunoglobulin domain protein (myotilin) TTID 5q31 0.49 6 microtubule-actin crosslinking factor 1 MACF1 1p32-p31 0.55 5 actin, gamma 2, smooth muscle, enteric ACTG2 2p13.1 0.55 5

1 The table shows the selected lists of genes expressed at least 2-fold change of average ratio when compared ten HNSCC cell lines with four HKC. 2Gene Description and classification were based on the gene information of NCBI Genbank and David ontology annotation; 3Average Ratio shows fold changes of the average fluorescence intensity ratio when compared ten UM-SCC cell lines with four human normal HKC; 4 Number of Cell Lines showed the number of cell lines with 2-fold change or more in the levels of gene expression.

7 Table S3. Putative transcription factor binding sites and Gene Ontology annotation in cluster A and B of HNSCC

Gene Name Gene Symbol Cene Ontology annotation † Number of TFBS in promoters‡

p53 NF-κB AP-1 STAT

Cluster A:

H2A histone family, member I HIST1H2AL chromosome organization and biogenesis 1 ral guanine nucleotide dissociation stimulator RALGDS signal transduction 1 2 pyruvate dehydrogenase kinase, isoenzyme 2 PDK2 glucose metabolism, signal transduction 2 matrilin 2 MATN2 excellular matrix assembly 1 ATPase, Na+/K+ transporting, beta 3 polypeptide ATP1B3 transport 3 olfactomedin 1 OLFM1 morphogenesis 1 1 2 insulin-like growth factor binding protein 2 (36kD) IGFBP2 regulation of cell growth 1 histone deacetylase 5 HDAC5 chromatin modification, B-cell differentiation CALR cell adhesion 1 1 H2B histone family, member B HIST1H2BD chromosome organization and biogenesis 1 1 1 H2B histone family, member R HIST1H2BJ chromosome organization and biogenesis 1 B lymphoid tyrosine kinase BLK protein kinase cascade 1 H2B histone family, member L HIST1H2BC chromosome organization and biogenesis 1 1 H2B histone family, member D HIST1H2BN chromosome organization and biogenesis 2 3 carbamoyl-phosphate synthetase 1, mitochondrial CPS1 metabolism 2 1 1 1 H2B histone family, member C HIST1H2BL chromosome organization and biogenesis 1 xeroderma pigmentosum, complementation group A XPA DNA repair 2 1

Cluster B

DNA replication factor CDT1 cell cycle 1 aldehyde dehydrogenase 1 family, member A3 ALDH1A3 metabolism 2 cortactin binding protein 1 SHANK2 signal transduction 2 3 carboxypeptidase, vitellogenic-like CPVL proteolysis and peptidolysis 1 interleukin 6 (interferon, beta 2) IL6 signal transduction; cell proliferation 1 1 1 interleukin 8 IL8 signal transduction; cell proliferation 1 1 1 carbonic anhydrase Ix CA9 one-carbon compound metabolism 1 1 H19, imprinted maternally expressed untranslated mRNA H19 carcinogensis; imprinting fatty-acid-Coenzyme A ligase, very long-chain 1 SLC27A2 fatty acid metabolism 2 2 GATA binding protein 2 GATA2 transcription 2 1 fatty-acid-Coenzyme A ligase, long-chain 5 ACSL5 fatty acid metabolism 1 1 stanniocalcin 1 STC1 signal transduction 1 4 protein tyrosine phosphatase, receptor type, J PTPRJ signal transduction 1 1 fatty acid desaturase 3 FADS3 fatty acid metabolism 3 1 intermediate conductance Ca-activated K channel, protein 1 KCNN4 transport 1 3 1 A kinase (PRKA) anchor protein (gravin) 12 AKAP12 signal transduction 1 2

1 ATP-binding cassette, sub-family G (WHITE), member 2 ABCG2 response to stimulus 3 1 1 allograft inflammatory factor 1 AIF1 Inflammatory response; cell cycle 3 sushi-repeat protein SRPX2 electron transport 2 3 Yes-associated protein 1, 65 kD YAP1 signal transduction 1 ribophorin II RPN2 protein metabolism 1 1 2 pro-oncosis receptor inducing membrane injury gene PORIMIN oncosis-like cell death 1 1 baculoviral IAP repeat-containing 2 BIRC2 anti-apoptosis; signal transduction 2 2 2 killer cell lectin-like receptor subfamily C, member 2 KLRC2 cellular defense response 1 2 intercellular adhesion molecule 1 (CD54) ICAM1 cell adhesion 2 1 1 melanophilin MLPH transport 1 keratin 8 KRT8 cell organization and biogenesis 2 2 1 1

† From Onto-Express (Draghici et al. 2003), AmiGo (www.godatabase.org) and NCBI (www.ncbi.nlm.nih.gov). ‡ Number of transcription factor binding sites (TFBS) in proximal region of promoters. The average length of these promoters was adjusted to approximately 600 bp, ~ 500 bp upstream and ~ 100 bp downstream by Genomatix Suite 3.4.1 (www.genomatix.de).

2