Supplemental data

A hypoxia metagene in head and neck cancer

Stuart Winter, Francesca M Buffa, Priyamal Silva, Crispin Miller, Helen Valentine, Helen Turley, Ketan Shah, Graham Cox, Rogan Corbridge, Jarrod Homer, Brian Musgrove, Nick Slevin, Philip Sloan, Pat Price, Catharine West, Adrian Harris

Table S1: List of reported in the literature to be hypoxia regulated genes Symbol Name ABCB1 ATP-binding cassette, sub-family B (MDR/TAP), member 1 ACAT1 acetyl-Coenzyme A acetyltransferase 1 (acetoacetyl Coenzyme A thiolase) ADFP adipose differentiation-related ADM adrenomedullin ADORA2B adenosine A2b receptor AK2 adenylate 2 AK3 adenylate kinase 3 ALDH1A1 aldehyde dehydrogenase 1 family, member A1 ALDH1A3 aldehyde dehydrogenase 1 family, member A3 ALDOA aldolase A, fructose-bisphosphate ALDOC aldolase C, fructose-bisphosphate ANGPT2 2 ANGPTL4 angiopoietin-like 4 ANXA1 A1 ANXA2 ANXA5 ARHGAP5 Rho GTPase activating protein 5 ARSE E (chondrodysplasia punctata 1) ART1 ADP-ribosyltransferase 1 BACE2 beta-site APP-cleaving 2 BCL2L1 BCL2-like 1 BCL2L2 BCL2-like 2 BHLHB2 basic helix-loop-helix domain containing, class B, 2 BHLHB3 basic helix-loop-helix domain containing, class B, 3 BIK BCL2-interacting killer (apoptosis-inducing) BIRC2 baculoviral IAP repeat-containing 2 BNIP3 BCL2/adenovirus E1B 19kDa interacting protein 3 BNIP3L BCL2/adenovirus E1B 19kDa interacting protein 3-like BPI bactericidal/permeability-increasing protein BTG1 B-cell translocation 1, anti-proliferative C11orf2 11 open reading frame2 CA9 carbonic anhydrase IX CA12 carbonic anhydrase XII CALD1 caldesmon 1 CCNG2 cyclin G2 CCT6A chaperonin containing TCP1, subunit 6A (zeta 1) CD99 CD99 antigen CDC2 cell division cycle 2, G1 to S and G2 to M CDKN1A cyclin-dependent kinase inhibitor 1A (p21, Cip1) CDKN1B cyclin-dependent kinase inhibitor 1B (p27, Kip1) CITED2 Cbp/p300-interacting transactivator, with Glu/Asp-rich carboxy-terminal domain, 2 CLK1 CDC-like kinase 1 CNOT7 CCR4-NOT transcription complex, subunit 7 COL4A5 collagen, type IV, alpha 5 COL5A1 collagen, type V, alpha 1

1 COL5A2 collagen, type V, alpha 2 COL5A3 collagen, type V, alpha 3 CP () CTSD cathepsin D (lysosomal aspartyl ) CXCR4 chemokine (C-X-C motif) receptor 4 D4S234E DNA segment on (unique) 234 expressed sequence DDIT3 DNA-damage-inducible transcript 3 DDIT4 DNA-damage-inducible transcript 4 DDX48 DEAD (Asp-Glu-Ala-Asp) box polypeptide 48 DEC1 deleted in esophageal cancer 1 DKC1 dyskeratosis congenita 1, dyskerin DR1 down-regulator of transcription 1, TBP-binding (negative 2) EDN1 endothelin 1 EDN2 endothelin 2 EFNA1 ephrin-A1 EGF epidermal (beta-urogastrone) EGR1 early growth response 1 ELF3 E74-like factor 3 (ets domain transcription factor, epithelial-specific ) ELL2 , RNA polymerase II, 2 ENG (Osler-Rendu-Weber syndrome 1) ENO1 enolase 1, (alpha) ENO3 enolase 3 (beta, muscle) ENPEP glutamyl (aminopeptidase A) EPO erythropoietin ETS1 v-ets erythroblastosis virus E26 oncogene homolog 1 (avian) F3 coagulation factor III (thromboplastin, ) FABP5 fatty acid binding protein 5 (psoriasis-associated) 3 (murine mammary tumor virus integration site (v-int-2) FGF3 oncogene homolog) FKBP4 FK506 binding protein 4, 59kDa fms-related 1 (vascular endothelial growth factor/vascular FLT1 permeability factor receptor) FN1 fibronectin 1 FOS v-fos FBJ murine osteosarcoma viral oncogene homolog FTL ferritin, polypeptide G22P1 thyroid autoantigen 70kDa (Ku antigen) GAPD glyceraldehyde-3-phosphate dehydrogenase glucan (1,4-alpha-), branching enzyme 1 (glycogen branching enzyme, Andersen GBE1 disease, glycogen storage disease type IV) GLRX (thioltransferase) GPCR5A -coupled receptor, family C, group 5, member A GPI glucose phosphate HAP1 huntingtin-associated protein 1 (neuroan 1) HBP1 HMG-box transcription factor 1 HDAC1 histone deacetylase 1 HDAC9 histone deacetylase 9 HERC3 hect domain and RLD 3 homocysteine-inducible, endoplasmic reticulum stress-inducible, ubiquitin-like HERPUD1 domain member 1 HGF (hepapoietin A; scatter factor) HIF1A hypoxia-inducible factor 1, alpha subunit (basic helix-loop-helix transcription factor) HIG2 hypoxia-inducible protein 2 HK1 hexokinase 1 HK2 hexokinase 2 HLA-DQB1 major histocompatibility complex, class II, DQ beta 1 HMOX1 heme oxygenase (decycling) 1

2 HMOX2 heme oxygenase (decycling) 2 HSPA5 heat shock 70kDa protein 5 (glucose-regulated protein, 78kDa) HSPD1 heat shock 60kDa protein 1 (chaperonin) HSPH1 heat shock 105kDa/110kDa protein 1 HYOU1 hypoxia up-regulated 1 ICAM1 intercellular adhesion molecule 1 (CD54), human rhinovirus receptor ID2 inhibitor of DNA binding 2, dominant negative helix-loop-helix protein IFI27 interferon, alpha-inducible protein 27 IGF2 -like growth factor 2 (somatomedin A) IGFBP1 insulin-like growth factor binding protein 1 IGFBP2 insulin-like growth factor binding protein 2, 36kDa IGFBP3 insulin-like growth factor binding protein 3 IGFBP5 insulin-like growth factor binding protein 5 IL6 interleukin 6 (interferon, beta 2) IL8 interleukin 8 INSIG1 insulin induced gene 1 IRF6 interferon regulatory factor 6 ITGA5 integrin, alpha 5 (fibronectin receptor, alpha polypeptide) JUN v-jun sarcoma virus 17 oncogene homolog (avian) KDR kinase insert domain receptor (a type III ) KRT14 14 ( simplex, Dowling-Meara, Koebner) KRT18 KRT19 LDHA lactate dehydrogenase A LDHB lactate dehydrogenase B LEP leptin (obesity homolog, mouse) LGALS1 lectin, galactoside-binding, soluble, 1 (galectin 1) LOX lysyl oxidase LRP1 low density lipoprotein-related protein 1 (alpha-2-macroglobulin receptor) MAP4 -associated protein 4 MET met proto-oncogene (hepatocyte ) MIG-6 Gene 33\/Mig-6 (MIG-6) MIF macrophage migration inhibitory factor (glycosylation-inhibiting factor) MMP13 matrix metalloproteinase 13 (collagenase 3) matrix metalloproteinase 2 (gelatinase A, 72kDa gelatinase, 72kDa type IV MMP2 collagenase) MMP7 matrix metalloproteinase 7 (matrilysin, uterine) MPI mannose phosphate isomerase MT1L metallothionein 1L MT-CO1 mitochondrially encoded cytochrome c oxidase I MT-CO2 mitochondrially encoded cytochrome c oxidase II MTL3 metallothionein-like 3 MUC1 mucin 1, transmembrane MXI1 MAX interactor 1 NDRG1 N-myc downstream regulated gene 1 NFIL3 nuclear factor, interleukin 3 regulated NFKB1 nuclear factor of kappa light polypeptide gene enhancer in B-cells 1 (p105) NFKB2 nuclear factor of kappa light polypeptide gene enhancer in B-cells 2 (p49/p100) NOS1 nitric oxide synthase 1 (neuronal) NOS2A nitric oxide synthase 2A (inducible, hepatocytes) NOS2B nitric oxide synthase 2B NOS2C nitric oxide synthase 2C NOS3 nitric oxide synthase 3 (endothelial cell) NP nucleoside phosphorylase NR3C1 nuclear receptor subfamily 3, group C, member 1 (glucocorticoid receptor) NR4A1 nuclear receptor subfamily 4, group A, member 1

3 NT5E 5'-, ecto (CD73) ODC1 ornithine decarboxylase 1 procollagen-proline, 2-oxoglutarate 4-dioxygenase (proline 4-hydroxylase), alpha P4HA1 polypeptide I procollagen-proline, 2-oxoglutarate 4-dioxygenase (proline 4-hydroxylase), alpha P4HA2 polypeptide II phosphoribosylaminoimidazole carboxylase, phosphoribosylaminoimidazole PAICS succinocarboxamide synthetase platelet-derived growth factor beta polypeptide (simian sarcoma viral (v-sis) PDGFB oncogene homolog) PDK3 pyruvate dehydrogenase kinase, isoenzyme 3 PFKFB1 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 1 PFKFB3 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3 PFKFB4 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 4 PFKL phosphofructokinase, liver PGAM1 phosphoglycerate mutase 1 (brain) PGF , vascular endothelial growth factor-related protein PGK1 phosphoglycerate kinase 1 PGK2 phosphoglycerate kinase 2 PGM1 phosphoglucomutase 1 PIM1 pim-1 oncogene PIM2 pim-2 oncogene PKM2 pyruvate kinase, muscle PLAU plasminogen activator, urokinase PLAUR plasminogen activator, PLOD2 procollagen-lysine, 2-oxoglutarate 5-dioxygenase 2 PNN , desmosome associated protein POLM polymerase (DNA directed), mu PPARA peroxisome proliferative activated receptor, alpha PPAT phosphoribosyl pyrophosphate amidotransferase PROK1 prokineticin 1 PRSS15 protease, serine, 15 PSMA3 proteasome (prosome, macropain) subunit, alpha type, 3 PSMD9 proteasome (prosome, macropain) 26S subunit, non-ATPase, 9 prostaglandin-endoperoxide synthase 1 (prostaglandin G/H synthase and PTGS1 cyclooxygenase) prostaglandin-endoperoxide synthase 2 (prostaglandin G/H synthase and PTGS2 cyclooxygenase) PTPNS1 protein tyrosine , non-receptor type substrate 1 QSCN6 quiescin Q6 RBPSUH recombining binding protein suppressor of hairless (Drosophila) v-rel reticuloendotheliosis viral oncogene homolog A, nuclear factor of kappa light RELA polypeptide gene enhancer in B-cells 3, p65 (avian) RIOK3 RIO kinase 3 (yeast) RNASEL L (2',5'-oligoisoadenylate synthetase-dependent) RNU3IP2 RNA, U3 small nucleolar interacting protein 2 RPL36A ribosomal protein L36a RUTBC1 RUN and TBC1 domain containing 1 SAT spermidine/spermine N1-acetyltransferase SERPINB2 serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), member 2 serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator SERPINE1 inhibitor type 1), member 1 SFRS6 splicing factor, arginine/serine-rich 6 SIAH2 seven in absentia homolog 2 (Drosophila) SIN3A SIN3 homolog A, transcription regulator (yeast) SLC16A1 solute carrier family 16 (monocarboxylic acid transporters), member 1

4 SLC16A2 solute carrier family 16 (monocarboxylic acid transporters), member 2 SLC20A1 solute carrier family 20 (phosphate transporter), member 1 SLC2A1 solute carrier family 2 (facilitated glucose transporter), member 1 SLC2A3 solute carrier family 2 (facilitated glucose transporter), member 3 solute carrier family 3 (activators of dibasic and neutral amino acid transport), SLC3A2 member 2 SLC6A10 solute carrier family 6 (neurotransmitter transporter, creatine), member 10 SLC6A16 solute carrier family 6, member 16 SLC6A6 solute carrier family 6 (neurotransmitter transporter, taurine), member 6 SLC6A8 solute carrier family 6 (neurotransmitter transporter, creatine), member 8 SNFT Jun dimerization protein p21SNFT SORL1 sortilin-related receptor, L(DLR class) A repeats-containing secreted phosphoprotein 1 (, I, early T-lymphocyte SPP1 activation 1) SSSCA1 Sjogren's syndrome/scleroderma autoantigen 1 STC2 stanniocalcin 2 STRA13 stimulated by retinoic acid 13 homolog (mouse) SYT7 VII TBPL1 TBP-like 1 TCEAL1 transcription elongation factor A (SII)-like 1 TEK tyrosine kinase, endothelial (venous malformations, multiple cutaneous and TEK mucosal) TF transferrin TFF3 trefoil factor 3 (intestinal) TFRC (p90, CD71) TGFA transforming growth factor, alpha TGFB1 transforming growth factor, beta 1 (Camurati-Engelmann disease) TGFB3 transforming growth factor, beta 3 TGFBI transforming growth factor, beta-induced, 68kDa TGM2 transglutaminase 2 (C polypeptide, protein-glutamine-gamma-glutamyltransferase) TH tyrosine hydroxylase THBS1 thrombospondin 1 THBS2 thrombospondin 2 TIMM17A of inner mitochondrial membrane 17 homolog A (yeast) TNFAIP3 tumor necrosis factor, alpha-induced protein 3 TP53 tumor protein p53 (Li-Fraumeni syndrome) TPBG trophoblast glycoprotein TPD52 tumor protein D52 TPI1 triosephosphate isomerase 1 TRA1 tumor rejection antigen (gp96) 1 TXN thioredoxin TXNIP thioredoxin interacting protein uridine monophosphate synthetase (orotate phosphoribosyl and UMPS orotidine-5'-decarboxylase) VEGF vascular endothelial growth factor VEGFB vascular endothelial growth factor B VEGFC vascular endothelial growth factor C VIM VPS11 vacuolar protein sorting 11 (yeast) Information taken from (Denko et al., 2003; Harris, 2002; Le et al., 2004; Papandreou et al., 2005; Semenza, 2003))

5 Table S2: The list of up-regulated genes

20123 21051 20291 20434 20519 1_s_a 20293 20073 20124 3_s_a 2_at 7_at 9_at t 4_at 7_at 9_at t SLC2 Gene Description Affy ID D ADM AK3 CA9 ENO1 HK2 PGK1 A1 VEGF MTX1 metaxin 1 210386_s_at 0.9 0.46 0.52 0.49 0.41 0.45 0.50 0.43 0.51 ADORA2B adenosine A2b receptor 205891_at 0.8 0.43 0.51 0.48 0.47 0.55 0.56 0.41 0.48 AK3 adenylate kinase 3 230630_at 0.8 0.57 0.89 0.52 0.49 0.57 0.55 0.73 0.47 ALDOA aldolase A, fructose-bisphosphate 238996_x_at 0.8 0.52 0.54 0.56 0.43 0.57 0.65 0.59 0.62 ALDOA aldolase A, fructose-bisphosphate 214687_x_at 0.8 0.41 0.46 0.60 0.45 0.54 0.66 0.43 0.61 ALDOA aldolase A, fructose-bisphosphate 200966_x_at 0.8 0.39 0.44 0.59 0.44 0.53 0.65 0.42 0.61 ANGPTL4 angiopoietin-like 4 221009_s_at 0.8 0.48 0.43 0.60 0.48 0.54 0.58 0.63 0.57 ANGPTL4 angiopoietin-like 4 223333_s_at 0.8 0.47 0.45 0.53 0.54 0.50 0.59 0.61 0.53 C20orf20 chromosome 20 open reading frame 20 218586_at 0.8 0.43 0.52 0.61 0.56 0.48 0.60 0.46 0.49 MRPS17 mitochondrial ribosomal protein S17 218982_s_at 0.8 0.52 0.51 0.45 0.54 0.62 0.47 0.50 PGF placental growth factor 209652_s_at 0.8 0.38 0.50 0.62 0.51 0.48 0.63 0.43 0.59 PGK1 phosphoglycerate kinase 1 200738_s_at 0.8 0.49 0.52 0.62 0.68 0.55 0.96 0.52 0.59 PGK1 phosphoglycerate kinase 1 227068_at 0.8 0.53 0.56 0.54 0.63 0.59 0.89 0.53 0.52 PGK1 phosphoglycerate kinase 1 217356_s_at 0.8 0.44 0.53 0.60 0.67 0.53 0.95 0.44 0.55 AFARP1 AKR7 family pseudogene 202235_at 0.7 0.56 0.74 0.67 0.54 0.55 0.61 0.71 AFARP1 AKR7 family pseudogene 202234_s_at 0.7 0.54 0.72 0.63 0.59 0.51 0.62 0.65 AFARP1 AKR7 family pseudogene 1557918_s_at 0.7 0.40 0.62 0.61 0.47 0.51 0.57 0.63 AK3 adenylate kinase 3 204348_s_at 0.7 0.58 0.95 0.56 0.55 0.68 0.62 0.63 AK3 adenylate kinase 3 225342_at 0.7 0.48 0.88 0.52 0.48 0.60 0.57 0.63 ANLN anillin, binding 222608_s_at 0.7 0.48 0.52 0.58 0.49 0.63 0.46 B4GALT2 UDP-Gal 209413_at 0.7 0.47 0.44 0.44 0.50 0.45 0.46 0.49 BCAR1 breast cancer anti-estrogen resistance 1 223116_at 0.7 0.52 0.52 0.44 0.56 0.45 0.48 0.48 BMS1L BMS1-like, ribosome assembly protein 203082_at 0.7 0.40 0.40 0.52 0.50 0.50 0.54 BNIP3 BCL2/adenovirus E1B interacting protein 3 201848_s_at 0.7 0.39 0.52 0.56 0.47 0.53 0.57 0.44 HOMER1 homer homolog 1 (Drosophila) 213793_s_at 0.7 0.60 0.56 0.49 0.55 0.50 0.60 HSPC163 HSPC163 protein 218728_s_at 0.7 0.58 0.51 0.45 0.65 0.62 0.46 IMP-2 IGF-II mRNA-binding protein 2 218847_at 0.7 0.56 0.57 0.42 0.56 0.49 0.44 0.53 KIAA1393 KIAA1393 227653_at 0.7 0.48 0.65 0.49 0.46 0.49 0.54 LDHA lactate dehydrogenase A 200650_s_at 0.7 0.48 0.58 0.63 0.60 0.41 0.59 0.58

6 LDLR low density lipoprotein receptor 202068_s_at 0.7 0.43 0.48 0.52 0.42 0.62 0.50 0.53 MGC2654 hypothetical protein MGC2654 218945_at 0.7 0.43 0.45 0.65 0.46 0.50 0.61 0.47 MNAT1 menage a trois 1 (CAK assembly factor) 203565_s_at 0.7 0.42 0.55 0.53 0.52 0.44 0.57 NDRG1 N-myc downstream regulated gene 1 200632_s_at 0.7 0.51 0.63 0.52 0.58 0.54 0.56 0.67 NME1 non-metastatic cells 1, protein (NM23A) 201577_at 0.7 0.36 0.42 0.57 0.52 0.61 0.39 procollagen-proline, 2-oxoglutarate 4- P4HA1 dioxygenase, alpha polypeptide I 207543_s_at 0.7 0.48 0.51 0.69 0.53 0.66 0.45 0.58 6-phosphofructo-2-kinase/fructose-2,6- PFKFB4 biphosphatase 4 228499_at 0.7 0.39 0.50 0.62 0.50 0.50 0.61 0.42 PGAM1 phosphoglycerate mutase 1 200886_s_at 0.7 0.41 0.57 0.56 0.49 0.52 0.48 0.56 PGK1 phosphoglycerate kinase 1 200737_at 0.7 0.54 0.59 0.68 0.70 0.58 0.56 0.59 PVR poliovirus receptor 212662_at 0.7 0.54 0.52 0.52 0.60 0.47 0.52 0.49 solute carrier family 16 (monocarboxylic acid SLC16A1 transporters), member 1 209900_s_at 0.7 0.51 0.72 0.61 0.59 0.54 0.63 0.70 SLC16A1 (monocarboxylic acid transporters), memb 1 202236_s_at 0.7 0.49 0.66 0.61 0.49 0.59 0.58 0.68 SLC2A1 glucose transporter 1 201250_s_at 0.7 0.62 0.67 0.54 0.52 0.70 0.62 0.88 TEAD4 TEA domain family member 4 41037_at 0.7 0.47 0.53 0.52 0.45 0.46 0.48 TPBG trophoblast glycoprotein 203476_at 0.7 0.37 0.56 0.53 0.49 0.47 0.41 0.54 TPI1 triosephosphate isomerase 1 213011_s_at 0.7 0.47 0.48 0.58 0.43 0.56 0.60 0.60 TPI1 triosephosphate isomerase 1 200822_x_at 0.7 0.47 0.47 0.58 0.42 0.54 0.60 0.60 TUBB2 , beta, 2 208977_x_at 0.7 0.40 0.50 0.44 0.54 0.50 0.47 0.49 VEGF vascular endothelial growth factor 210512_s_at 0.7 0.71 0.48 0.41 0.54 0.60 0.50 0.90 VEZATIN transmembrane protein vezatin 223675_s_at 0.7 0.46 0.52 0.46 0.44 0.45 0.56 AD-003 AD-003 protein 223368_s_at 0.6 0.48 0.41 0.60 0.60 0.47 AK3 adenylate kinase 3 204347_at 0.6 0.54 0.59 0.50 0.58 0.59 0.64 ANKRD9 repeat domain 9 230972_at 0.6 0.52 0.59 0.55 0.58 0.51 0.62 ANLN anillin, actin binding protein 1552619_a_at 0.6 0.45 0.52 0.57 0.51 0.62 0.42 BNIP3 BCL2/adenovirus E1B interacting protein 3 201849_at 0.6 0.43 0.54 0.45 0.50 0.54 0.43 C14orf156 open reading frame 156 221434_s_at 0.6 0.54 0.51 0.49 0.43 0.50 C15orf25 chromosome 15 open reading frame 25 229208_at 0.6 0.37 0.53 0.47 0.49 0.49 CA12 carbonic anhydrase XII 203963_at 0.6 0.56 0.64 0.45 0.70 0.49 0.63 CA12 carbonic anhydrase XII 210735_s_at 0.6 0.55 0.64 0.43 0.69 0.48 0.64 CA12 carbonic anhydrase XII 204508_s_at 0.6 0.49 0.63 0.42 0.71 0.50 0.62 CA12 carbonic anhydrase XII 214164_x_at 0.6 0.48 0.59 0.42 0.70 0.48 0.61 CA9 carbonic anhydrase IX 205199_at 0.6 0.41 0.59 0.50 0.68 0.61 0.48 CDCA4 cell division cycle associated 4 218399_s_at 0.6 0.40 0.42 0.52 0.54 0.64 0.57 COL4A5 collagen, type IV, alpha 5 (Alport syndrome) 213110_s_at 0.6 0.53 0.46 0.43 0.43 0.45

7 CORO1C coronin, actin binding protein, 1C 222409_at 0.6 0.59 0.60 0.55 0.56 0.53 0.52 CTEN C-terminal tensin-like 230398_at 0.6 0.41 0.49 0.49 0.45 0.43 0.53 DKFZP564D166 putative ankyrin-repeat containing protein 224952_at 0.6 0.47 0.53 0.50 0.48 0.46 0.39 dolichyl-phosphate mannosyltransferase DPM2 polypeptide 2, regulatory subunit 209391_at 0.6 0.39 0.54 0.58 0.43 0.50 0.57 eukaryotic translation initiation factor 2, EIF2S1 subunit 1 alpha, 35kDa 201144_s_at 0.6 0.39 0.60 0.41 0.43 0.45 0.41 AFFX- HUMGAPDH/ GAPD glyceraldehyde-3-phosphate dehydrogenase M33197_5_at 0.6 0.45 0.41 0.46 0.47 0.56 0.56 GMFB glia maturation factor, beta 202543_s_at 0.6 0.40 0.56 0.45 0.48 0.41 0.38 GSS glutathione synthetase 201415_at 0.6 0.39 0.39 0.48 0.45 0.41 HES2 hairy and enhancer of split 2 (Drosophila) 214521_at 0.6 0.40 0.52 0.44 0.44 0.49 0.50 HIG2 hypoxia-inducible protein 2 1554452_a_at 0.6 0.62 0.43 0.61 0.54 0.39 0.53 HSPC163 HSPC163 protein 223993_s_at 0.6 0.54 0.51 0.69 0.60 0.41 IL8 interleukin 8 202859_x_at 0.6 0.48 0.45 0.49 0.61 0.61 0.47 potassium channel tetramerisation domain KCTD11 containing 11 235857_at 0.6 0.46 0.48 0.58 0.61 0.49 0.63 KRT17 205157_s_at 0.6 0.54 0.49 0.42 0.58 0.45 0.50 Kua ubiquitin-conjugating enzyme variant Kua 223186_at 0.6 0.42 0.45 0.46 0.44 0.40 0.51 LOC14946 4 hypothetical protein LOC149464 232823_at 0.6 0.56 0.49 0.42 0.41 0.65 0.49 LOC56901 NADH 218484_at 0.6 0.49 0.52 0.47 0.51 0.49 0.50 low density lipoprotein receptor-related Lrp2bp protein binding protein 227337_at 0.6 0.46 0.38 0.50 0.48 0.40 0.54 MGC14560 protein x 0004 218461_at 0.6 0.40 0.37 0.47 0.58 0.41 MGC17624 MGC17624 protein 227806_at 0.6 0.51 0.54 0.44 0.48 0.48 0.55 MGC2408 hypothetical protein MGC2408 227103_s_at 0.6 0.40 0.49 0.47 0.41 0.41 macrophage migration inhibitory factor MIF (glycosylation-inhibiting factor) 217871_s_at 0.6 0.53 0.46 0.50 0.63 0.43 0.55 MRPL14 mitochondrial ribosomal protein L14 225201_s_at 0.6 0.39 0.43 0.44 0.43 0.40 nudix (nucleoside diphosphate linked moiety NUDT15 X)-type motif 15 219347_at 0.6 0.40 0.52 0.51 0.55 0.59 0.55 PAWR PRKC, apoptosis, WT1, regulator 226231_at 0.6 0.42 0.59 0.50 0.44 0.43 PDZK11 PDZ domain containing 11 223037_at 0.6 0.49 0.60 0.43 0.64 0.41 PLAU plasminogen activator, urokinase 205479_s_at 0.6 0.50 0.39 0.49 0.39 0.39 pleckstrin homology domain containing, PLEKHG3 family G (with RhoGef domain) member 3 212821_at 0.6 0.42 0.52 0.45 0.50 0.45 0.59 PPARD peroxisome proliferative activated receptor, 37152_at 0.6 0.59 0.53 0.51 0.46 0.48 0.56

8 delta 2a, catalytic subunit, PPP2CZ zeta isoform 200885_at 0.6 0.56 0.47 0.46 0.53 0.46 0.73 PPP4R1 protein phosphatase 4, regulatory subunit 1 201594_s_at 0.6 0.41 0.62 0.55 0.62 0.47 0.60 proteasome (prosome, macropain) subunit, PSMA7 alpha type, 7 201114_x_at 0.6 0.52 0.52 0.53 0.52 0.43 proteasome (prosome, macropain) subunit, PSMB7 beta type, 7 200786_at 0.6 0.38 0.51 0.48 0.52 0.53 proteasome (prosome, macropain) 26S PSMD2 subunit, non-ATPase, 2 200830_at 0.6 0.40 0.50 0.52 0.45 0.48 PTGFRN prostaglandin F2 receptor negative regulator 224950_at 0.6 0.43 0.59 0.54 0.53 0.48 0.59 PTGFRN prostaglandin F2 receptor negative regulator 224937_at 0.6 0.51 0.55 0.49 0.51 0.52 0.59 PYGL phosphorylase, glycogen 202990_at 0.6 0.42 0.64 0.49 0.59 0.47 0.54 RAN, member RAS oncogene family 200750_s_at 0.6 0.48 0.50 0.42 0.48 0.44 RNF24 ring finger protein 24 204669_s_at 0.6 0.46 0.39 0.46 0.42 0.40 0.49 RNPS1 RNA binding protein S1, serine-rich domain 200060_s_at 0.6 0.37 0.56 0.44 0.48 0.44 0.54 RUVBL2 RuvB-like 2 (E. coli) 201459_at 0.6 0.43 0.47 0.50 0.56 0.46 S100 calcium binding protein A10 (annexin II S100A10 ligand, calpactin I, light polypeptide (p11)) 200872_at 0.6 0.43 0.53 0.45 0.62 0.42 0.45 S100A3 S100 calcium binding protein A3 206027_at 0.6 0.50 0.53 0.52 0.46 0.42 0.48 survival of motor neuron protein interacting SIP1 protein 1 211114_x_at 0.6 0.42 0.44 0.51 0.42 0.55 solute carrier family 2 (facilitated glucose SLC2A1 transporter), member 1 201249_at 0.6 0.59 0.64 0.61 0.45 0.56 0.56 solute carrier family 6 (neurotransmitter SLC6A10 transporter, creatine), member 10 215812_s_at 0.6 0.53 0.48 0.68 0.50 0.48 0.64 solute carrier family 6 (neurotransmitter SLC6A8 transporter, creatine), member 8 202219_at 0.6 0.57 0.56 0.70 0.56 0.52 0.68 solute carrier family 6 (neurotransmitter SLC6A8 transporter, creatine), member 8 210854_x_at 0.6 0.52 0.47 0.68 0.51 0.48 0.65 solute carrier family 6 (neurotransmitter SLC6A8 transporter, creatine), member 8 213843_x_at 0.6 0.52 0.47 0.68 0.47 0.48 0.64 solute carrier organic anion transporter SLCO1B3 family, member 1B3 206354_at 0.6 0.47 0.42 0.53 0.50 0.55 0.46 SMILE SMILE protein 1560017_at 0.6 0.39 0.46 0.48 0.41 0.52 SNX24 sorting nexing 24 222716_s_at 0.6 0.40 0.45 0.51 0.43 0.50 0.52 , beta, erythrocytic (includes SPTB spherocytosis, clinical type I) 229952_at 0.6 0.56 0.54 0.49 0.55 0.48 0.45 TEAD4 TEA domain family member 4 204281_at 0.6 0.46 0.50 0.41 0.43 0.44 TFAP2C transcription factor AP-2 gamma (activating 205286_at 0.6 0.41 0.53 0.52 0.55 0.45 0.40

9 enhancer binding protein 2 gamma) translocase of inner mitochondrial membrane TIMM23 23 homolog (yeast) 218119_at 0.6 0.37 0.59 0.42 0.43 0.48 TMEM30B transmembrane protein 30B 213285_at 0.6 0.50 0.57 0.46 0.61 0.47 0.42 TPD52L2 tumor protein D52-like 2 201379_s_at 0.6 0.57 0.53 0.53 0.47 0.53 0.52 TUBB2 tubulin, beta, 2 213726_x_at 0.6 0.41 0.53 0.45 0.55 0.51 0.49 VAMP (vesicle-associated membrane VAPB protein)-associated protein B and C 202550_s_at 0.6 0.59 0.58 0.49 0.41 0.60 0.40 VEGF vascular endothelial growth factor 212171_x_at 0.6 0.68 0.50 0.60 0.62 0.50 0.95 VEGF vascular endothelial growth factor 211527_x_at 0.6 0.64 0.50 0.59 0.63 0.47 0.98 XPO5 exportin 5 223056_s_at 0.6 0.41 0.48 0.45 0.48 0.57

10 Table S3: The list of down-regulated genes 202912_ 204347_ 205199_ 202769_ 201231_ 202934_ 202464_ 200737_ 201249_ 210513_ at at at at s_at at s_at at at s_at Gene Description Affy ID D ADM AK3 CA9 CCNG2 ENO1 HK2 PFKFB3 PGK1 SLC2A1 VEGF

EVI2B ecotropic viral integration site 2B 211742_s_at 0.8 -0.56 -0.55 0.51 -0.48 -0.49 -0.51 -0.57 -0.45 GIMAP1 GTPase, IMAP family member 1 1552316_a_a 0.8 -0.60 -0.59 0.53 -0.47 -0.50 -0.53 -0.56 -0.48 t LOC9152 hypothetical protein 226641_at 0.8 -0.60 -0.59 -0.42 0.54 -0.60 -0.50 -0.59 -0.51 6 DKFZp434D2328 ZNFN1A1 zinc finger protein, subfamily 1A, 220704_at 0.8 -0.58 -0.63 0.51 -0.47 -0.53 -0.56 -0.49 -0.50 1 (Ikaros) ARHGAP1 Rho GTPase activating protein 15 244061_at 0.7 -0.56 -0.56 0.54 -0.52 -0.52 -0.51 -0.54 5 ARL6IP5 ADP-ribosylation-like factor 6 200760_s_at 0.7 -0.51 -0.56 0.51 -0.60 -0.55 -0.48 -0.47 interacting protein 5 ATM ataxia telangiectasia mutated 212672_at 0.7 -0.55 -0.52 0.50 -0.54 -0.50 -0.56 -0.53 (includes complementation groups A, C and D) ATP8A1 ATPase, aminophospholipid 213106_at 0.7 -0.56 -0.55 0.51 -0.48 -0.49 -0.46 -0.49 transporter (APLT), Class I, type 8A, member 1 ATP8B1 ATPase, Class I, type 8B, 238055_at 0.7 -0.57 -0.65 -0.54 -0.56 -0.56 -0.57 -0.47 member 1 CD28 CD28 antigen (Tp44) 206545_at 0.7 -0.47 -0.60 0.49 -0.48 -0.49 -0.44 -0.45 CUGBP2 CUG triplet repeat, RNA binding 242268_at 0.7 -0.48 -0.57 0.51 -0.47 -0.55 -0.50 -0.51 protein 2 CUGBP2 CUG triplet repeat, RNA binding 202157_s_at 0.7 -0.52 -0.64 -0.47 0.51 -0.59 -0.57 -0.54 protein 2 DKFZP56 DKFZP564O0823 protein 225809_at 0.7 -0.51 -0.60 -0.46 0.54 -0.51 -0.46 -0.52 4O0823 DOCK2 dedicator of cytokinesis 2 213160_at 0.7 -0.51 -0.58 0.54 -0.46 -0.50 -0.46 -0.54 ENPP2 ectonucleotide 209392_at 0.7 -0.52 -0.63 -0.50 -0.53 -0.60 -0.55 -0.47 /phosphodiester ase 2 () FRZB -related protein 203697_at 0.7 -0.57 -0.64 -0.49 0.47 -0.49 -0.57 -0.53 GIMAP7 GTPase, IMAP family member 7 228071_at 0.7 -0.61 -0.61 0.51 -0.51 -0.53 -0.55 -0.52 HA minor histocompatibility antigen 212873_at 0.7 -0.60 -0.63 0.50 -0.52 -0.48 -0.52 -0.47 HA HEM1 hematopoietic protein 1 238668_at 0.7 -0.56 -0.55 -0.48 -0.50 -0.48 -0.51 -0.47 INPP5D inositol polyphosphate-5- 203332_s_at 0.7 -0.57 -0.53 0.58 -0.47 -0.48 -0.46 -0.43 phosphatase, 145kDa PTPRC protein tyrosine phosphatase, 212588_at 0.7 -0.54 -0.58 0.49 -0.47 -0.53 -0.50 -0.56

11 receptor type, C TMEM1 transmembrane protein 1 215269_at 0.7 -0.46 -0.51 -0.45 -0.56 -0.44 -0.46 -0.52 WASPIP Wiskott-Aldrich syndrome protein 240154_at 0.7 -0.52 -0.61 -0.49 -0.51 -0.53 -0.62 -0.55 interacting protein ZNFN1A1 zinc finger protein, subfamily 1A, 205038_at 0.7 -0.61 -0.65 0.52 -0.56 -0.49 -0.55 -0.49 1 (Ikaros) ABCB1 ATP-binding cassette, sub-family 209994_s_at 0.6 -0.55 -0.51 0.56 -0.52 -0.57 -0.47 B (MDR/TAP), member 1 AP1GBP1 AP1 gamma subunit binding 64418_at 0.6 -0.53 -0.54 0.49 -0.59 -0.51 -0.56 protein 1 ARHGAP1 Rho GTPase activating protein 15 218870_at 0.6 -0.59 -0.60 -0.51 -0.49 -0.56 -0.50 5 ARHGEF6 Rac/Cdc42 guanine nucleotide 209539_at 0.6 -0.55 -0.60 -0.57 -0.47 -0.59 -0.47 exchange factor (GEF) 6 BCL2 B-cell CLL/lymphoma 2 203685_at 0.6 -0.52 -0.60 0.51 -0.50 -0.52 -0.55 BCL2 B-cell CLL/lymphoma 2 232210_at 0.6 -0.54 -0.60 0.52 -0.48 -0.55 -0.57 C13orf18 chromosome 13 open reading 44790_s_at 0.6 -0.50 -0.53 -0.59 -0.52 -0.58 -0.45 frame 18 C6orf32 chromosome 6 open reading 209829_at 0.6 -0.59 -0.54 -0.48 -0.45 -0.46 -0.46 frame 32 CCL19 chemokine (C-C motif) ligand 19 210072_at 0.6 -0.63 -0.59 0.53 -0.49 -0.47 -0.46 CD48 CD48 antigen (B- 204118_at 0.6 -0.56 -0.55 0.49 -0.47 -0.43 -0.55 protein) CD79A CD79A antigen (immunoglobulin- 205049_s_at 0.6 -0.57 -0.59 0.50 -0.49 -0.49 -0.47 associated alpha) CUGBP2 CUG triplet repeat, RNA binding 234151_at 0.6 -0.48 -0.55 0.48 -0.50 -0.53 -0.47 protein 2 CUGBP2 CUG triplet repeat, RNA binding 202158_s_at 0.6 -0.44 -0.62 -0.49 0.54 -0.57 -0.50 protein 2 CUGBP2 CUG triplet repeat, RNA binding 202156_s_at 0.6 -0.51 -0.66 0.51 -0.46 -0.53 -0.55 protein 2 EVI2A ecotropic viral integration site 2A 204774_at 0.6 -0.43 -0.53 -0.48 -0.52 -0.52 -0.55 FBLN5 5 203088_at 0.6 -0.48 -0.63 0.45 -0.52 -0.55 -0.60 FLI1 Friend leukemia virus integration 210786_s_at 0.6 -0.52 -0.53 0.52 -0.47 -0.49 -0.58 1 FLI1 Friend leukemia virus integration 204236_at 0.6 -0.52 -0.51 0.55 -0.49 -0.51 -0.59 1 FLJ12895 hypothetical protein FLJ12895 218312_s_at 0.6 -0.44 -0.47 -0.47 -0.53 -0.59 -0.56 FLJ20696 hypothetical protein FLJ20696 218614_at 0.6 -0.50 -0.62 -0.55 -0.53 -0.43 -0.46 GIMAP1 GTPase, IMAP family member 1 1552318_at 0.6 -0.56 -0.55 -0.52 -0.51 -0.50 -0.57 GYPC (Gerbich blood 202947_s_at 0.6 -0.58 -0.68 0.56 -0.53 -0.56 -0.64

12 group) HLA-DOB major histocompatibility complex, 205671_s_at 0.6 -0.56 -0.58 0.50 -0.48 -0.52 -0.49 class II, DO beta ICAM2 intercellular adhesion molecule 2 213620_s_at 0.6 -0.55 -0.63 -0.46 -0.49 -0.51 -0.63 IL16 interleukin 16 (lymphocyte 209827_s_at 0.6 -0.56 -0.60 0.49 -0.50 -0.49 -0.49 chemoattractant factor) INPP5D inositol polyphosphate-5- 1568943_at 0.6 -0.45 -0.59 -0.53 -0.55 -0.51 -0.46 phosphatase, 145kDa IRF8 interferon regulatory factor 8 204057_at 0.6 -0.55 -0.55 0.51 -0.49 -0.45 -0.54 ITGB2 integrin, beta 2 (antigen CD18 229041_s_at 0.6 -0.60 -0.61 0.51 -0.51 -0.48 -0.54 (p95), lymphocyte function- associated antigen 1 ITM2A integral membrane protein 2A 202746_at 0.6 -0.56 -0.60 0.56 -0.49 -0.53 -0.58 KLHDC1 kelch domain containing 1 1552733_at 0.6 -0.54 -0.55 -0.48 0.52 -0.47 -0.45 LMO2 LIM domain only 2 (rhombotin-like 204249_s_at 0.6 -0.44 -0.50 0.52 -0.48 -0.60 -0.54 1) LOC9152 hypothetical protein 228471_at 0.6 -0.47 -0.55 0.49 -0.53 -0.47 -0.46 6 DKFZp434D2328 LRMP lymphoid-restricted membrane 35974_at 0.6 -0.54 -0.51 0.50 -0.51 -0.56 -0.49 protein NIFUN NifU-like N-terminal domain 209075_s_at 0.6 -0.55 -0.56 -0.50 -0.51 -0.56 -0.47 containing PBXIP1 pre-B-cell leukemia transcription 214177_s_at 0.6 -0.47 -0.63 0.50 -0.48 -0.56 -0.56 factor interacting protein 1 RGS5 regulator of G-protein signalling 5 218353_at 0.6 -0.55 -0.52 -0.54 -0.48 -0.55 -0.59 RHOH ras homolog gene family, member 236293_at 0.6 -0.49 -0.59 -0.49 -0.54 -0.48 -0.46 H SLC25A2 solute carrier family 25 203658_at 0.6 -0.51 -0.51 0.50 -0.64 -0.59 -0.45 0 (carnitine/acylcarnitine translocase), member 20 SYCP3 synaptonemal complex protein 3 230364_at 0.6 -0.44 -0.48 -0.50 0.50 -0.49 -0.53 SYNE1 spectrin repeat containing, 209447_at 0.6 -0.55 -0.61 0.51 -0.52 -0.54 -0.62 nuclear envelope 1 SYNPO2 synaptopodin 2 227662_at 0.6 -0.54 -0.61 -0.48 -0.56 -0.48 -0.50 TLR7 toll-like receptor 7 220146_at 0.6 -0.48 -0.47 -0.51 -0.51 -0.46 -0.49 TNRC6B trinucleotide repeat containing 6B 237895_at 0.6 -0.59 -0.55 -0.51 -0.47 -0.58 -0.54

13 Table S4 Cox multivariate regression analysis of a published breast cancer dataset including the trained intrinsic classifier

Survival Factor p HR 95% CI Lower Upper Metastasis-free Increasing diameter (cm) <0.001 1.61 1.28 2.06 0.002 1.15 1.05 1.26 No positive lymph nodes <0.001 2.55 1.513 4.31 No adjuvant therapy <0.001 4.95 2.73 8.93 70 gene trained profile Overall Age (decreasing decades) 0.014 1.54 1.09 2.16 0.014 1.37 1.06 1.77 Increasing diameter (cm) <0.001 5.41 2.39 12.20 70 gene trained profile 0.003 1.41 1.12 1.77 HS-up (hypoxia metagene) (3.97) (1.59) (9.94) Data from Chang et al 2005. Reduced models after backward likelihood selection are shown (entry p=0.05, removal p=0.10). Variables included in the analysis were: age (decreasing decade), increasing tumor diameter (mm), number of positive lymph nodes, grade (well/moderate/poor), estrogen receptor (ER) status (positive vs negative), mastectomy (no vs yes), adjuvant therapy (no vs yes), the intrinsic 70 gene trained profile (good vs bad), the cell line-trained wound-response signature and HS-up (see Table S3). The wound response signature (as described in Table 1 of the Chang et al 2005 paper) and HS-up (fractional varying from 0−1), were entered as a continuous variables and the HR reported represents the risk for an increasing HS-up quartile. The numbers in parenthesis are the risks at two ends of the spectrum.

14

15 Flow chart 1: An in vivo hypoxia metagene was defined by clustering around the mRNA expression of 10 well-known hypoxia-regulated genes. The genes were chosen based on the extent of published data, validation in vitro and/or clinical studies of the individual factors. To select the genes for the clustering process we chose genes representative of key pathways known to be regulated by hypoxia including angiogenesis (VEGF, ADM), glucose transport and glycolysis (Glut-1, PDK-1, ENO1), genes involved in linking mitochondrial metabolism and glycolysis (HK2), genes regulating the overall rate of glycolysis and the most active isozyme of the four that regulate the concentration of fructose-2, 6- bisphosphate which activates the rate limiting enzyme 6-phosphofructo-1-kinase (PFKB3), pH regulation (CA9), re-use of ATP breakdown products (AK3) and an inhibitor of cell cycle progression (CCNG2). These pathways represent a wide range of the key pathways involved in hypoxia, but obviously not all, and there has been a focus on key aspects of glucose transport and its regulation because that is such a key element defining the hypoxia signature. We also focused on genes for which there have been a very substantial amount of investigation justifying their classification as hypoxia responsive genes, particularly promoter analysis and frequently reported in gene array analysis of hypoxia induced cell lines. To minimize random aggregation, probesets appearing in >50% of the clusters were selected; this conservative cut-off (p<0.001) was selected after a Monte-Carlo simulation was performed computing cluster aggregation around randomly selected sets of transcripts. This simulation showed that the maximum number of clusters in which a transcript was appearing by random chance was 40%, with the majority of transcripts (99%) appearing in no clusters. Pearson correlation was used as a distance measure for the aggregation of probesets around the proband genes; this selects transcripts which are linearly associated to the proband genes. There might be other relevant transcripts which are not linearly associated with the proband genes but still hypoxia regulated; this could be addressed in future studies by changing the distance measure of the clustering. To account for multiple testing, the local FDR was calculated for each probeset using SAMr (BioConductor); a local FDR<0.05 was used as a cut-off to determine cluster membership. This threshold guarantees that the maximum FDR for any transcript in the hypoxia metagene is 0.05; however, as transcripts were selected that were present in 50% of the clusters the final FDR for each transcript in the hypoxia metagene was much smaller than 0.05. This procedure is conservative and reflects the study aim to identify a metagene with strong predictive power. A higher false negative rate was considered acceptable as a consequence of reducing the number of false positives; i.e. a smaller but more accurate classifier was chosen over a larger, but less accurate one. Furthermore, a certain degree of redundancy has been observed in profiling in clinical studies, which again favors a higher false negative and lower false positive rate.

16

Flow chart 2.

17

Figure S1: Immunohistochemical expression of Ca9 and HIF-1α in 59 HNSCC. (A) Ca9 membranous expression. (B) Nuclear HIF-1α expression. (C) Correlation of Ca9 membrane and nuclear HIF-1α expression (Spearman ρ and significance are shown).

18

Figure S2: Analysis of 59 HNSCC samples clustered on CA9 RNA expression. The highest ranked genes that correlated positively and negatively with CA9 are shown.

19 0.7 6

0.6

0.5 17

0.4

0.3 27

35 0.2

45 57 0.1 71

11 6 2 1 1 0 0 (proportion) list in genes Litterature-validated 0 1 ) ) ) K3 K1 .1 .2) .3) .4 .5) .6) .7 ADM A CA9 NO HK2 G 0 0 0 E P VEGF CCNG2 LC2A1 (D>0 (D>0 PFKFB3 S * (D> * * (D>0 * (D> * * (D>0 * (D> H H H H H H H

Figure S3: The level of overlap between genes in the 245 hypoxia-associated gene list and those in the clusters derived from individual genes (ADM, AK3, CCNG2, ENO1, HK2, PFKFB3, SCL2A1, VEGF; diamonds). Increasing the proportion of clusters in which a gene occurred (called the D score) improved the level of overlap with the literature-validated genes (circles for up-regulated genes and triangles for down-regulated). A hypoxia metagene (the H* list of genes) was selected using D=0.5, ie genes that appeared in >50% of the 10 clusters.

20