Leukemia (2005) 19, 953–964 & 2005 Nature Publishing Group All rights reserved 0887-6924/05 $30.00 www.nature.com/leu New insights into MLL rearranged acute leukemias using profiling: shared pathways, lineage commitment, and partner

A Kohlmann1, C Schoch1, M Dugas2, S Schnittger1, W Hiddemann1, W Kern1 and T Haferlach1

1Laboratory for Leukemia Diagnostics, Department of Internal Medicine III, Ludwig-Maximilians University, Munich, Germany; and 2Department of Medical Informatics, Biometrics and Epidemiology, Ludwig-Maximilians University, Munich, Germany

Rearrangements of the MLL gene occur in both acute role in two main functional categories, namely signaling lymphoblastic and acute myeloid leukemias (ALL, AML). This molecules that normally localize to the cytoplasm/cell junctions study addressed the global gene expression pattern of these or nuclear factors implicated in regulatory processes of two leukemia subtypes with respect to common deregulated 7 pathways and lineage-associated differences. We analyzed 73 transcription. t(11q23)/MLL leukemias in comparison to 290 other acute With respect to the oncogenic activation of MLL in leukemia leukemias and demonstrate that 11q23 leukemias combined So and Cleary proposed two mechanisms. One subset of fusion are characterized by a common specific gene expression partners already displays the required transcriptional activation signature. Additionally, in unsupervised and supervised data potential required for leukemogenesis. The other subset acts via analysis algorithms, ALL and AML cases with t(11q23) segre- gate according to the lineage they are derived from, that is, their homodimerization or oligodimerization domains and myeloid or lymphoid, respectively. This segregation can be therefore can lead in a dimerization-dependent pathway to 8 explained by a highly differing transcriptional program. deregulated transcription. Through the use of novel biological network analyses, essential Interestingly, distinct MLL fusion partners suggest a possible regulators of early B cell development, PAX5 and EBF, were role in the tropism of the leukemia. Certain partner not shown to be associated with a clear B-lineage commitment in only convert MLL to an oncogenic fusion but also direct lymphoblastic t(11q23)/MLL leukemias. Also, the influence of the different MLL translocation partners on the transcriptional the lineage susceptibility for transformation. MLL-AF4 expres- program was directly assessed. Interestingly, gene expression sing leukemias are mainly diagnosed as pro B ALL (acute profiling did not reveal a clear distinct pattern associated with lymphoblastic leukemia), whereas, for example, fusion partners one of the analyzed partner genes. Taken together, the AF9, AF6, or AF10 are common in myelomonocytic or identified molecular expression pattern of MLL fusion gene monoblastic AML (acute myeloid leukemia) subtypes.9 samples and biological networks revealed new insights into the Microarrays simultaneously assess the abundance of thou- aberrant transcriptional program in 11q23/MLL leukemias. 10 Leukemia (2005) 19, 953–964. doi:10.1038/sj.leu.2403746 sands of mRNA transcripts. During the past few years powerful Published online 7 April 2005 algorithms have been developed and adapted to mine micro- 11 Keywords: acute leukemia; gene expression; microarray; MLL; array data. More recently also applications to interprete gene t(11q23) expression signatures in terms of pathways and networks have evolved. In this study, from a series of 363 acute leukemia patient samples hybridized to a set of high-density microarrays representing a near complete , analyses were Introduction performed to (i) identify t(11q23)/MLL gene signatures compared to numerous specific subtypes of acute leukemias, (ii) discrimi- The MLL gene (also termed ALL-1, HRX, and TRX1) located at nate t(11q23)/MLL-positive AML from t(11q23)/MLL ALL sam- band 11q23 is a recurrent target of chromosomal ples, (iii) investigate signatures correlated with MLL-AF9 and translocations in acute leukemias, particularly prevalent in other MLL partner genes, and (iv) decipher common biological infant leukemias and treatment-related secondary leukemias, networks. Specifically we addressed the question how the and associated with dismal prognosis.1–3 Reciprocal transloca- differing MLL partner genes influence the global gene expression tions associated with the MLL gene result in in-frame fusion signature and whether pathways could be identified to explain transcripts with various partner genes from at least 50 distinct the molecular determination of MLL leukemias occurring in gene loci.4 In addition, a partial tandem duplication of the MLL both the myeloid and lymphoid lineages. gene has been reported.5 The class of oncogenic MLL fusion proteins consists of the N-terminal portion of the MLL protein fused to C-terminal portions of a fusion partner. Experimental systems in which MLL Materials and methods fusion proteins were generated to induce leukemia in mice demonstrated that this fusion to a C-terminal partner is necessary Patient samples for immortalization. Two critical regions within MLL were identified: a region with three AT hook DNA-binding motifs and This study included bone marrow samples from 363 adult acute 6 the DNA methyltransferase homology region. The MLL fusion leukemia patients at diagnosis representing distinct precursor partners act via dominant gain of function and seem to play a B-ALL subtypes t(11q23)/MLL, t(8;14), t(9;22) and precursor T-ALL as well as AML subtypes with t(11q23)/MLL, t(8;21), Correspondence: A Kohlmann, Laboratory for Leukemia Diagnostics, t(15;17), inv(16), or complex aberrant karyotype (Supplementary Department of Internal Medicine III, Ludwig-Maximilians University, Table 1). In AML with t(11q23)/MLL, 15 out of 48 cases were Marchioninistr. 15, 81377 Munich, Germany; therapy-related. Seven out of these 15 cases followed exposure Fax: þ 49 89 7095 4971; E-mail: [email protected] to topoisomerase II inhibitors. All samples were sent between Received 25 October 2004; accepted 26 January 2005; Published December 1998 and February 2004 for reference diagnostics to online 7 April 2005 our laboratory and were registered in our leukemia database.12 Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 954 Prior to therapy, all patients gave their informed consent for graphically as nodes (genes) and edges (the biological relation- participation in the current evaluation after having been advised ships between the nodes). Nodes are further displayed using about the purpose and investigational nature of the study as well various shapes that represent the functional class of the gene as of potential risks. The study design adhered to the declaration product. Edges are displayed with various labels that describe the of Helsinki and was approved by the ethics committees of the nature of the relationship between the nodes. The length of an participating institutions prior to its initiation. The diagnosis was edge reflects the evidence supporting that node-to-node relation- performed by an individual combination of cytomorphology, ship, in that edges supported by more articles from the literature cytogenetics, fluorescence in situ hybridization (FISH), multi- are shorter. A detailed description of the procedure and various parameter-immunophenotyping, and molecular genetics. In legends are available online in the Supplementary section. particular, a thorough characterization of the t(11q23)/MLL samples was warranted. Cytogenetic characterization, FISH on interphase nuclei and/or metaphases, and MLL fusion transcripts Results PCR detection was performed as previously described.3 Distinct gene expression signatures in t(11q23)/MLL leukemias Gene expression profiling and statistical methods We first compared the expression profiles of all 73 adult Microarray analyses were performed as previously described t(11q23)/MLL-positive samples (n ¼ 25 ALL and n ¼ 48 AML utilizing the GeneChip System (Affymetrix, Santa Clara, CA, with t(11q23)/MLL) to 204 adult myeloid and 86 lymphoblastic USA) and the HG-U133 microarray set.13–16 Details on the leukemia samples with other defined genetic aberrations). In a procedure are provided in the online section. supervised data analysis approach, a robust set of differentially For supervised statistical analyses, samples were accordingly expressed genes was identified which accurately stratified the grouped and for each disease entity differential genes were samples according to their underlying cytogenetic and immuno- calculated by means of t-test-statistic (two-sample t-test, unequal phenotypic characteristics, that is, myeloid subclasses, precursor variances).17 We applied the software package R version 1.7.1 B-lineage, or precursor T-lineage ALL. In detail, for lympho- (http://www.r-project.org/). To address the multiple testing blastic leukemias, t(11q23)/MLL samples (n ¼ 25) were accu- problem, false discovery rates (FDR) of genes were calculated rately separated from precursor B-ALL cases with t(9;22) (n ¼ 42), according to Storey and Tibshirani.18 t(8;14) (n ¼ 12), and precursor T-ALL (n ¼ 32). Figure 1a displays The class prediction was performed using support vector a principal component analysis of 111 ALL samples based on the machines (SVM),19 because there is evidence that SVM-based differential expression of 262 genes. When projected into the prediction slightly outperforms other classification techni- expression space of these informative genes, the four distinct ALL ques.20,21 SVM models were built with libsvm (http:// subclasses accurately cluster together. The online section www.csie.ntu.edu.tw/~cjlin/libsvm/). For further details, see contains the top 50 genes with higher expression or lower Supplementary Information. expression, respectively, in ALL with t(11q23)/MLL, as well as As an additional method to extract differentially expressed genes, the SAM software program (MS Excel application) was 22 used. Microarray signal intensities were transformed as ab described above and subsequently imputed into the software. A stringent cutoff for significance (tuning parameter delta) for o1 false positive-rated gene was chosen. The resulting gene expression data was visualized using hierarchical cluster and principal component analysis (Gene- Maths XT, Applied Maths, Belgium). For visualization of unsupervised data analyses, a variation filter was applied. This filter aimed at removing probe sets that demonstrated minimal variation across the complete data set. Practically, for each gene the standard variance was calculated across all samples. Then ALL with t(9;22) AML with t(8;21) the data matrix was sorted according to the standard variances, AML with t(15;17) and probes, demonstrating a low variance were excluded from ALL with t(8;14) the subsequent analysis. precursor T-ALL AML with inv(16) ALL with t(11q23)/MLL AML with complex kt. AML with t(11q23)/MLL Biological networks analysis Figure 1 Principal component analyses including various acute Biological networks have been generated through the use of leukemia subtypes. The leukemia samples are plotted in a three- Ingenuity Pathways Analysis (August 2004 release version), a dimensional space using the three components capturing most of the variance in the original data set. Each patient sample is represented by web-delivered application that generates networks using differ- a single color-coded sphere. The labels and coloring of the classes entially expressed genes from microarray analyses. The networks were added after the analysis for means for better visualization. (a) address two different questions: (i) discrimination of t(11q23)/ Adult ALL of the four subcategories precursor B-ALL samples MLL from other genetically defined acute leukemia subtypes, comprising t(11q23)/MLL (n ¼ 25), t(9;22) (n ¼ 42), t(8;14) (n ¼ 12), and (ii) discrimination of ALL with t(11q23)/MLL from AML with and precursor T-ALL (n ¼ 32) are accurately separated based on 262 t(11q23)/MLL. In both analyses, a data set containing gene differentially expressed genes. (b) Adult AML samples including t(11q23)/MLL (n ¼ 48), t(8;21) (n ¼ 38), t(15;17) (n ¼ 42), inv(16) identifiers in probe set format and their corresponding fold (n ¼ 49), and complex aberrant karyotypes (n ¼ 75) are accurately change characteristics was uploaded as a tab-delimited text file separated based on 416 differentially expressed genes. Unsupervised into the Ingenuity pathway database. The networks are displayed analyses are available online.

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 955 data from an unsupervised analysis (Supplementary Table 2, Common MLL target genes Supplementary Figure 6). Likewise, by use of the differential expression of 416 genes, In order to identify common MLL target genes, both types the 252 AML samples could accurately be stratified. Specific of t(11q23)/MLL leukemias were grouped together and patterns in gene expression were correlated with t(11q23)/MLL were compared to the various types of precursor B- and T- (n ¼ 48), t(8;21) (n ¼ 38), t(15;17) (n ¼ 42), inv(16) (n ¼ 49), and lineage ALLs as well as to other cytogenetically defined AML AML samples with complex aberrant karyotypes (n ¼ 75). This subtypes. In doing so, a set of differentially expressed genes finding is also visualized by a principal component analysis specifically associated with t(11q23)/MLL leukemias was (Figure 1b). The online section contains the top 50 genes with specified. Relationships between these genes were further higher expression or lower expression, respectively, in AML examined using a network analysis application. As given with t(11q23)/MLL, as well as data from an unsupervised in Figure 3, HOXA9 as well as MEIS1 show up as genes with analysis (Supplementary Table 3, Supplementary Figure 7). higher expression in both t(11q23)/MLL leukemias. Other genes Thus, in both types of acute leukemias, t(11q23)/MLL-positive with higher expression in this network included NICAL and samples are clearly distinct from other subtypes of same cell chromatin remodeling actor RUNX2. Downregulated genes lineage, that is, myeloid or lymphoblastic. They have a included, for example, TNF-receptor superfamily members characteristic underlying expression signature compared to TNFRSF10A and TNFRSF10D,orMADH1, functioning down- other distinct acute leukemia subclasses. stream of TGF-beta receptor serine/threonine kinases. Three Subsequently, all samples were included into one comprehen- additional networks are available as online Supplementary sive analysis. A supervised data analysis algorithm was applied to material. They visualize networks containing other genes with identify genes that separate each of the nine subtypes from the known relationship with t(11q23)/MLL leukemias, for example, remaining classes. As shown in Figure 2, the nine distinct acute HOXA cluster genes (HOXA5, HOXA10), as well as the Hox leukemia subtypes can accordingly be separated. The hierarchical coregulator PBX3, or the tyrosine kinase FLT3. Other target clustering algorithm identified common expression signatures and genes with higher expression in t(11q23)/MLL leukemias orders the patient samples accurately by similarities. Interestingly, included HIP1, so far associated with prostate cancer progres- t(11q23)/MLL-positive samples are not found to cluster together sion, proto-oncogene FRAT1, TAF1B, playing a role in the but rather according to the lineage they are derived from, that is, a tumorigenesis of colorectal carcinomas, and ZFHX1B,a lymphoblastic t(11q23)/MLL cluster and a myeloid t(11q23)/MLL transcriptional corepressor. More detailed information on all cluster can be observed. In the top dendrogram, ALL samples with focus and nonfocus genes, as well as gene expression signal t(11q23)/MLL are grouped next to ALL with t(9;22) and t(8;14), intensities for all genes included in the four networks is available and AML with t(11q23)/MLL are grouped next to AML with in the online section. The top 50 genes with higher expression or t(15;17) or AML with t(8;21) cases. This finding can also be lower expression, respectively, in both leukemias with t(11q23)/ observed in unsupervised analyses using hierarchical clustering MLL combined are given in the online section (Supplementary (Supplementary Figure 8) and PCA (Supplementary Figure 9). Table 4). 40 50 60 70 80 90 100

ALL with t(11q23)/MLL AML with t(11q23)/MLL

Figure 2 Hierarchical cluster analysis of n ¼ 363 adult ALL and AML samples (columns). The normalized expression value for each gene (given in rows) is coded by color (standard deviation from mean). Red cells indicate high expression and green cells indicate low expression. The coloring of the groups is identical to Figure 1. The t(11q23)/MLL leukemias are highlighted by arrows. Corresponding unsupervised analyses are available online.

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 956 MEIS1

B B PBX2 B HOXA9*

B B B + CRKL E MADH1* SET B HOXC8 + B B E MLL B + B B CD34 B NEDD9 B B + SETBP1* RUNX2* + B SAP30 KIAA0601* B B

NICAL LMO2 B B B B B TAL1 B ZNF145 B RCOR B B B TCF7L1 B TNFRSF10D B B GATA2* + RANGAP1 B PML UBL1 B B B E TNFSF10 B A,E B

B + + B TNFRSF10C GFI1B TNFAIP3* RANBP2* A,B TNFRSF10B E E + B TNFRSF10A* REL (v-rel homolog) + E E SOD2

B

IER3

Figure 3 Biological network distinguishing t(11q23)/MLL leukemias from other acute leukemia subtypes. The network is graphically displayed with genes/gene products as nodes and the biological relationships between the nodes as edges. The intensity of the node color indicates the degree of differential gene expression. Green intensities correspond to a lower expression (downregulated) in t(11q23)/MLL cases compared to AML subtypes (inv(16), t(8;21), t(15;17), complex karyotypes) or ALL subtypes (t(9;22), t(8;14), T-ALL), respectively. Red intensities correspond to a higher expression in t(11q23)/MLL cases (upregulated), respectively. Nodes are displayed using various shapes that represent the functional classof the gene product (for detailed legend, see online section). Edges are displayed with labels that describe the nature of the relationship between the nodes (B for binding, E for expression, A for activation/deactivation). Note: Focus genes were included in the original text format file derived from the list of differentially expressed genes (for details, see Supplementary document). Nonfocus genes were derived from queries for interactions between focus genes and all other gene objects stored in the Ingenuity knowledge data base. Gene expression raw data is available online as Supplementary Data Set 2.

Unsupervised hierarchical cluster analysis of MLL was also able to distinguish between the different lineages they translocation positive acute leukemias are derived from. Both a principal component analysis and a two-dimensional hierarchical cluster analysis of the 25 ALL and Next, we addressed the question whether an unsupervised 48 AML with MLL gene translocation were performed. As analysis including exclusively MLL gene rearranged leukemias demonstrated in Figure 4a, although both types of acute

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 957 leukemias are characterized by an MLL gene rearrangement, an myeloid or lymphoblastic origin. Moreover, given the dendro- unsupervised data analysis approach clearly separates the gram from the unsupervised hierarchical cluster analysis, no samples according to their hematopoietic lineage, that is, clear subclustering of cases with identical MLL partner genes can be observed (Figure 4b). In ALL with t(11q23)/MLL, the MLL-ENL cases intercalate with the MLL-AF4 samples. In AML with t(11q23)/MLL, no obvious structure, neither according to a AML with FAB criteria nor to the MLL partner genes can be observed. The t(11q23) MLL-AF6, MLL-AF10, MLL-ELL, as well as rare cases (MLL-p300, MLL-AF17, MLL-SMAP1, MLL-X) are intercalated between the MLL-AF9 samples. Thus, two independent unsupervised algo- rithms consistently separate MLL gene rearranged leukemias into ALL and AML subgroups but not with respect to the partner genes.

ALL with Supervised analysis to discriminate t(11q23)/MLL t(11q23) translocation positive leukemias

We next directly compared expression signatures of ALL b AML MLL-X with t(11q23)/MLL to AML with t(11q23)/MLL in a supervised AML M5a MLL-AF10 algorithm. Among the differentially expressed genes, upregu- AML M5a MLL-ELL AML M5b MLL-AF6 lated candidates in lymphoblastic leukemias demonstrated a AML M5b MLL-AF9 AML M5b MLL-AF9 dominant pattern according to B-lineage commitment. PAX5, AML MLL-AF9 the B-cell-lineage-specific activator was designated as one AML M5b MLL-ELL AML M4 MLL-AF6 of the top-ranked differentially expressed genes. In line with AML M5 MLL-ENL AML M5a MLL-AF9 this finding, PAX5 target genes BLK and CD19 could also AML M4 MLL-AF6 be confirmed upregulated in ALL with t(11q23)/MLL by AML M4 MLL-X AML M4 MLL-SMAP1 microarray analysis. An upregulated expression of IGHM AML M4 MLL-X (encoding the IgM heavy chain), VPREB1 (surrogate light-chain, AML M4 MLL-AF9 AML M1 MLL-AF10 important for forming the pre-B-cell receptor), and CD22 or AML M5a MLL-AF9 AML M5a MLL-AF9 CD79A further elucidates the B-lineage commitment of ALL AML M5a MLL-AF10 with t(11q23)/MLL. AML M5a MLL-AF9 AML M5a MLL-X In addition, the list of differentially expressed genes was also AML M0 MLL-ENL imputed into a pathway analysis application. Various networks AML M5a MLL-AF9 AML M5a MLL-AF9 of functionally related genes were obtained. In Figure 5, a AML M5a MLL-AF9 AML MLL-AF9 biological network is represented. Seven additional networks AML M0 MLL-CBP are available as online Supplementary material. In this network, AML M5a MLL-AF9 AML M5b MLL-AF9 LEF1, a transcriptional regulator is connected to PAX5 and its AML M5a MLL-AF9 target CD79A, which is included in the B-cell antigen receptor. AML M5b MLL-p300 AML M5a MLL-AF9 These genes, as well as the transcriptional regulators MEF2A and AML M5b MLL-AF6 AML M4 MLL-AF9 TCF3 demonstrated a higher expression in ALL with t(11q23)/ AML M5a MLL-AF17 MLL profiles compared to AML with t(11q23)/MLL cases. In the AML M5a MLL-AF10 AML M5b MLL-AF9 Supplementary networks further interesting differentially ex- AML M4 MLL-AF9 AML M5b MLL-AF9 pressed genes with higher expression in t(11q23)/MLL-positive AML M0 MLL-AF9 ALL include BCL11A, also involved in lymphoid malignancies, AML M4 MLL-AF6 AML M1 MLL-ELL transcription regulator ETS2, chromatin-binding proteins CBX2 AML M1 MLL-AF6 and CBX4, and early B-cell factor EBF, which can restrict AML M4 MLL-AF6 AML M4 MLL-AF9 lymphopoiesis to the B-cell lineage and works in concert with Pro-B ALL MLL-AF4 Pro-B ALL MLL-AF4 PAX5 to activate genes required for B-cell differentiation. Pro-B ALL MLL-AF4 Pro-B ALL MLL-AF4 Pro-B ALL MLL-AF4 Pro-B ALL MLL-AF4 Pro-B ALL MLL-AF4 Pro-B ALL MLL-AF4 Figure 4 Unsupervised analysis of adult ALL and AML t(11q23)/ Pro-B ALL MLL-AF4 MLL samples. (a) Unsupervised analysis using a selection of 5000 Pro-B ALL MLL-AF4 Pro-B ALL MLL-ENL genes that showed the largest variance across all samples. In the three- Pro-B ALL MLL-ENL dimensional PCA plot, data points with similar characteristics will Pro-B ALL MLL-AF4 Pro-B ALL MLL-AF4 cluster together. Each patient’s expression pattern is represented by a ALL MLL-AF4 single color-coded sphere. ALL with t(11q23)/MLL samples are labeled Pro-B ALL MLL-AF4 mauve, AML with t(11q23)/MLL are labeled turquoise, respectively. (b) Pro-B ALL MLL-AF4 Pro B-ALL MLL-AF4 Enlarged dendrogram of ALL and AML t(11q23)/MLL samples when AML M5a MLL-AF9 analyzed by unsupervised hierarchical clustering (cluster algorithm: Pro-B ALL MLL-AF4 Ward; selected coefficient: Euclidean distance). For each sample, the Pro-B ALL MLL-AF4 Pro-B ALL MLL-ENL respective immunophenotype, FAB subtype and MLL fusion partner Pro-B ALL MLL-AF4 gene as confirmed by FISH and/or PCR-based molecular analyses is Pro-B ALL MLL-AF4 ALL MLL-ENL given. MLL-X indicates samples with unknown partner genes. Two of AML M4 MLL-p300 the 48 MLL gene rearranged AML are contained in the ALL branch of Pro-B-ALL MLL-AF4 the dendrogram.

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 958 + HMGB1 CDKN1B

B E B E + B TCF3* RB1 CEBPB CASP1* B A,B A,B

P + B,P A,T MEF2A* + + INSR PRKCE* TRA1 PRKCA B,I B + + P PLD1 A,B,P A,B PECAM1*

+ B MADH2 + + JAK2 B FCGR2A JUP + B

+ B CD79A B P GAB2 + LEF1*

+ CD36 B VDR* LYN* B E B B + + B PAX5 E FCER1G B B,P + MITF A,B KIT VEGF* + A,B WT1 PLAUR*

+ E B + B ITGAM B CD151 + + TNFRSF6* B ITGA4*

+ B B CD4

Figure 5 Differentially expressed genes between ALL with t(11q23)/MLL and AML with t(11q23)/MLL. A biological network is displayed graphically. Details for the legend can be obtained in Figure 3 or in the Supplementary document. Green intensities correspond to a lower expression in ALL with t(11q23)/MLL cases compared to AML with t(11q23)/MLL samples (downregulated). Red intensities correspond to a higher expression in ALL with t(11q23)/MLL cases compared to AML with t(11q23)/MLL samples (upregulated). Gene expression raw data is available online as Supplementary Data Set 3.

Reversely, genes with higher expression in t(11q23)/MLL- antigen, or CITED4, a CBP/p300-interacting transcriptional positive AML included the transcriptional activator CEBPB, transactivator. Also, a different repertoire of expression of protein tyrosine kinase KIT, MADH2, a transcription factor- suppressors of cytokine signaling (SOCS) family members as binding protein, and MITF, a transcriptional regulator (Figure 5). well as members of the tumor necrosis factor superfamily could As illustrated in the Supplementary networks, a myeloid be observed. commitment through higher expression in AML with t(11q23)/ More detailed information on all focus and nonfocus genes, as MLL could be demonstrated by differential expression of CEBPA well as gene expression signal intensities for all genes included (CCAAT/enhancer-binding protein-alpha), a transcription factor in the eight networks is given in the online section. required for differentiation of myeloid progenitors, as well as SPI1 (PU.1), a critical player in myeloid development, or granulocyte/macrophage colony-stimulating factor (GM-CSFR) Influence of MLL translocation partners on the gene and granulocyte colony-stimulating factor (G-CSFR) genes. expression signatures Other candidates with significantly higher expression in t(11q23)/MLL positive AML are FES, a tyrosine kinase oncogene, First, within the cohort of AML and t(11q23)/MLL samples, the MNDA, encoding the myeloid cell nuclear differentiation group of t(9;11)-positive cases (n ¼ 23) was compared to

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 959 non-t(9;11)-positive samples (n ¼ 25). Neither supervised nor Table 1 represents a confusion matrix of MLL subgroup unsupervised analyses revealed a specific expression signature predictions based on their gene expression signature using a 10- associated with the MLL translocation partner AF9. In Figure 6, fold crossvalidation approach. It can be observed that the SAM plots demonstrate that compared to the previous analysis classifier is good at predicting the MLL partner genes AF9 and of ALL with t(11q23)/MLL versus AML with t(11q23)/MLL no AF4, the two major groups in the AML and ALL patient cohorts, significantly differentially expressed genes clearly correlate to respectively. Other partner genes are not accurately identified. the MLL-AF9 translocation (left plot). The q-values of the top The misclassifications mainly occur in the corresponding differentially expressed genes ranged between 0.75 and 0.82, that is, calling this set of genes significant would result in an FDR of 475%. For comparison, a very high number of differentially expressed genes can be identified when comparing ALL with t(11q23)/MLL versus AML with t(11q23)/MLL (right plot). Furthermore, as demonstrated in Figure 7, the unsupervised data analysis approach including all t(11q23)/MLL samples also did not reveal any specific patterns associated with distinct MLL AML cluster partner genes. It is interesting to note that MLL-ENL samples, included both in the AML and ALL patient cohorts are separated. Four ALL cases with MLL-ENL intercalate with the MLL-AF4 samples, two AML with MLL-ENL samples are distributed between the various cases in the AML cluster. Additionally, in the AML cluster, no clear distinction between de novo cases and t-AML cases following exposure to topoisomerase II inhibitors ALL cluster can be observed (details are given in the online Supplementary Document). A more detailed analysis then aimed at mining the data supervised for differential gene expression between various MLL partner genes. Here, six groups of MLL patient samples were ALL with MLL-AF4 AML with MLL-ENL included: AML cases with t(9;11)/MLL-AF9 (n ¼ 23), t(6;11)/ ALL with MLL-ENL AML with MLL-p300 MLL-AF6 (n ¼ 7), t(10;11)/MLL-AF10 (n ¼ 4), and t(11;19)/MLL- ELL cases (n ¼ 3); as well as ALL samples with t(4;11)/MLL-AF4 AML with MLL-AF10 AML with MLL-SMAP1 (n ¼ 21) and t(11;19)/MLL-ENL (n ¼ 4). In this data set, no AML with MLL-AF6 AML with MLL-CBP statistically significant expression signatures were found to be AML with MLL-AF9 AML with MLL-X specifically correlated with one of the distinct partner genes. Predicting the respective partner gene based on differential gene AML with MLL-AF17 AML with MLL-ELL expression signatures was approached using Support Vector Figure 7 Unsupervised analysis of adult ALL and AML t(11q23)/ Machines (SVM). Taking the six different subgroups into MLL samples. The unsupervised analysis is based on 5000 genes that account, the complete data set was randomly separated into a showed the largest variance across all samples. For better visualiza- training cohort and an independent test cohort. Then differen- tion, the labels and coloring of the classes were added after the tially expressed genes were identified in the training set, analysis. In the three-dimensional PCA plot, data points with similar characteristics will cluster together. Here, each patient’s expression calculated by means of t-test-statistic, and an SVM model was pattern is represented by a single color-coded sphere. For each built based on the top 100 genes that demonstrate differential sample, the MLL fusion partner gene as confirmed by FISH and/or expression between the respective subclasses in the training set. PCR-based molecular analyses is given. MLL-X indicates samples with This SVM model was used to predict samples in the test cohort. unknown partner genes.

SAM Plot: SAM Plot: AML with MLL-AF9 vs. AML with non-MLL-AF9 ALL with t(11q23)/MLL vs. AML with t(11q23)/MLL Observed Observed

Expected Expected Significant : 6 Significant: 3336 Median # false significant: 0.968 Median # false significant: 0.497

Figure 6 Supervised identification of differentially expressed genes. The left plot shows a supervised analysis of AML samples comparing a group of t(9;11)-positive cases (n ¼ 23) to non-t(9;11)-positive samples (n ¼ 25). Here, no statistically significant differentially expressed genes were identified. The right plot shows a supervised comparison of ALL with t(11q23)/MLL versus AML with t(11q23)/MLL. Red dots indicate genes with higher expression in AML with t(11q23)/MLL and green dots indicate higher expressed genes in ALL with t(11q23)/MLL.

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 960 Table 1 MLL partner gene confusion matrix determined by 10-fold crossvalidation

Real

MLL-AF10 MLL-AF6 MLL-AF9 MLL-ELL MLL-AF4 MLL-ENL

Predicted MLL-AF10 1 MLL-AF6 1 1 MLL-AF9 4 7 20 2 MLL-ELL MLL-AF4 1 20 4 MLL-ENL 1 The matrix shows the predicted MLL fusion partner gene. Misclassified samples are given by bold letters. For example, of n ¼ 21 MLL-AF4 samples, 20 are accurately identified and one sample is classified as MLL-ENL. Likewise, MLL-AF10 or MLL-AF6 samples are classified as MLL- AF9 samples.

Table 2 MLL partner gene confusion matrix determined by resampling

Real

MLL-AF10 MLL-AF6 MLL-AF9 MLL-ELL MLL-AF4 MLL-ENL

Predicted MLL-AF10 0.05 0.27 MLL-AF6 0.44 0.47 0.2 MLL-AF9 0.95 1.55 7.09 0.8 0.09 MLL-ELL 0.1 0.01 MLL-AF4 0.01 0.07 6.59 0.84 MLL-ENL 0.31 0.16 The matrix shows the predicted MLL fusion partner gene as determined after 100 runs of SVM-based classifications. Misclassified samples are given by bold letters. Average numbers of predictions per run are given. For example, seven MLL-AF4 samples have been predicted by the algorithm 700 times (each sample 100 times). Of the 700 predictions, the class label MLL-AF4 has been given correctly 659 times, that is, on average 6.59 per run. In nine individual predictions, an MLL-AF4 sample has been predicted as MLL-AF9, in one prediction as MLL-ELL, and in 31 predictions as MLL-ENL, respectively.

myeloid or lymphoblastic compartment. Thus, there is only a tures. Furthermore, the present study was extended to include strong correlation with the lineage the MLL leukemias are AML cases with MLL gene rearrangements and shows that AML derived from. The gene expression profile does not support the with t(11q23)/MLL can also be associated with a distinct hypothesis of a clear distinct signature associated with one of the expression signature. various partner genes that can interact with the MLL gene. We also analyzed four differing adult ALL subtypes. Precursor In order to assess the robustness of partner gene prediction, a B-ALL with t(11q23)/MLL, t(9;22), or t(8;14) and precursor T-ALL resampling approach was applied, that is, the complete SVM all form distinct clusters in various data analysis approaches classification procedure was repeated for 100 times. The which reflect their highly differing underlying gene expression training set included 2/3 of patients, the test set 1/3. Here, the profiles. This is in line with previous reports showing that test set for each of the 100 runs included 20 samples which were pediatric and adult ALL with t(11q23)/MLL, t(9;22), or precursor randomly chosen from the total patient cohort to include one T-ALL samples, respectively, can be separated and also MLL-AF10, two MLL-AF6, eight MLL-AF9, one MLL-ELL, seven predicted with high accuracies using microarray techno- MLL-AF4, and one MLL-ENL sample. Given the differential gene logy.14,28,32 We demonstrated that in a comprehensive analysis expression mainly the MLL partner genes AF9 and AF4, including numerous classes of defined acute leukemia subtypes dominating the patient cohort, are given correct class labels t(11q23)/MLL patient samples were distinct. by the classification algorithm (Table 2). This study further aimed at identifying common targets of MLL chimeric fusion genes. In order to designate common target genes, both types of acute leukemias with MLL translocations Discussion were combined and were compared to various types of other precursor B- and T-lineage ALLs as well as to other cytogenetic- Recent studies successfully applied microarrays to classify ally defined AML subtypes. This supervised analysis resulted in a known hematological malignancies, as well as to discover list of statistical significant differentially expressed genes novel subtypes and to identify genetic differences associated irrespective of lineage. A closer examination of these genes with distinct prognostic subgroups.23–25 In pediatric and adult showed that also in our data a significantly overexpressed ‘Hox acute leukemias, distinct gene expression signatures were code’ was detectable, that is, overexpression of HOX-A cluster correlated with t(11q23)/MLL-positive cases.14,15,26–32 Using members.33 Other genes with higher expression in t(11q23)/MLL both a larger cohort of patients and an up-to-date microarray leukemias have also been previously reported to be implicated design, we confirmed our data that AML subtypes carrying the in MLL-related leukemogenesis, that is, MEIS1, and PBX3.34,35 specific balanced chromosomal aberrations t(8;21), t(15;17), However, here it could further be demonstrated how the and inv(16) demonstrate highly characteristic expression signa- t(11q23)/MLL leukemia-associated genes are related to each

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 961 other in a novel constellation. As given in the respective Taken together, a multitude of genes visualized a strong networks consistently upregulated candidates with oncogenic B-lineage commitment in lymphoblastic t(11q23)/MLL leuke- potential included, for example, RUNX2, HIP1, FRAT1, TAF1B, mias. Other interesting candidates with higher expression in ALL and ZFHX1. RUNX2 normally plays a key role in osteogenesis with t(11q23)/MLL for subsequent experimentation include but also a direct oncogenic role had been proposed.36,37 HIP1 CBX2 and CBX4, both components of the chromatin-associated encodes an endocytic protein with transforming properties that polycomb complex. Polycomb group proteins assemble to form is involved in a cancer-causing translocation and which is large multiprotein complexes that are thought to repress their overexpressed in a variety of human cancers.38 Proto-oncogene targets by modifying chromatin structure.48 It has been FRAT1 represents the human homologue to mouse proto- suggested that interference with CBX4 function can lead to oncogene Frat1, which promotes carcinogenesis through derepression of proto-oncogene transcription and subsequently activation of the Wnt/beta-catenin/TCF signaling pathway.39 to cellular transformation.49 TAF1B has been identified to play a role in the tumorigenesis of With respect to AML with t(11q23)/MLL, in another network a colorectal carcinomas with microsatellite instability.40 ZFHX1 transcriptional pattern for myeloid commitment was represented encoding Smad-interacting protein 1 (SIP1) directly represses through the higher expression of key players in myeloid E-cadherin gene transcription and activates cancer invasion via development, CEBPA and SPI1 (PU.1). The finding that the upregulation of the matrix metalloproteinase gene family.41 C/EBPalpha binds and activates the endogenous PU.1 gene in Consistently downregulated genes in t(11q23)/MLL leukemias myeloid cells further contributes to the specification of myeloid included TNF-receptor superfamily members required in TRAIL- progenitors.50 Also, genes encoding the receptors for GM-CSFR mediated apoptosis, TNFRSF10A and TNFRSF10D,42 or and G-CSFR clearly underline a completely differing transcript- MADH1 (SMAD1), functioning downstream of TGF-beta re- ional program since it has been suggested that G-CSFR signals ceptor serine/threonine kinases.43 However, it only can be may play a role in directing the commitment of primitive speculated whether the dysregulated expression of these genes hematopoietic progenitors to the common myeloid lineage.51 confer any resistance to apoptotic stimuli. Also, the downregulation of GM-CSFR represents a critical event The t(11q23)/MLL leukemias are generally associated with a in producing cells with a lymphoid-restricted lineage poten- high risk of treatment failure and therefore novel therapeutic tial.52 Other differentially expressed genes with higher expres- strategies are needed to improve the outcome in patients with sion in t(11q23)/MLL-positive AML included, for example, FES,a 11q23 abnormalities. Small molecule inhibitors of FLT3,a tyrosine kinase oncogene, implicated in signaling downstream receptor tyrosine kinase, may prove to be beneficial.44 In recent from hematopoietic cytokines.53 FES may be a key component studies, high levels of FLT3 expression in patients with MLL of the granulocyte differentiation machinery and contributes to rearrangements have been identified and FLT3 successfully has lineage determination at the level of multilineage hematopoietic been validated as a therapeutic target.26,45 We also can observe progenitors as well as the more committed granulo-monocytic an overexpression of FLT3 in both t(11q23)/MLL leukemias progenitors.54 Another gene that may be involved in myeloid compared to other acute leukemia classes. The corresponding differentiation is MNDA, encoding the myeloid cell nuclear network including the FLT3 gene is available online (network differentiation antigen.55 It is expressed exclusively in maturing MLL target 2). myeloid cells and cell lines, and is not expressed in lymphoid We further demonstrated that ALL and AML cases with cells. Recent data suggest that there is a strong correlation t(11q23)/MLL segregate according the lineage, that is, myeloid between MNDA expression and myeloid differentiation.56 Here, or lymphoblastic, respectively. In unsupervised data analyses MNDA expression further elucidates the myeloid lineage the cases with MLL gene translocations did not cluster as a specificity in t(11q23)/MLL-positive AML. Lastly, CITED4,a unique subgroup, but instead clustered according to their CBP/p300-interacting transcriptional transactivator, is signifi- lineage of origin. Therefore, we propose that MLL aberrations cantly higher expressed in AML with t(11q23)/MLL.57 It may lead to specific expression signatures but that there is a clear function as a coactivator for transcription factor AP-2 and identification of lymphoblastic lineage commitment for ALL possible roles for CITED4 in regulation of gene expression with t(11q23)/MLL. This has also recently been demonstrated by during development and differentiation of blood cells have been Ross et al29 in a pediatric cohort of patients. We now could implied.58 demonstrate that this cellular differentiation can be explained by A major goal of this study was to directly assess the influence a transcriptional program and further elucidated this through the of the different MLL translocation partners on the transcriptional use of biological network analysis. Among the top-ranked program. We performed a supervised comparison of MLL-AF9- differentially expressed genes to discriminate ALL and AML positive samples to MLL-AF9-negative samples in AML. No cases with t(11q23), PAX5 was represented. PAX5 restricts the statistically significant differences were found. Using SAM plots developmental options of lymphoid progenitors to the B-cell to visualize the degree of differences in their gene expression lineage by repressing the transcription of lineage-inappropriate pattern we observed that within AML the MLL-AF9-positive genes and simultaneously activating the expression of samples were very similar compared to the MLL-AF9-negative B-lymphoid signaling molecules.46 Its influence can also be samples. Furthermore, as demonstrated by an unsupervised data followed more downstream when we focused on PAX5 target analysis, no clear subclustering of t(9;11)/MLL-AF9-positive genes, also included in the list of top-ranked differential genes. It samples was observed. Instead of being distinct from other is known that, for example, BLK or CD19 are controlled by AML with differing MLL gene rearrangements, global gene PAX5. As visualized in the respective biological networks, these expression patterns of t(9;11)/MLL-AF9 intercalated with other and other B-lineage characteristic candidates (CD79A, VPREB1, AML with t(11q23)/MLL cases. This transcriptional concordance CD22) were grouped together, all with higher expression in MLL is an unexpected result. However, it would correlate with the gene rearranged ALL compared to AML samples. Interestingly, observation of comparable clinical outcome in those subset of not only PAX5 but also EBF a second essential regulator of early AML patients.3 B-cell development was higher expressed in ALL with t(11q23)/ We failed at identifying clearly differentially expressed genes MLL. Specific activities of these proteins include roles in when six different MLL partner genes, that is, MLL-AF9, MLL- chromatin remodeling and recruitment of partner proteins.47 AF6, MLL-AF10, and MLL-ELL in AML and MLL-AF4 as well as

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 962 MLL-ENL in ALL, respectively, were examined. At this step no comparable to the multipotent progenitor (MPP), the direct statistically significant expression signatures were found to be progeny of short-term HSC. When injected into sublethally specifically correlated with one of the distinct partner genes irradiated mice, the biphenotypic progenitors sequentially (additional data is given in the Supplementary online docu- further differentiated along the myeloid or lymphoid lineages ment). This also explains the failure of predicting the respective and induced AML or ALL, respectively. partner gene based on differential gene expression signatures In conclusion, our results underline that AML with t(11q23)/ using SVMs as classification algorithm. Given the presented MLL and ALL with t(11q23)/MLL are distinct entities as proposed data, the global gene expression profile analysis does not reveal in the current WHO classification of hematological malignan- a clear distinct pattern associated with one of the various partner cies.63 Both subtypes share a distinct gene expression signature genes in t(11q23)/MLL leukemias. but on the other hand vary substantially in the expression of Further experiments are required to investigate why most of genes determining the lymphoid or myeloid lineage. While a the MLL partner genes are strictly correlated with a specific clear gene expression pattern with respect to the lineage was leukemia subtype. Gene expression is determined not only by identified, a specific signature associated with the different MLL the available combination of transcription factors but also by the partner genes was not observed. Microarray technology structure of the local chromatin, which is the physiological demonstrated that based on a cohort of thoroughly character- substrate for all nuclear processes including transcription and ized leukemia samples, expression signatures lead to a better recombination.46 Therefore, it can be speculated that at the time understanding of biological features of these specific acute point of the chromosomal aberration, the hematopoietic leukemia subtypes. Novel networks of candidate genes were progenitor target cell already is committed to a myeloid or depicted and may inspire follow-up studies to elucidate the lymphoid lineage development. Given the differing chromatin events leading to these types of prognostically unfavorable acute structure and its accessibility to regulatory factors, thus only leukemias and may be exploited to identify new therapeutic certain genes would be suitable as fusion partner, for example, targets. AF4 in lymphoblastic, or AF9 in myeloid leukemias. On the other hand, if the progenitor target cell is not committed to a particular lineage, the fusion partner might be able to contribute Acknowledgements to cell-fate decisions. Then, the different MLL fusion proteins would dictate the respective differentiation pathway by facil- We thank Sonja Rauhut for excellent technical assistance and Dr itating the establishment of lineage-specific gene expression Sylvia Merk for help with unsupervised data analyses. This study is programs. In the gene expression patterns described here, a supported in part by German Jose´ Carreras Foundation; Grant strong association of lymphoid commitment in ALL with Number: DJCS-R00/13; Roche Diagnostics GmbH, ICCU, Penz- t(11q23)/MLL was observed. The coexpression of PAX5, the berg, Germany (all to TH). critical B-lineage commitment factor that restricts the develop- mental options of early progenitors to the B-cell pathway, and Supplementary Information early B-cell factor EBF in these samples suggests that the leukemogenic hit did occur in the earliest phase of B- Supplementary Information accompanies the paper on the lymphopoiesis. Leukemia website (http://www.nature.com/leu). Typically, the resultant t(11q23)/MLL leukemias display features of a maturation arrest at a later stage of differentiation. This has particularly been described by Cozzio et al.59 In this References model, purified progenitor subsets, that is, hematopoietic stem cells (HSC), common myeloid progenitors (CMP), and the lineal 1 Biondi A, Cimino G, Pieters R, Pui CH. Biological and therapeutic descendent granulocytic/monocytic-restricted progenitors aspects of infant leukemia. Blood 2000; 96: 24–33. (GMP) were susceptible to MLL fusion protein-mediated 2 Pui CH, Relling MV, Downing JR. Acute lymphoblastic leukemia. transformation. Regardless of the initiating cell, targeted by an N Engl J Med 2004; 350: 1535–1548. 3 Schoch C, Schnittger S, Klaus M, Kern W, Hiddemann W, MLL-ENL construct, the resultant leukemias displayed immuno- Haferlach T. AML with 11q23/MLL abnormalities as defined by phenotypes and gene expression profiles characteristic of the WHO classification: incidence, partner , FAB maturation arrest at an identical late state of myelomonocytic subtype, age distribution, and prognostic impact in an unselected differentiation downstream of the GMP. series of 1897 cytogenetically analyzed AML cases. Blood 2003; In human leukemias, MLL-ENL occurs in both AML and ALL. 102: 2395–2402. Interestingly, in our cohort, myeloid and lymphoblastic gene 4 Huret JL, Dessen P, Bernheim A. An atlas of chromosomes in hematological malignancies. Example: 11q23 and MLL partners. expression profiles of MLL-ENL samples were separated. The Leukemia 2001; 15: 987–989. t(11;19)(q23;p13.1) chromosomal translocation fuses the gene 5 Schnittger S, Kinkelin U, Schoch C, Heinecke A, Haase D, encoding transcriptional elongation factor ENL to the MLL Haferlach T et al. Screening for MLL tandem duplication in 387 gene.60 Recent data indicate that neoplastic transformation by unselected patients with AML identify a prognostically unfavorable the MLL-ENL fusion protein is likely to result from aberrant subset of AML. Leukemia 2000; 14: 796–804. transcriptional activation of MLL target genes.61 This finding 6 Ernst P, Wang J, Korsmeyer SJ. The role of MLL in hematopoiesis and leukemia. Curr Opin Hematol 2002; 9: 282–287. would further support the model of ‘lineage promiscuity’, a 7 Ayton PM, Cleary ML. Molecular mechanisms of leukemo- mechanism described for mixed lineage leukemias in the genesis mediated by MLL fusion proteins. Oncogene 2001; 20: context of MLL-GAS7.62 In their study, So et al had used a 5695–5707. retroviral MLL-GAS7 construct to model acute biphenotypic 8 So CW, Cleary ML. Dimerization: a versatile switch for oncogen- leukemia (ABL) in mice. Cells that were transformed in vitro esis. Blood 2004; 104: 919–921. were able to induce three different leukemias in vivo, that is, 9 Collins EC, Rabbitts TH. The promiscuous MLL gene links chromosomal translocations to cellular differentiation and tumour AML, ALL, and ABL, which also exhibited distinct gene tropism. Trends Mol Med 2002; 8: 436–442. expression profiles for a selection of transcripts. The progenitor 10 Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ. High density cells affected by the MLL oncogene were phenotypically most synthetic oligonucleotide arrays. Nat Genet 1999; 21: 20–24.

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 963 11 Slonim DK. From patterns to pathways: gene expression data 34 Rozovskaia T, Feinstein E, Mor O, Foa R, Blechman J, Nakamura T analysis comes of age. Nat Genet 2002; 32 (Suppl.): 502–508. et al. Upregulation of Meis1 and HoxA9 in acute lymphocytic 12 Dugas M, Schoch C, Schnittger S, Haferlach T, Danhauser-Riedl S, leukemias with the t(4:11) abnormality. Oncogene 2001; 20: Hiddemann W et al. A comprehensive leukemia database: 874–878. integration of cytogenetics, molecular genetics and microarray 35 Thorsteinsdottir U, Kroon E, Jerome L, Blasi F, Sauvageau G. data with clinical information, cytomorphology and immunophe- Defining roles for HOX and MEIS1 genes in induction of acute notyping. Leukemia 2001; 15: 1805–1810. myeloid leukemia. Mol Cell Biol 2001; 21: 224–234. 13 Kern W, Kohlmann A, Wuchter C, Schnittger S, Schoch C, 36 Ito Y. Oncogenic potential of the RUNX gene family: ’overview’. Mergenthaler S et al. Correlation of protein expression and gene Oncogene 2004; 23: 4198–4208. expression in acute leukemia. Cytometry 2003; 55B: 29–36. 37 Stewart M, Terry A, Hu M, O’Hara M, Blyth K, Baxter E 14 Kohlmann A, Schoch C, Schnittger S, Dugas M, Hiddemann W, et al. Proviral insertions induce the expression of bone-specific Kern W et al. Molecular characterization of acute leukemias by isoforms of PEBP2alphaA (CBFA1): evidence for a new use of microarray technology. Genes Chromosomes Cancer 2003; myc collaborating oncogene. Proc Natl Acad Sci USA 1997; 94: 37: 396–405. 8646–8651. 15 Kohlmann A, Schoch C, Schnittger S, Dugas M, Hiddemann W, 38 Hyun TS, Ross TS. HIP1: trafficking roles and regulation of Kern W et al. Pediatric acute lymphoblastic leukemia (ALL) gene tumorigenesis. Trends Mol Med 2004; 10: 194–199. expression signatures classify an independent cohort of adult ALL 39 Saitoh T, Mine T, Katoh M. Molecular cloning and expression of patients. Leukemia 2004; 18: 63–71. proto-oncogene FRAT1 in human cancer. Int J Oncol 2002; 20: 16 Schoch C, Kohlmann A, Schnittger S, Brors B, Dugas M, 785–789. Mergenthaler S et al. Acute myeloid leukemias with reciprocal 40 Kim NG, Rhee H, Li LS, Kim H, Lee JS, Kim JH et al. Identification rearrangements can be distinguished by specific gene expression of MARCKS, FLJ11383 and TAF1B as putative novel target genes in profiles. Proc Natl Acad Sci USA 2002; 99: 10008–10013. colorectal carcinomas with microsatellite instability. Oncogene 17 Yeang CH, Ramaswamy S, Tamayo P, Mukherjee S, Rifkin RM, 2002; 21: 5081–5087. Angelo M et al. Molecular classification of multiple tumor types. 41 Miyoshi A, Kitajima Y, Sumi K, Sato K, Hagiwara A, Koga Y et al. Bioinformatics 2001; 17 (Suppl 1): S316–S322. Snail and SIP1 increase cancer invasion by upregulating MMP 18 Storey JD, Tibshirani R. Statistical significance for genomewide family in hepatocellular carcinoma cells. Br J Cancer 2004; 90: studies. Proc Natl Acad Sci USA 2003; 100: 9440–9445. 1265–1273. 19 Vapnik V. Statistical Learning Theory. New York: Wiley, 1998. 42 Almasan A, Ashkenazi A. Apo2L/TRAIL: apoptosis signaling, 20 Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, biology, and potential for cancer therapy. Cytokine Growth Factor Haussler D. Support vector machine classification and validation Rev 2003; 14: 337–348. of cancer tissue samples using microarray expression data. 43 ten Dijke P, Goumans MJ, Itoh F, Itoh S. Regulation Bioinformatics 2000; 16: 906–914. of cell proliferation by Smad proteins. J Cell Physiol 2002; 191: 21 Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey 1–16. TS et al. Knowledge-based analysis of microarray gene expression 44 Gilliland DG, Griffin JD. The roles of FLT3 in hematopoiesis and data by using support vector machines. Proc Natl Acad Sci USA leukemia. Blood 2002; 100: 1532–1542. 2000; 97: 262–267. 45 Armstrong SA, Kung AL, Mabon ME, Silverman LB, Stam RW, Den 22 Tusher VG, Tibshirani R, Chu G. Significance analysis of Boer ML et al. Inhibition of FLT3 in MLL. Validation of a microarrays applied to the ionizing radiation response. Proc Natl therapeutic target identified by gene expression based classifica- Acad Sci USA 2001; 98: 5116–5121. tion. Cancer Cell 2003; 3: 173–183. 23 Ebert BL, Golub TR. Genomic approaches to hematologic 46 Busslinger M. Transcriptional control of early B cell development1. malignancies. Blood 2004; 104: 923–932. Annu Rev Immunol 2004; 22: 55–79. 24 Grimwade D, Haferlach T. Gene-expression profiling in acute 47 Maier H, Hagman J. Roles of EBF and Pax-5 in B lineage myeloid leukemia. N Engl J Med 2004; 350: 1676–1678. commitment and development. Semin Immunol 2002; 14: 25 Haferlach T, Kohlmann A, Kern W, Hiddemann W, Schnittger S, 415–422. Schoch C. Gene expression profiling as a tool for the diagnosis of 48 Pirrotta V. Polycombing the genome: PcG, trxG, and chromatin acute leukemias. Semin Hematol 2003; 40: 281–295. silencing. Cell 1998; 93: 333–336. 26 Armstrong SA, Staunton JE, Silverman LB, Pieters R, Den Boer ML, 49 Satijn DP, Olson DJ, van der Vlag J, Hamer KM, Lambrechts C, Minden MD et al. MLL translocations specify a distinct gene Masselink H et al. Interference with the expression of a novel expression profile that distinguishes a unique leukemia. Nat Genet human polycomb protein, hPc2, results in cellular transformation 2002; 30: 41–47. and apoptosis. Mol Cell Biol 1997; 17: 6076–6086. 27 Bullinger L, Dohner K, Bair E, Frohling S, Schlenk RF, Tibshirani R 50 Kummalue T, Friedman AD. Cross-talk between regulators et al. Use of gene-expression profiling to identify prognostic of myeloid development: C/EBPalpha binds and activates subclasses in adult acute myeloid leukemia. N Engl J Med 2004; the promoter of the PU.1 gene. J Leukoc Biol 2003; 74: 350: 1605–1616. 464–470. 28 Ross ME, Zhou X, Song G, Shurtleff SA, Girtman K, Williams WK 51 Richards MK, Liu F, Iwasaki H, Akashi K, Link DC. Pivotal role of et al. Classification of pediatric acute lymphoblastic leukemia by granulocyte colony-stimulating factor in the development of gene expression profiling. Blood 2003; 102: 2951–2959. progenitors in the common myeloid pathway. Blood 2003; 102: 29 Ross ME, Mahfouz R, Onciu M, Liu HC, Zhou X, Song G et al. 3562–3568. Gene expression profiling of pediatric acute myelogenous 52 Iwasaki-Arai J, Iwasaki H, Miyamoto T, Watanabe S, Akashi K. leukemia. Blood 2004; 104: 3679–3687. Enforced granulocyte/macrophage colony-stimulating factor sig- 30 Rozovskaia T, Ravid-Amir O, Tillib S, Getz G, Feinstein E, Agrawal nals do not support lymphopoiesis, but instruct lymphoid H et al. Expression profiles of acute lymphoblastic and myelo- to myelomonocytic lineage conversion. J Exp Med 2003; 197: blastic leukemias with ALL-1 rearrangements. Proc Natl Acad Sci 1311–1322. USA 2003; 100: 7853–7858. 53 Sangrar W, Gao Y, Zirngibl RA, Scott ML, Greer PA. The fps/fes 31 Valk PJ, Verhaak RG, Beijen MA, Erpelinck CA, Barjesteh van proto-oncogene regulates hematopoietic lineage output. Exp Waalwijk van Doorn-Khosrovani, Boer JM et al. Prognostically Hematol 2003; 31: 1259–1267. useful gene-expression profiles in acute myeloid leukemia. N Engl 54 Kim J, Ogata Y, Feldman RA. Fes tyrosine kinase promotes survival J Med 2004; 350: 1617–1628. and terminal granulocyte differentiation of factor-dependent 32 Yeoh EJ, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R myeloid progenitors (32D) and activates lineage-specific transcrip- et al. Classification, subtype discovery, and prediction of outcome tion factors. J Biol Chem 2003; 278: 14978–14984. in pediatric acute lymphoblastic leukemia by gene expression 55 Cousar JB, Briggs RC. Expression of human myeloid cell nuclear profiling. Cancer Cell 2002; 1: 133–143. differentiation antigen (MNDA) in acute leukemias. Leuk Res 33 Kumar AR, Hudson WA, Chen W, Nishiuchi R, Yao Q, Kersey JH. 1990; 14: 915–920. Hoxa9 influences the phenotype but not the incidence of Mll-AF9 56 Asefa B, Klarmann KD, Copeland NG, Gilbert DJ, Jenkins NA, fusion gene leukemia. Blood 2004; 103: 1823–1828. Keller JR. The interferon-inducible p200 family of proteins: a

Leukemia Gene expression profiling in t(11q23)/MLL acute leukemias A Kohlmann et al 964 perspective on their roles in cell cycle regulation and differentia- 60 Rubnitz JE, Morrissey J, Savage PA, Cleary ML. ENL, the gene fused tion. Blood Cells Mol Dis 2004; 32: 155–167. with HRX in t(11;19) leukemias, encodes a nuclear protein with 57 Braganca J, Swingler T, Marques FI, Jones T, Eloranta JJ, Hurst HC transcriptional activation potential in lymphoid and myeloid cells. et al. Human CREB-binding protein/p300-interacting transactivator Blood 1994; 84: 1747–1752. with ED-rich tail (CITED) 4, a new member of the CITED family, 61 Zeisig BB, Milne T, Garcia-Cuellar MP, Schreiner S, Martin ME, functions as a co-activator for transcription factor AP-2. J Biol Fuchs U et al. Hoxa9 and Meis1 are key targets for MLL-ENL- Chem 2002; 277: 8559–8565. mediated cellular immortalization. Mol Cell Biol 2004; 24: 58 Yahata T, Takedatsu H, Dunwoodie SL, Braganca J, Swingler T, 617–628. Withington SL et al. Cloning of mouse Cited4, a member of the 62 So CW, Karsunky H, Passegue E, Cozzio A, Weissman IL, Cleary CITED family p300/CBP-binding transcriptional coactivators: ML. MLL-GAS7 transforms multipotent hematopoietic progenitors induced expression in mammary epithelial cells. Genomics and induces mixed lineage leukemias in mice. Cancer Cell 2003; 2002; 80: 601–613. 3: 161–171. 59 Cozzio A, Passegue E, Ayton PM, Karsunky H, Cleary ML, 63 Jaffe ES, Harris NL, Stein H, Vardiman JW. World Health Weissman IL. Similar MLL-associated leukemias arising from self- Organization Classification of Tumours. Pathology and Genetics renewing stem cells and short-lived myeloid progenitors. Genes of Tumours of Haematopoietic and Lymphoid Tissues. Lyon, IARC Dev 2003; 17: 3029–3035. Press, 2001.

Leukemia