Leukemia (2011) 25, 671–680 & 2011 Macmillan Publishers Limited All rights reserved 0887-6924/11 www.nature.com/leu ORIGINAL ARTICLE

Targeted next-generation sequencing detects point mutations, insertions, deletions and balanced chromosomal rearrangements as well as identifies novel leukemia-specific fusion in a single procedure

V Grossmann1,3, A Kohlmann1,3, H-U Klein2, S Schindela1, S Schnittger1, F Dicker1, M Dugas2, W Kern1, T Haferlach1 and C Haferlach1

1MLL Munich Leukemia Laboratory, Munich, Germany and 2Department of Medical Informatics and Biomathematics, University of Mu¨nster, Mu¨nster, Germany

DNA sequence enrichment from complex genomic samples (RT-PCR) or direct sequencing, not only allow stratification of using microarrays enables targeted next-generation sequen- patients into distinct prognostic risk groups,5,6 but also serve as cing (NGS). In this study, we combined 454 shotgun pyro- molecular markers to monitor minimal residual disease.7–9 sequencing with long oligonucleotide sequence capture arrays. We demonstrate the detection of mutations including point In this study, we combined 454 PicoTiterPlate (PTP) mutations, deletions and insertions in a cohort of 22 patients pyrosequencing with long oligonucleotide sequence capture presenting with acute leukemias and myeloid neoplasms. arrays to evaluate whether this technique permits a compre- Importantly, this one-step methodological procedure also hensive genetic characterization of a cancer genome. In allowed the detection of balanced chromosomal aberrations, particular, we addressed the question whether this combination including translocations and inversions. Moreover, the geno- of methods would detect not only point mutations, as well as mic representation of only one of the partner genes of a chimeric fusion on the capture platform also permitted deletions and insertions, but also capture target sequences that identification of the novel fusion partner genes. Using acute would reveal balanced chromosomal aberrations in a one-step myeloid leukemias harboring RUNX1 abnormalities as a model procedure. This principle is proven by investigating the complex system, three novel chromosomal fusion sequences and recombinome of leukemias harboring alterations in RUNX1,10–13 KCNMA1 as a novel RUNX1 fusion partner were detected. PDGFRB14 and MLL.15 At present, 32 chromosomal partner This assay has the strong potential to become an important regions have been described for RUNX1 translocations, but the method for the comprehensive genetic characterization of corresponding partner gene has only been identified in 17 particular leukemias and other malignancies harboring 12 complex genomes. translocations. Similarly, 104 partner regions have been Leukemia (2011) 25, 671–680; doi:10.1038/leu.2010.309; described for the MLL gene, but only 64 partner genes are published online 21 January 2011 molecularly characterized.15 Keywords: targeted next-generation sequencing; balanced In this study, we further demonstrate that the contiguous chromosomal rearrangements; fusion genes genomic representation of only one of the partner genes of a chimeric fusion on the capture microarray assay was sufficient to identify also any potentially unknown partner gene from a balanced chromosomal aberration by subsequent shotgun sequencing. Introduction

DNA sequence enrichment from complex genomic samples has Patients and methods been proposed to enable a targeted next-generation sequencing (NGS) workflow. Several methods for massively parallel enrich- Patient cohort ment of the sequencing templates exist. Hybridization to In this study, we analyzed 19 acute leukemia cases (16 acute customized microarrays containing synthetic oligonucleotides myeloid leukemias (AMLs), 3 acute lymphoblastic leukemias) that match the target sequence allows capturing templates from and 3 patients with a myeloproliferative neoplasm sent to the randomly sheared, adaptor-ligated genomic DNA with high MLL Munich Leukemia Laboratory for diagnostic procedures specificity.1 Other methods are based on biotinylated RNA between October 2005 and September 2008 (Table 1). All capture probes to capture size-selected genomic DNA in samples in this study were obtained from untreated patients solution2 or allow a simultaneous amplification of up to 4000 at the time of diagnosis. The study design adhered to the tenets targeted sequences using microdroplet technology.3 of the Declaration of Helsinki and was approved by the Today, the genetic characterization necessary for optimal institutional review board before its initiation. Patients were treatment of leukemias requires a combination of different labor- 16 diagnosed using cytomorphology, banding intensive methods, such as chromosome banding analysis and 17 18 analysis, FISH, molecular genetics and flow cytometry. fluorescence in situ hybridization (FISH).4 Characteristic leukemia- specific fusion genes, detected by reverse transcriptase-PCR Molecular genetics Correspondence: Dr C Haferlach, MLL Munich Leukemia Laboratory, Standard mutational analysis by Sanger sequencing was Max-Lebsche-Platz 31, Munich 81377, Germany. performed on the purified fraction of mononuclear cells after E-mail: [email protected] 3These authors contributed equally to this work Ficoll density centrifugation. Isolation of mononuclear cells, Received 21 May 2010; revised 16 September 2010; accepted 15 genomic DNA (QIAamp DNA Mini kit, Qiagen, Hilden, November 2010; published online 21 January 2011 Germany), mRNA (MagNA Pure LC mRNA HS Kit, Targeted next-generation sequencing in leukemia V Grossmann et al 672 Table 1 Patients with chromosomal abnormalities

Case Diagnosis Fusion genes Gender Sample type Cytogenetic correlate of fusions

N01 AML M4eo CBFB–MYH11a Female Bone marrow inv(16)(p13q22) N03 AML M5a MLL–MLLT3 (AF9)a Female Bone marrow t(9;11)(p22;q23) N04 AML M2 RUNX1–RUNX1T1a Male Bone marrow t(8;21)(q22;q22) N05 AML M5b MLL–ELL, –SFRS14 Female Bone marrow t(11;19)(q23;p13) N14 AML MLL–MLLT10 (AF10) Male Bone marrow t(10;11)(p12;q23) N16 AML M4 MLL–MLLT6 (AF17) Female Bone marrow t(11;17)(q23;q12) N17 AML M5a MLL–MLLT10 (AF10) Male Bone marrow der(10)t(10;11)(p12;q22)inv(11)(q22q23), der(11)t(10;11)(p12;q22) N20 t-AML M5a MLL–MLLT1 (ENL) Male Bone marrow t(11;19)(q23;p13.3) N21 Pro-B-ALL MLL–AFF1 (AF4) Female Bone marrow t(4;11)(q21;q23) N38 AML M5a MLL–MLLT10 (AF10)a Male Bone marrow der(10)t(10;11)(p12;q11)inv(11)(q11q23), der(11)t(10;11)(p12;q23) N39 AML M5b MLL–MLLT4 (AF6)a Male Bone marrow t(6;11)(q27;q23) N40 t-Pro-B-ALL MLL–AFF1 (AF4)a Female Bone marrow t(4;11)(q21;q23) N41 t-AML MLL–ELLa Female Bone marrow t(11;19)(q23;p13.1) N42 AML M0 MLL–MLLT1 (ENL)a Female Peripheral blood t(11;19)(q23;p13.3) N27 AML M1 RUNX1–chr. 17 Female Bone marrow t(7;11;17;21)(p22;q11;q21;q22) N28 AML RUNX1–KCNMA1 Male Peripheral blood t(10;21)(q22;q22) N29 CMML-2 RUNX1–chr. 5 Male Peripheral blood t(5;21)(q11;q22) N30 AML M3v RUNX1–chr. 10 Female Bone marrow t(10;21)(q21;q22) N33 c-ALL ETV6–RUNX1a Female Bone marrow t(12;21)(p13;q22) N36 HES/CEL PDGFRB–DTD1a Male Peripheral blood t(5;20)(q33;p12) N37 HES PDGFRB–MYO18Aa Male Bone marrow t(5;17)(q33;q11.2) Abbreviations: AML, acute myeloid leukemia; CEL, chronic eosinophilic leukemia; CMML, chronic myelomonocytic leukemia; HES, hypereosinophilic syndrome. aKnown from standard routine procedures or previous observations. Note: case N06 not displayed (AML with a normal karyotype).

Roche Applied Science, Penzberg, Germany) and random To quantify enrichment of the genomic DNA, four regions primed complementary DNA synthesis was performed as were selected for quantitative PCR measuring SYBR green described previously.19 The analysis of KRAS mutations in fluorescence according to the manufacturer’s protocols codons 12, 13 and 61 was carried out as previously described.20 (Supplementary Figure 2). The enriched and ligation-mediated Exons 2 and 3 were amplified by PCR, and PCR products were PCR-amplified samples were compared against the non-en- analyzed using the BigDye terminator v1.1 cycle sequencing kit riched and ligation-mediated PCR amplified samples, that is, not (Applied Biosystems, Darmstadt, Germany). Analyses for NPM1 hybridized to a capture array, using a LightCycler LC480 mutations, FLT3 internal tandem duplications, FLT3 tyrosine real-time PCR system (Roche Applied Science). kinase domain mutations, KITD816 mutations and KIT exon 8 mutations were performed as described previously.21–25 Microarray designs A high-density oligonucleotide microarray representing capture Targeted sequence capture microarray assay probes covering 1.91 Mb of genomic sequences was synthesized Briefly, 20 mg genomic DNA was fragmented by nebulization to according to a standard microarray manufacturing protocol small sizes of 300–500 bp to generate blunt-ended fragments. (Roche NimbleGen 385K format). Overlapping microarray The DNA was quantified and the fragment size population was probes of more than 60 bases each on the array spanned each assessed by electrophoresis (Agilent Bioanalyzer 2100 DNA target genome region, with a probe positioned every 10 bases Chip 7500, Agilent, Bo¨blingen, Germany). The fragmented DNA for the forward strand of the genome. was then processed according to the recommended NimbleGen A first array captured short segments corresponding to all protocol (Roche Applied Science, User Guide 3.1; July 2008). In exon regions of 92 distinct target genes (genome build hg18). brief, linkers were ligated to the polished fragments in the library The genes had been selected according to their relevance in to provide a priming site for post-enrichment amplification of leukemia and included, for example, KIT, NF1, KRAS, CEBPA, the eluted fragment pool. The linker-terminated fragments were NPM1, FLT3, IKZF1 or TP53 (in total: 1559 exons). In addition, then denatured to produce single-stranded products that were contiguous genomic regions were represented for three addi- exposed to the sequence capture microarray for hybridization tional genes, that is, CBFB, MLL and RUNX1. On the 385K chip, for 72 h at 42 1C with active mixing using a MAUI hybridization 96% of bases were covered by the design (Supplementary station with mix mode B (Roche Applied Science). Any unbound Spreadsheet 1). DNA fragments were removed from the microarrays under Another series of three array designs represented a contiguous stringent washing conditions and rinsed with Wash Buffers I, II genomic region of a single target gene only (MLL, RUNX1 and and III (Roche Applied Science). Fragments captured on the PDGFRB). The RUNX1 gene was covered according to the microarrays were eluted with 125 mM NaOH and processed for following start and end coordinates: Chr. 21; 36 160 052– amplification by ligation-mediated PCR using a primer com- 36 421 677 (261.5 kb; hg19 assembly). On the 385K chip, plementary to the previously ligated linker oligonucleotides. The 97.6% of bases were covered by the design and RUNX1 was online section contains detailed information on the quality represented by capture probes with a 19-fold coverage on the assessment of the different library preparation steps (Supple- capture microarray. The MLL gene was covered with the start mentary Figure 1, Supplementary Table S1). and end coordinates: Chr. 11; 117 812 370–117 901 177

Leukemia Targeted next-generation sequencing in leukemia V Grossmann et al 673 (88.7 kb; hg18 assembly).26 On the 385K chip, 92.1% of bases Subsequently, the obtained reads were hierarchically clus- were covered by the design and MLL was represented by tered based on the Euclidean distance of their break points in capture probes with a 56-fold coverage on the capture base pairs. The distance between two break points from different microarray. The PDGFRB gene was covered with the following was set to infinity. Each remaining cluster start and end coordinates: Chr. 5; 149 493 355–149 535 460 represented a putative translocation or inversion. Most clusters (40.1 kb; hg19 assembly). On the 385K chip, 98.9% of bases consisted of a single read that was probably artificially generated were covered by the design and PDGFRB was represented by during the sample preparation steps.30 A detailed statistics capture probes with a 125-fold coverage on the capture summarizing dominant clusters and the following interpretation microarray (Supplementary Figure 3–5). of fusion genes events is available online.

Next-generation pyrosequencing Results We applied NGS technology using 454 FLX Titanium chemistry according to the manufacturer’s protocols (Roche Applied Sequence capture enrichment performance and NGS Science).27 Sequencing-compatible linkers, including molecular raw data barcode identifiers, were ligated to the eluted samples from the Performance of the capture-enrichment process was assessed by capture microarrays (Supplementary Table S2). The libraries quantitative PCR assays. These assays acted as a proxy for were subsequently diluted, amplified on beads using estimating the enrichment of larger populations of capture emulsion PCR and sequenced using the 454 FLX sequencing nucleic acid targets. Supplementary Spreadsheet 2 lists the instrument. One run comprised three patients per lane on a two- individual median enrichment values of the sequence capture lane PicoTiterPlate. assay and corresponding array designs. Performance of the actual sequencing process was assessed by several parameters (Supplementary Spreadsheet 2). In Sequencing data analysis median, 168 230 reads were generated per patient (range Basic raw data analysis was carried out using the GS Run 34 651–254 299). The median sequence length of these high- Browser and GS Reference Mapper software version 2.0.01 quality reads was 324 per patient, and ranged from 214 to 384 (Roche Applied Science). Following in silico removal of the bases. A median of 91.3% of reads mapped back uniquely to the linker sequence, each sequence read was compared with the genome. Of those reads that mapped back to the reference entire appropriate version of the . Captured sequence, a median of 63% of reads were on-target for the chip sequences mapped uniquely back to regions within the target design no. 1. Designs that only included a contiguous capture regions were considered sequencing hits. These were then used region for one gene each demonstrated a lower on-target read to calculate the percentage of reads that did hit target regions, percentage (5.1%). Differences were also observed with respect and the fold sequencing coverage for the entire target region. All to the occurrence of chimeric reads, wherein a median on-target putative variances were first compared with published single- read percentage of 44.9% for design no. 1 and 2.9% for the nucleotide polymorphism (SNP) data (dbSNP build 130; three subdesigns were detected, respectively. The median http://www.ncbi.nlm.nih.gov/projects/SNP). Sanger sequencing per-base coverage for each sample was 19-fold. and melting curve analyses were used independently of and concurrently with NGS sequencing to compare sequencing Detection of point mutations, deletions and small results in all patients. insertions First, we analyzed five AML cases, processed with the custom 1.91 Mb microarray design no. 1, and aimed at confirming well- Detection of chromosomal translocations and known AML-typical mutations. In some of the samples tested, inversions established mutations had been previously identified in routine A specific analysis pipeline was developed for the detection of diagnostics operations by conventional methods. As demon- translocations and inversions. After image and signal processing, strated in Supplementary Figure 6, this targeted sequencing assay the reads were demultiplexed and linker sequences were was able to detect the known insertion (case N06: FLT3 internal removed. We applied the Burrows–Wheeler Aligner’s 28 tandem duplication with 63 bp length mutation), deletion (case Smith–Waterman algorithm for mapping the trimmed reads N01: KIT D419del) and point mutation (case N02: FLT3 tyrosine against the reference genome. The Burrows–Wheeler Aligner’s kinase domain D835Y; case N01: KRAS G12C). Smith–Waterman algorithm aligns chimeric reads and reports Further, the enrichment assay also allowed identifying novel alignments largely non-overlapping on the query sequence, so non-synonymous aberrations. Using sequence capture informa- that fewer alignment results enter the post-processing stage. To tion from all 95 genes analyzed, in median 1701 variants were filter out any alignments of small interest with respect to detected (case N01: 1984; case N03: 1745; case N04: 1500; translocations and inversions, we successively applied the case N05: 1701; case N06: 944). As highlighted in Supple- following filtering processes: (Step 1) extract reads that have mental Table S3 after stringent filtering according to known exactly two local alignments, (Step 2) extract reads that have at variations and SNPs only few exonic non-synonymous variants least one local alignment within the target region, (Step 3) remained: case N01: 11; case N03: 6; case N04: 8; case N05: 6; remove reads with a linker sequence between both local case N06: 1. The final list of non-synonymous candidates alignments and (Step 4) remove reads whose local alignments included genes such as TET2 or PTPN11 and is currently being are on the same chromosome with a distance o1000 bp. These further investigated. reads indicate smaller deletions or an insertion that are not focus of this pipeline but of the GS Mapper software and (Step 5) remove duplicated reads. Duplicated reads share the same 50 Detection of a balanced chromosomal inversion starting position and are likely caused by sequencing several To test the performance of the capture assay to reveal any PCR copies of the same original sequence.29 balanced chromosomal aberrations, the microarray design no. 1 was

Leukemia Targeted next-generation sequencing in leukemia V Grossmann et al 674 used to capture fragmented genomic DNA from an AML patient which previous cytogenetic and FISH analyses revealed sample (case N01) harboring an inv(16)(p13q22) aberration, as balanced aberrations, involving translocations of the MLL, detected by cytogenetics and FISH. On a molecular level, the RUNX1 and PDGFRB genes, respectively. On a molecular CBFB gene was fused to MYH11, as confirmed by RT-PCR. level, PCR analyses had confirmed the presence of correspond- Applying a special data analysis pipeline, reads that did ing fusion genes (Table 1). Case no. N40 is an example; a total of not uniquely map back to the reference genome were not five reads was observed to carry sequence information that discarded, but were analyzed further for the detection of mapped both to the MLL and the AFF1 gene (Figure 2). As such, chimeric sequences (Figure 1a). In this case, a total of 12 reads by targeted capturing of MLL sequences using the chip design were observed to carry sequence information that mapped no. 2, subsequent shotgun sequencing confirmed the fusion both to the CBFB gene and the MYH11 gene, that is, four CBFB– partner gene as AFF1. The ability of this principle was also MYH11 reads and eight MYH11–CBFB reads, respectively confirmed for 10 additional cases, that is, seven MLL fusions, (Figure 1b). These 12 chimeric reads had formed a dominant one RUNX1 fusion and two PDGFRB fusions, respectively cluster in a background of 481 reads passing the analyses (Table 1). A detailed summary statistics on the respective cluster steps (Supplemental Spreadsheet 3). For example, sequence size distribution for each fusion and the total number of clusters read no. 1 as given in Table 2 had a total length of 429 bases calculated from chimeric reads obtained for a given sample is after removal of linker sequences. Of these, bases 1–222 available online (Supplemental Spreadsheet 3 and Supplemen- mapped to chromosome 16 with start coordinates 15 814 968 tary Table S4). and end coordinates 15 815 189, representative for MYH11 (intron 32–33, ENST00000300036). The remaining base sequences (223–429) were detected to represent the CBFB gene with chromosome 16 starting at 67 120 882 and ending at Identification of unknown fusion partner genes 67 121 088 (intron 5–6, ENST00000290858). In addition, this We next investigated nine patients with an aberrant karyotype, assay allowed a high-resolution fine mapping of the break points in whom FISH analyses had indicated balanced rearrangements (Table 2 and Supplementary Figure 7). As such, the combination of MLL and RUNX1 genes. First, we analyzed four AML cases, in of a sequence capture platform followed by NGS revealed a which the capturing assay had detected fusions of MLL–MLLT1, specific chimeric fusion gene resulting from a balanced MLL–MLLT6 and MLL–MLLT10 (Table 1). Although the mole- inversion. cular testing of these fusion genes had been previously performed by RT-PCR during the initial diagnostic procedure, no fusion event was detected in these cases. However, NGS Detection of balanced translocations sequencing had revealed the occurrence of unusual break Second, we investigated whether this method would also reveal points, not allowing the routine primers to anneal, and thereby any balanced translocations. In all, 11 cases were analyzed, in enable the detection of corresponding fusions.

inv(16)(p13q22) GS Image/Signal Processing

Raw sequences

Local alignment to reference genome (BWA-SW algorithm)

Locally aligned sequences FZY3Q2K01D6AHY Extraction of chimeric reads

CBFB Region 1: mapped to chr. 16q

MYH11 Region 2: mapped to chr. 16p

5` 3` 67,120,882 67,121,088 (intron) CBFB

15,815,190 (intron) MYH11 15,814,918 15,815,527 15,815,190 (intron) MYH11

67,121,088 (intron) CBFB 67,121,232

Figure 1 Identification of chimeric sequencing reads. (a) After processing of the raw image data, locally aligned sequences were identified using the Burrows–Wheeler Aligner’s Smith–Waterman (BWA-SW) algorithm.28 The local alignment is performed against the hg19 reference genome. In this exemplary case harboring an inv(16)(p13q22), 12 chimeric reads were detected mapping to distinct arms of chromosome 16. (b) Schematic orientation details for chimeric reads mapping to CBFB and MYH11 genes on chromosome 16. For CBFB–MYH11 and MYH11–CBFB fusions corresponding chromosomal start and end coordinates are given.

Leukemia Targeted next-generation sequencing in leukemia V Grossmann et al 675 Table 2 Chimeric reads identified for a case (N01) with inv(16)(p13q22)

Read name Length Chromosome Start End Gene symbol Break point

FZY3Q2K01AD64J 429 16 67 120 882 67 121 088 CBFB 67 121 088 16 15 814 968 15 815 189 MYH11 15 815 189 FZY3Q2K01AQO3E 489 16 67 120 821 67 121 086 CBFB 67 121 086 16 15 814 967 15 815 191 MYH11 15 815 191 FZY3Q2K01BTAQU 421 16 67 120 881 67 121 087 CBFB 67 121 087 16 15 814 974 15 815 189 MYH11 15 815 189 FZY3Q2K01DT2V8 482 16 67 120 731 67 121 089 CBFB 67 121 089 16 15 815 066 15 815 189 MYH11 15 815 189 FZY3Q2K01A321U 472 16 15 815 190 15 815 620 MYH11 15 815 190 16 67 121 088 67 121 129 CBFB 67 121 088 FZY3Q2K01CCF3J 443 16 15 815 190 15 815 436 MYH11 15 815 190 16 67 121 088 67 121 284 CBFB 67 121 088 FZY3Q2K01D6AHY 488 16 15 815 190 15 815 365 MYH11 15 815 190 16 67 121 088 67 121 400 CBFB 67 121 088 FZY3Q2K01D7V12 391 16 15 815 190 15 815 527 MYH11 15 815 190 16 67 121 088 67 121 142 CBFB 67 121 088 FZY3Q2K01DM7AL 455 16 15 815 190 15 815 506 MYH11 15 815 190 16 67 121 088 67 121 226 CBFB 67 121 088 FZY3Q2K01E0TAO 469 16 15 815 190 15 815 482 MYH11 15 815 190 16 67 121 088 67 121 265 CBFB 67 121 088 FZY3Q2K01EWQ5E 496 16 15 815 190 15 815 465 MYH11 15 815 190 16 67 121 088 67 121 300 CBFB 67 121 088 FZY3Q2K01BH5T5 306 16 15 815 190 15 815 351 MYH11 15 815 190 16 67 121 088 67 121 233 CBFB 67 121 088 Note: CBFB, chr16: 67 063 050–67 134 956 plus strand; MYH11, chr16: 15 796 994–15 950 887 minus strand (hg19).

F5KQRMM02I78IM F5KQRMM02FP4SS F5KQRMM02HL109 118,359,134 118,359,704 87,999,292 87,999,720 + 5' 3' + + 5' 3' + chr.4AFF1 MLL chr.11 chr.11MLL AFF1 chr.4 − 3' 5' − − 3' 5' − 87,998,864 87,999,280 118,358,677 118,359,124 F5KQRMM02IC94C F5KQRMM02JVLEY

F5KQRMM02I78IM F5KQRMM02FP4SS

+ 5' 3' + chr.4 AFF1 MLL chr.11 − 3' 5' −

F5KQRMM02IC94C

F5KQRMM02HL109

+ 5' 3' + chr.11 MLL AFF1 chr.4 − 3' 5' −

F5KQRMM02JVLEY

Figure 2 Detailed distribution of chimeric sequencing reads. A total of five chimeric fusion sequences were observed for an acute lymphoblastic leukemia case N40 with t(4;11)(q21;q23). Two reads detected the MLL–AFF1 fusion and three reads detected the AFF1–MLL fusion, respectively.

Second, in another case, a translocation t(11;19)(q23;p13) chimeric reads were detected which were composed of SFRS14 had been observed in chromosome banding analysis and the (splicing factor, arginine/serine-rich 14), also located on 19p13 involvement of the MLL gene had been proven by FISH. centromeric of ELL. By applying RT-PCR and subsequent Sanger However, using RT-PCR neither MLL–MLLT1 (ENL)norMLL–ELL sequencing with new primers designed based on the NGS data fusion transcripts were amplified to confirm any molecular we were able to confirm both the MLL–ELL fusion and the novel aberration. In contrast, the NGS method identified chimeric SFRS14–MLL fusion. However, we did not detect a correspond- reads. Five reads were observed to carry sequence information ing ELL–MLL fusion gene. This suggested that a deletion had that mapped to the MLL gene on the plus strand of chromosome occurred in the break point area, and thus prevented the formation 11 and additional partner genes on the minus strand of of the reciprocal ELL–MLL fusion gene. To confirm this assump- (Figure 3a). Two sequence reads corresponded tion, we performed a SNP microarray analysis (Affymetrix to an MLL–ELL fusion. In addition to MLL and ELL reads, three genome-wide human SNP array 6.0, Santa Clara, CA, USA) and

Leukemia Targeted next-generation sequencing in leukemia V Grossmann et al 676

5` 3` 118,335,065 118,355,232 MLL

18,570,176 ELL 18,569,886 19,101,946 19,101,848 SFRS14

118,355,250 MLL 118,355,584

Copy Number State

Deletion SFRS14

ELL

Figure 3 Discovery of a cryptic MLL rearrangement. (a) Orientation of fusion sequences detected in case N05. For each of the genes involved in the fusions corresponding start and end coordinates are given (hg19). (b) Genome-wide SNP microarray analyses in case N05 highlighting the chromosomal region chr.19p13. A 615 kb region is detected including the ELL gene on the centromeric end. SFRS14 is the first candidate gene outside the deleted fragment. The SNP microarray software algorithm indicated a copy number change from State 3 to State 2, in line with the patient’s karyotype (Table 1).

52110 17 N29N28, N30 N27 RUNX1

5` 3` I1-2 I2-3 I3-4 I4-5 I5-6 I6-7 I7-8 E1 E2 E3 E4 E5E6 E7 E8

Runt TAD 50 177 291 371

5q13.3 N29

10q22.3; E20 of KCNMA1; 2nd aa = X N28

X in intron 6-7 10q22 N30

17q21 N27

Figure 4 Novel chromosomal translocations involving the RUNX1 gene. (a) The ideogram depicts the distribution of the break points in four distinct patients and the corresponding chromosomal regions. (b) In four cases, fusion events involving the RUNX1 gene were observed with distinct functional consequences. The RUNX1 genomic structure and respective domains are shown (ENST00000300305). In each case break points are indicated by black circles. Solid lines indicate a partially translated RUNX1 protein. Vertical dashed lines indicate the RUNX1 gene sequence fused to the 30 end of a partner chromosome. Horizontal dashed lines indicate sequences derived from a fusion partner chromosome. The arrow indicates the break point observed for the t(8;21) translocation. TAD, transactivation domain.

Leukemia Targeted next-generation sequencing in leukemia V Grossmann et al 677 Table 3 Molecular parameters and secondary cytogenetic aberrations in four cases with novel RUNX1 translocations

N27 (RUNX1–chr. 17q) N28 (RUNX1–KCNMA1) N29 (RUNX1–chr. 5q) N30 (RUNX1–chr. 10)

FLT3-ITD wt wt wt MLL-PTD wt wt wt NPM1 wt wt wt CEBPA wt wt wt wt JAK2 wt wt wt KIT D816 wt wt wt wt NRAS Pos. (codon 12) wt wt KRAS wt Pos. (codon 12) wt wt CBL wt wt wt wt IDH1 wt wt wt wt PML–RARA Pos. RUNX1-RUNX1T1 wt À7 Pos. +21 Pos. Abbreviations: ITD, internal tandem duplication; Pos., positive; wt, wild type. data from the SNP microarray demonstrated a 615 kb deletion on capturing platform allowed the subsequent sequencing of 19p13, flanked by ELL and SFRS14. As such, a microdeletion was chimeric nucleic acids, that is, single DNA molecules mapping causative for the fusion of SFRS14 to MLL in the reciprocal setting to different regions of a genome, and thereby enabling the and could be deciphered by NGS (Figure 3b). identification of novel fusion partner genes occurring as a result In another series of four cases (three AML, one CMML) of a balanced chromosomal translocation. RUNX1 abnormalities were observed by metaphase cyto- For target enrichment, a programmable high-density micro- genetics and FISH (Figure 4a). Data on a comprehensive array platform with 385 000 probes was used. The probes were molecular mutation screening is given in Table 3. In this study, able to capture up to 5 Mb of total sequence. Recently, this capture and enrichment of RUNX1 DNA fragments followed by workflow was shown to also allow analysis of all human shotgun sequencing enabled the discovery of four novel distinct exons using a 2.1 million feature capture array.36 In our study, fusions (Table 1). In one case, KCNMA1 was fused to RUNX1. the capture-enrichment process targeted genes reported to This gene, a potassium large conductance calcium-activated have a pathogenetic role in hematological malignancies, for channel family member on chromosome 10q22.3, has been example, as part of molecular fusion genes resulting from reported to be fundamental to the control of smooth muscle translocations or inversions. Overall, the capture and enrich- tone and neuronal excitability.31–33 In our patient the RUNX1– ment process provided a useful purification of unique genomic KCNMA1 fusion, as confirmed by RT-PCR and Sanger sequen- sequences away from repeats and other impurities that would cing, led to the disruption of the ‘Runt homology domain’ (RHD) confound, for example, the first emulsion PCR step of the of RUNX1 (Figure 4b). The break point in RUNX1 was detected 454 sequencing process. Therefore, in addition to the specificity in intron 3–4 (ENST00000300305), whereas in t(8;21) the of the assay, the high yields of the downstream DNA sequencing genomic break point is located in intron 5, after the RHD steps are consistently superior to the routine average domain.34,35 The break point in the KCNMA1 gene, encoded by performance of shotgun sequencing using non-captured DNA 27 exons, was located in intron 19–20 (ENST00000286627). sources. Because of the frameshift in exon 20 in KCNMA1, the second In principle, we observed that specific sequence enrichment codon is predicted to be translated into a stop codon. Therefore, by a ‘larger’ array design, that is, the one covering 1.9 Mb of this translocation results in a chimera consisting of a truncated targets and containing multiple genes and contiguous genomic RUNX1 gene fused to a single amino acid of KCNMA1 (Figure 4b). regions, had outperformed the smaller ‘single gene’ designs with In three cases, RUNX1 was observed to be fused to genomic probes querying only 41.5 kb (PDGFRB), 81.7 kb (MLL) and regions on chromosomes 17q21, 5q13.3 and 10q22, respec- 255.2 kb (RUNX1), respectively. The larger target design had tively (Figure 4b). Interestingly, these fusion sequences would resulted in a median on-target percentage of 63.0%, whereas not lead to connecting RUNX1 to a known candidate gene, and lower on-target percentages were observed for the gene-wise consequently were not detectable on a transcript level. The designs (5.1%), as listed in Supplementary Spreadsheet 2. Thus, RUNX1–10q22 and the reciprocal 10q22–RUNX1 fusion the capturing performance indeed can be improved if multiple were confirmed by PCR from genomic DNA and subsequent genes and genomic loci are enriched on the same array. Yet, as Sanger sequencing (Supplementary Table S5). In the latter two we learned from these data, there might be a general limit in cases, only the reciprocal fusion events (chr. 17q–RUNX1 and coverage, as both design strategies resulted in a similar median chr.5q–RUNX1) were detectable. sequencing coverage: 19-fold coverage (large design) versus 17-fold median coverage (gene-wise design). This can be interpreted in a way that using the smaller designs the target is Discussion completely depleted from the DNA specimens so that increased unspecific hybridizations do occur and, as such, the on-target In this study, we described the specific capture and enrichment enrichment performance is less effective. This is further of target nucleic acids, and the subsequent analysis of the illustrated by case N16 (MLL design), in which the on-target enriched target nucleic acids for detecting balanced chromo- capturing efficacy was as low as 6.8%, yet this experiment still somal aberrations, including translocations and inversions, in a resulted in 53-fold coverage of the targeted MLL gene, and leukemia genome. In particular, we were able to demonstrate yielded 31 chimeric on-target reads. Similarly, in case N27, a that the format of representing one fusion partner gene on a comparable number of 38 chimeric on-target reads were

Leukemia Targeted next-generation sequencing in leukemia V Grossmann et al 678 generated for RUNX1, although in this enrichment assay only Second, nonsense and frameshift mutations in the N-terminal 1.6% of reads were observed to be on-target (onefold coverage; region result in partial deletion of the RHD and total loss of the high amount of unspecific enrichment). Overall, the sequencing C-terminal region. Thus, these types of alterations predict the output is in line with recently published data from a similar loss of DNA-binding capability and trans-activating potential of design strategy, in which a 2 Mb sequence capture array was RUNX1.39 In this study, we describe a novel RUNX1–KCNMA1 used to study autosomal recessive ataxia.37 fusion (N28). To our knowledge, this is the first report of a In this study, an analysis pipeline was used to map the RUNX1 fusion to a large-conductance calcium- and voltage- obtained reads both exactly against the human genome to detect activated potassium channel.12 Recently, KCNMA1 has also point mutations, small deletions and insertions, but also to been described to have a role in breast cancer invasion and identify chimeric sequences mapping to different regions in the metastasis to brain.32 In line with the mechanism described genome. Focusing on the first five patients, analyzed using the above, this translocation resulted in a predicted truncated 1.9 Mb capture array for 95 gene targets, after stringent filtering RUNX1 protein with disrupted RHD fused to a single amino a median of 247 exonic variants was observed. Filtering-out acid of KCNMA1. known variations, untranslated or synonymous variants, in These mutations generally do not have a dominant-negative median only six variants per case were observed. Yet, at this effect on the wild-type RUNX1, as a truncated RUNX1 leads to a stage the majority of final non-synonymous variants, with the haploinsufficiency for this gene.40 Studies on knock-out mice exception of those already described in the Supplementary have demonstrated that loss of one Runx1 allele causes a 50% Material in more detail, will remain putative and are currently reduction in the number of hematopoietic stem cells, suggesting under further investigation. that its function is dose dependent.40,41 In our study, two cases Second, by this approach, all fusion genes previously known with loss of one RUNX1 allele due to chromosomal fusions can from cytogenetics, FISH and standard molecular analyses were be postulated. In this study, RUNX1 was fused to genomic detected, for example, CBFB–MYH11, MLL–MLLT3, ETV6– regions on chromosomes 5q13.3 and 17q21, respectively. RUNX1 and RUNX1–RUNX1T1, respectively. Moreover, this However, only the reciprocal fusion events, that is, chr. 17q– method allowed resolving fusion events with unusual break RUNX1 and chr. 5q–RUNX1, were detectable by PCR. points, in which routine diagnostics primers were not annealing. Although, in both cases a second cooperative step is For example, in case N05 cytogenetic analysis revealed a important, families with mutations acting simply via haploin- translocation t(11;19)(q23;p13), and FISH confirmed the MLL sufficiency have a lower incidence of leukemia than families rearrangement. However, molecular PCR-based assays failed to with mutations acting in a dominant-negative manner.42 reveal the translocation partner gene. The method presented in Haploinsufficiency seems to be the basis for pathogenesis, but our study in combination with SNP microarray analysis a mechanism other than genuine haploinsufficiency is important elucidated MLL–ELL and a novel SFRS14–MLL fusion resulting for leukemogenesis,41,42 as in addition to RUNX1 alterations from a t(11;19)(q23;p13), including a 615 kb deletion on 19p13. second cooperating hits are required to drive leukemic Importantly, the genomic representation of only one of the mechanisms.43 Molecular genetic and cytogenetic analyses partner genes of a chimeric fusion on this capture assay was help to identify these second cooperating hits. For example, sufficient to identify also any potentially unknown partner gene patients with RUNX1 point mutations show the following as a result of a balanced chromosomal aberration. However, as concurrent abnormalities: þ 21; À7; þ 8; þ 13.41,42 Loss of break points can also occur in intron sequences, a contiguous chromosome 7 is suspected to be a strong predisposing factor for genomic representation of all exons and introns is mandatory. the second hit in a RUNX1 þ /À background (case N29). FLT3 As such, we further aimed at elucidating the role and mutations are also cooperating genetic alterations. RUNX1 mechanism of RUNX1 translocations. RUNX1 is a crucial results in a differentiation blockade, whereas FLT3 mutations are transcription factor involved in cell lineage differentiation simultaneously responsible for growth stimulation.42 In our during hematopoiesis.38 It contains a RHD and a transactivation cohort, different secondary molecular mutations were observed domain. RUNX1 can function as activator or repressor of target and included aberrations in NRAS and KRAS. gene expression, and three distinct modes of leukemogenesis In conclusion, we demonstrated that the combination of a due to acquired alterations of the RUNX1 gene have been targeted DNA sequence enrichment assay followed by NGS recognized: point mutations, amplification and translocation.12 technology allowed characterizing a complex cancer genome. Most of the translocations involving RUNX1 lead to the As such, this assay revealed balanced genomic aberrations, formation of a fusion gene consisting of the 50 part of RUNX1 which are detectable thus far only by metaphase cytogenetics fused to sequences on other chromosomal regions. Transloca- and FISH, that is, laboratory methods that typically are labor- tions that retain the RHD but remove the transcription activation intensive and require expert-knowledge. Therefore, this NGS domain have been reported to demonstrate a leukemogenic assay has the potential to become an important diagnostic effect by acting as dominant-negative inhibitors of wild-type method, especially for tumors in which cytogenetics can not be RUNX1 in transcription activation.12 applied successfully. Of note, thus far SNP microarrays are not In our cohort, one case (N30) was discovered to harbor a capable of detecting balanced chromosomal aberrations either. fusion sequence RUNX1–chr.10q22 that will be translated into a Yet, implementation of such a technique in a clinical laboratory truncated RUNX1 protein with an intact RHD, but without is challenging. Daily routine usage of such novel genomic transactivation domain because of the break point in intron 6–7. approaches is currently limited by the high costs associated with Because of the loss of the splice acceptor site, sequences of this assay and, moreover, by the relatively long turn-around RUNX1 intron 6–7 and a part of the sequence on chromosome time. In an ideal setting, data from such a targeted enrichment 10 will be translated until a stop codon is located. However, assay would be available no earlier than eight laboratory only the first 96 amino acids of intron 6–7 will be translated, as a working days, although allowing sequencing hundreds of genes stop codon is located even before the break point. Thus, this in a targeted manner. On the other hand, it would be technically chromosomal translocation resulted in a truncated RUNX1 that not possible to cope with this high number of amplicons, if one resembles functionally more a point mutation with absence of a would perform conventional Sanger sequencing for these real fusion gene. high number of targets. Related to balanced chromosomal

Leukemia Targeted next-generation sequencing in leukemia V Grossmann et al 679 aberrations, fusion genes, point mutations, as well as deletions 10 Maki K, Yamagata T, Mitani K. Role of the RUNX1-EVI1 fusion and insertions were detected in a one-step methodological gene in leukemogenesis. Cancer Sci 2008; 99: 1878–1883. approach. Finally, the genomic representation of only one of the 11 Matsuno N, Osato M, Yamashita N, Yanagida M, Nanri T, partner genes of a chimeric fusion on this capture assay was Fukushima T et al. Dual mutations in the AML1 and FLT3 genes are associated with leukemogenesis in acute myeloblastic sufficient to identify also any potentially unknown partner gene leukemia of the M0 subtype. Leukemia 2003; 17: 2492–2499. from a balanced chromosomal aberration. 12 De BE, Ferec C, De BM. RUNX1 translocations in malignant hemopathies. Anticancer Res 2009; 29: 1031–1037. 13 Tang JL, Hou HA, Chen CY, Liu CY, Chou WC, Tseng MH et al. Conflict of interest AML1/RUNX1 mutations in 470 adult patients with de novo acute myeloid leukemia: prognostic implication and interaction with CH, SuS, WK and TH are part owners of the MLL Munich other gene alterations. Blood 2009; 114: 5352–5361. Leukemia Laboratory GmbH. AK, VG, FD and SoS are 14 Erben P, Gosenca D, Muller MC, Reinhard J, Score J, Del VF et al. Screening for diverse PDGFRA or PDGFRB fusion genes is employed by MLL Munich Leukemia Laboratory GmbH. Other facilitated by generic quantitative reverse transcriptase polymerase authors declare no conflict of interest. A patent application has chain reaction analysis. Haematologica 2010; 95: 738–744. been filed under EP09-013670.6. 15 Meyer C, Kowarz E, Hofmann J, Renneville A, Zuna J, Trka J et al. New insights to the MLL recombinome of acute leukemias. Leukemia 2009; 23: 1490–1499. Acknowledgements 16 Loffler H, Rastetter J. Atlas of Clinical Hematology. Springer, Berlin, 1999. We thank H Fiegler and W Haagmans for supporting the initial 17 Schoch C, Schnittger S, Bursch S, Gerstner D, Hochhaus A, phase of the study and the array design. We further thank B Kazak Berger U et al. Comparison of chromosome banding analysis, interphase- and hypermetaphase-FISH, qualitative and quantitative for excellent technical assistance and G Schramm, L Du and PCR for diagnosis and for follow-up in chronic myeloid leukemia: C Bartenhagen for help on data analysis. This work was supported in a study on 350 cases. Leukemia 2002; 16: 53–59. part by a grant from Roche Diagnostics GmbH (Penzberg, Germany). 18 Kern W, Voskova D, Schoch C, Hiddemann W, Schnittger S, Haferlach T. Determination of relapse risk based on assessment of Author contributions minimal residual disease during complete remission by multi- parameter flow cytometry in unselected patients with acute myeloid leukemia. Blood 2004; 104: 3078–3085. VG and AK designed the study, carried out the experiments, 19 Schnittger S, Schoch C, Dugas M, Kern W, Staib P, Wuchter C interpreted the data and wrote the manuscript. H-UK and MD et al. Analysis of FLT3 length mutations in 1003 patients with acute performed data analysis. SoS provided technical assistance. FD, myeloid leukemia: correlation to cytogenetics, FAB subtype, and SuS, WK, CH and TH provided assistance in the design of the prognosis in the AMLCG study and usefulness as a marker for the study, characterized patient samples and critically reviewed the detection of minimal residual disease. Blood 2002; 100: 59–66. manuscript. All authors approved the final version submitted for 20 Paulsson K, Horvat A, Strombeck B, Nilsson F, Heldrup J, Behrendtz M et al. Mutations of FLT3, NRAS, KRAS, and PTPN11 publication. are frequent and possibly mutually exclusive in high hyperdiploid childhood acute lymphoblastic leukemia. Genes Chromosomes Cancer 2008; 47: 26–33. References 21 Bacher U, Haferlach C, Kern W, Haferlach T, Schnittger S. Prognostic relevance of FLT3-TKD mutations in AML: the combination F 1 Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X matters an analysis of 3082 patients. Blood 2008; 111: 2527–2537. et al. Direct selection of human genomic loci by microarray 22 Kohl TM, Schnittger S, Ellwart JW, Hiddemann W, Spiekermann K. hybridization. Nat Methods 2007; 4: 903–905. KIT exon 8 mutations associated with core-binding factor (CBF)- 2 Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, acute myeloid leukemia (AML) cause hyperactivation of the receptor Brockman W et al. Solution hybrid selection with ultra-long in response to stem cell factor. Blood 2005; 105: 3319–3321. oligonucleotides for massively parallel targeted sequencing. 23 Schnittger S, Schoch C, Dugas M, Kern W, Staib P, Wuchter C Nat Biotechnol 2009; 27: 182–189. et al. Analysis of FLT3 length mutations in 1003 patients with acute 3 Tewhey R, Warner JB, Nakano M, Libby B, Medkova M, David PH myeloid leukemia: correlation to cytogenetics, FAB subtype, and et al. Microdroplet-based PCR enrichment for large-scale targeted prognosis in the AMLCG study and usefulness as a marker for the sequencing. Nat Biotechnol 2009; 27: 1025–1031. detection of minimal residual disease. Blood 2002; 100: 59–66. 4 Bacher U, Schnittger S, Haferlach C, Haferlach T. Molecular 24 Schnittger S, Schoch C, Kern W, Mecucci C, Tschulik C, Martelli diagnostics in acute leukemias. Clin Chem Lab Med 2009; 47: MF et al. Nucleophosmin gene mutations are predictors of 1333–1341. favorable prognosis in acute myelogenous leukemia with a normal 5 Dohner H, Estey EH, Amadori S, Appelbaum FR, Buchner T, karyotype. Blood 2005; 106: 3733–3739. Burnett AK et al. Diagnosis and management of acute myeloid 25 Schnittger S, Kohl TM, Haferlach T, Kern W, Hiddemann W, leukemia in adults: recommendations from an international expert Spiekermann K et al. KIT-D816 mutations in AML1-ETO-positive panel, on behalf of the European Leukemia Net. Blood 2010; 115: AML are associated with impaired event-free and overall survival. 453–474. Blood 2006; 107: 1791–1799. 6 Pui CH, Evans WE. Treatment of acute lymphoblastic leukemia. 26 Hubbard TJ, Aken BL, Ayling S, Ballester B, Beal K, Bragin E et al. N Engl J Med 2006; 354: 166–178. Ensembl 2009. Nucleic Acids Res 2009; 37: D690–D697. 7 Grimwade D, Jovanovic JV, Hills RK, Nugent EA, Patel Y, Flora R 27 Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben et al. Prospective minimal residual disease monitoring to predict LA et al. Genome sequencing in microfabricated high-density relapse of acute promyelocytic leukemia and to direct pre-emptive picolitre reactors. Nature 2005; 437: 376–380. arsenic trioxide therapy. J Clin Oncol 2009; 27: 3650–3658. 28 Li H, Durbin R. Fast and accurate long-read alignment with 8 Ommen HB, Schnittger S, Jovanovic JV, Ommen IB, Hasle H, Burrows-Wheeler transform. Bioinformatics 2010; 26: 589–595. Ostergaard M et al. Strikingly different molecular relapse kinetics 29 Shen Y, Wan Z, Coarfa C, Drabek R, Chen L, Ostrowski EA et al. A in NPM1c, PML-RARA, RUNX1-RUNX1T1, and CBFB-MYH11 SNP discovery method to assess variant allele probability from acute myeloid leukemias. Blood 2010; 115: 198–205. next-generation resequencing data. Genome Res 2010; 20: 273–280. 9 Schnittger S, Weisser M, Schoch C, Hiddemann W, Haferlach T, 30 Hasin Y, Olender T, Khen M, Gonzaga-Jauregui C, Kim PM, Urban Kern W. New score predicting for prognosis in PML-RARA+, AE et al. High-resolution copy-number variation map reflects AML1-ETO+, or CBFBMYH11+ acute myeloid leukemia based on human olfactory receptor diversity and evolution. PLoS Genet quantification of fusion transcripts. Blood 2003; 102: 2746–2755. 2008; 4: e1000249.

Leukemia Targeted next-generation sequencing in leukemia V Grossmann et al 680 31 Imlach WL, Finch SC, Miller JH, Meredith AL, Dalziel JE. A role for 37 Hoischen A, Gilissen C, Arts P, Wieskamp N, van DVW et al. BK channels in heart rate regulation in rodents. PLoS One 2010; 5: Massively parallel sequencing of ataxia genes after array-based e8698. enrichment. Hum Mutat 2010; 31: 494–499. 32 Khaitan D, Sankpal UT, Weksler B, Meister EA, Romero IA, 38 Cohen Jr MM. Perspectives on RUNX genes: an update. Am J Med Couraud PO et al. Role of KCNMA1 gene in breast cancer invasion Genet A 2009; 149A: 2629–2646. and metastasis to brain. BMC Cancer 2009; 9: 258. 39 Harada Y, Harada H. Molecular pathways mediating MDS/AML 33 Long X, Tharp DL, Georger MA, Slivano OJ, Lee MY, Wamhoff BR with focus on AML1/RUNX1 point mutations. J Cell Physiol 2009; et al. The smooth muscle cell-restricted KCNMB1 ion channel 220: 16–20. subunit is a direct transcriptional target of serum response factor 40 Agerstam H, Lilljebjorn H, Lassen C, Swedin A, Richter J, and myocardin. J Biol Chem 2009; 284: 33671–33682. Vandenberghe P et al. Fusion gene-mediated truncation of RUNX1 34 Zhang Y, Strissel P, Strick R, Chen J, Nucifora G, Le Beau MM et al. as a potential mechanism underlying disease progression in the Genomic DNA breakpoints in AML1/RUNX1 and ETO cluster with 8p11 myeloproliferative syndrome. Genes Chromosomes Cancer topoisomerase II DNA cleavage and DNase I hypersensitive sites in 2007; 46: 635–643. t(8;21) leukemia. Proc Natl Acad Sci USA 2002; 99: 3070–3075. 41 Sun W, Downing JR. Haploinsufficiency of AML1 results in a 35 Miyoshi H, Shimizu K, Kozu T, Maseki N, Kaneko Y, Ohki M. decrease in the number of LTR-HSCs while simultaneously t(8;21) breakpoints on chromosome 21 in acute myeloid leukemia inducing an increase in more mature progenitors. Blood 2004; are clustered within a limited region of a single gene, AML1. Proc 104: 3565–3572. Natl Acad Sci USA 1991; 88: 10431–10434. 42 Osato M. Point mutations in the RUNX1/AML1 gene: another actor 36 Choi M, Scholl UI, Ji W, Liu T, Tikhonova IR, Zumbo P et al. in RUNX leukemia. Oncogene 2004; 23: 4284–4296. Genetic diagnosis by whole exome capture and massively parallel 43 Gilliland DG. Molecular genetics of human leukemias: new DNA sequencing. Proc Natl Acad Sci USA 2009; 106: 19096–19101. insights into therapy. Semin Hematol 2002; 39: 6–11.

Supplementary Information accompanies the paper on the Leukemia website (http://www.nature.com/leu)

Leukemia