Genome-Wide Approach to Identify Risk Factors for Therapy-Related Myeloid Leukemia
Total Page:16
File Type:pdf, Size:1020Kb
Leukemia (2006) 20, 239–246 & 2006 Nature Publishing Group All rights reserved 0887-6924/06 $30.00 www.nature.com/leu ORIGINAL ARTICLE Genome-wide approach to identify risk factors for therapy-related myeloid leukemia A Bogni1, C Cheng2, W Liu2, W Yang1, J Pfeffer1, S Mukatira3, D French1, JR Downing4, C-H Pui4,5,6 and MV Relling1,6 1Department of Pharmaceutical Sciences, The University of Tennessee, Memphis, TN, USA; 2Department of Biostatistics, The University of Tennessee, Memphis, TN, USA; 3Hartwell Center, The University of Tennessee, Memphis, TN, USA; 4Department of Pathology, The University of Tennessee, Memphis, TN, USA; 5Department of Hematology/Oncology St Jude Children’s Research Hospital, The University of Tennessee, Memphis, TN, USA; and 6Colleges of Medicine and Pharmacy, The University of Tennessee, Memphis, TN, USA Using a target gene approach, only a few host genetic risk therapy increases, the importance of identifying host factors for factors for treatment-related myeloid leukemia (t-ML) have been secondary neoplasms increases. defined. Gene expression microarrays allow for a more 4 genome-wide approach to assess possible genetic risk factors Because DNA microarrays interrogate multiple ( 10 000) for t-ML. We assessed gene expression profiles (n ¼ 12 625 genes in one experiment, they allow for a ‘genome-wide’ probe sets) in diagnostic acute lymphoblastic leukemic cells assessment of genes that may predispose to leukemogenesis. from 228 children treated on protocols that included leukemo- DNA microarray analysis of gene expression has been used to genic agents such as etoposide, 13 of whom developed t-ML. identify distinct expression profiles that are characteristic of Expression of 68 probes, corresponding to 63 genes, was different leukemia subtypes.13,14 Studies using this method have significantly related to risk of t-ML. Hierarchical clustering of led to the identification of molecular events in the progression of these probe sets clustered patients into three groups with 94, 15–18 122 and 12 patients, respectively; 12 of the 13 patients who chronic myeloid leukemia (CML) and to the classification of 13,19 went on to develop t-ML were overrepresented in the latter AML and ALL subtypes. Microarray analysis has also been group (Po0.0001). A permutation test indicated a low likelihood used in childhood ALL to identify gene expression patterns at that these probe sets and clusters were obtained by chance diagnosis associated with risk of ALL relapse.13,20 (Po0.001). Distinguishing genes included transcription-related Variation in gene expression is likely to partly reflect variation oncogenes (v-Myb, Pax-5), cyclins (CCNG1, CCNG2 and in germline DNA. Gene expression in lymphoid tissue reflects CCND1) and histone HIST1H4C. Common transcription factor 21 recognition elements among similarly up- or downregulated inherited traits in cell lines from large kindreds. That gene genes included several involved in hematopoietic differentia- expression in ALL blasts differs by germline genetic polymorph- tion or leukemogenesis (Maz, PU.1, ARNT). This approach has isms,22 and can even predict which patients were likely to identified several genes whose expression distinguishes eventually develop a therapy-induced brain tumor,23 is also patients at risk of t-ML, and suggests targets for assessing consistent with the hypothesis that gene expression in ALL cells germline predisposition to leukemogenesis. Leukemia (2006) 20, 239–246. doi:10.1038/sj.leu.2404059; partially reflects germline characteristics, and not only the published online 8 December 2005 acquired genetic signature of the ALL subtype. In an initial Keywords: acute lymphoblastic leukemia; secondary myeloid analysis, gene expression profiles of diagnostic ALL cells were leukemia; gene expression also predictive of t-ML risk,13 but these data were analyzed in depth only among the largest molecularly defined ALL subgroup: those with Tel/AML1 translocations at diagnosis. Moreover, the time dependence of t-ML and competing risks for relapse were not considered. Background In this study, we used a genome-wide approach to identify genes whose expression discriminated patients with ALL who were predisposed to t-ML, using cumulative incidence and Cox A major unpredictable complication in the treatment of acute regression models for assessing the relationship between gene lymphoblastic leukemia (ALL) is treatment-related myeloid expression and time to t-ML. The two models identified 68 leukemia (t-ML), which includes treatment-related acute mye- common probe sets, corresponding to 63 genes, whose loid leukemia (AML) and myelodysplastic syndrome (MDS), 1,2 expression in ALL cells differed significantly between patients occurring in 1–10% of ALL patients. Characteristics of the ALL 3–9 who did versus those who did not develop t-ML. These results subtype do not appear to influence the risk of t-ML. Two suggest that pretreatment gene expression profiling can provide major types of t-ML have been reported: the form associated 3,5,6 insights into candidate genes involved as host factors in t-ML with the use of topoisomerase II inhibitors, with character- 10–12 development. istic balanced translocations of the MLL gene, and the form associated with alkylating agents7,8 associated with monosomy of chromosome 5 or 7.4 Although several treatment-related risk factors have been identified that interact with the primary Patients, materials and methods leukemogens, few host factors are known. As survival of cancer Patients and laboratory tests Correspondence: Professor MV Relling, Department of Pharmaceutical Protocols Total XIIIA (1991–1994) and Total XIIIB (1994–1998) Sciences, St Jude Children’s Research Hospital, 332 North Lauderdale, of St Jude Children’s Research Hospital were used for the Memphis, TN 38105-2794, USA. E-mail: [email protected] treatment of patients with newly diagnosed ALL. Both protocols Received 18 July 2005; revised 3 October 2005; accepted 31 October included the administration of topoisomerase II inhibitors and 2005; published online 8 December 2005 alkylating agents.24,25 The patient cohort herein consisted of all Gene expression and t-ML risk A Bogni et al 240 267 patients (228 with B-lineage and 39 with T-cell ALL) who set for t-ML, that risk can be so altered by that failure (or its were enrolled on the protocols and for whom gene expression attendant change in therapy) that it is improbable to observe a t- data from ALL cells at diagnosis were available. The median ML in such patients. In our particular study, no patient who length of follow-up was 6.1 years. In all, 14 patients developed developed t-ML experienced a competing adverse event before t-ML: 13 had B-lineage ALL and one patient had T-cell ALL as the t-ML. For this reason, we have also conducted a Cox the primary malignancy. The types of therapy-related leukemias regression model analysis, in which we censored patients with were AML (n ¼ 10), MDS (n ¼ 3), and CML (n ¼ 1). Herein, we relapse (or other failures) at the times of such events, and used restricted our analysis to those with B-lineage ALL for the survival models, instead of cumulative incidence models, for primary analysis (228 patients), but an identical analysis for all purposes of gene/probe set selection. Probe sets with Pp0.01 267 patients (those with T- or B-lineage ALL), is presented in the were considered as ‘statistically significant.’ Supplementary Information. Bone marrow blasts of ALL patients were cryopreserved on the date of diagnosis. Total RNA was extracted using Tri- Cluster analysis Reagent (MRC, Cincinnati, OH, USA). Patients, their parents, or Hierarchical clustering (Ward’s minimum-variance method) was guardians gave informed consent and assent to participate in the applied to cluster the patients into three groups, using the study. The protocols and the current analysis of risk factors were expression of probe sets selected by the above methods as approved by the Institutional Review Board of St Jude. RNA was features. As three major outcomes were present (no event, t-ML, submitted for microarray analysis as described.13 Labeled cRNA adverse events other then t-ML), the cluster tree was cut at the was hybridized to Affymetrix HG-U95Av2 GeneChips, which level that defined k ¼ 3 clusters. The cumulative incidence of t- comprise more than 12 600 probe sets for more than 9600 ML among the three clusters was compared using Gray’s test, unique genes. Gene expression was verified by a second using adverse events other than t-ML as competing risks.29 method in a subset (see Supplementary Information). Permutation test Statistical analysis To assess the probability that the discriminating probe sets could Gene expression data were extracted from Affymetrix-generated have been selected by chance, a permutation test was image files using MicroArray Suites (MAS) version 5.0. with performed separately for each group of selected probe sets default settings (Raw data available on http://www.stjudere- (using the methods described above). For these permutations, search.org/data/ALL1). The MAS 5.0-generated signal of each outcome was treated as a time-dependent event, allowing us to probe set was transformed by natural-log transformation (log ). e combine features of a ‘classification accuracy’ with a cumula- Statistical analysis was performed using SAS (SAS Inc., Cary, tive incidence comparison approach. In each of 1000 permuta- NC, USA), S-plus (Insightful, Seattle, WA, USA), StatXact 5 tions, the paired time-to-t-ML development and outcome event (Cytel, Cambridge, MA, USA) and R 1.6 (http://www.r-project. were randomly reassigned to each patient, and gene selection org/). For data visualization, principal component analysis (PCA) was performed by applying either the cumulative incidence and hierarchical clustering were performed using Spotfire regression or the Cox regression model to the permuted data. DecisionSite 7.0 (Spotfire, USA). The a value of 0.01 was the significance threshold, the same as All analyses were performed on the entire cohort (n ¼ 267) used on the original data.