Published OnlineFirst April 5, 2019; DOI: 10.1158/1078-0432.CCR-18-4117
Translational Cancer Mechanisms and Therapy Clinical Cancer Research Therapeutic Targeting of Non-oncogene Dependencies in High-risk Neuroblastoma Chen-Tsung Huang1, Chiao-Hui Hsieh2,Wen-Chi Lee2,Yen-Lin Liu3,Tsai-Shan Yang4, Wen-Ming Hsu4, Yen-Jen Oyang1, Hsuan-Cheng Huang5, and Hsueh-Fen Juan1,2,6
Abstract
Purpose: Neuroblastoma is a pediatric malignancy of the potentially effective single agents and drug combinations for sympathetic nervous system with diverse clinical behaviors. high-risk neuroblastoma. Genomic amplification of MYCN oncogene has been shown to Results: Among these predictions, we validated in vitro drive neuroblastoma pathogenesis and correlate with aggres- efficacies of some investigational and marketed drugs, sive disease, but the survival rates for those high-risk tumors of which niclosamide, an anthelmintic drug approved carrying no MYCN amplification remain equally dismal. The by the FDA, was further investigated in vivo.Wealso paucity of mutations and molecular heterogeneity has hin- quantified the proteomic changes during niclosamide dered the development of targeted therapies for most treatment to pinpoint nucleoside diphosphate kinase 3 advanced neuroblastomas. We use an alternative method to (NME3) downregulation as a potential mechanism for its identify potential drugs that target nononcogene dependen- antitumor activity. cies in high-risk neuroblastoma. Conclusions: Our results establish a gene expression–based Experimental Design: By using a gene expression–based strategy to interrogate cancer biology and inform drug discov- integrative approach, we identified prognostic signatures and ery and repositioning for high-risk neuroblastoma.
Introduction MYCN-mediated transcriptional program (6, 7), the regulators of MYCN protein stability (8), or the downstream effects of MYCN Neuroblastoma is a childhood cancer of the peripheral sym- amplification (9, 10) have shown clinical promise. In addition to pathetic nervous system. Several clinical and biological variables, MYCN (amplified in 20% of neuroblastomas), several genomic including age at initial diagnosis, stage of disease, and amplifi- alterations, for example, ALK-activating mutation (10%; ref. 11), cation of the MYCN oncogene, are used to stratify patients into ATRX-inactivating mutation or deletion (10%; ref. 12), and TERT neuroblastoma risk groups (1). Although the survival rates from promoter rearrangement (25%; ref. 13, 14), have been described neuroblastoma have been improved substantially in recent dec- in aggressive neuroblastoma, among which ALK is currently the ades, children bearing high-risk tumors, regardless of the presence only tractable oncogene for targeted therapy (15). Relapsed of amplified MYCN, still have poor outcomes (2, 3). Owing to the neuroblastoma, by contrast, was found to harbor more druggable lack of recurrent mutations and heterogeneity of mutational mutations, most of which converged on the activation of the RAS– spectrum in neuroblastoma (4), current treatment approach for MAPK pathway (16, 17). Despite tremendous advances in under- high-risk diseases is largely based on intensive combination standing the cancer genome, new therapeutic approaches to chemotherapy, radiotherapy, stem cell transplant, immunother- effectively treat this heterogeneous, aggressive disease are in high apy, and differentiation therapy (1–3). Although MYCN has a demand. well-established role in neuroblastoma development, pharma- Paralleling the dedicated efforts of researchers to identify driver ceutical targeting of this oncogenic transcription factor remains mutations that confer selective growth advantage, the importance challenging (5). However, alternative strategies that target the of acquired dependencies of cancer cells on the activities of certain nonmutated genes has been increasingly recognized (18, 19). This cancer's addiction to both oncogenes and nononcogenes is 1Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan. 2Institute of Molecular and Cellular Biology, required to sustain the hallmark capabilities and tumorigenic National Taiwan University, Taipei, Taiwan. 3Department of Pediatrics, Taipei state (20). In particular, leveraging the nononcogene addiction for Medical University Hospital, Taipei, Taiwan. 4Department of Surgery, National therapeutic intervention has thus far proved beneficial in cancer Taiwan University Hospital, Taipei, Taiwan. 5Institute of Biomedical Informatics, treatment (18, 19, 21, 22). 6 National Yang-Ming University, Taipei, Taiwan. Department of Life Science, Here, we used a gene expression–based approach to identify the National Taiwan University, Taipei, Taiwan. cancer-related transcriptional signatures and inform potential Note: Supplementary data for this article are available at Clinical Cancer therapeutics for treating high-risk neuroblastoma. This was Research Online (http://clincancerres.aacrjournals.org/). achieved by correlating gene expression signatures between Corresponding Authors: Hsueh-Fen Juan, National Taiwan University, 1, Sec. 4, high-risk neuroblastoma and small-molecule perturbations Roosevelt Rd., Taipei, 106, Taiwan. Phone: 8862-3366-4536; Fax: 8862-2367- (23). In this study, we first performed integrative analysis of the 3374; E-mail: [email protected]; and Hsuan-Cheng Huang, transcriptomes of primary neuroblastoma tumors obtained from [email protected] multiple Gene Expression Omnibus (GEO) datasets. This process doi: 10.1158/1078-0432.CCR-18-4117 led to the identification of gene signatures that were prognostic for 2019 American Association for Cancer Research. patient survival in an independent cohort, especially for children
www.aacrjournals.org OF1
Downloaded from clincancerres.aacrjournals.org on September 27, 2021. © 2019 American Association for Cancer Research. Published OnlineFirst April 5, 2019; DOI: 10.1158/1078-0432.CCR-18-4117
Huang et al.
combined gene expression matrix (11,939 by 1,065) was taken by Translational Relevance the model as input with hyperparameters set as follows: NUM High-risk neuroblastoma has few recurrent somatic muta- (number of topics) ¼ 2; GAMMAPARAM (scale b for the gamma tions and is still associated with poor outcomes despite inten- distribution) ¼ 4; BETA (beta for the Dirichlet distribution) ¼ 10; sive treatment. The lack of druggable oncogenes continues to and number of interactions ¼ 100. After this cross-platform restrict the development of targeted therapies for high-risk normalization, the resultant matrix was then quantile-normalized neuroblastoma. Here, we use an alternative, gene expression– again to ensure that the gene expression distributions of all based approach to identify potential drugs that target the samples were identical. nononcogene dependencies in high-risk neuroblastoma. To assess the quality of data after PLIDA transformation, we Among the top predicted drugs by this approach, our work computed the following measures: (i) sample-wise principal then investigates an FDA-approved anthelmintic drug, niclo- component analysis (Supplementary Fig. S1A); (ii) a Spearman samide, as an effective treatment for neuroblastoma through correlation coefficient (SCC) between a given gene vector the regulation of nucleotide biosynthesis and nucleoside before PLIDA and the same gene vector after PLIDA for each diphosphate kinase 3 (NME3) protein. dataset category (within individual or across all GEO datasets; Supplementary Fig. S1B); (iii) SCCs between a gene and any other genes before and after PLIDA for each dataset category (Supplementary Fig. S1C); and (iv) gene expression values before and after PLIDA for each dataset category, for which a with advanced-stage, MYCN-nonamplified tumors. We used the 1-way ANOVA or Kruskal Wallis test was applied to determine high-risk neuroblastoma gene signature to predict effective drugs whether the mean or median expression values of a gene were or their combinations and validated some of these predictions in equivalent across all dataset categories, respectively (Supple- vitro. The in vivo efficacy of niclosamide, an anthelmintic drug mentary Fig. S1D). approved by the FDA to treat tapeworm infections, was confirmed in neuroblastoma xenograft models. By further investigating the Clustering patients with high-risk neuroblastoma without neuroblastoma proteome following niclosamide treatment, we MYCN amplification identified downregulation of nucleoside diphosphate kinase 3 We used a gene expression intensity-based similarity met- (NME3), an enzyme involved in the nucleotide biosynthesis, as a ric (26) to compute pairwise similarities among the patients with potential molecular mechanism of the drug's effects. Given the high-risk, MYCN-nonamplified neuroblastoma (HR-nonMNA) rarity of actionable mutations, our data present an alternative for clustering analysis (Supplementary Fig. S2A). This intensity- solution to target-based drug screening in this deadly pediatric based similarity metric has proved superior to other commonly neoplasm. used metrics derived from the Pearson correlation or Euclidean distance in a clustering task of drugs with diverse mechanisms of Materials and Methods action (MoA). In brief, for each HR-nonMNA sample (MYCN-nonamplified, Data source, cross-platform normalization, and quality stage 4, >18 months; n ¼ 156), we subtracted the median measures expression vector of all low-risk (LR) samples (MYCN- Six gene expression datasets from different microarray plat- nonamplified, stage 1 or 2, <18 months; n ¼ 247) from itself to forms containing primary neuroblastomas were obtained from derive a "differential profile." The differential profiles of all GEO with accession numbers listed as follows: GSE45547 HR-nonMNA samples were then applied to our gene expression (Agilent-020382 Human Custom Microarray 44k; n ¼ 649), intensity–based similarity framework to obtain the optimal GSE3446 (Affymetrix Human Genome U133A Array; n ¼ 117), parameter set for the intensity-based similarity metric (query GSE19274 (Illumina human-6 v2.0 expression beadchip; gene set size b and decay factor s; Supplementary Fig. S2B). n ¼ 100), GSE16254, GSE12460, and GSE16237 (Affymetrix Instead of using the F1 score, an external clustering validity Human Genome U133 Plus 2.0 Array; n ¼ 88, 64, and 50, index given a ground-truth answer, we used the silhouette respectively). For each dataset, probe set IDs were mapped to score, an internal clustering validity index based on the statis- gene names using an available R package (hgu133b.db, illumi- tical properties of a clustering, because in this case there is no naHumanv2.db, or hgu133plus2.db) or a GEO platform (GPL) gold standard for these patients with HR-nonMNA. The clus- annotation file (GPL16876 for GSE45547). For each sample, the tering produced by the best silhouette score has proved being median log expression value was taken for each gene mapped to highly correlated with that by the best F1 score in biomedical multiple probes, and no missing value was found across all data analysis (27). samples (negative expression values were replaced by a zero). The genes shared among all datasets were selected and combined Differential expression analysis into a matrix, followed by quantile normalization. This data For each comparative category ("MNA," "HR-MNA," "HR- integration process resulted in an intersection of 11,939 genes nonMNA-subgroup1," or "HR-nonMNA-subgroup2," defined in 1,065 primary neuroblastomas. by age at diagnosis, INSS tumor stage, and MYCN amplifica- For cross-platform data normalization, we used platform-inde- tion), we performed differential gene expression analysis using pendent latent Dirichlet allocation (PLIDA; ref. 24), which uses Significance Analysis of Microarrays (ref. 28; SAM; 2-class the generative probabilistic model latent Dirichlet allocation (25) unpaired Mann Whitney U test with 1,000 sample-level per- to learn topic model decomposition from gene expression data- mutations). The differentially expressed (DE) genes (the 90th sets from multiple platforms. The PLIDA model was learned using percentile FDR < 0.001) for each comparative category are the MATLAB code released by Deshwar and Morris (24). The provided in Supplementary Table S1.
OF2 Clin Cancer Res; 2019 Clinical Cancer Research
Downloaded from clincancerres.aacrjournals.org on September 27, 2021. © 2019 American Association for Cancer Research. Published OnlineFirst April 5, 2019; DOI: 10.1158/1078-0432.CCR-18-4117
Drugging Non-oncogene Addiction in Neuroblastoma
Pathway–gene association analysis Genetic perturbation analysis We devised an approach based on gene set enrichment anal- We have generated hundreds of thousands of recurrent pertur- ysis (GSEA; ref. 29) to identify genes that might closely associate bation–transcript regulatory associations among >7,000 chemical with the biology of high-risk neuroblastoma (Supplementary and genetic perturbagens and 12,494 transcripts across 10 cell Fig. S3). We performed GSEA analysis as described previous- types, while demonstrating the robustness of these recurrent ly (29) using the KEGG pathway gene set collection (MSigDB relationships in general against cell-line variability (23). For each v5.0, C2 collection) with default settings (1,000 sample-level comparative category, we combined those recurrent regulatory permutations; the minimum and maximum gene set sizes for associations of genetic perturbation type [corresponding to the consideration were set to 10 and 500, respectively). Pathways are exposure to short hairpin RNAs (shRNA) for 96 hours (sh96)] called statistically significant if FDR q values are <0.25. For each with the DE genes to infer gene regulatory relationships in high- comparative category, we (i) performed "group-wise" enrich- risk neuroblastoma (Supplementary Fig. S4). The inferred gene ment analysis by testing the pathways enriched between the regulatory relationships for each comparative category are pro- source (advanced) and reference patient groups, (ii) performed vided in Supplementary Table S3. "gene-specific" enrichment analysis by testing the pathways enriched among those patients in the source patient group with Generating gene expression signatures the highest 25% and the lowest 25% expression of a given gene For each comparative category, we first scored the genes from for each of the top 500 upregulated and 500 downregulated DE integrated transcriptomic analysis. For differential expression genes, and (iii) computed the proportion of enrichments for analysis, we computed the fold change for an upregulated gene each gene as the ratio of the number of "consistently" enriched or the inverse of fold change for a downregulated gene times the pathways in the gene-specific and group-wise analyses over the square root of the absolute value of the penalized t-statistic number of enriched pathways in the group-wise analysis (Sup- reported by SAM. For pathway–gene association analysis, we plementary Fig. S3A). A pathway is called consistently enriched mapped the proportion of enrichments (0% 100%) linearly to in the gene-specific and group-wise enrichment analyses if (i) it [0, 80] and the best FDR q value to [0, 20] as follows: [0.2, 0.25) to has a positive normalized enrichment score (NES) in the group- 2; [0.1, 0.2) to 4; [0.05, 0.1) to 6; [0.01, 0.05) to 10; [0.001, 0.01) wise analysis and has a positive NES for an upregulated gene or a to 14; [0.0001, 0.001) to 18; and [0, 0.0001) to 20. For gene negative NES for a downregulated gene in the gene-specific correlation analysis, we calculated twice the sum of the absolute analysis; or (ii) it has a negative NES in the group-wise analysis value of correlation scores between a given gene and all its first and has a negative NES for an upregulated gene or a positive NES neighbors. For genetic perturbation analysis, we counted the for a downregulated gene in the gene-specificanalysis.The number of directed links between a given gene and its first enriched pathways and the genes that contributed to the con- neighbors weighted by a factor of 0.2. In addition, we also sistent enrichments for each comparative category are summa- incorporated human protein interactome from Menche and col- rized in Supplementary Fig. S3B. leagues (34), which comprised a union of 13,460 proteins with 141,296 physical interactions, with the genes included in our Gene correlation analysis integrated transcriptomic analysis and then counted the number We used high-dimensional undirected graph estimation of physical interactions between a given gene and its first neigh- (HUGE; ref. 30, via R package HUGE version 1.2.7), a Gaussian bors weighted by a factor of 0.1. We note that this scoring scheme graphical model that describes probabilistic relationships puts far more emphasis on pathway–gene association analysis between variables in a multidimensional manner, to infer gene than other components such that a gene with the proportion correlation relationships in high-risk neuroblastoma. For each of enrichment of 100% and the best FDR q value < 0.0001 from comparative category, the top 500 upregulated and 500 down- pathway–gene association analysis has already contributed a regulated DE genes were considered and an estimator of popu- score of 100 to the aggregate score. The aggregate score for each lation inverse covariance matrix (also known as concentration or gene in each comparative category is provided in Supplementary precision matrix) was learned using the graphical lasso algorithm Table S4. These aggregate scores were used to generate the gene (ref. 31; function HUGE with method, "glasso"; nlambda, 100; signatures of high-risk neuroblastoma in Fig. 2A and Supplemen- lambda.min.ratio, 0.1), which is aimed to maximize Gaussian log tary Figs. S6A and S7A (cutoff >50), in which the genes were likelihood with l1 regularizer for a sparse solution (i.e., many ordered by the geometric mean of the ranks of aggregate scores. entries in the matrix will be zero). The parameter l was learned The significant enrichments of the entire MSigDB gene set collec- using stability approach to regularization selection (ref. 32; stARS; tion (v5.0) using hypergeometric test (Benjamini–Hochberg function HUGE.select with criterion, "stars"; stars.thresh, 0.05; (BH)-corrected P < 0.05) for each of gene signatures are provided rep.num, 20), which has been shown to generate reasonable in Supplementary Table S5. sparsity in the graph. The resulting estimator of population inverse covariance matrix was then used to derive the estimates Prognostic significance of the gene expression signatures of population covariance matrix and correlation matrix. The We validated the prognostic value of the generated gene presence of gene correlation relationships was judged by nonzero expression signatures in an independent cohort of 477 patients entries in the estimated population inverse covariance matrix with neuroblastoma (ArrayExpress accession number E-MTAB- and the strength of the relationships was represented by the 179; Agilent Custom Human Neuroblastoma Chip 251496110; corresponding entries in the estimated correlation matrix. The ref. 35). The expression profile (using the processed data from – inferred gene gene correlation relationships for each comparative E-MTAB-179.processed.1.zip) was log2-tranformed (a negative category and their overlap with 1,639 known and likely value was replaced by a zero), and the median value was taken human transcription factors (33) are provided in Supplementary for each gene mapped to multiple probes, generating a 3,834 Table S2. (mapped genes in hg19) by 477 (samples) data matrix, followed
www.aacrjournals.org Clin Cancer Res; 2019 OF3
Downloaded from clincancerres.aacrjournals.org on September 27, 2021. © 2019 American Association for Cancer Research. Published OnlineFirst April 5, 2019; DOI: 10.1158/1078-0432.CCR-18-4117
Huang et al.