PERSPECTIVES

downstream transcriptional programmes OPINION (FIG. 1b). Thus, although individual MR are highly tumour specific9–11, the The recurrent architecture of tumour resulting tumour checkpoint oncotecture is virtually tumour independent7. The term MR was initially introduced by initiation, progression and drug developmental biologists to describe products (mostly factors) at sensitivity the top of regulatory hierarchies that are both necessary and sufficient to induce Andrea Califano and Mariano J. Alvarez morphogenesis12 or specific steps in lineage differentiation13. By contrast, cancer biologists Abstract | Recent studies across multiple tumour types are starting to reveal a have adopted a looser definition of the term recurrent regulatory architecture in which genomic alterations cluster upstream to indicate genetic determinants or functional of functional master regulator (MR) proteins, the aberrant activity of which is both drivers, the aberrant activity of which is necessary and sufficient to maintain tumour state. These proteins form small, sufficient — yet not strictly necessary — to hyperconnected and autoregulated modules (termed tumour checkpoints) that induce transformation. According to the latter, TP53 (which encodes ) and ERBB2 are increasingly emerging as optimal biomarkers and therapeutic targets. Crucially, (which encodes HER2) are often presented as their activity is mostly dysregulated in a post-translational manner, rather than as cancer MRs14,15, even though many by mutations in their corresponding or by differential expression, the alternative genomic alterations may induce identification of MR proteins by conventional methods is challenging. In this transcriptional tumour states that are virtually Opinion article, we discuss novel methods for the systematic analysis of MR indistinguishable from those in which those (BOX 1) proteins and of the modular regulatory architecture they implement, including genes are mutated . For instance, several tumours classified as basal-like or their use as a valuable reductionist framework to study the genetic heterogeneity HER2+ by profile analysis of human disease and to drive key translational applications. lack the hallmark genetic alterations in TP53 and ERBB2, respectively16. In this Opinion article, not only do we adopt the stricter There is an intriguing yet largely unexplored are routinely used in the clinic to assess the MR definition from developmental biology paradox in cancer. On the one hand, risk of breast cancer recurrence following but we further require MRs to directly transcriptional programmes are highly hormonal therapy. regulate the transcriptional state of the conserved across samples that represent the The ability of tumours to present similar tumour cell (FIG. 1b). Under this definition, a same tumour subtype1 — even compared transcriptional signatures, despite having classic cancer driver such as p53 may not be with normal tissue (FIG. 1a) — suggesting radically distinct somatic mutational considered a MR unless it is necessary and the existence of relatively stable tumour profiles, suggests the existence of regulatory sufficient to implement a subset of tumours states. On the other hand, the genetic and mechanisms that are responsible for with a common transcriptional signature epigenetic alterations (henceforth referred to decoupling (that is, buffering) the tumour and it directly participates in its signature as genomic alterations) that determine these cell state from its genetic, epigenetic and regulation. By contrast, the proto- states are remarkably heterogeneous on a signalling determinants7,8. Specifically, this oncoprotein, the locus of which is translocated sample by sample basis (BOX 1). Opinion article provides an overview of the in 100% of Burkitt lymphomas17 and which Consistent with these observations, emergence of a universal gene-regulatory regulates their signature18, would emerge as a although genomic alterations represent architecture (henceforth referred to as bona fide MR in that context. valuable predictors of targeted inhibitor oncotecture), in which a handful of master This stricter terminology is not arbitrary. sensitivity against the corresponding regulator (MR) proteins — organized into Instead, it is necessary to address the currently oncoproteins, they have proved less effective tightly autoregulated tumour checkpoint ambiguous use of the term in the literature, in stratifying more general properties of modules — implement and maintain the thus making identification of MR proteins cancer, such as tumour subtype, metastatic transcriptional state of a tumour cell largely a more rational and systematic process. potential and clinical outcome2,3, which are independent of initiating events, as well as of Consistently, it separates the methodologies more often captured by transcriptional or endogenous and exogenous perturbations7,8. and studies discussed in this Opinion article proteomic profiles4,5 (BOX 1). For example, Much like a highway checkpoint, these from those aimed at identifying more immunohistochemistry assays, as well modular structures canalize the effect of loosely defined functional drivers or genetic as gene expression-based tests such as mutations and other aberrant signals in determinants of cancer, for instance, by Oncotype DX and MammaPrint6, their upstream pathways to implement key integrating mutational and gene expression

NATURE REVIEWS | CANCER ADVANCE ONLINE PUBLICATION | 1 ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

a COAD KIRC b Alternative mutations 40 20 8% 2% 20 0 90% 0

–20 Bypass –20 5% mutation PC-2, 9.6% of variance

–40 PC-2, 11.1% of variance –40 –20 0 20 40 –40 –20 0 20 PC-1, 12.6% of variance PC-1, 17.1% of variance

UCEC PRAD Tumour checkpoint 20 20 10 10 0 0 –10 –10 Proliferation –20 –20 Migration

PC-2, 7.3% of variance PC-2, 7.3% –30 PC-2, 11.5% of variance –30 –40 –20 0 20 –20 0 20 40 PC-1, 9.7% of variance PC-1, 19.1% of variance Other driver mutations Patient-specific Differentially in cancer cohort driver mutation Normal tissue Tumour tissue expressed gene Cancer signature gene Master regulator Passenger mutation

Figure 1 | The architecture of tumour checkpoints. a | The probability den- not upstream of MRs have no effect on tumour checkpointNature Reviews activity |and Cancer thus sities of normal and transformed cells are shown in a principal component on the specific phenotype that the checkpoint regulates. Arrows in this dia- (PC) projection that captures most of the sample variability of four tumour gram show regulatory and signalling interactions, that is, how one gene types: colorectal adenocarcinoma (COAD), kidney renal clear cell carcinoma product regulates other gene products. Black arrows represent crucial top- (KIRC), uterine corpus endometrial cancer (UCEC) and prostate adenocarci- down interactions leading from patient mutations first to activation of MR noma (PRAD). These distributions show a clear single-peak structure, sug- proteins in the tumour checkpoint and then to activation of downstream gesting that the regulatory logic of the tumour cell is effective in avoiding genetic programmes that are required for tumour phenotype presentation. occupancy of states that are far away from the mean. Considering that can- Grey and blue arrows represent additional regulatory interactions that do cer tissue may also be contaminated by extensive lymphocytic and stromal not affect and are not affected by tumour checkpoint MRs, respectively. cell infiltration, the variance of the normal and tumour-associated distribu- Dashed arrows represent feedback loops implemented either between the tions is of quite comparable magnitude. A comprehensive inventory of all MR layer and the upstream modulators or between genes regulated by MR tumour types in The Cancer Genome Atlas (TCGA) reveals that only a handful proteins and upstream MR modulators. The MR protein module in the — such as head and neck squamous cell carcinoma (HNSC), kidney renal shaded area represents the tumour checkpoint. Pink spheres represent papillary cell carcinoma (KIRP) and liver hepatocellular carcinoma (LIHC) — genes that are differentially expressed as a result of the aberrant activity of present with substantially greater variance than the corresponding normal MR proteins in the tumour checkpoint (that is, the tumour gene expression tissue. b | The proposed regulatory architecture implemented by master signature). Lightning bolts represent potential therapeutic interventions regulator (MR) proteins in tumour checkpoints is shown. MRs (blue spheres using pharmacological inhibitors. Inhibiting oncoproteins mutated in a large in shaded area) represent proteins the concerted, aberrant activity of which fraction (for example, 90%) of tumour subclones will cause relapse owing to is both necessary and sufficient for cancer cell state maintenance. Their aber- the presence of rare, alternative subclones harbouring either alternative or rant activity is induced by genes in their upstream pathways that are mutated bypass mutations. A bypass mutation is a mutation that activates the path- in a specific patient (purple spheres) selected from a larger repertoire of can- way downstream of the pharmacological intervention point. By contrast, didate driver genes (green spheres), the mutation of which is recurrently inhibiting the tumour checkpoint may represent a more effective strategy, detected in large cohorts. Passenger mutations (pale blue spheres) that are as it captures the effect of all upstream mutations.

profiles19, by network hub analysis20 or by Cellular homeostasis and cancer of endogenous and exogenous variables. other integrative network analyses21. A specific A large subset of the cell’s regulatory logic This concept plays an especially important advantage of such stricter nomenclature is its presides over homeostasis, a concept that can part in cell physiology, where its main role ability to differentiate proteins representing be traced back to the French physiologist is to collapse a universe of potential cell effective pharmacological targets because of Claude Bernard in 1865 and its refinement states into a finite number of energetically their necessary and direct role in maintaining by Walter Cannon in 1929 (REF. 22). and physiologically viable ones, each tumour state from those that may be only Homeostasis was originally defined as representing a distinct cell lineage or sufficient, thus representing less crucial the set of regulatory processes that ensure the precursor stage. This process, better known targets. overall stability of a system, independent as Waddington’s canalization23, was initially

2 | ADVANCE ONLINE PUBLICATION www.nature.com/nrc ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Box 1 | The tumour subtype paradox

The past decade has witnessed marked proliferation of tumour subtypes, of variability: noisy gene expression profiles (part c), independent mostly identified by gene expression clustering. For example, breast cancer clustering and heterogeneity of genetic alterations (parts a and b), has been divided into 4–10 subtypes4,5 and glioma into 3 or 4 subtypes3,79. potential coexistence of distinct tumour states in the same mass (tumour Yet, there is little consistency between expression-based subtypes and heterogeneity107) and stroma infiltration. Indeed, most tumour subtypes stratification based on genetic alterations. For example, consider the breast (including those of Perou) were identified by analysing a subset of the cancer cohort in The Cancer Genome Atlas (TCGA; see Further genes rather than by unbiased genome-wide analysis — with slight information). We used standard unsupervised cluster analysis108 to identify changes in gene selection producing significant variability in subtype the best four clusters based on distinct data types — somatic coding identification. mutations (SCMs; part a of the figure), copy number variants (CNVs; part b To show the poor reproducibility of gene expression measurement, we of the figure), gene expression profiles (GEPs; part c of the figure) and plotted the distribution of the 10 most differentially expressed genes in master regulator activity (MRA; part d of the figure). This enables each sample across all other same-subtype samples (part f of the figure). comparison of clustering results across these four data types and also As shown, high differential expression in one sample is not predictive of against the clinically relevant Perou classification16 — including luminal A differential expression across the full subtype. By contrast, the rank of the (LumA), luminal B (LumB), HER2+ (HER2) and basal-like (Basal). Silhouette 10 most differentially active proteins in each sample, as inferred by Virtual score (shaded waterfall plots) and Perou classification (colour coded bars) Inference of Protein Activity by Enriched Regulon Analysis (VIPER), is are shown above and below the heat maps in parts a–d, respectively. extremely well conserved across the corresponding subtypes . Surprisingly, but consistent with the tumour checkpoint hypothesis, Consistently, the most differentially active proteins in the LumA and three out of four of the clusters identified by either SCM or CNV analysis LumB subtypes — see, for example, forkhead box A1 (FOXA1), GATA presented relatively low silhouette score (SS) significance, suggesting low binding protein 3 (GATA3), oestrogen 1 (ESR1) — represent internal consistency, and relatively low overlap with GEP-based clusters. established luminal lineage driver and subtype-specific tumour

One SCM cluster was not even statistically significant (SSSCM = –0.01, P = 1). dependencies. As such, they are bona fide members of the LumA and Furthermore, part e shows the fraction of overlapping samples (above the LumB tumour checkpoints. expected background), for each cluster

compared with each –79 –80 –43 –7 –4 –7 –12 a 2.5 x 10 1 8.9 x 10 2.2 x 10 b 5.1 x 10 1.9 x 10 2.3 x 10 2.2 x 10 Perou subtype. As shown, the Perou classification was poorly recapitulated TP53 VCY PIK3CA ERBB2 by SCM and CNV analysis, CDH1 ORMDL3 GATA3 FAM90A7 except for the Basal MAP3K1 DEFA11P (SCM) and LumA (CNV) subtypes. Compared with SCM and CNV analysis, unsupervised GEP 1.2 x 7.4 x analysis improved cluster –62 –27 –27 –150 –269 –244 c 8.6 x 10 5.2 x 10 1.5 x 10 1.8 x 10 d 1.4 x 10 10–42 10–26 1.6 x 10 significance as well as Perou classification consistency, yet failed to fully differentiate PEG10 NPAS1 MUC5B ZNF703 between LumA, LumB KRT5 RARA GABRP ANKRD30A and HER2 samples (parts MUCL1 ZNF552 LTF XBP1 c and e of the figure), PIP FOXA1 CYP4Z1 ZNF442 which are intermixed in CLEC3A TCEAL1 CEACAM6 ZNF396 three out of four clusters. SCGB2A2 EMX1 SCGB1D2 ESR1 By contrast, MRA DHRS2 GATA3 BMPR1B AFF3 clustering (part d of the CYP2B7P1 BHLHE40 AGR2 GATA6 figure) not only improved TFF1 FOXC1 TFF3 BCL11A the statistical significance SLC7A2 NKX1–2 GFRA1 ZFP42 of the four inferred ESR1 TLX1 SCUBE2 POU4F1 clusters — a measure PGR PRDM13 CPB1 MSL3 of cluster self-consistency NPY1R CBX2 — but also most closely recapitulated the Perou classification, with highly –4 –2 0 2 4 Basal HER2 LumA LumB –10 –5 0 5 10 significant enrichment mRNA expression Protein activity of cluster 1 (Basal, −17 e SCMs CNVs GEPs MRA f P = 3.1 × 10 ), cluster 2 1.0 −19 Basal 45% 23% 76% 85% (LumA, P = 3.1 × 10 ), 0.8 cluster 3 (LumB, LumA 31% 46% 31% 95% 0.6 −4 P = 3.6 × 10 ) and cluster 4 0.4 −14 LumB 18% 19% 18% 18% (HER2, P = 2.6 × 10 ). 0.2 These data suggest that HER2 –2.4% 31% 18% 70% Relative rank 0.0 subtype proliferation may Basal LumA LumB HER2 have resulted, at least in 0 25 50 75 100 part, from three sources Overlap (%) Gene expression Tumour checkpoint

NATURE REVIEWS | CANCER ADVANCE ONLINENature Reviews PUBLICATION | Cancer | 3 ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

introduced to explain the robustness and representation of the transcriptional state of these tumour checkpoints and associated stability of developmental phenotypes the original tumour, despite aberrant ploidy, MRs would constitute a novel class of when confronted by genetic, epigenetic and mutational drift and massive changes in the mechanistic biomarkers and therapeutic environmental perturbations. It was later repertoire of microenvironment signals to targets11,29–31 and would further link cancer broadened to include the global mechanisms which they are exposed27. Similarly, cancer and development. that preside over the stability of cellular cells are easier to culture outside their phenotypes24. original milieu than normal cells. Indeed, The tumour checkpoint hypothesis The general view is that homeostasis is more than 1,000 cancer cell lines have been Diffuse large B cell lymphoma (DLBCL) disrupted in disease, including in cancer25,26 successfully established, compared with represents an ideal context in which to (FIG. 2). And yet, tumour-associated states only a handful of cell lines representing explore the tumour checkpoint hypothesis. seem to be remarkably stable (FIG. 1a), immortalized normal tissue28. Indeed, although never mutated in this suggesting that some form of stability Despite overwhelming evidence, malignancy, the canonical nuclear factor‑κB control, albeit a dysregulated one, must however, a systematic investigation of the (NF‑κB) heterodimer (p50–RELA) represents still be at play. Indeed, similar to their dystatic (short for dysregulated homeostatic) a crucial dependency of the activated B cell normal counterpart, cancer cell states — logic that ensues from transformation has (ABC) but not of the germinal centre B as represented by their transcriptional not been undertaken. Recent results suggest cell (GCB) subtype of DLBCL32. Indeed, signatures — are rather impervious to that dystatic cell states may be implemented aberrant NF‑κB activity was shown to arise endogenous and exogenous variability, by a handful of tumour-specific MR proteins from several mutations in its upstream from germline and somatic variants, to — working cooperatively within tight pathways, including in TNF-induced pharmacological perturbations, microenvi- modular structures (tumour checkpoints) protein 3 (TNFAIP3), myeloid differentiation ronment signals, temperature changes and — the aberrant activity of which is necessary primary response 88 (MYD88), TNF receptor nutrient availability. For instance, xenograft and sufficient for tumour phenotype associated factor 2 (TRAF2), TRAF5, CD79A, tumours in mice and even cancer cell lines presentation and maintenance7,9–11,29,30. If CD79B, MAP3K7 (also known as TAK1), can maintain a relatively high-fidelity this hypothesis were further confirmed, TNFRSF11A (which encodes RANK) and caspase recruitment domain family member 11 (CARD11)8,33,34 (FIG. 3). Moreover, Canalization Physiological phenotype landscape NF‑κB activity was essential for tumour HSC viability only in DLBCL cells harbouring 8 MLP these events . Finally, NF‑κB transcriptional Homeostasis targets (as assessed by a literature search) Regulatory • Resilience to endogenous network HSC Regulatory were significantly differentially expressed B cell and exogenous perturbations 8 network MLP • Germline variants, nutrients in the ABC-DLBCL signature , supporting and temperature its direct role in regulating subtype-specific Regulatory network B cell programmes. Taken together, these findings support the role of the p50–RELA complex Aberrant 35 canalization as a non-oncogene dependency and MR Pathological phenotype landscape protein, consistent with our definition. Although intriguing, however, these Tumour findings are not easily extended to plasticity Aberrant (tumour) homeostasis other tumour types. Specifically, initial identification of NF‑κB was more the product Regulatory of investigator ingenuity than of generalizable network DLBCL methodologies32. Additionally, detailed | Nature Reviews | Cancer Figure 2 Dysregulation of homeostatic control following malignant transformation and acti‑ information about pathways upstream of vation of dystasis control mechanisms that are responsible for the stability of tumour cell arbitrary candidate MR proteins is frequently state. This figure shows how normal cell physiology is determined by the energetic landscape of its regulatory networks, enabling cells to follow specific developmental trajectories that are highly insen- unavailable. Thus, systematic elucidation sitive to genetic, epigenetic and environmental variability, thus achieving stable end point states. This of MR proteins requires the development of process, also known as Waddington canalization23, is illustrated in a cartoon showing differentiation more effective methodologies to reconstruct from haematopoietic stem cell (HSC), to multi-lymphoid progenitor (MLP) to a fully differentiated and interrogate the genome-wide regulatory human B cell as a set of transitions to states of progressively lower energy and thus higher stability. logic of the cancer cell. Disruption of this regulatory landscape by genetic alterations and environmental signals leads to physiological­ state loss and emergence of novel, stable disease states, for example, diffuse large B cell MR and tumour checkpoint elucidation lymphoma (DLBCL). When multiple, quasi-isoenergetic states emerge, they can lead to coexistence of Defining MR proteins as mechanistic cells representing distinct tumour phenotypes in the same tumour mass or to tumour cell reprogram- regulators of the transcriptional signatures ming to different states following treatment, a process known as tumour plasticity. For example, it has that define distinct tumour states helps with been shown that cells representing both the mesenchymal and the proneural subtype of glioma can coexist in the same tumour107 and that a small fraction of cells treated with tumour necrosis factor their systematic identification. Specifically, (TNF)-related apoptosis-inducing ligand (TRAIL) transiently reprogramme to a TRAIL-resistant state133. it implies that the transcriptional targets of Whereas normal cell homeostasis presides over the stability of physiological cell states, by making them a bona fide MR protein must be enriched difficult to escape, we propose that a dysregulated form of these stability control processes (that is, in genes that are differentially expressed in tumour dystasis) is responsible for the stability of tumour-associated cell states and is mechanistically the cellular phenotype of interest (for implemented by a small number of master regulator (MR) proteins in a tumour checkpoint. example, tumour subtype, tumorigenesis,

4 | ADVANCE ONLINE PUBLICATION www.nature.com/nrc ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Chronic BCR Autocrine signalling cytokine signalling Constitutive MYD88 signalling IFNβ IL-6 or IL-10 Cell membrane

IRAK4 Cytoplasm CD79B MYD88 inhibitors TIR domain SFK SYK Y Y CD79A or mutation P CD79B MYD88 inhibitors SYK Y Y TYK2 JAK1 P ITAM mutation CARD11 IRAK4 IRAK4 CC domain P IRAK1 IRAK1 P BTK Ibrutinib mutation P P P CARD11 STAT1 STAT3 PI3K PKCβ TRAF6 inhibitors PI3K MALT1 MALT BCL-10 inhibitors • Ruxolitinib • Enzastaurin Death • SB15108 • Sotrastaurine AKT–mTOR NF-κB IFN BTK pathway IRF4 pathway pathway A20 AKT loss inhibitors AKT Bortezomib Lenalidomide • Everolimus mTOR Survival • Temsirolimus Figure 3 | Diverse genetic alterations in upstream pathways contrib‑ clinical development of several BCR pathway inhibitors, such as ibrutinib, a ute to aberrant NF‑κB activity in DLBCL. Systematic analysis of genes Bruton tyrosine kinase (BTK) inhibitor. CARD11,Nature caspase Reviews recruitment | Cancer in pathways upstream of the nuclear factor‑κB (NF‑κB) complex revealed a domain family member 11; IFN, interferon; IL, interleukin; IRAK, IL‑1 r large repertoire of diffuse large B cell lymphoma (DLBCL)-specific genetic eceptor-associated kinase; IRF4, interferon regulatory factor 4; ITAM, alterations in B cell receptor (BCR) and myeloid differentiation primary immune receptor tyrosine-based activation motif; JAK1, Janus kinase 1; response 88 (MYD88) pathways. The presence of these mutations leads to MALT1, mucosa-associated lymphoid tissue lymphoma translocation pro- aberrant activation of the canonical p50–RELA heterodimer and associated tein 1; PKC, protein kinase C; STAT, signal transducer and activator of tran- tumour dependency. These mutations, which are more frequent in the acti- scription; TIR, Toll–interleukin receptor; TRAF6, TNF receptor associated vated B cell (ABC) subtype of DLBCL, have provided the rationale for the factor 6. Adapted with permission from REF. 134, Nature Publishing Group.

metastatic progression and drug sensitivity). Most of the published studies on the architecture in neurological phenotypes More specifically, the positively regulated unbiased inference of MR proteins as — including amyotrophic lateral sclerosis targets of an aberrantly activated MR mechanistic regulators of disease cell state (ALS)53, Alzheimer disease7,54, Parkinson protein should be overexpressed, whereas (even excluding those from our laboratory disease55 and alcohol addiction56 — and in the negatively regulated (that is, repressed) to avoid bias) either use the MARINa and developmental phenotypes, from regulation ones should be underexpressed (FIG. 4). VIPER algorithms directly (for example, of germinal centre formation39 to stem This provides a straightforward rationale see REFS 40–44) or rely on slightly modified cell pluripotency57. As briefly summarized for computational MR protein inference, versions (for example, see REFS 45–47). below, these examples further outline using enrichment analysis methods such By contrast, as mentioned above, there the role of tumour checkpoint MRs in as the Kolmogorov–Smirnov test36 — as are many network-based algorithms to regulating disease dystasis and their unique implemented by gene set enrichment identify key regulators, drivers and pathways nature compared with their physiological analysis (GSEA)37 or by its analytical involved in tumorigenesis that are not based counterpart involved in homeostatic control. derivatives, such as analytical rank-based on mechanistic regulation of target genes Clearly, accurate transcriptional enrichment analysis (aREA)38. Indeed, when (for example, see REFS 19,21). These are target inference, using reverse engineering coupled with accurate, de novo identification extensively covered by earlier reviews48,49 methodologies, is crucial to the effective of tissue-specific protein targets (regulons), and will not be further discussed here. elucidation of candidate MR proteins and this represents the conceptual foundation of MR analyses have helped elucidate merits explicit mention (see also BOX 2). the Master Regulator Inference Algorithm proteins that regulate tumour-associated Postgenomic research has seen intense (MARINa)10,39 and of its extension to single phenotypes as diverse as predisposition40,44, interest in the systematic dissection of sample analysis, Virtual Inference of Protein subtype-specific tumorigenesis10,43,50,51, the transcriptional and signalling logic of Activity by Enriched Regulon Analysis progression to aggressive or metastatic cancer cells (henceforth termed regulatory (VIPER)38. To further increase accuracy disease9,11,12,45,47,52, stroma-specific logic)48,53,55–57, including its transcriptional58–60, and sensitivity, these algorithms specifically regulation of tumour outcome41 and drug post-transcriptional61–63 and post- account for target directionality (that is, resistance29,30, most of which have been translational64–68 layers. These methods activated versus repressed targets), regulon experimentally validated. In addition, their were first developed and validated in overlap between candidate MR protein pairs use in non-cancer phenotypes has helped to bacteria69 and yeast70,71 and then extended to and target inference confidence38. elucidate an equivalent disease checkpoint mammalian cells21,59, thus complementing

NATURE REVIEWS | CANCER ADVANCE ONLINE PUBLICATION | 5 ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

a DNA b engineering methodologies. Indeed, based EGFR Inactive on thorough benchmarking, activity changes Transcription Transcriptomics RNA can be accurately measured for only ~70% of mRNA decay transcriptional regulators and ~60% of signal abundance RNA transduction proteins on average38. Although this is certainly adequate to support Protein discovery of many novel mechanisms, it is degradation Inactive still susceptible to substantial false-positive protein and false-negative inferences. Consistent with Proteomics Post-translational these observations, the use of tissue-specific Protein or modification regulatory models (interactomes) has proved isoform Underexpressed Overexpressed abundance genes genes crucial, as statistical significance of inferred MRs decreases when mismatched models are used9,38. Indeed, differential protein activity Subcellular localization c following short hairpin RNA (shRNA)- EGFR Active mediated silencing of Active proteins in different tumour contexts was protein complex Cofactor effectively detected using regulatory models VIPER interaction generated from the same tissue context Protein activity but not from mismatched tissues (BOX 2). Active proteins Interestingly, the advent of efficient and economical single-cell profiling may herald Regulon a new era in network reverse engineering as it will avoid averaging over key processes Downstream direct or (for example, cell cycle) and cell states (for indirect transcriptional targets Underexpressed Overexpressed genes genes example, tumour and stroma). This will help to avoid typical limitations imposed by Figure 4 | Protein activity inference from the expression of its regulatoryNature targets. Reviews a | ProteinCancer intratumoural heterogeneity, such as stroma activity is the ultimate result of a complex cascade of molecular processes, from transcription and and immune cell infiltration, thus producing translation, to post-translational modification, complex formation and localization to appropriate more accurate gene regulatory models and subcellular compartments. As a result, there are no individual assays that can accurately measure improving the accuracy of MR identification, protein activity in -wide fashion. Instead, we have proposed that an accurate estimator of which is substantially degraded when protein activity is represented by the gene expression of its transcriptional targets, that is, its regulon. tumour cellularity drops below 70%. This rationale is implemented by the Virtual Inference of Protein Activity by Enriched Regulon An additional limitation is the complexity Analysis (VIPER) algorithm, based on transcriptional targets inferred by reverse engineering algo- of inferring the activity of proteins that rithms such as Algorithm for the Accurate Reconstruction of Cellular Networks (ARACNe). b | When a protein is inactive its targets are randomly distributed in terms of differential expression. c | By share a substantial subset of their regulatory contrast, when the same protein is aberrantly activated, its positively regulated targets become sig- targets. Although specific solutions to nificantly overexpressed and its repressed targets become underexpressed. This can be effectively this problem have been proposed38,39, the and quantitatively assessed by gene expression enrichment analysis methods. EGFR, epidermal combinatorial complexity of target regulation growth factor receptor. is still not completely addressed, potentially resulting in false-positive findings requiring substantial experimental validation. Finally, pre-existing literature-curated models72 or from concomitant genetic and small a crucial issue is represented by the inference and large-scale experimental assays73. The molecule perturbations9. Reverse engineering of protein activity in a dynamic context — for Algorithm for the Accurate Reconstruction methods have been extensively discussed in example, following a drug perturbation — of Cellular Networks (ARACNe)59 was several reviews48,49,76 and will not be further rather than in regimes approaching cellular the first of its kind to be experimentally discussed in this Opinion article. equilibrium. In these transient conditions, validated in mammalian cells10,39,57,74 and the correlation between mRNA abundance has been among the most frequently used Limitations and potential improvements. and protein abundance may be time delayed to infer regulatory targets for MR analysis. An important question arises as to the or even inverted. Ultimately, we foresee However, various alternative approaches have technical limitations of MR analysis that more specialized kinetic models will be also been popular, from Bayesian networks75 methods and their reliance on accurate required to describe the relationship between to co‑expression network integration with identification of transcriptional targets. A target mRNA abundance and protein activity, chromatin immunoprecipitation–sequencing key problem is the considerable dependence leading to a more effective generation of (ChIP–seq) assays47. To achieve high of protein targets on tissue context. dynamic MR inference algorithms. accuracy, all these methods require relatively Indeed, MR targets are affected both by large collections (n ≥ 100) of gene expression their tissue-specific epigenetic activation MR discovery in cancer biology profiles representing a specific cellular or inactivation and by the availability or The following sections describe some crucial context, under conditions that induce broad mutation of cofactors participating in applications of MR analysis and some of gene expression variability, such as from regulatory complexes66,77,78. This highlights the general principles that have consistently genetically heterogeneous tumour samples10 the need for more sophisticated reverse emerged from multiple MR studies.

6 | ADVANCE ONLINE PUBLICATION www.nature.com/nrc ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

Box 2 | Alternative procedures for network reverse engineering and analysis Reverse engineering of regulatory networks represents a hallmark which is highly context specific) except forkhead box M1 (FOXM1), of systems biology, resulting in the development of hundreds of the regulatory programme of which is highly conserved across all algorithms. Most of these algorithms fit one of the following tested tissues owing to its role in cell cycle regulation131 (part c of the five categories109: experimental maps, describing physical figure). ARACNe, Algorithm for the Accurate Reconstruction of Cellular molecular interactions, such as protein–protein67,110,111 and protein–DNA Networks; BCELL, B cell lymphoma; BLCA, bladder urothelial cancer; interactions112,113; optimization-based approaches, searching for BRCA, breast invasive carcinoma; CESC, cervical squamous cell network topologies that optimally explain the observed data70,114; carcinoma and endocervical adenocarcinoma; COAD, colorectal differential equation-based kinetic models, representing the adenocarcinoma; ESCA, oesophageal carcinoma; GBM, glioblastoma; concentration of relevant molecular species as a function of time HNSC, head and neck squamous cell carcinoma; KIRC, kidney renal and/or space115–117; correlation and information-theoretical clear cell carcinoma; KIRP, kidney renal papillary cell carcinoma; LAML, approaches, producing genome-wide context-specific maps of acute myeloid leukaemia; LGG, brain lower grade glioma; LIHC, liver functional molecular relationships59,118,119; and integrative hepatocellular carcinoma; LUAD, lung adenocarcinoma; LUSC, lung approaches, leveraging partially independent clues for regulatory squamous cell carcinoma; MEF2B, myocyte enhancer factor 2B; OV, network reconstruction39,120. ovarian serous cystadenocarcinoma; PAAD, pancreatic adenocarcinoma; Although all these methods have been benchmarked on their ability PCPG, pheochromocytoma and paraganglioma; PRAD, prostate to reconstruct known topologies121, the dialogue for the reverse adenocarcinoma; PTTG1, pituitary tumour-transforming 1; READ, engineering assessments and methods (DREAM), a conference dedicated rectal adenocarcinoma; SARC, sarcoma; SKCM, skin cutaneous to evaluating systems biology algorithms based on double-blind melanoma; STAD, stomach adenocarcinoma; STAT3, signal transducer benchmarking criteria122,123, has shown that performance depends and activator of transcription 3; TGCT, testicular germ cell tumour; strongly on the data type and target network size. For instance, kinetic THCA, thyroid carcinoma; THYM, thymoma; UCEC, uterine corpus models have the highest level of granularity but their large number of endometrial cancer; XBP1, X-box binding protein 1. parameters limits their applicability to small networks with only a handful of molecules116. Optimization-based approaches, including Bayesian networks, provide an optimal choice for small to middle-sized networks70. a b Accuracy (%) Correlation and information-theoretical methods are best 100 75 50 25 0

suited to genome-wide regulatory network inference109 MEF2B BCL-6 BCL-6 FOXM1 STAT3 XBP1 FOXM1 PTTG1 Protein Cell line but provide only qualitative interaction models. BCELL GBM 1/633 17/13623/12927/109 Genome-wide networks can also be obtained by BRCA integrating experimental data from the literature or UCEC MYB ST486 high-throughput experiments, but these models generally KIRP CESC combine evidence from multiple, often non-physiological LAML cellular contexts that may lack tumour-context specificity. LUSC 1/1223 2/36 7/9 STAD BCL-6 Ly7 Information theory-based approaches have been used BLCA to produce genome-wide representations of yeast LIHC regulatory networks124,125 and context-specific models PCPG 2/1223 8/29 44/66 of mammalian cell regulation39,59,126. However, THCA ESCA BCL-6 Pfeiffer regardless of the level of complexity, the regulatory LUAD B cell lymphoma models they produce are context specific126,127. SARC PAAD Different cellular lineages express different genes and HNSC 2/633 3/1294/1366/932/15 even different protein isoforms, as expression products READ FOXM1 ST486 of alternative splice variants. Crucially, differential OV THYM availability of scaffolding proteins or adaptors can KIRC profoundly reshape (rewire) the topology of protein– COAD 1/835 26/7713/3771/12989/136 protein interaction networks128,129, and differential TGCT PRAD STAT3 SNB19 availability of cofactors and chromatin state between Ly7

cells can lead to strong rewiring of transcriptional Glioblastoma MCF7 MCF7 ST486 SNB19 P3HR1

58,66,78 Pfeiffer

networks . Thus, reliance on context-free, MDA231 literature-based pathway models130, arising from the biologist’s desire to linearize and visualize the cell’s ARACNe Ingenuity overlap ENCODE regulatory logic, can be misleading. The need for 0 25 50 75 100 ChEA Ingenuity activity context-specific models is highlighted in parts a and b Accuracy (%) of the figure, where we show that the ability of Virtual c Inference of Protein Activity by Enriched Regulon Analysis

(VIPER) to correctly assess the activity changes of BLCA BRCA COAD HNSC KIRC LGG LUAD OV PRAD SKCM THCA UCEC BLCA BRCA COAD HNSC KIRC LGG LUAD OV PRAD SKCM THCA UCEC 20 proteins encoded by silenced genes in lymphoma, BLCA BLCA glioma and breast carcinoma cells depends strongly on BRCA BRCA 38 COAD 15 COAD using the correct tissue-matched interactome . HNSC HNSC Moreover, as also discussed in REF. 38, regulatory KIRC KIRC 113 LGG 10 LGG models based on ChIP enrichment analysis (ChEA) LUAD LUAD and Encyclopedia of DNA Elements (ENCODE)60 OV OV PRAD 5 PRAD protein–DNA interaction data, as well as canonical SKCM SKCM pathways from the Ingenuity Pathway Analysis THCA 0 THCA platform72 failed to properly detect differential activity UCEC Regulon UCEC of most proteins (see for example, BCL‑6, the regulon of BCL-6 conservation score FOXM1

Nature Reviews | Cancer

NATURE REVIEWS | CANCER ADVANCE ONLINE PUBLICATION | 7 ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

The tumour checkpoint of mesenchymal more of these programmes. In glioma, analysis of aggressive prostate cancer glioblastoma. Glioblastoma (GBM) presents for instance, MES-GBM MRs were checkpoint MRs, using data from both with multiple molecular subtypes. Despite shown to regulate migration, invasion human biopsies and genetically engineered classification differences79,80, there is global and angiogenesis but not proliferation or mouse models9. Gene expression profiles consensus that the mesenchymal subtype apoptosis. As a result, although still crucially for interactome generation were obtained (MES-GBM) is associated with the poorest necessary for tumour state maintenance and from a high-quality human cohort, for prognosis7,79. The first reported example in vivo tumour viability, their co‑activation which both expression and outcome data of systematic MR analysis and validation is not sufficient for tumorigenesis. This were available84, and from prostate tissue elucidated a MES-GBM-specific tumour is easily explained: tumour checkpoint harvested from 13 distinct genetically checkpoint, including six MR proteins — analysis identifies MRs that are necessary engineered mouse models (including β and δ isoforms of CCAAT/enhancer- and sufficient to implement a specific wild-type controls) following in vivo binding protein (CEBP), signal transducer differential phenotype, as captured by the treatment with 14 compounds targeting and activator of transcription 3 (STAT3), corresponding differential gene expression prostate-relevant pathways (including class E basic helix–loop–helix protein 40 signature. Thus, the analysis will miss MRs dimethylsulfoxide (DMSO) as control (BHLHE40; also known as BHLHB2), of programmes that are unchanged in medium). A gene expression signature runt-related transcription factor 1 the transition. The differential expression of aggressive human prostate cancer was (RUNX1) and Fos-related antigen 2 signature used in the GBM study represents obtained by comparison of gene expression (FOSL2). These were validated as positive the difference between its mesenchymal in tumours with high Gleason score with mechanistic regulators of >74% of and its proneural subtypes, which are confirmed biochemical recurrence within MES-GBM signature genes. A seventh distinct, yet both tumour-associated states. 3 years, against gene expression of tumours MR — protein 238 (ZNF238) — Hence, tumorigenesis programmes are with low Gleason score with no biochemical was identified as an aberrantly inactivated essentially removed from the analysis. This is recurrence. Equivalent signatures were MR and was independently confirmed as important, because the specific selection of obtained from four distinct genetically a fibroblast-to‑neuron reprogramming cellular states that define the gene expression engineered mouse models associated with factor81. Surprisingly, these MRs were signature ultimately determines the role aggressive disease (harbouring Myc, Trp53, not among the 500 most differentially of identified MR proteins, and selection of Braf and Kras mutations). Strikingly, 8 of expressed genes and only one of them inappropriate signatures can drastically the top 20 human MRs identified by the (CEBPD) was later found to be focally affect the outcome of MR studies. analysis were also identified from each amplified in 30% of MES-GBMs7. This study highlighted six hallmark of the four mouse model-derived gene Further analysis revealed a highly properties of tumour checkpoints that were expression signatures. This included the only modular architecture, with CEBPβ, CEBPδ also confirmed in follow‑up studies, both in predicted synergistic MR pair — forkhead and STAT3 regulating each other as well cancer7,9,29,30,83 and in other phenotypes39,53,55–57. box M1 (FOXM1) and centromere protein F as the other MRs and jointly regulating Specifically, MR proteins form highly (CENPF) — the universal synthetic lethality the MES-GBM gene expression signature modular and tightly regulated structures of which in aggressive prostate cancer was through a large number of feed-forward loops (tumour or disease checkpoint modules) validated in vitro and in vivo. Synthetic (FIG. 5). Although only ~6% of CEBPβ and (FIG. 5); implement stable, redundant lethality of this MR pair in malignant STAT3 target genes were mesenchymal, control of downstream cancer programmes prostate cancer emerged as virtually based on , 36% of targets through multiple feed-forward loops; independent of both genetic and organism regulated by both CEBPβ and STAT3 regulate each other, both directly and via background, thus further supporting the were mesenchymal, suggesting synergistic autoregulatory loops, thus increasing tumour hypothesis that checkpoint MRs uncouple interaction39. Indeed, ectopic co‑expression checkpoint stability; are highly enriched in tumour phenotypes from upstream . of CEBPβ and STAT3 (but not of either tumour-specific dependencies, including These results were further confirmed in a one in isolation) was sufficient to induce essential MRs and/or synthetic lethal MR pairs; cohort of >900 patients, for which tissue reprogramming of neural stem cells along an are differentially active but not generally microarrays (TMAs) were available. Although aberrant mesenchymal lineage. Conversely, differentially expressed; and only rarely almost no patients with double negative shRNA-mediated silencing of both genes harbour genetic alterations. Among these, immunohisto­chemistry (IHC) staining for (but not of either one in isolation) was the first four points also represent crucial FOXM1 and CENPF at diagnosis died of the sufficient to abrogate viability of MES-GBM hallmarks of homeostatic control. Indeed, disease for up to 20 years from diagnosis, tumours in mouse xenograft models. Pure because of their inherent stability, tumour those with double positive FOXM1 and feed-through regulation, for example, checkpoint collapse generally requires CENPF IHC constituted 90% of prostate as implemented by an ‘AND’ or ‘OR’ co‑inhibition of two or more proteins (that cancer-associated deaths, suggesting that logic, cannot explain CEBPβ and STAT3 is, synthetic lethality). This decreases the co‑activation of FOXM1 and CENPF is dependency for both module activation and chances of detecting key synthetic lethal MRs, necessary and sufficient to implement the inactivation. As was later confirmed7, this such as STAT3 and CEBPβ in MES-GBM, most aggressive form of these tumours that implies existence of an autoregulatory logic by pooled RNA interference (RNAi) or are associated with poor prognosis. implemented by feedback loops. CRISPR-based functional screens that target Notably, cancer states represent the individual genes. Integration with functional screens. superposition of multiple independent MR analysis may also be used effectively hallmarks82. This requires the potential Cross-species MR analysis. The hallmarks to mitigate discovery of idiosyncratic coexistence of multiple tumour checkpoints, of a tumour checkpoint oncotecture were dependencies from pooled RNAi screens. each one regulating or buffering one or further recapitulated by cross-species For instance, isogenic MCF‑10A breast

8 | ADVANCE ONLINE PUBLICATION www.nature.com/nrc ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

CEBP

PLAU STAT3 BHLHE40 IL32 MYL9 PTRF ITGA7 FOSL2 HOMER3 STEAP3

IFITM3 PLAUR

IFITM2 B4GALT1 SHC1 ANGPTL4 PVRL2 NRP1 ANGPT2 S100A11 ANPEP SLC39A8 TAGLN RUNX1

PAPPA EFEMP2 EFNB2 PI3 TMEPAI EMP3 EMP1 C1QTNF1 PDLIM1 PDLIM4 OSMR MYH9 TIMP1 THBD SGSH RRBP1 SLC16A3 SLC25A37 ICAM1 ITGA3 ZYX ECE1 COL4A1 SERPINH1 SOCS3 CD97 FCGR2C TPP1 ESM1 PDPN BACE2 ACTA2 EHD2 TNC C1R NRP2 SERPINE1 ITGA5 C1RL FAM38A BCL3 FPRL1 HRH1 CD151 SERPINA1 THADA CHI3L1 PDLIM7 PLA2G5 FCGR2A LIF PELO SLC22A18 FLNA MMP14 CA12 MVP COL4A2 ACTN1

Figure 5 | Tumour checkpoint architecture of the mesenchymal sub‑ chitinase 3 like 1; COL4A1, collagen type IV 1 chain; ECE1, endothelin con- α Nature Reviews | Cancer type of glioblastoma. Transcription factors involved in the activation of verting enzyme 1; EFEMP2, EGF containing fibulin like extracellular matrix mesenchymal glioblastoma (MES-GBM) subtype are shown in purple. protein 2; EFNB2, ephrin B2; EHD2, EH domain containing 2; EMP1, epithelial Overall, the six transcription factors shown in this figure — CCAAT/enhancer- membrane protein 1; ESM1, endothelial cell specific molecule 1; FCGR2A, Fc binding protein‑β (CEBPβ) and CEBPδ are represented by CEBP, for simplicity, fragment of IgG receptor IIa; FLNA, filamin A; FOSL2, Fos-related antigen 2; as they form homodimers and heterodimers — control 74% of the genes in FPRL1, formyl peptide receptor-like 1; HRH1, histamine receptor H1; ICAM1, the mesenchymal signature of high-grade glioma. A region between 2 kb intercellular adhesion molecule 1; IFITM, interferon induced transmembrane upstream and 2 kb downstream of the transcription start site of the target protein; IL32, interleukin‑32; ITGA7, integrin subunit α7; LIF, leukaemia inhib- genes identified by Algorithm for the Accurate Reconstruction of Cellular itory factor; MMP, matrix metalloproteinase; MVP, major vault protein; MYH9, Networks (ARACNe) was analysed for the presence of putative binding sites. myosin heavy chain 9; MYL9, myosin light chain 9; NRP2, neuropilin 2; OSMR, When combined with analysis of gene expression profiles following short oncostatin M receptor; PAPPA, pappalysin 1; PDLIM4; PDZ and LIM domain hairpin RNA (shRNA)-mediated silencing of these transcription factors, the 4; PDPN, podoplanin; PELO, pelota homologue; PI3, peptidase inhibitor 3; latter were shown to bind and regulate the large majority of MES-GBM sig- PLA2G5, phospholipase A2 group V; PLAU, plasminogen activator, urokinase; nature genes (shown in pink). In addition, CEBP (both β and δ subunits) and PLAUR, PLAU receptor; PVRL2, poliovirus receptor-related 2; PTRF, polymer- signal transducer and activator of transcription 3 (STAT3) were shown to ase I and transcript release factor; RRBP1, binding protein 1; regulate the other three transcription factors in the tumour checkpoint and RUNX1, runt-related transcription factor 1; SGSH, N-sulfoglucosamine sul- to synergistically regulate the state of MES-GBM cells. ACTA2, actin α2; fohydrolase; S100A11, S100 calcium binding protein A11; SLC39A8, solute ACTN1, actinin α1; ANGPT2, angio­poietin 2; ANPEP, alanyl aminopeptidase; carrier family 39 member 8; SOCS3, suppressor of cytokine signalling 3; BACE2, β-site APP-cleaving enzyme 2; B4GALT1, β-1,4‑galactosyltransferase TAGLN, transgelin; THBD, thrombomodulin; TIMP1, tissue inihibitor of met- 1; BHLHE40, class E basic helix–loop–helix protein 40; CA12, carbonic anhy- alloproteinase 1; TMEPAI, transmembrane prostate androgen-induced pro- drase 12; C1QTNF1, C1q and tumour necrosis factor related protein 1; C1R, tein; TNC, tenascin C; TPP1, tripeptidyl peptidase 1; ZYX, zyxin. Adapted with complement C1r; C1RL, complement C1r subcomponent like; CHI3L1, permission from REF. 10, Nature Publishing Group.

epithelial cell lines, with and without ectopic which were validated as MRs of trastuzumab kinase (JAK)–STAT3 autoregulatory HER2 expression (MCF‑10A/HER2 versus (a HER2‑inhibiting antibody) resistance. loop, independent of trastuzumab- MCF‑10A), were screened in 2D and 3D Indeed, their silencing in trastuzumab- mediated inhibition of HER2. Consistently, cultures using a pooled, genome-wide RNAi treated, resistant tumours elicited synthetic trastuzumab and ruxolitinib (a US Food library. Integrative analysis of RNAi screen lethality in vitro and in vivo29, leading to the and Drug Administration (FDA)-approved and MR analysis identified only two genes discovery of an autocrine STAT3–interleukin JAK inhibitor) combination therapy in common (STAT3 and agrin (AGRN)), 6 (IL‑6)–IL‑6 receptor (IL6R)–Janus effectively abrogated the viability of

NATURE REVIEWS | CANCER ADVANCE ONLINE PUBLICATION | 9 ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

trastuzumab-resistant patient-derived despite initial and often striking responses tumours. For example, subclonal expansion xenografts (PDXs)29, providing the rationale to therapy, most cancer patients will of drug-resistant clones, for instance, is a for a phase I/II clinical trial85. relapse with poor prognosis tumours that well-studied relapse mechanism. Indeed, no longer respond to chemotherapy or because of the extensive genomic diversity of The oncotecture of tumour relapse targeted therapy. Thus, there is a pressing individual tumour cells, it is only a matter Progression to drug-resistant disease need for novel methodologies to identify of time before one or more drug-resistant represents perhaps the most substantial therapeutic targets that either predict subclones emerges with bypass or challenge in cancer treatment. Indeed, relapse or rescue drug sensitivity in relapsed alternative mutations86,87. However, many additional adaptive mechanisms exist that modulate drug sensitivity epigenetically, 88 89 Glossary including transient or stable epigenetic reprogramming, as well as by triggering Bayesian networks tested to assess whether they are mutated in a cancer mechanisms of resistance that are purely cohort, thus increasing the overall probability that any one Probabilistic models representing the probabilities and 29 conditional probabilities of variables representing nodes of them may be mutated by chance. regulatory in nature — for instance, from in a directed acyclic graph, that is, a type of network activation of autocrine feedback loops, with nodes connected by directionally defined edges, Oncogene addiction hypothesis as discussed above. In each case, drug usually represented as arrows, with no circular paths The hypothesis that tumours become dependent on the resistance is associated with a stable and aberrant activity of proteins encoded by mutated (cycles). This model is often used to determine the molecularly distinct tumour cell state. It probability of an unknown event corresponding to one oncogenes and that pharmacological inhibition of these node as a function of the probability of other known proteins will cause tumour demise. is thus reasonable to consider whether events in the network. these states might also be implemented Oncotecture and maintained by aberrant activity Dystatic The regulatory architecture responsible for implementing of specific tumour checkpoints and Describes the aberrant processes responsible for tumour dystasis, comprising one or more tumour implementing and maintaining a stable disease state, checkpoints that are responsible for integrating the effect of whether pharmacological targeting of the independent of most genetic, epigenetic and multiple mutations and aberrant signals in their upstream corresponding MRs might rescue sensitivity environmental perturbations. pathways to implement a conserved repertoire of or even prevent relapse. downstream transcriptional programmes that are necessary and sufficient for tumour phenotype presentation. Feed-forward loops MRs of glucocorticoid resistance in Regulatory structures in which a gene product (A) regulates a second gene product (B) and, together, A and B regulate Regulatory logic T cell acute lymphoblastic leukaemia. one or more target gene products. These structures The full complement of transcriptional, post-transcriptional The tumour checkpoint–MR conceptual constitute a directed acyclic graph. and post-translational molecular interactions that framework was applied to the study of determine cell behaviour. glucocorticoid (GC) resistance in T cell Homeostasis The set of regulatory processes that ensures the overall Regulons acute lymphoblastic leukaemia (T‑ALL). stability of a system state, independent of endogenous and The full complement of transcriptional targets that are Despite their highly heterogeneous genetic exogenous variables. regulated by a protein. These can include either direct determinants90, first-line therapy for these physical targets, for transcription factors and cofactors, or tumours universally relies on a class of indirect targets for signal transduction. Interactomes high-affinity, cognate binding agonists of Sets of molecular interactions that constitute the regulatory logic of cells representative of a specific tissue Reverse engineering the GC receptor subfamily and organism. Systematic dissection of the molecular interactions that 3 group C member 1 (NR3C1). Although comprise the regulatory logic of the cell. never mutated in leukaemias, NR3C1 Isobologram regulates apoptosis by direct transcriptional A two-dimensional map representing the viability of a cell Sample barcoding following perturbation with a combination of two A technique enabling incorporation of a predefined nucleic activation of the pro-apoptotic factor compounds. Each isobologram axis represents the acid sequence (barcode) to tag either DNA or RNA BCL‑2‑like protein 11 (BCL2L11; also concentration of an individual compound. This diagram is molecules coming from a common sample so that they can known as BIM). Remarkably, 70% of used to quantitatively assess whether the effect of the be sequenced as a pool while retaining the ability to GC‑treated T‑ALLs undergo long-term deconvolute which sample they came from. compound pair is additive, synergistic or antagonistic. This remission. However, 30% of them relapse assessment is based on whether the isoboles (that is, curves in the diagram representing isopotency of the Silhouette score within weeks with GC‑resistant disease and combination) are linear or nonlinear. Linear isoboles A measure, from –1 to 1, of how similar an object is to its poor prognosis. indicate additive behaviour, whereas nonlinear isoboles own cluster compared with the closest other cluster. To identify MRs of GC‑therapy indicate synergistic (supra-additive) or antagonistic resistance, we analysed gene expression (sub-additive) behaviour. Synthetic lethal An interaction between two genes in which knockout of signatures collected at diagnosis (that is, Master regulator either gene in isolation has no (or minimal) negative impact before treatment) from 22 patients with (MR). Type of protein participating in a modular on cell viability whereas knockout of both genes is lethal. GC‑resistant disease and 10 patients regulatory structure (that is, tumour checkpoint), the with GC‑sensitive disease. Differential Tissue microarrays aberrant activity of which is both necessary and sufficient gene expression of resistant versus sensitive for tumour cell state implementation and maintenance (TMAs). Paraffin blocks in which up to 1,000 separate and which directly controls the transcriptional state of a tissue cores are embedded in a grid to support multiplexed tumours was then used for MR analysis, tumour cell. immunohistological analyses. using a T‑ALL-specific interactome. The analysis identified nine MR proteins, three Multiple hypothesis testing Tumour checkpoint of which — serine/threonine protein When a test is repeated multiple times — for instance, a A small, autoregulated module comprising one or more pair of dice are thrown multiple times — the probability of master regulator (MR) proteins, the concerted activity of phosphatase 2A 56 kDa regulatory subunit‑δ an outcome must be adjusted. This is a common problem which is both necessary and sufficient for the (PPP2R5δ), galactosylgalactosylxylosyl- in genetics, for example, where a large number of loci are implementation and maintenance of a tumour cell state. protein 3‑β‑glucuronosyltransferase 3

10 | ADVANCE ONLINE PUBLICATION www.nature.com/nrc ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

(B3GAT3) and AKT1 — rescued sensitivity assays rather than to discovery assays. By a putative adaptor protein in the cullin 3 to dexamethasone (a NR3C1 agonist), contrast, classic drug discovery methods (CUL3) ubiquitin ligase complex — in ~50% when silenced in GC‑resistant cells. screen for targets and inhibitors using cell of the MES-GBM tumours. Experimental AKT1‑mediated NR3C1 phosphorylation lines or animal models, the relevance of validation and independent cohort analysis at Ser134, in particular, prevented NR3C1 which to human disease is questionable, not confirmed that competent KLHL9 activity is nuclear translocation and transcriptional quantitatively assessed and often subjective, crucial for ubiquitin-dependent proteasomal activation of BCL2L11 (REF. 30). Consistently, thus reducing the probabilities that results degradation of CEBPβ and CEBPδ, as well treatment with MK2006 (an AKT1 may translate to clinical studies. as for STAT3 dephosphorylation. Consistent inhibitor) rescued dexamethasone sensitivity with this result, IHC analysis of KLHL9−/− in vitro and in vivo, including in PDXs Genetic determinants of MR activity tumours showed significant increase in transplanted with leukaemic cells from Similar to p50–RELA in ABC-DLBCL, lack CEBPβ and CEBPδ protein abundance, patients with GC‑resistant disease30. AKT1 of differential MR expression or mutations and their in vivo viability was dramatically thus emerged as a key tumour checkpoint suggests that aberrant activity of MES-GBM reduced by ectopic KLHL9 expression7. MR downstream of recurrent T‑ALL MRs (that is, CEBPβ, CEBPδ and STAT3) Similar analyses were also carried out in mutations, including in PI3K and PTEN and may result from genomic alteration in breast cancer7 using different MRs, inferred further mediating resistance by inducing their upstream pathways. Although several at the individual sample level. mTOR-mediated increase in the expression algorithms can detect association of These results define two distinct of the MCL1 antiapoptotic gene91. genomic alterations with differential gene protein classes: the first class includes MR A key aspect of this study is that it expression (genetical-genomics)94 or perform proteins (for example, CEBPβ and STAT3), confirmed the ability to extend MR analyses network-based clustering of genomic co‑activation of which is strictly necessary to signal transduction proteins, based alterations95 (see REF. 48 for a review), there and sufficient for tumour phenotype on identification of their least-indirectly are no methods for the genome-wide, presentation. The second class includes regulated transcriptional targets, as also tumour-specific exploration of alterations in MR modulators, dysregulation of which shown in REF. 38. signalling pathways upstream of candidate by mutations or aberrant signals induces MR proteins. aberrant MR activity (for example, KLHL9 MR‑based model fidelity assessment The Driver-gene Inference by Genetical- and CEBPδ). Proteins in the latter class, which The above-discussed studies highlight a key Genomics and Information Theory includes most of the established oncogenes property of tumour checkpoints, specifically (DIGGIT) pipeline7, available from and tumour suppressors, may be sufficient their conservation, independent of specific Bioconductor96, was developed precisely to but not necessary for tumour checkpoint driver mutation patterns and even across address this challenge. DIGGIT relies on the activity and phenotype presentation and are different organisms. This is important Modulator Inference by Network Dynamics thus not predicted to be MRs by MARINa. because, compared with their human (MINDy) algorithm66 to systematically For instance, KLHL9 is deleted in only ~50% counterparts, driver mutations and microen- identify all candidate modulators of one of MES-GBM samples, suggesting that this vironment signals in cell lines and preclinical or more MR proteins in their upstream event is not strictly required for MES-GBM models are highly idiosyncratic. This suggests pathways. One may argue that extensive subtype implementation. In rare cases in that tumour checkpoint conservation, as feedback control — including via paracrine which aberrant MR activity results from objectively assessed by MR enrichment interactions with the stroma — may genetic alterations at its locus (for example, analysis, may support more effective use dramatically inflate the number of upstream CEBPD), these two classes may overlap. of these models as high-fidelity, predictive pathway proteins. Yet, MINDy-based When combined with identification experimental surrogates. This would be quite inference reveals only a relatively small of mutations in pathways upstream of valuable, as validation in cell lines and in number (10–100) of statistically significant NF‑κB in DLBCL and with conservation preclinical models — including genetically modulators of any candidate MR, most of of prostate cancer MRs in four different engineered mouse models and xenografts which are experimentally confirmed, thus genetically engineered mouse models — is considered mandatory on the path to dramatically reducing the potential search with distinct driver mutations, these clinical translation92. space for mutations that may affect MR studies further confirm that tumour More specifically, these results support activity7,66. checkpoints represent the cellular logic that the idea of a human‑to‑mouse‑to‑human For instance, among the multitude of is responsible for integrating the effect of (H2M2H) methodology31,93, in which genes of which the expression was affected large and heterogeneous genomic alteration preclinical models are optimally matched by recurrent copy number alterations repertoires into virtually indistinguish- to human tumours based on the statistical (CNAs) in the MES-GBM subtype, MINDy able tumour states. This also supports significance of their tumour checkpoint MR inferred that only 92 genes were either MRs the role of tumour checkpoint MRs as overlap. This would increase the probability or upstream MR-activity modulators. Of highly conserved, mechanistic therapeutic of results from preclinical studies using these these, only 21 showed significant association targets and biomarkers. Indeed, based on models being relevant to human disease. between their corresponding CNA and the the proposed oncotecture model, tumour As all the steps required for MR inference activity of MES-GBM MRs. Conditional checkpoints emerge as the regulatory pillars — including network reverse engineering association analysis of these 21 genes of tumour dystasis and thus, potentially, and MR analysis — rely exclusively on revealed only two statistically independent as the universal Achilles heels of cancer. patient-derived tissue, MR identification is alterations, including amplification of Consistent with this observation, these and completely free of bias from using cell line CEBPD, itself an MR, in ~30% of MES-GBM other studies have shown that collapse of and model organisms. This limits the use tumours, and homozygous deletion of tumour checkpoint activity — as induced of MR‑matched models only to validation kelch-like family member 9 (KLHL9) — from genetic9,10 or pharmacological11,29–31

NATURE REVIEWS | CANCER ADVANCE ONLINE PUBLICATION | 11 ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

inhibition of individual MRs or MR pairs high-penetrance driver mutations — such have shown that systematic perturbational — induces crucial loss of tumour state and as BCR–ABL translocation events in CML98, assays, either in vitro104,105 or in vivo31, abrogation of tumour viability or rescue of KRAS mutations in up to 90% of pancreatic using tumour-representative cell lines drug sensitivity, in vitro and in vivo. This ductal adenocarcinomas101, BRAF mutations or mouse models, can help to prioritize has important consequences in terms of in almost 50% of melanomas102 and ERBB2 targeted inhibitors optimally suited to the potential use of tumour checkpoints amplification in 20–30% of breast ductal the abrogation of tumour checkpoint MR as more universal (that is, mutation adenocarcinomas103 — we are starting to activity, including MRs that are usually independent) tumour vulnerabilities that are run out of such low-hanging fruits. It is thought to be undruggable11,31. The rationale intrinsically less susceptible to typical relapse thus reasonable to expect a long tail of more for these studies is that, although crucial mechanisms. Indeed, whenever relapse is complex (additive or epistatic) genomic tumour dependencies may be substantially induced by alternative or bypass mutations alteration patterns that will be increasingly different in the human tumour context versus or by epigenetic mechanisms that act less recurrent and thus harder to dissect via in vitro and in vivo models, drug MOA is upstream of the same tumour checkpoint, its statistical inference methods. Among other generally reasonably well conserved. Indeed, direct targeting should mitigate relapse29,30. things, this suggests that existing targeted pharmacological compound MOA is typically inhibitors could be used more effectively elucidated in vitro. Of course, compounds Extending oncogene dependency in the clinics if oncoprotein activity could that modulate the immune system — such Tumorigenesis arises from the aberrant be assessed directly, rather than indirectly as the recent repertoire of cytotoxic T activity of multiple gene products working based on mutations (BOX 3). lymphocyte-associated antigen 4 (CTLA4) in concert. Activating mutations represent Indeed, VIPER-inferred oncoprotein and programmed cell death protein 1 (PD1; a highly reproducible predictor of aberrant activity systematically outperformed also known as PDCD1)–PD1 ligand 1 (PDL1; oncogene activity and can thus be used to activating mutations in predicting sensitivity also known as CD274) inhibitors — would pinpoint valuable pharmacological targets, to targeted inhibitors by an order of have to be investigated either in immune as proposed by the oncogene addiction magnitude in statistical significance on system-related cell lines or in transgenic hypothesis97. Indeed, following the clinical average. For instance, the P value of erlotinib mouse models with an intact immune system. success of imatinib — an inhibitor of sensitivity prediction in 80 Cancer Cell Line The ability of a compound to inhibit the BCR–ABL fusion protein in chronic Encyclopedia (CCLE) lung cancer cell lines specific MR proteins is directly reflected in myelogenous leukaemia (CML)98 — oncogene is P = 3.3 × 10–2 when the mutational status of the inversion of expression of its regulon. As addiction has become a de facto paradigm for epidermal growth factor receptor (EGFR; the a result, the same regulons used to identify targeted therapy in precision medicine. target of erlotinib) is used, and P = 3.2 × 10–3 tumour MRs represent ideal quantitative Yet, direct mutations are only one of many when VIPER-inferred EGFR activity is used reporter assays to prioritize their candidate ways to induce aberrant protein activity, (BOX 3). This suggests that the oncogene inhibitors, specifically by measuring and there are innumerable alternatives addiction paradigm could be complemented enrichment of their regulons in genes to achieve a virtually identical objective. and extended by direct measurement or that are differentially expressed following These include mutation in upstream inference of protein activity. compound perturbation, for example, as modulators7,8, cognate binding partners99 discussed in REF. 38. and autoregulatory loops29, among others8,100. Pharmacological MR targeting To validate this approach, we performed Even assuming a single functional mutation Elucidation of tumour checkpoint MRs and systematic perturbation of human DLBCL per gene product, the activity of oncoproteins MR pairs that represent either individual cells, including OCI‑LY7 cells using 14 regulated by ≥100 other gene products, (essential) or synergistic (synthetic lethal) compounds and then of OCI‑LY3, OCI‑LY7 such as MYC66, could be dysregulated by dependencies, independent of the associated and U-293 cells using 92 FDA-approved any combination of mutated oncogenes out genomic alterations, provides a largely novel compounds, at different time points and at of the 100 possible ones. This accounts for and more universal repertoire of targets for multiple concentrations. Gene expression more than 1030 distinct genomic alteration pharmacological intervention. However, profiles from these assays were then used patterns, as computed by the binomial many of the MRs and MR pairs that to systematically elucidate the MOA of the coefficient formula. Even if most of them determine tumour checkpoint activity are compounds105 and their ability to target failed to induce pathogenic activity levels, transcription factors, which are considered disease-specific MR genes, either individually it is still reasonable to expect the existence challenging or even undruggable targets. As or in combination (synergistically)104. This of a large number of distinct, non-recurrent a result, even if MR proteins were effectively successfully demonstrated that interrogation yet tumorigenic mutational patterns. For validated as valuable therapeutic targets, of gene regulatory networks reconstructed instance, PTEN activity in glioma was shown development of small molecule inhibitors de novo in context-specific tumours can help to be dysregulated by multiple deletions of that modulate their activity may require an to elucidate compound MOA and identify any among 13 frequently deleted genes61, entirely novel chemistry. compounds and compound combinations each individually affecting PTEN expression Yet, the current repertoire of FDA- that target the activity of DLBCL subtype- by less than 20%. Unfortunately, owing to approved drugs includes ~2,000 compounds, specific MR proteins, thus reducing multiple hypothesis testing, evaluating the with an additional ~400 oncology-related tumour viability. Surprisingly, for instance, statistical significance of such polygenic investigational compounds in late-stage at a low false discovery rate (FDR = 8%), patterns would require prohibitive cohort (phase II/III) clinical studies. Many of these >60% of predicted synergistic compound sizes. have mechanisms of action (MOAs) that combinations were experimentally Two important observations emerge are only partially elucidated and may thus confirmed by isobologram analysis104,106. These from these considerations. First, despite represent unexpected, tumour-specific data were valuable for the identification of successful identification of key recurrent, inhibitors of MR proteins. Specifically, we drugs and drug combinations that abrogate

12 | ADVANCE ONLINE PUBLICATION www.nature.com/nrc ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

viability of aggressive follicular lymphoma could be dramatically reduced without loss approaches that explore the mechanistic cells following transformation to DLBCL11, of accuracy. This would enable cost-effective architecture of disease state maintenance by targeting experimentally validated MRs screening of large compound libraries, downstream of the genes associated with and synergistic MR pairs. In vivo, we recently including, for example, FDA-approved and its aetiology. In this Opinion article, showed that a similar approach could late-stage investigational compounds. we have explored a recently proposed identify drug combinations that abrogate regulatory architecture in which MR FOXM1 and CENPF activity in multiple Conclusions proteins implement tightly autoregulated transgenic mouse models of prostate cancer, Despite considerable progress in integrative tumour checkpoint modules that are in which these were validated as synergistic and systems biology approaches, responsible for the stability of cancer MR pairs31. The most statistically significant computational assessment of gene–function states, independent of specific genetic drug combination identified by the analysis relationships is still mostly carried out mutations, using mechanisms reminiscent produced a 9‑fold decrease in tumour one gene at a time. The use of pathways is of physiological homeostasis. Such dystatic volume in vivo, in tumour xenografts that generally limited to visual diagrams that regulatory oncotecture may provide a failed to respond to conventional therapy. poorly represent their context-specific conceptual framework to help unravel the A key value of network-based approaches nature and their true connectivity. This is complexity and profound heterogeneity is that MR inference can tolerate very low particularly ineffective in studying complex of cancer genetics. For instance, it may RNA sequencing depths, down to 500,000 polygenic dependencies, including synthetic provide a tractable set of constraints reads, with virtually no decrease in the lethal interactions that cannot be gleaned on to interpret the evolutionary selection accuracy of MR activity prediction (Pearson’s an individual gene basis and yet are crucially of additional mutational events that correlation coefficient ≥0.95)38. This suggests relevant for the systematic development of may confer selective advantage to the that with appropriate sample barcoding combination therapies. corresponding cancer cell subclones, by techniques for the creation of pooled Recent progress in the systematic and affecting upstream modulators of tumour RNA-seq libraries, the cost of individual context-specific mapping of regulatory checkpoint MRs that control proliferative profiles from compound perturbations and signalling interactions supports novel programmes.

Box 3 | Protein activity inference complements analysis of genetic alterations in assessing aberrant oncoprotein activity Cancer is driven by the aberrant activity of proteins (oncoproteins) working in concert to regulate tumour cell states. a AUC GSEA P value Indeed, pharmacological inhibition of aberrantly activated oncoproteins can elicit oncogene dependency, which represents Activity 3.2 × 10–3 the basis for the oncogene addiction paradigm and the foundation of current efforts in precision cancer medicine97. Mutations 3.3 10–2 Activating genetic alterations have thus emerged as an important × rationale for the identification of therapeutic targets in individual patients. Yet, activating mutations represent only one of many AUC possible ways of dysregulating protein activity, which may also result from a vast complement of genetic and epigenetic events Activity in cognate binding partners99 and upstream regulators7. For instance, lung adenocarcinoma (LUAD) cells with activating Mutations mutations in a specific oncogene (for example, the epidermal growth factor receptor (EGFR)) may be sensitive to the Mutated but not corresponding targeted inhibitor (erlotinib). Yet cells lacking Activated but not mutated activated these mutations may also present equivalent sensitivity28,132 (see part a of the figure, which shows 80 Cancer Cell Line Encyclopedia (CCLE) lung cancer cell lines ranked by their –3 –1.5 0 1.5 3 –14 –7 0 7 14 sensitivity to erlotinib — the expanded inset shows the 25 cell Protein activity (NES) AUC (z-score) lines with the highest sensitivity to erlotinib). Conversely, owing to homeostatic feedback mechanisms and epigenetic allele silencing, activating mutations may not guarantee aberrant b GSEA Activity P value oncoprotein activity. As a result, the direct or inferred assessment of protein activity may constitute a valuable complement to Mutations 7.8 × 10–5 genetic analysis for prioritization of targeted therapy, especially as a large proportion of cancer patients present no actionable oncogene mutations. Indeed, samples from The Cancer Genome Activity Atlas (TCGA) LUAD cohort show striking enrichment of activating EGFR mutations in samples with the highest Virtual Inference of Mutations Protein Activity by Enriched Regulon Analysis (VIPER)-inferred EGFR activity (part b of the figure — the expanded inset shows samples with the highest activity). However, many samples showed comparably high activity but no activating mutations, Activated but not mutated suggesting that they may also respond to targeted EGFR inhibitors, as shown for the cell lines. AUC, area under the curve; –6 –3 0 3 6 GSEA, gene set enrichment analysis; NES, normalized expression score. Protein activity (NES) Nature Reviews | Cancer

NATURE REVIEWS | CANCER ADVANCE ONLINE PUBLICATION | 13 ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

There are several crucial unaddressed characterized by abnormalities in PDGFRA, IDH1, 31. Mitrofanova, A., Aytes, A., Shen, C., Abate-Shen, C. & EGFR, and NF1. Cancer Cell 17, 98–110 (2010). Califano, A. A systems biology approach to predict issues implied by such a tumour- 4. Curtis, C. et al. The genomic and transcriptomic drug response for human prostate cancer based on independent oncotecture. These include architecture of 2,000 breast tumours reveals novel in vivo preclinical analyses of mouse models. Cell Rep. subgroups. Nature 486, 346–352 (2012). 12, 1–12 (2015). whether tumour heterogeneity supports 5. Cancer Genome Atlas Network. Comprehensive 32. Davis, R. E., Brown, K. D., Siebenlist, U. & the plastic coexistence of distinct cancer molecular portraits of human breast tumours. Nature Staudt, L. M. Constitutive nuclear factor kappaB 490, 61–70 (2012). activity is required for survival of activated B cell-like cell states in the same tumour, which 6. Oakman, C., Santarpia, L. & Di Leo, A. Breast cancer diffuse large B cell lymphoma cells. J. Exp. Med. 194, may require pharmacological targeting of assessment tools and optimizing adjuvant therapy. 1861–1874 (2001). Nat. Rev. Clin. Oncol. 7, 725–732 (2010). 33. Ngo, V. N. et al. Oncogenically active MYD88 mutations multiple distinct tumour checkpoint MRs, 7. Chen, J. C. et al. Identification of causal genetic in human lymphoma. Nature 470, 115–119 (2011). for example, using alternating treatment drivers of human disease through systems-level 34. Davis, R. E. et al. Chronic active B‑cell-receptor analysis of regulatory networks. Cell 159, 402–414 signalling in diffuse large B‑cell lymphoma. Nature schedules. For instance, in glioma, (2014). 463, 88–92 (2010). mesenchymal and proneural subtype cells 8. Compagno, M. et al. Mutations of multiple genes 35. Luo, J., Solimini, N. L. & Elledge, S. J. Principles of cause deregulation of NF‑ B in diffuse large B‑cell cancer therapy: oncogene and non-oncogene 107 κ coexist plastically in the same tumour , lymphoma. Nature 459, 717–721 (2009). addiction. Cell 136, 823–837 (2009). even though these states have been shown 9. Aytes, A. et al. Cross-species regulatory network 36. Wayne, D. W. Kolmogorov–Smirnov One-sample Test: analysis identifies a synergistic interaction between Applied Nonparametric Statistics 2nd edn 319–330 to be regulated by orthogonal tumour FOXM1 and CENPF that drives prostate cancer (PWS-Kent, 1990). checkpoints10. In addition, it is unclear how malignancy. Cancer Cell 25, 638–651 (2014). 37. Subramanian, A. et al. Gene set enrichment analysis: 10. Carro, M. S. et al. The transcriptional network for a knowledge-based approach for interpreting genome- many distinct tumour checkpoints may mesenchymal transformation of brain tumours. Nature wide expression profiles. Proc. Natl Acad. Sci. USA coexist even in the same tumour cell. For 463, 318–325 (2010). 102, 15545–15550 (2005). 11. Bisikirska, B. et al. Elucidation and pharmacological 38. Alvarez, M. J. et al. Functional characterization of instance, tumour checkpoints associated with targeting of novel molecular drivers of follicular somatic mutations in cancer using network-based proliferation and epithelial-to‑mesenchymal lymphoma progression. Cancer Res. 76, 664–674 inference of protein activity. Nat. Genet. 48, 838–847 (2016). (2016). transition are probably distinct, given the 12. Yang, J. et al. Twist, a master regulator of morphogenesis, 39. Lefebvre, C. et al. A human B‑cell interactome serial nature of the cancer initiation and plays an essential role in tumor metastasis. Cell 117, identifies MYB and FOXM1 as master regulators of 927–939 (2004). proliferation in germinal centers. Mol. Syst. Biol. 6, progression steps associated with their 13. Nambu, J. R., Lewis, J. O., Wharton, K. A. Jr & 377 (2010). activation. It is thus possible that a relatively Crews, S. T. The Drosophila single-minded gene 40. Castro, M. A. et al. Regulators of genetic risk of breast encodes a helix–loop–helix protein that acts as a cancer identified by integrative network analysis. Nat. small number of molecularly distinct tumour master regulator of CNS midline development. Cell Genet. 48, 12–21 (2016). checkpoints exist and that their activation in 67, 1157–1167 (1991). 41. Zhang, S. et al. Stroma-associated master regulators 14. Resnick, M. A., Tomso, D., Inga, A., Menendez, D. & of molecular subtypes predict patient prognosis in distinct tumour subtypes depends on distinct Bell, D. Functional diversity in the gene network ovarian cancer. Sci. Rep. 5, 16066 (2015). genetic alteration patterns. Combinatorial controlled by the master regulator p53 in humans. 42. Yepes, S., Torres, M. M. & Lopez-Kleine, L. Regulatory Cell Cycle 4, 1026–1029 (2005). network reconstruction reveals genes with prognostic superposition of two or more of these 15. Klapper, L. N., Kirschbaum, M. H., Sela, M. & value for chronic lymphocytic leukemia. BMC mechanisms in each individual tumour may Yarden, Y. Biochemical and clinical implications of the Genomics 16, 1002 (2015). ErbB/HER signaling network of growth factor 43. Remo, A. et al. Systems biology analysis reveals thus explain the large variety of reported receptors. Adv. Cancer Res. 77, 25–79 (2000). NFAT5 as a novel biomarker and master regulator of tumour subtypes and of associated clinical 16. Perou, C. M. et al. Molecular portraits of human inflammatory breast cancer. J. Transl Med. 13, 138 breast tumours. Nature 406, 747–752 (2000). (2015). phenotypes and drug sensitivities. 17. Dalla-Favera, R. et al. Human c‑myc oncogene is 44. Fletcher, M. N. et al. Master regulators of FGFR2 Ultimately, we propose that tumour located on the region of 8 that is signalling and breast cancer risk. Nat. Commun. 4, translocated in Burkitt lymphoma cells. Proc. Natl 2464 (2013). checkpoint and MRs may be leveraged to Acad. Sci. USA 79, 45. Tanaka, H. & Ogishima, S. Network biology approach simultaneously identify crucial tumour 7824–7827 (1982). to epithelial-mesenchymal transition in cancer 18. Klein, U. et al. Transcriptional analysis of the B cell metastasis: three stage theory. J. Mol. Cell Biol. 7, dependencies, as well as the specific drugs germinal center reaction. Proc. Natl Acad. Sci. USA 253–266 (2015). and drug combinations that are optimally 100, 2639–2644 (2003). 46. Piao, G. et al. A computational procedure for 19. Akavia, U. D. et al. An integrated approach to identifying master regulator candidates: a case study suited to inducing their collapse, on the uncover drivers of cancer. Cell 143, 1005–1017 on diabetes progression in Goto–Kakizaki rats. BMC basis of an individual sample and even an (2010). Syst. Biol. 6 (Suppl. 1), S2 (2012). 20. Gu, J. et al. Gene module based regulator inference 47. Liang, Y. et al. Transcriptional network analysis individual cell. This would have obvious identifying miR‑139 as a tumor suppressor in identifies BACH1 as a master regulator of breast application to complementing oncogene colorectal cancer. Mol. Biosyst. 10, 3249–3254 cancer bone metastasis. J. Biol. Chem. 287, (2014). 33533–33544 (2012). addiction and immune checkpoint 21. Ergun, A., Lawrence, C. A., Kohanski, M. A., 48. Califano, A., Butte, A. J., Friend, S., Ideker, T. & paradigms in precision medicine. Brennan, T. A. & Collins, J. J. A network biology Schadt, E. Leveraging models of cell regulation and approach to prostate cancer. Mol. Syst. Biol. 3, 82 GWAS data in integrative network-based association (2007). studies. Nat. Genet. 44, 841–847 (2012). Andrea Califano is at the Department of Systems 22. Cannon, W. B. Organization for physiological 49. Pe’er, D. & Hacohen, N. Principles and strategies for Biology, Columbia University, and the Departments of homeostasis. Physiol. Rev. 9, 399–431 (1929). developing network models in cancer. Cell 144, Biomedical Informatics, Biochemistry and Molecular 23. Waddington, C. H. Canalization of development and 864–873 (2011). Biophysics, JP Sulzberger Columbia Genome Center, genetic assimilation of acquired characters. Nature 50. Tovar, H., Garcia-Herrera, R., Espinal-Enriquez, J. & Herbert Irving Comprehensive Cancer Center, Columbia 183, 1654–1655 (1959). Hernandez-Lemus, E. Transcriptional master regulator 24. Waddington, C. H. Genetic assimilation. Adv. Genet. analysis in breast cancer genetic networks. Comput. University, New York, New York 10032, USA. 10, 257–293 (1961). Biol. Chem. 59B, 67–77 (2015). 25. Siegel, P. M. & Massague, J. Cytostatic and apoptotic 51. Sartor, I. T., Zeidan-Chulia, F., Albanus, R. D., Mariano J. Alvarez is at DarwinHealth, Inc., actions of TGF-β in homeostasis and cancer. Nat. Rev. Dalmolin, R. J. & Moreira, J. C. Computational analyses 3960 Broadway, Suite 540, New York, Cancer 3, 807–821 (2003). reveal a prognostic impact of TULP3 as a transcriptional New York 10032, USA. 26. Carter, S. B. Tissue homeostasis and the biological master regulator in pancreatic ductal adenocarcinoma. basis of cancer. Nature 220, 970–974 (1968). Mol. Biosyst. 10, 1461–1468 (2014). Correspondence to A.C. [email protected] 27. Young, S. R. et al. Establishment and serial passage of 52. Lim, W. K., Lyashenko, E. & Califano, A. Master doi:10.1038/nrc.2016.124 cell cultures derived from LuCaP xenografts. Prostate regulators used as breast cancer metastasis classifiers. 73, 1251–1262 (2013). Pac. Symp. Biocomput. 14, 504–515 (2009). Published online 15 Dec 2016 28. Barretina, J. et al. The Cancer Cell Line Encyclopedia 53. Ikiz, B. et al. Dissecting the regulatory machinery of enables predictive modelling of anticancer drug neurodegeneration in an in vitro model of amyotrophic 1. Hoadley, K. A. et al. Multiplatform analysis of 12 sensitivity. Nature 483, 603–607 (2012). lateral sclerosis. Cell Rep. 12, 335–345 (2015). cancer types reveals molecular classification within and 29. Rodriguez-Barrueco, R. et al. Inhibition of the 54. Aubry, S. et al. Assembly and interrogation of across tissues of origin. Cell 158, 929–944 (2014). autocrine IL‑6–JAK2–STAT3–calprotectin axis as Alzheimer’s disease genetic networks reveal novel 2. Shah, S. P. et al. The clonal and mutational evolution targeted therapy for HR–/HER2+ breast cancers. regulators of progression. PLoS ONE 10, e0120352 spectrum of primary triple-negative breast cancers. Genes Dev. 29, 1631–1648 (2015). (2015). Nature 486, 395–399 (2012). 30. Piovan, E. et al. Direct reversal of glucocorticoid 55. Brichta, L. et al. Identification of neurodegenerative 3. Verhaak, R. G. et al. Integrated genomic analysis resistance by AKT inhibition in acute lymphoblastic factors using translatome-regulatory network analysis. identifies clinically relevant subtypes of glioblastoma leukemia. Cancer Cell 24, 766–776 (2013). Nat. Neurosci. 18, 1325–1333 (2015).

14 | ADVANCE ONLINE PUBLICATION www.nature.com/nrc ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.

PERSPECTIVES

56. Repunte-Canonigo, V. et al. Identifying candidate 85. US National Library of Medicine. Clinicaltrials.gov 114. Needham, C. J., Bradford, J. R., Bulpitt, A. J. & drivers of alcohol dependence-induced excessive https://clinicaltrials.gov/ct2/show/NCT02066532 (2014). Westhead, D. R. Inference in Bayesian networks. Nat. drinking by assembly and interrogation of brain-specific 86. Tzoneva, G. et al. Activating mutations in the NT5C2 Biotechnol. 24, 51–53 (2006). regulatory networks. Genome Biol. 16, 68 (2015). nucleotidase gene drive chemotherapy resistance in 115. Yeung, M. K., Tegner, J. & Collins, J. J. Reverse 57. Kushwaha, R. et al. Interrogation of a context-specific relapsed ALL. Nat. Med. 19, 368–371 (2013). engineering gene networks using singular value transcription factor network identifies novel regulators 87. Woyach, J. A. et al. Resistance mechanisms for the decomposition and robust regression. Proc. Natl of pluripotency. Stem Cells 33, 367–377 (2015). Bruton’s tyrosine kinase inhibitor ibrutinib. N. Engl. Acad. Sci. USA 99, 6163–6168 (2002). 58. Ravasi, T. et al. An atlas of combinatorial J. Med. 370, 2286–2294 (2014). 116. Martinez, M. R. et al. Quantitative modeling of the transcriptional regulation in mouse and man. Cell 88. Flusberg, D. A. & Sorger, P. K. Modulating cell‑to‑cell terminal differentiation of B cells and mechanisms of 140, 744–752 (2010). variability and sensitivity to death ligands by lymphomagenesis. Proc. Natl Acad. Sci. USA 109, 59. Basso, K. et al. Reverse engineering of regulatory co‑drugging. Phys. Biol. 10, 035002 (2013). 2672–2677 (2012). networks in human B cells. Nat. Genet. 37, 382–390 89. Fenner, A. Prostate cancer: next-generation RNA 117. Polynikis, A., Hogan, S. J. & di Bernardo, M. (2005). sequencing identifies gene signature of Comparing different ODE modelling approaches for 60. ENCODE Project Consortium. Identification and neuroendocrine differentiation in prostate tumors. gene regulatory networks. J. Theor. Biol. 261, analysis of functional elements in 1% of the human Nat. Rev. Urol. 9, 8 (2012). 511–530 (2009). genome by the ENCODE pilot project. Nature 447, 90. Van Vlierberghe, P. & Ferrando, A. The molecular basis 118. Stuart, J. M., Segal, E., Koller, D. & Kim, S. K. A gene- 799–816 (2007). of T cell acute lymphoblastic leukemia. J. Clin. Invest. coexpression network for global discovery of conserved 61. Sumazin, P. et al. An extensive microRNA-mediated 122, 3398–3406 (2012). genetic modules. Science 302, 249–255 (2003). network of RNA–RNA interactions regulates 91. Wei, G. et al. Gene expression-based chemical 119. Horvath, S. & Dong, J. Geometric interpretation of established oncogenic pathways in glioblastoma. Cell genomics identifies rapamycin as a modulator of MCL1 gene coexpression network analysis. PLoS Comput. 147, 370–381 (2011). and glucocorticoid resistance. Cancer Cell 10, Biol. 4, e1000117 (2008). 62. Lewis, B. P., Shih, I. H., Jones-Rhoades, M. W., 331–342 (2006). 120. Jansen, R. et al. A Bayesian networks approach for Bartel, D. P. & Burge, C. B. Prediction of mammalian 92. Rubin, E. H. & Gilliland, D. G. Drug development and predicting protein-protein interactions from genomic microRNA targets. Cell 115, 787–798 (2003). clinical trials—the path to an approved cancer drug. data. Science 302, 449–453 (2003). 63. Friedlander, M. R. et al. Discovering from Nat. Rev. Clin. Oncol. 9, 215–222 (2012). 121. He, F., Balling, R. & Zeng, A. P. Reverse engineering and deep sequencing data using miRDeep. Nat. 93. Shelanski, M. et al. A systems approach to drug verification of gene networks: principles, assumptions, Biotechnol. 26, 407–415 (2008). discovery in Alzheimer’s disease. Neurotherapeutics and limitations of present methods and future 64. Rual, J. F. et al. Towards a proteome-scale map of the 12, 126–131 (2015). perspectives. J. Biotechnol. 144, 190–203 (2009). human protein-protein interaction network. Nature 94. Schadt, E. E. et al. An integrative genomics approach 122. Stolovitzky, G., Prill, R. J. & Califano, A. Lessons from 437, 1173–1178 (2005). to infer causal associations between gene expression the DREAM2 Challenges. Ann. NY Acad. Sci. 1158, 65. Bandyopadhyay, S. et al. A human MAP kinase and disease. Nat. Genet. 37, 710–717 (2005). 159–195 (2009). interactome. Nat. Methods 7, 801–805 (2010). 95. Hofree, M., Shen, J. P., Carter, H., Gross, A. & 123. Marbach, D. et al. Revealing strengths and 66. Wang, K. et al. Genome-wide identification of post- Ideker, T. Network-based stratification of tumor weaknesses of methods for gene network inference. translational modulators of transcription factor activity in mutations. Nat. Methods 10, 1108–1115 (2013). Proc. Natl Acad. Sci. USA 107, 6286–6291 (2010). human B cells. Nat. Biotechnol. 27, 829–839 (2009). 96. Alvarez, M. J., Chen, J. C. & Califano, A. DIGGIT: a 124. Bussemaker, H. J., Li, H. & Siggia, E. D. Regulatory 67. Zhang, Q. C. et al. Structure-based prediction of Bioconductor package to infer genetic variants driving element detection using correlation with expression. protein-protein interactions on a genome-wide scale. cellular phenotypes. Bioinformatics 31, 4032–4034 Nat. Genet. 27, 167–171 (2001). Nature 490, 556–560 (2012). (2015). 125. Fiedler, D. et al. Functional organization of the 68. AlQuraishi, M., Koytiger, G., Jenney, A., MacBeath, G. 97. Weinstein, I. B. Cancer. Addiction to oncogenes—the S. cerevisiae phosphorylation network. Cell 136, & Sorger, P. K. A multiscale statistical mechanical Achilles heal of cancer. Science 297, 63–64 (2002). 952–963 (2009). framework integrates biophysical and genomic data to 98. Druker, B. J. et al. Effects of a selective inhibitor of the 126. Mani, K. M. et al. A systems biology approach to assemble cancer networks. Nat. Genet. 46, 1363– Abl tyrosine kinase on the growth of Bcr-Abl positive prediction of oncogenes and molecular perturbation 1367 (2014). cells. Nat. Med. 2, 561–566 (1996). targets in B‑cell lymphomas. Mol. Syst. Biol. 4, 169 69. Faith, J. J. et al. Large-scale mapping and validation of 99. Wang, X., Haswell, J. R. & Roberts, C. W. Molecular (2008). Escherichia coli transcriptional regulation from a pathways: SWI/SNF (BAF) complexes are frequently 127. Ideker, T. & Krogan, N. J. Differential network biology. compendium of expression profiles. PLoS Biol. 5, e8 mutated in cancer—mechanisms and potential Mol. Syst. Biol. 8, 565 (2012). (2007). therapeutic insights. Clin. Cancer Res. 20, 21–27 (2014). 128. Barrios-Rodiles, M. et al. High-throughput mapping of 70. Friedman, N. Inferring cellular networks using 100. Vazquez, M., Valencia, A. & Pons, T. Structure-PPi: a a dynamic signaling network in mammalian cells. probabilistic graphical models. Science 303, 799–805 module for the annotation of cancer-related single- Science 307, 1621–1625 (2005). (2004). nucleotide variants at protein-protein interfaces. 129. Saez-Rodriguez, J. et al. Comparing signaling 71. Kundaje, A. et al. Learning regulatory programs that Bioinformatics 31, 2397–2399 (2015). networks between normal and transformed accurately predict differential expression with 101. Morris, J. P. t., Wang, S. C. & Hebrok, M. KRAS, hepatocytes using discrete logical models. Cancer MEDUSA. Ann. NY Acad. Sci. 1115, 178–202 (2007). Hedgehog, Wnt and the twisted developmental Res. 71, 5400–5411 (2011). 72. Kramer, A., Green, J., Pollard, J. Jr & Tugendreich, S. biology of pancreatic ductal adenocarcinoma. Nat. 130. Menashe, I. et al. Pathway analysis of breast cancer Causal analysis approaches in Ingenuity Pathway Rev. Cancer 10, 683–695 (2010). genome-wide association study highlights three Analysis. Bioinformatics 30, 523–530 (2014). 102. Dhomen, N. & Marais, R. BRAF signaling and targeted pathways and one canonical signaling cascade. Cancer 73. Maher, B. ENCODE: the human encyclopaedia. Nature therapies in melanoma. Hematol. Oncol. Clin. North Res. 70, 4453–4459 (2010). 489, 46–48 (2012). Am. 23, 529–545 (2009). 131. Laoukili, J. et al. FoxM1 is required for execution of 74. Palomero, T. et al. NOTCH1 directly regulates c‑MYC 103. Davoli, A., Hocevar, B. A. & Brown, T. L. Progression the mitotic programme and chromosome stability. and activates a feed-forward-loop transcriptional and treatment of HER2‑positive breast cancer. Cancer Nature Cell Biol. 7, 126–136 (2005). network promoting leukemic cell growth. Proc. Natl Chemother. Pharmacol. 65, 611–623 (2010). 132. Basu, A. et al. An interactive resource to identify Acad. Sci. USA 103, 18261–18266 (2006). 104. Bansal, M. et al. A community computational cancer genetic and lineage dependencies targeted by 75. Affara, M. et al. Vasohibin‑1 is identified as a master- challenge to predict the activity of pairs of compounds. small molecules. Cell 154, 1151–1161 (2013). regulator of endothelial cell apoptosis using gene Nat. Biotechnol. 32, 1213–1222 (2014). 133. Flusberg, D. A., Roux, J., Spencer, S. L. & Sorger, P. K. network analysis. BMC Genomics 14, 23 (2013). 105. Woo, J. H. et al. Elucidating compound mechanism of Cells surviving fractional killing by TRAIL exhibit 76. Werner, H. M., Mills, G. B. & Ram, P. T. Cancer systems action by network dysregulation analysis in pertubed transient but sustainable resistance and inflammatory biology: a peek into the future of patient care? Nat. cells. Cell 162, 441–451 (2015). phenotypes. Mol. Biol. Cell 24, 2186–2200 (2013). Rev. Clin. Oncol. 11, 167–176 (2014). 106. Tallarida, R. J. An overview of drug combination 134. Roschewski, M., Staudt, L. M. & Wilson, W. H. Diffuse 77. Saito, M. et al. BCL6 suppression of BCL2 via Miz1 analysis with isobolograms. J. Pharmacol. Exp. Ther. large B‑cell lymphoma-treatment approaches in the and its disruption in diffuse large B cell lymphoma. 319, 1–7 (2006). molecular era. Nat. Rev. Clin. Oncol. 11, 12–23 (2014). Proc. Natl Acad. Sci. USA 106, 11294–11299 (2009). 107. Patel, A. P. et al. Single-cell RNA-seq highlights 78. Luscombe, N. M. et al. Genomic analysis of regulatory intratumoral heterogeneity in primary glioblastoma. Acknowledgements network dynamics reveals large topological changes. Science 344, 1396–1401 (2014). This work was supported by the National Cancer Institute Nature 431, 308–312 (2004). 108. Park, H. S. & Jun, C. H. A simple and fast algorithm for (NCI) Cancer Target Discovery and Development Program 79. Phillips, H. S. et al. Molecular subclasses of high-grade K‑medoids clustering. Expert Systems With Appl. 36, (1U01CA168426), NCI Outstanding Investigator Award glioma predict prognosis, delineate a pattern of 3336–3341 (2009). (R35CA197745) for A.C., NCI Research Centers for Cancer disease progression, and resemble stages in 109. Lefebvre, C., Rieckhof, G. & Califano, A. Reverse- Systems Biology Consortium (1U54CA209997) and the NIH neurogenesis. Cancer Cell 9, 157–173 (2006). engineering human regulatory networks. Wiley instrumentation grants (S10OD012351 and S10OD021764). 80. Brennan, C. W. et al. The somatic genomic landscape Interdiscip. Rev. Syst. Biol. Med. 4, 311–325 (2012). of glioblastoma. Cell 155, 462–477 (2013). 110. Linding, R. et al. Systematic discovery of in vivo Competing interests statement 81. Wapinski, O. L. et al. Hierarchical mechanisms for phosphorylation networks. Cell 129, 1415–1426 The authors declare competing interests: see Web version for direct reprogramming of fibroblasts to neurons. Cell (2007). details. 155, 621–635 (2013). 111. Srivas, R. et al. Assembling global maps of cellular 82. Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: function through integrative analysis of physical and DATABASES the next generation. Cell 144, 646–674 (2011). genetic networks. Nat. Protoc. 6, 1308–1323 (2011). CCLE: http://software.broadinstitute.org/software/ 83. Chudnovsky, Y. et al. ZFHX4 interacts with the NuRD 112. Rhodes, D. R. et al. Mining for regulatory programs in cprg/?q = node/11 core member CHD4 and regulates the glioblastoma the cancer . Nat. Genet. 37, 579–583 tumor-initiating cell state. Cell Rep. 6, 313–324 (2005). FURTHER INFORMATION (2014). 113. Lachmann, A. et al. ChEA: transcription factor regulation The Cancer Genome Atlas: http://cancergenome.nih.gov 84. Taylor, B. S. et al. Integrative genomic profiling of inferred from integrating genome-wide ChIP‑X ALL LINKS ARE ACTIVE IN THE ONLINE PDF human prostate cancer. Cancer Cell 18, 11–22 (2010). experiments. Bioinformatics 26, 2438–2444 (2010).

NATURE REVIEWS | CANCER ADVANCE ONLINE PUBLICATION | 15 ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved. ©2016 Mac millan Publishers Li mited, part of Spri nger Nature. All ri ghts reserved.