<<

ARTICLE

https://doi.org/10.1038/s41467-019-09695-9 OPEN Metabolic profiling of cells reveals genome- wide crosstalk between transcriptional regulators and metabolism

Karin Ortmayr 1, Sébastien Dubuis1 & Mattia Zampieri 1

Transcriptional reprogramming of cellular metabolism is a hallmark of cancer. However, systematic approaches to study the role of transcriptional regulators (TRs) in mediating 1234567890():,; cancer metabolic rewiring are missing. Here, we chart a genome-scale map of TR-metabolite associations in human cells using a combined computational-experimental framework for large-scale metabolic profiling of adherent cell lines. By integrating intracellular metabolic profiles of 54 cancer cell lines with transcriptomic and proteomic data, we unraveled a large space of associations between TRs and metabolic pathways. We found a global regulatory signature coordinating glucose- and one-carbon metabolism, suggesting that regulation of carbon metabolism in cancer may be more diverse and flexible than previously appreciated. Here, we demonstrate how this TR-metabolite map can serve as a resource to predict TRs potentially responsible for metabolic transformation in patient-derived tumor samples, opening new opportunities in understanding disease etiology, selecting therapeutic treat- ments and in designing modulators of cancer-related TRs.

1 Institute of Molecular Systems Biology, ETH Zurich, Otto-Stern-Weg 3, CH-8093 Zurich, Switzerland. Correspondence and requests for materials should be addressed to M.Z. (email: [email protected])

NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9

ranscriptional regulators (TRs) are at the interface between implemented by the additional quantification of total the cell’s ability to sense and respond to external stimuli or abundance and is based on the assumption that protein content T 1 changes in internal cell-state . In cancer as well as other scales with cell volumes, even across largely diverse cell types or human diseases2, alterations in the activity of TRs, such as conditions17 (Supplementary Figs 1, 2). To overcome these lim- transcription factors, chromatin modifiers or itations, we present an innovative and robust workflow enabling co-regulators, can remodel the cellular signaling landscape and large-scale metabolic profiling in adherent mammalian cells trigger metabolic reprogramming3 to meet the requirements for alongside with a scalable computational framework to normalize fast cell proliferation and cell transformation4,5. However, evi- and compare molecular signatures across cell types with large dence linking alterations of cancer metabolism to TR dysfunction differences in morphology and size (Supplementary Fig. 1–2). In is often based on molecular profiling technologies, like tran- contrast to classical metabolomics techniques16, we use a 96-well scriptomics and chromatin modification profiling6 or the iden- plate cultivation format, rapid in situ metabolite extraction, tification of TR-binding sites upstream of metabolic enzymes3, automated time-lapse microscopy and flow-injection time-of- that don’t report on the functional consequences of detected flight mass spectrometry14 (FIA-TOFMS) for high-throughput interactions. Mass spectrometry-based metabolomics approaches profiling of cell extract samples (Fig. 1a). are powerful tools for the direct profiling of cell metabolism and Specifically, we optimized each step, from cultivation and to uncover mechanisms of transcriptional (in)activation of extraction to MS analysis, to be compatible with parallel 96-well metabolic pathways7,8. Nevertheless, because of limitations processing. Different cell lines are seeded in triplicates at low cell imposed by commonly used workflows, such as coverage, scal- density in 96-well microtiter plates, and are grown to confluence fi ability, and comparability between molecular pro les of largely within 5 days (37 °C, 5% CO2). Growth is continuously diverse cell types, simultaneously quantifying the activity of TRs monitored by automated acquisition of bright-field microscopy and metabolic pathways at genome- and large-scale remains a images, and replicate 96-well plates are sampled for metabolome major challenge. Here, we develop a unique experimental work- analysis every 24 h. To facilitate sampling, increase the through- flow for the parallel profiling of the relative abundance of more put and reduce the risk of sample processing artifacts, we collect than 2000 putatively annotated metabolites in morphologically metabolomics samples directly in the 96-well cultivation plate diverse adherent mammalian cells. This approach overcomes without prior cell detachment (Supplementary Fig. 1). Finally, cell several of the major limitations in generating large-scale com- extracts are analyzed by FIA-TOFMS14, allowing rapid full- parative metabolic profiles across cell lines from different tissue spectral acquisition within less than one minute per sample. types or in different conditions, and was applied here to profile 54 Normalization of large-scale non-targeted metabolomics adherent cell lines from the NCI-60 panel9. profiles is a fundamental aspect of data analysis, which becomes To understand the interplay between transcriptional regulation particularly challenging when comparing mammalian cell lines and emerging metabolic phenotypes, we generated a genome- with large differences in cell size. Our approach consists of two scale map of TR-metabolite associations. To this end, we imple- main steps. First, we quantified relative metabolite abundance mented a robust and scalable computational framework that per cell using a multiple linear regression scheme. To this end, integrates metabolomics profiles with previously published tran- we quantified cell numbers in each sample directly from scriptomics10 and proteomics11 datasets to resolve the flow of automated analysis of bright-field microscopy images (see signaling information across multiple regulatory layers in the cell. Supplementary Note and Supplementary Fig. 1), and for each This computational framework enables (i) systematically explor- cell line, related extracted cell number to ion intensity (Fig. 1b). ing the regulation of metabolic pathways, (ii) reverse-engineering By combining MS profiles throughout cell growth (Supplemen- TR activity from in vivo metabolome profiles and (iii) predicting tary Fig. 1), we decoupled cell line-specific metabolic signatures post-translational regulatory interactions between metabolites from differences in extracted cell numbers. This procedure and TRs. Beyond contributing to the understanding of genome- enables selecting only annotated ions exhibiting a linear wide associations between changes in TR activities and rewiring dependency between measured intensities and extracted cell of metabolism in cancer, we demonstrate how genome-scale TR- number (Fig. 1b), that are hence amenable to accurate relative metabolite associations can introduce a new paradigm in the quantification. Moreover, by integrating MS readouts at multi- analysis of patient-derived metabolic profiles and the develop- ple cell densities and time points, the resulting estimates of ment of alternative therapeutic strategies to counteract upstream relative metabolite abundances are invariant to cultivation time reprogramming of cellular metabolism. and cell densities, enabling the direct comparison between different cell lines and conditions18. In the second step, a subset of metabolites identified to report on cell volume, mostly Results intermediates of fatty acid metabolism, were used to correct for Large-scale metabolic profiling of cancer cells. Tumor cells, in differences in total volume of sampled cultures (see Methods spite of similar genetic background or tissue of origin, can exhibit section and Supplementary Fig. 2). profoundly diverse transcriptional and metabolic phenotypes12,13. Exploiting the naturally occurring variability across a large set of diverse cancer cell lines could reveal the interplay between Linking expression profiles to metabolic diversity. Here, aberrant tumor metabolism and . However, in we used our framework to profile the intracellular metabolomes spite of significant advancements in the rapid generation of high- of 54 adherent cell lines from eight different tissue types in the resolution spectral profiles of cellular samples14,15, the accurate NCI-60 cancer cell line panel. For 2181 putatively annotated ions comparative profiling of intracellular metabolites across hetero- exhibiting a significant linear dependency between extracted cell geneous cell line panels is still a major challenge. Two are the number and ion intensities (linear regression p-value ≤ 3.4e−7, major bottlenecks for in vitro large-scale cell metabolic profiling: Bonferroni-adjusted threshold), we report Z-score normalized (i) limited throughput of classical techniques16 featuring large relative abundances fitted over approximately 15 individual cultivation formats, time-consuming and laborious extraction/ measurements per cell line (Fig. 1b and Supplementary Data 1). measuring protocols, and (ii) lack of accurate and rapid nor- This new data set illustrates the technological advances of our malization procedures to make metabolic profiles of morpholo- methodology over previous metabolomics approaches and similar gically diverse cell types comparable. This last step is often comparative resources (Supplementary Fig. 3), in that it enables

2 NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9 ARTICLE

a c b BT549 Fast sampling and Automated and rapid Breast HS578T Multiple regression analysis MCF7 metabolite extraction metabolome profiling MDAMB231 T47D SF268 5 ATP SF295 ×10 CNS SF539 Cold extraction (–20 °C) Flow-injection TOFMS 505.9873 m/z SNB19 SNB75 U251 Cell line α p-value COLO205 8 HCC2998 KM12 5.3 ± 0.2 3.8e–140 HCT116 Colon HCT15 Normalization OVCAR5 1.9 ± 0.2 4.4e–32 HT29 ion m/z KM12 6 SW620 and metabolite SF539 11 ± 0.8 9.2e–43 786O annotation Background A498 TECAN Kidney ACHN TM 4 CAKI1

Spark 10 M Intensity α RXF393 OVCAR5 SN12C TK10 2 UO31 Image A549 EKVX analysis HOP62 0 Lung HOP92 5 NCIH226 96-well plate Time-lapse Automated NCIH23 0 0.5 1 1.5 2 ×10 NCIH322M NCIH460 cultivation microscopy cell counting NCIH522 Cell number LOXIMVI M14 MALME3M Melanoma MDAMB435 SKMEL2 SKMEL28 de24-oxo-1alpha,25- PS(14:1(9Z)/22:6 1 SKMEL5 UACC257 dihydroxyvitamin D3 (4Z,7Z,10Z,13Z,16Z,19Z)) UACC62 IGROV1 429.3006 m/z 776.4510 m/z NCIADRRES 0.8 OVCAR3 Ovary OVCAR4 4 OVCAR5 4 OVCAR8 SKOV3 3 0.6 Prostate DU145 3 PC3 2 2 Spearman R 1 0.4 Drug sensitivity AUC: 0.778 –0.6 –0.4 –0.2 0 0.2 0.4 0.6 0.8 Proteome AUC: 0.755 Z-score 1 0 Transcriptome AUC: 0.735

–1 0 True positive rate 0.2 Metabolome AUC: 0.623 CORE profile AUC: 0.570 Growth rate AUC: 0.467 0 g 10

CNS CNS Stoichiometric network (Recon 2) Lung Lung 0 0.2 0.4 0.6 0.8 1 Colon Colon Ovary Ovary Breast Breast Kidney Kidney Randomized networks 3.5

Prostate Prostate False positive rate 8 Melanoma Melanoma 3

f 140 6 2.5 -value

120 2 q 4 10 100 Breast 1.5 Ovary 2 80 Melanoma –Log Average distance 1

CNS -metabolites 60 Colon Lung 0 Prostate 0.5 40 Kidney –2 0 Doubling time, hours 20 0 0.2 0.4 0.6 0 Absolute correlation of enzyme PC3 M14 A549 A498 TK10 U251 expression and metabolite abundance HT29 T47D 786O UO31 KM12 EKVX MCF7 ACHN CAKI1 BT549 SF268 SF295 SF539 DU145 HCT15 SNB19 SNB75 SN12C HOP62 HOP92 SW620 SKOV3 NCIH23 HS578T RXF393 HCT116 IGROV1 UACC62 SKMEL2 SKMEL5 LOXIMVI NCIH226 OVCAR3 OVCAR4 OVCAR5 OVCAR8 NCIH460 NCIH522 HCC2998 UACC257 COLO205 SKMEL28 NCIH322M MALME3M MDAMB231 MDAMB468 MDAMB435 NCIADRRES Fig. 1 Comparative metabolome profiling in 54 adherent cancer cell lines. a Schematic overview of the combined workflow for high-throughput metabolome profiling in adherent cell lines. Multiple cell lines are cultivated in parallel to collect cell extracts for MS-based metabolome profiling (Supplementary Fig. 1). A new software tool for the segmentation of bright-field microscopy images (Supplementary Note) allows automated cell number quantification. b Raw MS data for adenosine triphosphate (ATP) in three different cell lines before cell volume correction. Slopes (α) and standard errors from the linear fitting of ion intensity and cell number are reported, together with the p-value significance. Background intensities were obtained from measurements of cell-free extraction solvent. c Pairwise similarity (Spearman correlation) of metabolome profiles among 53 cell lines from the NCI-60 panel. Circle size on the diagonale is proportional to cell line doubling times (f). d Examples for metabolites exhibiting a significantly (ANOVA q-value ≤ 0.05, corrected for multiple tests) tissue-dependent abundance (Supplementary Fig. 4, Supplementary Data 1). Boxplot of Z-score normalized metabolite levels are grouped by the cell line tissue of origin. Box edges correspond to 25th and 75th percentiles, whiskers include extreme data points, and outliers are shown as red plus signs. e Signatures of tissue type in intracellular metabolome profiles compared to transcriptome10, proteome11, drug sensitivity31, growth rate and the uptake- and secretion rates of 140 metabolites (CORE profiles13). Receiver operating characteristic curve analysis quantifies the likelihood of cell lines from the same tissue type to feature similar molecular profiles. f Cell line doubling times (mean ± standard deviation), grouped by tissue. Doubling times were determined from continuous monitoring of cell confluence in 96-well plates. g Correlations between metabolite levels and gene expression in relation to their distance in the stoichiometric network of human metabolism21. The black line represents the average pairwise enzyme- metabolite distance (y-axis) at different levels of correlation between metabolite abundance and enzyme gene expression (x-axis). The blue line represents the average enzyme-metabolite distance in 10.000 randomized networks. Dot color reflects significance (q-value from permutation test) of enzyme- metabolite proximity in the stoichiometric network as compared to the randomized networks higher throughput, sensitivity, coverage and systematic removal (Fig. 1e), heterogeneity in metabolome profiles across cell lines of biases associated to differences in cell size. from the same tissue type is consistent with large differences in Pairwise similarity analysis of cell line metabolome profiles doubling times (Fig. 1f, Supplementary Fig. 4, Supplementary (Fig. 1c) reveals widespread heterogeneity in the metabolome of Data 1), previous data on exchange rates of nutrients and cell lines even from the same tissue type. We found only 70 metabolic byproducts13,19 (Fig. 1e) and metabolic differences metabolites whose intracellular levels exhibited a significant across mouse tissues20 (Supplementary Fig. 3). dependency (ANOVA, q-value ≤ 0.05) on cell-line tissue (Fig. 1d, How can such large phenotypic and metabolic diversity emerge Supplementary Fig. 4, Supplementary Data 1). In few cases, these under the same environmental condition? We hypothesized that patterns highlighted expected tissue-specific functions, such as for differential regulation of gene expression could subserve meta- elevated levels of a derivative of vitamin D3 in melanoma cells bolic heterogeneity. To test our hypothesis, we correlated mRNA (Fig. 1d), while other metabolites were ubiquitously present levels of metabolic with metabolite abundances, and among cell lines but in different levels depending on the tissue related enzyme-metabolite correlation to their proximity in the of origin. While previously published transcriptome10 and metabolic network. We used a genome-scale stoichiometric proteome11 profiles for 53 of the herein profiled cell lines model of human metabolism21 to derive the distance between revealed a stronger molecular signature of the tissue of origin each enzyme-metabolite pair as the minimum number of

NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9 reactions separating the two in the network. We found that investigated whether metabolites correlating with TR activity enzyme gene expression tends to more strongly correlate with locate in the proximity of TR-enzyme targets. To this end, we levels of proximal metabolites (Fig. 1g), reflecting a direct estimated the distance of each TR to metabolites by taking the dependency between enzyme- and related metabolite levels22,23. minimum distance between known24 TR enzyme-targets and By expanding the search for correlation between gene expression- metabolites in the metabolic network. We found that metabolites and metabolite levels beyond enzyme-encoding , we exhibiting the strongest correlations with TR activity (|Spearman observed that the two principal components of metabolic variance correlation| > 0.45) are in the significant (q-value < = 0.05, explaining 38% of total variance (Supplementary Fig. 4) most permutation test) vicinity of TR-enzyme targets (Supplementary strongly correlated (Spearman |R| > 0.37) with transcripts in Fig. 5). Such local dependencies support our hypothesis that signal transduction pathways regulating cell proliferation, adapta- metabolite levels can be used as an intermediate readout to study tion, cell adhesion and migration (e.g., HIF-1, PI3K-Akt, AMPK, the functional interplay between regulation of gene expression Supplementary Fig. 4). Altogether, these results suggest that and cell metabolism. phenotypic heterogeneity observed in vitro can emerge as a result To test the extent to which our conclusions hold true beyond of different transcriptional regulatory programs, and that the specific TR-regulatory network and cell lines tested here, we metabolite abundances can be used as intermediate functional expanded the TR-gene target network to include any enzyme readouts linking gene expression profiles to different strategies in whose transcript levels correlate with TR activity, i.e. enzymes allocating metabolic resources for energy generation and growth. directly or indirectly regulated by a TR. To that end, we applied NCA to resolve TR activity across an independent large compendium of transcriptome data, monitoring gene expression Systematic inference of TR-metabolite associations. To study in a panel of 1037 human cancer cell lines30 (Cancer Cell Line the flow of signaling information between transcriptome and Encyclopedia), and identified enzyme levels that correlate with metabolome, we sought to quantify the functional interplay individual TR activity profiles (|Spearman correlation| > 0.5, between metabolic phenotypes and different transcriptional Supplementary Data 6). We call the resulting association network programs mediated by the activity of transcriptional regulators an augmented TR-gene network. By repeating the analysis of TR- (TRs). Here, we use the term TR formally for any regulator metabolite distance using the augmented TR-gene network, we capable of modulating gene expression, including transcription found an even stronger vicinity between TRs and correlating factors, chromatin remodelers, and co-regulators, i.e., adopting metabolites (Fig. 2b). Hence, the analysis of this independent the inclusion criteria from a curated repository of gene regulatory panel of cancer cell line expression data not only reinforces our network links24 (TRRUST database). TRs can directly regulate previous results, but it also suggests that the TR-metabolite metabolic fluxes by modulating enzyme abundance, i.e. changing association network derived from the NCI-60 tumor cell lines can maximum flux capacity, or by indirectly affecting substrate be generalized to a largely diverse panel of cell lines. Our results availability of proximal metabolic reactions, which can in turn demonstrate that while models of TR-target genes are far from result in local changes of fluxes25,26. The activated form of a complete, even relatively few known gene targets can be sufficient transcriptional regulator, rather than its expression/protein level, to estimate TR activities. Remarkably, while in human cells most regulates gene promoters, and a TR’s activity is imprinted in the of the known TR binding sites map to genes in signaling- and expression levels of its target genes. Hence, because TR activity is disease-related pathways, our TR-metabolite network unraveled governed by complex post-transcriptional and post-translational a complementary large space of associations involving inter- mechanisms, monitoring TR gene expression or protein levels is mediates in central metabolic pathways (Fig. 2c). an inadequate proxy of their activity27,28 (Supplementary Fig. 5). Because, methodologies to directly measure activity, such as GFP or luciferase constructs, are limited in scalability, we Mapping TR activity to metabolic phenotypes. Here, we asked used network component analysis (NCA)29 to derive a relative whether coordinated changes in TR activity and metabolite estimate of TR activity for each cell line directly from the com- abundances indirectly inform on changes in proximal metabolic bined expression levels of TR-gene targets. By quantifying TR fluxes12 (see Supplementary Fig. 6 and Supplementary Discus- activities from transcriptomic profiles and a network model of sion). As a proof of concept, we measured rates of glucose uptake transcriptional regulation, NCA enables simultaneously compar- and lactate secretion in each cell line as a proxy for glycolytic flux. ing the activity of TRs across different cell lines on a genome- As previously observed13,22, glucose consumption strongly cor- scale level. related with lactate secretion, and on average approximately 70% Hence, by integrating previously published transcript abun- of the incoming glucose carbon is secreted into lactate (Fig. 2d dance data10 for 53 cell lines with a genome-scale network of and Supplementary Fig. 3). Next, we related intracellular meta- literature-curated TR-target gene interactions (TRRUST data- bolite abundances to measured fluxes by estimating the average base24), we derived a relative estimate of TR activity for 728 correlation with glucose and lactate exchange rates (Fig. 2e). transcriptional regulators (Fig. 2a, Supplementary Fig. 5). To Indeed, metabolites that strongly correlate with glycolytic flux account for potentially confounding effects of different growth (Spearman R > 0.5) were enriched for intermediates of proximal rates and the incompleteness of the TR-regulatory network, glycolytic pathways, oxidative phosphorylation and HIF-1 we introduced an artificial TR that, by virtually targeting all signaling pathway (Fig. 2f), including metabolites such as gly- genes, simulates pleiotropic effects of growth on gene expression, ceraldehyde 3-phosphate, acetyl-CoA and ATP (Fig. 2e). Sur- and used a bootstrapping approach to sample different prisingly, metabolites that anti-correlated with glucose uptake, i.e. combinations of TR regulatory interactions. Within an unknown that are increased with low glycolytic flux, were significantly scaling factor, TR activities are hence robustly derived as the (hypergeometric test, q-value ≤ 0.001) enriched for one-carbon median across more than 400 estimates per TR (Supplementary metabolism (Fig. 2f), hinting at a direct functional dependency Fig. 5, Supplementary Data 2). between one-carbon metabolism and glycolytic flux. To test this To generate an empirical network of associations between TRs prediction, we correlated glucose consumption and lactate and metabolites, we systematically correlated the activity of 728 secretion with the susceptibility to 430 drugs with known TRs with relative levels of individual metabolites across cell lines mechanisms of action31. Consistent with the hypothesized func- (Fig. 2a, Supplementary Data 3, Supplementary Fig. 5). First, we tional dependency between glucose and one-carbon metabolism,

4 NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9 ARTICLE

iptional regu abcnscr lato Transcriptomics TR activity profiles 9 Tra rs Augmented TR-gene network 4 g Randomized TR-gene networks in ss 8 ce ro 54,675 p n Network o 728 TRs ti mRNA TR-metabolite 7 a component 3 m r probes association network o s f analysis n e I t i 6 l -value o q b

53 cell lines 53 cell lines m

a s

2 i t 10

2181 l o

Correlation 5 e

b

a g

M

annotated t

n

analysis –log e i

Transcriptional Metabolomics s m

s metabolites

4 r e

a c regulatory network l

o

1 u l r

l p e n

C o 2181 3 ti a 728 transcriptional Average TR-metabolite distance rm annotated fo s regulators In e 2 0 n metabolites C e e 0 0.2 0.4 0.6 llu G la r 728 TRs Spearman correlation me tab 2209 target genes 53 cell lines olism Central metabolism Central carbon metabolism Amino acid metabolism d e f Carbohydrate metabolism Nucleotide metabolism 0.05 Lipid and fatty acid metabolism Redox and cofactor metabolism ×103 Glycolysis/Gluconeogenesis ATP Fructose and mannose metabolism Genetic and environmental information processing and human diseases y = x 0.04 S-Lactoylglutathione Galactose metabolism Signalling pathways Protein transport and degradation y = 0.7178x 0.6 D-Glyceraldehyde 3-phosphate Oxidative phosphorylation Glycan biosynthesis Human diseases ] 1.0 0.03 Purine metabolism –1 Cell cycle (R)-lactate Genetic information processing 0.4 0.02 Pyrimidine metabolism Drug metabolism Pathways in cancer 0.8 Acetyl-CoA Lysine biosynthesis 0.01 Arginine and proline metabolism 0.2 Taurine and hypotaurine metabolism 0 Glutathione metabolism 0.6 Breast Amino sugar and nucleotide sugar metabolism g Antifolate drugs 0 -value DHFR inhibitor *volume (AU)

Ovarian q 0.01 Arachidonic acid metabolism RNA synthesis inhibitor –1 Melanoma 0.4 Folic acid Butanoate metabolism Tubulin stabilizer Lactate carbon CNS –0.2 0.02 One carbon pool by folate TYMS inhibitor Colon Average correlation Aerobactin HIF-1 signaling pathway HSP90(AA1, AB1)

[fmol*h Lung Tetrahydrofolic acid 0.03 Synaptic vesicle cycle Tubulin affecting 0.2 Prostate –0.4 DNA synthesis inhibitor 10-Formyl-THF GABAergic synapse Renal Taste transduction PPAT inhibitor Tetradecanedioic acid 0.04 IMPDH2 inhibitor –0.6 Insulin resistance Ribonucleotide reductase inhibitor 0 Proximal tubule bicarbonate reclamation 0 500 1000 1500 0 500 1000 1500 2000 Negative corr.0.05 Positive corr. Topoisomerase 1 inhibitor Histone deacetylase Glucose carbon Metabolite BCR-ABL inhibitor [fmol*h–1*volume (AU)–1] DNA methylttransferase inhibitor Tubulin fragmenter Alkylating at N-2 position of guanine Antimetabolite Phosphodiesterase inhibitor h i Insulin signaling pathway Alkylating at N-7 position of guanine 0.25 Ubiquinone and other terpenoid-quinone biosynthesis SRCAP 3.5 Selective estrogen modulator Complement and Serine threonine kinase PREB coagulation cascades Antiprotozoal PLAGL1 3 Tyrosine kinase inhibitor 0.2 Proximal tubule bicarbonate reclamation DNA binder RFX3 MAP2K1 inhibitor Longevity-regulating ATF5 Citrate cycle MAP2K (MEK) inhibitor 2.5 pathway Proteasome inhibitor NFIA 0.15 Topoisomerase 2 inhibitor ECD 2 Hormone

-value

KLF11 q Glycolysis/Gluconeogenesis Alkylating agent Bacterial invasion

10 of epithelial cells EGFR inhibitor 0.1 MAFG 1.5 Phosphoinositide-3-Kinase NFE2L1 PDGFR inhibitor –log ALK inhibitor 1 Renin-angiotensin system c-Kit inhibitor

Average absolute score MTOR inhibitor 0.05 VEGFR inhibitor 0.5 VEGF inhibitor RET inhibitor Apoptosis inducer 0 0 728 TRs 0 0.05 0.1 0.15 0.2 –0.4 –0.2 0 0.2 0.4 Pathway average score Spearman correlation

Fig. 2 Inferring a TR-metabolite association network by integrating transcriptome and metabolome data. a Schematic overview of the computational framework for finding TR-metabolite associations. Activity of 728 human TRs was calculated using Network Component Analysis29 on expression profiles of the NCI-60 cell lines10 and the TRRUST transcriptional regulatory network24. Correlation analysis was used to find associations between TRs and metabolites. b Correlations between metabolite levels and TR activity profiles in relation to the distance between TRs and enzyme targets in the metabolic network21. The black line represents average TR-metabolite distance (y-axis) at different levels of absolute TR-metabolite correlation (x-axis). The blue line represents average TR-metabolite distance in 10.000 randomized networks. Dot color indicates significance (q-value, corrected for multiple tests) of TR- metabolite proximity compared to the randomized networks. c Distribution of TR-metabolite associations (0.1% FDR, left section) and TR-gene regulatory interactions (right section) across KEGG pathways. Edge-size connecting the TR hub and metabolic pathways reflects the number of links found in either network. d Glucose uptake vs lactate secretion. Mean ± standard deviation of rates estimated over three biological replicates and five time points. e, f Metabolites whose abundance correlated with glycolytic flux (i.e., glucose uptake and lactate secretion, e) were tested for significant (q-value, corrected for multiple tests) enrichment in KEGG metabolic pathways (f), separately for positively and negatively correlated metabolites (yellow to red and purple to blue color ranges, respectively). g Boxplot of the correlation between glucose uptake/lactate secretion and sensitivity (i.e., GI50, drug concentration causing 50% growth rate reduction) to 430 drugs31 grouped by mechanism of action. Box edges reflect 25th and 75th percentiles, center dots indicate medians, whiskers include extreme data points, and outliers are gray circles. h Association between TR activity and metabolites reporting on glycolytic flux. TR names are shown for the 10 TRs with the highest score. i For each KEGG pathway we calculated the average association score (h) among TRs with known enzyme targets in the pathway (x-axis). Significance (y-axis) is estimated using a permutation test and corrected for multiple tests (q-value). Only pathways with a q-value ≤ 0.01 are labeled we found that among all drug classes, the strongest correlation ubiquinone biosynthesis, insulin signaling, TCA cycle and with glycolytic flux was observed with cell line sensitivities to glycolysis/gluconeogenesis (Fig. 2i). Several of the top 10 antifolate drugs and inhibitors of dihydrofolate reductase (DHFR) predicted TRs (Fig. 2h) are associated to the regulation of key —i.e., cell lines with lower glucose uptake are more sensitive to steps in glycolysis, such as the regulatory factor X3 (RFX3) inhibitors of folate biosynthesis (Fig. 2g). regulating the glucokinase gene32, the nuclear factor erythroid 2- Next, we sought to use our TR-metabolite network to find TRs related factor 1 (NFE2L1) and its interacting partner MAFG, associated with glycolytic flux, that potentially coordinate glucose involved in the regulation of oxidative stress response and diverse and one-carbon metabolism. To this end, we searched for TRs glycolytic genes33. Interestingly, we also found TRs important for that correlate with metabolites reporting on glycolytic flux by the survival and proliferation of cancer cells under nutrient calculating the sum of the dot product between TR-metabolite- limitation or stress, such as the activating transcription factor 5 and metabolite-glycolytic flux correlation vectors, normalized (ATF5)34 and the SNF2-related CPB activator protein (SRCAP), by the absolute sum of TR-metabolite correlation coefficients a direct regulator of phosphoenolpyruvate carboxykinase 2 (Fig. 2h). Notably, TRs with the highest scores were enriched (PCK2)35, a gluconeogenic enzyme essential to maintain cell (permutation test, q-value ≤ 0.05) for enzyme targets in proliferation under limited glucose conditions in cancer cells36.

NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9

Consistent with our previous observation of a dependency emerged 68 h post-transfection, and were most pronounced at between glycolytic flux and intermediates in one-carbon meta- 111 h after HIF-1A silencing (Fig. 3b, Supplementary Data 3). bolism, SRCAP activity levels across the 1037 CCLE cancer cell The most prominent metabolic changes (Fig. 3c) involve the lines strongly correlate (R ≥ 0.5) with the expression of three accumulation of metabolites in the oxidative branch of TCA cycle genes (Supplementary Data 2): MTHFD2, a mitochondrial (Fig. 3d), including citrate, oxalosuccinate, and N-acetyl gluta- enzymes in folate metabolism, asparagine synthetase (ASNS) mate, a downstream product of 2-oxoglutarate. The increase in and ATF4, an upstream regulator of serine biosynthesis and abundance of TCA cycle intermediates upon HIF-1A knockdown mitochondrial folate enzyme transcription, involved in the is consistent with the previously reported regulatory role of HIF- response to amino acids starvation (Supplementary Fig. 6). 1A as a repressor of oxidative metabolism40 via induction of It is tempting to speculate that cancer cell line diversity in pyruvate dehydrogenase kinase (PDK), and is in agreement with glucose uptake and lactate secretion observed in vitro potentially the enrichment analysis of functional associations in our TR- reflects regulatory programs acquired during an earlier adaptation metabolite association network (Fig. 3a). Moreover, out of 728 to in vivo nutrient availability and stresses. Altogether, these TRs in the TR-metabolite association network, HIF-1A was findings independently support the functional relevance of our recovered within the top 2% of TRs with the strongest TR-metabolite association network in resolving the interplay associations to metabolites affected by HIF-1A knockdown between transcriptional regulation and metabolic phenotype. (Fig. 3e and Supplementary Fig. 7). Notably, the inferred associations between metabolic intermedi- Hence, besides generating experimentally testable hypotheses ates and TRs are complementary to the information derived from on condition-specific regulatory roles of TRs in metabolism TR-gene regulatory networks. While the latter report on the (Fig. 3a), the herein-established TR-metabolite association net- regulatory capability of TRs to change enzyme abundance, work might serve as a guide to extract signatures of TR coordinated changes between TR activity and metabolite levels deregulation directly from metabolome profiles. Such an can reveal the functional implications of these regulatory events. approach would be particularly attractive for the interpretation of large collections of in vivo metabolome profiles acquired from tissue samples in patient cohort studies. In the following, we Predicting functional roles of TRs in metabolism. Because verify whether TR-metabolite associations inferred in vitro are correlation does not imply causation, we cannot resolve whether relevant also in an in vivo context, by testing the well-known role changes in metabolite levels are responsible for changes in TR of HIF-1A as a key oncodriver in clear-cell renal cell carcinoma. activity or vice-versa. However, often only few metabolic inter- mediates of metabolic pathways, typically the end product37, can allosterically regulate TRs. Hence, multiple proximal metabolic Predicting TRs mediating in vivo metabolic reprogramming. intermediates correlating with an individual TR’s activity can Here, we ask whether the map of TR-metabolite associations reveal the functional impact of TRs on overall pathway activity. found in vitro recapitulates metabolic rearrangements in an By analyzing TR-metabolite associations on pathway-level, in vivo setting. To this end, we searched for TRs potentially we can hence predict the functional roles of individual TRs in responsible for metabolic differences between healthy and tumor potentially regulating distinct metabolic pathways. For 677 TRs, tissue by evaluating the dot product between the TR-metabolite we discovered a significant enrichment (hypergeometric test, correlation matrix derived in vitro, and in vivo metabolite fold- q-value ≤ 0.05) of TR-associated metabolites in at least one KEGG changes between normal and cancer tissues. We applied this pathway (Fig. 3a, Supplementary Fig. 6, Supplementary Data 4), approach to analyze differences in metabolite abundances including 145 cancer-related TRs38 (as defined by the COSMIC between clear-cell renal cell carcinoma (ccRCC) and proximal cancer gene census). In this analysis, metabolic pathways with normal tissue samples in a cohort of 138 patients7 (Supplemen- the highest number of associated TRs were arachidonic and tary Data 3). Because this independent metabolomics data set fatty acid metabolism, followed by arginine and proline meta- contains only a subset of metabolites (i.e., 134) detected in our bolism and the degradation of branched-chain amino acids TR-metabolite association network, for each patient, we estimated (Supplementary Fig. 6). These predicted interactions potentially the significance of a TR in explaining the observed metabolic reflect a large space of yet unexplored regulatory interactions that changes using a permutation test, and ranked the 728 TRs can mediate the adaptation to varied micro-environmental con- according to the median q-value across patients (Fig. 4a, Sup- ditions and fast proliferation under diverse nutrient availability plementary Data 3). (see Supplementary Discussion). Loss-of-function mutations in van Hippel Lindau factor (VHL) Even in cases where the roles of TR-target genes have been gene are the most frequent and specific genetic event observed in extensively characterized, the herein-proposed TR-metabolite ccRCC41, entailing a hyper-activation of hypoxia-inducible associations can help refining the condition-specific functional factors (HIF-1, HIF-2, and HIF-3)39,40. In agreement with the role of TRs in metabolism. For example, hypoxia-inducible factor genetic basis of ccRCC, we identified VHL and HIF-1A among 1 alpha (HIF-1A)39 is reported to act on regulatory elements the top 1% of TRRUST-listed24 transcriptional regulators that upstream of nearly 50 enzymes in several central metabolic potentially mediate metabolic rearrangement in ccRCC (Fig. 4a). pathways (Fig. 3a). However, among KEGG pathways with Other top-ranking TRs include YY1 that has been shown to known HIF-1A target genes, we found a significant enrichment of interact with hypoxia-inducible factors42 (see also Supplementary metabolic intermediates almost confined to TCA cycle (hyper- Discussion and Supplementary Fig. 7). These results indepen- geometric test, q-value < 0.05), suggesting that changes in HIF-1A dently demonstrate the in vivo relevance of the previously activity alone are sufficient to affect TCA cycle. In order to inferred in vitro map of TR-metabolite associations, and support validate this association, we monitored dynamic intracellular its potential clinical applicability to decipher metabolic rearran- metabolic changes upon HIF-1A mRNA degradation (i.e., siRNA gements in tumor tissue samples. knockdown) in IGROV1 ovarian cancer cells, exhibiting an To further illustrate the potential of TR-metabolite associations average basal level of HIF-1A activity (Supplementary Fig. 5). in aiding the interpretation of tumor-specific metabolic changes, While the earliest time-points are dominated by minor and we collected data from two additional studies monitoring tumor fluctuating metabolic changes, likely reflecting a general response metabolic reprogramming in cohorts of 10 and 21 patients with to siRNA transfection, strong metabolic changes gradually colon12 and lung cancer43, respectively. In contrast to ccRCC,

6 NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9 ARTICLE

a bc RELB SRF STAT1 BCL3 EGR1 FHL2 HIF3A NFKB1 NFKB2 RELA STAT2 TP53 VHL HIF1A 4 40 Citrate, Isocitrate 3 Glycolysis / Gluconeogenesis 48 h 111 h Citrate cycle N-Acetyl-L-glutamate Pentose phosphte pathway 3 35 2 Galactose metabolism Hydroxyproline Fatty acid biosynthesis 30 Fatty acid elongation 2 Beta-Alanine, L-Alanine 1

Fatty acid degradation -values

Primary bile acid biosynthesis p 25 O-Acetylcarnitine Ubiquinone and terpenoid-quinone biosynthesis 1 0 Creatine Steroid hormone biosynthesis 20 Dodecanoic acid Purine metabolism –1 Pyrimidine metabolism 0 64 h Alanine, aspartate and glutamate metabolism 87 h 15

product of Oxalosuccinate Glycine, serine and threonine metabolism –2 10 Valine, leucine and isoleucine degradation PC2 score (9.7%) –1 Arginine and proline metabolism Non-targeting siRNA 10

Tyrosine metabolism HIF-1A siRNA, 10 nM –log –3 beta-Alanine metabolism –2 HIF-1A siRNA, 25 nM 5 HIF-1A metabolite correlation vector D-Arginine and D-ornithine metabolism HIF-1A siRNA, 50 nM Starch and sucrose metabolism Amino sugar and nucleotide sugar metabolism –3 0 abs. strength Inositol phosphate metabolism –10 –5 0 5 10 –4 –3 –2 –1 01234of association Arachidonic acid metabolism PC1 score (73%) Glyoxylate and dicarboxylate metabolism Median log2 fold-change Butanoate metabolism Vitamin B6 metabolism Pantothenate and CoA biosynthesis Retinol metabolism deRelative score Sulfur metabolism Citrate, Isocitrate Aminoacyl-tRNA biosynthesis HIF-1A 191.019 m/z 1 0.8 0.6 0.4 0.2 0 Metabolism of xenobiotics by cytochrome P450 Pyruvate 4 Metabolic pathways HIF-1A siRNA, 10 nM Carbon metabolism 2 PDH PDK FC 2 HIF-1A siRNA, 25 nM HIF-1A 2-Oxocarboxylic acid metabolism 0

Fatty acid metabolism log HIF-1A siRNA, 50 nM Rank 13 Biosynthesis of amino acids Acetyl-CoA –2 ABC transporters cGMP-PKG signaling pathway 0 24487296 Phosphatidylinositol signaling system Citrate Neuroactive ligand-receptor interaction Time, h Oxalatosuccinate AMPK signaling pathway 189.0042 m/z Ferroptosis Oxaloacetate Isocitrate 4 Vascular smooth muscle contraction Platelet activation 2 FC Fc epsilon RI signaling pathway 2 728 Cholinergic synapse 0 Serotonergic synapse Oxalosuccinate log TRs Olfactory transduction –2 In ammatory mediator regulation of TRP channels Ovarian steroidogenesis 0 24487296 Oxytocin signaling pathway Time, h Glucagon signaling pathway 2-Oxoglutarate N-acetyl- 2-Acetamido- Insulin resistance Glutamate Salivary secretion phenyl hydrazine derivative L-glutamate 5-oxopentanoate Gastric acid secretion 235.0717 m/z 146.0461 m/z 188.0557 m/z 172.0612 m/z Protein digestion and absorption 4 4 4 4 Bile secretion 2 2 2 2

Mineral absorption FC FC FC FC 2 2 2 2 African trypanosomiasis 0 0 0 0 log log log Pathways in cancer log Central carbon metabolism in cancer –2 –2 –2 –2 Asthma 024487296 0 24487296 0 24487296 0 24487296 0.05 0.04 0.03 0.02 0.01 0 Number of ... target genes 151 Time, h Time, h Time, h Time, h KEGG pathway enrichment q-value

Fig. 3 Predicted functional association of TRs to metabolic pathways. a Significant over-representation in KEGG metabolic pathways for metabolites associated with 15 selected cancer-related TRs38 (as defined by the COSMIC cancer gene census). The heatmap shows metabolic pathways with a significant enrichment (hypergeometric test, q-value < 0.05). The size of black circles scales with the number of known TR-gene targets24 in the metabolic pathway. b Principal component analysis of dynamic metabolome changes in IGROV1 ovarian cancer cells transfected with three different concentrations (10, 25, and 50 nM) of HIF-1A siRNA, and a non-targeting siRNA as negative control. c Volcano plot of metabolic changes at 111 h after HIF-1A siRNA transfection. Each putatively annotated metabolite is associated with a fold-change and significance (product of p-values across three siRNA concentrations). Only ions annotated to KEGG identifiers are shown. Dot size reports on the strength of the association predicted between HIF-1A activity and the metabolite in our TR-metabolite association network, while the color indicates the sign of the association. d Dynamic metabolite changes of metabolites in and proximal to TCA cycle upon siRNA transfection (with time-point 0 representing the negative control). Data are mean ± standard deviation across three replicates. Gray lines represent dynamic changes of all other detected metabolites. e The 728 TRs in the TR-metabolite association network were ranked by the median overlap of TR-metabolite associations with the metabolic profiles of the HIF-1A knockdown (111 h post-transfection). For each TR, we calculated a score representing the dot product of the TR-metabolite association vector and metabolite fold-changes, normalized by the sum of absolute correlations in the association network. The scores are shown as boxplots across the three siRNA concentrations (see also Supplementary Fig. 7), with blue dots indicating the median score, and edges showing 25th and 75th percentiles the most recurrent genetic events in colon and lung are cellular processes and drugs with a shared mode of action (MoA), mutations in p5344,45, similarly to many different tumor types. we found expected and potentially new functional associations Metabolome-based predictions suggest that deregulation of (Fig. 4d). For example, cell lines differentially susceptible to TRs other than , such as NF-Y or Mllt10 (Fig. 4b, c and mTOR inhibitors that can induce cell cycle arrest46 exhibited Supplementary Discussion), more directly associates to metabolic similar patterns in the activity of TRs involved in regulating cell changes observed in lung and colon tissue samples. While cycle progression (linear regression p-value ≤ 1e−4, Bonferroni- we don’t have direct experimental evidence, the predicted TRs adjusted threshold). Even stronger but less intuitive is the can function as effectors up- or downstream of p53, and unveil predicted association between regulators of calcium homeostasis key tumor-specific characteristics that can be exploited in a and inhibitors of the MAPK signaling pathway (Fig. 4d). In the therapeutic setting. more specific cases described above, our analysis of HIF-1A Consistent with this hypothesis, by analyzing the dependency and VHL activities across cell lines recapitulate the action of HIF- between TR activity and the sensitivity of NCI-60 tumor cell lines 1A inhibitor vorinostat47 and HIF-1A inducer imiquimod48 to 130 FDA-approved drugs31, we found 392 TRs for which (Fig. 4e, f). activity significantly (linear regression p-value ≤ 2.67e−4, Overall, by analyzing cell line responses to largely diverse Bonferroni-adjusted threshold) associates with at least one drug drugs, we uncovered numerous TRs whose activity correlates with sensitivity profile (Supplementary Data 4). To disentangle the drug sensitivity, providing independent experimental evidence difference in drug sensitivity relating to variable TR activity from that diverse transcriptional programs can affect the survival and those attributable to the tissue of origin, we used a multivariate drug tolerance of cancer cells. Hence, similarly to gene expression statistical approach that uncovers the potential association signatures used clinically to guide treatment decisions49, between individual TRs and drug susceptibility. When testing metabolome-based signatures of TR deregulation could open these associations between groups of TRs regulating similar new possibilities in aiding the selection of personalized

NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9

a 0.26 b 0.6 c 0.45 MYCN RELB VHL GFI1 HOXC13 POU2F2 MLLT10 GATA6 SMARCA1 0.55 CBX7 0.24 FOXN1 0.4 CEBPE HIF1A YY1 YBX1 ZNF217 0.5 NFYB HNF4G 0.22 TFAP2A 0.35 0.45 AATF NFYC 0.3 -value 0.2 -value -value q q q

0.4

10 ... 10 10 0.25 0.18 0.35 Mutation frequency 0.2 0.16 0.3 0.15 0.25 Median –log Median –log 0.14 Median –log 0.2 0.1 0.12 0.15 0.05 0.1 0.1 0 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.4 0 0.5 1 1.5 2 2.5 3 Median score Median score Median score

d –5 e f ×10 5 Protein/histone acetylation 10 8 Mitotic G1/S transition checkpoint T cell activation Nitrogen mustard Imiquimod 7 T cell differentiation 4 8 Cardioblast proliferation Vorinostat 6 Afatinib Ectodermal placode development Abiraterone Regulation of coenzyme metabolic process 5 3 6 Rapamycin Type I interferon signaling pathway -value -value p p

Regulation of response to oxidative stress Pazopanib -value 4 10 10 p Response to unfolded protein Lapatinib Epirubicin Olaparib 3 Gluconeogenesis 2 4 Erlotinib –log Glucose metabolic process –log Idelalisib 2 snRNA transcription Mitochondrial permeability involved in apoptotic process 1 2 Irinotecan 1 Cellular nitrogen compound catabolic process Calcium ion transport 0 0 –2 02–2 0 2 BCL2 MTOR MAP2K2 MAP2K1 Cathartic Slope of HIF1A-drug relationship (λ) Slope of VHL-drug relationship (λ) Antiprotozoal anti-Hormone MAP2K (MEK) DNA synthesis Relative sensitivity of kidney tissue Alkylating at N-7 Tubulin stabilizer Apoptosis inducer Cardiac Glycoside position of guanine 2 1 0 –1 –2 Serine threonine kinase Ribonucleotide reductase Fig. 4 Inferring TRs as mediators of in vivo metabolic changes. a–c Prediction results in three previously published data sets comprising metabolic profiles of tumor vs. proximal healthy tissue from 21, 138, and 10 patients with clear-cell renal cell carcinoma7 (a), lung cancer43 (b), and colon cancer12 (c). Each dot represents the strength (x-axis, median score) and significance (y-axis, q-value, corrected for multiple tests, blue to purple color range) of a TR in mediating in vivo metabolic changes. Dot-size indicates the frequency of mutations in the TR-encoding gene for the respective cancer type, derived from the Catalog of Somatic Mutations In Cancer (COSMIC)38. Gene-name labels are shown for the top 1% most significant hits (see Supplementary Discussion and Supplementary Fig. 7). d 131 clusters of functionally related TRs (Supplementary Data 3) were tested against 82 different drug modes of action (MoA) to find a significant over-representation of drug-TR associations. Colored squares indicate the significance of the enrichment analysis. Only TR clusters and drug MoAs with at least one significant association (p-value < 1e−4, Bonferroni-adjusted threshold) are shown in the heatmap. e, f Each dot represents one of the tested FDA-approved drugs. Estimated influence of changes in TR activity on drug susceptibility (i.e., λ) and corresponding p-value significance are reported. Dot size is proportional to the number of TRs that exhibit a significant association (linear regression p-value ≤ 2.67e−4, Bonferroni-adjusted threshold) with the drug, while dot color reflects the estimated average susceptibility of renal cell lines (i.e., βkidney) with respect to the remaining seven tissue types therapeutic treatments. Taken together, in vitro associations distinct transcriptional programs in response to changes in the between TRs and metabolites not only allow predicting TR abundance of specific intracellular metabolites. regulatory functions in metabolism, but could also complement Analysis of phosphorylation interaction networks in yeast55 and genomic information and guide the analysis, molecular classifica- human56 revealed that TRs, as compared to enzymes, interact on tion and interpretation of metabolome profiles from large patient average with many more kinases (Kolmogorov–Smirnov test, p- cohorts. value ≤ 0.001, Fig. 5a, b), reflecting a key role of TRs as mediators in phosphorylation signaling cascades and emphasizing the Systematic prediction of potential modulators of TR activity. importance of elucidating post-translational modulators of TR While we have shown that metabolic rearrangements in cancer activity. While the impact of phosphorylation has been system- can be correlated to changes in TR activity, the origin of such atically studied55,56, much less is known about the potential role of changes often remains elusive. Mutations in genes encoding metabolites in the allosteric regulation of TR activity. Despite the transcriptional regulators can be directly responsible for altered increased interest in metabolites as signaling molecules and their TR functionality, and in some cases even explain disease etiol- potential role in driving cellular transformation51, resolving the ogy50. However, the activation of new transcriptional programs influence of metabolites on TR activity has remained a daunting is often an indirect response to changes in the abundance of task. Here, we established an in silico framework for generating internal effectors of cell signaling51,52. In silico models have hypotheses on regulatory interactions between TRs, metabolites proven extremely powerful in finding new allosteric interactions and kinases (Fig. 5c). To that end, we used model-based fitting that can regulate enzyme activity53 and in testing their in vivo analysis to integrate TR activity and metabolome profiles with functionality54, but little progress has been made in the systematic proteome data11 measuring the abundance of 100 TRs and 64 mapping of effectors of TR activity. Here, we integrated three kinases/phosphatases across 53 cell lines. For each TR, we applied layers of biological information—i.e., metabolome, proteome, and non-linear regression analysis to determine whether variation in transcriptome, to obtain first insights into how cells can activate TR activity could be modeled as a function of TR protein

8 NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9 ARTICLE

a 1 0.25 cdTR activity

3−Mercaptolactate RPS6KA3 SMAD3 Xanthurenic acid Allopurinol 3,4−Dihydroxyphenylethyleneglycol riboside

Quinolinic Adrenic Se−Adenosylselenohomocysteine acid acid

4−acetamidobutanal NELFB COL4A3BP Erythrono−1,4−lactone 3−Aminopropionaldehyde 2−oxoglutaramate MTA3 ITP(3−)

MAPK1 GTF2F1 TSG101 MAP2K2 PHderiv_2−Oxopentanoic FHL2 acid 1−pyrrolinium D−Glutamic PHderiv_3−Sulfopyruvate Dimethylallyl CKB acid diphosphate HMDB01186 DL−Dopa 3−alpha−hydroxy−5−alpha−androstane−17−one 3−D−glucuronide Hexadecenoate Phosphodimethylethanolamine PHderiv_Oxalureate Tetracosanoic (n−C16:1) TRIM28 acid

APEX1 GDP (2S)−1−hydroxy−3−oxooctadecan−2−aminium 5,6−dihydrothymine Glycerophosphocholine 2−trans,6−trans,10−trans−geranylgeranyl Xanthosine AATF p-value: 6.5e–31 diphosphate(3−) 5’−phosphate UDP−D−galacturonate Palmitaldehyde MAP2K1 AIP Metabolite Glycerol 3−Deoxyarabinohexonic SRC 2−oxoadipate(2−) acid S−Sulfo−L−cysteine SRPK2 FUS 2,6,10−Trimethylundecanoic Scopolamine Beta−Alanine TP53 acid 2−Keto−3−deoxy−6−phosphogluconic CAMK2D DEK acid UHRF1 3a,7a−Dihydroxy−5b−cholestane ZFHX3 HMDB03269 3−Methoxytyramine GalactosylsphingosineMECP2 D−Proline NME2 EGFR Glycine Clupanodonic SNW1 acid Estrone Sphingosine sulfate Demethylated 1−phosphate Nicotinate antipyrine D−ribonucleotide HNRNPD HSPA5 N−Carbamoyl−L−aspartate N−methylethanolaminium ZAP70 UDP Deoxyuridine phosphate(1−) dCMP Kinasetriphosphate 3−Phospho−D−glycerate HDAC2 GTF2I NME1 100 TRs Benzo[a]pyrene−4,5−oxide 5−Hydroxyisourate 5a−Cholesta−8,24−dien−3−one Creatine ENO1 Acetylglycine

PGK1 (2S,3R)−3−hydroxybutane−1,2,3−tricarboxylate Pyridoxamine 4−Imidazolone−5−propanoate 93% allantoate Aminocaproic 5’−phosphate acid CTP Choline CNBP CEBPB propionate HMGN1 2−Arachidonoylglycerol YY1 VRK1 5,6−dihydrouracil EPHA2 WNK1 4,4−Dimethyl−5a−cholesta−8−en−3b−ol Nicotinamide acetate Hypoxanthine DNMT1 adenine (R) BRD7 Behenic CTNNB1 2,3−Dihydroxy−3−methylvalerate acid dCTP dinucleotide 5,6−dihydroxyindole−2−carboxylate Lipoic acid KHSRP PHderiv_(S)−3−Methyl−2−oxopentanoicEHMT2 acid Beta−Glycerophosphoric 4−Oxo−4−(3−pyridyl)−butanamide 0.5 5−Aminopentanoic acid STK24 acid TTK PIR Beta−L−arabinose GDP−alpha−D−mannose(2−) pdf CASK TRRAP Aminoadipic 1−phosphate acid MAFF Adenylylselenate Deoxyinosine PKN2 (R)−2,3−Dihydroxy−isovalerate 3−Ureidoisobutyrate (R)−b−aminoisobutyric FUBP3 Indoleacetaldehyde XRCC5 PHderiv_3−Phosphonopyruvate naphthalene acid ADP alpha−D−glucoside(2−) MTA1 choline PRPF4B PHderiv_2−Hydroxy−3−oxosuccinate phosphate(1−) 7,8−dihydroneopterin ID1 5,10−Methylene−THF 5−Hydroxy−N−formylkynurenine HDGF 4−nitrophenyl 3’−triphosphate(4−) Biliverdin CEBPZ 2’,3’−cyclic phosphate(2−) UMP(1−) 0 L−Cystine CSNK1A1 FUBP1 LRRFIP1 EIF2AK2 ILK 5,6,7,8−tetrahydrobiopterin N−Acetyl−D−galactosamine YES1 5’−Methylthioadenosine 1−phosphate MAP2K3 KHDRBS1 SRPK1 CTBP1 ALDOA PAWR SCAND1 2−Amino−3−oxoadipate 5’−Hydroxycotinine (−)−Salsoline ERBB2 (R)−Pantothenate Nicotinate 2−Indolecarboxylic 0 10 Nicotinamide D−ribonucleoside ILF3 XRCC6 acid adenine PHderiv_5−Amino−2−oxopentanoic dinucleotide Allocystathionine acid 1−Pyrroline−4−hydroxy−2−carboxylate Butyrylcarnitine phosphate 2−Aminoacrylic CMP−N−acetyl−beta−neuraminate(2−) acid EPHB2 PKM Orotate dTDP−D−galactose BTF3 PML (S)−2,3−Epoxysqualene Geranyl Beta−N−Acetylglucosamine 2−Carboxy−2,3−dihydro−5,6−dihydroxyindole CDK6 CSK diphosphate D−Aspartic CDP−Ethanolamine acid Sebacic acid Glycolic Acetone COPS5 PBK 2−Oxoarginine acid PHderiv_2−Oxosuccinamate CSNK2B Selenocystathionine FAD Histamine B1 CRABP2 (R)−lactate HMDB00833 3,5−Diiodo−L−tyrosine UBTF Oxalacetic PHderiv_Oxomalonate 1−Deoxy−D−xylulose acid HMDB03357 UDP−D−Xylose SSB CSNK2A1 5−phosphate 6−pyruvoyl−5,6,7,8−tetrahydropterin Ubiquinol−10 NEK9 YBX1 EIF2AK2 Urea 1.2% Biotin STAT6 D−Sedoheptulose FES 3−oxopropanoate 7−phosphate NPM1 ATP beta−carboline USF2 (S)−dihydroorotate EWSR1 HMGB2 NFKB25.3% PLK1 PA2G4 Nordazepam HMGA1 Benzpyrene 23S,25,26−trihydroxyvitamin 53 cell lines 1−Pyrroline−2−carboxylic Cytidine D3 Taurochenodesoxycholic PRKACA STAT1 OXSR1 Cysteic STAT3 acid dUDP 3−disulfanyl−L−alanine acid acid ARID1A dIDP BLVRA HMDB01238 Argininic 2−aminomuconate(2−) TRs 2−Oxo−4−methylthiobutanoic acid acid CDK2 2−[(4R)−4−[(1S,2S,5R,7R,10R,11S,14R,15R)−2,15−dimethyl−5−(sulfonatooxy)tetracyclo[8.7.0.0^{2,7}.0^{11,15}]heptadecan−14−yl]pentanamido]ethane−1−sulfonate 4−trimethylammoniobutanal Estrone HEXIM1 MCM6 Formyl−5−hydroxykynurenamine PRKDC glucuronide STAT5B O−Phosphohomoserine dADP Phosphoenolpyruvate RBBP7 (24R)−Cholest−5−ene−3−beta,24−diol Coenzyme docosanedioicacid Kynurenic acid Q10 Glutarylglycine HTATIP2 SND1 glyoxylate MAPK14 Galactosylglycerol dCDP UTP PDCD11 SRSF1 Pantetheine 4’−phosphate Citraconic Enzymes Glycyl−glycine (2R,3R,4S,5R,6R)−3,4,5−trihydroxy−6−(hydroxymethyl)oxan−2−yl acid ILF2 phosphate STK39 5−Diphosphoinositol IFI16 Fucose pentakisphosphate AXL 1−phosphate oxalatosuccinate(3−)PGR D−Arabitol PTK2 PARP1 3−Hydroxy−L−proline 3a−Hydroxy−5b−pregnane−20−one LCK S−(2−Methylpropanoyl)−dihydrolipoamide STK3 2−Aminobenzoic HMGN3 acid Oxidized SMARCB1 Selenomethionine NCOR2 glutathione CDK1 ATP, UDP, GDP, UTP, CDP, TGM2 ROCK2 HCFC1 SF1 0 (2)[glucose−1,3]−mannose MYLK 5−Tetradecenoic 3’,5’−cyclic oligosaccharide TAF5 GMP(1−) acid GMP UDP−D−galactose(2−)

STK10 D−Glyceraldehyde PAK2 TP53BP1 Coumarin 3−phosphate HMDB00230 Thyrotropin TR abundance RBMX releasing Bicarbonate (+)−gama−tocopherol RELA hormone Adenine SP1 4−Pyridoxic BIN1 acid dGTP, dCTP, GMP, adenine, 02040 TranscriptionalPHB2 BCKDK regulator CKM SFPQ Protoporphyrinogen HDAC1 dUMP IX RB1 PHderiv_Oxaloacetate

21−hydroxyallopregnanolone MERTK CHD4 4a−Hydroxytetrahydrobiopterin (S)−Succinyldihydrolipoamide D−Glucuronate ′ ′ 1−phosphate CAD CDP HMDB00229 Isopyridoxal 5 -deoxy-5 -(methylthio)adenosine, D−Serine

(5Z,9E,12S,14Z)−8,11,12−trihydroxyicosa−5,9,14−trienoate 2−acetamido−5−oxopentanoate PURA S−Glutaryldihydrolipoamide Number of kinase interactions HMGA2 SIN3A 100 TR Uracil cis,cis−2−amino−3−(3−oxoprop−1−enyl)but−2−enedioate pyruvate, glutamate, glucose, Non-linear succinate, glucose 6-phosphate, model fitting glycerol, GSSG, GlcNAc b 1 53 cell lines e f 0.25 Metabolite abundance All metabolites Predicted metabolites Modulators 10% 16% aspartate, choline, histamine, of 260 2% XMP, cGMP, prostaglandin E1 p-value: 5.6e–34 2% 28% sphingosine 1-phosphate TR activity metabolites 4% 7% myristic acid 0.5

pdf 6% 0 53 cell lines 48% 0 20 7% Protein abundance 9% 5% Number of known allosteric TRs 64 kinases enzyme-interactions Enzymes 19% 6% 1 2 3 4 5 6 7 >7 0 18% 0204060 53 cell lines 12% Number of kinase interactions

Fig. 5 Modeling TR activity using an ensemble of non-linear models. a, b Probability density function (pdf) of the number of kinases reported to interact with TRs (red) and metabolic enzymes (blue) in yeast55 (a) and human56 (b). P-values report on the significance of the difference between TRs and enzyme pdfs (Kolmogorov–Smirnov test). c Schematic overview of the computational framework. Three layers of biological information are integrated using a model-based fitting approach: protein abundance of 100 TRs and 64 kinases, respective TR activities derived from transcriptome data using NCA and relative levels of 260 metabolites across cell lines (see Methods section). d Predicted network of modulatory interactions between TRs, metabolites and kinases. Most interactions involve the combined action of a metabolite and a kinase (93% of interactions, Supplementary Data 5). e, f The pie chart represents the distribution of metabolites allosterically regulating one or more enzymatic reactions. All metabolites (1633) with at least one known allosteric interaction in human26 are reported in e, while only metabolites predicted to modulate the activity of at least one TR are shown in panel f. Metabolites that regulate more than seven enzymatic reaction are lumped together abundance and the activating or inhibiting action of individual regulation is at the basis of the decision-making process of a cell metabolites and/or kinases. In total, we tested 6,753,600 models, and its ability to allocate resources necessary for cell transfor- and found 1888 interactions which significantly (FDR ≤ 0.1%) mation and proliferation. Genome sequencing and transcriptome improved the explained variance in the activity of 96 TRs out of technologies have revealed an intricate network of TR-gene reg- the 100 TRs for which protein abundance data was available11. ulatory interactions in which mutations in TRs often associate to Consistent with the dominant role of phosphorylation in disease states and aberrant metabolic phenotypes7,60. However, it modulating TR activity (Fig. 5a, b), we found that on average is important to emphasize that gene regulatory interactions TRs are predicted to interact with more kinases than metabolites between enzymes and TRs per se are not sufficient to functionally (Supplementary Fig. 8). However, most of the inferred regulatory regulate the activity of metabolic pathways. Key to unambigu- interactions (93%) involved the combined action of a kinase and a ously resolving regulatory circuits at the intersection with meta- metabolite, while only few metabolites or kinases could alone bolism are methods searching for coordinated behavior between significantly improve model-fitting (Fig. 5d, Supplementary the different levels. To this end, we developed a combined Data 5). This observation suggests that multiple coordinated experimental and computational framework that overcomes regulatory mechanisms possibly underlie the post-transcriptional important limitations in large-scale metabolome screenings, regulation of TR activity. Moreover, metabolites predicted to affect including (i) the limited throughput and laborious sample pre- TR activity are significantly enriched (hypergeometric test, p-value paration of classical metabolomics approaches in mammalian cell 3.1e−4, Supplementary Fig. 8) for key signaling molecules in the cultures, (ii) the lack of scalable methods to adequately normalize cell known to allosterically regulate multiple enzymatic reactions, metabolomics data across morphologically diverse cell types, and such as glutathione, glutamate or ATP (Fig. 5e, f). Among the (iii) the need for systematic data integration strategies. The most highly connected metabolites predicted by our model-based herein-proposed workflow for large-scale metabolome profiling is fitting analysis is choline (118 interactions with 36 TRs), in directly applicable to the study of dynamic metabolic responses to agreement with recent studies showing that altered choline levels external stimuli18, and can scale to larger cohorts that are now are a metabolic hallmark of malignant transformation57 (see within reach of other molecular profiling platforms61. Supplementary Discussion). By combining cross-sectional omics data from diverse tumor Overall, our approach opens the door for a systematic cell lines, we constructed a global network model across three investigation of a previously largely unexplored58 interaction layers of biological information: the transcriptome, the proteome, space between transcriptional regulators and signaling effectors in and the metabolome, exploiting the naturally occurring pheno- human cells. An extensive network of post-translational regula- typic diversity in an in vitro cell line system. By analyzing the tory interactions has emerged in model organisms such as E. coordinated changes in baseline transcriptome, proteome and coli26 or yeast55,59, and based on our findings we expect a similar metabolome with the aid of a gene regulatory network and picture to hold true in human cells. model-based fitting analysis, we investigated the bi-directional exchange of signaling information between TRs and metabolic pathways. For many TRs, we predict potentially new regulatory Discussion associations with central metabolic pathways, suggesting a large In recent years, increasing efforts have been made to understand space of transcriptional solutions by which cells can fulfill the the signals driving metabolic changes in cancer12. Transcriptional anabolic and catabolic requirements for rapid proliferation and

NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9 adaptation to nutrient limitations. The correlation between gly- A detailed description and validation of the algorithm used for estimating cell colytic flux and activity of TRs involved in the response to numbers from bright-field microscopy images is provided in the Supplementary nutrient limitation and stress hints at the ability of cancer cells Note, alongside with a Matlab code. Of note, this approach has several important advantages, in that it is non-destructive, and allows quantifying cell growth and cell to sidetrack transcriptional programs activated by nutrient scar- numbers without any manual sample manipulation. city or stresses to potentially fulfill carbon demand for fast pro- liferation. Moreover, we observed a global coordination between Metabolomics experiments. Cells were plated in triplicates in 150 µL of RPMI- glucose and one-carbon metabolism, which revealed a selective 1640 medium (5% dFBS, 2 g/L glucose, 2 mM glutamine, 1% P/S) in 96-microtiter sensitivity to antifolate drugs in cell lines with low glucose uptake well plates. After an initial growth phase, the medium in each well was renewed on and might serve as a diagnostic marker for cancer cells that are the third day, and the cultures were subsequently monitored for four more days more likely to respond to folate synthesis inhibitors. (96 h). Ten replicate plates were prepared in each experiment to allow for gen- fl erating metabolomics samples at five different time-points (immediately before Because measuring intracellular uxes is still a major challenge, media change, and at 24, 48, 72 and 96 h after media change), and one additional and these measurements are typically limited to central metabolic plate for continuous growth monitoring (TECAN Spark 10 M, 37 °C, 5% CO2). pathways, our metabolome profiling technique offers an alter- At each sampling time point, two replicate 96-well plates were processed (plates native tool to probe metabolic regulation at a genome-scale and A and B). Plate A was used to generate cell extracts, by (1) removing the spent high-throughput in cancer cells. In light of the central regulatory medium, (2) washing once with 75 mM ammonium carbonate (pH 7.4, 37 °C), and (3) adding ice-cold extraction solvent (40% methanol, 40% acetonitrile, 20% water, role of TRs in cellular organization, targeting transcriptional 25 µM phenyl hydrazine64). Finally, the plate is sealed, incubated at −20 °C for one regulators is an extremely attractive way to counteract global hour, and subsequently stored at -80 °C until MS analysis. Plate B undergoes the gene expression changes that underlie cancer survival and same processing steps, except for the last one, where each well is filled with PBS fl development62,63. Endogenous metabolites capable of modulating (pH 7.4, 37 °C), and the plate is immediately imaged to determine cell con uence for subsequent normalization of MS spectra. TR activity could become invaluable chemical scaffolds to design Immediately prior to MS analysis, the plates were thawed on ice, and the new therapeutic molecules targeting oncogenic TRs, with the extracted cells were scraped off the bottom of each well using a multichannel pipet potential to overcome difficulties related to targeting kinase- with wide-bore tips. Next, the cell extracts were transferred to 96-well plates with mediated signaling cascades63. Altogether, our work also suggests conical bottom and centrifuged at 4 °C, 4000 rpm for 5 min to separate cell debris. Finally, pooled cell extracts for each experiment (pooled from five cell lines that, while clearly far from typical in vivo conditions, in vitro cell processed within the same experiment) as well as aliquots of cell-free extraction line systems represent an invaluable discovery tool to investigate solvent were added to each measurement plate as control samples, and the plates metabolic regulatory mechanisms that can still generalize to were sealed and stored at 4 °C until injection. in vivo conditions and clinical settings. The experimental and computational framework proposed in this study is applicable to Metabolome profiling using FIA-TOFMS. Cell extract samples were analyzed other systems or diseases, providing us with an unprecedented by flow-injection analysis time-of-flight mass spectrometry (FIA-TOFMS) on an tool to investigate the origin of metabolic dysregulation in human Agilent 6550 iFunnel Q-TOF LC-MS System (Agilent Technologies, Santa Clara, 14 diseases. CA, USA), as described by Fuhrer et al. . This method allows generating high- resolution spectral profiles in less than one minute per sample, allowing for sensitive high-throughput profiling of large sample collections. In brief, a defined Methods sample volume of 5 µL is injected using a Gerstel MPS2 autosampler into a Cell cultivation. The NCI-60 cancer cell lines were obtained from the National constant flow of isopropanol/water (60:40, v/v) buffered with 5 mM ammonium Cancer Institute (NCI, Bethesda, MD, USA). The full panel consists of 60 cell lines, carbonate (pH 9), containing two compounds for online mass axis correction: 54 of which grow adherently and were used in this study. Of note, cell line MDA-N 3-Amino-1-propanesulfonic acid, (138.0230374 m/z, Sigma Aldrich, cat. no. had previously been excluded from the panel, but was replaced by the NCI recently A76109) and hexakis(1H,1H,3H-tetrafluoropropoxy)phosphazine (940.0003763 m/ by MDA-MB-468 cells. After thawing, the 54 adherent cell lines were expanded in z, HP-0921, Agilent Technologies, Santa Clara, CA, USA). The sample plug is fl fi cell culture asks (Nunc T75, Thermo Scienti c) at 37 °C and 5% CO2 in RPMI- delivered directly to the ion source for ionization in negative mode (325 °C source 1640 (Biological Industries, cat.no. 01-101-1A) supplemented with 5% fetal bovine temperature, 5 L/min drying gas, 30 psig nebulizer pressure, 175 V fragmentor serum (FBS, Sigma Aldrich, F6178), 2 mM L-glutamine (Gibco, cat.no. 25030024), voltage, 65 V skimmer voltage). Mass spectra were recorded in the mass range 2 g/L D-glucose (Sigma Aldrich, cat.no. G8644), and 100 U/mL penicillin/strep- 50–1000 m/z in 4 GHz high-resolution mode with an acquisition rate of 1.4 spectra tomycin (P/S, Gibco, cat.no. 15140122). After two passages, the cells were trans- per second. Raw MS profiles were processed to align spectra and pick centroid ferred to fresh medium where FBS was replaced by dialyzed FBS (dFBS, Sigma ion masses using an in-house data processing environment in Matlab R2015b Aldrich, cat.no. F0392) with a reduced content of low molecular weight com- (The Mathworks, Natick). pounds, to improve the accuracy of metabolite quantification. Twenty cell lines were exemplarily tested and confirmed mycoplasma-free: SF268, IGROV1, fi UACC62, HCC2998, DU145, NCI-ADRRESS, OVCAR3, SNB19, SKMEL18, NCI- Multiple hypothesis testing correction. Given a suf cient number of tests, the 65 H23, SF539, HOP62, UACC157, NCI-H322M, NCI-H522, OVCAR4, EKVX, Storey method for correction of multiple hypothesis testing was adopted (i.e., UO31, CAKI-1, MDAMB231. q-value). However, this methodology typically requires a large number of null tests π Cells were maintained in medium with dFBS throughout all experiments. The to derive an accurate estimate of 0 (estimate of the proportion of true null 65 starting cell density for metabolomics experiments in 96-well plates (Nunc cat.no. p-values) . In cases where the number of tests is inadequate for q-value correction, 167008, Thermo Scientific) was determined for each cell line. To this end, cells we adopted the Bonferroni correction. Only in two cases, we applied a resampling fi were plated in triplicates at eight different starting cell densities and incubated at con dence interval-based method: (i) to generate Fig. 2c of strongest TR- metabolite associations, and (ii) to select the strongest TR-metabolite-kinase pre- 37 °C and 5% CO2 for 3 days. On the third day, the medium was changed in all wells by aspirating the spent medium using a multichannel aspirator, washing once dicted interactions (Fig. 5d). In these cases, to avoid any a priori assumptions of the with phosphate-buffered saline solution (PBS, pH 7.4, 37 °C, Gibco, cat.no. underlying distributions, we determine the false discovery rate (FDR) from the bulk 10010015) using a multichannel dispensing pipet, and finally filling each well again distribution of correlation or mean squared error values, respectively, derived from with 150 µL of fresh medium. The plate was imaged to determine the confluence random sampling. The procedure is described in details in the respective sections. (see below) immediately before and after media change, and after 72 h. The starting cell density for metabolomics experiments was then chosen to guarantee a Metabolite annotation. Measured ions were putatively annotated by matching minimum of 20-30% cell confluence after media change, and approx. 80% mass-to-charge ratios to a reference list of calculated masses of metabolites listed in confluence after 72 h. the Human Metabolome Database (HMDB) and the genome-scale reconstruction of human metabolism21 (Recon2) within 0.003 amu mass accuracy. The reference Cell imaging and image analysis. We monitor cell growth by measuring cell mass list was generated from the respective sum formulae, considering deproto- confluence (i.e., area of the well covered by cells in percentage) directly from 96- nation as the most prevalent mode of ionization in the chosen acquisition con- well plates (Nunc cat.no. 167008, Thermo Scientific) using automated time-lapse ditions. To allow for the annotation of α-keto acid derivatives formed in presence microscopy imaging. Every 1.5 h, bright-field microscopy images of each well were of phenyl hydrazine in the extraction solvent64, sum formulae for the phenylhy- acquired using a TECAN Spark 10 M plate reader. In addition, we developed an drazone derivatives (+C6H8N2−H2O) of a total of 30 α-keto acid compounds image analysis framework to segment cells and determine the characteristic cell size (selected via KEGG SimComp search http://www.genome.jp/tools/simcomp/) were area (i.e., average surface area of single adherent cells) for each cell line (Supple- added to the metabolite list for annotation. The final list of putatively annotated mentary Fig. 1 and Supplementary Note). To quantify the number of extracted metabolites consisted of 689 and 5949 unique compound IDs from Recon2, and cells, confluence was divided by the characteristic cell size (Supplementary Fig. 1). HMDB, respectively.

10 NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9 ARTICLE

Data normalization. We corrected for systematic errors using a two-step regres- cell lines were used to calculate a consensus correction factor for each cell line by sion model to disentangle the contributions of extracted cell numbers, plate-to- taking the mean across the 987 ions. To apply the cell volume correction to the full plate variance, instrumental and background noise from the actual variance in data set, we divided the cell line-specific α-values for each ion by the consensus metabolite abundances between cell types (Supplementary Fig. 2). To this end, raw correction factor (Supplementary Data 1). data were first corrected for instrument drift by normalizing for possible batch/ Finally, the corrected α-values were normalized using Z-score normalization: plate effects. Each plate contains 12 pooled cell extract samples prepared from the 5 different cell lines in each experiment (i.e., batch). Measured intensities for each α À α annotated ion are modeled as follows: ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffifficell ; Zα P ð Þ 1 n ðÞα À α 4 ¼ γ Á ð Þ n cell¼1 cell Ii;j;p p Mi;j 1  ¼ ðγ Þþ ð ÞðÞ log Ii;j;p log p log Mi;j 2 where n is the number of cell lines (i.e., 54). The final normalized data set is provided in Supplementary Data 1 alongside γ Where Ii,j,p is the measured intensity for ion I, in pooled sample j and plate p, p is with p-values and standard errors derived from regression analysis. Missing values the scaling factor associated to each plate and Mi,j represents the actual abundance (NaN) correspond to cases where the measured ion abundance for the annotated of metabolite i in sample j. By using a linear regression scheme we can estimate metabolite was close to the background level in the cell line, and can be considered γ both parameters ( p and Mi,j) within an unknown scaling factor. After correcting as zero for further analysis. In cell lines where a significant dependency (p-value < for possible instrumental artifacts, we implemented a second step in order to derive 3.4e−7, Bonferroni-adjusted threshold) of given metabolite’s abundance with cell comparative measurements of metabolite abundance for each cell line. Here, we number could be robustly determined and exceeded the background noise, relative follow each cell line along the linear growth phase, sampling every 24 h across standard errors of α calculated during fitting analysis were below 20% (median: 5 days. We typically obtain 15 data points for each cell line at different cell den- 11%, Supplementary Fig. 2). sities. The expectation is that the signal measured for any ion of biological origin (i.e., a genuine metabolite) would increase linearly as the number of cells in the Analysis of tissue signatures across four omics data types sample increases. The proportionality (i.e., α parameter) between ion intensity and . A complete extracted cells depend on the intracellular concentration of the metabolite. Here, by description of the analysis can be found in Supplementary Methods. implementing a multiple regression scheme, we estimate the relative abundance of a given metabolite in each of the cell lines, α, alongside with its standard error Estimating TR activity by network component analysis (NCA). Originally (Fig. 1b, Supplementary Fig. 2). A linear regression model describes variation in ion established by Liao et al.29, NCA provides a mathematical framework for recon- intensity as a linear function of cells extracted (α) and a constant parameter (β) that structing TR regulatory signals (TR activity) from gene expression profiles. Here, capture MS background noise. For each ion, the α values are specific of each cell we adopted sparseNCA implementation by Noor et al.67 (Matlab code downloaded line while the constant term in the model is fixed (i.e., expected ion signal at zero from https://sites.google.com/site/aminanoor/softwares). This methodology adopts confluence that is independent of the cell line). a mathematical model to approximate TR-target regulatory interactions and Because of the large number of measurements for each cell line at different cell integrates prior network information with the expression of target genes across densities, we can apply a multiple regression analysis (fitlm function in Matlab) to multiple conditions to regress the activity of the respective TRs, delivering a relative infer all model parameters αs and β at once, by minimizing the Euclidian distance measure of TR activity. We obtained normalized gene expression profiles across the between measured metabolite intensities and model predictions. For each NCI-60 cell lines from Gene Expression Omnibus (accession number GSE32474), metabolite, we solve the following linear model, including all 54 cell lines: containing 54,675 mRNA probes. TRRUST database24 served as the source of TR- 2 3 2 3 ¼ target gene interactions relevant in human, including 748 human TRs and 1975 Icell ;1 Ncell ;1 0 01 6 1 7 6 1 7 non-TR gene targets. Intersecting these two resources, we assembled a network of 6 I ; 7 6 N ; 0 ¼ 017 6 cell1 2 7 6 cell1 2 7 2209 unique genes corresponding to 5490 mRNA probes that match target genes 6 7 6 ¼ 7 6 Icell ;3 7 6 Ncell ;3 0 017 of 728 TRs in the TRRUST database (Supplementary Fig. 5). We implemented a 6 1 7 6 1 7 6 ¼ 7 6 ¼¼¼¼¼7 bootstrapping approach to account for incompleteness of the regulatory network 6 7 6 7 ÂÃ(i.e., missing regulatory interactions), and the fact that there may be multiple 6 7 6 ¼ 7 α α ¼ α β 6 Icell ;1 7 ¼ 6 0 Ncell ;1 017 Á cell cell cell 6 2 7 6 2 7 1 2 c optima in the solution space. Of note, even if the current knowledge of TR-target 6 I ; 7 6 0 N ; ¼ 017 genes is incomplete, few gene targets can be sufficient to estimate TR relative 6 cell2 2 7 6 cell2 2 7 6 7 6 7 activities using this approach. To this end, for each TR we randomly selected 48 6 I ; 7 6 0 N ; ¼ 017 6 cell2 3 7 6 cell2 3 7 4 ¼ 5 4 ¼¼¼¼¼5 additional TRs and constructed a sub-network containing the 49 TRs and their target genes. Because growth-rate has a pleiotropic effect on gene expression, here I ; 00¼ N ; 1 fl fi cellc s cellc s re ected in the correlation between rst principal component of gene expression ð3Þ data and cell line growth rates (Supplementary Fig. 5), we decouple TR activity from the confounding effect of growth-rate by adding an additional TR that targets Where Icell c,s is the measured metabolite intensity in sample s of cell line c, Ncell c,s all genes. This fictitious TR mimics the general effect of proliferation rates on is the number of cells extracted in sample s of cell line c, and αs (for each cell line) transcription. As a result, each TR is embedded in a sub-network of 50 TRs and and β are the unknown parameters to be fitted. The number of cells per sample is their target genes from the full network. Ten such subnetworks were created derived from the confluence measurements at sampling, divided by the average cell randomly for each TR to apply NCA. In this bootstrapping scheme, each TR was size area determined using our image segmentation analysis (see Supplementary sampled in on average 490 subnetworks (permutations, min. 423, . 556 data Fig. 2 and Supplementary Note). After this step, we retained 2181 ions with a points per TR). In the final data set, we normalized the calculated TR activity to the regression p-value below a threshold value of 3.4e-7 (adjusted by the number of cell maximum across all permutations, and finally calculated the median TR activity lines and ions) in at least one cell line, and that showed a significant dependency and its standard deviation for each TR and cell line (Supplementary Fig. 5). It is with the extracted cell number in more than 80% of cell lines (Supplementary worth noting that the estimates we obtain with this approach are correct within Fig. 2). Of note, we found that prior to normalization, the variance across three an unknown scaling factor, and hence we determine a relative measure of activity biological replicates at the same time-point was equally low in cell confluence for each of the 728 TRs across the NCI-60 cell lines (Supplementary Data 2). (median: 7.4% CV) and raw ion intensities (median: 13%, Supplementary Fig. 2), reflecting the high quality of MS measurements. TR-metabolite association network. In order to find metabolites whose relative In the third and last step, we take into account systematic changes in metabolite abundances related to differences in cell size (i.e., cell volume) between the 54 cell abundances correlate with TR activity, we calculated pairwise Spearman correla- tions between all 2181 annotated metabolites and 728 TRs across the 54 cell lines lines to derive comparative estimates of intracellular metabolite concentration. (Supplementary Data 3). Principal component analysis of relative metabolite abundance per cell revealed a For visualization in Fig. 2c, we controlled the false discovery rate (FDR) among strong trend across the 54 cell lines (PC1, 58.9% explained variance, Supplementary network links at 0.1% using a bootstrapping approach to calculate the 99.9% Fig. 2), which strongly correlates with cell line volumes derived from cell diameters confidence interval of correlation coefficients after randomizing the data set. To measured in ref. 66 as well as with the herein determined adherent cell size area that end, the cell lines in the metabolome data set were randomized by resampling (Supplementary Fig. 2, see Supplementary Note). The transitive correlation 100 times with replacement, and pairwise Spearman correlation coefficients were between adherent cell size and the spherical cell volume in suspension indicates fi that adherent cell height can be approximated as a constant. To correct MS data for calculated for each randomized data set. Correlation coef cients yielding a 99.9% confidence interval (0.1% FDR) were obtained from the pooled list of absolute differences in cell line volumes, we selected 987 ions that showed a significant and correlation coefficients by finding the smallest correlation coefficient that exceeds strong correlation (Pearson’s r > 0.8, p < 0.05, Supplementary Fig. 2) with PC1. the maximum value among 99.9% lowest correlation coefficients). KEGG pathway enrichment analysis showed that these putatively annotated metabolites were strongly overrepresented in fatty acid metabolism (Supplementary Fig. 2), consistent with the expected linear dependency between Measurement of glucose and lactate exchange rates. A complete description cell membrane surface (i.e., phospholipid content) and cell volume of adherent of the experiments and techniques used to determine glucose uptake- and lactate cells. For each ion, cell line-specific α-values of the selected metabolites across all 54 secretion rates can be found in Supplementary Methods.

NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9

Pathway enrichment in TR-metabolite correlation signatures. To assess the 6. Cheung, P. et al. Single-cell chromatin modification profiling reveals increased over-representation of TR-metabolite associations in KEGG metabolic pathways, epigenetic variations with. Aging. Cell. 173, 1385–1397.e14 (2018). the pairwise correlations between TR activity and metabolite relative abundance 7. Hakimi, A. A. et al. An integrated metabolic atlas of clear cell renal cell across cell lines were rank-transformed. A statistical score that models the prob- carcinoma. Cancer Cell. 29, 104–116 (2016). ability of a KEGG pathway to be significantly associated to a TR is based on the 8. Bartel, J. et al. The human blood metabolome-transcriptome interface. collective activities of multiple metabolites in a pathway following the approach PLoS. Genet. 11, e1005274 (2015). described in ref. 68. The significance of the rank distribution of all metabolites 9. Shoemaker, R. H. The NCI60 human tumour cell line anticancer drug screen. within the same KEGG pathway is tested by means of an iterative hypergeometric Nat. Rev. Cancer 6, 813–823 (2006). fi test, indicating the statistical signi cance of metabolic intermediates of a common 10. Pfister, T. D. et al. Topoisomerase I levels in the NCI-60 cancer cell line panel metabolic pathway (e.g., TCA cycle) being distributed toward the top ranking ones. determined by validated ELISA and microarray analysis and correlation 69 P-values were corrected for multiple tests by q-value estimation (Supplementary with indenoisoquinoline sensitivity. Mol. Cancer Ther. 8, 1878–1884 (2009). Data 3). 11. Guo, T. et al. Rapid proteotyping reveals cancer biology and drug response determinants in the NCI-60 cells. Preprint at https://www.biorxiv.org/content/ siRNA transfection and HIF-1A knockdown. A complete description of the 10.1101/268953v2 (2018). experimental procedures used to knockdown the TR HIF-1A in IGROV-1 ovarian 12. Hu, J. et al. Heterogeneity of tumor-induced gene expression changes in the cancer cells, and quantify metabolic changes in response to the knockdown can be human metabolic network. Nat. Biotechnol. 31, 522–529 (2013). found in Supplementary Methods. 13. Jain, M. et al. Metabolite profiling identifies a key role for glycine in rapid cancer cell proliferation. Science 336, 1040–1044 (2012). Inferring TR involvement from in vivo metabolic changes. Based on the TR- 14. Fuhrer, T., Heer, D., Begemann, B. & Zamboni, N. High-throughput, accurate fi fl – fl metabolite association network, the procedure used to estimate the contribution of mass metabolome pro ling of cellular extracts by ow injection time-of- ight – each of the 728 TRs in mediating metabolic changes observed in vivo, consists of 3 mass spectrometry. Anal. Chem. 83, 7074 7080 (2011). main steps: (i) For each patient log fold-changes of detected metabolites between 15. Paglia, G. & Astarita, G. Metabolomics and lipidomics using traveling-wave 2 – cancer and adjacent normal tissue are estimated (FCp), (ii) The dot product ion mobility mass spectrometry. Nat. Protoc. 12, 797 813 (2017). between metabolite fold-changes and the TR-metabolite correlation vector (cTR, 16. Dettmer, K. et al. Metabolite extraction from adherently growing mammalian – product of correlation R and log10 p) estimated from in vitro cell lines is com- cells for metabolomics studies: optimization of harvesting and extraction p fi protocols. Anal. Bioanal. Chem. 399, 1127–1139 (2011). puted for each TR (STR), (iii) the signi cance is estimated using a permutation test, where metabolite order is shuffled 10.000 times, the dot product is estimated for 17. Milo, R. What is the total number of protein molecules per cell volume? p – each random permutation (~S ) and p-values are estimated as follows: A call to rethink some published values. Bioessays 35, 1050 1055 (2013). TR 18. Dubuis, S., Ortmayr, K. & Zampieri, M. A framework for large-scale C Á FCp fi Sp ¼ TR ð Þ metabolome drug pro ling links coenzyme A metabolism to the toxicity of TR jj jj1 5 CTR anti-cancer drug dichloroacetate. Commun. Biol. 1, 101 (2018). P ÀÁ 19. Dubuis, S. et al. Metabotypes of breast cancer cell lines revealed by non- 10;000 ~p p S  S targeted metabolomics. Metab. Eng. 43, 173–186 (2017). p À value ¼ k TR TR ð6Þ TR 10; 000 20. Sugimoto, M. et al. MMMDB: mouse multiple tissue metabolome database. – P-values are corrected for multiple tests by q-value estimation69, and the Nucleic Acids Res. 40, D809 D814 (2012). median across patients is calculated (Supplementary Data 3). Notably, when 21. Thiele, I. et al. A community-driven global reconstruction of human – analyzing the data published in ref. 7 we excluded all detected metabolites with metabolism. Nat. Biotechnol. 31, 419 425 (2013). more than 10 missing values across patient samples. 22. Tanner, L. B. et al. Four key steps control glycolytic flux in mammalian cells. Cell Syst. 7,49–62.e8 (2018). 23. Zelezniak, A., Sheridan, S. & Patil, K. R. Contribution of network connectivity Associations between TR activity and drug action. A complete description of in determining the relationship between gene expression and metabolite the linear regression model used to quantify associations between variation in drug concentration changes. PLoS. Comput. Biol. 10, e1003572 (2014). sensitivity and TR activity can be found in Supplementary Methods. 24. Han, H. et al. TRRUST: a reference database of human transcriptional regulatory interactions. Sci. Rep. 5, srep11432 (2015). Prediction of metabolite-TR effectors. A complete description of the non-linear 25. Cascante, M. et al. Metabolic control analysis in drug discovery and disease. model fitting analysis for the prediction of metabolite- or kinase effectors of TRs Nat. Biotechnol. 20, 243–249 (2002). can be found in Supplementary Methods. 26. Reznik, E. et al. Genome-scale architecture of small molecule regulatory networks and the fundamental trade-off between regulation and enzymatic Data availability activity. Cell Rep. 20, 2666–2677 (2017). All data generated or analyzed during this study are included in this published article as 27. Vaquerizas, J. M., Kummerfeld, S. K., Teichmann, S. A. & Luscombe, N. M. Supplementary Data. A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 10, 252–263 (2009). 28. Bellis, A. D. et al. Cellular arrays for large-scale analysis of transcription Code availability factor activity. Biotechnol. Bioeng. 108, 395–403 (2011). A detailed description of all data analysis steps is published in this article in the Methods 29. Liao, J. C. et al. Network component analysis: Reconstruction of regulatory section and Supplementary Information. Matlab code for the image analysis software is signals in biological systems. Proc. Natl Acad. Sci. USA 100, 15522–15527 available for download at http://www.imsb.ethz.ch/research/zampieri-group/resources. (2003). html. 30. Barretina, J. et al. The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012). 31. Reinhold, W. C. et al. CellMiner: a web-based suite of genomic and Received: 31 January 2019 Accepted: 22 March 2019 pharmacologic tools to explore transcript and drug patterns in the NCI-60 cell line set. Cancer Res. 72, 3499–3511 (2012). 32. Ait-Lounis, A. et al. The transcription factor Rfx3 regulates β-cell differentiation, function, and glucokinase expression. 59, 1674–1685 (2010). 33. Hirotsu, Y. et al. Transcription factor NF-E2-related factor 1 impairs glucose References metabolism in mice. Genes. Cells 19, 650–665 (2014). 1. Kiviet, D. J. et al. Stochasticity of metabolism and growth at the single-cell 34. Angelastro, J. M. Targeting ATF5 in Cancer. Trends Cancer 3, 471–474 level. Nature 514, 376–379 (2014). (2017). 2. Green, D. R., Galluzzi, L. & Kroemer, G. Metabolic control of cell death. 35. Monroy, M. A. et al. Regulation of cAMP-responsive element-binding Science 345, 1250256 (2014). protein-mediated transcription by the SNF2/SWI-related protein, SRCAP. 3. Yuneva, M. O. et al. The metabolic profile of tumors depends on both the J. Biol. Chem. 276, 40721–40726 (2001). responsible genetic lesion and tissue type. Cell. Metab. 15, 157–170 (2012). 36. Vincent, E. E. et al. Mitochondrial phosphoenolpyruvate carboxykinase 4. Boroughs, L. K. & DeBerardinis, R. J. Metabolic pathways promoting cancer regulates metabolic adaptation and enables glucose-independent tumor cell survival and growth. Nat. Cell Biol. 17, 351–359 (2015). growth. Mol. Cell 60, 195–207 (2015). 5. Reznik, E., Wang, Q., La, K., Schultz, N. & Sander, C. Mitochondrial 37. Chubukov, V., Zuleta, I. A. & Li, H. Regulatory architecture determines respiratory gene expression is suppressed in many cancers. eLife 6, e21592 optimal regulation of gene expression in metabolic pathways. Proc. Natl Acad. (2017). Sci. USA 109, 5127–5132 (2012).

12 NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-09695-9 ARTICLE

38. Forbes, S. A. et al. COSMIC: somatic cancer genetics at high-resolution. 64. Zimmermann, M., Sauer, U. & Zamboni, N. Quantification and mass Nucleic Acids Res. 45, D777–D783 (2017). isotopomer profiling of α-keto acids in central carbon metabolism. Anal. 39. Semenza, G. L. HIF-1: upstream and downstream of cancer metabolism. Chem. 86, 3232–3237 (2014). Curr. Opin. Genet. Dev. 20,51–56 (2010). 65. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. 40. Papandreou, I., Cairns, R. A., Fontana, L., Lim, A. L. & Denko, N. C. HIF-1 Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003). mediates adaptation to hypoxia by actively downregulating mitochondrial 66. Dolfi, S. C. et al. The metabolic demands of cancer cells are coupled to their oxygen consumption. Cell. Metab. 3, 187–197 (2006). size and protein synthesis rates. Cancer Metab. 1, 20 (2013). 41. The Cancer Genome Atlas Research Network. Comprehensive molecular 67. Noor, A., Ahmad, A. & Serpedin, E. SparseNCA: Sparse Network Component characterization of clear cell renal cell carcinoma. Nature 499,43–49 (2013). Analysis for Recovering Transcription Factor Activities with Incomplete Prior 42. Petrella, B. L. & Brinckerhoff, C. E. PTEN suppression of YY1 induces HIF-2α Information. IEEE/ACM Trans. Comput. Biol. Bioinform. 15, 387–395 (2015). activity in von Hippel Lindau null renal cell carcinoma. Cancer Biol. Ther. 8, 68. Konig, R. et al. A probability-based approach for the analysis of large-scale 1389–1401 (2009). RNAi screens. Nat Meth 4, 847–849 (2007). 43. Chaudhri, V. K. et al. Metabolic alterations in lung cancer-associated 69. Storey, J. D. A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B 64, fibroblasts correlated with increased glycolytic metabolism of the tumor. Mol. 479–498 (2002). Cancer Res. MCR 11, 579–592 (2013). 44. The Cancer Genome Atlas Research Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 Acknowledgements (2012). We thank the National Cancer Institute (NCI) for providing the cancer cell lines. This 45. The Cancer Genome Atlas Research Network. Comprehensive molecular work was supported by a Worldwide Cancer Research (WCR-15-1058) project funding profiling of lung adenocarcinoma. Nature 511, 543–550 (2014). to M.Z., K.O. was funded by the Austrian Science Fund (FWF): FWF P26603 and FWF 46. Kawada, J. et al. mTOR Inhibitors induce cell-cycle arrest and inhibit tumor W1224 Doctoral Program BioToP—Biomolecular Technology of Proteins. We thank growth in epstein–barr virus–associated T and natural killer cell lymphoma Uwe Sauer and Nicola Zamboni for supporting this work and providing laboratory cells. Clin. Cancer Res. 20, 5412–5422 (2014). facilities, Dimitris Christodoulou, Maren Diether, Nicola Zamboni and Victor Chubukov 47. Zhang, C. et al. Vorinostat suppresses hypoxia signaling by modulating for helpful feedback and discussions. nuclear translocation of hypoxia inducible factor 1 alpha. Oncotarget 8, 56110–56125 (2017). Author contributions α 48. Huang, S.-W. et al. Targeting aerobic glycolysis and HIF-1 expression M.Z. designed the project. K.O. and S.D. performed the metabolome experiments. K.O. enhance imiquimod-induced apoptosis in cancer cells. Oncotarget 5, and M.Z. analysed the data. M.Z. and S.D. designed and implemented the image analysis – 1363 1381 (2014). framework. M.Z. and K.O. wrote and revised the manuscript. All authors contributed to 49. Cardoso, F. et al. 70-Gene signature as an aid to treatment decisions in early- preparing the manuscript. stage breast cancer. N. Engl. J. Med. 375, 717–729 (2016). 50. Gossage, L., Eisen, T. & Maher, E. R. VHL, the story of a tumour suppressor gene. Nat. Rev. Cancer 15, nrc3844 (2014). Additional information 51. Sciacovelli, M. et al. Fumarate is an epigenetic modifier that elicits epithelial- Supplementary Information accompanies this paper at https://doi.org/10.1038/s41467- to-mesenchymal transition. Nature 537, 544 (2016). 019-09695-9. 52. Sciacovelli, M. & Frezza, C. Oncometabolites: unconventional triggers of oncogenic signalling cascades. Free Radic. Biol. Med. 100, 175–181 (2016). Competing interests: The authors declare no competing interests. 53. Link, H., Kochanowski, K. & Sauer, U. Systematic identification of allosteric protein-metabolite interactions that control enzyme activity in vivo. Reprints and permission information is available online at http://npg.nature.com/ Nat. Biotechnol. 31, 357–361 (2013). reprintsandpermissions/ 54. Hackett, S. R. et al. Systems-level analysis of mechanisms regulating yeast metabolic flux. Science 354, aaf2786 (2016). Journal peer review information: Nature Communications thanks the anonymous 55. Sharifpoor, S. et al. A quantitative literature-curated gold standard for kinase- reviewers for their contribution to the peer review of this work. Peer reviewer reports are substrate pairs. Genome. Biol. 12, R39 (2011). available. 56. Huang, K.-Y. et al. RegPhos 2.0: an updated resource to explore protein ’ kinase-substrate phosphorylation networks in mammals. Database J. Biol. Publisher s note: Springer Nature remains neutral with regard to jurisdictional claims in fi Databases Curation 2014, bau034 (2014). published maps and institutional af liations. 57. Glunde, K., Bhujwalla, Z. M. & Ronen, S. M. Choline metabolism in malignant transformation. Nat. Rev. Cancer 11, 835 (2011). 58. Alam, M. T. et al. The self-inhibitory nature of metabolic networks and its Open Access This article is licensed under a Creative Commons alleviation through compartmentalization. Nat. Commun. 8, ncomms16018 Attribution 4.0 International License, which permits use, sharing, (2017). adaptation, distribution and reproduction in any medium or format, as long as you give 59. Piazza, I. et al. A Map of protein-metabolite interactions reveals principles appropriate credit to the original author(s) and the source, provide a link to the Creative of chemical communication. Cell 172, 358–372.e23 (2018). Commons license, and indicate if changes were made. The images or other third party fi 60. Chen, J. C. et al. Identi cation of causal genetic drivers of human disease material in this article are included in the article’s Creative Commons license, unless – through systems-level analysis of regulatory networks. Cell 159, 402 414 indicated otherwise in a credit line to the material. If material is not included in the (2014). article’s Creative Commons license and your intended use is not permitted by statutory 61. Subramanian, A. et al. A next generation connectivity map: L1000 platform regulation or exceeds the permitted use, you will need to obtain permission directly from fi fi – and the rst 1,000,000 pro les. Cell 171, 1437 1452.e17 (2017). the copyright holder. To view a copy of this license, visit http://creativecommons.org/ 62. Illendula, A. et al. A small-molecule inhibitor of the aberrant transcription licenses/by/4.0/. factor CBFβ-SMMHC delays leukemia in mice. Science 347, 779–784 (2015). 63. Gonda, T. J. & Ramsay, R. G. Directly targeting transcriptional dysregulation in cancer. Nat. Rev. Cancer 15, 686–694 (2015). © The Author(s) 2019

NATURE COMMUNICATIONS | (2019) 10:1841 | https://doi.org/10.1038/s41467-019-09695-9 | www.nature.com/naturecommunications 13