Intratumor heterogeneity alters most effective drugs in designed combinations

Boyang Zhaoa,b, Michael T. Hemannb,, and Douglas A. Lauffenburgerb,c,d,1

aComputational and Systems Biology Program, bThe David H. Koch Institute for Integrative Cancer Research, and Departments of cBiology and dBiological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139

Edited by Gordon B. Mills, The University of Texas, M. D. Anderson Cancer Center, Houston, TX, and accepted by the Editorial Board June 13, 2014 (received for review December 26, 2013)

The substantial spatial and temporal heterogeneity observed in combination for best treatment of a heterogeneous tumor. Per- patient tumors poses considerable challenges for the design of haps the closest endeavor was a theoretical analysis of a minimal effective drug combinations with predictable outcomes. Currently, set of drugs that maximize the coverage of molecular target the implications of tissue heterogeneity and sampling bias during variants, with respect to structural considerations (24). None- diagnosis are unclear for selection and subsequent performance of theless, to our knowledge, there are no previous systematic potential combination therapies. Here, we apply a multiobjective studies of combination drug design examining efficacy and tox- icity in the context of tumor heterogeneity and aimed at (i) how computational optimization approach integrated with empirical ii information on efficacy and toxicity for individual drugs with heterogeneity affects the utility of drug combinations and ( ) respect to a spectrum of genetic perturbations, enabling deriva- how the current clinical practice of tumor diagnosis, which biases tion of optimal drug combinations for heterogeneous tumors toward focus on the predominant subpopulation, affects the design of drug combinations (Fig. 1A). comprising distributions of subpopulations possessing these per- To address these questions, we use a computational optimi- turbations. Analysis across probabilistic samplings from the spec- zation approach in concert with previously reported experi- trum of various possible distributions reveals that the most mental data across a range of specific drugs effects on a spectrum beneficial (considering both efficacy and toxicity) set of drugs of particular RNAi perturbations characterizing genetic varia- changes as the complexity of genetic heterogeneity increases. tions in model tumors. Our goal is to determine the best drug Importantly, a significant likelihood arises that a drug selected as combinations to minimize all subpopulations significantly pres- the most beneficial single agent with respect to the predominant ent in a heterogeneous tumor. We previously experimentally subpopulation in fact does not reside within the most broadly validated our model and the computational optimization pre- useful drug combinations for heterogeneous tumors. The un- dictions regarding a two-drug combination with three-compo- derlying explanation appears to be that heterogeneity essentially nent heterogeneous tumors in vitro and in vivo in the presence of homogenizes the benefit of drug combinations, reducing the information about the subpopulation proportions (25). Here, we special advantage of a particular drug on a specific subpopulation. dramatically extend our model to examine more complex het- Thus, this study underscores the importance of considering erogeneous tumor compositions and a greater number of com- heterogeneity in choosing drug combinations and offers a princi-

ponent drugs and, most importantly, to integrate over samplings SYSTEMS BIOLOGY pled approach toward designing the most likely beneficial set, from all possible subpopulation distributions via computational even if the subpopulation distribution is not precisely known. simulation. The latter integration facilitates our understanding of which drugs should be included in the combination treat- systems biology | cancer | combination therapy ment, even in the absence of quantitative information on the

enetic intratumor heterogeneity has long been appreciated Significance

Gas present in cancer patients (1). Recent sequencing studies ENGINEERING further revealed the extent of this tumor diversity, arising from Tumors within each cancer patient have been found to be ex- highly complex clonal evolutionary processes (2). This phe- tensively heterogeneous both spatially across distinct regions nomenon has been observed in many solid (3–5) and hemato- and temporally in response to treatment. This poses challenges poietic cancers (6–8). Moreover, treatments also may have for prognostic/diagnostic biomarker identification and rational dramatic effects on tumor composition—with a preexisting sub- design of optimal drug combinations to minimize reoccurrence. clone at diagnosis often becoming dominant at relapse (9–12). Here we present a computational approach incorporating drug To meet this challenge, rational drug combination treatments efficacy and drug side effects to derive effective drug combi- must be designed so as to account for intratumor heterogeneity nations and study how tumor heterogeneity affects drug to better predict reoccurrence. selection. We find that considering subpopulations beyond The history of theoretical studies aimed at drug optimization just the predominant subpopulation in a heterogeneous tu- in cancer therapy is long. Some studies used differential equation mor may result in nonintuitive drug combinations. Additional models formulated as deterministic optimal control problems analyses reveal general properties of effective drugs. This (13–16), whereas others used stochastic birth–death process study highlights the importance of optimizing drug combina- models (17–21), to examine treatment regimens for tumors tions in the context of intratumor heterogeneity and offers comprising a sensitive population along with a few resistant a principled approach toward their rational design. subpopulations. However, these studies, many of which dealt only with generic scenarios, focused primarily on drug schedul- Author contributions: B.Z., M.T.H., and D.A.L. designed research; B.Z. performed research; ing, investigating the effects of frequency and dose intensity of B.Z. and D.A.L. analyzed data; and B.Z., M.T.H., and D.A.L. wrote the paper. drugs on resistance potential. Practical applications of such The authors declare no conflict of interest. results have been limited, although some of these strategies re- This article is a PNAS Direct Submission. G.B.M. is a guest editor invited by the Editorial cently were combined with experimental validation to examine Board. single-drug scheduling in vitro and in xenograft models (22, 23). Freely available online through the PNAS open access option. The extensive number of chemotherapy and targeted thera- 1To whom correspondence should be addressed. Email: [email protected]. peutics in clinical and preclinical use presents a large search This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. space to determine even the choice of component drugs in a drug 1073/pnas.1323934111/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1323934111 PNAS | July 22, 2014 | vol. 111 | no. 29 | 10773–10778 Downloaded by guest on October 2, 2021 A B to treatments and may be responsible for resistance and relapse Monte Carlo drug optimization (28). An immediate critical question is, how should consider- biopsy sampling Heterogeneous ation of heterogeneity, instead of simply the predominant sub- tumor predominant subpopulation only How different are the A most efficacious drug statistical population, affect drug combination design (Fig. 1 )? combinations? analyses all subpopulations Our approach integrates a theoretical framework with em- pirical experimental information. The theoretical framework optimal drug combinations is a multiobjective linear optimization algorithm that simul- . . C . . taneously incorporates efficacy (the desired tumor cell killing) and toxicity (adverse side effects in patients) for drug combina- tumor composition augmented weighted SI

tumor tion effects on many heterogeneous tumor compositions (see static deterministic linear Tchebycheff method drug combinations multi-objective optimization Appendix efficacy at Pareto frontier for methodological details). The calculations involved

drug toxicity given: tumor composition maximize: overall efficacy in the optimization algorithm incorporate the effects of each minimize: overall toxicity drug on each tumor subpopulation, sampling from among the conceivable compositions of subpopulations and combinations machine learning prior knowledge of drugs (Fig. 1B). These compositions and drugs come from e.g. PLSR, k-NN, SVM elastic net, naive Bayes previous experimental information (26, 29). The tumor sub- populations are taken from a battery of shRNA knockdowns of Fig. 1. Schematic of computational model. (A) Current clinical practice for genes in the DNA damage repair and apoptosis pathways in the Arf−/− diagnosis may result in sampling bias, with analysis based on only/mostly the murine Eμ-myc; p19 lymphoma cell line [derived from a predominant subpopulation. As such, a key question we try to address well-established preclinical Eμ-myc mouse model of human here is how intratumor heterogeneity (and specifically, consideration of Burkitt lymphoma (30, 31)], and the drugs are taken from a se- the entire heterogeneity vs. just the predominant subpopulation) affects ries of commonly used chemotherapeutics and targeted thera- the resulting optimized drug combinations. (B) To examine the effects of intratumor heterogeneity on optimal drug combinations, Monte Carlo peutics (Fig. S1). A key finding in the previous work is that the sampling was applied to sample 10,000 heterogeneous tumor compositions. overall efficacy of multiple drugs in combinations typically is For each tumor composition, we mathematically optimized for a set of drug approximated by linear combinations of the individual drug combinations. Statistical and sensitivity analyses were applied to the sam- efficacies when they are applied together at overall LD80–90 pling results. (C) Schematic for the optimization model. For a given tumor concentration. Such additivity allows for the use of a linear composition, drug combinations were optimized to maximize efficacy and objective function for efficacy with linear constraints in our minimize toxicity using a multiobjective optimization approach, scalarized optimization algorithm. using an augmented weighted Tchebycheff method, and solved iteratively Of course, the design of drug combinations involves con- using (SI Appendix). Additional drug or tumor proper- straints and conflicting objectives, resulting in multiple tradeoffs ties (derived from prior knowledge or machine learning) also may be in- to consider simultaneously. For instance, the tradeoff between corporated into this framework. Solving the multiobjective optimization efficacy and toxicity prohibits maximizing the number of drugs model leads to a set of solutions, or a Pareto optimal set, on the Pareto without exceeding the tolerable level of toxicity. Previous anal- frontier, which represents a surface for which any increase in one objective yses of optimal control theory-based design of chemotherapeutic results in a decrease in the other objective. regimens have long used toxicity (e.g., in terms of maximal/ cumulative drug concentrations) as constraints or secondary objectives in their mathematical formulations (14). Here, our subpopulation distributions for a given tumor, as would be typ- framework instead posits multiobjective optimization to maxi- ical in clinical situations. Among the insights gained, a key mize overall efficacy and minimize overall toxicity concomitantly principle is that significant differences may exist between the (Fig. 1C and SI Appendix). combinations predicted to be most beneficial for the pre- As one of the primary considerations in clinical drug combi- dominant single subpopulation and those more beneficial when nation regimens is nonoverlapping toxicity (again, adverse side multiple subpopulations are taken into account. This suggests effects), we formulated our drug toxicity model as a linear that gaining at least some degree of information concerning the model, with the goal of minimizing overall toxicity with con- heterogeneity of a primary tumor upon diagnosis, even if not straints on a maximal allowable toxicity for each side effect (e.g., quantitative or complete, may lead to an indicated drug regimen myelosuppression, gastrointestinal effects). This formulation different from what would be viewed as best if only the pre- effectively captures nonlinear behaviors as a result of non- dominant genetic signature was obtained. Moreover, consistent overlapping drug combinations. Subsequent analyses first are with our recent experimental findings (26), our simulations dem- based on a symmetric toxicity profile—that is, each drug has the onstrate that drug effects are homogenized by tumor heteroge- same toxicity unit of 1. Later, this is relaxed to include the actual neity over essentially a full range of conceivable subpopulation asymmetric toxicity profile of drugs in our efficacy dataset. distributions. Consequently, the effectiveness of drugs most likely Solving a multiobjective optimization problem, concomitant- to be found beneficial in combination for heterogeneous tumors ly incorporating efficacy and toxicity, results in a discrete set may be characterized by average efficacy and robustness across (known as a “Pareto optimal set”) of feasible solutions (i.e., the subpopulations rather than by extreme efficacy in a partic- optimal drug combinations) rather than a single “absolute best” ular subpopulation. Taken together, our results offer concep- solution (Fig. S2). The set is discrete, because each solution tual principles for designing drug combinations in the context represents an integer number of drugs included in the combi- of intratumor heterogeneity. nation, but it resides upon a continuous curve termed the “Pareto frontier.” Each solution in the Pareto optimal set has the Results property that an increase in one objective (say, efficacy) results Conceptual and Computational Approach. Our premise is that a in a decrease in at least one or more objectives (then, toxicity). typical tumor comprises multiple diverse subclones along with We enumerate the Pareto optimal solutions via linear pro- a dominant subpopulation, and that dynamic changes in the gramming calculations (see SI Appendix for technical details). distribution of these populations may be influenced by drug treatments. Currently, the prevalent clinical practice for di- Effects of Tumor Heterogeneity on Drug Combination Efficacy and agnosing tumor type and drug regimen is based on histological Toxicity Tradeoffs. As the first manifestation of our analysis, we and/or biomarker identification and generally is biased toward used our empirical efficacy dataset (26, 29) in concert with a the predominant subpopulation as a result of its greater rep- symmetric toxicity profile. In this circumstance, we expect the resentation in the population (2, 27). However, minor sub- algorithm to produce as optimal solutions drug combinations populations are important in determining how a tumor responds with the most efficacious drug plus additional drugs incorporated

10774 | www.pnas.org/cgi/doi/10.1073/pnas.1323934111 Zhao et al. Downloaded by guest on October 2, 2021 100% shp53 55% shp53 + others distribution of component drugs in six-drug combinations. Overall, ABNadir Nadir 1 1 10,000 heterogeneous populations were generated with each pop- 0.8 0.8 ulation sampled, based first on the number of subpopulations, 0.6 0.6 followed by the subpopulation genetic variants, and subsequently Compromise Toxicity Toxicity 0.4 Compromise 0.4 the subpopulation proportions. Upon drug combination op- 0.2 0.2 timization, we obtain predictions of the six-drug combinations Utopia Utopia 0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 most likely to be beneficial for a heterogeneous tumor with the Efficacy Efficacy specified genetic variants, averaged over potential unknown Objective space Nadir 6 quantitative proportions. These genetic variants are character- C 1 D Solution space 5 0 ized by a set of up to 30 shRNA knockdowns, targeting genes in 0.8 1 4 2 the DNA damage repair and apoptosis pathways (Fig. S1). The 0.6 3 Compromise 3 4 Toxicity 0.4 2 Regimens 5 component drugs most frequently included in the optimal drug 6 1 combination for this class of heterogeneous tumor were rapa- 0.2 So Su Dox EtopCDDPMMC CBL CPT 6−TG TMZ MTX 5−FU HU VCR Taxol ActD RapaSAHA DACRosco NSC 0 Utopia A 0 Drugs mycin, 5-fluorouracil (5-FU), vincristine, and sunitinib (Fig. 3 ); 0 0.2 0.4 0.6 0.8 1 “ ” Efficacy we term these the dominant drugs. Although these drugs do not covary based on clustering analysis over all populations (Fig. Fig. 2. Intratumor heterogeneity linearizes the Pareto frontier and S5), when we analyzed the frequency of drug inclusion as a homogenizes drug combination efficacy. (A) Representative objective space function of tumor complexity (i.e., the number of subpopulations showing the Pareto frontier and the tradeoff between toxicity and efficacy μ Arf−/− in the heterogeneous tumor), we observed a strong dependence for a homogeneous population of murine E -myc; p19 lymphoma cells for a select number of drugs: some positively and some nega- infected with shp53 hairpin. The plot was generated based on the optimi- tively correlating with tumor complexity (Fig. 3 B and C). For zation described in Fig. 1C, using efficacy data composed of single-drug efficacy for individual subpopulations (Fig. S1) and a symmetric toxicity instance, with increasing tumor complexity, we observed an in- profile for each drug. (B) Representative objective space showing the effects crease in the frequencies of inclusion of the four drugs noted of a heterogeneous population containing a predominant shp53 sub- above but a decrease in the frequencies of some other drugs population and 13 minor subpopulations on the Pareto frontier. (C and D) (such as NSC3852, temozolomide, roscovitine, cisplatin, mito- Representative objective and solution space (shown for the same population mycin C, and camptothecin). In Fig. 3B, these relationships are as in B), with a maximum toxicity constraint of 6, which is set based on the illustrated with positive or negative trends, respectively, and Fig. number of drugs present in commonly used combination regimens. In the 3C quantifies them in terms of Spearman correlation coefficient context of the mathematical model, this refers to the value for the param- values. Furthermore, tumor complexity is a crucial characteristic, eter amax (SI Appendix). Compromise solution (colored green) refers to the because this strong correlation is abrogated when we control for point closest to the utopia based on an L1 norm distance metric. tumor complexity (by examining the subsequent partial correla- tion) (Fig. S6). To expand the generalizability of our insights further, we into the combination in order of decreasing efficacy. Partnering also analyzed combinations of numbers of drugs other than six; a symmetric toxicity profile with decreasing efficacy would sug- i.e., representing other regimens along a Pareto frontier. The gest a deviation from linearity from the Pareto frontier curve as the number of drugs in the combination is increased. Fig. 2A

shows a representative example of this anticipated tradeoff SYSTEMS BIOLOGY Regimens Rapa So 1 between toxicity and efficacy on the Pareto frontier for the A 5-FU CDSu 123456 VCR Dox 0.8 Su Etop ActD 1 homogeneous tumor population (100% shp53). Interestingly, CDDP 0.6 Spearman correlation So SAHA MMC Regimens Etop CBL 0.4 2 however, as heterogeneity is introduced to the population (55% 6-TG CPT MTX 6−TG 0.2 3 HU TMZ shp53 plus 13 minor subpopulations), the shape of the Pareto Dox MTX 0 4

Drugs Taxol 5−FU CBL HU −0.2 B TMZ VCR 5 frontier becomes surprisingly linear (Fig. 2 and Table S2). This DAC Taxol −0.4 CDDP ActD NSC Rapa 6 behavior, the shift from nonlinear Pareto frontier curve to linear Rosco −0.6 MMC SAHA CPT DAC Rosco −0.8 as a tumor becomes more heterogeneous, also was observed with NSC 0 2000 4000 6000 8000 −1 0.5 1 ENGINEERING other subpopulation distributions (Fig. S4 and Table S3). The Frequency Spearman correlation meaning of this in tumor treatment terms is that heterogeneity So Su Dox Etop CDDP MMC CBL CPT 6-TG TMZ MTX B 0.21 0.65 0.1 0.2 -0.73 -0.66 -0.42 -0.64 -0.43 -0.86 0.19 homogenizes the benefit of drug combinations, because each 200 250 150 150 60 40 80 20 120 200 120 additional drug has differential effects on each subpopulation. 100 200 100 100 40 20 60 10 100 100 100 In fact, most clinically used chemotherapeutic regimens con- 0 150 50 50 20 0 40 0 80 0 80 5-FU HU VCR Taxol ActD Rapa SAHA DAC Rosco NSC sist of multiple drugs, such as ABVD (doxorubicin, vinblastine, 0.63 0.14 0.77 0.02 0.25 0.79 0.17 -0.45 -0.77 -0.87 bleomycin, and dacarbazine), Stanford V (vinblastine, doxorubicin, 250 150 300 150 200 300 150 100 100 100 200 100 200 100 150 200 100 50 50 50

vincristine, bleomycin, mechlorethamine, etoposide, and pred- 150 50 100 50 100 100 50 0 0 0 nisone), and BEACOPP (bleomycin, etoposide, doxorubicin, freq. cyclophosphamide, vincristine, procarbazine, and prednisone) no. of subpop. for Hodgkin lymphoma. Accordingly, we imposed an upper limit Fig. 3. Optimal drugs dominate at higher tumor complexity. (A) Frequency of six drugs in our subsequent analyses. The consequence of this distribution of component drugs in optimal drug combinations based on limit for the same heterogeneous tumor in Fig. 2B is shown in Monte Carlo sampling results, specifically for six-drug combinations (regi- Fig. 2C (the corresponding Pareto frontier) for direct compari- men 6) and optimized using efficacy data (Fig. S1) and a symmetric toxicity son. Fig. 2D enumerates the drugs elicited by the algorithm for profile. Analyses shown in B and C also are based on the six-drug combi- each of the regimens, from one drug to six drugs. nation. (B) Frequency of component drugs as a function of tumor complexity (i.e., the number of subpopulations in the heterogeneous tumor). The trend Monte Carlo Analyses of Drug Combinations in Diverse Heterogeneous for each scatter plot is quantified with Spearman correlation, the values of which are shown below the corresponding component drug title. (C) Heat Tumor Populations. The results above were obtained for single map of the Spearman correlations (in B) between component drug fre- heterogeneous tumors possessing a specified subpopulation dis- quency and tumor complexity. Together, B and C reveal that selecting a set tribution. In practice, from a clinical tumor biopsy, one might of drugs (e.g., rapamycin, 5-FU, vincristine, sunitinib) strongly depends on gain data on the genetic variants present but not with explicit tumor complexity. (D) Spearman correlation of component drug frequencies quantitative distribution proportions. To deal with this challenge, across different drug regimens in the Pareto optimal set. The high correla- we performed Monte Carlo sampling of heterogeneous tumors tion illustrates that the frequency distribution of drugs for the six-drug from among the possible quantitative distributions for a given combination (shown in A) is comparable to other regimens (with N-drug set of genetic variants and analyzed the resulting frequency combinations).

Zhao et al. PNAS | July 22, 2014 | vol. 111 | no. 29 | 10775 Downloaded by guest on October 2, 2021 frequency distributions of the component drugs were found to be potentially may exclude drugs that work effectively for highly similar among the various N-drug combinations (annotated as heterogeneous tumors. “Regimen N”), as calculated in terms of the correlation between We therefore proceeded to analyze in detail the effects of the drug frequency distribution in Regimen N vs. that in Regi- these two alternative optimization perspectives—consideration men M (Fig. 3D and Fig. S8). In other words, the drugs more and of the entire heterogeneous tumor vs. just the predominant less dominant in a regimen combining N drugs are similarly more subpopulation—on the drug combinations predicted to be opti- and less dominant in a regimen combining M drugs. Thus, the mal for tumors whose complexity resides within the sampled behavior of drug dominance among a constellation of potential 10,000 populations. We observed that there was a large pro- drugs as a function of heterogeneous tumor complexity is con- portion of tumors for which the optimal drug combinations were sistent across a broad range of multidrug regimens. disparate between the two perspectives (Fig. 5A), with the frac- We next asked how the drug frequency distributions might tion of optimal drug combinations for a given tumor found to be change if we determine the optimal combinations based solely on different when taking heterogeneity into account vs. not taking it the predominant tumor subpopulation instead of considering into account, increasing from about 40% for one-drug regimens all the subpopulations present in each heterogeneous tumor. Using to almost 70% for six-drug regimens. Quantitatively, the differ- the same set of 10,000 theoretical tumors, we found only small ence in efficacy between the drug combinations optimized based changes in the overall distribution of drug frequencies within on the two perspectives followed a roughly exponential profile optimal combinations under this approach (Fig. 4A, in compar- (Fig. 5B). This means that cases for which the efficacy difference ison with Fig. 3A). However, quite surprisingly, the correlation falls toward the tail of the profile will exhibit dramatic dis- between drug dominance and tumor complexity is abrogated parities in therapeutic outcome when only the predominant dramatically (Fig. 4B in comparison with Fig. 3B,andFig.4C in subpopulation is considered in determining the best drug. comparison with Fig. 3C). For example, the choice of whether to Now, it is conceivable that a drug most frequently deemed include drugs such as rapamycin, 5-FU, vincristine, and sunitinib worthy of inclusion within a regimen optimized for a pre- no longer strongly depends on tumor complexity, but instead dominant subpopulation might generally also be found fre- varies sensitively with the specific individual predominant sub- quently included within a regimen determined from considering population. As a consequence, the drug distributions found in the entire tumor complexity. Therefore, we probed this notion optimal combinations across N-drug regimens may be very dif- again using our 10,000 heterogeneous tumor sample, for each ferent from those for M-drug regimens (Fig. 4D and Fig. S9,in case asking whether the drug combination optimized based on striking contrast to Fig. 3D and Fig. S8). In addition, changes in the entire tumor complexity also contained the single-best drug correlations are minimal when compared before and after con- for the predominant subpopulation. Notably, there were many trolling for tumor complexity, further suggesting that tumor regimens for which the drug combination optimized based on the complexity no longer is an important factor (Fig. S7). Hence, the entire heterogeneity did not contain the single-best drug for the optimal design of combination drugs based on only a predominant predominant subpopulation (Fig. 5C). This proportion expect- subpopulation—i.e., a tumor assumed to be homogeneous— edly decreased with an increasing number of drugs in the com- binations; for instance, for two- and three-drug regimens, 39% and 23% of the cases, respectively, did not include the drug that would have been best for the predominant subpopulation. We Regimens 5-FU So 1 obtained similar conclusions when we performed the same A Su CDSu 123456 Rapa Dox 0.8 ActD Etop VCR 0.6 1 analyses on drug combinations optimized based also on the 6-TG CDDP MMC Spearman correlation

Etop Regimens TMZ CBL 0.4 2 SAHA CPT second largest subpopulations (Fig. S10). To confirm the ro- MTX 6−TG 0.2 3 So TMZ Dox MTX 0 bustness of these conclusions, we performed the same analyses Drugs HU 5−FU 4 Taxol HU −0.2 CBL VCR 5 but decomposed the results with respect to the predominant and Rosco Taxol DAC −0.4 CDDP ActD 6 NSC Rapa −0.6 second largest subpopulation proportions (Fig. S11). We found MMC SAHA CPT DAC Rosco −0.8 that the proportion of regimens with different drug combina- 0 2000 4000 6000 NSC 0.5 1 −1 Frequency Spearman correlation tions, as well as those containing single-best drugs for pre- dominant (and second largest) subpopulations, is robust near So Su Dox Etop CDDP MMC CBL CPT 6-TG TMZ MTX B 0.01 -0.01 0.12 0.16 -0.02 -0.01 0.09 -0.09 0.08 0.12 0.06 boundary cases across different predominant subpopulation 150 200 150 150 80 40 100 20 200 150 150 proportions—and begins to decrease expectedly only when the 100 150 100 100 60 20 80 10 150 100 100 60 predominant subpopulation becomes very large compared with 50 100 50 50 40 0 0 100 50 50 0 the remaining subpopulations (Fig. S12). We also performed 5-FU HU VCR Taxol ActD Rapa SAHA DAC Rosco NSC 0.22 -0.17 0.11 0.16 0.3 0.22 0.15 0.25 -0.09 0.22 250 100 200 100 200 200 150 100 100 100 Monte Carlo sampling with more than 13,000 heterogeneous

200 80 150 80 150 150 100 50 50 50 populations near the boundaries, with similar results (Fig. S13). 60 150 100 60 100 100 50 0 0 0 This firmly demonstrates that selecting drug combinations for freq. a heterogeneous tumor often may not follow from simple in- no. of subpop. tuition—which seemingly would involve including the best drug Fig. 4. Drug optimization based on consideration of only the predominant for the predominant subpopulation. subpopulation abrogates drug dominance at greater tumor complexity. Sta- Taken together, these analyses indicate that intratumor het- tistical analyses are formatted similar to those in Fig 3. However, here the erogeneity may influence drug combinations dramatically such analyses are based on drug optimization by considering only the pre- that the chosen regimen may have strongly disparate outcomes dominant subpopulation in each heterogeneous tumor population. (A) Fre- based on whether we examine the entire heterogeneity or just the quency distribution of component drugs in optimal drug combinations, based predominant subpopulation. This insight argues for the necessity on Monte Carlo sampling results, specifically for six-drug combinations. of considering at least even qualitative features of heterogeneity Analyses shown in B and C also are based on the six-drug combination. (B) in selecting combination regimens. Frequency of component drugs as a function of tumor complexity. The trend for each scatter plot is quantified with Spearman correlation, the values of Sensitivity Analysis of Monte Carlo Sampling. which are shown below the corresponding component drug title. (C) Heat Next we sought to map of the Spearman correlations (in B) between component drug frequency examine in greater detail the specific dependencies of a par- and tumor complexity. Together, B and C reveal that in contrast to Fig. 3 B ticular drug choice on the tumor subpopulations present. To and C, the drugs no longer depend on tumor complexity when only the accomplish this, we performed a global sensitivity analysis by predominant subpopulation is considered in the drug optimization. (D) Fre- calculating the correlation between a given tumor subpopula- quency of component drugs in relation to other drug regimens in the Pareto tion (i.e., shRNA knockdown) and the categorical outcome of optimal set. In contrast to Fig. 3C, the lower correlation values suggest that whether a particular drug was chosen by the optimization al- the drug frequencies are less consistent across the different regimens. gorithm to be included in the combination regimen for the

10776 | www.pnas.org/cgi/doi/10.1073/pnas.1323934111 Zhao et al. Downloaded by guest on October 2, 2021 teristics in examining tradeoffs in the rational design of drug ABDifferent drug combination optimized based on entire heterogeneity Same drug combination } versus predominant subpop. only 10000 3000 combinations in the context of intratumor heterogeneity. 2500 8000 2000 Discussion 6000 1500 1000 The substantial spatial and temporal intratumor heterogeneity

Frequency 4000 500 that inevitably exists in cancer patients presents a fundamental 0 2000 0 1 2 3 4 5 6

Frequency challenge for the rational design of effective drug combina- Difference in efficacy 0 tions. Here, we applied a multiobjective optimization approach, 1 2 3 4 5 6 No. of drugs in combination grounded in empirical experimental data. These data comprised quantitative effects of commonly used chemotherapeutics and C 23% 15% 11% 8% Contains single-best drug for targeted therapeutics on subpopulations of the murine Eμ-myc; 39% predominant subpopulation Arf−/− Does not contain single-best drug for p19 100% 61% 77% 85% 89% 92% lymphoma cell line generated via shRNA knockdowns predominant subpopulation of genes in the DNA damage repair and apoptosis pathways. The cell line was derived from the well-established Eμ-myc mouse Fig. 5. Drug combinations optimized based on consideration of the entire Arf−/− heterogeneity may be nonintuitive. (A) Breakdown of the optimal drug model of human Burkitt lymphoma (30), and the existing p19 combination regimens for 10,000 Monte Carlo-sampled heterogeneous tu- potentially captures the early events in tumor evolution (31). Our mor populations, showing proportions of solutions that are the same or computational analysis on these data generated predictions of different depending on whether the entire heterogeneity is considered vs. optimal drug combinations for tumors in which the genetic just the predominant subpopulation. (B) Distribution of the difference in heterogeneity is qualitatively characterized (in terms of the efficacy between drug combinations optimized based on the entire het- shRNA knockdowns) but quantitatively uncertain with respect to erogeneity vs. just the predominant subpopulation for the six-drug combi- the various relative proportions in the tumor. An initial key re- nation. (C) For each regimen in the Pareto optimal set, breakdown of sult is that tumor heterogeneity can effectively homogenize drug solutions that are different based on the two optimization approaches, efficacy, so the most effective drug combinations are those that showing the proportions for which drug combinations optimized based on best kill the broadest range of subpopulations. Notably, we found entire heterogeneity still contains the single best drug for the predominant that optimizing drug combinations based on consideration of subpopulation. the entire tumor heterogeneity instead of just the predominant subpopulation may result in nonintuitive optimal drug combi- nations. As such, knowledge of the single “best agent” for each 10,000 sampled tumors; this is termed a point-biserial correla- subpopulation does not promise that it will be an optimal choice tion. In general, there were varying degrees of subpopulation for inclusion within the overall drug combination when the entire dependency for the different drugs, negative as well as positive A heterogeneous tumor is considered. We recently demonstrated (Fig. 6 ). In addition, the more frequently chosen drugs exhibit successful validation of our optimization approach for selected broader distributions of point-biserial correlation values than do B tumor subpopulation distributions in both in vitro cell culture the less frequently chosen drugs (Fig. 6 ). Interestingly, the less and in vivo mouse contexts (25). frequently chosen drugs typically were characterized by a very The substantial tumor complexity that exists in patients and tight range of dependencies, and were characterized most dis- the detection limit of diagnostic tools undoubtedly cast un- tinctly by outliers with respect to a particular tumor. In other certainty and incomplete information on the underlying tumor

words, the more frequently chosen drugs are relatively robust composition for each patient. However, our statistical sampling SYSTEMS BIOLOGY to the uncertainty concerning quantitative features of tumor and sensitivity analyses offer guiding principles for the charac- heterogeneity, whereas the less frequently chosen drugs are teristics of effective drugs under tumor diversity. In particular, problematically sensitive to precisely which subpopulations are we discovered that a certain set of drugs dominated the solutions present and in what quantitative proportions. This insight can be captured by the kurtosis of the distribution of the point-biserial correlation values, which was found to be strongly negatively correlated with the drug’s frequency (Fig. 6C). In addition to

ABENGINEERING shBmf shCtrl shATM shPuma shATX shJNK1 shp38 shBok shBad shBPR shp53 shChk2 shATR shChk1 shDNAPK shBax shBak shBim shNoxa shbclb shMil1 shA1 shBid shBclg shBclw shMule shBclx shHrk shBik shBNIP robustness, we also observed that, as expected, drug frequency Rapa Rapa 5−FU VCR 5−FU was correlated with mean efficacy averaged over the entire set of Su ActD VCR So SAHA Su 10,000 tumor samples. Taken together, all the results from the Etop 6−TG ActD MTX sensitivity analysis indicate that the most frequently chosen drugs HU Dox So Taxol CBL SAHA for optimal drug combinations for heterogeneous tumors of TMZ DAC Etop CDDP NSC 6−TG uncertain subpopulation distribution are characterized by their Rosco MMC CPT MTX mean and robustness in efficacy. HU −1 0 1 Dox Point−biserial correlation (r ) pb Taxol Multiobjective Optimization Comprising Particular Drug Toxicity CBL C TMZ Along with Particular Drug Efficacy. Our multiobjective optimiza- -0.96 -0.68 0.087 0.94 1 DAC freq. of drugs -0.96 -0.68 0.087 0.94 1 CDDP tion model allows for the incorporation of multiple objectives in -0.86 -0.63 0.15 1 0.94 mean efficacy -0.86 -0.63 0.15 1 0.94 NSC of drugs deriving optimal drug combinations, with whatever sophistica- -0.042 0.01 1 0.15 0.087 Rosco -0.042 0.01 1 0.15 0.087 mean rpb MMC tion in objective might be desired. As such, we extended our 0.69 1 0.01 -0.63 -0.68 CPT 0.69 1 0.01 -0.63 -0.68 range rpb analysis with greater nuance by recognizing that the toxicities 1 0.69 -0.042 -0.86 -0.96 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 differ—that is, are asymmetric—across systemic tissue types for 1 0.69 -0.042 -0.86 -0.96 kurtosis rpb Point−biserial correlation (rpb)

each given drug and include this recognition in our objective pb pb pb −1 0 1 Spearman correlation functions. Accordingly, we incorporated actual dose-limiting of drugs mean r range r kurtosis r

A freq. of drugs toxicity information for each drug in the dataset (Fig. S15 ). We mean efficacy observed that now a larger set of drug choices could be in- Fig. 6. Sensitivity analysis reveals optimal drugs are most robust and, on corporated for each multidrug regimen along the Pareto frontier average, most efficacious. (A) Point-biserial correlation (rpb) showing the (Fig. S14) and that for myelosuppression and gastrointestinal dependence of component drug choice on subpopulation. The drugs are effects, two of the most common side effects of the drugs in our ordered according to their frequency in the optimal drug combination (Fig. dataset, the distribution in the number of overlaps between drugs 3A). (B) Distribution of point-biserial correlations for each component drug. in toxicity was effectively shifted to minimize such overlaps (Fig. (C) Correlation matrix of various drug characteristics. Component drug fre- S15 B and C and S16). Hence, a multiobjective optimization quency is strongly positively correlated with mean efficacy and negatively

approach enables the incorporation of multiple drug charac- correlated with kurtosis of rpb, a metric used here for robustness.

Zhao et al. PNAS | July 22, 2014 | vol. 111 | no. 29 | 10777 Downloaded by guest on October 2, 2021 for optimal drug combination at increasing tumor complexity, our model optimized for overall efficacy by targeting all sub- and such drugs were associated with greater average efficacy populations, an optimal drug combination would require further and robustness. This indicates that in the absence of complete understanding of how differential selective pressures imposed by knowledge of a tumor’s composition, we nonetheless can ap- drugs affect the potential for secondary resistance. This issue has ply a de novo design of optimal drug combinations based on been studied particularly in antibiotic resistance, in which strong drugs’ average efficacy and robustness properties. Ultimately selection by synergistic drug combinations actually may increase this requires experimental validation in a clinically relevant the risk of resistance (33, 34). Ultimately, just as multiobjective model. Although our dataset, acquired based on in vitro assays optimization approaches may be applied to derive small mol- using clinically relevant genes and chemotherapies, is compre- ecule structures with the desired polypharmacological profiles hensive with regard to knowledge of the efficacy of single agents (35), a multiobjective optimization with considerations of these in single subpopulations, such complete knowledge has yet to be additional properties provides a potentially promising approach to realized in the clinical setting for experimental validation. Nev- optimizing drug combination in the context of tumor heterogeneity. ertheless, recent large-scale efforts, such as The Cancer Genome Atlas and the Cancer Cell Line Encyclopedia (32), are beginning Methods to elucidate patient tumor composition and subpopulation Full details of methods and mathematical models may be found in SI Ap- responses. The accumulation of this type of information may pendix. Briefly, the efficacy dataset was derived previously by using an in − − provide opportunities to deconvolute and derive efficacy data for vitro competition assay with murine Eμ-myc; p19Arf / lymphoma cells par- a single drug on individuals subpopulations, allowing this meth- tially infected with a specific shRNA and treated with a broad range of odology to be used in the derivation of drug combinations sampled chemotherapies (26, 29). The toxicity dataset was a binary matrix of known based on a prior distribution of known genetic variants of a cancer dose-limiting toxicities for each drug. The optimization was formulated as type. This would enable the experimental validation of many clinical a multiobjective optimization model with the objective of maximizing effi- samples, with responses tracked longitudinally following treatment. cacy while minimizing toxicity. The optimization was solved iteratively with Additional considerations of drug–drug interactions, including linear programming using MATLAB 2012b (MathWorks) and CPLEX 12.5. nonlinear drug–drug efficacy (e.g., drug synergy or antagonism) Monte Carlo sampling was performed with simulation of 10,000 heteroge- and toxicity interactions, and tumor dynamics undoubtedly add neous tumors drawn from a specified distribution of subpopulations, followed to the complexity of drug combination optimization. Specific by drug combination optimization on each tumor composition. Statistical and genetic alterations also may affect subpopulation interactions sensitivity analyses were performed using standard packages in MATLAB. All and mutation rates, with consequences on the evolutionary tra- MATLAB codes are publicly available at http://icbp.mit.edu/data-models. jectories of the tumor, in the absence of drug selection. These additional rate parameters, generally used in stochastic and de- ACKNOWLEDGMENTS. We thank Dane Wittrup and Leona Samson for their insightful comments. Funding was provided by Integrative Cancer Biology terministic differential equation-based models of tumor evolu- Program Grant U54-CA112967 (to M.T.H. and D.A.L.). B.Z. is supported by tionary dynamics, would be particularly useful to consider in the National Institutes of Health/National Institute of General Medical Sciences design of sequential treatment strategies. Moreover, although Interdepartmental Biotechnology Training Program 5T32GM008334.

1. Nowell PC (1976) The clonal evolution of tumor cell populations. Science 194(4260): 20. Foo J, Michor F (2009) Evolution of resistance to targeted anti-cancer therapies during 23–28. continuous and pulsed administration strategies. PLOS Comput Biol 5(11):e1000557. 2. Marusyk A, Almendro V, Polyak K (2012) Intra-tumour heterogeneity: A looking glass 21. Komarova , Wodarz D (2005) Drug resistance in cancer: principles of emergence for cancer? Nat Rev Cancer 12(5):323–334. and prevention. Proc Natl Acad Sci USA 102(27):9714–9719. 3. Snuderl M, et al. (2011) Mosaic amplification of multiple receptor tyrosine kinase 22. Das Thakur M, et al. (2013) Modelling vemurafenib resistance in melanoma reveals – genes in glioblastoma. Cancer Cell 20(6):810 817. a strategy to forestall drug resistance. Nature 494(7436):251–255. 4. Navin N, et al. (2011) Tumour evolution inferred by single-cell sequencing. Nature 23. Chmielecki J, et al. (2011) Optimization of dosing for EGFR-mutant non-small cell lung 472(7341):90–94. cancer with evolutionary cancer modeling. Sci Transl Med 3(90):90ra59. 5. Gerlinger M, et al. (2012) Intratumor heterogeneity and branched evolution revealed 24. Radhakrishnan ML, Tidor B (2008) Optimal drug cocktail design: Methods for tar- by multiregion sequencing. N Engl J Med 366(10):883–892. geting molecular ensembles and insights from theoretical model systems. J Chem Inf 6. Notta F, et al. (2011) Evolution of human BCR-ABL1 lymphoblastic leukaemia-initi- Model 48(5):1055–1073. ating cells. Nature 469(7330):362–367. 25. Zhao B, Pritchard JR, Lauffenburger DA, Hemann MT (2014) Addressing genetic tu- 7. Anderson K, et al. (2011) Genetic variegation of clonal architecture and propagating cells in leukaemia. Nature 469(7330):356–361. mor heterogeneity through computationally predictive combination therapy. Cancer – 8. Campbell PJ, et al. (2008) Subclonal phylogenetic structures in cancer revealed by Discov 4(2):166 174. ultra-deep sequencing. Proc Natl Acad Sci USA 105(35):13081–13086. 26. Pritchard JR, et al. (2013) Defining principles of combination drug mechanisms of 9. Ding L, et al. (2012) Clonal evolution in relapsed acute myeloid leukaemia revealed by action. Proc Natl Acad Sci USA 110(2):E170–E179. whole-genome sequencing. Nature 481(7382):506–510. 27. Swanton C (2012) Intratumor heterogeneity: Evolution through space and time. 10. Keats JJ, et al. (2012) Clonal competition with alternating dominance in multiple Cancer Res 72(19):4875–4882. myeloma. Blood 120(5):1067–1076. 28. Su K-Y, et al. (2012) Pretreatment epidermal growth factor receptor (EGFR) T790M 11. Misale S, et al. (2012) Emergence of KRAS mutations and acquired resistance to anti- mutation predicts shorter EGFR tyrosine kinase inhibitor response duration in pa- EGFR therapy in colorectal cancer. Nature 486(7404):532–536. tients with non-small-cell lung cancer. J Clin Oncol 30(4):433–440. 12. Mullighan CG, et al. (2008) Genomic analysis of the clonal origins of relapsed acute 29. Jiang H, Pritchard JR, Williams RT, Lauffenburger DA, Hemann MT (2011) A mam- – lymphoblastic leukemia. Science 322(5906):1377 1380. malian functional-genetic approach to characterizing cancer therapeutics. Nat Chem 13. Swan GW (1990) Role of optimal control theory in cancer chemotherapy. Math Biosci Biol 7(2):92–100. – 101(2):237 284. 30. Adams JM, et al. (1985) The c-myc oncogene driven by immunoglobulin enhancers 14. Shi J, Alagoz O, Erenay FS, Su Q (2011) A survey of optimization models on cancer induces lymphoid malignancy in transgenic mice. Nature 318(6046):533–538. chemotherapy treatment planning. Ann Oper Res, 10.1007/s10479-011-0869-4. 31. Schmitt CA, McCurrach ME, de Stanchina E, Wallace-Brodeur RR, Lowe SW (1999) 15. Martin RB (1992) Optimal control drug scheduling of cancer chemotherapy. Auto- INK4a/ARF mutations accelerate lymphomagenesis and promote chemoresistance by matica 28(6):1113–1123. disabling p53. Genes Dev 13(20):2670–2677. 16. Swan GW, Vincent TL (1977) Optimal control analysis in the chemotherapy of IgG 32. Barretina J, et al. (2012) The Cancer Cell Line Encyclopedia enables predictive mod- multiple myeloma. Bull Math Biol 39(3):317–337. – 17. Coldman AJ, Goldie JH (1983) A model for the resistance of tumor cells to cancer elling of anticancer drug sensitivity. Nature 483(7391):603 607. chemotherapeutic agents. Math Biosci 65(2):291–307. 33. Pena-Miller R, et al. (2013) When the most potent combination of antibiotics selects 18. Beckman RA, Schemmann GS, Yeang C-H (2012) Impact of genetic dynamics and for the greatest bacterial load: The smile-frown transition. PLoS Biol 11(4):e1001540. single-cell heterogeneity on development of nonstandard personalized medicine 34. Hegreness M, Shoresh N, Damian D, Hartl D, Kishony R (2008) Accelerated evolution strategies for cancer. Proc Natl Acad Sci USA 109(36):14586–14591. of resistance in multidrug environments. Proc Natl Acad Sci USA 105(37):13977–13981. 19. Bozic I, et al. (2013) Evolutionary dynamics of cancer in response to targeted com- 35. Besnard J, et al. (2012) Automated design of ligands to polypharmacological profiles. bination therapy. eLife 2:e00747. Nature 492(7428):215–220.

10778 | www.pnas.org/cgi/doi/10.1073/pnas.1323934111 Zhao et al. Downloaded by guest on October 2, 2021