Spring 2020 – of Reproduction Discussion Outline (Systems Biology) Michael K. Skinner – Biol 475/575 Weeks 1 and 2 (January 23, 2020)

Systems Biology Primary Papers

1. Westerhoff & Palsson (2004) Nat Biotech 22:1249-1252 2. Joyner (2011) J Appl Physiol 111:335-342 3. Clarke, et al. (2019) Endocr Relat Cancer. 26(6):R345-368

Discussion

Student 1 - Ref #1 above -How does this support evolutionary systems biology? -What was the convergence discussed? -Give an example that supports this perspective.

Student 2 - Ref #2 above -What is the problem with reductionism? -What is the void? -What is the solution?

Student 3 - Ref #3 above -What is multiscale modeling? -What is the role of mathematical modeling? -What network analysis insights into endocrine cancers are described?

HISTORICAL PERSPECTIVE

The evolution of molecular biology into systems biology

Hans V Westerhoff1 & Bernhard O Palsson2

Systems analysis has historically been performed in many high-throughput technologies—more emphasis is placed on the sec- areas of biology, including ecology, developmental biology ond root, which sprung from nonequilibrium thermodynamics theory and immunology. More recently, the revolution in the 1940s, the elucidation of biochemical pathways and feedback has catapulted molecular biology into the realm of systems controls in unicellular organisms and the emerging recognition of net- biology. In unicellular organisms and well-defined cell lines works in biology. We conclude by discussing how these two lines of of higher organisms, systems approaches are making definitive work are now merging in contemporary systems biology. strides toward scientific understanding and biotechnological applications. We argue here that two distinct lines of inquiry Scaling-up molecular biology in molecular biology have converged to form contemporary In the decades following its foundational discoveries of the structure systems biology. and information coding of DNA and protein, molecular biology blos- http://www.nature.com/naturebiotechnology somed as a field, with a series of breathtaking discoveries (Fig. 1). The Whereas the foundations of systems biology-at-large are generally rec- description of restriction enzymes and cloning were major break- ognized as being as far apart as 19th century whole-organism embryo- throughs in the 1970s, ushering in the era of genetic engineering and logy and network mathematics, there is a school of thought that biotechnology. In the 1980s, we began to see the scale-up of some of systems biology of the living cell has its origin in the expansion of the fundamental experimental approaches of molecular biology. molecular biology to genome-wide analyses. From this perspective, the Automated DNA sequencers began to appear and reached genome- of this ‘new’ field constitutes a ‘paradigm shift’ for molecu- scale sequencing in the mid-1990s4,5.Automation, miniaturization lar biology, which ironically has often focused on reductionist think- and multiplexing of various assays led to the generation of additional ing. Systems thinking in molecular biology will likely be dominated by ‘omics’ data types6,7. formal integrative analysis going forward rather than solely being The large volumes of data generated by these approaches led to driven by high-throughput technologies. rapid growth in the field of bioinformatics, again largely emanating It is, however, incorrect to state that integrative thinking is new to from the reductionist perspective. Although this effort was mostly molecular biology. The first molecular regulatory circuits were focused on statistical models and object classification approaches in © 2004 Nature Publishing Group mapped out over 40 years ago. The feedback inhibition of amino acid the late 1990s, it was recognized that a more formal and mechanistic biosynthetic pathways was discovered in 1957 (refs. 1,2), and the tran- framework was needed to analyze multiple high-throughput data scriptional regulation associated with the glucose-lactose diauxic shift types systematically8,9.This need led to efforts toward genome-scale led to the definition of the lac operon and the elucidation of its regula- model building to analyze the systems properties of cellular function. tion3.With the study of these regulatory mechanisms, admittedly on a small scale, molecular biologists began to apply systems approaches to Molecular self-organization unravel the molecular components and logic that underlie cellular Even before the first key events in the history of molecular biology, processes, often in parallel with the characterization of individual several lines of reasoning revealed that integration of multiple molecu- macromolecules. High-throughput technologies have made the scale lar processes is fundamental to the living cell. Biochemical processes of such inquiries much larger, enabling us to view the genome as the necessitate the production of entropy (chaos in the thermodynamic ‘system’ to study. Thus, the popular contemporary view of systems sense) as driving force. The paradox felt by many, but expressed by biology may be synonymous with ‘genomic’ biology. Schrödinger in his war-time lectures10, was how one could explain This article discusses two historical roots of systems biology in the progressive ordering that occurs in developmental biology (that is, molecular biology (Fig. 1). Although we briefly outline the more the ‘self-organization,’decrease in chaos) when entropy (‘chaos’) must familiar first root—which stemmed from fundamental discoveries be increased. about the nature of genetic material, structural characterization The answer was that one process could produce order (negative of macromolecules and later developments in recombinant and entropy or negentropy) provided it was coupled to a second process that produced more chaos (entropy): coupling, another word for inte- gration of processes, is therefore essential for life. Onsager11 provided 1 Departments of Molecular Cell Physiology and Mathematical Biochemistry, the basis for this concept by stressing the significance of the coupling BioCentrum Amsterdam, De Boelelaan 1085, NL-108, HV Amsterdam, the Netherlands. 2Department of Bioengineering, University of California-San Diego, of dissimilar processes. He is also relevant because he discovered a law 9500 Gilman Drive, La Jolla, California 92093-0412, USA. Correspondence for such systems of coupled processes: close to equilibrium the should be addressed to H.V.W. ([email protected]) or B.O.P. ([email protected]). dependence of the one process rate on the driving force of the other Published online 6 October 2004; doi:10.1038/nbt1020 process should equal the dependence of the other process rate on the

NATURE BIOTECHNOLOGY VOLUME 22 NUMBER 10 OCTOBER 2004 1249 HISTORICAL PERSPECTIVE

Haemophilus influenzae Recombinant first genome DNA the Automated technology sequenced Human genome sequencing genetic DNA sequenced material structure 1995 2001▲

1944 1953 1960 1970 1980 1990 2000 High-throughput at

genome scale, 'data rich' biology ▲

Systems analysis critical to molecular biology ‘Data poor’ in silico biology, ▲ Feedback regulation models of viruses, in metabolism red blood cell

1931 1952 1957 1970 1980 1990 2000 ▲

Non-equilibrium Self- thermodynamcs organization Large-scale simulators Genome-scale models of metabolic dissipative and analysis, large-scale structures, energy coupling kinetic models http://www.nature.com/naturebiotechnology Analog simulation, MCA and BST bioenergetics, lac operon Erin Boyle

Figure 1 Two lines of inquiry led from the approximate onset of molecular biological thinking to present-day systems biology. The top timeline represents the root of systems biology in mainstream molecular biology, with its emphasis on individual macromolecules. Scaled-up versions of this effort then induced systems biology as a way to look at all those molecules simultaneously, and consider their interactions. The lower timeline represents the lesser-known effort that constantly focused on the formal analysis of new functional states that arise when multiple molecules interact simultaneously.

former driving force. Caplan, Essig and Rottenberg12 later defined a environment, a phenomenon called symmetry breaking. Turing16 led coupling coefficient, which quantifies the extent to which two the way, but the Prigogine school17 and others developed the topic processes are coupled in a system and showed that this coefficient from the perspective of nonequilibrium thermodynamics in molecu- © 2004 Nature Publishing Group must range between 0 and 1. lar contexts such as biochemical reactions involved in sugar meta- These approaches were called nonequilibrium thermodynamics and bolism (glycolysis). They demonstrated how having a sufficient number constituted a prelude to systems biology at the cell and molecular lev- of nonlinearly interacting chemical processes in a single system such as els in that they (i) dealt with integration quantitatively and (ii) aimed the Zhabotinski reaction, a developing tissue, or glycolysis, could lead to discover general principles rather than just being descriptive. An to symmetry-breaking as a result of self-amplification of random improved procedure for describing ion movement and energy trans- fluctuations. Of course, more recent molecular developmental biology duction in biological membranes, termed mosaic nonequilibrium studies have shown that reality is even more complicated; pre- thermodynamics, further progressed towards systems thinking in specification by external (maternally specified) gradients of mor- that it (iii) established a connection to molecular mechanisms and phogens may substitute for the random fluctuations, increasing the (iv) enabled the determination of the stoichiometry of membrane robustness of development18.Perhaps more importantly, Prigogine energy transduction from system data13.Peter Mitchell’s14 chemi- searched for and found a law (on minimum entropy production). osmotic coupling principle was another early case of systems analysis Although it is strictly valid only in Onsager’s near-equilibrium in cell and molecular biology. It stated that ATP synthesis was coupled domain, it testified to the systems scientists’ quest for the principles in quite an indirect way to respiration, involving an entire intracellular underlying systems, rather than just for their appearances. system, including a volume surrounded by an ion-impermeable Early on, oscillations in yeast glycolysis were the experimental membrane and proton movement across it. Indeed, for eukaryotes, systems of choice. Although intact cells were studied19,more often this provided much of the rationale for the organization of the mito- measurements were made using cell extracts20.Reductionist biochem- chondrion. In his calculations verifying that that the proposed ical thinking proclaimed that a single pacemaker enzyme should be chemiosmotic mechanisms transferred sufficient free energy to responsible for the oscillations. Only relatively recently has systems- empower ATP synthesis, Mitchell demonstrated the sort of quanti- based analysis in one of our laboratories (H.V.W.) been used to reveal tative thinking that would eventually prove crucial to the study of that the oscillations are simultaneously controlled by many steps in the biochemical systems14. intracellular network21 and how the oscillations in the individual cells The problem of biological self-organization was to understand how synchronize actively22.Ofcourse, with the more recent experimental structures, oscillations or waves arise in a steady and homogenous capability to inspect single cells dynamically, more and more cells are

1250 VOLUME 22 NUMBER 10 OCTOBER 2004 NATURE BIOTECHNOLOGY HISTORICAL PERSPECTIVE

seen to exhibit asynchronous oscillations of all sorts and some of these zero.Instead, they found a theorem stating that all the flux-control cases are up for systems biology analysis. Slime mold aggregation was coefficients must sum to unity28,32.This result then suggested that another early case where a network of reactions was shown to be there need not be a single rate-limiting step to a pathway and that essential for systems biology reaching one step beyond cell biology, instead many enzymes can contribute simultaneously to the control of again by combining mathematical modeling with experimental the network. Thus, control was not a component property but a net- molecular information23. work property. The network nature of regulation was shown experi- mentally to be the case for mitochondrial ATP generation, where Building large-scale models control was indeed distributed over more than three steps, and quite Following the events of the late 1950s and early 1960s, researchers notably not particularly strong, neither for the first nor for the irre- undertook efforts that were not well publicized and formulated math- versible step of the pathway33. ematical models to simulate the functions of newly discovered regula- An important aspect of systems biology is to relate the system prop- tory circuits in cells. Even before digital computers became available, erties to the molecular properties of components that comprise a net- simulations of integrated molecular functions were performed on work.The kinetics-based sensitivity analysis by MCA, and its close analog computers24.These efforts grew in scale to dynamic simulation relative, biochemical systems theory proposed by H.V.W and Chen34, of large metabolic networks in the 1970s25–27.Following the pathway- showed that by focusing on the properties of an individual compo- centered kinetic models in the seventies28,cell-scale flux models of the nent, one cannot properly decipher its role in the context of a whole human red cell were published by the late 1980s (ref. 29), and by the network. The connectivity laws proven by MCA28,34 (see other refer- early 1990s genome-scale models of viruses and large-scale models of ences in ref. 35) pinpointed how that distribution of control relates to mitosis were formulated30.With the advent of genome-scale sequenc- network structure and the kinetic properties of all network compo- ing, the first genome-scale, constraint-based metabolic models for nents simultaneously. Similarly, the topological analyses of network bacteria were constructed31.These models describe reconstructed net- structure by our groups31,36 have revealed the existence of network- works and their possible functional states (phenotypes) and are now based definitions of pathways that can be used mathematically to rep- available at the genome-scale for a growing number of organisms. resent all possible functional states of reconstructed networks37.Thus, They treat the ‘genome’ as the ‘system.’ a growing number of methods now exist to analyze the properties http://www.nature.com/naturebiotechnology Progress toward the development of detailed kinetic models at a mathematically of the large-scale networks that we are now able to large scale has proven to be slower. Some of these models approach reconstruct based on high-throughput data. computer replicas of pathways of metabolism, signal transduction and gene expression, and are active on the web, ready for experimentation Convergence and integration (compare http://www.siliconcell.net/). Obtaining Figure 1 presents our interpretation of the history of systems analysis in vivo numerical values for kinetic constants remains a key challenge. in cell and molecular biology. Events in the upper timeline have been much more to the fore of scientific thinking than those in the lower Metabolic control analysis timeline. In one sense, the dazzling stream of discoveries and exciting We have agreed that contemporary systems biology has an histori- technologies (most recently with genome-wide data) provides the cal root outside mainstream molecular biology, ranging from basic ‘biology’ root to contemporary systems biology. In contrast, scientific principles of self-organization in nonequilibrium thermodyna- progress in the lower timeline has never gained much notoriety, mics, through large-scale flux and kinetic models to ‘genetic circuit’ although work in this area was much more prominent in European thinking in molecular biology.‘Systems thinking’ differs from ‘compo- science throughout this period. This latter branch might be thought of © 2004 Nature Publishing Group nent thinking’ and requires the development of new conceptual as the ‘systems’ root of systems biology. frameworks. Systems modeling and simulation in molecular biology was once Metabolic control analysis (MCA), developed in the early seven- seen as purely theoretical and not particularly relevant to understand- ties28,32,presented a key example of approaches to characterize prop- ing ‘real’ biology. However, now that molecular biology has become erties of networks of interacting chemical reactions. At this time, such a data-rich field, the need for theory, model building and simula- thinking in biochemistry was dominated by the concept that there had tion has emerged. The systems-directed root always had the ambition to be a single ‘rate-limiting’ step at the beginning of all metabolic of discovering fundamental principles and laws, such as those of non- pathways. Criteria used to establish whether a given enzyme was equilibrium thermodynamics and MCA. This ambition should now rate-limiting referred to it as being far from equilibrium, strongly reg- extend to systems biology. ulated by various metabolic factors or causing pathway flux to decrease All too often, the field has been perceived as just pattern recognition when inhibited. and phenomenological modeling. Systems biology is a thorough sci- However, the application of these criteria to some metabolic path- ence with its own quest for scientific principles at the interface of ways suggested that they contained more than a single rate-limiting physics, chemistry and biology, with its remarkable mixture of func- step. Network thinking through MCA helped to resolve this paradox. tionality, hysteresis, optimization and physical chemical limitations. First, mathematical models of metabolic pathways were developed In silico analysis of complex cellular processes (whether for data both for inspiration and discovery, and subsequently used to check description, genetic engineering or scientific discovery), with its focus numerically the principles they conjectured28,32.Second, quantitative on elucidating system mechanisms, has in fact become critical for definitions were developed to describe the extent to which a step lim- progress in biology. ited the flux through a pathway. This ‘flux-control coefficient’ of a par- The historical dichotomy in approaches to molecular biology must ticular step corresponded to the sensitivity coefficient of the pathway now be reconciled with the need to corral resources and expertise in flux with respect to the activity of the particular enzyme. Third, these systems approaches. Although the reductionist molecular biological investigators looked for proof of the concept that there should be a sin- root has been the focus of a plethora of investigations, literature gle rate-limiting enzyme in a pathway that should have a flux-control sources and curricula, the same is not true for the systems molecular coefficient of unity, with all others having flux control coefficients of biology root. There is now a need for development of theoretical and

NATURE BIOTECHNOLOGY VOLUME 22 NUMBER 10 OCTOBER 2004 1251 HISTORICAL PERSPECTIVE

analytical approaches, curricula and educational materials to advance 10. Schrödinger, E. What is life? The physical aspects of the living cell. Based on Lectures Delivered under the Auspices of the Dublin Institute for Advanced Studies at understanding of the systems in cell and molecular biology. Unknown Trinity College, Dublin, in February 1943. (Cambridge University Press, Cambridge, to many, the ‘pre-online PDF’ era contains answers to many of the cur- UK, 1944). http://home.att.net/∼p.caimi/oremia.html rent challenges and pitfalls facing the field. So although systems bio- 11. Onsager, L. Reciprocal relations in irreversible processes. Phys. Rev. 37, 405–426 (1931). logy has an intellectually exciting future ahead of it, the leaders in the 12. Rottenberg, H., Caplan, S.R. & Essig, A. Stoichiometry and coupling: theories of field should try to minimize rediscovery and focus on the newer chal- oxidative phosphorylation. Nature 216, 610–611 (1967). 13. Westerhoff, H.V. & Van Dam, K. Thermodynamics and Control of Biological Free- lenges facing us, particularly those that come with the application of Energy Transduction (Elsevier, Amsterdam, 1987). existing concepts to genome-scale problems and identification of the 14. Mitchell, P. Coupling of phosphorylation to electron and hydrogen transfer by a new issues that arise from the study of cellular functions on this scale. chemiosmotic type of mechanism. Nature 191, 144–148 (1961). 15. Mitchell, P. Chemiosmotic Coupling in Oxidative and Photosynthetic Phosphorylation. Where has this history brought us? We now have the growing and (Glynn Research Ltd., Bodmin, UK, 1966). general recognition that systems analysis is important to the future 16. Turing, A. The chemical basis of morphogenesis. Phil. Trans. Roy. Soc. London, Ser. B evolution of cell and molecular biology. Some reeducation of workers 237, 37–72 (1952). 17. Glansdorff, P. & Prigogine, I. Structure, Stabilité et Fluctuations (Masson, Paris, in the field may be in order (http://www.systembiology.net/). Over the 1971). near term, it is likely that successes with practical applications of sys- 18. Lawrence, P.A. The Making of a Fly (Blackwell, London, 1992). 19. Chance, B., Estabrook, R.W. & Ghosh, A. Damped sinusoidal oscillations of cytoplas- tems biology will be confined to unicellular systems. We are now see- mic reduced pyridine nucleotide in yeast cells. Proc. Natl. Acad. Sci. USA 51, ing successful applications of systems biology to microbes, including 1244–1251 (1964). pathway engineering (e.g., see our recent publications37,38), network- 20. Hess, B. & Boiteux, A. Oscillatory phenomena in biochemistry. Annu. Rev. Biochem. 39 40, 237–258 (1971). based drug design (e.g., H.V.W. and colleagues ), and prediction of 21. Teusink, B., Bakker, B.M. & Westerhoff, H.V. Control of frequency and amplitudes is the outcome of complex biological processes, such as adaptive evolu- shared by all enzymes in three models for yeast glycolytic oscillations. Biochim. tion (B.O.P and colleagues40).Although the mathematical modeling of Biophys. Acta. 1275, 204–212 (1996). 22. Wolf, J. et al. Transduction of intracellular and intercellular dynamics in yeast gly- whole-body human systems cannot yet be linked to genome-wide data colytic oscillations. Biophys. J. 78, 1145–1153 (2000). and models, data analysis and modeling are likely to contribute to 23. Tyson, J.J. & Murray, J.D. Cyclic AMP waves during aggregation of Dictyostelium amoebae. Development 106, 421–426 (1989). the success of realizing the goal of individualized medicine. Even if 24. Goodwin, B.C. Oscillatory Organization in Cells, a Dynamic Theory of Cellular Control we have to rely on less precise models than the currently available Processes (Academic Press, New York, 1963). http://www.nature.com/naturebiotechnology genome-scale models of microorganisms, systems biology may soon 25. Garfinkel, D. et al. Computer applications to biochemical kinetics. Annu. Rev. Biochem. 39, 473–498 (1970). lead to better diagnosis and dynamic therapies of human disease than 26. Loomis, W. & Thomas, S. Kinetic analysis of biochemical differentiation in the qualitative methodology presently in use. Dictyostelium discoideum. J. Biol. Chem. 251, 6252–6258 (1976). 27. Wright, B.E. The use of kinetic models to analyze differentiation. Behavioral Sci. 15, ACKNOWLEDGMENTS 37–45 (1970). We thank Adam Arkin for comments and Timothy Allen for editing. B.O.P.serves 28. Heinrich, R., Rapoport, S.M. & Rapoport, T.A. Progr. Biophys. Mol. Biol. 32, 1–83 on the scientific advisory board of Genomatica, Inc. (1977). 29. Joshi, A. & Palsson, B.O. Metabolic dynamics in the human red cell. Part I—A com- prehensive kinetic model. J. Theor. Biol. 141, 515–528 (1989). COMPETING INTERESTS STATEMENT 30. Novak, B. & Tyson, J.J. Quantitative analysis of a molecular model of mitotic control The authors declare that they have no competing financial interests. in fission yeast. J. Theor. Biol. 173, 283–305 (1995). 31. Edwards, J.S. & Palsson, B.O. Systems properties of the Haemophilus influenzae Rd Published online at http://www.nature.com/naturebiotechnology/ metabolic genotype. J. Biol. Chem. 274, 17410–17416 (1999). 32. Kacser, H. & Burns, J.A. In Rate Control of Biological Processes (ed., Davies, D.D.) 1. Umbarger, H.E. & Brown, B. Threonine deamination in Escherichia coli. II. Evidence 65–104 (Cambridge University Press, Cambridge, 1973). for two L-threonine deaminases. J. Bacteriol. 73, 105–12 (1957). 33. Groen. A.K., Wanders, R.J.A., Van Roermund, C., Westerhoff, H.V. & Tager, J.M. 2. Yates, R.A. & Pardee, A.B. Control by uracil of formation of enzymes required for oro- Quantification of the contribution of various steps to the control of mitochondrial res-

© 2004 Nature Publishing Group tate synthesis. J. Biol. Chem. 227, 677–692 (1957). piration. J. Biol. Chem. 257, 2754–2757 (1982). 3. Beckwith, J.R. Regulation of the lac operon. Recent studies on the regulation of lac- 34. Savageau, M.A. Biochemical Systems Analysis (Addison-Wesley, Reading, MA, tose metabolism in Escherichia coli support the operon model. Science 156, 1976). 597–604 (1967). 35. Westerhoff, H.V. & Chen, Y. How do enzyme activities control metabolite concentra- 4. Hunkapiller, T. et al. Large-scale and automated DNA sequence determination. tions? An additional theorem in the theory of metabolic control. Eur. J. Biochem. Science 254, 59–67 (1991). 142, 425–430 (1984). 5. Rowen, L., Magharias, G. & Hood, L. Sequencing the human genome. Science 278, 36. Westerhoff, H.V., Hofmeyr, J.H. & Kholodenko, B.N. Getting to the inside of cells 605–607 (1997). using metabolic control analysis. Biophys. Chem. 50, 273–283 (1994). 6. Scherf, M., Klingenhoff, A. & Werner, T. Highly specific localization of promoter 37. Papin, J.A., Price, N.D., Wiback, S.J., Fell, D.A. & Palsson, B.O. Metabolic pathways regions in large genomic sequences by PromoterInspector: a novel context analysis in the post-genome era. Trends Biochem. Sci. 28, 250–258 (2003). approach. J. Mol. Biol. 297, 599–606 (2000). 38. Kholodenko, B.N. & Westerhoff, H.V. (eds.) Metabolic Engineering in the Post 7. Uetz, P. et al. A comprehensive analysis of protein–protein interactions in Genomics Era (Horizon Bioscience, UK, 2004). Saccharomyces cerevisiae. Nature 403, 623–627 (2000). 39. Bakker, B.M. et al. Network-based selectivity of antiparasitic inhibitors. Mol. Biol. 8. Ge, H., Walhout, A.J. & Vidal, M. Integrating ‘omic’ information: a bridge between Rep. 29, 1–5 (2002). genomics and systems biology. Trends Genet. 19, 551–560 (2003). 40. Ibarra, R.U., Edwards, J.S. & Palsson, B.O. Escherichia coli K-12 undergoes adaptive 9. Palsson, B.O. In silico biology through ‘omics’. Nat. Biotechnol. 20, 649–650 evolution to achieve in silico predicted optimal growth. Nature 420, 186–189 (2002). (2002).

1252 VOLUME 22 NUMBER 10 OCTOBER 2004 NATURE BIOTECHNOLOGY J Appl Physiol 111: 335–342, 2011. First published June 2, 2011; doi:10.1152/japplphysiol.00565.2011. Review

Edward F. Adolph Distinguished Lectureship

Giant sucking sound: can physiology fill the intellectual void left by the reductionists?

Michael J. Joyner Department of Anesthesiology, Mayo Clinic, Rochester, Minnesota Submitted 3 May 2011; accepted in final form 31 May 2011

Joyner MJ. Giant sucking sound: can physiology fill the intellectual void left by the reductionists?. J Appl Physiol 111: 335–342, 2011. First published June 2, 2011; doi:10.1152/japplphysiol.00565.2011.—Molecular reductionism has so far failed to deliver the broad-based therapeutic insights that were initially hoped for. This form of reductionism is now being replaced by so-called “systems biology.” This is a nebulously defined approach and/or discipline, with some versions of it relying excessively on hypothesis-neutral approaches and only minimally informed by key physiological concepts such as homeostasis and regulation. In this context, physiology is uniquely positioned to continue to provide impressive levels of both biological and therapeutic insight by using hypothesis-driven “classical” ap- proaches and concepts to help frame what might be described as the “pieces of the puzzle” that emerge from molecular reductionism. The strength of physiology as a “bridge” between reductionism and epidemiology, along with its unparalleled ability to generate therapeutic insights and opportunities justifies increased atten- tion and emphasis on our discipline into the future. Arguments relevant to this set of assertions are advanced and this paper, which was based on the 2011 Adolph Lecture, represents an effort to fill the intellectual void left by reductionism and improve scientific progress. homeostasis; regulation; integrative

THIS PAPER REFLECTS IDEAS that were presented as part of the As the title demonstrates, my goal in the Adolph Lecture and 2011 Adolph Lecture at the Experimental Biology meeting that in this paper was and is to be intentionally provocative and was held in Washington, DC. The goal of the talk was to share hopefully generate a dialogue with the reductionists. In this a physiologist’s perspective on what reductionism in general context, and because I am “taking sides”, I have adopted what and the “omic” revolution in particular has or has not done for might be called a conversational approach to this paper. biomedical research and associated therapeutic insights or advances. The main ideas highlighted in the lecture were the BIOLOGICAL ORTHOPEDIC SURGERY following. A key idea or theme that seems to underpin the impetus for 1) Reductionism via various flavors of molecular biology reductionism and various flavors of “omics” as applied to and “omics” has so far failed to deliver its self-promoted biomedical problems might be described as biological ortho- revolution in clinical medicine. pedic surgery: “the gene is broken ¡ fix the broken gene ¡ 2) Systems biology has a cell-centric focus that is marked by cure the patient.” This thinking clearly seems to explain the a limited understanding of and application to biology beyond enthusiasm about gene therapy that emerged after the discov- the cell. ery of the genetic defect responsible for the most common form 3) The failure of systems biology to recognize and use key of cystic fibrosis and more recently ideas about a limited concepts from physiology about homeostasis, regulation, re- number of common gene variants explaining the risk for dundancy, feedback control, and acclimation/adaptation are common conditions like atherosclerosis and diabetes (10–12, major limitations to this poorly defined approach. 43, 51). The line of thinking described above flows from what 4) While all the attention has been focused on reductionism Denis Noble has critically termed “Neo-Darwinian” thinking and more recently systems biology, physiology continues to about the relationship between genes and phenotype (45, 46). provide important biomedical insights that lead to therapeutic It is exemplified by two quotes, the first from 1989 and second advances. from Francis Collins (the current director of NIH), one of the people involved in the cystic fibrosis gene discovery. Address for reprint requests and other correspondence: M. J. Joyner, 200 The implications of this research are profound; there will be First St. SW, Dept. of Anesthesiology, Mayo Clinic, Rochester, MN 55905 large spin offs in basic biology, especially cell physiology, but (e-mail: [email protected]). the largest impact will be biomedical (51). http://www.jap.org 8750-7587/11 Copyright © 2011 the American Physiological Society 335 Review

336 PHYSIOLOGY AS AN ANTIDOTE TO REDUCTIONISM Here we are in 1997, eight years later, and the management altitude (Fig. 1), under many circumstances there is in fact a net of her disease has not changed. . . . .But I will predict that in left shift in the oxygen hemoglobin dissociation curve. This left the course of the next 10 years management of CF will shift is facilitated by the rise in pH and fall in CO caused by change.....The healthy form of the gene itself may even be 2 used in so-called gene therapy (12). the hyperventilation driven by systemic hypoxia. Additionally, under some circumstances, it is driven further leftward by a fall What is interesting to note is that while gene therapy for in body temperature (68). Furthermore, it is of interest to note cystic fibrosis has failed to materialize in the 20ϩ years since that all genetically adapted high altitude animals and the the gene defect was identified, there are traditional ion channel- human fetus in the hypoxic intrauterine environment also have based drugs that target the CFTR protein in clinical trials that show promise in cystic fibrosis (18, 66). At one level, the left shifted oxygen-hemoglobin dissociation curves, some with development of these drugs was likely facilitated by the genetic P50 values in the teens. discoveries because they permitted the development of models These observations make it seem likely that the main adap- that advanced the understanding of the biophysics and ulti- tive strategy is to shift the oxygen-hemoglobin dissociation mately pharmacology of the defective channel. However, one curve to the left to facilitate the “loading” of oxygen at the lung is tempted to speculate, for cystic fibrosis and perhaps other in conditions (altitude) where oxygen availability is limited. diseases, that much faster therapeutic progress might have been This strategy also takes advantage of the fact that the mito- made if traditional physiological and pharmacological ap- chondria in the tissues can work efficiently at very low PO2 proaches had been a bigger area of focus. Perhaps the optimism values (and that under specific needs such as muscular exercise and drive for gene therapy was an example of what might be in hypoxia local increases in [Hϩ] and temperature will reduce termed “silver bullet” thinking that I will discuss below. the leftward shift in muscle capillaries so that “unloading” of oxygen and tissue O2 levels can be facilitated). It is also of note REDUCTIONISM IS SEDUCTIVE that the left shift in the oxygen-hemoglobin dissociation curve The type of reductionism that I have termed “biological has been “known” since at least the 1920s. Along these lines, orthopedic surgery” has a number of attractive features and is the sequencing of hemoglobin and the understanding of its at some level very seductive. It is easy to understand, and when biophysical properties was one of the earliest triumphs of what it delivers it is associated with a heroic narrative by a lone has come to be described as molecular biology (55). However, scientist or team of scientists making a fundamental discovery when the interpretation of such discoveries is too narrow, key that solves a problem. This is the sort of silver bullet thinking physiological insights can be missed. The 2–3 DPG story is mentioned above. However, it has been known for some time also an excellent and early example of how physiology trumps that both the easy to understand elements and heroic narratives reductionist molecular biology as multiple systems and regu- associated with reductionism are mirages. In this context, when latory strategies interact to regulate homeostasis for the whole the factors that contribute to biomedical breakthroughs were organism. subjected to analysis by Comroe and Drips (13) in the late 1960s and early 1970s via the “retrospectoscope,” biomedical breakthroughs are in fact more nuanced, incremental, and associated with a more serendipitous view of progress vs. the heroic narrative of reductionism.

HEMOGLOBIN IS A SHIFTY MOLECULE Homeostasis—the ability to regulate key bodily functions within a narrow range in response to either internal (e.g., exercise) or external (e.g., harsh environmental conditions)—is one of the fundamental (perhaps the fundamental) concept in physiology (7). Homeostasis is also subserved by ideas about regulated systems, feedback control, redundant control mech- anisms, and adaptation and acclimation over time. These phys- iological concepts and mechanisms contribute to what might be described as emergent properties, so that the behavior of the system is far more complex and (and likely more robust) than might be predicted on the basis of a single reductionist property (35). A good, and early, example of this concept comes from the textbook description about the right shift in the oxygen-hemo- Fig. 1. Oxygen-hemoglobin dissociation curve demonstrating a left shift globin dissociation curve that occurs at high altitude or during among sojourners (Œ) to high altitude and natives. The left shift in the other forms of hypoxia. The standard teaching is that under oxygen-hemoglobin dissociation curve under these circumstances demon- these conditions there is a rise in 2–3 DPG that allosterically strates that the combined effects of hypocapnia, increased pH, and cold modifies oxygen-hemoglobin dissociation curve and creates a override the simple effects of 2–3 DPG on the oxygen-hemoglobin dissociation curve. These data are an outstanding example of the limits of single mechanism right shift that facilitates the unloading of oxygen at the tissues. reductionism. They are also consistent with the left shift seen in many However, when measurements of the oxygen-hemoglobin dis- genetically adapted animals that are native to high altitude. [Reprinted from sociation curve are made in humans who have traveled to high Ref. 68, with permission from Elsevier.]

J Appl Physiol • VOL 111 • AUGUST 2011 • www.jap.org Review

PHYSIOLOGY AS AN ANTIDOTE TO REDUCTIONISM 337 PREDICTIVE POWERS OF GENES? risk (24). Paradoxically, perhaps those at reduced genetic risk would pay less attention to behavioral risks. In addition to gene therapy and other molecular treatments for rare diseases, reductionism also made promises about its SUCCESS IN PHARMACOGENOMICS AND ANTHROPOLOGY ability to provide insight about who gets what complex disease like atherosclerosis, diabetes, hypertension, etc. As the quote So far, this paper has offered a sharp critique of the reduc- below demonstrates, this idea became extremely popular after tionists and taken the position that they over-sold what their the sequencing of the human genome, and scientific funding technology had to offer on both the individual (gene therapy) agencies like the National Institutes of Health have invested basis and also in terms of population risk and intervention. huge sums of money in so-called “genome wide association However, there have been some notable successes stemming studies” (GWAS) and other efforts to determine if a few from application of this technology and two that seem espe- genetic variants are harbingers of future disease in the popu- cially worthy of comment. For example, there has been success lation as a whole (10, 12, 43). in so-called pharmacogenomics. It has been well-known for some time that there are “responders” and “non-responders” to . . . because it been known all along that virtually every disease tends to track in families. What has changed is many forms of drug therapy. In many cases, this is related to that. . . . .we are now beginning to see possible therapeutic how rapidly drugs are metabolized. In the case of tamoxifen, approaches based on gene discoveries that will change the way which had a dramatic effect on the recurrence of breast cancer, medicine is practiced (12). individuals with decreased drug metabolism appear to be at One attractive element of this paradigm was that if a few increased risk for recurrence. This is especially important for common variants explained much of the risk for disease like drugs like tamoxifen, which are ingested as pro-drugs with one diabetes, then it should be possible to identify those at risk and or more metabolites that are active (56). target them for early intervention. So far, the data from many, Another field where “omic” approaches have yielded divi- if not most or even all of these studies, have been underwhelm- dends is anthropology. Two good examples include discoveries ing (43). First, a large number of variants seem to cause a related to the independent development of lactase persistence significant increase in risk, but this increase is small compared into adulthood in areas of the world that were early adopters of with behavioral and environmental factors. An increased risk herding (23, 34). In this context, one can imagine that the of several percent seems also likely to fall below what might be ability to digest lactose into adulthood provided the affected described as a phenotypic signal-to-noise ratio. Second, when individuals a significant survival advantage and thus became the gene variants (single nucleotide polymorphisms, SNPs) the dominant genotype in only a few generations. Another that have been identified via GWAS or other experimental good example that is perhaps counterintuitive relates to the approaches are tested in large populations, the distribution of individuals who migrated to the Tibetan plateau. These indi- risk SNPs is typically strikingly similar in populations with and viduals do not develop chronic mountain sickness even with without disease (50, 63; Fig. 2). Third, when so-called genetic lifelong living at 3–4,000 m of elevation. These responses risk scores for disease are compared with predictive algorithms contrast to the high altitude natives in the Andes Mountains, based on traditional risk factors (family history, lifestyle, age, who do develop chronic mountain sickness (58, 61, 70). Along etc.), the genetic risk scores are far less predictive than tradi- these lines, those who migrated to the Tibetan plateau appear to tional phenotype-based risk scores. Furthermore, addition of have had selection pressure that favored a less functional genetic risk elements to phenotypically based scores adds little variant of the hypoxia-inducible factor that, among other or no additional predictive power (50, 63). Finally, the idea that things, prevents them from developing excessive polycythe- identifying prospective genetic risks for complex diseases that mia, which plays a critical role in chronic mountain sickness. include a number of lifestyle and environmental factors (and INTERIM SUMMARY increasingly even prenatal factors) is fundamentally wishful thinking, because behavioral health issues and culture play So far, I have provided a general critique of what might such a dominant role in determining who gets what disease broadly be termed “molecular reductionism”. I have presented when, and it is unclear if people will change their behavior in evidence that its failure to live up to its self-generated hype is a positive way if they know prospectively they are at increased in reality a failure to recognize larger ideas about homeostasis

Fig. 2. Distribution of so-called high risk genes for cardiovascular disease in women with and without known coronary artery dis- ease. The distribution of risk genes is similar, and construction of a genetic risk score for cardiovascular disease is thus problematic. This is just one example of the limited pre- dictive power of “genomics” as it relates to the ability of relatively common gene variants to predict common diseases. [Borrowed with permission from Ref. 50. Copyright © 2010 American Medical Association. All rights re- served.]

J Appl Physiol • VOL 111 • AUGUST 2011 • www.jap.org Review

338 PHYSIOLOGY AS AN ANTIDOTE TO REDUCTIONISM and regulation that are central to physiology. This includes the using modern “omic” technology. In this context, there are key specific example of the idea of gene therapy for relatively intellectual issues related to how data elements are generated, common genetic disorders like cystic fibrosis and also the their spatial and temporal relationships, and how many ways limited predictive power of gene variants for common diseases. they might interact (Fig. 3) that question the very fundamental The question now is whether there is some way out of this assumptions about systems biology and its reliance on “bottom problem and a better way to use potentially powerful technol- up” or “hypothesis neutral” modeling (2, 6, 15, 27, 35, 36, 38, ogies championed by the reductionists in a biomedical context. 48, 67). It seems to me that without a narrative approach that includes hypothesis testing and key concepts like homeostasis, IS SYSTEMS BIOLOGY THE ANSWER? systems biology runs the risk of becoming scientific “Abstract Expressionism”. Given the issues discussed earlier with gene One idea to address the “failure” of molecular reductionism therapy and GWAS approaches and the hype that surrounds described above is to use a new approach called systems systems biology, these concerns raise questions about what biology. The idea is that if powerful modeling tools and other kind of science and scientific approaches deserve our future data analysis techniques could be applied to the data generated attention and funding (2, 24, 35). via high throughput molecular reductionism, then somehow more meaningful insights would be generated and ultimately REDUCTIONISM STALLS PHYSIOLOGY PROGRESSES exploited for predictive or therapeutic purposes. The rationale for systems biology comes from a sampling of the comments This is not the place for a comprehensive review of the on www.systemsbiology.org web site (34a). contributions of physiology to biomedical research and thera- Systems biology is the study of an organism, viewed as an peutic progress over the last 20–30 years. However, a few integrated and interacting network of genes, proteins and highlights that were initially seen as counterintuitive seem biochemical reactions which give rise to life. Instead of ana- warranted. An obvious one is the discovery of EDRF and nitric lyzing individual components or aspects of the organism, such oxide (25). This observation, which challenged the idea of the as sugar metabolism or a cell nucleus, systems biologists focus endothelium as merely a barrier, led to the discovery of on all the components and the interactions among them, all as gas-based signaling mechanisms and new therapeutic targets part of one system. These interactions are ultimately responsi- for conditions as diverse as erectile dysfunction and pulmonary ble for an organism’s form and functions. hypertension. Would gas-based signaling mechanisms have Traditional biology—the kind most of us studied in high been discovered by sequencing genes? Physiology has also school and college, and that many generations of scientists helped redefine the optimal strategy used during mechanical before us have pursued—has focused on identifying individual genes, proteins and cells, and studying their specific functions. ventilation in patients with adult respiratory distress syndrome But that kind of biology can yield relatively limited insights (ARDS; 26). This has led to abandonment of strategies asso- about the human body. ciated with high airway pressures and maintenance of arterial Biologists, geneticists, and doctors have had limited success blood gases toward so-called permissive hypercapnia, alternate in curing complex diseases such as ....diabetes because forms of mechanical ventilation and pressure support. Impor- traditional biology generally looks at only a few aspects of an tantly, these new strategies that emphasize the avoidance of organism at a time. barotrauma have been associated with significant reductions in To a physiologist, there are obvious problems with systems morbidity and mortality for ARDS. While part of the conven- biology. The problems start with the fact that physiology has tional wisdom now, this strategy was initially seen as counter- been attempting for hundreds of years to understand the inte- intuitive. grated function of organs and whole organisms that culminated in unifying big ideas about homeostasis and regulation dis- cussed earlier. It is also clear that the type of biology that physiologists have been interested in starting with Harvey and the circulation has been about systems and has used modeling and computational techniques (1, 32, 57). Additionally, at this time the concept of systems biology and how it is defined remains very nebulous (52). Is systems biology a new disci- pline, an approach, a collection of tools, or merely a new name for integrative physiology generated by individuals who are generally unaware that our field exists (2, 28, 34a, 36, 40, 41, 45, 57)? Clearly physiology has provided and continues to provide insight about human disease, including insight that has led to vast therapeutic advances in recent years (37). Perhaps, the obvious question for the advocates of the cell-centric view of systems biology is did they skip physiology as part of their course work as students? Fig. 3. Simulation of a number of possible combinations of genes gene The concerns about systems biology outlined above at some interactions depending on the number of genes per biological function (x-axis) level are about definitions and perhaps intellectual ownership. and the total number of genes in the organism. For biological functions with roughly 50 genes, ϳ10150 possible combinations exist for most mammals. This However, it also seems fair to ask what the long-term outlook figure shows the immense challenge associated with hypothesis-neutral sys- for cell-centric systems biology is as an approach to making tems biology and “bottom up” modeling. [Borrowed with permission from sense out of the vast amounts of data that can be generated Ref. 46.]

J Appl Physiol • VOL 111 • AUGUST 2011 • www.jap.org Review

PHYSIOLOGY AS AN ANTIDOTE TO REDUCTIONISM 339 concept of homeostasis, regulation, feedback, redundancy, and acclimation/adaptation. A classic example of redundancy comes from coronary circulation where coronary vasodilation is tightly linked to myocardial oxygen demand. In this context, a number of vasodilator systems likely contribute to this response. However, pharmacological blockade of one system, or in fact multiple systems, fails to alter this fundamental relationship between coronary vasodilation and myocardial oxygen demand in most species (19, 64; Fig. 5) This suggests that multiple redundant pathways contribute to this critical physiological response so that when one is blocked or absent, oxygen supply to the heart is not threatened when demand Fig. 4. Demonstration that beta-blockade can improve ventricular function rises. (%EF) in humans with congestive heart failure over time. Standard therapy The fundamental relationship between coronary vasodilation was associated with stable ventricular ejection fraction over 3 mo. By contrast, and myocardial oxygen demand is also an observation that has ␤ Ͼ metoprolol ( -blockade) increased ventricular ejection fraction by 50% over had vast therapeutic implications and explains in large part 3 mo (*Ͻ0.05 vs. baseline). This finding, while initially counterintuitive, was based on sound physiological reasoning and along with other therapies has why age specific death rates for cardiovascular disease have improved outcomes for patients with congestive heart failure. [Adapted from fallen dramatically over the last 30–40 years. There are drugs Ref. 20, with permission from Wolters Kluwer Health.] the reduce myocardial oxygen demand, mechanical therapy like stents, bypass surgery that improves myocardial oxygen delivery, and other drugs and lifestyle interventions that can Another example of a counterintuitive physiologically based affect both elements of the equation over time (30, 44). This clinical strategy was the use of beta-blockers in congestive physiological narrative and the progress that has flowed from it heart failure. For many years these drugs were contraindicated is in stark contrast to the relative lack of progress against in congestive heart failure (CHF) because it was felt that high cancer where there does not seem to be a unifying physiolog- sympathetic drive to the heart was required to maintain an ically based story or model that can be exploited to address the adequate cardiac output in CHF. In reality, high sympathetic general problem of cancer. activity to the heart over time contributed to the progression of One of the classic feedback control mechanisms in physiol- the disease and promoted a downward spiral of cardiac remod- ogy is the arterial baroreflex. While barodenervated animals eling and reduced function (20). Thus the use of beta-blockers have relatively normal blood pressure over a given 24 h period, along with vasodilator therapy has been revolutionary and can their blood pressure becomes much more variable (14). The interrupt or slow the downward spiral noted above in patients relative stability of blood pressure in the long run shows the with congestive heart failure (Fig. 4). Again, the conventional power of redundant control via renal regulation of arterial wisdom was turned on its head and provided new insights that pressure. However, for short-term adaptations, essential for ultimately led to improved therapy. In the case of ARDS and things like exercise or changes in posture, feedback control congestive heart failure there has also been a two-way street between observations from clinical research conducted “at the bedside” to more fundamental observations in the laboratory. Three other examples of more straight forward physiologi- cally based therapeutic successes in recent years include the long story of improved outcomes for premature infants cared for in the neonatal ICU including altered ventilatory strategies, avoidance of oxygen toxicity, and surfactant therapy (9, 60). These improved outcomes, in the littlest ICU survivors, con- tinue to seem miraculous to individuals who care for these patients and practiced medicine or nursing prior to their use. A second example has been oral rehydration solutions that are life saving in infants and children with diarrheal disease, especially in developing countries where it is a primary and frequent cause of death (8). Finally, in the developed world, where obesity and physical inactivity are leading to a pandemic of type 2 diabetes, physical activity (especially walking training in middle-aged people) has been proven to be highly effective in preventing, limiting, and in some cases reversing type 2 Fig. 5. Myocardial oxygen demand on the x-axis and coronary blood flow on diabetes (16, 29). Each of these therapeutic successes is based the y-axis. Note that coronary blood flow rises in proportion to myocardial on a foundation of physiologically based experimental evi- oxygen demand and that this rise is unaffected by triple inhibition of kATPϩ channels, nitric oxide synthase, and adenosine receptor s This is a classic dence and insights. example of the concept of physiological redundancy. This well-known phe- nomenon may also explain why the absence of many so-called critical genes or REDUNDANCY, FEEDBACK, AND ACCLIMATION/ADAPTATION proteins has limited impact on overall organ or organism function. This is because so-called redundant systems are able to alter their function and Why has physiology continued to contribute in the era of “upregulate” when one or more systems is blocked. [Borrowed with permis- reductionism? Physiologists are well versed in the overall sion from Ref. 64.]

J Appl Physiol • VOL 111 • AUGUST 2011 • www.jap.org Review

340 PHYSIOLOGY AS AN ANTIDOTE TO REDUCTIONISM from arterial baroreflexes is essential for normal physiological areas will be used. In each case there seems to be an overall responses. hypothesis and a strategy that exploits what might be called An outstanding example of how humans acclimatize and responders and non-responders to an intervention. adapt to physiological stress comes from studies that demon- Britton and Koch and colleagues (39, 69) have used selec- strate that the ability of individuals to exercise in the heat can tive breeding strategies to develop rats with vastly different be remarkably improved by a few weeks of training in the heat inherent aerobic endurance capacities (Fig. 6). These animals (54). This improved exercise tolerance in the heat is associated have been used in a variety of studies to better understand the with expanded plasma volume, increased sweating, and altered gene environment interactions. In many instances the animals thermoregulatory skin blood flow. Another outstanding exam- selected for low intrinsic aerobic capacity seem to be at ple is what might be called the adaptability of insulin sensi- increased risk for complex diseases like diabetes, obesity, and tivity and glucose uptake in skeletal muscle. These variables heart disease. Additionally, studies using these animals have are extremely sensitive to exercise and changes in daily activity begun to identify genetic and transcriptional factors and net- and seem especially relevant in the era of the physical inactiv- works that explain in part this increased risk (39). ity/obesity pandemic (29, 49, 53, 65). Another example of how physiologists are using tools from Ideas about redundancy, feedback control, and acclimation/ the “new biology” is the HERITAGE study, which broadly adaptation are also why physiologists are not that surprised by seeks to understand the genetic basis for the differing physio- the ability of various gene knockout animals to survive and logical responses to exercise training in a large number of thrive (33). At some level this approach is conceptually similar humans exposed to a standard protocol (3–5). This is an to the classic denervation or high dose pharmacological block- excellent example of how what might be called “high resolu- ade studies used by physiologists for generations and primarily tion” physiologically based phenotyping in conjunction with show the power of the regulatory mechanisms highlighted . This hypothesis-driven approach also includes uses above to preserve both long term phenotype and homeostasis despite the loss of one or more critical pathways or mecha- nisms (17). In this context, it is not surprising the yeast can survive without 80% of their genes and the function of these genes only becomes apparent when the organism is stressed (33). Is it too cynical to point out that knockout animals are essentially a “can’t lose” experimental approach? If the knock- out is lethal or leads to significant phenotypic dysfunction it is essential. If it survives then genetic or other compensatory mechanisms were upregulated to overcome the absence of the essential gene. Physiology or physiologically based tests can also provide insight into the risk of future disease and/or predictive out- comes. For example, the blood pressure responses to common sympathoexcitatory stress can be used to define those at risk for future hypertension in a way that is potentially much more predictive than any current genetic test. Additionally, tests of autonomic function are strong predictors of outcomes in large populations of humans, and cardiorespiratory fitness is an especially good predictor of all-cause mortality.

TOOLS VS. BIG IDEAS At some level molecular reductionism and systems biology are at existential cross roads. Are they in fact real disciplines informed by big ideas like homeostasis and regulation, or are they essentially tools and approaches that will facilitate the work of disciplines informed by bigger ideas and more impor- tantly bigger questions and more comprehensive strategies? Based on the concepts and examples highlighted in this paper I would argue that until the vast amounts of data generated by Fig. 6. Selective breeding of rats with divergent aerobic capacities. These data modern “omic” techniques are put in a physiological context it show that animals selected for their running capacity diverge dramatically after will be an exercise in what Sydney Brenner has deemed “low a few generations and is sustained for many generations. Importantly, at the same time body weight also began to diverge as did a number of risk factors input, high throughput, no output biology” (6). Along these for cardiometabolic disease. Phenotypic studies conducted on these animals in lines, I want to end on an optimistic note with examples of how conjunction with more targeted forms of “omic” approaches and other types of physiology is making a difference by applying reductionist molecular reductionism are providing new insights about gene environment tools as part of a more comprehensive approach to important interactions. These findings may also have applicability to physically active and inactive humans. The approach of Britton and Koch is a classic example questions. Because the Adolph lecture is sponsored by the of using reductionist tools in a physiological context to gain new insights with Exercise and Environmental Physiology section of the Amer- direct applicability to human health and disease. [Reprinted from Ref. 39 with ican Physiological Society, relevant examples from related permission from Macmillan Publishers Ltd. Obesity Suppl. copyright 2008.]

J Appl Physiol • VOL 111 • AUGUST 2011 • www.jap.org Review

PHYSIOLOGY AS AN ANTIDOTE TO REDUCTIONISM 341 various “omic” and systems biology approaches and was ini- REFERENCES tiated by physiologists before the terms genomics or systems 1. Auffray C, Noble D. Origins of systems biology in William Harvey’s biology existed. Additionally, like the examples from pharma- masterpiece on the movement of the heart and the blood in animals. Int J cogenomics and anthropology discussed earlier, it takes advan- Mol Sci 10: 1658–1669, 2009. tage of the fact that there are responders and non-responders in 2. Beard DA, Kushmerick MJ. Strong inference for systems biology. PLoS response to a given intervention or environmental stressor. Comput Biol 5: 1–10, 2009. Finally, my collaborator John Eisenach and I along with our 3. Bouchard C, An P, Rice T, Skinner JS, Wilmore JH, Gagnon J, Perusse L, Leon AS, Rao D. Familial aggregation of V˙ O2max response to colleagues have performed carefully controlled studies on how exercise training: results from the HERITAGE Family Study. J Appl common genetic variants in the ␤2-adrenergic receptor influ- Physiol 87: 1003–1008, 1999. ence a number of physiological responses and how any geno- 4. Bouchard C, Leon AS, Rao DC, Skinner JS, Wilmore JH, Gagnon J. type-based differences might be influenced by dietary sodium Aims, design, and measurement protocol. Med Sci Sports Exerc 27: (21, 22, 31, 59). These studies were initiated because epide- 721–729, 1995. 5. Bouchard C, Sarzynski MA, Rice TK, Kraus WE, Church TS, Sung miological evidence suggested that genetic variation in the YJ, Rao DC, Rankinen T. Genomic predictors of maximal oxygen ␤2-adrenergic receptor influenced blood pressure in large pop- uptake response to standardized exercise training programs. J Appl ulations. In our studies only homozygotes for the genetic Physiol 110: 1160–1170, 2011. variant of interest were recruited in an effort to see the 6. Brenner S. Sequences and consequences. Phil Trans R Soc B 365: maximum potential physiological effect of the variants. Using 207–212, 2010. 7. Cannon WB. Organization for physiological homeostasis. Physiol Rev 9: this approach, it appears that there are genotype-specific pat- 399–431, 1929. terns associated with increased cardiac output responses to 8. Cheng AC, McDonald JR, Thielman NM. Infectious diarrhea in devel- exercise that may interact with NO-mediated ␤2-adrenergic oped and developing countries. J Clin Gastroenterol 39: 757–773, 2005. receptor peripheral vasodilation. These responses clearly link 9. Clements JA, Avery ME. Lung surfactant and neonatal respiratory and mechanistically define how a common gene variant in a distress syndrome. Am J Respir Crit Care Med 157: S59–S66, 1998. key regulatory system can influence a physiological response 10. Collins FS. Contemplating the end of the beginning. Genome Res 11: 641–643, 2001. in humans. They may also provide physiological explanations 11. Collins FS. Cystic fibrosis: molecular biology and therapeutic implica- relevant to the original epidemiological observations on blood tions. Science 256: 774–779, 1992. pressure and other outcomes, including those in patients with 12. Collins FS. The human genome project and the future of medicine. Ann the acute coronary syndrome (42). NY Acad Sci USA 882: 42–55, 1999. 13. Comroe JH Jr, Dripps RD. Ben Franklin and open heart surgery. Circ Res 35: 661–669, 1974. SUMMARY 14. Cowley AW Jr, Liard JF, Guyton AC. Role of the baroreceptor reflex in daily control of arterial blood pressure and other variables in dogs. Circ In this paper and in the Adolph Lecture I have highlighted Res 32: 564–576, 1973. some of the claims associated with molecular reductionism and 15. Csete ME, Doyle JC. Reverse engineering of biological complexity. more recently systems biology. In both cases I have argued that Science 295: 1664–1669, 2002. the apparent inability and/or unwillingness of the advocates of 16. Diabetes Prevention Program Research Group. Reduction in the incidence these approaches to use key concepts from physiology and of type 2 diabetes with lifestyle intervention or metformin. N Engl J Med 346: 393–403, 2002. ultimately use their tools in a physiological context has limited 17. Donald DE, Milburn SE, Shepherd JT. Effect of cardiac denervation on the contribution of the approaches they advocate. By contrast the maximal capacity for exercise in the racing greyhound. J Appl Physiol physiology has continued to use new tools in the service of its 19: 849–852, 1964. big ideas and also continued to provide biomedical insight and 18. Donaldson SH, Boucher RC. Sodium channels and cystic fibrosis. Chest therapeutic advances. As the final examples show, it is possible 132: 1631–1636, 2007. 19. Duncker DJ, Bache RJ. Regulation of coronary blood flow during to incorporate reductionist tools in a physiological context to exercise. Physiol Rev 88: 1009–1086, 2008. gain broader biomedical insights. Hopefully these insights will 20. Eichhorn EJ, Bristow MR. Medical therapy can improve the biological fuel the next wave of physiologically inspired therapeutic properties of the chronically failing heart. Circulation 94: 2285–2296, advances. 1996. 21. Eisenach JH, Barnes SA, Pike TL, Sokolnicki LA, Masuki S, Dietz NM, Rehfeldt KH, Turner ST, Joyner MJ. Arg 16/Gly beta-2 adrener- ACKNOWLEDGMENTS gic receptor polymorphism alters the cardiac output response to isometric The author thanks Drs. Nisha Charkoudian, Doug Seals, and Jerry Dempsey exercise. J Appl Physiol 99: 1776–1781, 2005. for critical reviews of the manuscript. 22. Eisenach JH, McGuire AM, Schwingler RM, Turner ST, Joyner MJ. The Arg16/Gly ␤-2 adrenergic receptor polymorphism is associated with altered cardiovascular responses to isometric exercise. Physiol Genomics GRANTS 16: 323–328, 2004. My laboratory has been funded continuously by National Institutes of 23. Enattah NS, Jensen TGK, Nielsen M, Lewinski R, Kuokkanen M, Health since the early 1990s (HL-46493, HL-83947), and our work has also Rasinpera H, El-Shanti H, Seo JK, Alifrangis M, Khalil IF, Natah A, been facilitated by the former Mayo GCRC grant and more recently the CTSA Ali A, Natah S, Comas D, Mehdi SQ, Groop L, Vestergaard EM, grant (RR-024150). I have also been supported by the American Heart Imtiaz F, Rashed MS, Meyer B, Troelsen J, Peltonen L. Independent Association, the Mayo Foundation, and the Frank and Shari Caywood Profes- introduction of two lactase-persistence alleles into human populations sorship. I would also like to thank the fellows and collaborators who have reflects different history of adaptation to milk culture. Am J Hum Genet worked with me, my outstanding technical support staff, and the many subjects 82: 57–72, 2008. who have volunteered for our studies. 24. Evans JP, Meslin EM, Marteau TM, Caulfield T. Genomics. Deflating the genomic bubble. Science 331: 861–862, 2011. 25. Furchgott RF. The discovery of endothelium-derived relaxing factor and DISCLOSURES its importance in the identification of nitric oxide. JAMA 276: 1186–1188, No conflicts of interest, financial or otherwise, are declared by the author. 1996.

J Appl Physiol • VOL 111 • AUGUST 2011 • www.jap.org Review

342 PHYSIOLOGY AS AN ANTIDOTE TO REDUCTIONISM

26. Gillette MA, Hess DR. Ventilator-induced lung injury and the evolution 49. Olsen RH, Krogh-Madsen R, Thomsen C, Booth FW, Pedersen BK. of lung-protective strategies in acute respiratory distress syndrome. Respir Metabolic responses to reduced daily steps in healthy nonexercising men. Care 46: 130–148, 2001. JAMA 299: 1261–1263, 2008. 27. Golub TR. Counterpoint: Data first. Nature 464: 679, 2010. 50. Paynter NP, Chasman DI, Pare G, Buring JE, Cook NR, Miletich JP, 28. Greenhaff PL, Hargreaves M. “Systems biology” in human exercise Ridker PM. Association between a literature-based genetic risk score and physiology: is it something different from integrative physiology? J cardiovascular events in women. JAMA 303: 631–637, 2010. Physiol 589: 1031–1036, 2011. 51. Pearson H. One gene, twenty years. Nature 460: 165–169, 2009. 29. Henriksen EJ. Exercise effects of muscle insulin signaling and action. 52. Powell A, O’Malley MA, Muller-Wille S, Calvert J, Dupre J. Disci- Invited review: effects of acute exercise and exercise training on insulin plinary baptisms: a comparison of the naming stores of genetics, molecular resistance. J Appl Physiol 93: 788–796, 2002. biology, genomics, and systems biology. Hist Phil Life Sci 29: 5–32, 2007. 30. Heron M, Hoyert DL, Murphy SL, Xu J, Kochanek KD, Tejada-Vera 53. Rogers MA, Yamamoto C, King DS, Hagberg JM, Ehsani AA, Hol- B. Deaths: Final data for 2006. National Vital Statistics Reports; vol 57, no loszy JO. Improvement in glucose tolerance after 1 wk of exercise in 14. Hyattsville, MD: National Center for Health Statistics, 2009, pp patients with mild NIDDM. Diabetes Care 11: 613–618, 1988. 1–117. 54. Robinson S, Turrell ES, Belding HS, Horvath SM. Rapid acclimatiza- 31. Hesse C, Schroeder DR, Nicholson WT, Hart EC, Curry TB, Pen- tion to work in hot climates. Am J Physiol 140: 168–176, 1943. heiter AR, Turner ST, Joyner MJ, Eisenach JH. Beta-2 adrenoceptor 55. Rossman MG. Chapter 3: Recollection of the events leading to the gene variation and systemic vasodilatation during ganglionic blockade. J discovery of the structure of haemoglobin. J Mol Biol 392: 23–32, 2009. Physiol 588: 2669–2678, 2010. 56. Schroth W, Goetz MP, Hamann U, Fasching PA, Schmidt M, Winter 32. Hester RL, Iliescu R, Summers R, Coleman TG. Systems biology and S, Fritz P, Simon W, Suman VJ, Ames MM, Safgren SL, Kuffel MJ, integrative physiological modelling. J Physiol 589: 1053–1060, 2011. Ulmer HU, Bolander J, Strick R, Beckmann MW, Koelbl H, Wein- 33. Hillenmeyer ME, Fung E, Wildenhain J, Pierce SE, Hoon S, Lee W, shilbum RM, Ingle JN, Eichelbaum M, Schwab M, Brauch H. Asso- Proctor M, St. Onge RP, Tyers M, Koller D, Altman RB, Davis RW, ciation between CYP2D6 polymorphisms and outcomes among women Nislow C, Giaever G. The chemical genomic portrait of yeast: uncovering with early stage breast cancer treated with tamoxifen. JAMA 302: 1429– a phenotype for all genes. Science 320: 362–365, 2008. 1436, 2009. 34. Ingram CJE, Mulcare CA, Itan Y, Thomas MG, Swallow DM. Lactose 57. Secomb TW, Pries AR. The microcirculation: physiology at the me- digestion and the evolutionary genetics of lactase persistence. Hum Genet soscale. J Physiol 589: 1047–1042, 2011. 124: 579–591, 2009. 58. Simonson TS, Yang Y, Huff CD, Yun H, Qin G, Witherspoon DJ, Bai 34a.Institute for Systems Biology. (Online). http://www.systemsbiology.org Z, Lorenzo FR, Xing J, Jorde LB, Prchal JT, Ge R. Genetic evidence [June 22, 2011]. for high-altitude adaptation in Tibet. Science 329: 72–75, 2010. Joyner MJ, Pedersen BK. 35. Ten questions about systems biology. J 59. Snyder EM, Beck KC, Dietz NM, Eisenach JH, Joyner MJ, Turner Physiol 589: 1017–1030, 2011. ST, Johnson BD. Arg16Gly polymorphism of the ␤ -adrenergic receptor 36. Joyner MJ. Physiology: alone at the bottom, alone at the top. J Physiol 2 is associated with differences in cardiovascular function at rest and during 589: 1005, 2011. exercise in humans. J Physiol 571: 121–130, 2006. 37. Joyner MJ. Why physiology matters in medicine. Physiology 26: 72–75, 60. Sol RF. Current trials in the treatment of respiratory failure in preterm 2011. infants. Neonatology 95: 368–372, 2009. 38. Kell DB, Oliver SG. Here is the evidence, now what is the hypothesis? 61. Storz JF. Genes for high altitudes. Science 329: 40–41, 2010. The complementary roles of inductive and hypothesis-driven science in 63. Talmud PJ, Hingorani AD, Cooper JA, Marmot MG, Brunner EJ, the post-genomic era. Bioessays 26: 99–105, 2004. Kumari M, Kivimaki M, Humphries SE. 39. Koch LG, Britton SL. Development of animal models to test the Utility of genetic and non- fundamental basis of gene-environmental interactions. Obesity 16, Suppl genetic risk factors in prediction of type 2 diabetes: Whitehall II prospec- tive cohort study. Br Med J 340: b4838, 2010. 3: S28–S32, 2008. ϩ 40. Kohl P, Crampin EJ, Quinn TA, Noble D. Systems biology: an ap- 64. Tune JD, Richmond KN, Gorman MW, Feigl EO. K(ATP)( ) chan- proach. Clin Pharmacol Ther 88: 25–33, 2010. nels, nitric oxide, and adenosine are not required for local metabolic 41. Kuster DWD, Merkus D, Van Der Velden J, Verhoeven AJM, coronary vasodilation. Am J Physiol Heart Circ Physiol 280: H868–H875, Duncker DJ. Integrative Physiology 2.0: integration of systems biology 2001. into physiology and its application to cardiovascular homeostasis. J 65. van Dieren S, Beulens JW, van der Schouw YT, Grobbee DE, Neal B. Physiol 589: 1037–1045, 2011. The global burden of diabetes and its complications: an emerging pan- 42. Lanfear DE, Jones PJ, Marsh S, Cresci S, McLeod HL, Spertus JA. demic. Eur J Cardiovasc Prev Rehabil 17, Suppl 1: S3–S8, 2010. Beta2-adrenergic receptor genotype and survival among patients receiving 66. Verkman AS, Galietta LJV. Chloride channels as drug targets. Nat Rev beta-blocker therapy after an acute coronary syndrome. JAMA 294: 1526– 8: 153–171, 2009. 1533, 2005. 67. Weinberg RA. Point: Hypotheses first. Nature 464: 678, 2010. 43. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter 68. Winslow RM. The role of hemoglobin oxygen affinity in oxygen transport DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, at high altitude. Respir Physiol Neurobiol 158: 121–127, 2007. Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin 69. Wisloff U, Najjar SM, Ellingsen O, Haram PM, Swoap S, Al-Share Q, M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Fernstrom M, Rezaei K, Lee SJ, Koch LG, Britton SL. Cardiovascular Gibson G, Haines JL, Mackay TFC, McCarroll SA, Visscher PM. risk factors emerge after artificial selection for low aerobic capacity. Finding the missing heritability of complex diseases. Nature 461: 747– Science 307: 418–420, 2005. 753, 2009. 70. Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo ZXP, Pool JE, Xu X, 44. Nelson RR, Gobel FL, Jorgensen CR, Wang K, Wang Y, Taylor HL. Jiang H, Vinckenbosch N, Korneliussen TS, Zheng H, Liu T, He W, Li Hemodynamic predictors of myocardial oxygen consumption during static K, Luo R, Nie X, Wu H, Zhao M, Cao H, Zou J, Shan Y, Li S, Yang and dynamic exercise. Circulation 50: 1179–1189, 1974. Q, Asan Ni P, Tian G, Xu J, Liu X, Jiang T, Wu R, Zhou G, Tang M, 45. Noble D. New-Darwinism, the Modern Synthesis and selfish genes: are Qin J, Wang T, Feng S, Li G, Huasang Luosang J, Wang W, Chen F, they of use in physiology? J Physiol 589: 1007–1015, 2011. Wang Y, Zheng X, Li Z, Bianba Z, Yang G, Wang X, Tang S, Gao G, 46. Noble D. Differential and integral views of genetic in computational Chen Y, Luo Z, Gusang L, Cao Z, Zhang Q, Ouyang W, Ren X, Liang systems biology. Interface Focus 1: 7–15, 2011. H, Zheng H, Huang Y, Li J, Bolud L, Kristiansen K, Li Y, Zhang Y, 47. Noble D. Genes and causation. Phil Trans R Soc A 366: 3001–3015, 2008. Zhang X, Li R, Li S, Yang H, Nielsen R, Wang J, Wang J. Sequencing 48. Nurse P, Hayles J. The cell in an era of systems biology. Cell 144: of 50 human exomes reveals adaptation to high altitude. Science 329: 850–854, 2011. 75–78, 2010.

J Appl Physiol • VOL 111 • AUGUST 2011 • www.jap.org 26 6

Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R345–R368 Cancer systems biology REVIEW Systems biology: perspectives on multiscale modeling in research on endocrine-related cancers

Robert Clarke1, John J Tyson2, Ming Tan3, William T Baumann4, Lu Jin1, Jianhua Xuan5 and Yue Wang5

1Department of Oncology, Georgetown University Medical Center, Washington, District of Columbia, USA 2Department of Biological Sciences, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA 3Department of Biostatistics, Bioinformatics & Biomathematics, Georgetown University Medical Center, Washington, District of Columbia, USA 4Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA 5Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, Virginia, USA

Correspondence should be addressed to R Clarke: [email protected]

Abstract Key Words Drawing on concepts from experimental biology, computer science, informatics, ff systems biology mathematics and statistics, systems biologists integrate data across diverse platforms ff mathematical biology and scales of time and space to create computational and mathematical models of the ff computational biology integrative, holistic functions of living systems. Endocrine-related cancers are well suited ff predictive modeling to study from a systems perspective because of the signaling complexities arising from the roles of growth factors, hormones and their receptors as critical regulators of cancer cell biology and from the interactions among cancer cells, normal cells and signaling molecules in the tumor microenvironment. Moreover, growth factors, hormones and their receptors are often effective targets for therapeutic intervention, such as estrogen biosynthesis, estrogen receptors or HER2 in breast cancer and androgen receptors in prostate cancer. Given the complexity underlying the molecular control networks in these cancers, a simple, intuitive understanding of how endocrine-related cancers respond to therapeutic protocols has proved incomplete and unsatisfactory. Systems biology offers an alternative paradigm for understanding these cancers and their treatment. To correctly interpret the results of systems-based studies requires some knowledge of how in silico models are built, and how they are used to describe a system and to predict the effects of perturbations on system function. In this review, we provide a general perspective on the field of cancer systems biology, and we explore some of the advantages, limitations and pitfalls associated with using predictive multiscale modeling Endocrine-Related Cancer to study endocrine-related cancers. (2019) 26, R345–R368

Introduction

Over the past few decades, many advances in endocrine- approach, focusing on mechanistic studies of specific related cancers have come from the experimental fields of genes and proteins, linear signaling pathways, and cellular and molecular biology and from their translation particular anticancer drugs and other interventions. A into clinical applications. Generally speaking, cellular systems-based approach builds on this important work by and molecular studies have taken a mostly reductionist providing a more holistic account of the complex networks

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University

-18-0309 Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R346 Cancer systems biology of interacting genes, proteins and metabolites that that a systems approach, including computational and determine how a cancer cell survives and thrives within mathematical modeling of new data streams, is essential the tumor microenvironment and how the host responds to transform data into actionable knowledge that leads to the tumor. From this viewpoint, molecular networks to fundamental improvements in human health. An and the subcellular processes they regulate are seen to overview of the organization of this review is provided in interact with activities occurring within the tumor cell, Fig. 1. We begin with a section on why models are needed, its microenvironment and the cancer-bearing organism. how modelers generally approach building their models, A holistic view, where interactions can have both local and some considerations regarding the specific goals of and distant effects, is nothing new for endocrinologists modeling. Next, we describe how models may be based and experts in some other fields. However, in what is now on a modular structure, and how modularity can lead to often referred to as the ‘post-genomic era’, the tools and emergent behaviors, as consequences of the dynamical technologies available to effectively study any cancer as properties of signaling networks. We discuss deterministic, a systems-disease have changed dramatically. In concert stochastic and Bayesian models, and how their parameters with these advances has come greater insight into the are estimated from data and provided with error bounds. remarkable complexity of signaling, its integration and We then discuss model performance, potential sources the coordination evident in controlling and executing of error, the importance of independently validating cellular functions. model predictions and modeling drug interactions. In this review article, we hope to introduce a broad Subsequent sections discuss examples of a knowledge- readership to the potentials and limitations of a systems guided computational tool for building networks, a approach to improve our understanding and treatment of mathematical model of the estrogen receptor landscape endocrine-related cancers. The scope of endocrine-related and some insights into interpreting models. cancer systems biology is large and complex, and we For our purposes in this review, a system is a acknowledge that some issues in this field are addressed collection of interacting components that produces a here at a relatively simplistic level. Nonetheless, we believe defined biological output in response to specific inputs.

Figure 1 Representation of data streams and how these relate to computational and mathematical modeling in the context of systems biology. The four primary sections of this review contain specific insights into different aspects of modeling that reflect how modeling uses data streams to build multiscale models. We first describe why models are needed in ‘Why build models’. The second section ‘Multiscale modeling’ introduces several critical aspects of modeling, from some basic goals of modeling, then describing how models can use a modular structure that can explain the emergent properties of biological systems. Deterministic, stochastic, and Bayesian models are then presented, as is the critical feature for cancer therapies of strategies to model drug interactions. These subsections are followed by a discussion of types of error in modes, assessing model performance, and validating model predictions. The final two subsections within the section on multiscale modeling provide specific examples of tools or approaches to modeling: a knowledge-guided computational tool for building networks, and a mathematical model of the estrogen receptor landscape. The penultimate section ‘Interpreting models’ provides some insights into the challenges and pitfalls of interpreting model solutions. The final section ‘Future directions’ offers some brief insights into where the authors see the field going in the next few years.

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R347 Cancer systems biology

To be useful, such input–output models must adequately we consider a ‘mathematical model’ as using differential capture the complexity of the system. Complexity does equations and stochastic algorithms to create dynamic, not necessarily mean ‘big’ (many nodes and edges). semi-mechanistic models of control networks of limited Relatively small networks can exhibit non-intuitive scope (dozens of genes and their products). Of course, signal-processing capabilities due to inherent feedforward such dynamical models must ultimately be simulated and feedback loops and non-linear kinetic rate laws, for on a digital computer, but we consider a ‘computational which small changes in input produce disproportionately model’ as something different: as using machine-learning large changes in output. tools to explore high-dimensional data (hundreds or Most biological systems are open, complex, dynamic thousands of genes and/or proteins). and adaptive. While these fundamental properties may Mathematical models may be deterministic or be missed in work that adopts a solely reductionist stochastic in nature, depending on the role of random perspective, there would be little for systems biologists events in the system being modeled. In either case, all to model without the data and insights obtained from models ultimately entail a statistical evaluation of how reductionist studies. Systems biologists acknowledge both well the model’s output fits the available experimental the complexity of biological systems and the fact that data. Both stochastic and deterministic models can be much of what must be modeled and interpreted is still useful when used appropriately (Twycross et al. 2010). poorly understood. Computational and mathematical At present, deterministic models are usually the initial models are often used to analyze and integrate data from approach taken to provide a description of molecular multiple technological platforms into new representations events in cellular control systems. However, considering of system function. These new representations can the paucity of informative data within the flood of omics expand our understanding of complex regulatory systems results, the unavoidable noise in biological measurements, (Lavrik & Zhivotovsky 2014, Wang & Deisboeck 2014, and our ignorance of latent variables in regulatory Altrock et al. 2015, Peng et al. 2016, Janes et al. 2017, Ji networks, stochastic (Wilkinson 2009) or hybrid models et al. 2017). Ultimately, systems-based insights into the (Twycross et al. 2010) are being applied more widely. biology of endocrine-related cancers may lead to better Some of the general limitations in modeling have been treatments and outcomes for patients (Werner et al. 2014, discussed elsewhere (Di et al. 2006, Wilkinson 2009, Jinawath et al. 2016, Ji et al. 2017). Twycross et al. 2010) and will not be reiterated here. While the idea of generating mathematical models of From a clinical perspective, useful in silico models signal flow in a biological system is not new Le( 2007, Ji et al. will have to be multiscale. For example, drug action at 2017), the sources and magnitude of data for multiscale the molecular scale must be linked to clinical outcomes at modeling, and many of the computational/mathematical the tissue or organism scale. Multiscale models use many tools available, have changed dramatically in recent years. different data types from multiple sources, spanning scales Many of the newer technologies fall into the rapidly from DNA to RNA to protein, from metabolites to cells to developing fields of omics (genomics, transcriptomics, tissues, from tissues to organisms and even to interacting , metabolomics), an increasing number of populations. Modeling based only on genome and/or sub-omic technologies and quantitative microscopy transcriptome data can be limited because approximately including gene expression in single cells (Sandberg 2014, 50% of changes found in the transcriptome may not be Buettner et al. 2015, Kanter & Kalisky 2015). Central to present in the proteome (Vogel & Marcotte 2012); an our ability to analyze and integrate these new data streams even smaller percentage of changes in the genome may and to build new mathematical models and computational filter through to the proteome. Hence, spanning scales representations of the data, are the analytical approaches (provided necessary data are available) may improve the and software tools that continue to be developed by models and provide new insights into cancer physiology computer scientists, mathematicians and statisticians. (Deisboeck et al. 2011). Rather than being identified with any of these particular In this review, we explore some of the basic concepts and specializations, systems biology sits uniquely at their challenges in applying computational and mathematical nexus. modeling to endocrine-related cancer research. Rather We will focus our discussion on the use of than providing detailed descriptions of tools-of-the-trade, computational and mathematical approaches to we discuss a variety of computational and mathematical model system function in the context of endocrine- approaches that are often applied, the advantages and related cancer biology. For the purposes of this review, limitations of each, and the specific challenges for using

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R348 Cancer systems biology them correctly and usefully. Since we will not discuss endocrine-related cancers is their cellular heterogeneity, specific experimental designs here, readers interested in which creates a dynamic microenvironment of many cell exploring the many tools, workflows and frameworks and types in addition to the cancer cell component and can emerging standards for systems-based research may find also affect a tumor’s response to treatment (Junttila & de the following sources useful (Brazma et al. 2006, Swertz & Sauvage 2013, Meacham & Morrison 2013, Martelotto Jansen 2007, Gehlenborg et al. 2010, Ghosh et al. 2011, et al. 2014). Often, models are built with transcriptome Wu & Stein 2012, Hofree et al. 2013, Sedgewick et al. 2013, data that reflect averaged expression values, since tissue Wen et al. 2013, Cheng et al. 2014a,b, Hoadley et al. 2014, microdissection prior to collecting omic data remains Creixell et al. 2015, Leiserson et al. 2015, Dimitrova et al. relatively uncommon. When applying computational 2017, Nam 2017, Keenan et al. 2018, Miryala et al. 2018). and mathematical modeling to study cell type/tissue type Similarly, there are many sources of cancer omics data in in data from complex tissue samples, data deconvolution the public domain that are too numerous to capture here. using either supervised or unsupervised approaches is However, we provide examples of some widely used large a prerequisite. Supervised data deconvolution can be omics datasets that include data from breast and other performed by integrating tissue-specific gene or protein endocrine-related cancers in Table 1. expression profiles (Newman et al. 2015) from the Gene- Given clear evidence of a significant lack of Tissue Expression program (GTex Consortium 2015) reproducibility in biomedical research (Begley 2013, and The Human Protein Atlas (Ponten et al. 2011). Mobley et al. 2013, Hatzis et al. 2014) and the potential Alternatively, in the more challenging case of intra-tumor for systems approaches to both reduce and exacerbate this heterogeneity where subclone-specific markers are often problem, an appreciation of some of the key challenges unknown, an unsupervised data deconvolution approach – for which there may or may not be adequate current such as Convex Analysis of Mixtures can be exploited solutions – is timely. While we cannot address all the to uncover the hidden subclone specificity (Wang et al. major issues in such an interdisciplinary subject, we hope 2015, 2016, Herrington et al. 2018). While tools for that our perspective will be pertinent to using systems supervised (Zuckerman et al. 2013, Hart et al. 2015) biology to attain a better understanding of endocrine- and unsupervised deconvolution of averaged data from related cancers. heterogeneous tissues (Chen et al. 2011, Wang et al. 2016) can be used as a data processing step prior to modeling – this preprocessing step remains uncommon. Why build quantitative models of The properties of high-dimensional data, particularly biological systems? data from omics technologies, present unique challenges (Clarke et al. 2008) that are often inadequately addressed ‘The statistician knows, for example, that in nature there or fully appreciated. Nonetheless, the purpose of in never was a normal distribution, there never was a straight silico analysis is to apply tools to extract meaningful line, yet with normal and linear assumptions, known to be results from high-dimensional data for the purposes of false, he can often derive results which match, to a useful generating and testing biological hypotheses (Tyson et al. approximation, those found in the real world.’ 2011). For instance, we may wish to understand and predict under what conditions a cancer cell will begin to George E P Box (1919–2013) proliferate in situ or migrate to a new location. Extracting To extract new insights and build integrated, predictive such knowledge from large datasets by intuitive reasoning models, particularly from experiments that generate ‘big alone can be difficult or impossible and is often associated data’, requires some form of in silico analysis to deal with the with a high risk of operator bias and/or error. Thus, new complexity of the data. For biological systems, complexity tools and approaches continue to emerge to deal with the can arise from dimensionality (many genes and their challenges of working in high-dimensional data spaces interactions) and from general properties of the system and to enable integrating the spatial, temporal and cell that reflect its topology (feedforward and feedback loops), context-specific nature of regulatory networks (Hoadley adaptability (redundancy, degeneracy), multimodality et al. 2014, Leiserson et al. 2015, Masoudi-Nejad et al. 2015, (concurrent performance of multiple integrated and Tape 2016, Barberis & Verbruggen 2017, Dimitrova et al. coordinated tasks) and dynamism (changes in time and 2017). New concepts, such as ‘master regulator proteins’ space) (Clarke et al. 2008, Tyson et al. 2011). Complexity can that may determine the transcriptional state of a cancer also arise at the cellular level. A notable feature of several cell, also continue to arise (Califano & Alvarez 2017).

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R349 Cancer systems biology

Table 1 Examples of the most commonly used endocrine-related breast cancer public omic datasets.

Database URL Data spaces CPTAC https://proteomics.cancer.gov/data-portal Proteome EGA https://ega-archive.org/datasets Genome, Transcriptome EMBL-EBI https://www.ebi.ac.uk/services/all Genome, Transcriptome, Proteome, Metabolome GNPS/Massive https://gnps.ucsd.edu/ProteoSAFe/static/gnps-splash.jsp Metabolome ICGC https://dcc.icgc.org/ Genome, Transcriptome MassIVE https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp Proteome Metabolomics https://www.metabolomicsworkbench.org/ Metabolome Workbench NCBI-GEO https://www.ncbi.nlm.nih.gov/gds Genome, Transcriptome ONCOMINE https://www.oncomine.org/resource/login.html Genome, Transcriptome ProteomeXchange (PX) Consortium http://www.proteomexchange.org/ Proteome ProteomicsDB https://www.proteomicsdb.org/ Proteome TCGA https://portal.gdc.cancer.gov/ Genome, Transcriptome

Primary data and metadata quality vary across and within these sites. For example, clinical metadata for human subjects are often limited. The platform used for data collection in each omics space also can vary across and within these sites. While most provide access to the raw (unprocessed) data, ONCOMINE primarily exposes only processed data; the method of data processing can vary across individual studies.

Computational modeling can provide unbiased gene set enrichment analysis can rapidly probe a large results from large data sets, allowing us to visualize database of genes and their hierarchically annotated complex signaling relationships within the data functions to suggest signaling pathways closely affiliated (Gehlenborg et al. 2010). Some of the more useful with a list of differentially expressed genes (Subramanian approaches in this area come from applications of graph et al. 2005); for example, see http://software.broadinstitute. theory. Graphs are mathematical structures that represent org/gsea/index.jsp. A pathways database and a search tool pairwise relationships between nodes. Each gene/protein is also provided by the Gene Ontology Consortium (see is a node (or vertex) and each connection with another http://geneontology.org/page/go-enrichment-analysis). gene/protein is an edge. Graphical representations of Given adequate data, both computational and molecular signaling are readily available on the web. For mathematical models can make quantitative predictions example, signal transduction pathways may be found at of the biological state under investigation. One of the the community-based Kyoto Encyclopedia of Genes and primary uses of quantitative models is to perform in silico Genomes (KEGG; http://www.genome.jp/kegg) or the experiments where the values of specific nodes or edges commercially supported Biocarta Pathways Project (http:// are changed and the model is used to predict how the www.biocarta.com/genes/index.asp). These graphical change affects other nodes in the network. It is possible to representations are mostly assembled intuitively from the run hundreds or thousands of such simulations to explore literature to provide a static reflection of the topological both model performance and how specified changes in features of mostly canonical signaling networks. node/edge values affect the distribution of predicted Static maps are widely used to represent complex outcomes. For example, in silico modeling can be used to signaling networks and to guide largely intuitive compare multiple drug combinations including the effects interpretations of signaling, but they are of limited use of scheduling and dosing that would be very difficult for predicting signal flows through edges of the network in animal models or even in some cell culture models in a living cell responding to signals received from its (Tang & Aittokallio 2014, Ryall & Tan 2015, Ledzewicz & environment. Limited dynamic information may be Schaettler 2016). Appropriate quantitative models, when evident in the directionality of signal flow (such as, effectively applied to sufficient, high-quality data, can protein A upregulates the production of protein B), but enable investigators to explore questions in ways that the consequences of many such interactions in a complex, would otherwise not be possible. Visualization of the interconnected network are challenging to predict by outputs from computational analysis of high-dimensional intuitive reasoning alone. Appropriate computational data can be an indispensable aid in interpreting the models can help to uncover complex associations hidden biological significance of the data Gehlenborg( et al. 2010, in the data and often may provide a statistical assessment Cirillo et al. 2017, Pavlopoulos et al. 2017, Robinson et al. of the strength of any predicted association. For example, 2017). Thus, multiscale modeling enables investigators to

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R350 Cancer systems biology explore complex datasets and signaling in new ways that and new predictions will be generated that can be used to are both tractable and productive. test the modified assumptions. A suitable framework to guide the modeling effort is a key starting point. The framework describes, at a Multiscale modeling high level, what is generally known about the system in the context of integrated modules that perform specific ‘Numquam ponenda est pluralitas sine necessitate’ cellular functions. Thus, a modular function, such as cell (Plurality should not be proposed unnecessarily) death, may be explained by a model of the signaling that controls and executes one or more forms of cell death, William of Occam (c. 1287–1347) such as apoptosis. Where there is sufficient knowledge ‘Since all models are wrong the scientist cannot of an individual module, a reasonably detailed influence obtain a ‘correct’ one by excessive elaboration. On the diagram of known or predicted signaling relationships contrary following William of Occam he should seek an can be created to guide construction of the mathematical economical description of natural phenomena.’ equations. This knowledge can be gained from specific experimental data available in the laboratory, from the George E P Box (1919–2013) literature, or perhaps based on a static canonical model All models are abstract representations of the system they as might be obtained from KEGG or Biocarta. Where a are built to portray. The types of models we consider here canonical model does not exist (or there is good reason are not intended to explain all of cancer biology. Rather, to believe that canonical signaling is inadequate), we use models to learn something new about how a computational modeling can be used to formulate new specific function may operate, be controlled and interact hypotheses about the topology of a control module from with other cellular functions to affect a specific biological high-dimensional data (Clarke et al. 2011). Where there is outcome. For example, we may wish to understand how sufficient knowledge of the components and interactions estrogens affect the decision of some breast cancer cells to of a control system, the interaction diagram can be enter and complete a turn of the cell cycle. Understanding translated into a set of mathematical equations that this function could then lead to addressing larger goals, quantitatively represent dynamical fluxes through the such as developing new therapeutic interventions to network (readers interested in exploring specificin silico block cell cycling or predicting which patients would models can find examples in several databases including receive the greatest benefit from blocking this action of JWS Online, available at http://jjj.biochem.sun.ac.za/ estrogens. Thus, the primary goals of modeling are to give index.html, and the Biomodels Database http://www.ebi. insights into how a control system works at the molecular ac.uk/biomodels-main/). An example of such a framework level and to make robust, reliable predictions about how can be seen in our roadmap for systems modeling the system responds to a variety of natural situations and of endocrine responsiveness in breast cancer medical interventions. (Tyson et al. 2011). For molecular signaling studies, the latter goal can At some level, useful models need to address be achieved by changing the values of parameters in the open, complex, dynamic and adaptive nature of the model and experimentally validating the predicted biological systems. While we do not intend to provide outcomes. Given a perturbation or rewiring of a control a detailed description of the concepts and methods of network, the output of a model is a prediction of the model building, we can mention some general, widely changed state of the cell (for example, alive or dead; applicable principles. First of all, we must keep our end- proliferating or growth arrested). When simulations of a goal in mind (what aspect of cancer cell physiology are model under a variety of realistic conditions inadequately we trying to understand) as well as our starting point reflect what is already known to occur in cells, the model (what is our working hypothesis about the underlying must be modified or extended. For example, a model may control system). Then, ideally, we would like to get predict that reducing the expression of one gene should from the working hypothesis to accurate predictions of increase the expression of another, but the observed result cell behavior with a model that is as simple as possible, of this experiment (perhaps using an RNAi approach) but not so simple as to leave out crucial features of the is the opposite. By considering how to resolve this molecular biology or cell physiology. Of course, these discrepancy between model and experiment, new insights are vague and often antithetical requirements (what is may be gained into how the control system works, simple? what is crucial?), but it is the job of the modeler

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R351 Cancer systems biology to make informed decisions about how much detail can An example of the classification task would be the and should be included in the mathematical model. use of gene expression data from a patient’s tumor to Often these uncertainties can be addressed by an iterative predict the patient’s prognosis and/or to determine the approach, involving knowledge-guided trial-and-error or best choice of treatment. Among the simplest examples the use of multiple feature selection tools (as an example is the heuristic guide for the treatment of breast cancer see the Feature Selection functions by MathWorks, http:// patients based on a three-gene classification scheme: www.mathworks.com/help/stats/feature-selection.html). estrogen receptor alpha (ER), progesterone receptor To highlight issues that may be useful for the non- (PGR) and HER2. Knowledge of the expression of these expert wishing to evaluate published models and/or to three genes defines three molecular subgroups: ER and/or collaborate with modelers, we next address the utility PGR-positive (can be treated with an endocrine therapy), and methodology of mathematical and computational HER2-positive (can be treated with an anti-HER2 therapy, modeling. In our studies, we use computational tools to approximately half of these also express ER and/or PR and extract small, robust and information-rich topological may also receive an endocrine therapy) and absence of features from high-dimensional data sets. These features expression of all three – often referred to as triple-negative can then be tested and validated experimentally, and breast cancer (TNBC) – which is usually treated with at this stage, a simple mathematical model may be cytotoxic chemotherapy. A similar goal is exemplified by useful in capturing this knowledge, working out its using a panel of clinical/pathological measures to predict implications, and making predictions to guide further prognosis in breast cancer; an example being the semi- laboratory experiments (Clarke et al. 2011). This iterative quantitative assessment that produces the Nottingham approach requires a modeling framework (a network Prognostic Index (Galea et al. 1992). Classifiers based on diagram), some relevant experimental data, and a basic omics data are also available and in common clinical understanding of how components of the network may use, including the 70-gene signature that comprises the interact to produce observed physiological responses of MammaPrint prognostic predictor (Bedard et al. 2009) cells. The network diagram guides the construction of and the prognostic PAM50 gene signature (Parker et al. the mathematical model, which can be used to compute 2009). Signatures that have not yet become adopted the expected behavior of the simulated cells. To carry widely in the clinic continue to emerge (Wu & Stein 2012, out simulations, we must first estimate the values of the Cheng et al. 2013). The output from these types of models parameters (such as rate constants and binding constants) is a prediction of the future behavior of the cancer – a in the mathematical model. Parameter estimation is a clinical outcome such as an estimate of patient survival difficult problem, but it can (and must) be carried out in (prognosis) – often within a defined time period. light of existing experimental data (Tyson et al. 2011). Omics-based classifiers (most frequently There would be no rationale to include a parameter transcriptomic) are usually built using a supervised without some data or direct evidence of its involvement approach, where a training set of data from samples in reactions, and these data can provide bounds on with known outcomes is used and the predictive model parameter values in the mathematical model. Once is subsequently validated in independent datasets. an initial model adequately accounts for the existing Classification models often rely primarily on the statistical data, it can be used to predict specific outcomes of new properties of each measurement/input variable and do experiments that can be run to confirm, extend or adjust not require that these properties derive specifically from the model. Thus, iterative modeling with the addition of any biological function of the system (Clarke et al. 2008). new data allows both testing and refining of the model, The literature contains many different attempts to build which leads to new biological insights (Clarke et al. 2011). classification schemes in breast cancer but often with varying results and robustness, even for some of the most widely used tools (Mackay et al. 2011, Venet et al. 2011). Examples of modeling goals While some schemes produce comparable outcomes on Cancer systems biology studies tend to focus either on a common dataset, the features selected for classification classification, where the goal is to predict a phenotype by each scheme often have little overlap (Imamov et al. or outcome based on data, or on mechanistic modeling, 2005). Given the complexities in molecular signaling and where the goal is to learn something new about how the selection of genes based on their statistical properties the system (a tumor, a cancer cell or a signaling network to support classifier performance, it is not clear whether within the cell) functions (Clarke et al. 2011). this observation reflects different genes representing

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R352 Cancer systems biology similar underlying processes (Imamov et al. 2005) or a lack mRNA regulation in microarray data are often used to of robustness in feature selection unrelated to biology. validate these signatures. Studies with RNAi or cDNA Network-based classification can also be performed overexpression, mostly done using cell lines growing on individual patient data (Creixell et al. 2012). The key is in vitro, may also be used to further establish the influence to develop a quantitative metric based on the topology of of gene expression on target gene regulation. a learned network that can be applied to new observations These approaches may not account fully for the to determine if the new observation is likely to share the complexity of a given target gene’s transcriptional same topology. For example, once phenotype-specific regulation, such as whether factors other than the protein networks are learned, a model-based likelihood measure complex that is detected as being bound to a specific can be calculated to determine which topological promoter element are driving the measured differential hypothesis is more likely generating the new observation, expression of the target gene. For example, TF1→Target where the learned variance of network topology is used to Gene could still be driven through a latent variable(s), support such likelihood-based hypothesis testing. since the same experimental outcomes could be seen if The second goal of a systems analysis of data is to TF1 was knocked down and the true relationship was generate new insights into mechanistic aspects of the TF1→TF2→Target Gene or even TF1⊣TF2⊣Target Gene. cancer phenotype. For example, the model may be used Hence, both false-positive and false-negative regulatory to understand why patients respond differently to a events may be obtained in addition to true events. specific therapy or how molecular signaling regulates or For in silico modeling, including data on TF2 may or executes a specific phenotype. Hence, the analysis may be may not affect model function. Where it does not, structured to test if a series of proposed features might be the measurements of TF2 are superfluous and, in the true (hypothesis testing) or to discover new features that interests of parsimony, can be eliminated from the model. might explain mechanism (hypothesis generation). While Alternatively, there may be technical reasons that make these models also frequently use the statistical properties the measurements of TF2 more reproducible than those of the measurements to find signaling features of interest, of TF1. In this case, when TF1 and TF2 capture the same there is an explicit assumption that the measurements, information, the model may perform better with TF2 and any changes in their values across phenotypes, are measurements than using those for TF1. derived from relevant biological properties of the system. Among the more common approaches for mechanistic Modules and emergent behavior studies is the use of transcriptome data to build gene regulatory networks, as exemplified by a network of System models can be constructed as a network of transcription factors (TFs) and the target genes that they integrated and interacting modules that perform the are known, or predicted, to regulate. Insights from models system’s component operations in a coordinated manner built primarily from transcriptome data can be limited (Tyson et al. 2011). The topology of signaling for a by the often low frequency with which transcriptome module can be extracted de novo from the data, with changes translate into similar expression changes in the functions being implied from any known activities of proteome (Vogel & Marcotte 2012). The target genes for their member nodes (Wu & Stein 2012). However, for TFs are identified either in silico (predicted using DNA modeling known functions where there is significant sequence data; see MotifDb at (http://www.bioconductor. data and prior knowledge, modules can be viewed org/packages/release/bioc/html/MotifDb.html) as an more discretely as integrated network components that example of a tool for performing this function) or regulate and/or execute a specific function (Tyson et al. experimentally (chromosome immunoprecipitation- 2011). For example, apoptosis could be considered as a based methods; ChIP). These studies often produce small module that performs a cell death function; apoptosis and mostly unidirectional maps (TF→target) and they can then be modeled as a discrete process, perhaps as a can be noisy. For example, in silico predictions of targets closed, input–output device. Cells have other modules based only on promoter sequences do not account for that perform similar functions, including autophagy DNA structure/accessibility and are often incomplete. (which can produce prodeath or prosurvival outcomes). Experimentally measured promoter occupancy (such as by These modules represent biological redundancy because ChIP) does not always reflect functional regulation of the if an irreversible cell fate decision is made in favor of adjacent gene. Correlations of measured (ChIP/ChIPseq) death, one of several differently constituted modules can or predicted promoter sequence binding with differential execute that decision. Some genes may play key, but not

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R353 Cancer systems biology necessarily similar, functions in more than one of these in response to specific stresses is likely the emergent modules. For example, BCL2 can regulate the activation property that drives both the phenotypic plasticity of the autophagy module through its ability to sequester often attributed to cancer cells and the development of BECN1, while also affecting execution of the apoptosis resistance to anticancer drugs. From an intuitive point- module through its effects on mitochondrial membrane of-view, emergent properties are challenging because they permeability (Clarke et al. 2012). Cell fate may depend on are difficult to deduce from a knowledge of the individual the amount of BCL2 present and its subcellular location. components of the system, and the relationships between For example, BCL2 bound to BECN1 may be unable to the emergent property and its component parts may be protect the mitochondria, with BCL2:BECN1 complexes non-linear and dynamic (changing over time). To deal effectively preventing the initiation of prosurvival reliably with these complexities requires comprehensive autophagy (BECN1) and concurrently not preventing and accurate mathematical models to guide our thinking apoptosis (BCL2). Since other prosurvival BCL2 family and predictions. members can also bind to BECN1, the balance of Emergence may underlie many novel behaviors of prosurvival-to-prodeath BCL2 family members (there cancer cells that cannot easily be foreseen from knowledge is potentially significant signaling degeneracy within of the system’s individual components. In evolutionary apoptosis), the concentration of free BECN1 remaining biology, emergence can reflect the development of available to activate autophagy, and their respective larger or more complex functions or behaviors derived subcellular localization(s) may all contribute to the final from the interactions among, but not shared with, cell fate decision. The potential for cell context-specific individual smaller or less complex features (Okasha wiring (and rewiring in response to stress) is evident. 2012, Gho & Lee 2017). New behaviors in tumors likely A clear understanding of these interactions in ER+ arise through changes that affect interactions within breast cancer cells requires both significant insight and and among modules. For example, changes in signaling quantitative data from wet laboratory studies. Predicting from within the tumor microenvironment (adaptive) cell fate outcomes robustly in the presence of various or the acquisition of a genetic/epigenetic change (such endocrine stressors (estrogen withdrawal, exposure as activating or inactivating mutations) could alter the to SERMs/SERDs) is unlikely to be successful without level of expression, function or subcellular location of adequate in silico modeling. An effective dynamic model a molecule or the activity of a pathway in a network. of these relationships could also be used to predict optimal Consequently, this pathway may now connect different drug dosing and scheduling to drive maximal cell death modules that perform a new cellular function or and potentially limit the emergence of drug resistance continue to perform an existing function in a different (Tang & Aittokallio 2014, Ryall & Tan 2015). manner. Where these new emergent properties confer Integration of modular functions allows a cancer a biological advantage, they are expected to experience cell to coordinate and execute the activities it needs to positive selection (in a Darwinian sense) (Enriquez- proliferate, survive, move and invade locally, respond Navas et al. 2015). Acquired drug resistance may be an to stress and manage its metabolism to support these example of a new emergent property that is not evident actions. Modules can be combined differentially in time in the initial cell population. Such resistance could and space, creating some of the phenotypic diversity that be mutational (ER mutations that confer resistance is characteristic of breast cancer cells. When modules to aromatase inhibitors in breast cancer) or adaptive interact in complex feedback and feedforward loops, they (activation and integration of the unfolded protein can exhibit redundancy (different modules performing response module with a prosurvival autophagy module similar functions), degeneracy (different signaling routes that act together to confer resistance to antiestrogens) allowing a module to perform the same function in (Clarke et al. 2011, 2012). different ways) and novelty (the ability to perform new The emergent properties of cells in a system like an functions or old functions in new ways). This plasticity ER+ breast tumor likely explain, in part, the phenotypic of the response characteristics of modular networks is the heterogeneity of some breast tumors and also the diversity origin of their ‘emergent’ properties (Bhalla & Iyengar of responses that confer drug resistance (Clarke et al. 1999). For example, an apoptosis module may be blocked 2012). The property of emergence with respect to acquired in a cell but the cell death decision may now be executed multiple drug resistance (a function that is likely subject to by an autophagy module. The ability to recombine positive selection), and the potential that some complex signaling features in complex regulatory networks functions may never stabilize (the rate of appearance of

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R354 Cancer systems biology new metastatic foci may continue to increase throughout Nonetheless, like weather prediction, mathematical the disease process), may underlie the high prevalence of models of cellular regulatory systems can be very useful distant recurrences that are poorly responsive to available for short-term forecasting of local activity without being systemic therapies, and so are generally fatal. reliable predictors of long-term ‘weather’ patterns on a ‘global’ scale.

Dynamics Parameters

One of the major strengths of quantitative mathematical To simulate a mathematical model, we must first estimate modeling is the ability to capture the dynamic the values of all kinetic parameters from experimental nature of a system (Aldridge et al. 2006, Anderson & observations. Examples of parameters include reaction Quaranta 2008, Toettcher et al. 2009, Spencer & Sorger rate constants (such as protein synthesis and degradation, 2011, Molinelli et al. 2013). In particular, models of or phosphorylation and dephosphorylation) and binding endocrine-related cancers have provided new insights or dissociation constants (for example, Michaelis into the temporal development of invasive, metastatic constants for enzyme-catalyzed reactions). Estimation of cells (Quaranta et al. 2008, Gallaher et al. 2014), drug- these parameter values is often the most difficult aspect of treatment responses and drug-resistant states (Chen building a useful mathematical model (Liepe et al. 2014, et al. 2013, 2014, Parmar et al. 2013, McKenna et al. Kimura et al. 2015). The goal of parameter estimation is 2017) and the origins of network plasticity (Tavassoly often not to find the ‘optimal’ set of parameter values et al. 2015, Picco et al. 2017). Examples of some of the for fitting a selection of experimental results but rather methods used in mathematical modeling are provided to find a representative collection of parameter sets that in Table 2 (Tyson et al. 2019). all provide an ‘acceptable’ fit to the data Tavassoly( et al. Despite their evident utility, dynamic models in 2015). molecular cell biology must be interpreted cautiously. When faced with the dimensionality of data from an Model predictions can be very accurate when restricted to omics platform, a mathematical model with thousands conditions close to the experimental conditions on which of variables would be difficult to formulate and almost the model was built, but less reliable when extrapolated impossible to parametrize. Currently, high-dimensional far beyond the range for which they have been verified. data are more effectively explored using computational

Table 2 Methods of mathematical modeling.

Method Dynamic variables Time Example Boolean networks X(t) = 0 or 1 t = integer X inhibits synthesis of Y and Y(t) = 0 or 1 (0, 1, 2, …) Y inhibits synthesis of X Xt( +=1) ¬Yt( ) Yt( +=1) ¬Xt( ) Ordinary differential equations X(t) = positive real number t = real number X inhibits synthesis of Y and Y(t) = positive real number (t ≥ 0) Y inhibits synthesis of X dX k = sx − kX dt 1+ Y p dx dY k = sy − kY dt 1+ X q dy

Stochastic models M(t) = positive integer t = real number Propensity of mRNA synthesis = ksm (t ≥ 0) Propensity of mRNA degradation = kdmM Probability density function for number of mRNA molecules in the cell is M −λ λ k PM = e sm ( ) M! , where λ= kdm Hybrid deterministic-stochastic M(t) = positive integer t = real number Genetic regulatory network: models P(t) = positive real number (t ≥ 0) Simulate mRNA fluctuations,M (t), with a stochastic model and protein dynamics, P(t), with ordinary differential equations

Additional information can be found in Tyson et al. (2019). Reprinted from Journal of Theoretical Biology, Vol 462, Tyson JJ, Laomettachit T & Kraikivski P, Modeling the dynamic behavior of biochemical regulatory networks, Pages 514–527, Copyright (2019), with permission from Elsevier.

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R355 Cancer systems biology modeling where the assumptions of the model are higher Whichever approach is selected, statistical models level and less demanding of detailed kinetic information. (Bayesian or frequentist) have assumptions that can For example, machine-learning techniques can learn the be violated and parameters (even non-parametric features of molecular networks and their relationships probabilistic tools have parameters; these are not fixed in from the data. Bayesian approaches are common in this advance but obtained from the data) that can be affected regard and are discussed below. by the data structure and that can influence performance. While it is not always evident which statistical model is most appropriate for the data being analyzed, Deterministic and stochastic models understanding what the model outputs represent is Deterministic models, defined usually by differential important for correctly inferring biological meanings or equations, produce specific outcomes for a given set of appreciating the uses and limitations of the output. parameter values and initial conditions, without any An increasingly common approach for computational evidence of randomness. In contrast, stochastic models modeling is to build models that incorporate prior evolve in time with significant random fluctuations knowledge of the system (Tian et al. 2014b, 2015). Prior (Singhania et al. 2011, Barik et al. 2016). For example, a gene knowledge can be as simple as looking at the expression regulatory network, where TFs regulate specific targets, levels of genes already known to contribute to the could be modeled deterministically or stochastically. In a phenotype, at known interactions among molecules deterministic model, the rate of gene transcription would such as protein–protein or protein–DNA interactions have a definite value determined by the activity of the (PPIs or PDIs) or at relationships reported in canonical transcription factor. In a stochastic model, the activity of signaling pathway representations. Incorporation of prior the TF would determine only the propensity (probability knowledge, depending on the quality of the knowledge, per unit time) of transcribing the gene into an mRNA can greatly improve the performance of algorithms to molecule. In this case, a stochastic model represents build Bayesian networks. Indeed, a major challenge more accurately the noisy process of gene transcription in constructing Bayesian networks is the selection of in individual cells, but a deterministic model may capture appropriate prior probability distributions (priors) for the adequately the average rate of expression of the gene over variables in the model. How these parameters are estimated a population of cells responding to an external stimulus for a Bayesian approach affect its outcomes (Lampinen & that is activating the TF. If we have data on the noise Vehtari 2001). Poorly estimated priors (relative to ground associated with gene transcription in individual cells, truth – which is often unknown) may provide fits to then a stochastic model may be warranted and needed. the data that are statistically acceptable and intuitively Stochastic models have been useful for exploring the logical, but solutions that are, nonetheless, noisy and lead dynamic responses of endocrine-related cancers (Jain to incorrect biological interpretations. Influence of the et al. 2011, Chen et al. 2014, Morken et al. 2014). A prior can be reduced using Bayesian hierarchical models deterministic model is simpler and more appropriate if and robust priors (Berger 2010). we have only gross transcriptome data on populations of In Bayesian networks, the edges are directed but the cells under constant conditions. sign is not specified. Consequently, whether the edge is positive (such as driving) or negative (such as inhibiting) must be inferred from sources external to the model and/or Bayesian models established experimentally. A further limitation is that A general objective of computational tools is to find edges cannot be interpreted as necessarily reflecting direct patterns (correlation structures) within data. For interactions. While some interactions may well be direct, example, with transcriptomic data an algorithm may latent variables can also create direct edges in the model look for patterns of changes in gene expression that are solution where none exist in the biological system. For correlated with each other and with the phenotype(s) or example, the predicted edge of A→B in the model may function(s) of interest (Dutta et al. 2016, Anafiet al. 2017, really be A→C→B (see also the discussion of modules and Califano & Alvarez 2017). Some measure of the statistical emergent behavior, above). Inferring feedback loops can strength of these correlations, using either a Bayesian also be difficult, such as →A C→B→A. (conditional probabilistic) or frequentist (parametric or For gene network modeling, the quality of the non-parametric probabilistic) approach, is usually applied knowledge and its incorporation into the selection of to help identify the associations most likely to be correct. priors will improve the predictions. Two implications

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R356 Cancer systems biology follow from this observation. Firstly, a team with better Here, error propagation represents the effects of the biological understanding of a system may build a Bayesian- variability in the input variables on their respective model based algorithm that outperforms others on the analysis functions and on model output (Mangado et al. 2016). of this specific system (because the model’s priors are Estimating (and reporting) uncertainty propagation and more correctly defined by the team’s existing knowledge) its implications is an important consideration in assessing but produces less robust/accurate predictions than other model calibration and interpretation (Vanlier et al. 2012). algorithms when it is applied to related systems. Secondly, Methods to estimate uncertainty propagation continue to detailed prior knowledge of a system limits what new be developed and applied (Ades & Lu 2003, Welton & Ades knowledge can be discovered. The more that is understood 2005, Dubois 2010, Moseley 2013, Mangado et al. 2016). about the system ahead of time, the better the model will In his discussion of error propagation in metabolomics perform. However, the model will be making predictions studies, Moseley notes that both derived and propagated in a shrinking space where there is less new knowledge uncertainty should be reported along with the results to be discovered. In reasonably well understood systems, (Moseley 2013). these latter models may have most utility in building our Measurement errors, as they apply to the relationship confidence that what we believe to be true may indeed between a measured variable and its covariate, are additive be true. In systems that are inadequately known, the new (Eckert et al. 1997). Integrative analyses across workflows in knowledge space can be large and the predictions noisy; multiscale modeling, as may occur when combining data the extent to which something is now believed to be from DNA sequence, RNA sequence/abundance and/or true may require careful evaluation. Overall, the primary PPI studies, include many relationships between the advantages of modeling include the ability to integrate measured variables (such as mRNA and protein expression significant amounts of knowledge, to help researchers to levels) and covariates (such as a clinical outcome or changes understand confounding events seen in the data and to in phenotype). Such analyses may be prone to error answer questions of combinatorial complexity for which propagation and to error additivity or even amplification. experimentation within the wet laboratory is prohibitive. For example, agglomerative techniques (such as some hierarchical clustering), growing decision trees (such as some random forest methods) or the network propagation Error, performance and validation algorithms that have begun to attract increased attention Some workflows may include the output of one algorithm (Cowen et al. 2017) may be sensitive to error propagation. as a means to guide parameter estimation for another. Once an error (node-edge connection) is made during For example, in building a gene regulatory network from the graph build, it may remain and affect the accuracy expression data, an investigator could take the output of subsequent local connections and of the overall model predictions from a tool that predicts a TF and its targets solution. A build error that remains can lead to a model as a means to define the priors for a Bayesian network solution that reaches convergence and appears ‘globally modeling analysis of how these molecules are related correct’ but contains features that are ‘locally wrong’. The in the data from a gene expression study. Intuitively, challenge here is that it is the local connections that are even if the TF output is statistically noisy, it might be used to guide individual wet laboratory experiments. expected to outperform a model with uninformed priors Studies that apply bioinformatic/biostatistic tools where equal probabilities are assigned to each outcome. to solve problems in large data spaces are likely to be at Nonetheless, some of the predictions will be wrong and greatest risk of experiencing the various types of errors represent errors in the prior that may be worse than described above. The ‘hairball’ models often produced uninformative; these types of errors will be propagated are rarely robustly tested for local error, especially when from the output of one tool to the output of the next. the global model fit provides an apparently miniscule Since the variables and their relationships (as captured in P value. For example, independent datasets showing their priors) were thought to be intuitively correct, if these the same topologies are often not shown, frequently incorrect variables persist as key features of the Bayesian because the data are not available to do independent model solution, they could create the trap of self-fulfilling validation. The internal topology of individual cliques is prophecy (Clarke et al. 2008). Predictions from one tool rarely tested, even using a simple n-fold cross-validation. will also be associated with a level of error (variability), Global solutions are also rarely tested by an analogous and this type of error will also propagate when the outputs n-fold cross-validation, such as removing entire cliques are used as input variables for another tool in a workflow. at random. Since the overall topology of the solution

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R357 Cancer systems biology is likely to be influenced by the relationships among approach is to knockdown a target gene in cells where discrete discovered features, without testing the effects of it is overexpressed, overexpress the gene in cells where removing features on the remaining structures, there are its expression is low and then determine if the biological few ways to determine topological robustness. While these function(s) is altered as the model predicts. Knockdown ‘hairballs’ will likely have met the statistical requirements is commonly achieved by an RNAi method such as siRNA for global algorithmic convergence, how many of the or shRNA transfection. A gene may also be eliminated local structures are correct, either internally within using CRISPR (Yin et al. 2019). How often a cell totally each feature or externally within the global solution, is loses a gene or its expression likely requires careful often left to human intuition and the risks therein (Clarke consideration. Total loss of a protein’s expression, as et al. 2008). would usually occur with CRISPR, could alter a signaling Appropriate assessments of model robustness and feature in a manner that does not occur when expression validation are critical to the successful use of a systems is lowered but not eliminated in the phenotype(s) of biology approach (Steyerberg et al. 2001). There are interest. While CRISPR is often preferred over RNAi, for many tools to assess model performance and validation genes where downregulation rather than total loss is the and a detailed technical discussion is beyond our scope. primary biological observation, RNAi may offer a more Here, we use performance to denote assessments of the physiologically relevant validation approach. A similar robustness or reproducibility of model predictions. For caveat applies to the use of cDNA transfection to produce performance, biostatistical assessments of model fit overexpression of a gene. The level of overexpression are usually incorporated into the workflow. Examples may be outside the range seen in the phenotype(s) under of approaches to assess performance include use of a study, and so also produce changes in network features receiver operating characteristic analysis and estimates that are not physiologically relevant. These types of of the positive predictive value and negative predictive events could lead to misinterpretations of the validation value. An internal n-fold cross-validation is commonly experiments. For example, the phenotype predicted by used, particularly when data are limited (Waljee et al. the in silico model is not observed or further studies to 2014). A random portion of the data is withheld at each determine the effects of the manipulation of a gene on interaction as a ‘validation set’, and the remaining data signaling identifies new relationships that are signaling are used as a ‘training set’ for running the model. Multiple artifacts from a physiological relevance perspective. iterations are run and the performance for each iteration As an example of a biological validation strategy, is compared to assess the overall model performance. consider a prediction by an in silico model that an A model can be tuned by adjusting its parameters until antiestrogen should induce autophagy through altering the predictions from the training and internal validation expression of BECN1 in ER+ breast cancer cells. One sets become sufficiently comparable. Since this approach approach to mechanistic validation of this prediction can lead to model overfitting, the most informative could be to apply the drug and its vehicle control to ER+ assessment of model performance is obtained from the and ER− cells (negative control), measure changes in use of independent datasets not used in model building BECN1 and autophagy and then use a molecular approach and any internal performance analyses. A robust model to study if BECN1 knockdown or overexpression altered is expected to produce broadly similar predictions in the regulatory effects of the antiestrogen on autophagy. all comparable data sets. For classification studies using An underappreciated challenge with these types of studies human tumors, the use of independent datasets may is that the experimental validation may be frustrated by also be the only tractable option for validating model a high proportion of intuitively rational, statistically predictions. significant, but biologically incorrect in silico model For models that are used to predict system function predictions (the wet lab validation experiments show the in a biological context, mechanistic or functional predictions to be invalid). validation of a prediction is almost always required. Here, validation refers to experimental validation in Modeling drug interactions the form of appropriate wet laboratory studies. These validation studies are often done in cell lines and/or Another area of significant potential for a systems approach animal models and can include applying perturbations is the search for drug combinations for treating a specific to the experimental system and then measuring whether cancer in the context of a multicomponent signaling the changes predicted by the model occur. A common network within the cancer cells (Tang & Aittokallio 2014,

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R358 Cancer systems biology

Ryall & Tan 2015). Effective combination therapy, which the signaling network model parameters and the functional is a hallmark of current cancer treatment, requires an structure of the dose–response relationship (Fang et al. 2016). adequate understanding of signal complexity. Developing This model comprises a Hill equation for signals arriving at and evaluating drug combinations is difficult because each receptor, a generic enzymatic rate equation to describe the complexity of the problem increases combinatorially the transmission of signals among connecting genes, and with the number of constituent drugs proposed to address a logistic equation to represent the cumulative effect of an integrated driver pathway of the cancer. When the genes implicated in the onset of the cell death machinery. possibility of sequencing drugs at different times relative These statistical models generate a global drug sensitivity to one another is added to the mix, complexity again index based on the joint dose–response characteristics. Only increases dramatically. Progress has been made using a the few terms with large global-sensitivity indices, much systems biology approach. For example, the joint effects like principal components, are kept and subject to further of multidrug combinations can be evaluated based on the experimental validation. Recently, the experimental design mechanisms of action of the drugs (Fitzgerald et al. 2006). required for such subsequent experimentation has also been If the constituent drugs in a combination therapy exert worked out (Fang et al. 2016, Huang et al. 2018). their effects through known mechanisms that feed into common pathways, the joint effect of the combination An example of computational may be assessed by the ‘Loewe additivity’. If the drugs modeling: KDDN act non-exclusively on multiple targets, the effect may be assessed by the ‘Bliss additivity’ (Baeder et al. 2016). Cancers are often characterized by dysregulation of Knowledge of the biological system can be used for molecular signaling (Barabasi et al. 2011, Tyson et al. 2011, experimental design and data analysis. Thus, drugs with Creixell et al. 2012). Significant rewiring of molecular different mechanisms of action, as revealed by systems networks can drive key phenotypic transitions that can biology modeling, may exhibit different shapes of their occur in both a tumor and its microenvironment (Califano dose–response relationships. Such information can be 2011, Roy et al. 2011, Ideker & Krogan 2012). The impact of augmented by experimental data on a single drug to a treatment can spread through the network and alter the optimally design the experiments on the joint effect of activity of functionally relevant gene products (Roy et al. the drug combinations. 2011, Creixell et al. 2012). Most molecular components Because the complexity of the problem increases exert their functions through interactions with other rapidly with the number of constituent drugs, even the molecular components (Li et al. 2008, Gong & Miller development of systems-based methods for the design 2013). How cancer cells differ from each other in their and analysis of three-drug combinations has been responses to environments or treatments is intrinsically only recent (Fang et al. 2017). The case of three-drug context specific Mitra( et al. 2013) and identifying such combinations is fundamentally more difficult than two- differences may represent a ‘wicked’ problem for the drug combinations. Finding doses of the combination, research community (Rittle & Webber 1973, Courtney number of combinations and replicates needed to detect 2001, Clarke et al. 2011). Changes in molecular departures from additivity depend on the dose–response interdependencies across cancer phenotypes may reveal shapes of each of the constituent drugs. Thus, different novel hub genes and pathways, which may be suitable classes of drugs with different dose–response shapes must targets for drug development. Instead of asking ‘which be treated as separate cases. We designed and analyzed genes are differentially expressed?’ the question here is a combination study of three anticancer drugs (PD184, ‘which genes are differentially connected?’ (Hudson et al. HA14-1 and CEP3891) that inhibit the H929 myeloma cell 2009). Studies on network-attacking events will shed new line. The three-drug combinations study used the original light on whether network rewiring is a general principle 4D dose–response surface formed by the dose ranges of of cancer cell responses, as most molecular therapies the three drugs (Fang et al. 2017). target proteins and their networks but not genes (Califano Methods for screening large numbers of drug 2011). Novel hypotheses inferred from the rewired TFs combinations are being developed to reduce the problem to and their distal enhancers or partners can be proposed one that is more experimentally manageable by using the and examined (Creixell et al. 2012, Mitra et al. 2013). experimental data from dose–response studies of single drugs While multiscale omics data and the prior knowledge and from a few combinations along with a systems analysis that provide insight into complex interactions are of pathway/network information to obtain an estimate of increasingly available, models and analysis methods to

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R359 Cancer systems biology functionally integrate this information are still sorely cells adapt to the stresses of endocrine-based therapies. needed. In particular, systematic efforts to characterize Our central hypothesis invokes a gene network that selectively activated regulatory components and coordinately regulates those functions of a cell module mechanisms must effectively distinguish significant that determine and execute the cell’s fate decision. Using network rewiring from random background fluctuations. the KDDN tool, we identified three small topological Most published biological network inferences were features and then overlaid these onto the canonical obtained from molecular datasets acquired under a apoptosis pathway from KEGG (Fig. 2). The largest of single condition, for which the statistically significant the three features reflected much of our prior knowledge, network rewiring across different conditions is unknown despite not explicitly incorporating this knowledge into or unreported (Mitra et al. 2013). The inability to the models (Zhang et al. 2009). Following the predictions identify significant rewiring in biological networks of this topology, we uncovered some fundamentally new represents a major limitation on the use of these results insights into molecular signaling; for example, the direct for molecular signaling studies. The Knowledge-fused regulation of BCL2 by XBP1 and the requirement of NFκB Differential Dependency Network (KDDN) method has for XBP1 signaling to regulate the prosurvival cell fate been developed to infer significant rewiring of complex outcome in the context of antiestrogen treatment and biological dependency networks, via sparse modeling and resistance (Clarke et al. 2011, Tyson et al. 2011, Hu et al. data-knowledge integration (Zhang et al. 2009, 2011, Tian 2015). In applying KDDN to data from a rodent model, we et al. 2013, 2014a,b, 2015). Specifically, KDDN formulates found that exposure to estrogens in utero induces a rewired the inference of differential dependency networks (Zhang network in the mammary glands of the offspring that et al. 2009, 2011, Tian et al. 2014a) that incorporate predicts for resistance to endocrine therapies in tumors both conditional data and prior knowledge as a convex that arise in these glands during adulthood. Subsequent optimization problem (Zhang & Wang 2010, Tian et al. studies showing that tumors in these mammary glands 2011) and uses an efficient learning algorithm to jointly are less responsive to tamoxifen (TAM) provided the first infer the conserved biological network and significant direct demonstration of why many ER+ breast cancers may rewiring across different conditions (Tian et al. 2014b, be pre-programmed to fail to respond to TAM treatment 2015). KDDN uses a minimax strategy to maximize the or respond and later recur (Hilakivi-Clarke et al. 2017). benefit of prior knowledge while confining its negative We further pursued the functional evidence of the impact under the worst-case scenario. Furthermore, hidden dependencies/crosstalk inferred by KDDN. For KDDN matches the values of model parameters to the example, KDDN analysis of global protein expression data expected false-positive rates on network edges at a from 122 TCGA ovarian cancer samples (selected based on specified significance level and assesses edge-specific P homologous recombination deficiency, HRD, a phenotype values on each of the differential connections. with distinct prognosis and response to therapies) Tests on synthetic data have shown that KDDN resulted in a number of phenotype-dependent modules produces biologically plausible results (Zhang et al. 2009, of co-expressed proteins. Several of the member proteins 2016, Herrington et al. 2018) and can reveal statistically in the modules were known to be involved in histone significant rewiring in biological networks. The utility of modification. With the additional evidence of HRD KDDN is evident following its application to a variety of status-dependent acetylation or deacetylation of histone real gene and protein expression datasets including yeast proteins in the same samples, we were able, using patient cell lines (Tian et al. 2014b), breast cancer (Tian et al. 2014b), population data, to support what has been shown in cells ovarian cancer (Zhang et al. 2016) and medulloblastoma (Gong & Miller 2013, Tang et al. 2013) that histone protein (Tian et al. 2014a). The method efficiently leverages data- acetylation affects the choice of DNA double-strand break driven evidence and existing biological knowledge while repair pathways (between homologous recombination remaining robust to false-positive edges in the prior and non-homologous end-joining) (Zhang et al. 2016). knowledge. The network rewiring events identified by KDDN reflect previous studies in the literature and provide new mechanistic insight into the biological system(s) that An example of mathematical modeling: extends beyond this earlier work. ER landscape To study how gene networks may rewire during the transition from normal to neoplastic breast cells, we Dynamic mathematical models track a system as it have focused on understanding how ER+ breast cancer evolves in time. A key use of such models is to optimize

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R360 Cancer systems biology

Figure 2 Differential dependency network focused on the KEGG apoptosis pathway Kanehisa( & Goto 2000). Recurrent breast cancers (uniquely featured by red edges) showed the imbalance between apoptosis and survival with only one route into the cell through IL1B-induced inhibition of proapoptotic CASP3. Non-recurrent breast cancer (uniquely featured by green edges) had a cascade of signaling pathways inside the cell that provides the balance between apoptosis and survival. Copyright Kanehisa Laboratories. Reproduced with permission from KEGG. therapeutic protocols. For example, instead of applying idea considered estrogen deprivation therapy (Chen et al. a given drug or combination of drugs continuously for a 2014). ER+ cells were presumed to exist in three different specified overall duration, the drug(s) can be applied for states: an estrogen-sensitive state (growth driven by fixed durations with rest intervals in between. Alternatively, the estrogen receptor bound to estrogen), an estrogen- several drugs can be applied in a repeating sequence for hypersensitive state (growth driven by membrane- fixed durations. Optimizing the durations and dosing of associated estrogen receptor (ERM) bound to estrogen) drugs is a combinatorial problem that is difficult to solve and an estrogen-independent state (growth driven by experimentally, but relatively simple to solve via computer growth factor receptors (GFRs)). Transitions between the simulation, assuming an accurate dynamical model states were governed by the estrogen level (high, low, is available. Impressive results have been obtained in trace) in which the cells were grown. If cells were growing prostate cancer and glioblastoma using two-compartment in a high (physiological) concentration of estrogen, most models that simulate the temporal development of the cells would transition to the estrogen-sensitive state. If the sensitive and drug-resistant populations of cancer cells estrogen concentration dropped to a low level, sensitive (Jain et al. 2011, Leder et al. 2014, Morken et al. 2014). cells would begin to die, but some would transition to a In the case of ER+ breast cancer and antiestrogens, hypersensitive state and continue growing. the resistance character of the cells changes with time To model the transitions among these states, we in response to the drugs. Hence, it is necessary first to developed a stochastic differential equation model of an model the dynamics of development of drug resistance individual cell. States were characterized in the model in individual cells, then to model the dynamics of a by ERM activity (high or low) and GFR activity (high population of treated cells by linking the cellular scale or low). The model qualitatively matched observations to the population scale, and finally to consider strategies in the literature concerning sensitivity transitions in for optimizing drug therapy. A proof of concept of this breast cancer cells as the estrogen level was varied.

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R361 Cancer systems biology

The fact that resistance to estrogen deprivation was the disease can be kept in check (similar to increasing reversible if resistant cells were transferred back to duration of the recurrence-free survival period). estrogen-rich medium for a sufficiently long time was This example provides a possible roadmap for how also captured. Using techniques from statistical physics, modeling a molecular understanding of the response of it is possible to visualize this model as a landscape upon a cancer cell to a drug can be transitioned to a tissue- which the system makes spontaneous transitions among level model and used for therapy optimization. While three low-lying basins (Fig. 3A), which represent the three the situation in patients is certainly more complicated states of estrogen sensitivity. Random fluctuations in the than the model systems described here, the success cells can occasionally cause transitions from one basin to of simple compartment models to guide therapy in another, representing the natural heterogeneity seen in a simulated tumors provides hope that more complicated, cell population. However, the system typically resides in molecularly-based, multiscale models will ultimately be the lowest basin, as determined by the estrogen level. useful in guiding therapy in the clinic. It is not efficient to simulate large numbers of these ‘model cells’ for long periods of time in order to compute how a population would evolve in response to changes Interpreting models: caveat lector in estrogen dose. To circumvent this problem, a cell-level model was used to compute the transition probabilities ‘A little learning is a dang’rous thing; drink deep, or taste among states as a function of estrogen concentration. not the Pierian spring: there shallow draughts intoxicate the These probabilities were then used to create a population brain, and drinking largely sobers us again.’ model that efficiently tracked the number of cells in each Alexander Pope (1688–1744) state. A treatment regimen consisting of cycles of estrogen deprivation followed by a drug holiday was considered, The qualitative and quantitative models that we have and the deprivation and break durations were optimized described above produce results that can be difficult to to drive the cancer cell population as low as possible. interpret correctly and usefully. Correct interpretation Results are shown in Fig. 3B and C for the situation where is important, of course, because no one wants to spend the cancer population is initially 1000 cells. For the time and precious experimental resources failing to parameters in the model, the cancer cannot be eradicated. validate an incorrect understanding of the results of a However, over a suitable range of therapeutic parameters, computational and/or mathematical analysis of a cellular

Figure 3 (A) The estrogen-response landscape for a particular level of estrogen stimulation. There are four basins of attraction for the cell state corresponding to sensitive (ERM−/GFR−), hypersensitive (ERM+/GFR−) and independent (GFR+). (B) A sample intermittent treatment regimen (top panel) produces varying proportion of cells in different states (second panel; cyan = sensitive, green = hypersensitive, blue = independent), a varying proliferation index of the overall cell population (third panel; yellow indicates death and red indicates growth). The overall population level, starting from 1000 cells, is shown in the 4 4 bottom panel. (C) Plot of the average value of cell number over the interval t ϵ (2 × 10 , 3 × 10 ) as a function of Ttreat and Tbreak. The white dot indicates the case in (B). Any combination of Ttreat and Tbreak that puts the system within the log10N = 3 contour will suppress cancer growth. This figure is adapted, with permission, from Fig. 3 and 6 of Chen et al. (2014).

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R362 Cancer systems biology control network. For example, gene set enrichment assessment, usually a P value, accompanies each model is a powerful tool to explore high-dimensional data prediction, and understanding what these assessments sets (Subramanian et al. 2005), but how its results are represent is important in evaluating the results. It is not interpreted and used requires a thorough understanding unusual for a model to provide many predictions for which of what the results do and do not imply. In this case, gene the P values are small (highly statistically significant), but set enrichment analysis provides a static representation of how easily or appropriately these statistical estimates can canonical signaling pathways, which are highly idealized be used to guide biological interpretation is not always views of the most frequently observed events in a signal clear. Primarily, P values reflect how well each model transduction pathway. However, once an event is identified output fits the data input, subject to the parameters and and reported, it is more likely to be studied further and assumptions in the statistical model used. Thus, the use of eventually to be considered as being canonical. Moreover, different statistical tools with the same input gene list and these canonical signaling maps (examples include KEGG the same database may give different outputs, or the same and Biocarta) are often assembled from a variety of sources outputs with different P values, because the parameters (cell types, tissues, and species). Consequently, these and assumptions in each statistical model are different. graphical representations may be relevant only in part to Also, if some pathways in the database are larger, better the signaling processes under consideration in the specific annotated, or more fully (and correctly) understood than cell context that a researcher is studying experimentally others, the P values associated with these pathways could and trying to model computationally. be smaller (implying a more statistically significant fit) Researchers are often limited to applying reductionist than less well represented pathways that may be a better wet laboratory technologies to validate the predictions reflection of the underlying biological truth. of models that attempt to explain some or all of the When the decision as to which is likely to be the complexity in the biological system under investigation. correct solution is left entirely to intuition, it is not Often, the cost in time and resources needed to validate surprising that the solution that best supports the experimentally the predictions of multiscale models current hypothesis, or that is most easily explained by can be prohibitive, making the ability to select among the operator’s existing knowledge, is often selected over multiple solutions a necessity. Most algorithms, given other statistically significant outputs that are not easily input variables in the correct format, will produce understood or may even refute the hypothesis. In such outputs/predictions, but these often include false cases, the investigator is likely to fall into the trap of positives and false negatives that are not easily identified. self-fulfilling prophesy Clarke( et al. 2008). To be able to While model outputs are usually associated with interpret model outputs appropriately, it is often critical probability estimations, the results of any significance to understand both what the data represent and some of tests generally provide an evaluation only of how well the the basic principles of how the model works. This prior model fits the available data. This statistical evaluation is knowledge is particularly important when the correct not necessarily an estimate of how well the predictions interpretation is counterintuitive or inconsistent with the reflect biological truth. Moreover, when sorting through hopes or expectations of the study designers. multiple apparently statistically significant predictions, Cell context, by which we mean the unique patterns an investigator can be left relying on subjective intuition, of genes, proteins, and metabolites that are expressed in perhaps guided by an incomplete, inadequate, or incorrect a cell and that interact to influence the physiology of understanding of the system. Since model predictions that cell (Clarke & Brünner 1996), is one of the central should generally be consistent with the experimental data determinants of how signaling and function are related in and/or the (sometimes) limited knowledge of the system biological systems. Context is clearly related to the cell/ currently available, the trap of self-fulfilling prophesy tissue type, local microenvironment, status of the host, becomes almost unavoidable (Clarke et al. 2008). and other external and internal influences. Some aspects A gene set enrichment algorithm may produce several of cellular context may be highly conserved, as can be predicted pathways and functions associated with a single seen from the DNA sequence of some genes through to set of differentially expressed genes. While some genes can the basic signaling topology of some highly conserved certainly participate in more than one pathway or regulate functions. Nonetheless, there can also be substantial more than one cellular function, the investigator must diversity, even within closely related species, tissues, or determine which output(s) (which pathway, module, or cells. For cancer research, the differences between the function) is most likely to represent the truth. A statistical normal and neoplastic state in the same tissue or cell type

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R363 Cancer systems biology is where we look most often for molecular targets that potentially powerful approach to extracting new features can be diagnostic, prognostic, and/or therapeutic. Here, from high-dimensional data spaces (Hosny et al. 2018, Zou even small changes in cell context can have substantial et al. 2019). It is likely that deep-learning approaches will implications for the ability to address a specific hypothesis. be more commonly applied in the near future to guide Despite the often fundamental importance of cellular knowledge discovery within the framework of cancer context, it is frequently ignored. systems biology. Some modeling approaches are of limited utility Another area that has attracted renewed interest is either scientifically or clinically and need to be the heterogeneity arising from the presence of multiple re-addressed. The complex hairball models often cell types and the consequent complexity of interactions generated by some computational tools may (or may within tumor microenvironments. For molecular not) contain valid insights into regulatory biology. signaling, a key issue in this context is whether the events However, their complexity can be so high and the noise identified as being associated with a biological outcome sufficiently extensive and the errors undefined that these or phenotype are intrinsic or extrinsic to the cancer cells models cannot be tested meaningfully or interpreted and/or other cells within the microenvironment. While reliably. Employment of a razor to shave away those most therapeutic interventions attempt to induce cell components that do not add to the utility, robustness, death programs that are executed within the cancer cell or accuracy of a model may be desirable, but only if its (intrinsic), many of the signals that initiate this intrinsic application is tractable and it is evident what ‘whiskers’ activity are generated by activities originating in stromal can be removed without a significant loss of predictive or immune cells (extrinsic). Single-cell RNAseq can address power. Proactively incorporating feature elimination some of these issues, but this is not always feasible and the tools during modeling (for example, applying a support technology has its own limitations (Cheng et al. 2014a, vector machine with recursive feature elimination for Saliba et al. 2014). Moreover, many public omics datasets classification;Guyon et al. 2003) may help to address this are populated with data representing averaged signals from concern by attempting to arrive at the smallest model multiple cell types, as might be expected from a study that that meets predetermined requirements of convergence used tumor biopsies as the primary material. Some form and statistical significance. Nonetheless, the need for of data deconvolution is then required. Tools to achieve human intuition to interpret outcomes remains central deconvolution continue to emerge but for many of these to many study designs, and, consequently, the risk of datasets the tools must be effective when applied in an falling into the trap of self-fulfilling prophesy must be unsupervised manner because data that could supervise carefully avoided (Clarke et al. 2008). the analysis is often absent. Tools that can accurately and robustly perform unsupervised data deconvolution are likely to become more widely used in the near future. The application of systems biology approaches to Future directions critical questions in endocrine-related and other cancers The properties of high-dimensional data spaces, and may provide new insights into cancer biology and lead the challenges and opportunities these provide (Clarke to new treatments. Many signaling networks and the et al. 2008), remain central to the performance of many biological processes that they regulate often prove to be too computational modeling approaches and bioinformatic complex for biostatistics, bioinformatics or mathematical tools and workflows. Tools designed to manage these biology alone to unravel. However, the integrated use of properties explicitly, such as support vector machines, these approaches can support the building of predictive and workflows to address high dimensionality, such as multiscale models from a systems perspective. The including dimensionality reduction as a preprocessing virtuous cycle of in silico model prediction, validation in step in data analysis, are likely to remain in use. New and appropriate wet laboratory experiments, with validated more powerful tools and workflows are likely to continue results feeding back to improve model predictions, can to emerge, increasing the power and accuracy of predictive then drive new discovery of complex systems in a manner models, the quality and accuracy of data interpretation, that often outstrips intuitive reasoning. In those cancers and the utility of the new knowledge gained. Deep where hormone and growth factor receptors and their learning, a subset of approaches within the broader signaling play a major role, systems approaches may offer field of machine learning that generally applies neural the best means to address the complexity and dynamic network-based modeling, has gained recent attention as a nature of signaling and how it responds to therapeutic

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R364 Cancer systems biology interventions that affect the cancer cells and their Brazma A, Krestyaninova M & Sarkans U 2006 Standards for systems interactions within their microenvironments. biology. Nature Reviews: Genetics 7 593–605. (https://doi.org/10.1038/ nrg1922) Buettner F, Natarajan KN, Casale FP, Proserpio V, Scialdone A, Theis FJ, Teichmann SA, Marioni JC & Stegle O 2015 Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data Declaration of interest reveals hidden subpopulations of cells. Nature Biotechnology 33 Robert Clarke is an Associate Editor of Endocrine-Related Cancer. Robert 155–160. (https://doi.org/10.1038/nbt.3102) Clarke was not involved in the review or editorial process for this paper, on Califano A 2011 Rewiring makes the difference. Molecular Systems Biology which he is listed as an author. The other authors have nothing to disclose. 7 463. (https://doi.org/10.1038/msb.2010.117) Califano A & Alvarez MJ 2017 The recurrent architecture of tumour initiation, progression and drug sensitivity. Nature Reviews: Cancer 17 116–130. (https://doi.org/10.1038/nrc.2016.124) Funding Chen L, Chan TH, Choyke PL, Hillman EM, Chi CY, Bhujwalla ZM, This work was supported in part by Public Health Service Awards U01- Wang G, Wang SS, Szabo Z & Wang Y 2011 CAM-CM: a signal CA184902, U54-CA149147, and DoD-BCRP-CA171885 to R Clarke, the deconvolution tool for in vivo dynamic contrast-enhanced imaging Georgetown-Lombardi Comprehensive Cancer Center grant (P30- of complex tissues. Bioinformatics 27 2607–2609. (https://doi. CA51008-19), R01-CA164717 to M Tan, and R01-CA201092 to W T Baumann. org/10.1093/bioinformatics/btr436) Chen C, Baumann WT, Clarke R & Tyson JJ 2013 Modeling the estrogen receptor to growth factor receptor signaling switch in human breast References cancer cells. FEBS Letters 587 3327–3334. (https://doi.org/10.1016/j. febslet.2013.08.022) Ades AE & Lu G 2003 Correlations between parameters in risk models: Chen C, Baumann WT, Xing J, Xu L, Clarke R & Tyson JJ 2014 estimation and propagation of uncertainty by Markov chain Monte Mathematical models of the transitions between endocrine therapy Carlo. Risk Analysis 23 1165–1172. (https://doi. responsive and resistant states in breast cancer. Journal of the Royal org/10.1111/j.0272-4332.2003.00386.x) Society, Interface 11 20140206. (https://doi.org/10.1098/ Aldridge BB, Burke JM, Lauffenburger DA & Sorger PK 2006 rsif.2014.0206) Physicochemical modelling of cell signalling pathways. Nature Cell Cheng WY, Ou Yang TH & Anastassiou D 2013 Development of a Biology 8 1195–1203. (https://doi.org/10.1038/ncb1497) prognostic model for breast cancer survival in an open challenge Altrock PM, Liu LL & Michor F 2015 The mathematics of cancer: environment. Science Translational Medicine 5 181ra50. (https://doi. integrating quantitative models. Nature Reviews: Cancer 15 730–745. org/10.1126/scitranslmed.3005974) (https://doi.org/10.1038/nrc4029) Cheng F, Jia P, Wang Q, Lin CC, Li WH & Zhao Z 2014a Studying Anafi RC, Francey LJ, Hogenesch JB & Kim J 2017 CYCLOPS reveals tumorigenesis through network evolution and somatic mutational human transcriptional rhythms in health and disease. PNAS 114 perturbations in the cancer interactome. Molecular Biology and 5312–5317. (https://doi.org/10.1073/pnas.1619320114) Evolution 31 2156–2169. (https://doi.org/10.1093/molbev/msu167) Anderson AR & Quaranta V 2008 Integrative mathematical oncology. Cheng F, Jia P, Wang Q & Zhao Z 2014b Quantitative network Nature Reviews: Cancer 8 227–234. (https://doi.org/10.1038/nrc2329) mapping of the human kinome interactome reveals new clues for Baeder DY, Yu G, Hoze N, Rolff J & Regoes RR 2016 Antimicrobial rational kinase inhibitor discovery and individualized cancer combinations: Bliss independence and Loewe additivity derived from therapy. Oncotarget 5 3697–3710. (https://doi.org/10.18632/ mechanistic multi-hit models. Philosophical Transactions of the Royal oncotarget.1984) Society of London: Series B, Biological Sciences 371 20150294. (https:// Cirillo E, Parnell LD & Evelo CT 2017 A review of pathway-based doi.org/10.1098/rstb.2015.0294) analysis tools that visualize genetic variants. Frontiers in Genetics 8 Barabasi AL, Gulbahce N & Loscalzo J 2011 Network medicine: a 174. (https://doi.org/10.3389/fgene.2017.00174) network-based approach to human disease. Nature Reviews: Genetics Clarke R & Brünner N 1996 Acquired estrogen independence and 12 56–68. (https://doi.org/10.1038/nrg2918) antiestrogen resistance in breast cancer: estrogen receptor-driven Barberis M & Verbruggen P 2017 Quantitative systems biology to phenotypes? Trends in Endocrinology and Metabolism 7 291–301. decipher design principles of a dynamic cell cycle network: the (https://doi.org/10.1016/S1043-2760(96)00127-0) ‘Maximum Allowable mammalian Trade-Off-Weight’ (MAmTOW). Clarke R, Ressom HW, Wang A, Xuan J, Liu MC, Gehan EA & Wang Y NPJ Systems Biology and Applications 3 26. (https://doi.org/10.1038/ 2008 The properties of very high dimensional data spaces: s41540-017-0028-x) implications for exploring gene and protein expression data. Nature Barik D, Ball DA, Peccoud J & Tyson JJ 2016 A stochastic model of the Reviews: Cancer 8 37–49. (https://doi.org/10.1038/nrc2294) yeast cell cycle reveals roles for feedback regulation in limiting Clarke R, Shajahan AN, Wang Y, Tyson JJ, Riggins RB, Weiner LM, cellular variability. PLoS Computational Biology 12 e1005230. (https:// Bauman WT, Xuan J, Zhang B, Facey C, et al. 2011 Endoplasmic doi.org/10.1371/journal.pcbi.1005230) reticulum stress, the unfolded protein response, and gene network Bedard PL, Mook S, Piccart-Gebhart MJ, Rutgers ET, Van’t Veer LJ & modeling in antiestrogen resistant breast cancer. Hormone Molecular Cardoso F 2009 MammaPrint 70-gene profile quantifies the Biology and Clinical Investigation 5 35–44. (https://doi.org/10.1515/ likelihood of recurrence for early breast cancer. Expert Opinion on hmbci.2010.073) Medical Diagnostics 3 193–205. (https://doi. Clarke R, Cook KL, Hu R, Facey CO, Tavassoly I, Schwartz JL, org/10.1517/17530050902751618) Baumann WT, Tyson JJ, Xuan J, Wang Y, et al. 2012 Endoplasmic Begley CG 2013 Six red flags for suspect work. Nature 497 433–434. reticulum stress, the unfolded protein response, autophagy, and the (https://doi.org/10.1038/497433a) integrated regulation of breast cancer cell fate. Cancer Research 72 Berger JO 2010 Statistical Decision Theory and Bayesian Analysis. New 1321–1331. (https://doi.org/10.1158/0008-5472.CAN-11-3213) York, NY, USA: Springer-Verlag. Courtney JF 2001 Decision making and knowledge management in Bhalla US & Iyengar R 1999 Emergent properties of networks of inquiring organizations: towards a new decision-making paradigm biological signaling pathways. Science 283 381–387. (https://doi. for DSS. Decision Support Systems 31 17–38. (https://doi.org/10.1016/ org/10.1126/science.283.5400.381) S0167-9236(00)00117-2)

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R365 Cancer systems biology

Cowen L, Ideker T, Raphael BJ & Sharan R 2017 Network propagation: a GTex Consortium 2015 Human genomics. The Genotype-Tissue universal amplifier of genetic associations. Nature Reviews: Genetics Expression (GTEx) pilot analysis: multitissue gene regulation in 18 551–562. (https://doi.org/10.1038/nrg.2017.38) humans. Science 348 648–660. (https://doi.org/10.1126/ Creixell P, Schoof EM, Erler JT & Linding R 2012 Navigating cancer science.1262110) network attractors for tumor-specific therapy. Nature Biotechnology 30 Guyon J, Weston J, Barnhill MD & Vapnik V 2003 Gene selection for 842–848. (https://doi.org/10.1038/nbt.2345) cancer classification using support vector machines. Machine Learning Creixell P, Reimand J, Haider S, Wu G, Shibata T, Vazquez M, 46 389–422. Mustonen V, Gonzalez-Perez A, Pearson J, Sander C, et al. 2015 Hart Y, Sheftel H, Hausser J, Szekely P, Ben-Moshe NB, Korem Y, Pathway and network analysis of cancer genomes. Nature Methods 12 Tendler A, Mayo AE & Alon U 2015 Inferring biological tasks using 615–621. (https://doi.org/10.1038/nmeth.3440) Pareto analysis of high-dimensional data. Nature Methods 12 Deisboeck TS, Wang Z, Macklin P & Cristini V 2011 Multiscale cancer 233–235. (https://doi.org/10.1038/nmeth.3254) modeling. Annual Review of Biomedical Engineering 13 127–155. Hatzis C, Bedard PL, Birkbak NJ, Beck AH, Aerts HJ, Stem DF, Shi L, (https://doi.org/10.1146/annurev-bioeng-071910-124729) Clarke R, Quackenbush J & Haibe-Kains B 2014 Enhancing Di Ventura B, Lemerle C, Michalodimitrakis K & Serrano L 2006 From in reproducibility in cancer drug screening: how do we move forward? vivo to in silico biology and back. Nature 443 527–533. (https://doi. Cancer Research 74 4016–4023. (https://doi.org/10.1158/0008-5472. org/10.1038/nature05127) CAN-14-0725) Dimitrova N, Nagaraj AB, Razi A, Singh S, Kamalakaran S, Banerjee N, Herrington DM, Mao C, Parker SJ, Fu Z, Yu G, Chen L, Venkatraman V, Joseph P, Mankovich A, Mittal P, DiFeo A, et al. 2017 InFlo: a novel Fu Y, Wang Y, Howard TD, et al. 2018 Proteomic architecture of systems biology framework identifies cAMP-CREB1 axis as a key human coronary and aortic atherosclerosis. Circulation 137 modulator of platinum resistance in ovarian cancer. Oncogene 36 2741–2756. (https://doi.org/10.1161/CIRCULATIONAHA.118.034365) 2472–2482. (https://doi.org/10.1038/onc.2016.398) Hilakivi-Clarke LA, Wärri A, Bouker KB, Zhang X, Cook KL, Jin L, Dubois D 2010 Representation, propagation, and decision issues in risk Zwart A, Nguyen N, Hu R, Cruz MI, et al. 2017 Effects of in utero analysis under incomplete probabilistic information. Risk Analysis 30 exposure to ethinyl estradiol on tamoxifen resistance and breast 361–368. (https://doi.org/10.1111/j.1539-6924.2010.01359.x) cancer recurerence in a preclinical model. Journal of the National Dutta A, Le Magnen C, Mitrofanova A, Ouyang X, Califano A & Abate- Cancer Institute 109 djw188. (https://doi.org/10.1093/jnci/djw188) Shen C 2016 Identification of an NKX3.1-G9a-UTY transcriptional Hoadley KA, Yau C, Wolf DM, Cherniack AD, Tamborero D, Ng S, regulatory network that controls prostate differentiation. Science 352 Leiserson MDM, Niu B, McLellan MD, Uzunangelov V, et al. 2014 1576–1580. (https://doi.org/10.1126/science.aad9512) Multiplatform analysis of 12 cancer types reveals molecular Eckert RS, Carroll RJ & Wang N 1997 Transformations to additivity in classification within and across tissues of origin. Cell 158 929–944. measurement error models. Biometrics 53 262–272. (https://doi. (https://doi.org/10.1016/j.cell.2014.06.049) org/10.2307/2533112) Hofree M, Shen JP, Carter H, Gross A & Ideker T 2013 Network-based Enriquez-Navas PM, Wojtkowiak JW & Gatenby RA 2015 Application of stratification of tumor mutations. Nature Methods 10 1108–1115. evolutionary principles to cancer therapy. Cancer Research 75 (https://doi.org/10.1038/nmeth.2651) 4675–4680. (https://doi.org/10.1158/0008-5472.CAN-15-1337) Hosny A, Parmar C, Coroller TP, Grossmann P, Zeleznik R, Kumar A, Fang HB, Huang H, Clarke R & Tan M 2016 Predicting multi-drug Bussink J, Gillies RJ, Mak RH & Aerts HJWL 2018 Deep learning for inhibition interactions based on signaling networks and single drug lung cancer prognostication: a retrospective multi-cohort radiomics dose-response information. Journal of Computational Systems Biology study. PLoS Medicine 15 e1002711. (https://doi.org/10.1371/journal. 2 101. pmed.1002711) Fang HB, Chen X, Pei XY, Grant S & Tan M 2017 Experimental design Hu R, Warri A, Jin L, Zwart A, Riggins R & Clarke R 2015 NFkappaB and statistical analysis for three-drug combination studies. Statistical signaling is required for XBP1 (U and S) mediated effects on Methods in Medical Research 26 1261–1280. (https://doi. antiestrogen responsiveness and cell fate decisions in breast cancer. org/10.1177/0962280215574320) Molecular and Cellular Biology 35 390. (https://doi.org/10.1128/ Fitzgerald JB, Schoeberl B, Nielsen UB & Sorger PK 2006 Systems biology MCB.00847-14) and combination therapy in the quest for clinical efficacy. Nature Huang HZ, Fang HB & Tan MT 2018 Experimental designs for multidrug Chemical Biology 2 458–466. (https://doi.org/10.1038/nchembio817) combination studies using signaling networks. Biometrics 74 Galea MH, Blamey RW, Elston CE & Ellis IO 1992 The Nottingham 538–547. (https://doi.org/10.1111/biom.12777) prognostic index in primary breast cancer. Breast Cancer Research and Hudson NJ, Reverter A & Dalrymple BP 2009 A differential wiring Treatment 22 207–219. (https://doi.org/10.1007/BF01840834) analysis of expression data correctly identifies the gene containing Gallaher J, Babu A, Plevritis S & Anderson ARA 2014 Bridging population the causal mutation. PLoS Computational Biology 5 e1000382. and tissue scale tumor dynamics: a new paradigm for understanding (https://doi.org/10.1371/journal.pcbi.1000382) differences in tumor growth and metastatic disease. Cancer Research Ideker T & Krogan NJ 2012 Differential network biology. Molecular 74 426–435. (https://doi.org/10.1158/0008-5472.CAN-13-0759) Systems Biology 8 565. (https://doi.org/10.1038/msb.2011.99) Gehlenborg N, O'Donoghue SI, Baliga NS, Goesmann A, Hibbs MA, Imamov O, Shim GJ, Warner M & Gustafsson JA 2005 Estrogen receptor Kitano H, Kohlbacher O, Neuweger H, Schneider R, Tenenbaum D, beta in health and disease. Biology of Reproduction 73 866–871. et al. 2010 Visualization of omics data for systems biology. Nature (https://doi.org/10.1095/biolreprod.105.043497) Methods 7 S56–S68. (https://doi.org/10.1038/nmeth.1436) Jain HV, Clinton SK, Bhinder A & Friedman A 2011 Mathematical Gho YS & Lee C 2017 Emergent properties of extracellular vesicles: a modeling of prostate cancer progression in response to androgen holistic approach to decode the complexity of intercellular ablation therapy. PNAS 108 19701–19706. (https://doi.org/10.1073/ communication networks. Molecular BioSystems 13 1291–1296. pnas.1115750108) (https://doi.org/10.1039/c7mb00146k) Janes KA, Chandran PL, Ford RM, Lazzara MJ, Papin JA, Peirce SM, Ghosh S, Matsuoka Y, Asai Y, Hsin KY & Kitano H 2011 Software for Saucerman JJ & Lauffenburger DA 2017 An engineering design systems biology: from tools to integrated platforms. Nature Reviews: approach to systems biology. Integrative Biology 9 574–583. (https:// Genetics 12 821–832. (https://doi.org/10.1038/nrg3096) doi.org/10.1039/C7IB00014F) Gong F & Miller KM 2013 Mammalian DNA repair: HATs and HDACs Ji Z, Yan K, Li W, Hu H & Zhu X 2017 Mathematical and computational make their mark through histone acetylation. Mutation Research 750 modeling in complex biological systems. BioMed Research 23–30. (https://doi.org/10.1016/j.mrfmmm.2013.07.002) International 2017 5958321. (https://doi.org/10.1155/2017/5958321)

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R366 Cancer systems biology

Jinawath N, Bunbanjerdsuk S, Chayanupatkul M, Ngamphaiboon N, modeling: microscopic scale and multiscale approaches. Seminars in Asavapanumas N, Svasti J & Charoensawan V 2016 Bridging the gap Cancer Biology 30 60–69. (https://doi.org/10.1016/j. between clinicians and systems biologists: from network biology to semcancer.2014.03.003) translational biomedical research. Journal of Translational Medicine 14 McKenna MT, Weis JA, Barnes SL, Tyson DR, Miga MI, Quaranta V & 324. (https://doi.org/10.1186/s12967-016-1078-3) Yankeelov TE 2017 A predictive mathematical modeling approach Junttila MR & de Sauvage FJ 2013 Influence of tumour micro- for the study of doxorubicin treatment in triple negative breast environment heterogeneity on therapeutic response. Nature 501 cancer. Scientific Reports 7 5725. (https://doi.org/10.1038/s41598-017- 346–354. (https://doi.org/10.1038/nature12626) 05902-z) Kanehisa M & Goto S 2000 KEGG: Kyoto Encyclopedia of genes and Meacham CE & Morrison SJ 2013 Tumour heterogeneity and cancer cell genomes. Nucleic Acids Research 28 27–30. (https://doi.org/10.1093/ plasticity. Nature 501 328–337. (https://doi.org/10.1038/ nar/28.1.27) nature12624) Kanter I & Kalisky T 2015 Single cell transcriptomics: methods and Miryala SK, Anbarasu A & Ramaiah S 2018 Discerning molecular applications. Frontiers in Oncology 5 53. (https://doi.org/10.3389/ interactions: a comprehensive review on biomolecular interaction fonc.2015.00053) databases and network analysis tools. Gene 642 84–94. (https://doi. Keenan AB, Jenkins SL, Jagodnik KM, Koplev S, He E, Torre D, Wang Z, org/10.1016/j.gene.2017.11.028) Dohlman AB, Silverstein MC, Lachmann A, et al. 2018 The library of Mitra K, Carvunis AR, Ramesh SK & Ideker T 2013 Integrative integrated network-based cellular signatures NIH program: system- approaches for finding modular structure in biological networks. level cataloging of human cells response to perturbations. Cell Nature Reviews: Genetics 14 719–732. (https://doi.org/10.1038/ Systems 6 13–24. (https://doi.org/10.1016/j.cels.2017.11.001) nrg3552) Kimura A, Celani A, Nagao H, Stasevich T & Nakamura K 2015 Mobley A, Linder SK, Braeuer R, Ellis LM & Zwelling L 2013 A survey on Estimating cellular parameters through optimization procedures: data reproducibility in cancer research provides insights into our elementary principles and applications. Frontiers in Physiology 6 60. limited ability to translate findings from the laboratory to the clinic. (https://doi.org/10.3389/fphys.2015.00060) PLoS One 8 e63221. (https://doi.org/10.1371/journal.pone.0063221) Lampinen J & Vehtari A 2001 Bayesian approach for neural networks – Molinelli EJ, Korkut A, Wang W, Miller ML, Gauthier NP, Jing X, review and case studies. Neural Networks 14 257–274. (https://doi. Kaushik P, He Q, Mills G, Solit DB, et al. 2013 Perturbation biology: org/10.1016/S0893-6080(00)00098-8) inferring signaling networks in cellular systems. PLoS Computational Lavrik IN & Zhivotovsky B 2014 Systems biology: a way to make Biology 9 e1003290. (https://doi.org/10.1371/journal.pcbi.1003290) complex problems more understandable. Cell Death and Disease 5 Morken JD, Packer A, Everett RA, Nagy JD & Kuang Y 2014 Mechanisms e1256. (https://doi.org/10.1038/cddis.2014.195) of resistance to intermittent androgen deprivation in patients with Le NN 2007 The long journey to a systems biology of neuronal function. prostate cancer identified by a novel computational method. Cancer BMC Systems Biology 1 28. (https://doi.org/10.1186/1752-0509-1-28) Research 74 3673–3683. (https://doi.org/10.1158/0008-5472.CAN-13- Leder K, Pitter K, Laplant Q, Hambardzumyan D, Ross BD, Chan TA, 3162) Holland EC & Michor F 2014 Mathematical modeling of PDGF- Moseley HN 2013 Error analysis and propagation in metabolomics data driven glioblastoma reveals optimized radiation dosing schedules. analysis. Computational and Structural Biotechnology Journal 4 Cell 156 603–616. (https://doi.org/10.1016/j.cell.2013.12.029) e201301006. (https://doi.org/10.5936/csbj.201301006) Ledzewicz U & Schaettler H 2016 Optimizing chemotherapeutic anti- Nam S 2017 Databases and tools for constructing signal transduction cancer treatment and the tumor microenvironment: an analysis of networks in cancer. BMB Reports 50 12–19. (https://doi.org/10.5483/ mathematical models. Advances in Experimental Medicine and Biology BMBRep.2017.50.1.135) 936 209–223. (https://doi.org/10.1007/978-3-319-42023-3_11) Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Leiserson MD, Vandin F, Wu HT, Dobson JR, Eldridge JV, Thomas JL, Diehn M & Alizadeh AA 2015 Robust enumeration of cell subsets Papoutsaki A, Kim Y, Niu B, McLellan M, et al. 2015 Pan-cancer from tissue expression profiles. Nature Methods 12 453–457. (https:// network analysis identifies combinations of rare somatic mutations doi.org/10.1038/nmeth.3337) across pathways and protein complexes. Nature Genetics 47 106–114. Okasha S 2012 Emergence, hierarchy and top-down causation in (https://doi.org/10.1038/ng.3168) evolutionary biology. Interface Focus 2 49–54. (https://doi. Li H, Xuan J, Wang Y & Zhan M 2008 Inferring regulatory networks. org/10.1098/rsfs.2011.0046) Frontiers in Bioscience 13 263–275. (https://doi.org/10.2741/2677) Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, Liepe J, Kirk P, Filippi S, Toni T, Barnes CP & Stumpf MP 2014 A Davies S, Fauron C, He X, Hu Z, et al. 2009 Supervised risk predictor framework for parameter estimation and model selection from of breast cancer based on intrinsic subtypes. Journal of Clinical experimental data in systems biology using approximate Bayesian Oncology 27 1160–1167. (https://doi.org/10.1200/JCO.2008.18.1370) computation. Nature Protocols 9 439–456. (https://doi.org/10.1038/ Parmar JH, Cook KL, Shajahan-Haq AN, Clarke PA, Tavassoly I, Clarke R, nprot.2014.025) Tyson JJ & Baumann WT 2013 Modelling the effect of GRP78 on Mackay A, Weigelt B, Grigoriadis A, Kreike B, Natrajan R, A’Hern R, anti-oestrogen sensitivity and resistance in breast cancer. Interface Tan DS, Dowsett M, Ashworth A & Reis-Filho JS 2011 Microarray- Focus 3 20130012. (https://doi.org/10.1098/rsfs.2013.0012) based class discovery for molecular classification of breast cancer: Pavlopoulos GA, Paez-Espino D, Kyrpides NC & Iliopoulos I 2017 analysis of interobserver agreement. Journal of the National Cancer Empirical comparison of visualization tools for larger-scale network Institute 103 662–673. (https://doi.org/10.1093/jnci/djr071) analysis. Advances in Bioinformatics 2017 1278932. (https://doi. Mangado N, Piella G, Noailly J, Pons-Prats J & Ballester MÁ 2016 org/10.1155/2017/1278932) Analysis of uncertainty and variability in finite element Peng H, Tan H, Zhao W, Jin G, Sharma S, Xing F, Watabe K & Zhou X computational models for biomedical engineering: characterization 2016 Computational systems biology in cancer brain metastasis. and propagation. Frontiers in Bioengineering and Biotechnology 4 85. Frontiers in Bioscience 8 169–186. (https://doi.org/10.2741/s456) (https://doi.org/10.3389/fbioe.2016.00085) Picco N, Gatenby RA & Anderson ARA 2017 Stem cell plasticity and Martelotto LG, Ng CK, Piscuoglio S, Weigelt B & Reis-Filho JS 2014 niche dynamics in cancer progression. IEEE Transactions on Bio- Breast cancer intra-tumor heterogeneity. Breast Cancer Research 16 Medical Engineering 64 528–537. (https://doi.org/10.1109/ 210. (https://doi.org/10.1186/bcr3658) TBME.2016.2607183) Masoudi-Nejad A, Bidkhori G, Hosseini Ashtiani S, Najafi A, Ponten F, Schwenk JM, Asplund A & Edqvist PH 2011 The Human Bozorgmehr JH & Wang E 2015 Cancer systems biology and Protein Atlas as a proteomic resource for biomarker discovery. Journal

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R367 Cancer systems biology

of Internal Medicine 270 428–446. (https://doi. Tian Y, Zhang B, Shih IM & Wang Y 2011 Knowledge-guided differential org/10.1111/j.1365-2796.2011.02427.x) dependency network learning for detecting structural changes in Quaranta V, Rejniak KA, Gerlee P & Anderson AR 2008 Invasion emerges biological networks. In BCB '11: Proceedings of the 2nd ACM from cancer cell adaptation to competitive microenvironments: Conference on Bioinformatics, Computational Biology and Biomedicine, quantitative predictions from multiscale mathematical models. pp 254–263. New York, NY, USA: ACM. (https://doi. Seminars in Cancer Biology 18 338–348. (https://doi.org/10.1016/j. org/10.1145/2147805.2147833) semcancer.2008.03.018) Tian Y, Chen L, Zhang B, Zhang Z, Yu G, Clarke R, Xuan J, Shih IM & Rittle HWJ & Webber MM 1973 Dilemmas in a general theory of Wang Y 2013 Genomic and network analysis to study the origin of planning. Policy Sciences 4 155–169. (https://doi.org/10.1007/ ovarian cancer. Systems Biomedicine 1 55–64. (https://doi. BF01405730) org/10.4161/sysb.25313) Robinson JT, Thorvaldsdottir H, Wenger AM, Zehir A & Mesirov JP 2017 Tian Y, Wang SS, Zhang Z, Rodriguez OC, Petricoin E, III, Shih IeM, Variant review with the integrative genomics viewer. Cancer Research Chan D, Avantaggiati M, Yu G, Ye S, et al. 2014a Integration of 77 e31–e34. (https://doi.org/10.1158/0008-5472.CAN-17-0337) network biology and imaging to study cancer phenotypes and Roy S, Werner-Washburne M & Lane T 2011 A multiple network responses. IEEE/ACM Transactions on Computational Biology and learning approach to capture system-wide condition-specific Bioinformatics 11 1009–1019. (https://doi.org/10.1109/ responses. Bioinformatics 27 1832–1838. (https://doi.org/10.1093/ TCBB.2014.2338304) bioinformatics/btr270) Tian Y, Zhang B, Hoffman EP, Clarke R, Zhang Z, Shih IeM, Xuan J, Ryall KA & Tan AC 2015 Systems biology approaches for advancing the Herrington DM & Wang Y 2014b Knowledge-fused differential discovery of effective drug combinations. Journal of Cheminformatics dependency network models for detecting significant rewiring in 7 7. (https://doi.org/10.1186/s13321-015-0055-9) biological networks. BMC Systems Biology 8 87. (https://doi. Saliba AE, Westermann AJ, Gorski SA & Vogel J 2014 Single-cell RNA- org/10.1186/s12918-014-0087-1) seq: advances and future challenges. Nucleic Acids Research 42 Tian Y, Zhang B, Hoffman EP, Clarke R, Zhang Z, Shih IeM, Xuan J, 8845–8860. (https://doi.org/10.1093/nar/gku555) Herrington DM & Wang Y 2015 KDDN: an open-source cytoscape Sandberg R 2014 Entering the era of single-cell transcriptomics in app for constructing differential dependency networks with biology and medicine. Nature Methods 11 22–24. (https://doi. significant rewiring. Bioinformatics 31 287–289. (https://doi. org/10.1038/nmeth.2764) org/10.1093/bioinformatics/btu632) Sedgewick AJ, Benz SC, Rabizadeh S, Soon-Shiong P & Vaske CJ 2013 Toettcher JE, Loewer A, Ostheimer GJ, Yaffe MB, Tidor B & Lahav G Learning subgroup-specific regulatory interactions and regulator 2009 Distinct mechanisms act in concert to mediate cell cycle arrest. independence with PARADIGM. Bioinformatics 29 i62–i70. (https:// PNAS 106 785–790. (https://doi.org/10.1073/pnas.0806196106) doi.org/10.1093/bioinformatics/btt229) Twycross J, Band LR, Bennett MJ, King JR & Krasnogor N 2010 Singhania R, Sramkoski RM, Jacobberger JW & Tyson JJ 2011 A hybrid Stochastic and deterministic multiscale models for systems biology: model of mammalian cell cycle regulation. PLoS Computational an auxin-transport case study. BMC Systems Biology 4 34. (https://doi. Biology 7 e1001077. (https://doi.org/10.1371/journal.pcbi.1001077) org/10.1186/1752-0509-4-34) Spencer SL & Sorger PK 2011 Measuring and modeling apoptosis in Tyson JJ, Baumann WT, Chen C, Verdugo A, Tavassoly I, Wang Y, single cells. Cell 144 926–939. (https://doi.org/10.1016/j. Weiner LM & Clarke R 2011 Dynamic modeling of oestrogen cell.2011.03.002) signalling and cell fate in breast cancer cells. Nature Reviews Cancer Steyerberg EW, Harrell FE Jr, Borsboom GJ, Eijkemans MJ, Vergouwe Y & 11 523–532. (https://doi.org/10.1038/nrc3081) Habbema JD 2001 Internal validation of predictive models: Tyson JJ, Laomettachit T & Kraikivski P 2019 Modeling the dynamic efficiency of some procedures for logistic regression analysis. Journal behavior of biochemical regulatory networks. Journal of Theoretical of Clinical Epidemiology 54 774–781. (https://doi.org/10.1016/S0895- Biology 462 514–527. (https://doi.org/10.1016/j.jtbi.2018.11.034) 4356(01)00341-9) Vanlier J, Tiemann CA, Hilbers PA & van Riel NA 2012 An integrated Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, strategy for prediction uncertainty analysis. Bioinformatics 28 Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al. 1130–1135. (https://doi.org/10.1093/bioinformatics/bts088) 2005 Gene set enrichment analysis: a knowledge-based approach for Venet D, Dumont JE & Detours V 2011 Most random gene expression interpreting genome-wide expression profiles. PNAS 102 signatures are significantly associated with breast cancer outcome. 15545–15550. (https://doi.org/10.1073/pnas.0506580102) PLoS Computational Biology 7 e1002240. (https://doi.org/10.1371/ Swertz MA & Jansen RC 2007 Beyond standardization: dynamic software journal.pcbi.1002240) infrastructures for systems biology. Nature Reviews: Genetics 8 Vogel C & Marcotte EM 2012 Insights into the regulation of protein 235–243. (https://doi.org/10.1038/nrg2048) abundance from proteomic and transcriptomic analyses. Nature Tang J & Aittokallio T 2014 Network pharmacology strategies toward Reviews: Genetics 13 227–232. (https://doi.org/10.1038/nrg3185) multi-target anticancer therapies: from computational models to Waljee AK, Higgins PD & Singal AG 2014 A primer on predictive experimental design principles. Current Pharmaceutical Design 20 models. Clinical and Translational Gastroenterology 5 e44. (https://doi. 23–36. (https://doi.org/10.2174/13816128113199990470) org/10.1038/ctg.2013.19) Tang J, Cho NW, Cui G, Manion EM, Shanbhag NM, Botuyan MV, Wang Z & Deisboeck TS 2014 Mathematical modeling in cancer drug Mer G & Greenberg RA 2013 Acetylation limits 53BP1 association discovery. Drug Discovery Today 19 145–150. (https://doi. with damaged chromatin to promote homologous recombination. org/10.1016/j.drudis.2013.06.015) Nature Structural and Molecular Biology 20 317–325. (https://doi. Wang N, Gong T, Clarke R, Chen L, Shih IeM, Zhang Z, Levine DA, org/10.1038/nsmb.2499) Xuan J & Wang Y 2015 UNDO: a Bioconductor R package for Tape CJ 2016 Systems biology analysis of heterocellular signaling. Trends unsupervised deconvolution of mixed gene expressions in tumor in Biotechnology 34 627–637. (https://doi.org/10.1016/j. samples. Bioinformatics 31 137–139. (https://doi.org/10.1093/ tibtech.2016.02.016) bioinformatics/btu607) Tavassoly I, Parmar J, Shajahan-Haq AN, Clarke R, Baumann WT & Wang N, Hoffman EP, Chen L, Chen L, Zhang Z, Liu C, Yu G, Tyson JJ 2015 Dynamic modeling of the interaction between Herrington DM, Clarke R & Wang Y 2016 Mathematical modelling autophagy and apoptosis in mammalian cells. CPT: Pharmacometrics of transcriptional heterogeneity identifies novel markers and and Systems Pharmacology 4 263–272. (https://doi.org/10.1002/ subpopulations in complex tissues. Scientific Reports 6 18909. psp4.29) (https://doi.org/10.1038/srep18909)

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University Endocrine-Related R Clarke et al. Multiscale modeling in cancer 26:6 R368 Cancer systems biology

Welton NJ & Ades AE 2005 Estimation of Markov chain transition 26th Conference on Uncertainty in Artificial Intelligence, UAI 2010, pp probabilities and rates from fully and partially observed data: 701–708. Waterloo, ON, Canada: AUAI Press. (available at: https:// uncertainty propagation, evidence synthesis, and model calibration. event.cwi.nl/uai2010/papers/UAI2010_0221.pdf) Medical Decision Making 25 633–645. (https://doi.org/10.1177/02729 Zhang B, Li H, Riggins RB, Zhan M, Xuan J, Zhang Z, Hoffman EP, 89X05282637) Clarke R & Wang Y 2009 Differential dependency network analysis Wen Z, Liu ZP, Liu Z, Zhang Y & Chen L 2013 An integrated approach to to identify condition-specific topological changes in biological identify causal network modules of complex diseases with application networks. Bioinformatics 25 526–532. (https://doi.org/10.1093/ to colorectal cancer. Journal of the American Medical Informatics bioinformatics/btn660) Association 20 659–667. (https://doi.org/10.1136/amiajnl-2012-001168) Zhang B, Tian Y, Jin L, Li H, Shih IeM, Madhavan S, Clarke R, Hoffman EP, Werner HM, Mills GB & Ram PT 2014 Cancer systems biology: a peek Xuan J, Hilakivi-Clarke L, et al. 2011 DDN: a caBIG(R) analytical tool into the future of patient care? Nature Reviews: Clinical Oncology 11 for differential network analysis. Bioinformatics 27 1036–1038. (https:// 167–176. (https://doi.org/10.1038/nrclinonc.2014.6) doi.org/10.1093/bioinformatics/btr052) Wilkinson DJ 2009 Stochastic modelling for quantitative description of Zhang H, Liu T, Zhang Z, Payne SH, Zhang B, McDermott JE, Zhou JY, heterogeneous biological systems. Nature Reviews: Genetics 10 Petyuk VA, Chen L, Ray D, et al. 2016 Integrated proteogenomic 122–133. (https://doi.org/10.1038/nrg2509) characterization of human high grade serous ovarian cancer. Cell Wu G & Stein L 2012 A network module-based method for identifying 166 755–765. (https://doi.org/10.1016/j.cell.2016.05.069) cancer prognostic signatures. Genome Biology 13 R112. (https://doi. Zou J, Huss M, Abid A, Mohammadi P, Torkamani A & Telenti A 2019 A org/10.1186/gb-2012-13-12-r112) primer on deep learning in genomics. Nature Genetics 51 12–18. Yin H, Xue W & Anderson DG 2019 CRISPR-Cas: a tool for cancer (https://doi.org/10.1038/s41588-018-0295-5) research and therapeutics. Nature Reviews Clinical Oncology 16 Zuckerman NS, Noam Y, Goldsmith AJ & Lee PP 2013 A self-directed 281–295. (https://doi.org/10.1038/s41571-019-0166-8) method for cell-type identification and separation of gene expression Zhang B & Wang Y 2010 Learning structural changes of gaussian microarrays. PLoS Computational Biology 9 e1003189. (https://doi. graphical models in controlled experiments. In Proceedings of the org/10.1371/journal.pcbi.1003189)

Received in final form 25 March 2019 Accepted 8 April 2019 Accepted Preprint published online 8 April 2019

https://erc.bioscientifica.com © 2019 Society for Endocrinology https://doi.org/10.1530/ERC-18-0309 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 12/11/2019 05:39:20PM via Washington State University, Washington State University, Washington State University and Washington State University