EXPLORATION OF CAUSAL AND CORRELATIONAL MODELLING

IN CANCER : GLIOBLASTOMA CASE STUDY by

SOMPOP SAENGPHUENG

Submitted in partial fulfillment of the requirements

For the degree of Doctor of Philosophy

Dissertation Advisor: Dr. Sree N. Sreenath

Department of Electrical Engineering & Computer Science

CASE WESTERN RESERVE UNIVERSITY

May 2015 CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the dissertation of

SOMPOP SAENGPHUENG

candidate for the Doctor of Philosophy degree *

Dr. Sree N. Sreenath Dissertation Advisor Professor,

Dr. James W. Jacobberger Professor,

Dr. Mihajlo Mesarovic Professor,

Dr. Vira Chankong Associate Professor,

Dr. Evren Cavusoglu Assistant Professor,

July 8th, 2014

*We also certify that written approval has been obtained for any proprietary material contained therein. Table of Contents

Table of Contents ...... iii List of Tables ...... vii List of Figures ...... viii Acknowledgement ...... x Abstract ...... xi

1 Introduction 1 1.1 Objective and Challenge ...... 1 1.2 Biological Problem ...... 4 1.3 Hallmark of Cancer and Angiogenesis ...... 7 1.4 Cause and Correlation of Angiogenesis in Glioblastoma ...... 8 1.5 Computer-aid analysis of causal structure ...... 9 1.6 Glioblastoma ...... 9 1.7 Summary ...... 9

2 Background 11 2.1 Glioblastoma or Glioblastoma multiforme (GBM) ...... 11 2.1.1 Glioblastoma ...... 11 2.1.2 Causes of glioblastoma ...... 12 2.1.3 Treatment ...... 12 2.1.4 Glioblastoma, angiogenesis, and VEGF ...... 14

iii 2.1.5 Activation of VEGF pathway in glioblastoma ...... 16 2.1.6 Influence of VEGF on glioblastoma microvasculature ..... 16 2.1.7 Summary of VEGF in glioblastoma ...... 17 2.2 Understanding Angiogenesis and the VEGF Ligand ...... 17 2.2.1 Angiogenesis is regulated by the VEGF ligand ...... 17 2.2.2 The VEGF ligand is continuously expressed ...... 18 2.2.3 VEGF expression during tumor development ...... 20 2.2.4 Strategies for inhibiting the VEGF pathway ...... 21 2.2.5 Extracellular targeting of the VEGF ligand ...... 22 2.2.6 Intracellular targeting of the VEGF receptor ...... 23 2.2.7 Inhibition of new and recurrent tumor vessel growth ...... 23 2.2.8 Regression of existing tumor vasculature ...... 23 2.2.9 Regrowth of tumor vasculature ...... 24 2.2.10 The rationale for continuing VEGF inhibition ...... 25 2.2.11 An evolving understanding of tumor biology ...... 25 2.3 Cancer stem cells ...... 27 2.4 Observational Model ...... 28 2.5 Causal Model ...... 30 2.6 Translating from causal to statistical model ...... 32 2.7 d-separation ...... 33 2.8 Summary ...... 34

3 Model Selection : Glioblastoma model 35 3.1 Glioblastoma model ...... 35 3.1.1 Glucose ...... 38 3.1.2 Oxygen ...... 39 3.1.3 Transforming Growth Factor alpha ...... 41

iv 3.1.4 Vascular Endothelial Growth Factor (VEGF) ...... 41 3.1.5 Fibronectin ...... 43 3.2 Other published works of brain cancer, angiogenesis, and cancer cell . 47 3.3 Summary ...... 47

4 Methodology 50 4.1 Translating from causal to statistical model ...... 50 4.2 Statistical control and physical control ...... 51 4.3 Path analysis and d-separation ...... 53 4.3.1 d-separation test ...... 55 4.3.2 Independence of d-separation statements ...... 56 4.3.3 Testing for probabilistic independence ...... 58 4.4 Structural Equation Modeling (SEM) ...... 59 4.4.1 Steps to perform SEM analysis ...... 59 4.4.2 Path analysis and maximum likelihood ...... 62 4.4.3 Nested models and multilevel model ...... 63 4.5 The causal expression of SEM ...... 65 4.5.1 Assumptions and representations ...... 65 4.5.2 Causal assumptions in nonparametric models ...... 67 4.5.3 Intervention and causal effects ...... 68 4.6 Exploration, discovery and equivalents of causal graph ...... 70 4.6.1 Exploring hypothesis space ...... 70 4.6.2 The shadow cause revisited ...... 71 4.6.3 Obtaining the undirected dependency graph ...... 71 4.6.4 Hypothesis setting ...... 73 4.6.5 Causal inference for brain cancer ...... 73 4.7 Analysis of randomized experiment through SEM ...... 74

v 4.8 Develop a randomized experiment to address causal issues ...... 75 4.9 Advantages, Disadvantages, and Limitation ...... 76 4.10 Summary ...... 77

5 Results 78 5.1 Part 1: Glioblastoma without TKI treatment ...... 79 5.2 Part 2: Glioblastoma with TKI treatment ...... 86 5.3 Estimation of path coefficient ...... 90

6 Conclusion 96 6.1 Interpretation of the result ...... 96 6.2 Suggestion for future research ...... 98 6.3 Contribution ...... 98 6.4 Summary ...... 98

Appendix A The first Appendix 100 A.1 SEM and causality ...... 100 A.2 Search algorithms using TETRAD ...... 103 A.2.1 PC algorithm ...... 103 A.2.2 FCI algorithm ...... 106 A.2.3 SEM parametric models ...... 106 A.2.4 SEM instantiated model ...... 107 A.3 Multilevelness of Angiogenesis ...... 108

Bibliography 109

vi List of Tables

2.1 The effects of the VEGF ligand ...... 19

3.1 Initial values and parameter values ...... 42 3.2 Other published works of brain cancer ...... 47

4.1 A basis set for the DAG ...... 57

5.1 Mean coefficient and fit statistics of models without TKI ...... 85 5.2 Mean coefficient and fit statistics of models with TKI ...... 91

vii List of Figures

2.1 VEGF is known to express throughout the tumor life cycle ...... 20 2.2 Strategies for inhibiting the VEGF pathway ...... 21 2.3 Observational Relationships and Causal Relationships ...... 28 2.4 Statistical Model ...... 29 2.5 A directed graph describing the causal relationships ...... 30 2.6 A directed graph used to illustrate the notion of d-separation .... 33

3.1 Multilevel phenotype switch functioning in the brain cancer ..... 37 3.2 Probability for endothelial tip cells migration ...... 49

4.1 The translation from a causal model to an observational model .... 50 4.2 Alternative causal model ...... 52 4.3 A directed acyclic graph (DAG) involving six variables ...... 58 4.4 Nesting model: Model B is nested within A ...... 64 4.5 A simple SEM, and its associated diagrams ...... 66 4.6 The diagrams associated with model ...... 67 4.7 Path diagram vs Undirected graph ...... 72

5.1 Causal Model: Without TKI treatment ...... 80 5.2 Alternative Models: Without TKI treatment ...... 82 5.3 Causal Model: With TKI treatment ...... 86

viii 5.4 Alternative Models 1,2: With TKI treatment ...... 87 5.5 Alternative Models 3,4: With TKI treatment ...... 88 5.6 Estimation of path coefficient in causal model: With TKI treatment . 92 5.7 Implied correlation matrix of causal model: With TKI treatment ... 93 5.8 Estimation of path coefficient in alternative model: With TKI treatment 93 5.9 Implied correlation matrix of alternative model: With TKI treatment 94

A.1 SEM methodology depicted as an inference engine ...... 102 A.2 Causal Structure using PC algorithm ...... 104

ix ACKNOWLEDGEMENTS

My dissertation would have not been finished without my professor, Dr. Mihajlo

Mesarovic, who gives me guidance and inspires me in every possible way.He not only has given me knowledge, but he has also inspired and shaped my life. I have learned many things from him, and I am forever grateful for our relationship both inside and outside the classroom.It has been four years of being his visitor at home and it was the four years of enjoying conversation about my dissertation. I would like to thank you him for supporting me and devoting himself to my study since the beginning.

I would like to thank you Dr. Sree N. Sreenath for his kind effort to support me and to provide an indispensable advice, information, and support on different aspects of my dissertation.Thank you for your patience with me over the last four years of my study. Your support was essential for my success here.

Additionally, I would like to thank you Dr.Vira Chankhong who gives me an opportunity to pursuit my dream. I cannot imagine where am I going to be if he was not there at the first place.Without him I could not have a chance to see another successful steps of my life.

Finally, I would like to thank you the committee,Dr. James W. Jacobberger and

Dr. Evren Cavusoglu, whose work demonstrated to me that concern for cancer treat- ment requires another approach that can help solve the problems and it is required diversity of knowledge.

Thank you.

x Exploration of Causal and Correlational Modelling in Cancer : Glioblastoma Case Study

Abstract

by

SOMPOP SAENGPHUENG

Physicians observe and study human diseases to gain insight into these complex biological systems and to develop appropriate therapies. Understanding the disease mechanism, which factors regulate or control others, may lead us to find the methods for stopping or preventing severity of it’s symptoms or causes by using drugs or alter- native treatments. The treatment of a disease requires a chance discovery that intake of a compound or prior vaccination affects the outcome. There are many examples of this: Jenner’s discovery that milkmaids did not suffer the full effects of small- pox. The discovery that nitrogen mustards (chemical weapons) suppressed lymphoid cell proliferation led to the first successful treatment for lymphoma long before the causes of lymphoma were understood. There are several factors that may contribute to cause a disease. Determination of the causes is fundamental to fully describe a disease. Based on a statistical method from Shipley’s book ”Cause and Correlation

xi in Biology”, we fashion an approach to provide a description of the biological phe- nomenon’s functions in a statistical way. To provide effective treatment is to make an observation that exposure to certain factors (such as chemical or radiation, etc.) affects a positive outcome in diseased individuals. Improving treatments may ben- efit from causal models that permit simulations that explore alternative treatment interventions. Determining how to provide a causal description of the phenomenon starting from statistical experiments presents a challenge because when we analyze all factors statistically and try to identify their relationship by judging from their correlation, it cannot tell us what are causes or what are effects. When we select the factors that are highly correlated to the output, i.e. in a regression model, some factors or regressors may not be concluded that they are the cause of output i.e. the rain is the cause of the mud, but the mud is not the cause of rain.

The measurable criteria help to classify tumors into groups that help physicians predict tumor behavior. Among these are tumor size, metastatic status, and patholog- ical considerations (cellular features) [116]. Currently, there is a large effort to identify molecular markers that may help to refine these predictions, and more importantly, may guide targeted therapy [29, 36]. There are literally many tens of thousands of po- tential markers. As a means for measurement, more markers (genetic, epi-genetic, and biochemical measurements, as well as higher level cellular measurements of process and function) have provided measurements, but making sense of the measurements constitutes a major problem in biomedical science [34, 56, 69, 88, 114, 151].

xii For this reason, we will use the method proposed by Shipley to provide us a fundamental understanding of the brain cancers functioning [119, 120]. Shipley’s method relies on the structural equation modeling (SEM) and the software TETRAD

V to assist with the identification of the structure view of a system during a statistical experiment. This approach also provides useful information for differentiating healthy from unhealthy states in terms of causality. In an effort to begin to address this problem in a novel way, we have explored an approach to identify causality once we determine the most influential inputs.

xiii Chapter 1

Introduction

1.1 Objective and Challenge

The objective of the thesis is to explore a novel statistical approach to observational data that could improve our understanding of cancer complexity. Understanding cancer aimed at therapy relies on knowledge of cellular biochemistry; cell, tissue, and organ physiology, genetic and epigeneic evolution, as well as whole organism health considerations . All of these aspects are complex. To begin to understand these complexities from a systems viewpoint, we should consider ranked variables that impact cancer. Ideally, in this endeavor, the behavior of overall system is considered

[90]. The following section provides the reasoning for choosing the variables that I will use in a causal model.

Angiogenesis, the formation of new capillaries, can facilitates the growth of a tu- mor above 1cm and it is a significant contributor to cancer development. During the

1 angiogenesis process, in response to low oxygen tension, endothelial cells divide and form networks of blood vessels. Therefore, in model consideration, some important variables derived from measures that correlate with angiogenesis may be important to a causal model. Some measured variables which are available from patients- oxy- genation or blood vessel density from pathological slides are directly collected.For some variables that we cannot measure directly, i.e. the molecular compounds, which mediate through the signaling pathways, we can use the value of these variables from literature. Once we obtain these variables, we will use them to construct a causal structure. In the causal model, variables from cellular environment, oxygen and glu- cose, are important to angiogenesis because they help maintain cell function and cell survival [54]. Therefore, oxygen and glucose have a great impact on cell’s phenotype

(apoptosis, quiescence, migration,proliferation, etc.) in a causal model.In extracel- lular, VEGF and Fibronectin are used as chemotatic factor and haptotatic factor of angiogenesis. Both VEGF and Fibronectin have a great impact on the increasing of vessels [132]. For intracellular activities, all signaling pathways that we cannot mea- sure from patients will be estimated or will be taken from literature. These signaling pathways are influent to the cell’s genotype, therefore they have the great impact on cell’s phenotypes [123]. The proliferating cells promote the growth of new blood vessels [55]. The numbers of the proliferating cell can be measured using cytrometry- based cell counting method [7, 11, 148]. In a causal model, we assume that cell’s phenotypes affect the increasing number of blood vessels. Consequently, the network

2 of blood vessels are increased, which could provide a chance for vessels to merge each others.

The thesis is aimed to identify causality from statistical data by creating causal model of tumor growth, using the d-separation methodology of Shipley [119], and simulating in experiments with hypothetical data constituting key variables identified from the literature. The study will focus on glioblastoma because it is one of the most difficult to cure and it is commonly involved in overexpression of oncogenes that greatly impact signaling pathways [92, 100]. These pathways i.e. EGFR will be used as a target for treatment because if the pathways of the cancer cells are blocked, then it may help to reduce the growth rate of cancer cells. The angiogenesis has been identified as a key variable that is required for growth of large brain tumors

[25, 92, 99, 112]. Angiogenesis can be understood best at the cellular level and the related processes or states are tissue oxygen tension, cytokine regulation, energy metabolism, cell division rate, cell death rate, cell migration, and cell quiescence

[16, 21, 25, 41, 59, 76, 101]. Each of these processes can be approximated by specific measurements that can be obtained for individual patients. These are oxygenation

[18, 95], cytokine profiles [9, 101], glucose uptake [22, 47], mitotic indices [14, 70], and apoptotic markers [36, 124]. To obtain information on the rates of cell migration or the fraction of quiescent cells, molecular markers need to be used. None of these latter two categories are purely related to these variables, and therefore are the lowest quality. In the thesis, the states or processes are described in a causal structure

3 model, which will be used to obtain statistical data. The causal structure is based on the biological systems, showed the closest relationships of measured variables, and expressed as the cause and effect structures. The measured variables that have a direct impact on disease progression are described based on several literature reviews.

We use standardized data to facilitate us in the analysis without worrying about the unit of variable. This allows us to adjust the value of measured variables, which can represent the actual measures under the same assumption that if the causal model is true, then the actual measures should be true.

The ”cause and correlation” approach proposed by Shipley is pretty much different from traditional multi-variate statistical method in that the first approach is self explanatory, which all inputs are the cause of the output. While the latter is mainly based on the correlation of variables, the inputs are highly correlated with an output, but some inputs may not be the real cause of an output.

1.2 Biological Problem

Biologists are challenge to explain real biological phenomenon without exploiting models. Because of the complexity, it is troublesome to represent biological phe- nomenon in a profound detail. In particular cases, biological models can help model- ers to gain insights into real biology. Most models are used as explanatory tools, to represent more explicitly the state of knowledge, or to predict results. Essentially, a model is a representation of reality but may not encompass the whole [75].

4 All models have pros and cons. We need to ensure that the essential aspects of the system or phenomenon are well represented. In biological phenomena, we would like to capture the behavior of population of cells by using models that represent relationships from the interaction within cellular level and its environment. Human understanding of biological systems is limited, among other things, by that very complexity and by the problems that arise when attempting to examine a given system

,the classical observer vs. observed problem, wherein the process of observation may change the behavior of the observed. This makes us question how to understand relationships between the biological variables as they occur in reality. Models are used for different purposes; thus, the criteria for selecting good or appropriate models are different varying based on the field of interest.

Basically, any conceptual idea of some biological phenomenon counts as a model.In general, the cartoons and block diagrams are used to represent metabolic, signaling, or regulatory pathways.These are qualitative models. These models show the rela- tionships of elements that are important to the biological phenomenon. Qualitative models ignore certain details (e.g., about kinetics) implicitly asserting that exclusion of such details does not make the model irrelevant. Implicit modeling uses statistical tests by many biologists. Statistical tests require null hypotheses, and the null hy- potheses are based on particular model, which the probability distribution of the null hypotheses is obtained [75, 119].

In cell biology, models and simulations are used to examine behavior of the cell

5 both structurally and dynamically,not just to pay attention to the characteristics of its single parts. Models must consider inherent stochastic and deterministic processes, modular design, alternative pathways, and emergent behavior in biological hierarchy.

The goal of studying biology is to obtain an understanding of the interactions or molecular processes that are responsible for the cellular behavior. To do so, a quan- titative model of the cell must be developed to include the whole measurement taken at many different levels.

It is considered an iterative processes because it begins with a rough model of the cells based on some knowledge of the components and the possible interactions among them, as well as prior biochemical and genetic knowledge. Although the assumptions underlying the model are insufficient and inappropriate for the system being investigated, rough modeling provides a hypothesis about the structure of the interactions that govern the cell’s behavior [75].

Implicit in the model are predictions about the cell’s response under different kinds of perturbation.Perturbations, caused by genetic and environmental factors, are introduced into the cell, and the cell’s responses are measured with tools that capture any changes at the relevant levels of biological information.

The adjusted model that precisely predicts for the new set of measurements can be used as the initial model for the next iteration. Through this process, model and measurement are consistent in a way that the model’s predictions map biological responses to perturbation. Modeling and experimental data should be linked so that

6 we know what needs to be identified in order to construct a related description and create a theoretical framework to explain the behavior of a biological system [75].

1.3 Hallmark of Cancer and Angiogenesis

Highly regulated, precise cell division that replicates the mother cell with high fidelity during development, organism homeostasis, and wound repair has been selected for over evolutionary time. Some of the consequences of de-regulation are expressed as cancer. Genetic alterations are responsible for the progressive trans- formation of normal human cells into malignant derivatives [54]. Cancer cell genotypes are the manifestation of alterations in six cell physiological processes. The hall- marks or physiological alterations are : self-sufficiency in growth signals,insensitivity to growth- inhibitory signals,evasion of programmed cell death (apoptosis), limitless replicative potential, sustained angiogenesis, and tissue invasion and metastasis. The aforemen- tioned capabilities of cancer cells are common by most of human tumors.

Angiogenesis is a hallmark of tumor development. Even though there are several types of cancer, angiogenic mechanisms are quite simi- lar. Normal balancing of pro- and anti-angiogenic factors keeps angiogenesis progression under control during de- velopment and wound repair. During tumor development, there can be a disruption of the balance of these factors,which is called the angiogenic switch. This will re- sult in the creation and maintenance of a growing vascular network.Tumors use the localized heterotypic in- teractions between the cells of the blood vessels (including

7 endothelial cells,pericytes, and smooth muscle cells)and nonvascular cells in tumors to assemble their own blood supply [25]. The release of vascular endothelial growth factor (VEGF) together with the chemotactic signals by the tumors help recruit the circulating endothelial cells to assemble vasculature [145].The production of VEGF is governed by the availability of oxygen. When a group of cells, including cancer cells are hypoxic (lack of oxygen), they release VEGF and angiogenic factors and provoke the growth of capillaries.

1.4 Cause and Correlation of Angiogenesis in Brain Cancer: Glioblastoma case

In statistics the phrase ”Correlation does not imply causation” is used to emphasize that correlation between two variables does not automatically imply that one causes the other.Even though correlation is necessary for a linear causation, and can indicate possible causes or areas for further investigation, correlation can be assumed [119].

Conversely, ”correlation proves causation” is a logical fallacy, which two events that occur together are claimed to have a cause-and-effect relationship.

Similarly, the relationship between angiogenesis and brain cancer must have a transitive statement or intermediate event that was used to correlated them.If angio- genesis (A) causes brain cancer (B) only through its effect of an intermediate event

(I), then the causal influence of A on I is blocked if event B is prevented from re- sponding to A. This is an asymmetric logical statement : if angiogenesis is a cause of

8 brain cancer, then brain cancer cannot simultaneously be a cause of angiogenesis.

1.5 Computer-aid analysis of causal structure

The assumptions for the selected model that we study are (i)independent draws gen- erated by (ii) the same causal process. The sample from populations with different causal processes, either causal structure or quantitative strength between the vari- ables, require a model that can explicitly take into account for the causal relationships across the different groups.

1.6 Glioblastoma

Glioblastoma or Glioblastoma multiforme (GBM) is a fast-growing malignant primary brain tumor that forms from glial tissue of the brain and spinal cord [145].Glioblastoma is caused by genetic alterations, which are the results of intrinsic and environmental factors [92, 100, 130]. Several complications of glioblastoma make it difficult to treat due to the fact that tumor cells are resistant to conventional therapies and the brain is limited in its capacity to repair itself [72].

1.7 Summary

This chapter has detailed the basic idea of angiogenesis and why it is considered one of the hallmarks of cancer. Genetic mutation is the causes the selection of a cells phenotype. Biological problems can be solved using the models that represent real

9 phenomena. Models help capture the behaviors of biological changes due to inputs adjustments and other perturbations, and then provide the outputs for biologists to interpret the results. We thus get the idea of how cause and correlation can be used for analyzing the angiogenesis in the brain cancers (in this case glioblastoma). As this study is based upon the causal structures, the computer-aided analysis that be used to facilitate our understanding of the cause of biological phenomena in the brain tumors, which consequently has a great impact on the process of angiogenesis.

10 Chapter 2

Background

2.1 Glioblastoma or Glioblastoma multiforme (GBM)

2.1.1 Glioblastoma

Glioblastoma is a nonepithelial tumors arising from cells that form various com- ponents of the central and peripheral nervous systems.They are often termed neu- roectodermal tumors to reflect their origins [145].According to WHO(World Health

Organization), classified as CNS(Central Nervous System) tumor, glioblastomas are the most frequent and malignant brain tumors, which are classified as grade IV [130].

Treatment currently relies on chemotherapy, radiation and surgery.The median survival of patients was 47 weeks from implantation and 94 weeks from initial re- section [72].The location of tumors in the brain makes surgical treatment difficult.

Radiotherapy in combination with temozolomide is a standard treatment for patients diagnosed with GBM [130].

11 2.1.2 Causes of glioblastoma

In about one third of glioblastoma case, the EGFR has been found to be truncated and has lost an extracellular domain [145].Most glioblastoma tumors are heterogeneous and occur without any known genetic predisposition. The risk factors of glioblastoma are sex, age, race,and genetic disorders [20, 100]

2.1.3 Treatment

Treatment of glioblastoma consists of both symptomatic (focuses on relieving symp- toms and improving the patient’s neurological function) and palliative (to improve quality of life and to achieve a longer survival time) interventions.

Surgery

The first stage of glioblastoma treatment is surgery to remove tumor before secon- darily resorting to radiotherapy and chemotherapy, to extend a patient’s survival rate.Removal of the tumor has been associated with a significant longer healthier time line [71]. Tumor cells infiltrate the brain at diagnosis, and most patients with glioblastoma later develop recurrent tumors near the original site or at more distant lesions within the brain [71].

Radiotherapy

Radiotherapy is the main treatment for patients with glioblastoma after the surgical treatment. Patients who received radiation had a median survival more than double

12 of those who did not [139]. Tissues in the brain tumors (glioblastoma case) suffer from hypoxia, which are resistant to radiotherapy. Using oxygen diffusion-enhancing compound such as trans sodium crocetinate as radiosensitizers, several investigations have been made [see 116]. Ishikawa et al have performed treatment on newly diag- nosed patients by using radiotherapy together with autologous formalin-fixed vaccine

(AFTV) and T emozolomide (TMZ).The results showed that the median overall sur- vival is 22.2 months and the treatment regime was well tolerated by all patients [see

63].

Chemotherapy

Several studies have found that using chemotherapy in conjunction with radiotherapy increases the survival rate of patients with glioblastoma.One study using standard ra- diation and radiation plus temozolomide chemotherapy showed that patients treated with temozolomide survived a median of 14.6 months as opposed to 12.1 months for the patients receiving radiation alone [see 129]. This treatment regime is now used as a standard for most cases of glioblastoma [129].

Gene transfer

The treatments mentioned before have some drawbacks, as they destroy or kill healthy cells while treating malignant ones. Gene transfer is a new approach for healing glioblastoma because it can kill cancer cells without harming healthy cells. A group of researchers at UCLA in 2005 affirmed a long-term survival benefit in an experimental

13 brain tumor animal model using gene delivery method [see 133].

2.1.4 Glioblastoma, angiogenesis, and VEGF

The main characteristics of glioblastoma are : high proliferation of tumor cells, in- crease cellularity, becoming necrotic, and proliferation of capillary endothelial cells.

Angiogenesis is an important process in the progression of gliomas to glioblastoma.

Angiogenesis is responsible for the growth of potentially metastatic tumors. Glioblas- toma rarely metastasizes,so that it almost always recurs locally due to diffuse infil- tration resulting from angiogenesis [see 83]. Angiogenesis is regulated by growth factors that bind to their specific receptors on endothelial cells, thus inducing the proliferation and migration of endothelial cells.

Many of pro- and anti-angiogenic cellular factors regulate angiogenesis in glioblas- toma. Among them, VEGF has been involved as a major paracrine mediator in the pathogenesis of glioblastoma.

VEGF concentrations were measured in the sera and tumor extracts, as well as tumor cyst fluid, of brain tumor patients. Significantly higher concentrations of VEGF were found in the tissue of glioblastoma compared with other tumors and normal brain

[see 134]. Although VEGF concentration in the serum was not correlated with that of tumor tissue, VEGF concentration of the glioblastoma cyst fluid was 200 -300-fold higher than those in serum and normal brain [28]. Additionally, vascular density in the tumors was significantly correlated with VEGF concentration when measured

14 by counting vessels positive for von Willebrand factor, an endothelial cell marker

(r=0.76).

Takano et al. have shown that glioma cells released VEGF in the culture medium in a time-dependent manner, and this correlated with the induction of endothelial cell migration (r=0.96) [see 134].

Additionally, human glioblastoma cells endogenously express 3 different VEGF variants or isoforms, V EGF121, V EGF165, and V EGF189. Cheng et al. conducted a study to determine whether there are biologically functional differences among these isoforms by highly expressing these isoforms individually in glioblastoma cells im- planted in mouse brains. V EGF121 and V EGF165 were the predominant forms of

VEGF expressed in glioblastoma cells. In contrast to mice that received V EGF189, mice that received V EGF121 and V EGF165 highly expressing cells developed in- tracranial hemorrhage after 60 to 90 hours. V EGF189 highly expressing cells had only slightly larger tumors soon after implantation; however, after longer periods of growth, enhanced angiogenesis and tumorogenesis were apparent. In V EGF121 and

V EGF165 cells, rapid blood vessel growth and breakdown around the tumor vascu- lature was noted, whereas in V EGF189 rapid vessel growth, but not breakdown, was observed [28].

15 2.1.5 Activation of VEGF pathway in glioblastoma

In glioblastoma, cells located near necrotic areas are thought to up-regulate VEGF secondary to hypoxia, a common characteristic observed in tumor microenvironments.

To further explain this relationship in glioblastoma and other cancers, Ziemer et al conducted a study using EF5, a known marker for hypoxia. The results of this study shown that, in all tumors that were highly positive for VEGF, the VEGF mRNA signal pattern was directly related with the percentage of EF5 binding (indicating hypoxic regions) [122, 152].

Conversely, for all tumors studied, regions with relatively low levels of EF5 binding had relatively low or undetectable VEGF mRNA [see 152].

2.1.6 Influence of VEGF on glioblastoma microvasculature

VEGF has been proven to increase vascular permeability by inducing the formation of structural abnormalities. In a study by Robert et al, tumors implanted in sub- cutaneous (SC) tissue vs intracerebral (IC) tissue were examined to see if the host microvasculature determines the morphology and therefore the function of the tumor vasculature or if VEGF overrides all host microvascular endothelial input [112].

The study revealed that when tumors were implanted in IC tissue, the microvas- culature underwent a remarkable transformation in both structure (increase in en- dothelial gaps and fenestrations) and function (increase in permeability). Addition- ally, while the tumors produced similar amounts of VEGF in the 2 conditions, the

16 expression of VEGF receptors was increased in the IC tissue [see 112]. The authors conclude that these results indicate there is a drastic difference in neovasculature of identical tumor types when they are grown in different locations. These findings are important, as the differences in tumor vessel morphology affect tumor permeability

[112].

2.1.7 Summary of VEGF in glioblastoma

VEGF is an important mediator of angiogenesis, which has been identified as a key characteristic of glioblastoma development. Numerous studies have demonstrated the prevalence of VEGF expression and its isoforms in glioblastoma, as well as its asso- ciated outcomes, including increased vascularization, increased vessel density, worse tumor grade, and poor survival rate. [99, 112, 134].

2.2 Understanding Angiogenesis and the VEGF Ligand

2.2.1 Angiogenesis is regulated by the VEGF ligand

There are several pro-angiogenic factors that have been characterized. One of those characteristics is the VEGF, which has been identified as a predominant regulator of tumor angiogenesis [see 57]. The VEGF may affect tumor vasculature in three essen- tial ways (Table 2.1). Early in tumor development, VEGF may help new vasculature establish. Specifically, VEGF has been shown to stimulate tumor growth at both pri-

17 mary and metastatic sites through the recruitment of bone-marrowderived progenitor cells that form the building blocks of a new vascular network. As this network devel- ops, VEGF may continue to help new vasculature grow, providing the blood supply needed to drive further tumor growth and metastasis. Throughout tumor develop- ment, VEGF may also help existing vasculature survive, allowing tumors to sustain their metabolic requirements over their entire life cycle [see 36]. A tumor needs blood supply for itself in order to grow 1-2 mm in diameter. To fulfill it’s needs, a tumor secretes growth factors that recruit new vasculature from existing blood vessels. This process continues even as the tumor matures.

2.2.2 The VEGF ligand is continuously expressed and genet- ically stable

The VEGF ligand is known to be present throughout the tumor life cycle (Fig.

2.1) [25, 66, 76, 102]. The secondary angiogenic pathways were activated during the progress of tumor, such as basic fibroblast growth factor (bF GF ), transforming growth factor beta (T GFβ), placental growth factor (PlGF ), and platelet-derived endothelial cell growth factor (PD-ECGF). When these secondary pathways were ac- tivated, the VEGF ligand continues to be expressed and remains one of the critical mediators of angiogenesis [17, 45, 55, 66, 111].

VEGF is expressed throughout the tumor life cycle. As the tumor matures, secondary angiogenic factors may be increasingly produced. The observation that some tumors are 1) highly dependent on VEGF early in development and 2) con-

18 Table 2.1: The effects of the VEGF ligand

Helps tu- Helps tu- Helps tu- mor vessels mor vessels mor vessels ESTABLISH GROW SURVIVE Mechanism Recruitment of Stimulation of Inhibition of progenitor cells endothelial cell endothelial cell to primary and proliferation, apoptosis metastatic sites migration, and invasion Effect on tumor Helps tumor Provides the Maintains a growth cells seed blood supply vascular net- and form needed for tu- work that fuels premetastatic mors to grow continued tumor niches beyond 1 to 2 growth and mm survival

19 TUMOR GROWTH

VEGF VEGF VEGF bFGF VEGF bFGF bFGF TGF- 1 VEGF bFGF TGF- 1 TGF- 1 PlGF TGF- 1 PlGF PlGF PD-ECGF PD-ECGF Pleiotrophin

Figure 2.1: VEGF is known to be expressed throughout the tumor life cycle ”adapted from [66]” tinuously dependent on VEGF throughout their life cycle is reflected by preclinical research with VEGF inhibitors. In these experiments, VEGF inhibition has demon- strated significant antitumor effects when administered throughout tumor develop- ment [50, 51].T GFα increase human mesenchymal stem cell-secreted VEGF by MEK and Pl3 − K mechanisms [140]. Although the amount of VEGF that is produced and released may change in response to certain stimuli within the tumor environment,

VEGF is thought to be a genetically stable protein that may be relatively insuscepti- ble to mutation [57, 96]. This genetic stability may make continued targeting of the

VEGF ligand a rational antitumor strategy.

2.2.3 High levels of VEGF expression have been observed across a wide range of solid tumors

Expression of the VEGF ligand has been observed across a range of tumor types and has been widely correlated with tumor development and/or poor prognosis [50, 51].

20 Because it drives tumor growth through multiple stages of development, direct and continuous inhibition of the VEGF ligand is a rational antitumor strategy.

2.2.4 Strategies for inhibiting the VEGF pathway

While VEGF is a predominant mediator of angiogenesis, there are different strategies for inhibiting its pathway. The two primary strategies include inhibiting either the

VEGF ligand or the VEGF receptor (Fig. 2.2) [111].

Figure 2.2: Strategies for inhibiting the VEGF pathway ”adapted from [111]

1. Some anti-VEGF strategies target the ligand, allowing for specific inhibition

of the VEGF pathway.

2. Other strategies target the receptor. Some of these approaches (eg, Tyrosine

Kinase Inhibitors(TKIs)) have a wider range of inhibitory effects beyond the

VEGF pathway.

21 2.2.5 Extracellular targeting of the VEGF ligand

Anti-VEGF strategies that directly target the VEGF ligand include ligand-binding antibodies and soluble receptors. These agents produce specific inhibition of the

VEGF pathway via extracellular communication and inhibit angiogenesis without disrupting other non-VEGFrelated pathways [66]. VEGF also promotes angiogenesis by signaling through neuropilin, a co-receptor to VEGFR-1 and -2 on endothelial cells.

The presence of neuropilin is associated with tumor aggressiveness and poor prognosis.

Neuropilin receptors lack a targetable intracellular kinase domain. Therefore, anti-

VEGF strategies that target the ligand extracellularly may be capable of attenuating

VEGF signaling that is mediated through neuropilin [31, 36, 57].

Neuropilin receptors lack an intracellular kinase domain but are able to induce signaling by recruiting ligands to the cell membrane. There is also evidence to suggest that neuropilin may be able to transduce VEGF signals in the absence of tyrosine kinase VEGF receptors [59]. Accordingly, prevention of neuropilin-mediated signaling may require an extracellular strategy of VEGF inhibition. Considerable variation exists in the level of binding affinity among agents that target the VEGF ligand directly. In particular, soluble receptors have been observed to have a higher affinity for VEGF than monoclonal antibodies. In preclinical models, attempts to make molecules with higher VEGF binding affinity have not yielded an improved antitumor effect [see 50].

22 2.2.6 Intracellular targeting of the VEGF receptor

Anti-VEGF strategies that target the VEGF receptor include TKIs and receptor antibodies. Agents that target the VEGF receptor intracellularly, such as TKIs, have a wider range of inhibitory effects beyond the VEGF pathway and may inhibit other pathways that are also mediated through receptor kinases [66, 68, 94].Evidence also suggests that the specificity of an agent may impact its ability to target any one pathway [132]. In particular, while agents that work broadly may be capable of inhibiting multiple pathways, their ability to inhibit activity through individual pathways and receptors may vary.

2.2.7 Inhibition of new and recurrent tumor vessel growth

Direct targeting of the VEGF ligand may result in ongoing inhibition of both new and recurrent tumor vessel growth (Table 2.2). It has been proposed that these effects may help inhibit tumor growth and metastasis. Research also suggests that blockade of VEGF signaling may help inhibit tumor growth by preventing progenitor cells from initiating new vessel growth at both primary and metastatic sites [26, 79, 102].

2.2.8 Regression of existing tumor vasculature

Based on preclinical and clinical models, it has also been proposed that direct VEGF inhibition may regress existing tumor vessels (Table 2.2). These reductions in mi- crovascular density have been associated with a reduction in tumor volume and weight

23 [33, 143, 146].

Table 2.2: Summary of proposed effects of direct VEGF inhibition

Tumor vessel inhibition Tumor vessel regression Interferes with the ability of Interferes with the ability of VEGF to help tumor vessels VEGF to help tumor vessels establish and grow survive Associated with reduced tu- Associated with reduction mor growth and decreased in microvascular density metastatic potential and tumor volume

2.2.9 Regrowth of tumor vasculature may occur when VEGF inhibition is stopped

While continued VEGF inhibition is thought to maintain important anti-angiogenic effects that keep tumor cells from growing and spreading, cessation of VEGF suppres- sion may diminish those effects. In preclinical models, withdrawal of an anti-VEGF agent has been shown to result in regrowth of tumor vasculature [10, 61, 84, 138].

While the cessation of VEGF inhibition with anti-VEGF antibodies has been shown to result in eventual regrowth of tumor vessels, it has not been shown to result in a rebound effect (ie, more aggressive/accelerated growth following discontinuation). In one series of preclinical experiments, the tumor recovery rate was found to be slower in tumors following cessation of VEGF inhibition than in control tumors in 25 of 26

24 (96%) of tumor cell lines studied. This lack of tumor rebound was observed whether

VEGF inhibition was applied alone or in combination with chemotherapy [1, 8, 135].

2.2.10 The rationale for continuing VEGF inhibition

Research suggests that continuing direct VEGF inhibition alone may help preserve antitumor effects achieved following initial combination with chemotherapy [25]. In preclinical models, continuation of VEGF inhibition following initial combination with chemotherapy inhibited tumor recurrence and prolonged survival in mice [8, 81].

2.2.11 An evolving understanding of tumor biology brings new hypotheses about the mechanisms of tumor pro- gression

In devising effective antitumor strategies over time, it is important to consider the possible reasons for tumor progression, which may vary widely among different modal- ities. (Table 2.3). Tumor cells are typically genetically unstable. With agents that target tumor cells directly, such as chemotherapy and some TKIs, progression may therefore occur as a result of acquired resistance. Mechanisms of acquired resistance include activation of efflux pumps (which prevents the accumulation of an agent in the cell) and mutations on or inside the cell surface (which reduce the ability of an agent to bind to or inhibit its target). As a result, the effectiveness of many tumor- targeting agents may diminish over time. Based on preclinical observations, it has been proposed that both endothelial cells and the VEGF ligand are genetically stable.

25 Therefore, escape from agents that directly target the VEGF ligand is generally not thought to occur through acquired resistance but rather through the activation of secondary pathways. For example, while VEGF is present throughout the tumor life cycle, secondary angiogenic factors that function independently of VEGF can become upregulated over time (Fig. 2.1). This activation of secondary pathways is thought to be innate to tumor biology and not directly related to the targeting of the VEGF ligand.

Table 2.3: Current understanding of the mechanisms that drive tumor progression

Acquired resis- Activation of sec- tance ondary pathways Involves mutational pathways? Yes No Results in loss of bind- Yes No ing/inhibitory effects? Observed with agents that target No Yes VEGF extracellularly?

While other factors may become activated, the VEGF ligand remains a predom- inant mediator of angiogenesis. And because it is genetically stable and continually expressed, direct VEGF ligand inhibition remains a rational strategy throughout tumor development. One strategy being evaluated is maintaining direct VEGF lig- and inhibition over time while selectively targeting other relevant pathways as they emerge.

26 2.3 Cancer stem cells

Cancer stem cells are cancer cells found within tumors that possess characteristics associated with normal stem cells, specifically the ability to give rise to all cell types found in a particular cancer.In many tumors there is a small proportion of the neo- plastic cell population, which is composed of self-renewing tumor stem cells. These cells spawn the bulk of the cancer cells in tumors, which have many of the properties of normal progenitor or transit-amplifying cells [145]. Cancer stem cells are therefore tumorigenic (tumor-forming),in contrast to other non-tumorigenic cancer cells. The cancer stem cells may generate tumors through the stem cell processes of self-renewal and differentiation into multiple cell types. Such cells are proposed to persist in tu- mors as a distinct population and cause relapse and metastasis by giving rise to new tumors. The development of specific therapies targeting at cancer stem cells holds hope for improvement of survival and quality of life of cancer patients, especially for patients with metastatic disease.

The existence of cancer stem cells in many solid tumors has profound implications for the evaluation of many types of anti-cancer treatments.If drug is able to elimi- nate the stem cells in a tumor while leaving the bulk cancer cell population intact, the tumor mass may appear to be unaffected by the treatment, and will begin to shrink slowly only when these transit-amplifying cells gradually senesce and die in the following weeks.

27 2.4 Observational Model

It is important to understand how the translations are made between causal processes and probability distributions in order to avoid scientific equivalence. To achieve that goal, one needs to describe the relationship between variables that are involved in a causal process and the probability distribution of these variables that the causal process generates must be made. The difference between a causal model, an ob- servational model and a statistical model is shown by using simple example of the statement ”rain causes mud”, which implies an asymmetric relationship: the rain will create mud, but the mud will not create rain. The symbol ′ −→′ will be used to show causal relationships. If there are no arrows it means that there is no direct causal relationship between them or they are causally independent. The symbol ′−′ is used to show observational relationships (Fig. 2.12)

Rain Mud Other causes of mud

A. The causal relationships between rain, mud and other causes of mud

Rain Mud Other causes of mud

B. The observational relationships between rain, mud and other causes of mud

Figure 2.3: Observational Relationships and Causal Relationships

The observational model that relates to this causal model is the statement that

’having observed rain will give us information about what we will observe concerning

28 mud’. This statement is merely information, not caused, and it is not asymmetric.

If we learned that it has rained, then we will have added information concerning the mud. However, observing the mud will also give us information about whether or not it has rained (Fig. 2.13)

Mud(cm) = 0.1Rain(cm) + N(0,0.1) A. A statistical model relating rain and mud

Rain(cm) = 10Mud(cm) + N(0,1) B. Another statistical model relating rain and mud

Figure 2.4: Statistical Model

The statistical model specifies the mathematical relationship between the variables and the probability distribution of variables. The equivalence operator ′ =′ is used to refer to quantitative equivalence for mathematical statement. The statement says that the value obtained by measuring the depth of the mud is equivalent to the value that obtained by measuring the amount of the rain that falls multiplying this value by 0.1, and adding another value obtained from a random value taken from a normal distribution whose population mean is zero and whose population standard deviation is 0.1.

However, the mistranslation of symbol ′ −→′ and ′ =′, which are not the same thing when refers to ′cause′, lead to nonsensical causation. The strategy for trans- lation is to use direct graphs to express relationships and use d-separation, which developed by [105] to convert the statement of direct graphs into statement of condi-

29 tional independence of random variables obeying particular probability distribution.

2.5 Causal Model

The description of causal graphs or directed graphs is shown in Figure 2.14.

A E

C F

B D

Figure 2.5: A directed graph describing the causal relationships

From the Figure 2.14, A and B are causally independent, changing in either will not affect the value of other. C,D,E,F are causally dependent on A and B, either directly C or indirectly D,E,F. Causally dependent is that changes in either A or B will provoke changes in each of C,D,E,F but changes in any of these will not provoke changes either in A or B. A and B are direct causes of C because changes in A or B will provoke changes in C. A and B are indirect causes of D,E,F because changes in

A or B will only provoke changes in these variables by causing changes in C: if C is blocked then A and B will no longer cause changes in these three other variables. C is a direct common cause of D and E and an indirect cause of F through its effect on D and E. Both E and D are direct cause of F, they are not themselves causally independent.

30 In biology, a directed graph is used to express causal relationships between vari- ables. The notion ′X −→ Y ′ means that X is a direct cause of Y. The notion

′X ←− Y ′ means that Y is a direct cause of X. The notion ′X ←→ Y ′ means that neither X nor Y are cause of the other but both share common unknown causes, which is unspecified latent variables.

A and B are the causal parent of C, and C is their causal child. A is an indirect cause of D because its causal effect is conditional on the behavior of C. A and B are causal ancestors of D and D is a causal descendant of both A and B. For example, the murder of a victim by a gunman where the direct cause of the victim’s death is the gunman’s action, which can be written as ′Gunmansaction −→ Murderofvictim′.

However, if the bullet was presented, the bullet will be the direct cause of death and the gunman will be an indirect cause, which can be written as ′Gunmansactions −→

Bullet −→ Murderofvictim′. To make a causal explanation, the trick is to choose level of causal complexity that is sufficiently detailed that it meets the goal of the study while remaining applicable in practice.

A direct path in a causal graph shows an ordered sequence of variables. If there is no direct path it means the variables are causally independent. If there are more than one direct path that linked two variables then it is a causal conditional independence.

An indirect path is the same of direct path except it ignore the direction of the edge from head to tail and not include a case of collider.

A collider is a vertex with arrow pointing into it from both directions. For example

31 from Figure 1.3, F in the undirected path D −→ F ←− E is a collider. A vertex that is a collider along an undirected path is inactive in its normal (unconditioned) state, which means a collider blocks the transmission of causal effects along such path. Oppositely, a non-collider is an active in its normal (unconditioned) state, which means it permits the transmission of causal effect along the path. This is like

′ON ′ and ′OF F ′ state in electrical circuit.

An unshielded collider is a set of three variables A −→ B ←− C along a path such that B is a collider and, additionally, there is no edge between A and C. If there are edges at A and C then B is a shielded collider.

A natural state of a non-collider is active (ON) state and the natural state of a collider is the inactive (OFF) state. It is possible for a vertex (variable) to be active along one path and inactive along another. For instance, notice the vertex F along the path D −→ F ←− E in Figure 2.14. Conditioning on a vertex in a causal graph means to change its state; if it was active, then conditioning inactivates it. But, if it was inactive, then conditioning activates it.

2.6 Translating from causal to statistical model

It is important to understand how translations are made between causal processes and probability distributions to avoid scientific equivalence. To achieve that goal, the description of the relationship between variables that involved in a causal process and the probability distribution of these variables that the causal process generates

32 must be made. The difference between a causal model, an observational model and a statistical model is shown by using simple example of the statement ”rain causes mud”, which implies an asymmetric relationship: the rain will create mud, but the mud will not create rain. The symbol ′ −→′ will be used to show causal relationships.

If there are no arrows it means that there is no direct causal relationship between them or they are causally independent. The symbol ’’ is used to show observational relationships.

2.7 d-separation

The definition of d-separation by Judea Pearl in 1988 says that:

”A set S of nodes is said to block a path p if either (1) p contains at least one arrow-emitting node that is in S, or (2) p contains at least one collision node that is outside S and has no descendant in S. If S blocks all paths from set X to set Y , it is said to ”d-separate X and Y ,” and then, it can be shown that variables X and Y are independent given S, written X ⊥ Y |S”.

UZ UX UY UZ UX UY

x0

Z XY Z

(a) (b)

Figure 2.6: A directed graph used to illustrate the notion of d-separation

To illustrate, the path UZ → Z → X → Y in Figure 2.6(a) is blocked by S =

33 Z and by S = X, since each of them have an arrow along the path. As a result, we can infer that the conditional independencies UZ ⊥ Y |Z and UZ ⊥ Y |X will be satisfied in any probability function that this model can generate,regardless of how we parameterize the arrows. Similarly, the path UZ → Z → X ← UX is blocked by the null set φ, but it is not blocked by S = Y since Y is a descendant of the collision node

X. Consequently, the marginal independence UZ ⊥ UX will hold in the distribution, but UZ ⊥ UX |Y may or may not hold [105, 106].

2.8 Summary

This chapter discussed the literature surrounding the study of glioblastoma and an- giogenesis. Background knowledge of the observational models and causal models was introduced to clarify the distinction between the models. The transitive method that we use to translate from observational model to causal model is the d-separation method. Additional information of hallmark of cancer and angiogenesis is important to extend the framework of the study because it provides much necessary knowledge to the understanding of the mechanisms of how the tumors become cancerous.

34 Chapter 3

Model Selection : Glioblastoma model

3.1 Glioblastoma model

We selected the brain cancer model from [132]. We implemented the model by de- scribing the cellular activities and locations where the activities take place. In our model, multilevelness was framed to show differences in cellular processes.

A description of the multilevel phenotype switch functioning in the brain cancer is shown in Figure 3.1:

Initially, Glucose diffuses from blood vessels, and then diffuses into the extracellu- lar microenvironment (Figure 3.1). A tumor cell, represented as an agent, consumes glucose and evaluates the concentration of glucose at its current location. If the con- centration is greater than the cell active threshold, the agent becomes active and

35 uses its EGFR signaling pathway to determine its phenotype. If the concentration of glucose is less than the dead threshold, the cell dies. If the concentration is between active and dead thresholds, the agent takes on a quiescent state [132]. The func- tioning of each active agent evaluates its migration potential (MP) by the process described in equation (3.1).

d[P LC ] MP [P LC ]= γ (3.1) γ dt

36 GLUCOSE OXYGEN TGFα

ACTIVE/QUIESCENT TUMOR CELLS

Glucose 1 Glucose 2 Glucose 3 Blood vessel’s wall

Blood Vessel DIFFUSION

Extracellular Matrix

EGFR PATHWAY

NO £ CELL CYCLE MP[PLC ¢ ]> YES

cdh1thr2 NO

YES YES

DIVISION MIGRATION

FREE SPACE NO

YES

PROLIFERATION NEW CELLS QUIESCENT APOPTOSIS

Free space = empty neighborhood site for cells to reside Extracellular micro-environment level Migration = phenotype of cells

TGF¡ ‘ Molecular level Division = cell divides into two cells Proliferation = phenotype of cells (the reproduction of Cellular level similar forms ) Quiescent = inactive cells Cell cycle = the series of events that take place in a cell leading to its division and duplication

Figure 3.1: Multilevel phenotype switch functioning in the brain cancer

Where d[P LCγ]/dt is the rate of change of the P LCγ concentration- P LCγ belongs to a class of enzymes that cleave phospholipids just before the phosphate group,

37 play an important role in the intracellular transduction of receptor-mediated tyrosine

kinase activators. If MP is greater than a threshold σP LCγ , the agent will choose the

migration phenotype. If MP is less than σP LCγ , the agent starts to proliferate. If the concentration of CDH1 ( Cadherin-1 is a protein that in humans is encoded by the CDH1 gene; It is a tumor suppressor gene) is less than a threshold thr1 and the concentration of cycCDk (a family of protein kinases first discovered for their role in regulating the cell cycle) is greater than the threshold thr2, the cell divides [see 132].

After that, the cell chooses the most attractive free site (location for cells to reside) in the neighborhood to deliver its offspring. If there is no empty neighborhood, the cell turns into a reversible quiescent state until free space becomes available [see 132].

Each agent chooses the most attractive location mentioned above according to the following probability:

ψGj Pj = + (1 − ψ)ǫj, (3.2) Fj

where Gj is the glucose concentration at location j, Fj is the fibronectin concen- tration at j, j ∼ N(0, 1) is a normally distributed error term, and the parameter

ψ ∈ (0, 1) represents the extent of the search precision [132].

3.1.1 Glucose

Glucose diffuses from blood vessels into tumor cells via extracellular microenviron- ment. The process is represented in the equation below:

38 ∂G = D ∆+ X (t, x)q (Gblood − G) − X (t, x)U (3.3) ∂t G ves G tum G

where;

G = Glucose concentration in tumor (Tumor can consume glucose)

Gblood = Glucose concentration in blood (constant)

DG= Diffusivity of glucose (constant shows how fast glucose can penetrate blood vessel)

△ = Laplace operator (△≡∇2) = constant

qG = 2πrρG = constant

ρG = Vessel permeability for Glucose (constant)

r = blood vessel average radius (constant)

UG = Glucose uptake rate of surrounding cells (constant)

Xves (t, x) = characteristic function depends on time of blood vessel

 1 if blood vessel presents at x Xves(t, x)=  0 otherwise  3.1.2 Oxygen

Angiogenesis is regulated by changes in oxygen tension,and endothelial cells (ECs) and smooth muscle cells, which account for the first-line interface with the blood. It is obvious that these cells have mechanisms to sense differences in the oxygen supply

[47]. Oxygen permeates the blood vessels and diffuses into tumor cells. The process

39 is extrapolated in equation below:

∂C = D ∆+ X (t, x)q (Cblood − C) − X (t, x)U (3.4) ∂t C ves C tum C

Where;

C = Oxygen concentration in tumor (Tumor can consume oxygen)

Cblood = Oxygen concentration in blood (constant)

DC = Diffusivity of oxygen (constant show how fast oxygen can penetrate blood vessel)

△ = Laplace operator (△≡∇2)

qC = 2πrρC

ρC = Vessel permeability for Oxygen (constant)

r = blood vessel average radius (constant)

UC = Oxygen uptake rate of surrounding cells (constant)

Xves(t, x) = characteristic function depends on time of blood vessel

 1 if blood vessel presents at x Xves(t, x)=  0 otherwise  Xtum(t, x) = characteristic function depends on time of tumor cells

 1 if oxygen is diffuse into tumor region Xtum(t, x)=  0 otherwise 

40 3.1.3 Transforming Growth Factor alpha

T GFα is secreted by tumor cells and can be paracrine (has an effect only in the vicinity of the gland secreting it) and juxtacrine where it requires the cell producing the effector to be in direct contact with the cell containing the appropriate receptor

[132, 150]. The below equation shows the process:

∂T = D ∆+ X (t, x)q T + X (t, x)S − δ T (3.5) ∂t T ves T tum T T

Where;

T = the T GFα concentration

DT = T GFα diffusivity

qT = 2πrρT vessel permeability to T GFα

ST = cell’s net production rate of T GFα

δT = natural decay rate of T GFα

Assume zero flux along the boundary of considered domain [132]. List of all parameters and initial conditions of equations shown in Table 3.1.

3.1.4 Vascular Endothelial Growth Factor (VEGF)

Tumor cells secrete VEGF into the extracellular environment (Figure 3.1). After that, VEGF diffuses into the surrounding corneal tissue (fibrous, tough, unyielding, and perfectly transparent) and is consumed by endothelial cells [132]. The process of

VEGF consumption is described as following equation:

41 Table 3.1: Initial values and parameter values that were used in current model

Symbol Value Unit Description −7 2 −1 DG 6.7 × 10 cm s Diffusion coefficient of glucose −5 2 −1 DC 8 × 10 cm s Diffusion coefficient of oxygen −7 2 −1 DT 5.18 × 10 cm s Diffusion coefficient of T GFα −7 2 −1 DV 2.9 × 10 cm s Diffusion coefficient of VEGF

ST 0.2 nM/h Secretion rate of T GFα

SV 0.6 nM/h Secretion rate of VEGF

UG 0.28 mmol/h Uptake rate of glucose −8 UC 6.25 × 10 DC/h Uptake rate of oxygen −5 PG 3.0 × 10 cm/s Permeability of glucose −4 PC 3.0 × 10 cm/s Permeability of oxygen −4 PV 0.1 × 10 cm/s Permeability of VEGF β 0.05 DC Production rate of fibronectin γ 0.1 DC Uptake rate of fibronectin r 10 µm Average radii of micro blood vessel

42 ∂V = D △V − X (t, x)q + X (t, x)S − δ V (3.6) ∂t V ves V tum V V

Where;

V = VEGF concentration

DV = diffusivity of VEGF

qV = the vessel permeability for VEGF

SV = cell’s VEGF secretion rate

δV = the natural decay rate of VEGF

3.1.5 Fibronectin

Fibronectin is a high-molecular weight ( 440kDa) glycoprotein of the extracellular matrix that binds to membrane-spanning receptor proteins called integrins in the corneal tissue secreted by endothelial cells.Fibronectin can promote endothelial cell migration, which is caused by a chemotactic effect [109].The tumor cells can consume

fibronectin. This process is described in the equation below:

∂F = X (t, x)β − X (t, x)γF (3.7) ∂t ves tum

Where; F = Fibronectin concentration

β = production rate of fibronectin

γ = uptake rate of fibronectin

The process, starts with vessel budding (being in the early developmental stage)

43 from pre-existing, or parent, vessels, includes cell proliferation, migration, anastomo- sis, and budding of new sprout off newly formed vessels. A probabilistic description is used to explain the uncertainty of cells migration and the branching of vessels in 2-dimension based on real experiment [132]. Assume that a source of angiogenic stimulus (VEGF) exists at some distance away from the existing micro vessels. The existing vasculature is a single parent vessel located at y=0. New buds are located along the parent vessel. All buds begin to grow off the parent vessel at the same time

(t=0), and no budding off the parent vessel occurs at later times [see 132]. The path of each capillary is determined by the trajectory of an actively migrating cell located at its tip. All other cells in the sprout follow directly behind the tip cell and it is calculated as a single cell migration such that the entire sprout position follows only the movement of the tip cell[132]. The endothelial cells within a sprout proliferate to provide additional cells for the expansion of the sprout. Cells are able to move from one sprout into another attached sprout. Parameter values for the tip endothe- lial cell’s random motility and proliferation are obtained independently by directed measurement or estimation from experimental observations. This can help investi- gate whether endothelial cell chemotactic (single-cell or multicellular organisms direct their movements according to certain chemicals in their environment) response is im- portant to the directional development of microvessel networks commonly observed from invivo. The optimal level of chemotaxis for effective microvessel network forma- tion is important to the study, as it can suggest a possible way to control the growth

44 of neovasculature that feeds a tumor [132]. It should be investigated and compared to the laboratory measurement result. The effects of the rate of cell proliferation were studied to investigate whether proliferation rate governs the rate of the vessel growth.

In response to the stimulation of VEGF in microenvironment, endothelial cells located at the pre-existing blood vessel begin to proliferate. Endothelial tip cells begin to migrate out of existing vessels into neighborhood areas in the direction of

VEGF secretion source (at the tumor site)[132].The description of tip cells migration is shown in Figure 3.2.

Several of endothelial tip cells line up to form solid sprout and move from one sprout to another. As a result, the vessels keep growing even when cell proliferation is prevented. The endothelial cells in the sprout proliferate, producing more cells for the continued expansion of sprout. One sprout runs into another and anastomoses

(the connection of separate parts of a branching system to form a network as of blood vessels and its branches) to form loops [2]. More sprouts keep growing and making new vessels, which facilitates the expansion of microvascular network. The new blood vessel allow blood to flow throughout the new vasculature as it develops. Finally, pericytes (contractile cells that wrap around the endothelial cells of capillaries and venules throughout the body) take up position along the abluminal surface (surface that is away from the lumen) of the capillaries [2, 145].

During the process of angiogenesis , tumor cells secret vascular endothelial growth factor (VEGF) into the microenvironment to induce and sustain new capillary sprouts

45 migrating from pre-existing vasculature towards the tumor. This can help to maintain tumor cells’ metabolism by supplying them with glucose and oxygen. This process allows the endothelial cells to migrate away from the blood vessel and toward the secretion direction of VEGF from the tumor. The leading endothelial cells are called sprout-tips; other endothelial cells divide, migrate, align, and form tubes of endothe- lial cells surrounding a vascular lumen. The vessels link with one another to form network of loops in a process called anastomosis [2]. When new blood vessels are formed they can provide the tumor with a direct supply of oxygen and nutrients.

The fresh nutrient supply allows a new stage of rapid tumor growth into the sur- rounding tissue.

Understanding of the coordination of cellular behaviors is helpful for therapeu- tic manipulation of the process. The study focuses on the role of endothelial cells functioning in vessel growth. The random motility behaviors of endothelial cells are important for new capillary growth rate and new vascular structure. The random- ness in cell migration direction is necessary for vessel anastomosis (communication between vessels by collateral channels) and capillary loop formation. The rate of vessel outgrowth is determined by the tip endothelial cell migration rate.

46 3.2 Other published works of brain cancer, angio- genesis, and cancer cell

Table 3.2 shows other published papers that are related to brain cancer, angiogenesis, and cancer cell. These published papers influence the current and future study of brain cancer.The development of therapeutic strategies for brain cancer is the main focus of brain cancer study.

Table 3.2: Other published works of brain cancer, angiogenesis, and cancer cell

Focus of study Authors Brain cancer [26], [58, 62, 144], [3, 22, 27, 38, 82, 85, 132, 135, 150? ] [23, 29, 30, 35, 39, 42, 44, 48, 52, 78, 98, 104, 108] Angiogenesis [5, 15, 16, 25, 43, 47, 59, 60, 77, 80, 87, 102, 107, 109, 117, 121, 126, 127, 140][6, 18, 19, 24, 32, 37, 41, 49, 65, 67, 74, 91, 97, 101, 110, 115, 128, 131, 137] Cancer cell [4, 12–14, 40, 53, 64, 73, 76, 79, 86, 93, 103, 107, 113, 117, 118, 124, 136, 142]

3.3 Summary

In this chapter, we discussed the criteria for picking a model for understanding brain cancer. We are not getting involved in deterministic model or processes that use differential equations to describe cellular behaviors. We focus on causal model that is

47 based on the linear relationship. The equations shown and described in this chapter seek to provide an understanding of brain cancer modeling in a biological context.

We also provide the phenomenon of angiogenesis in brain cancer and give attention to the important factors that much research mentions: that they are really involved in brain cancer. These factors will be used in our causal model in the next chapter.

48 VEGF

CORNEAL TISSUE

ENDOTHELIAL CELLS FIBRONECTIN

PROBABILITY OF MIGRATION

Pk=P1 YES NO

Pk=P2 YES NO

Pk=P3 YES NO

Pk=P4 NO

YES Pk=Average Pi

MOVE DOWN MOVE RIGHT MOVE UP SPROUT MOVE LEFT SPROUT STAY THE SAME SPROUT SPROUT

SPROUT CELLS

YES Age of vessel > 18hr NO Free site availability

SPROUT SPROUT BRANCHING MIGRATING

Figure 3.2: Probability for endothelial tip cells migration

49 Chapter 4

Methodology

4.1 Translating from causal to statistical model

The strategies for translating a causal model to an observational model are: First, express a causal hypothesis in a mathematical language (directed graph) that can properly express the asymmetric types of relationship that use to imply causality.

Second, use the translation device (d-separation) to translate from this directed graph into probability theory to express notion of association. Finally, determine the types of (conditional) independent relationships that must occur in the resulting joint prob- ability distribution.Figure 4.1 shows the translation process: Derive the observational Express one’s causal Translate using ‘shadows’ of the causal hypothesis in the form d-separation graph and express of a direct graph these in the language of probability theory

Figure 4.1: The translation from a causal model to an observational model

50 4.2 Statistical control and physical control

We already known from Chapter 3 how to translate a causal hypothesis into a sta- tistical hypothesis. First, transcribe the causal hypothesis into causal graph showing how variables are causally linked to other variables in the form of direct and indirect effects. Second, use the d-separation criterion to predict what types of probabilistic independence relationship must exist when we observing a random sample of units that obey such a causal process. The relationship between control through external manipulation and probability distributions is given by the Manipulation Theorem

[125].

For example, suppose someone has randomly sampled herbaceous plants growing in the understory of an open stand of trees. The measure variables are the light intensity by the herbaceous plants, their photosynthetic rates and the concentration of anthocyanin (red-color pigment) in their leaves.

The d-separation will predict the pattern of probabilistic independencies in this new causal system resulting in two alternative causal explanations for the data (Fig- ure 4.2 (A, B)). Notice that anthocyanin concentration is d-separated from pho- tosynthetic rate according to the first hypothesis in both the manipulated system

(Figure 4.2 (B)), where the light intensity is experimentally fixed, as well as in the unmanipulated system (Figure 4.2 (A)), where light intensity is statistically fixed by conditioning. Statistical and experimental controls are alternative ways of doing the same thing: predicting how the associations between variables will change once other

51 Tree canopy Tree canopy density density

Light intensity Light intensity

Anthocyanin Photosynthetic Anthocyanin Photosynthetic concentration rate concentration rate

A. Two different causal scenarios linking the same four variables

Tree canopy Light bulb Tree canopy Light bulb density density

Light intensity Light intensity

Anthocyanin Photosynthetic Anthocyanin Photosynthetic concentration rate concentration rate

B. Experimental manipulation of the causal systems that are shown in A.

Figure 4.2: Alternative causal model sets of variables are ’held constant’. This does not mean that the two types of control always predict the same type of observational independency in our data. Once we have a way of measuring how closely the predictions agree with the observations, we have a way of testing, and potentially falsifying, causal hypotheses even in case that we cannot physically control the variables of interest.

52 4.3 Path analysis and d-separation

One important use of a path diagram is to ’decompose’ an association between vari- ables into different types of causal relationship.This is the goal of Wright’s original method of path coefficients [147].The types of effects are : direct causal, indirect causal,effects due to shared causal ancestors and unknown causal relationships [119].

This decomposition of a statistical association into different types of causal relation- ship is based on fundamental association linking causality with probability distri- butions. The association between two variables is simply the overall correlation or covariance between them. This overall association can be generated by a number of different causal relationships at the same time. The consequences of intervention or manipulations will depend on these different types of relationship; Therefore, it is important to be able to distinguish and quantify them.

The importance of decomposing effects will be clearly understood when consid- ering the standardized path coefficients. The overall effects (thus overall predicted correlation) between variables can be used as an indicator of the prediction.

The magnitude of direct effect is measured by the path coefficient on the arrow going from the parent to the child. The units of this effect are the same as those used to measure the variables. If the variables are not standardized then the path coefficient measures the number of unit changes in the child per unit change in the parent.If the variables are standardized, then the units are standard deviations from the mean.

53 Since path analysis looks similar to multiple regression, it is possible to look at a multiple regression as a path model [119]. A multiple regression equation uses a series of predictor variables to predict, or account for, the observed variation in the dependent variable. It is preferable, but not necessary, for predictor variables to be independent. The path diagram, which have partial regression coefficients are estimated with multiple regression are the direct effects of each predictor on the dependent variable. The indirect effects, which are the unresolved causal relationships between the predictors, are ignored. If the free covariances are zero, then the direct effects will also be the overall effects, but the regression equation will not tell us this.

Multiple regression helps us to predict but not to explain. It cannot help to decide whether the causal assumptions of the model are correct,nor can it help us to tell whether the predictor variables are cause of output or response. Causal conclusion must come from elsewhere. The best way would be to conduct a controlled randomized experiment in, which the values of variables are randomly assigned to the experimental units, since that would give good reason to assume that the free covariances between them are zero. If this is not possible then we have to construct our model, collect our observations, give constraints to the pattern of covariation based on our causal hypothesis, then test the constraints.

54 4.3.1 d-separation test

Structural equations modeling (SEM), which is based on a maximum likelihood tech- nique is used here to test the causal models that include variables that cannot be directly measured, or latent variables, or variables that contain measurement errors.

However, Shipley introduces the d-sep test, which was derived from the notion of d-separation. The method is used for small sample size, non-normally distributed data or non-linear function relationships. The disadvantage of the d-sep test is that it is not applicable to causal models that include latent variables.

The link between causal conditional independence, given by d-separation, and probabilistic independence provides ways to test causal model: simply list all of the d-separation statements that are implied by the causal model and then test each of these models using an appropriate test of conditional independence.

According to Sewall [147], this method of path analysis was not a statistical test based upon standard formulae such as correlation or regression. Instead, the path coefficients were interpretative parameters for measuring direct and indirect causal effects based on a causal system that had already been determined. His method was a statistical translation of a biological system that obeys asymmetrical causal relationships.

However, using d-separation in path analysis, though based on a structural model- ing equation, still has some drawbacks. First, the inferential tests are asymptotic and require large sample size. Second, functional relationships must be linear relation-

55 ship. Last, data that are not multivariate normal are difficult to treat. Additionally, d-separation is not applicable to causal models that include latent variable (variable that cannot be directly observed or measured).

The steps of using d-separation in path analysis are:

1. Construct a directed acyclic graph that involves all variables and excludes latent

variables

2. List all pairs of variables, their conditional relationships, and d-separation state-

ments

3. Test each predicted and independently use inferential test

4. Depend on the nature of the variables: if two variables involved in independence

statements are normally and linearly distributed, then we can use Pearson par-

tial correlation coefficient to test independency of variables. For instance, if

coefficient is zero then they are independent from each other.

4.3.2 Independence of d-separation statements d-separation is used to predict a set of conditional probabilistic independencies that must be true if the causal model is true. The minimum set of d-separation statements is called a basis set which is sufficient to use for prediction of the entire set of d- separation statements. Figure 4.3 shows the directed acyclic graph (DAG) involving six variables and in Table 4.1 shows the basis set for the DAG shown in Figure 4.3

56 Table 4.1: A basis set for the DAG along with the implied d-separation statements

Non-adjacent variables Parent variables of ei- d-separation statement ther non adjacent vari- able A,C B A ⊥ C | B A,D B A ⊥ D| B A,E C,D A ⊥ E| CD A,F None A ⊥ F B,E A,C,D B ⊥ E| AC B,F A B ⊥ F| A C,D B B ⊥ C| DB C,F B C ⊥ F| B D,F B B ⊥ D| FB

57 C

A B E F

D

Figure 4.3: A directed acyclic graph (DAG) involving six variables

4.3.3 Testing for probabilistic independence

The difference between the value of a random quantity Xi and its expected value µ is (Xi − µ). This number can be either negative or positive. The deviation around

2 the expected value is (Xi − µ) . The expected value of this squared difference is

2 the variance: E[(Xi − µX ) ] = E[(Xi − µX )(Xi − µX )]. The covariance is simply a generalization of the variance. For a different random variable (X, Y) measured on the same observational units, the covariance between these two variables is defined

2 as E[(Xi − µX ) ] = E[(Xi − µX )(Yi − µY )]. If X and Y behave independently of each other, then large positive deviations of X from its mean (µX ) will be paired with large or small, negative or positive, deviations of Y from its mean (µY ). In a long run, the expected value of the product of these two deviations, E[(Xi −µX )(Yi −µY )], will be zero. As a result, probabilistic independence of X and Y implies a population zero covariance. A Pearson correlation coefficient is simply a standardized covariance.

Neither a variance nor a covariance have any upper or lower bounds. Changing the

58 units of measurement will change both the variance and the covariance.

4.4 Structural Equation Modeling (SEM)

4.4.1 Steps to perform SEM analysis

Model specification

The model must be specified correctly when SEM is used as a confirmatory technique based on the type of analysis that we are attempting to confirm. Two different types of variables, namely exogenous and endogenous variables, are used when building the correct model. The difference between these two types of variables is whether the variable regresses on another variable or not. A variable that regresses on another variable is called an endogenous variable, even if other variables regress on it. In a directed graph, an endogenous variable is referred to any variable receiving an arrow.

The modeler can make two types of relationships to specify pathways in a model,

: (i) free pathways, which hypothesized causal relationships between variables are tested, and therefore are left ’free’ to vary, and (ii) relationships between variables that already have estimated relationship, usually based on previous studies, which are ’fixed’ in the model. We always specify a set of theoretically plausible models in order to assess whether the model proposed is the best of the available models.The theoretical reasons for building the model as it is, the number of data points, and the number of parameters that the model must estimate are accountable to identify the model.

59 Estimate of free parameters

Parameter estimation is performed by comparing the actual covariance matrices rep- resenting the relationships between variables and the estimated covariance matrices of the best-fitting model. This is obtained through numerical maximization of a fit criterion as provided by maximum likelihood estimation, weighted least squares or asymptotically distribution-free methods. A specialized SEM analysis program is used in our study we use TETRAD V. The importance of TETRAD on the foundations of structural equations modeling (SEM) is established much multivariate behavioral research [119, 120][105].

Assessment of model and model fit

It is necessary to interpret the model after performing model estimation. Estimated paths may be tabulated and/or presented graphically as path models. The impact of variables is assessed using path tracing rules (see path analysis). It is important to examine the ”fit” of an estimated model to determine how well it models the data.

This is a basic task in SEM modeling: forming the basis for accepting or rejecting models and usually accepting one competing model over another. The output of

SEM programs includes matrices of the estimated relationships between variables in the model. Assessment of fit essentially calculates how similar the predicted data are to matrices containing the relationships in the actual data [46]. In statistical hypothesis tests, SEM model tests are based on the assumption that the correct and

60 complete relevant data have been modeled. There are differing approaches to assessing

fit.Different measures of fit capture different elements of the fit of the model; therefore, it is appropriate to report a selection of different fit measures.

Model modification

In order to improve the fit, a model may need to be modified, thereby estimating the most likely relationships between variables.TETRAD V provides modification indices, which may guide minor modifications. Modification indices report the change in χ2 that result from freeing fixed parameters: usually, therefore adding a path to a model, which is currently set to zero. Modifications that improve model fit may be flagged as potential changes that can be made to the model. Modifications to a model, especially the structural model, are changes to the theory claimed to be true. Modifications therefore must make sense in terms of the theory being tested, or be acknowledged as limitations of that theory. Changes to measurement model are effectively claims that the data are impure indicators of the latent variables specified by a theory.

Sample size and power

Complexities that increase information demands in structural model estimation in- crease with the number of potential combinations of latent variables; while the in- formation supplied for estimation increases with the number of measured parameters times the number of observations in the sample size, both are non-linear. Sample size in SEM can be computed through two methods: the first as a function of the ratio

61 of indicator variables to latent variables, and the second as a function of minimum effect, power, and significant level.

Interpretation

Using the best fitting model, models are then interpreted so that claims about the constructs can be made. Caution should always be taken when making claims of causality even when experimentation or time-ordered studies have been done. The term causal model must be understood to mean: ”a model that conveys causal as- sumptions,” not necessarily a model that produces validated causal conclusions. Good

fit by a model consistent with one causal hypothesis invariably entails equally good

fit by another model consistent with an opposing causal hypothesis.

4.4.2 Path analysis and maximum likelihood

The basic steps for doing the path analysis are:

1. Specify the hypothesized causal structure of the relationship between variables.

2. Translate to causal model into an observational model. Write down the set of

linear equations that follow this structure and specify which parameters are to

be estimated from the data, which are fixed based on the causal hypothesis.

3. Derive the predicted variance and the covariance algebra. Covariance algebra

gives the rules of path analysis that Wright derived [147].

62 4. Estimate these free parameters using maximum likelihood or related methods,

while respecting the values of the fixed parameters. This estimation is done by

minimizing the difference between the observed covariances of the variables in

the data and the covariances of the variables that are predicted by the causal

model.

5. Calculate the probability of having observed the measure minimum difference

between the observed and predict covariances, assuming that the observed and

predicted covariances are identical except for random sampling variation.

6. If the calculated probability that the remaining differences between observed

and predicted covariances are due only to sampling variation is sufficiently

small(less than 0.05) and then we conclude that the observed data were not

generated by the causal process specified by the hypothesis and that the pro-

posed model should be rejected. If, on the contrary, the probability is sufficiently

large (more than 0.05), then we conclude that the data are consistent with such

a causal process.

4.4.3 Nested models and multilevel model

Multigroup SEM is used to analyze different causal processes and to compare the causal relationships across the different groups. Given two SEM models with the same set of variables, one model is nested within a second one if:

1. All fixed parameters in the first model are also fixed to the same value in the

63 second, Or

2. Some of free parameters in the first are still fixed in the second

(A) X1 X2 X3

Ԑ 2 Ԑ 3

(B) X1 X2 X3

Ԑ 2 Ԑ 3

Figure 4.4: Nesting model: Model B is nested within A

From Figure 4.4, Model A has two fixed parameters. The path coefficient for the edges between X1 and X2, and between X1 and X3 have been fixed zero, therefore there are no edges between X1 and X2 or between X1 and X3. There are two fixed parameters in model A, all others being freely estimated. There is only one fixed parameter in model B- the path coefficient for the edges between X1 and X3 is still

fixed to zero- and all others, including the path from X1 and X2, are freely estimated.

So the fixed parameter of model B are a subset of those in model A and model B is nested within model A.

The strength of multigroup SEM is the ability to compare statistically between groups and determine which parts of the models in each group are the same and which parts different. In this sense, multigroup SEM is analogous to ANOVA except that, rather than testing for differences in means between groups, it is testing for difference

64 in the covariance structure between groups. To do this, one must construct a series of nested multigroup models [120].

4.5 The causal expression of SEM

4.5.1 Assumptions and representations

From Figure 4.5, the inference outline using SEM that consists of linear equations and their nonparametric counterpart, encoded using diagrams. Consider the linear structure equation

y = βx + uy, x = uX (4.1)

where x is the level of a disease, y is the level of symptom, and uY is used for all factors that could possibly affect Y when X is held constant. The interpretation of this equation for a physical process is to examine the values of all variables in the domain that assigns to variable Y the value y = βx + uY . To explain the occurrence of the disease X , we write x = uX , where UX is the factors that affecting X, which may include factors in UY .

Variables UX and UY are ’exogenous’, which means they represent observed or unobserved background factors that remain unexplained, that is factors that influence but are not influenced by the other variables (endogenous) in the model. In SEM, unobserved exogenous variables are ′disturbances′ or ′error′, which are different from residual terms in regression equations. Residual (ǫX , ǫY ) in regression equations are

65 artifacts of analysis that are uncorrelated with the regressors.

If correlation is presumed possible, we connect the two variables ,UY and UX , by a dashed double arrow, as shown in Figure 4.5(b).If allowing for correlation among omitted factors, we encode in effect of latent variables to variables X and Y as shown in Figure 4.5(c). If we focus on causal relations among observed rather than latent variables, we do not distinguish between correlated errors and interrelated latent variables. When error terms are uncorrelated, we just eliminate them from the diagram with the understanding that every variable, X, is subject to the influence of an independent disturbance UX .

UX UY UX UY UX UY

x = uX y = βx + uY XYβ XYβ X β Y (a) (b) (c)

Figure 4.5: A simple SEM, and its associated diagrams, showing (a) independent unobserved variables, (b) dependent exogenous variables, and (c) an equivalent in which latent variables are enclosed in ovals.

In path diagrams, causal assumptions are encoded not in the links but in the miss- ing links. An arrow just indicates the possibility of causal connection, the strength is to be determined (from data); a missing arrow represents a claim of zero influence, a missing double arrow means a claim of zero covariance. Both assumptions are causal, not statistical, since none can be determined from the joint density of the observed variables, X and Y , though both can be tested in experimental settings (randomized trials).

66 One advantage of using latent variables is that it reduces the dimensionality of data. A large number of observable variables can be aggregated in a model to repre- sent an underlying concept, making it easier to understand the data. In this sense, they serve a function similar to that of scientific theories. At the same time, latent variables link observable (”sub-symbolic”) data in the real world to symbolic data in the modeled world.

4.5.2 Causal assumptions in nonparametric models

SEM is expendable to models involving discrete variables, nonlinear dependencies, and heterogeneous effect modifications. The term ’effect’ does not mean the coeffi- cient in an equation, but rather the transmission of changes among variables [106].

For example, the nonparametric interpretation of the diagram in Figure 4.6(a) corre- sponds to a set of three unknown functions, each corresponding to one of the observed variables:

UZ UX UY UZ UX UY

x0

Z XY Z XY

(a) (b)

Figure 4.6: The diagrams associated with (a) the structural model equation (2) and

(b) the modified model of equation (3) representing intervention d0(X = x0).

67 z = fZ(uZ)

x = fX (z,uX ) (4.2)

y = fY (x, uY )

where in this example UZ , UX , and UY are assumed to be jointly independent but otherwise arbitrary distributed. Each of these functions represents a causal process

(or mechanism) that determines the value of the left variable (output) from the values on the right variables (inputs). The absence of variable Z from the arguments of fY conveys the empirical claim that variations in Z will leave Y unchanged, as long as variables UY and X remain constant.

4.5.3 Intervention and causal effects

We can claims causal effects even if we do not know their functional and distribution forms [106].This can be done by using a mathematical operator called do(X), which simulates physical interventions by deleting certain from the model, replacing them with a constant X = x, while keeping the rest of the model unchanged. For instance, to emulate an intervention do(x0) that holds X constant (atX = x0) in model M of

Figure 4.6(a), we replace the equation for x in equation (2) with x = x0, and obtain a new model, Mx0,

z = fZ (uZ)

x = x0 (4.3)

y = fY (x, uY )

68 the graphical description, which is shown in Figure 4.6(b).

P (z,y|do(x0)) is the joint distribution associated with the modified model and de- scribes the postintervention distribution of variables Y and Z (also called ”controlled” or experimental distribution), to be distinguished from the preintervention distribu- tion, P (x, y, z), associated with the original model of equation (2). For example, if

X represents a treatment variable, Y a response variable, and Z some covariate that affects the amount of treatment received, then the distribution P (z,y|do(x0)) gives the proportion of individuals that would attain response level Y = y and covariate level Z = z under the hypothetical situation, which treatment X = x0 is administered uniformly to the population. In general, we can formally define the postintervention distribution by the equation

PM (y|do(x)) = PMx (y) (4.4)

To describe this equation in the framework of model M, the postintervention distribution of outcome Y is defined as the probability that model Mx assigns to each outcome level Y = y. From this distribution we are able to assess treatment efficacy by comparing aspects of this distribution at different levels of x0 [106]. However, one would have the question in the analysis of causal effects that is the question of identification in partially specified models: Given assumptions set A (as embodied in the model), can the controlled (postintervention) distribution, P (Y = y|do(x)), be estimated from data governed by the preintervention distribution P (z,x,y)?

69 In linear parametric settings, the question of identification reduces to asking whether some model parameter,β,has a unique solution in terms of the parameters of P (say the population covariance matrix). In the nonparametric formulation, the notion of ”has a unique solution” does not directly apply since quantities such as

Q(M) = P (y|do(x)) have no parametric signature and are defined procedurally by simulating an intervention in a causal model M, as in equation (3).

4.6 Exploration, discovery and equivalents of causal graph

4.6.1 Exploring hypothesis space

Assuming that all of the data are generated by the same unknown causal process, that there are no latent variables responsible for some observed association and that the data are accurate to the causal process, how many different causal graphs could exist under these conditions? It is impossible to explore every possible pairs of relationship

[119]. As a result, we need search strategies that quickly find hypotheses that contain the correct answer. The strategies must be efficient at exploring hypothesis space, they should be relied on d-separation to translate from causal graph to probability distributions.

70 4.6.2 The shadow cause revisited

When we measure a correlation in a sample of data we use value to infer what the correlation might be in the population, which we randomly choose our sample data.

The causal process sometimes has ambiguous correlational shadows, so we have to

find a way of deducing causal processes from correlational shadows and we have to take into account the inaccuracies that is caused by using sample correlations to infer population correlations [119]. Additionally, to deal with sampling variation, we have to use correlation as the clues to find causes when there is no sampling variation.

In other words, we will consider asymptotic methods. The exploratory methods is similar to the search algorithm in that it uses mathematical relationships between causal graphs and probability distribution by using d-separation to translate their relationships. The purpose of using d-separation is to convert causal claims into the probability distributions [119, 120].

4.6.3 Obtaining the undirected dependency graph

In acyclic graphs without latent variables, the undirected dependency graphs is just the direct acyclic graph (DAG), which all arrows are replaced with line lacking ar- rowheads (Figure 4.7). The steps are:

1. If there is not already an arrow or curved double-headed arrow between any two

observed variables, but d-separation of the pair requires conditioning on latent

variables, then draw a line between the pair

71 2. If there are curved double-headed arrows between any pairs of variable, then

replace these with a line

3. Remove the latent variables and also any arrows going into, or out of, these

latent variables

4. Change all remaining arrows to lines.

Path Diagram B F A

G

Undirected graph B F

G

Figure 4.7: Path diagram vs Undirected graph

To discover undirected dependency graph from the data when we do not know what the true direct acyclic graph look like we use the following assumptions as clues:

72 1. Causal homogeneity : every unit of population governed by the same causal

process

2. Probability distribution of observed variables that measured on each unit is

faithful to some cyclic or acyclic causal graph

For each possible association, or partial associations, we can definitely know whether the association or partial associations exist by looking at its value: if different from zero it is exist; if not, it does not exist.

4.6.4 Hypothesis setting

From the structural equation model of brain cancer in Chapter 3 raises the question of whether pro-angiogenic factors (T GFα, VEGF) cause angiogenesis in brain cancer.

How much are these factors needed to cause brain cancer? If these factors are changed, how much do we expect angiogenesis to increase? How do we account for endothelial cells that influenced by angiogenic factors manage to function without becoming malignant cancer?

4.6.5 Causal inference for brain cancer

Causal inference is often elusive in behavioral sciences.Causal effects seem to impli- cate primary causes.In brain cancer, effects usually have multiples causes, such as family factors and biological environment.Effects of causes are not always constant : developmental stages, interventions.

73 4.7 Analysis of randomized experiment through SEM

Consider equation Y = b0 + b1X + e, let X take one of two values representing whether a subject received the treatment (X=1) or the control placebo(X=0). b1 estimates D. Because the assignment is randomized, X is expected to be uncorrelated with residual causes of Y. Randomization justifies the pseudo-isolation condition.

The randomized experiment also reminds us that between subject comparisons can provide information on the average within subject effects. We can contemplate what would have happened if a given subject had been assigned to a different group.

We need to specify EVERY causal factor that is correlated with X, the causal variable of interest.

Y = b0+ b1W 1+ b2W 2+ b3X + e (4.5)

For example, if Y is distress, and X is exposure to stress, W2 is some measure of previous distress, W3 is the social class and W4 is coping skills, then all four exogenous (predictor) variables might be considered as possible causes of distress. In this model, variables W2, W3, and W4 are introduced as additional causes of Y that are distinct from, but not necessarily uncorrelated with, X. If the list of covariates is complete, then the condition of pseudo-isolation will be justified. In practice, we never know if the list of competing covariates is complete; SEM analyses become credible as they withstand the alternative explanations advanced by their critics [119].

74 We will see that the systems of equations imply certain patterns of correlations

(covariances) among the variables in the model. Estimates are obtained by fitting the sample covariance matrix rather than the individual observations.

4.8 Develop a randomized experiment to address causal issues

Procedure : we generate data using software TETRAD V by follow the following procedure : First we construct a causal graph, assumed that we know the causal graph, by making it in a directed causal graph such that it can be interpreted as a linear relationship. Second, construct a parameterized model for the causal graph.

Third, construct the instantiated model from the causal graph to generate data set.

Data set that we have generated are continuous data. We generate data for using in the searching algorithm,since the search needs simulated data. Then we have to try out search algorithms on simulated data. Data can be simulated as with the simulate data template, and then an appropriate search procedure can be run on this data. Search procedures options are different depending on the type of simulated data .The search algorithms take in data and return information about a collection of alternative causal graphs that can explain features of the data.

75 4.9 Advantages, Disadvantages, and Limitation of Shipley’s approach for glioblastoma case study

The advantage of Shipley’s approach is that once we have set up hypothesized causal structure, it will map every d-separation to statistical independence of observational data.In other words, the causal relationships between variables in the glioblastoma model determine the correlational relationships between them. Additionally, the causal model, which based upon the cause and correlation is a self-explonatory model, so that we can interpret the cause of something by looking from the causal graph.In this case, we consider the number of anastomosis, which is the phenomenon during progress of angiogenesis. Another good reason for using Shipley’s method is that if the variables in a causal model are overidentified, it will appear in the causal structure without any relationship (zero correlation) in the model.The cause and correlation method is based on SEM (Structural Equation Modeling), which assumed that all variables have linear relationships. As a result, all nonlinear relationships are con- verted to linear relationships. However, the disadvantage is that if causal model is constructed from the wrong assumptions, the interpretation of the causal structure might not be useful.In that case, the causal strucure must be reconsidered and must be relied on prior knowledges of modelers or biological facts. The approach is used for inferring causes without randomized experiments, which are already justified [119].

The limitation of Shipley’s methods is composed of the level of model’s com- plexity. If the causal model, glioblastoma model, is composed of too many levels

76 of relationships (molecules, cells, tissues, organs), it will be difficult to determine causality because some variables may be generated by different causal stimuli. For instance, in glioblastoma case, anastomosis (tissues) was not stimulated by oxygen or glucose (molecules). Instead, anastomosis was stimulated by adjacent variables

(VEGF,fibronectin,vessels). This can prove that different group of observations might be generated by partial different causal processes.If we ignore the level of different structure, then we will obtain incorrect probability estimates [119].

4.10 Summary

In this chapter, we have learned how to translate a causal model to a statistical model or an observational model using the d-separation method, which created by Verma and Pearl [105].Path analysis, or SEM, will be used in the analysis by decompose the association between two variables into different types of causal relationships. We point out the pros and cons of using path analysis. We learned how to follow the structural equation modeling procedure to interpret something when the model is constructed.

A brief explanation of nested models and multilevel models is provided. We have learned how to explore and discover of causal graph, also learned how to make a causal hypothesis and test it accordingly. Finally, we developed a randomized experiment to address causal issues for the brain cancer model. Advantages, disadvantage, and limitation in using Shipley’s approach exorts us to be cautious in using this method.

77 Chapter 5

Results

In our study, we separate the analysis into two parts. Part 1 is for analyzing the glioblastoma model without the treatment (TKI treatment). Part two is for the anal- ysis of the glioblastoma model with TKI treatment. The investigation uses path anal- ysis to shows the relationship between variables in the causal model. The hypothetical model in path analysis usually involves two types of variables: observable/manifest

(endogenous or dependent) variable and latent (exogenous or non-observable) vari- ables. Observable variables serve as indicators of the underlying construct represented by the observable variables, and latent variables are usually theoretical constructs that cannot be observed directly.

There are two goals of path analysis: (1) understanding patterns of correlations among the regions; (2) explaining as much of the regional variation as possible with the model specified.The focus in path analysis is usually on a decision about the whole

78 model: reject, modify, or accept it, which is different from statistical testing in other techniques, such as multiple regression and ANOVA [120].

The coefficient between two variables in the path model is not a correlation coef-

ficients. Suppose we have a network with a path connecting from region A to region

B. The meaning of the path coefficient theta (e.g., 0.81) is this: if region A increases by one standard deviation from its mean, region B would be expected to increase by 0.81 its own standard deviations from its own mean while holding all other rel- evant regional connections constant. With a path coefficient of -0.16, when region

A increases by one standard deviation from its mean, region B would be expected to decrease by 0.16 its own standard deviations from its own mean while holding all other relevant regional connections constant.

5.1 Part 1: Glioblastoma without TKI treatment

Three structural models were fit to the data generated by the causal model. All variables are continuous and are normally distributed (standardized data).

The data is served as an observational data set with 500 observation.We observe the behavior of Anastomosis by changing the level of variables (increase,decrease, or remain constant) from the selected alternative model that shows sign of significance.

Even the significant level seems to be unacceptable due to the restriction of algorithms that were used to search for causal graphs. We can still infer from the selected model.

Using the path analysis, we can infer the estimation of Anastomosis based on the

79 coefficients of paths that are connected to it. The increase by one standard deviation from its mean of Anastomosis is caused by the increase of the same value of standard deviations of all variables pointing to it.As mentioned earlier, path analysis is not a tool for model selection, but it provides evidence to reject, accept, or modify the model to see if the pattern of correlations among variables can explain as much of the regional variation as possible with the specified model.

In the glioblastoma model, we include the effect of chemoattractant (V EGF ), and haptotactic effect influenced by F ibronectin because these effects are involved in tissue level.The true causal model will be used as a null model to compare other models in order to make specific estimations about how inputs affect the Anastomosis.

The true causal model that we used to generate the data (Without TKI treatment)

is shown in Figure 5.1.

G ¤¥ ¦§¨ A ©¦© ¦§ §

Q¤ ¨ §¥¨¥¨ F  ¦¨¥ 

O ¨

P¦  ¨ ¦ V ¨§ §¨ § A § ¦¦§ § 

T G F

TG F GF

_E 

V G F

E

GF

E 

M  ¦

C¤ § M¦¨ : W  ¦¤ T   ¨ ¨

Figure 5.1: Causal Model: Without TKI treatment

From all alternative models that obtained by searching box in TETRAD V, the

80 searches give different patterns of correlations. Some patterns include the direction of arrows that do not make any sense in biology, so we need to modify the direction by using conditional independence (d-separation). The alternative models are shown in Figure 5.2.

The alternative models that were fit to each trial are shown in Figure 5.2. Each of the models show a different structure for the effect of phenotype selection, nutrients condition (oxygen and glucose level), growth factors, chemoattractant (V EGF ), and haptotactic factor (F ibronectin) used to estimate Anastomosis for the glioblastoma model.

For alternative model 1, there are two possible causes for misfit. First, the linear relationship of Migration,P roliferation,V EGF , and F ibronectin to the variable

V essels is not a well described. Second, the assumption of zero covariances between all variables is very strict so it cannot contribute to model misfit.

Alternative model 2 and model 3 provide two alternative ways in which the cells phenotype and their relative variables influence the number of V essels. In alternative model 2,Oxygen and Glucose are a directed effect of Apoptosis. While T GFα and

EGFR are mediated through the combination substance T GFα − EGFR by using

EGFR pathway within the cell.

81

  ! "# #$ %

-

+ %! !*! 0%1 *! $ %*

&'()! *

,- . -/ 7 / 8

 % ! $ % * ! ! "* $ % 3

2 0

45 6

2 03  0

75

 0

5 6

0

9 -/

%) $ % *

9 ; < = > ? @B - /8

!  %$ $ 2 $ ! ! *$

  ! "# #$ %

-

+ %! !*! 0%1 *!$ % *

& '() ! *

/ 8 ,- . -/ 7

"* $ %  % ! $ % * ! ! 3

20

45 6

203  0

75

 0

5 6

 0

9 -/

%) $ % *

9 ; = > ? @B - / 8

! H %$ $ 2 $ ! ! *$

  ! "# #$ %

-

+ %! !*! 0%1 *! $ %*

& '()! *

,- . -/ 7 / 8

 % ! $ % * ! ! "* $ % 3

2 0

45 6

2 03  0

75

 0

5 6

 0

9 -/

%) $ % *

9 ; = > ? @B - / 8

!  D %$ $ 2 $ ! ! *$

Figure 5.2: Alternative Models: Without TKI treatment

82 Consequently,T GFα−EGFR influence the cell to pick it’s phenotype P roliferation and Migration. Quiescence is a common cause of Apoptosis and Migration. The alternative model suggests us to orient the direction of Migration to P roliferation which is the opposite to the causal model. F ibronectin does not have any effect on any variables.Coincidently, the significance of the estimation model for alternative model 2 is acceptable (less than 0.05), which is good.

In alternative model 3, the model suggests us to orient the direction of Quiescence to Glucose, which is absurd for biologist since this does not make any sense for the tumor cells could have influence on the level of Glucose, so this can be rejected by us even the TETRAD recommends to change, we can just observe the pattern of the graph and make a decision. Apoptosis does not influence on Quiescence, but the relationship between P roliferation and Migration is correct according to the causal graph.The p-value of linear relationship of the whole model is not good (more than

0.05).

What can we infer from the alternative model rests within the interpretation of each model. If alternative model 1 were preferred, the theoretical conclusion would be that changes in chemotactic factor (caused by V EGF ) and haptotactic factor(cause by F ibronectin) cause of changes in number of vessels (V essels). A Cell’s prolifera- tion (P roliferation) not only has a directed effect on the number of vessels (V essels), but is also mediated through cells Migration to affect the number of vessels. Glucose will only affect Apoptosis, likewise Oxygen only influences Quiescence. Anastomosis

83 is caused by the number of vessels in the tumor.

But in alternative model 2, Glucose and Oxygen are common effects of Apoptosis.

Quiescence is a common cause of Apoptosis and Migration. The number of quiescent cells does not directly affect cell proliferation; it is mediated through cell migration before proliferation occurs. If alternative model 2 were preferred, the theoretical conclusion would be that there is no possibility for the coefficient linking the variable

F ibronectin to affect the estimation of parameters of V essels. Thus in alternative model 2, the number of the vessels is independent of F ibronectin.

In alternative model 3, the graph looks similar to alternative model 1 except that the orientation between P roliferation and Migration, the direction of path from

Quiescence to Glucose is nonsense. Similarly, the apoptosis cells cannot become quiescent cell as suggested by the model.If the alternative model 3 were preferred, the theoretical conclusion could be that cells could pick their proliferation phenotype then affect the estimation of number of vessels, or cells pick their proliferation pheno- type then mediated through cell migration then affect the estimation of the number of vessels. Each models were fit to each of the 500 individual trials. When the cor- relation between the phenotype (P roliferation,Migration) are minimized, the bias in coefficients are minimized. The mean coefficients and fit statistics over all relative variable for brain cancer model without TKI treatment shows in Table 5.1. Search for alternative model when there is no TKI treatment;

84 Table 5.1: Mean coefficient and fit statistics of models without TKI

Path Model 1 Model 2 Model 3 V essels −→ Anastomosis 0.9716 -1.0181 0.7370 Apoptosis −→ Quiescence 1.3717 F ibronectin −→ V essels 0.8291 Migration −→ Quiescence

Migration −→ T GFα − EGFR Migration −→ V essels -0.9779 -1.2135 Oxygen −→ Quiescence -0.1968 P roliferation −→ Migration 0.1571 0.7456 P roliferation −→ V essels 1.1765 -0.6340 0.3480 Quiescence −→ Apoptosis -0.2798 -0.1476 Quiescence −→ Glucose -1.0622 Quiescence −→ P roliferation -1.0671 Quiescence −→ Migration 0.3356 1.2760 -1.0610

T GFα − EGFR −→ Migration -0.2930

T GFα − EGFR −→ P roliferation 1.1454 -0.3239

T GFα −→ T GFα − EGFR -1.4218 1.2083 -1.4059 V EGF −→ V essels -0.6502 0.1272 0.8279

EGFR −→ T GFα − EGFR -1.0808 -1.3332 -0.8905 Glucose −→ Apoptosis 0.0872 1.2995 0.4100 Migration −→ P roliferation -0.7614 Oxygen −→ Apoptosis 0.3081 -0.9456 DOF 63 67 65 χ2 74.4433 87.8433 73.9522 p − value 0.1534 0.0447 0.2091

85 5.2 Part 2: Glioblastoma with TKI treatment

We explore hypothesis by performing different search algorithms. We use PC,FCI,GES algorithm by TETRAD V to search for alternative models. If any search result show irregular direction, we use conditional on variables or prior knowledge about the cuasal relationship.

In this section we will find alternative causal graph from the true causal model.

The model is the same as in section 5.1 except that it includes the effect of TKI treatment to the model. The true causal graph is shown in Figure 5.3.

Next we search for alternative causal models. The alternative models are shown

in Figure 5.4-5.

IJKLN RS UXNXY N RZ R

Z m N SLYZ

g c ` `

KZ SRLS LS

a `

S

[\ ]^ `

N JZ S Y Z N S RRSJ R U RYN N RZ R

b c d ce ` l `e n

I

gh

f

I ij I k

f gh g

j I

l g

j I k

g

o pij I k q r sZ Y Z N

f g b `

^ ce

o p

f

r KR J sNt SJ u wZY v o p Y S Y S Y

e e f c e n `

Figure 5.3: Causal Model: With TKI treatment

86

y z{| }~ €}‚ }~ƒ ~

 ƒ• ‹}ˆ|‚ ƒ ˆ

‰{ƒ  ~| ˆ|

„ †‡ ˆ

Š ‹}z ƒ Œ ‹ ‚ ƒ}ˆ ” ~~ z~ € ˆ~‚ }–}~ƒ ~

 Žy‘’y “

Žy

”’y

Ž—˜

‘’y “ Š™š › ƒ‡‹ ‚ƒ }ˆ

Ž—˜

’y “

œ  ž Ÿ

› } z ƒ‚ Ž—˜ ‚ ‹ ‚–ˆ‚

y z{| } ~ €} ‚}~ƒ ~

 ƒ• ‹}ˆ|‚ ƒ ˆ

‰{ ƒ~| ˆ|

„ †‡ ˆ

Š‹}zƒŒ ‹ ‚ ƒ}ˆ ”~~ z~ €ˆ~‚ }–}~ƒ ~

 Žy‘’y “

Žy

”’y

Ž—˜

‘’y “ Š™š › ƒ‡ ‹‚ƒ }ˆ

Ž—˜

’y “

œ ¡ ž Ÿ

› } z ƒ‚ Ž—˜ ‚ ‹ ‚–ˆ‚

Figure 5.4: Alternative Models 1,2: With TKI treatment

87

¢£¤¥¦§¨ ©ª¦ª«¦§ ¬§

´

¸ ¬¾ ¦±¨¥ «¬±

²¤ ¬¨§¥¨ ±¥¨

­®¯°¨±

³ ´ µ ´¶ ½ ¶ ¿

¦£ ¬ ¨ «¬¦± ¨ §§¨£§ ©± § «¦ ¦§ ¬§

¹ ·¢¸¹º»¢ ¸¼

·¢¸

½

» ¢ ¸

ÀÁ

·

ÀÁ ³ ´ ¶

º»¢ ¸¼ ÂÃ Ä ¬° « ¬¦±

·

»¢ ¸¼

ÀÁ ´ ¶ ¿

Ħ Ũ£ Æ Ç È ¬«É · « ¨ « ¨ ± «

©ª¦ª«¦§ ¬§

¢£¤¥ ¦§¨

´

¸ ¬ ¾ ¦±¨ ¥ « ¬±

²¤¬¨§¥¨±¥ ¨

­®¯°¨±

³ ´ µ ´¶

¦£ ¬ ¨ «¬¦±

½ ¶ ¿

¨§§¨ £§ ©± § «¦ ¦§ ¬§

¹ ·¢¸¹º»¢ ¸¼

·¢¸

½

»¢ ¸

ÀÁ

·

ÀÁ ´¶ ³

· º» ¢ ¸¼ Ä ¬ ° « ¬¦±

ÂÃ

»¢ ¸¼

ÀÁ ´ ¶ ¿

Ħ Ũ£ Ê Ç È¬« É · « ¨ « ¨± «

Figure 5.5: Alternative Models 3,4: With TKI treatment

In alternative model 1, everything looks the same as a true causal graph except some paths are eliminated. The path from P LCγ is different from a true causal in that it influences on the combination substance of TKIEGF R. In biology, the

P LCγ has an importance role in cell physiology, in particular signal transduction pathways. P LCγ is activated by receptor tyrosine kinases (TK) in response to growth factors and hormones and plays an important role in regulating cell proliferation and

88 differentiation. It has been observed in literature that during tumor growth the level of P LCγ is increased [132, 141]. If model 1 were preferred, the theoretical conclusion would be that P LCγ is the common cause of P roliferation, Migration, which is acceptable in biology, but it is unclear if P LCγ is the cause of TKI − EGFR. TKI is a common cause of TKI − EGFR and P roliferation, which is also acceptable in biology. P roliferation can directly affect F ibronectin and can be mediated through

V essels.

In alternative model 2, it looks the same to alternative model 1, except there is no path between P roliferation and F ibronectin.The orientation of T GFα and

T GFα − EGFR is opposed to the causal model. If model 2 were preferred, the theoretical conclusion would be that T GFα − EGFR is a common cause of T GFα,

P roliferation, and Migration.

In alternative model 3, Quiescence is the only common effect of Glucose and

Oxygen. If alternative model 3 were preferred, the theoretical conclusion would be that the cause of V essels is influenced by both V EGF and F ibronectin, and by

P roliferation.

In alternative model 4, V EGF is independent in estimating Anastomosis. If this model were preferred, the theoretical conclusion would be that P roliferation is the directed effect of V essels and also a common effect of Quiescence,T GFα − EGFR,

P LCγ, and TKI. V EGF is not an influence of V essels.

The mean coefficients and fit statistics over all relative variable for brain cancer

89 model with TKI treatment is shown in Table 5.2. Search for alternative model when there is TKI treatment;

To be cautious, the above chi-squared test assumes that the maximum likelihood function over the measured variables has been minimized. Under that assumption, the null hypothesis for the test is that the population covariance matrix over all of the measured variables is equal to the estimated covariance matrix over all of the measured variables written as a function of the free model parameters–i.e., the unfixed parameters for each directed edge (the linear coefficient for that edge), each exogenous variable (the variance for the error term for that variable), and each bi- directed edge (the covariance for the exogenous variables it connects). The model is explained in Structural Equations with Latent Variable [119, 120]. Degrees of freedom are calculated as m(m+1)/2−d, where d is the number of linear coefficients, variance terms, and error covariance terms that are not fixed in the model.

5.3 Estimation of path coefficient

The path coefficients that we estimated, showing strength of relationship between each of the variables, are reported in the causal structure. We must have the implied correlation matrix, which generated by the program to justify the relationships of variables in the causal model. The path coefficient of the glioblastoma model with

TKI treatment is shown in Figure 5.6

90 Table 5.2: Mean coefficient and fit statistics of models with TKI

Path Model 1 Model 2 Model 3 Model4 V essels −→ Anastomosis 1.0867 1.1834 0.1472 0.8317 V essels −→ F ibronectin -0.8536 V essels −→ Migration 0.3147 F ibronectin −→ V essels -0.6456 1.4043 0.7621 Oxygen −→ Quiescence 1.2268 -0.7557 Oxygen −→ Apoptosis 0.9211 -0.5792 -1.0270

P LCγ −→ Migration 1.3082 1.3834 -0.8998 -1.3124

P LCγ −→ P roliferation -1.0310 -0.2308 -0.3759 0.2173

P LCγ −→ TKI − EGFR -0.2289 0.3758 1.0263

P LCγ −→ V essels -0.0456 P roliferation −→ F ibronectin 0.1194 P roliferation −→ V essels 1.3753 1.4996 0.2069 -0.0712 Quiescence −→ Apoptosis 1.2104 -0.6547 -1.3757 -0.2787 Quiescence −→ P roliferation -1.1177 1.2429 -1.3808 -0.2468 Quiescence −→ Migration 1.2722

T GFα − EGFR −→ Migration 1.1836 -0.2671 1.1464 -0.8754

T GFα − EGFR −→ P roliferation -0.3358 0.0037 -0.3957 0.4164

T GFα − EGFR −→ T GFα 1.5720 -1.1423

T GFα − EGFR −→ V essels 0.8915 TKI −→ P roliferation -0.9404 1.3356 TKI −→ TKI − EGFR 0.2006 1.1237 -1.2621 1.4931

TKI − EGFR −→ P LCγ 0.3725 TKI − EGFR −→ P roliferation 0.1096

T GFα −→ T GFα − EGFR -0.5068 -0.9698 V EGF −→ V essels 1.2717

EGFR −→ T GFα − EGFR 0.4204 EGFR −→ TKI − EGFR -0.4806 -0.1957 0.2379 1.4780 Glucose −→ Quiescence 0.5488 1.0076 -1.1217 0.3628 DOF 102 103 103 101 χ2 108.9856 107.9395 107.6395 112.5644 p − value 0.2999 0.3501 0.3576 0.2030 91 Figure 5.6: Estimation of path coefficient in causal model: With TKI treatment

From the estimation of path coefficient in a causal model, we can obtain the regression equations of each path to predict outputs. The equations that we obtain from the causal structure are as follows:

Apoptosis = 0.14 × Glucose + 0.38 × Oxygen + 0.14 × Quiescence − 0.13 Quiescence = −1.20 × Glucose − 0.15 × Oxygen − 0.087

T GFα − EGF R = −0.72 × T GFα − 0.077 × EGF R + 0.009 TKI − EGF R = −0.21 × EGF R − 0.44 × TKI − 0.035

PLCγ = 0.36 × TKI − EGF R − 0.038

P roliferation = 0.63 × Quiescence − 1.26 × T GFα − EGF R + 0.82 × PLCγ − 0.07

Migration = 0.21 × Quiescence − 1.25 × T GFα − EGF R + 1.39 × PLCγ − 0.05 V essels = 0.47 × F ibronectin + 1.11 × P roliferation − 0.3 × Migration −0.16 × V EGF − 0.07 Anastomosis = −0.63 × V essels + 0.05 (5.1) The correlation matrix obtained from TETRAD (Figure 5.7) shows how the vari- ables are influenced by parent nodes.If the correlation is really small or close to zero we can say that they are independent, otherwise they are related (covariance).

92 Figure 5.7: Implied correlation matrix of causal model: With TKI treatment

The alternative causal model 4 that we have selected shows a different scenario as shown in Figure 5.8. It provides alternative choices based on prior knowledge and algorithms that are embedded in TETRAD.

Figure 5.8: Estimation of path coefficient in alternative model: With TKI treatment

From Figure 5.8, we can see that the direction of P LCγ differs from the true causal model. This can be explained as the feedback control of TKI − EGFR, which is influenced by P LCγ. We tend to see high level of P LCγ concentration during the tumor progression [141]. This suggests that the causal structure really represents the cause of something based on our knowledge in biology. The model not only tells modelers about cause and effect, but it also shows the feed back control of variables within the causal structure. The regression equations from the alternative causal model are shown below.

93 Apoptosis = −1.10 × Oxygen − 0.23 × Quiescence + 0.08 Quiescence = −0.08 × Glucose − 0.057

TKI − EGF R = 1.50 × EGF R + 1.41 × TKI + 0.94 × PLCγ + 0.26

P roliferation = −0.33 × Quiescence + 0.39 × T GFα − EGF R

+0.13 × TKI − EGF R + 0.20 × PLCγ + 0.07 (5.2) Migration = −0.75 × T GFα − EGF R − 1.23 × PLCγ +0.27 × V essels + 0.04 V essels = 0.80 × F ibronectin − 0.03 × P roliferation

+0.80 × T GFα − EGF R − 0.09 × PLCγ + 0.004 Anastomosis = 0.83 × V essels − 0.04

The correlation matrix of an alternative model shows in Figure 5.9.

Figure 5.9: Implied correlation matrix of alternative model: With TKI treatment

In this chapter,structural equation modeling (or path analysis) provides us a method for estimating parameters of systems in linear equations and can be used for explaining angiogenesis in a glioblastoma model. We have performed the analysis of causal model using path analysis. The estimation models are based on the informa- tion that we get from true causal model. We have shown how to pick the estimated model to predict the output for both models with and without TKI treatment. From the path model analysis, we consider it as a linear model such that the coefficients of direct path can be used as coefficient of predictors in linear model. Each model is

94 statistically significant. The causal model cannot tell whether the model is right or wrong, but it tells us cause and effect if we have to use the selected model. If prior knowledge is provided, the causal model will be more meaningful.

95 Chapter 6

Conclusion

6.1 Interpretation of the result

The results show that using cause and correlation analysis can provide more useful reasons than using the statistical analysis alone. Statistical analysis gives us the significant level of prediction but cannot tell us whether the predictors are the cause of output/response or not. Path analysis (SEM) is a promising analysis that can tell us whether when something happen it always happen for some reasons; this can explain how useful cause and correlation analysis is. From section 5.1,the path analysis of the selected model (model 2), the coefficients between V essels and Anastomosis is negative. This could be explained that the decreasing one unit of V essels will cause the Anastomosis to decrease one unit. The unit is not necessary in this case because we use standardized data to create the estimation model. According to Shipley, path analysis will be considered as a multi regression model. The coefficients in path analysis will be analogous to coefficients in the regression model. We can tell from the estimation model that if the number of vessels is decreased it will cause the number of Anastomosis to

96 decrease as well. All other paths can be explained in the similar way. We can see that when P roliferation is decreased, it causes the number of V essels to decrease. When the Quiescence is reduced, we tend to see the reduction of Apoptosis but the strength of their relationship is quite small amount that means Apoptosis is not quite sensitive to the changes of Quiescence. Quiescence seems to have a great impact on Migration; this is quite agreeable because there might be other impacts on Quiescence such that it can become an active cell again so it can be able to pick another phenotypes. T GFα has a direct impact on T GFalpha − EGFR signaling pathway. When there is high level of T GFalpha, the T GFalpha − EGFR signaling pathway tends to increase (path coefficient = 1.2083).Coefficient between EGFR and

T GFalpha − EGFR is negative (coefficient = -1.3332), which shows that when the

EGFR decreasing one unit, T GFalpha−EGFR tends to decrease one unit. However, the relationship between Glucose and Apoptosis is needed more biological evidences to explain why the increasing of Glucose could impact Apoptosis to increase as well. For the relationship between Migration and P roliferation, Oxygen and Apoptosis seem to be biologically reasonable. From section 5.2, the brain cancer model with TKI treatment, we have 4 alter- native models. All estimates models are not significant but model 4 seems to be the best among the 4 models. From alternative model 4, the path estimation between V essels and Anastomosis is very strong. It can be said that V essels is important to estimate Anastomosis. It is quite obvious that TKI is the cause of TKI − EGFR (coefficient = 1.4931), when there are more TKI it can bind to EGFR on the cell surface and use the EGFR pathways to impact P roliferation, when TKI is bound to EGFR it impacts the number of cells to proliferate (path coefficient = 0.1096). This can proved that TKI is an indirect cause of P roliferation.

97 6.2 Suggestion for future research

We would recommend to develop alternative methods to explain the phenotype se- lection model in glioblastoma to compare our results. In reality, we do not know exactly how the individual cell picks it’s phenotype because there are several causes are involved, for example, extracellular impact to the cell or intracellular impact or the cellular environment. Those impacts have a great role in cell phenotype selection. One possible way is to use fuzzy set approach [149] to describe the cell phenotype selection. Fuzzy set will be used for making a decision for cells to select their pheno- types. This could be helpful for the future research.

6.3 Contribution

We have proposed the methodology using Shipley’s approach to better understand cancers and explore the feasibility of possible therapies. Shipley’s method ”Cause and correlation” is widely accepted and acknowledged by several biologists and modelers since it can help them realize a good causal model to enhance the treatment options. In this study, we have applied the Shipley’s approach to a glioblastoma case study to study the feasibility of TKI (Tyrosine kinase inhibitor) treatment. Shipley’s approach can be applied to other problems that require the inference of causal relationships from correlational data. The translation between the causal relationships, which are encoded in directed graphs and the correlational relationships encoded in probability can be developed using Shipley’s method.

6.4 Summary

This chapter, we have learned that cause and correlation analysis can help explain the brain cancer (glioblastoma case) model. It can help to interpret the real causes of

98 Anastomosis, which are the phenomenon of angiogenesis in the brain cancer. Path analysis, a decomposition of SEM, shows the coefficients of linear relationship between predictors and output. Structural equation modeling is another promising approach for studying biological phenomenon in the brain cancer.

99 Appendix A

The first Appendix

A.1 SEM and causality

For SEM, the structural interpretation of equation: {y = a + bx + ǫ} is not the condi- tional distribution of {y} given {x}. It conveys causal information that is orthogonal to the statistical properties of {x} and {y}. SEM provides the formal mathematical basis, which the potential-outcome notation draws its legitimacy [106]. According to Pearl, the interpretation of SEM methodology that emerges from the nonparametric perspective makes these specification explicit. The interpretation of SEM is an inference method that takes three inputs and produces three outputs. The three inputs are:

1. A set A of qualitative causal assumption, which the investigator is prepared

to defend on scientific grounds, and a model MA that encodes these assump-

tions. (Typically, MA takes the form of a path diagram or a set of structural equations with free parameters. A typical assumption is that certain omit- ted factors, represented by error terms,are uncorrelated with some variables or among themselves, or that no direct effect exists between a pair of variables).

100 2. A set Q of queries concerning causal and counterfactual relationships among variables of interest. Traditionally, Q concerned the magnitudes of structural coefficient but, in general models, Q will address causal relations more di-

rectly.Theoretically, each query Qi ∈ Q should be computable from a fully specified model M, which all functional relationships are given. Noncomputat- ble queries are inadmissible.

3. A set D of experimental or nonexperimental data, governed by a joint proba- bility distribution presumably generated by a process consistent with A

The three outputs are:

1. A set A∗ of statements that are the logical implications of A, separate from the data at hand, for instance, that X has no effect on Y if we hold Z constant, or that Z is an instrument relative to X,Y .

2. A set C of data-based claims concerning the magnitudes or likelihoods of the target queries in Q, each conditional on A. C may contain, for example, the estimated mean and variance of a given structural parameter, or the expected effect of a given intervention. Auxiliary to C, SEM also generated an estimand

Qi(P ) for each query in Q, or a determination that Qi is not identifiable from P.

3. A list T of testable statistical implications of A, and the degree g(Ti), Ti ∈ T , which the data agree with each of those implications. A typical implication would be the vanishing of a specific partial correlation; such constraints can be

read from the model MA and confirmed or disconfirmed quantitatively by the data.

The structure of this inferential exercise is shown in Figure A.1.

101 Figure A.1: SEM methodology depicted as an inference engine converting assump- tions (A), queries (Q), and data (D) into logical implication (A∗), conditional claims (C), and data-fitness indices (g(T ))

SEM is not a traditional statistical methodology,typified by hypothesis testing or estimation, because claims or assumptions are not expressed in terms of probability functions. All claims that is produced by SEM are conditional on validity of A, and should be expressed in conditional format: ’If A then Ci for any claim Ci ∈ C. This is important to emphasize that in SEM one must start with a model, which all causal relations are presumed known. Passing a goodness of fit test is not a prerequisite for the validity of the conditional claim ’If A then Ci’, nor for the validity of Ci. The assertion ’If A then Ci’ is informa- tive in a decision-making context,because each Ci conveys quantitative information extracted from the data rather than qualitative assumptions. Additionally, SEM’s methodology cannot protect C from inevitably of contradictory equivalent models.

102 A.2 Search algorithms using TETRAD

A.2.1 PC algorithm

The PC algorithm, created by Spirtes and Glymour [125], is a pattern search, which assumes that the underlying causal structure of the input data is acyclic, and that no two variables are caused by the same latent (unmeasured) variable. In addition, it is assumed that the input data set is either entirely continuous or entirely discrete; if the data set is continuous, it is assumed that the causal relation between any two variables is linear, and that the distribution of each variable is normally distributed. Finally, the sample should ideally be i.i.d.. Simulations show that PC and several of the other algorithms described here often succeed when these assumptions, needed to prove their correctness, do not strictly hold. The PC algorithm will sometimes output double headed edges. In the large sample limit, double headed edges in the output indicate that the adjacent variables have an unrecorded common cause, but PC tends to produce false positive double headed edges on small samples. The PC algorithm is correct whenever decision procedures for independence and conditional independence are available. The procedure conducts a sequence of independence and conditional independence tests, and efficiently builds a pattern from the results of those tests. As implemented in TETRAD, PC is intended for multinomial and approximately Normal distributions with i.i.d. data. The tests have an alpha value for rejecting the null hypothesis, which is always a hypothesis of independence or conditional independence. For continuous variables, PC uses tests of zero correlation or zero partial correlation for independence or conditional independence respectively. For discrete or categorical variables, PC uses either a chi square or a g square test of independence or conditional independence. In either case, the tests require an alpha value for rejecting the null hypothesis, which can be adjusted by the user. The procedures make no adjustment for multiple testing. (PC, CPC, JPC, JCPC, FCI,

103 and other testing searches.) Consider a discrete data set with the following underlying causal structure:

Figure A.2: Causal Structure using PC algorithm

The output is a pattern in this case, though it is not necessarily all the time. The PC algorithm sometimes outputs graphs with cycles and bidirected edges; under ideal conditions, a bidirected edge between two variables indicates a latent common cause. The algorithm is deterministic, so given the same input and parameters, the output pattern will always be the same. You can, however change the parameters, and thereby possibly change the output. The choice of appropriate alpha values re- quires some experience and depends on the sample size and the number of variables. For small models, such as the one illustrated, and small samples (¡ 500), the default 0.05 alpha value is commonly used. Alpha values should be decreased as the sample size increases. In general, smaller alpha values will give sparser graphs. If, for ex- ample, the user is concerned to avoid false positive edges, the sample size should be lowered. Again, because of multiple hypothesis testing in the search, to reduce false positives the alpha value should be lowered (we do NOT however recommend using

104 the Bonferroni adjustment). For example, in simulations with 5,000 variables and sample sizes of 250, an alpha value of 10-7 on data from a sparse model (5,000 edges) finds the preponderance of edges and a 7 to 8 percent false positive rate. It is wise to test PC, or any other search on the best simulations you can produce that are similar to your actual data in distribution family, sample size, and number of variables. See the section on Simulation. The two parameters on the left side of the window, Alpha and Depth, determine the way, which Tetrad performs independence tests between variables. The alpha value is a threshold for independence; the higher it is set, the less discerning Tetrad is when determining the independence of two variables. The depth value specifies the maximum size of subsets of variables, which Tetrad can condition on when testing for independence. If you check the Aggressively Prevent Cycles box, and then click Execute, the search will rerun, and the output graph will be acyclic. If you click the Calc Stats button, the algorithm will run again, and two tabs will appear next to the Pattern tab: DAG in Pattern and DAG Model Statistics. The DAG in Pattern tab provides the same function as the option of that name in the graph manipulation box. The DAG Model Statistics tab provides information on the p-value, degrees of freedom, chi square, and BIC score of the search, for that chosen DAG. Under the Independence tab, you can specify whether Tetrad uses the chi square or the g square test for independence of variables, for discrete data, or whether Tetrad uses the Fisher Z test, the Cramer T test, or linear regression for in- dependence of variables, for continuous data. In general, the Graph tab in the search box functions in the same way that it does in the graph box. There are, however, a few additional functionalities in the search box. In particular, the Meek Orientation option orients the edges of the graph according to the Meek Rules [89]. The Global Score-based Reorientation option runs the GES edge orientation algorithm on only the edges present in the graph.

105 A.2.2 FCI algorithm

Like the PC search, the FCI search takes as input a data set whose underlying causal structure is assumed to be represented by a DAG. However, the causal structure of the data set input into the FCI algorithm may include unknown latent variables (variables not present in the data set), or sample selection bias. It is assumed that no relationship between variables is deterministic. For continuous data, the input is assumed to be a linear SEM with Normal error terms. As in PC output, a bidirected edge between two variables indicates the presence of a latent common cause of the two. The FCI algorithm, however, also has an undetermined edge: the presence of a small circle at the end of an edge indicates that Tetrad cannot determine whether or not that edge should contain an arrow. The functionality of the FCI window is much the same as that of the PC win- dow, with the exception of the three check boxes underneath the alpha and depth parameters on the left. If the Use complete rule set box is left unchecked, the original FCI algorithm is run; if it is checked, the algorithm runs using the complete (and more extensive) rule set for orienting edges. The Do possible DSEP search option is checked by default. This operation can take a lot of time, so you may wish to uncheck this option for further executions of the algorithm. The Max reachable path length number indicates how long the path from a collider to a conditioning variable may be in d-separation; when it is -1, the path length is unbounded.

A.2.3 SEM parametric models

The parametric model of a structural equation model (SEM) takes as input a SEM graph. SEM PMs represent causal structures, which all variables are continuous. Consider, for example, a variable, which represents the length of time in seconds it takes for a ball thrown from point A to reach point B. It might take three seconds,

106 or ten, or 5,320,192. It might even take a fractional number of seconds. There are an infinite number of possible lengths of time. A SEM PM contains two components: the graphical causal structure of the model, and a list of parameters used in a set of linear equations representing the causal structure of the model. Each variable in a SEM PM is a linear function of a subset of the other variables and of an error term drawn from a Normal distribution.

A.2.4 SEM instantiated model

A SEM instantiated model is a SEM parametric model, which the parameters and error terms have defined values. Using this box, you can specify the ranges of values from, which you want coefficients, covariances, and variances to be drawn for the parameters in the model. In the above box, for example, all linear coefficients will be between -1.5 and -0.5 or 0.5 and 1.5. If you uncheck symmetric about zero, they will only be between 0.5 and 1.5. You can now manually edit the values of parameters in one of two ways. Double clicking on the parameter in the graph will open up a small text box for you to overwrite. Or you can click on the Tabular Editor tab, which will show all of the parameters in a table, which you can edit. The Tabular Editor tab of our SEM IM looks like this: In an estimator box, the Tabular Editor tab provides statistics showing how robust the estimation of each parameter is. This is the function of the SE, T, and P columns. Our SEM IM, however, is in an instantiated model box, so these columns are empty. The Implied Matrices tab shows matrices of different kinds of relationships between variables in the model. In the Implied Matrices tab, you can view the covariance or correlation matrix for all variables (including latents) or just measured variables. You can choose the matrix you wish to view from the drop-down menu at the top of the window. Only half of any matrix is shown, because in a well-formed acyclic model, both halves should be identical. The cells in the Implied Matrices tab cannot

107 be edited. In an estimator box, the Model Statistics tab provides goodness of fit statistics for the SEM IM, which has been estimated. Our SEM IM, however, is in an instantiated model box, so no estimation has occurred, and the Model Statistics tab is empty.

A.3 Multilevelness of Angiogenesis

In complex systems, multilevelness is the key organizing principle to understand the systems (Mesarovic,2004).Angiogenesis is a complex systems for it has the relation- ships of functioning between higher level (tissue)and lower level (cells). The changes on the higher level are solely depend on the interaction of subsystems on lower level. The behavior on the higher level depending on coordination and interaction of the subsystems on the lower level. The tissue level and cellular level are considered as two adjacent levels.There are two modes to be recognized. First mode is that functioning or change on the lower level does not impact the functioning on the higher level. While in the second mode, the functioning on the lower level impacts the higher level [90]. In a similar way, in angiogenesis, if the changes on cellular level do not have an impact on tissue level then it is referred to as normal or healthy state. But if the cellular level impacts on tissue level it will referred to as pathological state.

108 Bibliography

[1] Tomas Alarcon, Helen Byrne, Philip Maini, and Jasmina Panovska. Multidis- ciplinary Approaches to Theory in Medicine, volume 3 of Studies in Multidisci- plinarity. Elsevier, 2005.

[2] Bruce Alberts, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, and . Molecular Biology of the Cell. Garland Science, 4 edition, 2002.

[3] Alexander R A Anderson. A hybrid mathematical model of solid tumour inva- sion: the importance of cell adhesion. Mathematical medicine and biology : a journal of the IMA, 22(2):163–86, June 2005.

[4] Alexander R a Anderson, Katarzyna a Rejniak, Philip Gerlee, and Vito Quar- anta. Microenvironment driven invasion: a multiscale multimodel investigation. Journal of mathematical biology, 58(4-5):579–624, April 2009.

[5] Satoshi Arima, Koichi Nishiyama, Toshiyuki Ko, Yuichiro Arima, Yuji Hakozaki, Kei Sugihara, Hiroaki Koseki, Yasunobu Uchijima, Yukiko Kuri- hara, and Hiroki Kurihara. Angiogenic morphogenesis driven by dynamic and heterogeneous collective endothelial cell movement. Development (Cambridge, England), 138(21):4763–76, November 2011.

[6] Robert Auerbach, Rachel Lewis, Brenda Shinners, Louis Kubai, and Nasim

109 Akhtar. Angiogenesis assays: a critical overview. Clinical chemistry, 49(1):32– 40, 2003.

[7] Jayant Avva, Michael C Weis, R Michael Sramkoski, Sree N Sreenath, and James W Jacobberger. Dynamic expression profiles from static cytometry data: component fitting and conversion to relative, ”same scale” values. PloS one, 7(7):e38275, January 2012.

[8] Anil Bagri, Leanne Berry, Bert Gunter, Mallika Singh, Ian Kasman, Lisa a Damico, Hong Xiang, Maike Schmidt, Germaine Fuh, Beth Hollister, Oliver Rosen, and Greg D Plowman. Effects of anti-VEGF treatment duration on tumor growth, tumor regrowth, and treatment efficacy. Clinical cancer re- search : an official journal of the American Association for Cancer Research, 16(15):3887–900, August 2010.

[9] Alexander M Bailey, Bryan C Thorne, and Shayn M Peirce. Multi-cell agent- based simulation of the microvasculature to study the dynamics of circulating inflammatory cell trafficking. Annals of biomedical engineering, 35(6):916–36, July 2007.

[10] Peter Baluk, Hiroya Hashizume, and Donald M McDonald. Cellular abnor- malities of blood vessels as targets in cancer. Current opinion in genetics & development, 15(1):102–11, February 2005.

[11] H T Banks, Karyn L Sutton, W Clayton Thompson, Gennady Bocharov, Dirk Roose, Tim Schenkel, and Andreas Meyerhans. Estimation of cell proliferation dynamics using CFSE data. Bulletin of mathematical biology, 73(1):116–50, January 2011.

[12] Britta Basse, BC Baguley, and ES Marshall. A mathematical model for analysis of the cell cycle in human tumour. 2002.

110 [13] Britta Basse, BC Baguley, and ES Marshall. A mathematical model for analysis of the cell cycle in cell lines derived from human tumors. Journal of mathemat- ical ... , 47:295–312, 2003.

[14] Britta Basse, Bruce C Baguley, Elaine S Marshall, Graeme C Wake, and David J N Wall. Modelling cell population growth with applications to cancer therapy in human tumour cell lines. Progress in biophysics and molecular biology, 85(2- 3):353–68, 2004.

[15] Amy L Bauer, Trachette L Jackson, and Yi Jiang. A cell-based model exhibiting branching and anastomosis during tumor-induced angiogenesis. Biophysical journal, 92(9):3105–21, May 2007.

[16] Amy L Bauer, Trachette L Jackson, and Yi Jiang. Topography of extracellular matrix mediates vascular morphogenesis and migration speeds in angiogenesis. PLoS computational biology, 5(7):e1000445, July 2009.

[17] Gabriele Bergers, Rolf Brekken, Gerald McMahon, and Thiennu H Vu. Matrix metalloproteinase-9 triggers the angiogenic switch during carcinogenesis. Nature cell ... , 2(10):737–744, 2000.

[18] Frederique Billy, Benjamin Ribba, Olivier Saut, Helene Morre-Trouilhet, Thierry Colin, Didier Bresch, Jean Pierre Boissel, Emmanuel Grenier, and Jean Pierre Flandrois. A pharmacologically based multiscale mathematical model of angiogenesis and its use in investigating the efficacy of a new can- cer treatment strategy. Journal of Theoretical Biology, 260(4):545–562, 2009.

[19] E T Bishop, G T Bell, S Bloor, I J Broom, N F Hendry, and D N Wheatley. An in vitro model of angiogenesis: basic features. Angiogenesis, 3(4):335–344, 1999.

111 [20] Melissa L Bondy, Michael E Scheurer, Beatrice Malmer, Jill S Barnholtz-Sloan, Faith G Davis, Dora Il’yasova, Carol Kruchko, Bridget J McCarthy, Preetha Rajaraman, Judith a Schwartzbaum, Siegal Sadetzki, Brigitte Schlehofer, Tarik Tihan, Joseph L Wiemels, Margaret Wrensch, and Patricia a Buffler. Brain tu- mor epidemiology: consensus from the Brain Tumor Epidemiology Consortium. Cancer, 113(7 Suppl):1953–68, October 2008.

[21] Emmanuel Boucrot and Tomas Kirchhausen. Mammalian cells change volume during mitosis. PloS one, 3(1):e1477, January 2008.

[22] Matthew Brown, Emma Bowring, and Shira Epstein. Applying multi-agent techniques to cancer modeling. Proceedings of the 6th . . . , 2011.

[23] Lei Cao and Matthew J. During. What Is the Brain-Cancer Connection?, 2012.

[24] Aurelie Carlier, Liesbet Geris, Katie Bentley, Geert Carmeliet, Peter Carmeliet, and Hans Van Oosterwyck. MOSAIC: A Multiscale Model of Osteogenesis and Sprouting Angiogenesis with Lateral Inhibition of Endothelial Cells. PLoS Computational Biology, 8(10), 2012.

[25] Peter Carmeliet and RK Jain. Angiogenesis in cancer and other diseases. NATURE-LONDON-, 407:249–257, 2000.

[26] Nikki a Charles, Eric C Holland, Richard Gilbertson, Rainer Glass, and Helmut Kettenmann. The brain tumor microenvironment. Glia, 59(8):1169–80, August 2011.

[27] L Leon Chen, Le Zhang, Jeongah Yoon, and Thomas S Deisboeck. Cancer cell motility: optimizing spatial search strategies. Bio Systems, 95(3):234–42, March 2009.

112 [28] SY Cheng and Motoo Nagane. Intracerebral tumor-associated hemorrhage caused by overexpression of the vascular endothelial growth factor iso- forms VEGF121 and VEGF165 but not VEGF189. Proceedings of the . . . , 94(October):12081–12087, 1997.

[29] Michael F Clarke. At the root of brain cancer. Nature, 432(7015):281–2, 2004.

[30] Jason J. Corso, Eitan Sharon, Shishir Dube, Suzie El-Saden, Usha Sinha, and Alan Yuille. Efficient multilevel brain tumor segmentation with inte- grated bayesian model classification. IEEE Transactions on Medical Imaging, 27(5):629–640, 2008.

[31] Nikolaos A Dallas, Michael J Gray, Ling Xia, Fan Fan, George Van Buren Ii, Puja Gaur, Shaija Samuel, Sherry J Lim, Thiruvengadam Arumugam, Vijaya Ramachandran, Huamin Wang, and Lee M Ellis. Neuropilin-2 Mediated Tumor Growth and Angiogenesis in Pancreatic Adenocarcinoma umor Growth and Angiogenesis in Pancreatic Adenocarcinoma. pages 8052–8060, 2008.

[32] Josephine T. Daub and Roeland M H Merks. A Cell-Based Model of Extracellular-Matrix-Guided Endothelial Cell Migration During Angiogenesis. Bulletin of Mathematical Biology, 75(8):1377–1399, 2013.

[33] Paxton V Dickson, John B Hamner, Thomas L Sims, Charles H Fraga, Cather- ine Y C Ng, Surender Rajasekeran, Nikolaus L Hagedorn, M Beth McCarville, Clinton F Stewart, and Andrew M Davidoff. -induced transient remodeling of the vasculature in neuroblastoma xenografts results in improved delivery and efficacy of systemically administered chemotherapy. Clinical cancer research : an official journal of the American Association for Cancer Research, 13(13):3942–50, July 2007.

113 [34] Travis Dunckley, Keith D Coon, and Dietrich a Stephan. Discovery and develop- ment of biomarkers of neurological disease. Drug discovery today, 10(5):326–34, March 2005.

[35] Chris Eliasmith, Terrence C Stewart, Xuan Choo, Trevor Bekolay, Travis De- Wolf, Yichuan Tang, Charlie Tang, and Daniel Rasmussen. A large-scale model of the functioning brain. Science (New York, N.Y.), 338(6111):1202–5, 2012.

[36] Lee M Ellis and Daniel J Hicklin. VEGF-targeted therapy: mechanisms of anti-tumour activity. Nature reviews. Cancer, 8(8):579–91, August 2008.

[37] Sabine A. Eming, Bent Brachvogel, Teresa Odorisio, and Manuel Koch. Regu- lation of angiogenesis: Wound healing as a model. Progress in Histochemistry and Cytochemistry, 42(3):115–170, 2007.

[38] Heiko Enderling, Alexander R A Anderson, Mark A J Chaplain, Afshin Be- heshti, Lynn Hlatky, and Philip Hahnfeldt. Paradoxical dependencies of tumor dormancy and progression on basic cell kinetics. Cancer research, 69(22):8814– 21, November 2009.

[39] Sabrina Facchino, Mohamed Abdouh, and Gilbert Bernier. Brain Cancer Stem Cells: Current Status on Glioblastoma Multiforme, 2011.

[40] Babak Faryabi and Golnaz Vahedi. Constrained intervention in a cancerous mammalian cell cycle network. Genomic Signal Processing and Statistics, 2008. GENSiPS 2008. IEEE International Workshop on, pages 2–3, 2008.

[41] Moritz Felcht, Robert Luck, Alexander Schering, Philipp Seidel, Kshitij Srivas- tava, Junhao Hu, Arne Bartol, Yvonne Kienast, Christiane Vettel, Elias K. Loos, Simone Kutschera, Susanne Bartels, Sila Appak, Eva Besemfelder, Dorothee Terhardt, Emmanouil Chavakis, Thomas Wieland, Christian Klein,

114 Markus Thomas, Akiyoshi Uemura, Sergij Goerdt, and Hellmut G. Augustin. Angiopoietin-2 differentially regulates angiogenesis through TIE2 and integrin signaling, 2012.

[42] Daniel P. Fitzgerald, Diane Palmieri, Emily Hua, Elizabeth Hargrave, Jeanne M. Herring, Yongzhen Qian, Eleazar Vega-Valle, Robert J. Weil, An- dreas M. Stark, Alexander O. Vortmeyer, and Patricia S. Steeg. Reactive glia are recruited by highly proliferative brain metastases of breast cancer and promote tumor cell colonization. Clinical and Experimental Metastasis, 25(7):799–810, 2008.

[43] . Is angiogenesis an organizing principle in biology and medicine? Journal of pediatric surgery, 42(1):1–11, January 2007.

[44] Elena I. Fomchenko and Eric C. Holland. Stem cells and brain cancer, 2005.

[45] G Fontanini, S Vignati, L Boldrini, S Chin`e, V Silvestri, M Lucchi, a Mussi, C a Angeletti, and G Bevilacqua. Vascular endothelial growth factor is associated with neovascularization and influences progression of non-small cell lung carci- noma. Clinical cancer research : an official journal of the American Association for Cancer Research, 3(6):861–5, June 1997.

[46] John Fox. Structural Equation Modeling With the sem Package in R. Structural Equation Modeling, 13(3):465–486, 2006.

[47] Peter Fraisl, Massimiliano Mazzone, Thomas Schmidt, and Peter Carmeliet. Regulation of angiogenesis by oxygen and metabolism. Developmental cell, 16(2):167–79, February 2009.

[48] R. Ganguly and I. K. Puri. Mathematical model for the cancer stem cell hy- pothesis. Cell Proliferation, 39(1):3–14, 2006.

115 [49] Ray F Gariano and Thomas W Gardner. Retinal angiogenesis in development and disease. Nature, 438(7070):960–966, 2005.

[50] Hans-peter Gerber and Napoleone Ferrara. Pharmacology and Pharmacody- namics of Bevacizumab as Monotherapy or in Combination with Cytotoxic Therapy in Preclinical Studies Pharmacology and Pharmacodynamics of Be- vacizumab as Monotherapy or in Combination with Cytotoxic Therapy in Pre- clinical Studies. pages 671–680, 2005.

[51] Hans-Peter Gerber, Xiumin Wu, Lanlan Yu, Christian Wiesmann, Xiao Huan Liang, Chingwei V Lee, Germaine Fuh, Christine Olsson, Lisa Damico, David Xie, Y Gloria Meng, Johnny Gutierrez, Racquel Corpuz, Bing Li, Linda Hall, Linda Rangell, Ron Ferrando, Henry Lowman, Franklin Peale, and Napoleone Ferrara. Mice expressing a humanized form of VEGF-A may provide insights into the safety and efficacy of anti-VEGF antibodies. Proceedings of the National Academy of Sciences of the United States of America, 104(9):3478–83, February 2007.

[52] Isabelle Germano, Victoria Swiss, and Patrizia Casaccia. Primary brain tumors, neural stem cell, and brain tumor cancer cells: Where is the link?, 2010.

[53] Fabrizio Griffero, Antonio Daga, Daniela Marubbi, Maria Cristina Capra, Alice Melotti, Alessandra Pattarozzi, Monica Gatti, Adriana Bajetto, Carola Porcile, Federica Barbieri, Roberto E Favoni, Michele Lo Casto, Gianluigi Zona, Renato Spaziante, Tullio Florio, and Giorgio Corte. Different response of human glioma tumor-initiating cells to epidermal growth factor receptor kinase inhibitors. The Journal of biological chemistry, 284(11):7138–48, March 2009.

[54] Douglas Hanahan, Robert A Weinberg, and San Francisco. The Hallmarks of Cancer Review University of California at San Francisco. Cell, 100:57–70, 2000.

116 [55] Vickie Hanrahan, Margaret J Currie, Sarah P Gunningham, Helen R Morrin, Prudence a E Scott, Bridget a Robinson, and Stephen B Fox. The angiogenic switch for vascular endothelial growth factor (VEGF)-A, VEGF-B, VEGF-C, and VEGF-D in the adenoma-carcinoma sequence during colorectal cancer pro- gression. The Journal of pathology, 200(2):183–94, June 2003.

[56] Jan Hasenauer, Julian Heinrich, Malgorzata Doszczak, Peter Scheurich, Daniel Weiskopf, and Frank Allg¨ower. A visual analytics approach for models of het- erogeneous cell populations. EURASIP journal on bioinformatics & systems biology, 2012(1):4, January 2012.

[57] Daniel J Hicklin and Lee M Ellis. Role of the vascular endothelial growth factor pathway in tumor growth and angiogenesis. Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 23(5):1011–27, February 2005.

[58] Taylor Hines. Mathematically Modeling the Mass-Effect of Invasive Brain Tu- mors. SIAM Undergraduate Research Online, pages 135–158, 2010.

[59] A Hoeben and Bart Landuyt. Vascular endothelial growth factor and angio- genesis. Pharmacological Reviews, 56(4):549–580, 2004.

[60] Peter Horak, Andrew R Crawford, Douangsone D Vadysirisack, Zachary M Nash, M Phillip DeYoung, Dennis Sgroi, and Leif W Ellisen. Negative feedback control of HIF-1 through REDD1-regulated ROS suppresses tumorigenesis. Pro- ceedings of the National Academy of Sciences of the United States of America, 107(10):4675–80, March 2010.

[61] Tetsuichiro Inai, Michael Mancuso, Hiroya Hashizume, Fabienne Baffert, Amy Haskell, Peter Baluk, Dana D Hu-Lowe, David R Shalinsky, Gavin Thurston,

117 George D Yancopoulos, and Donald M McDonald. Inhibition of vascular en- dothelial growth factor (VEGF) signaling in cancer causes loss of endothelial fenestrations, regression of tumor vessels, and appearance of basement mem- brane ghosts. The American journal of pathology, 165(1):35–52, July 2004.

[62] David W Infanger, Maureen E Lynch, and Claudia Fischbach. Engineered culture models for studies of tumor-microenvironment interactions. Annual review of biomedical engineering, 15:29–53, July 2013.

[63] Eiichi Ishikawa, Yoshihiro Muragaki, Tetsuya Yamamoto, Takashi Maruyama, Koji Tsuboi, Soko Ikuta, Koichi Hashimoto, Youji Uemae, Takeshi Ishihara, Masahide Matsuda, Masao Matsutani, Katsuyuki Karasawa, Yoichi Nakazato, Tatsuya Abe, Tadao Ohno, and Akira Matsumura. Phase I/IIa trial of fraction- ated radiotherapy, temozolomide, and autologous formalin-fixed tumor vaccine for newly diagnosed glioblastoma. Journal of neurosurgery, pages 1–11, July 2014.

[64] Z Jackiewicz, B Zubik-Kowal, and B Basse. Finite-difference and pseudo- spectral methods for the numerical simulations of in vitro human tumor cell pop- ulation kinetics. Mathematical biosciences and engineering : MBE, 6(3):561–72, July 2009.

[65] Trachette Jackson and Xiaoming Zheng. A cell-based model of endothelial cell migration, proliferation and maturation during corneal angiogenesis. Bulletin of Mathematical Biology, 72(4):830–868, 2010.

[66] Rakesh K Jain, Dan G Duda, Jeffrey W Clark, and Jay S Loeffler. Lessons from phase III clinical trials on anti-VEGF therapy for cancer. Nature clinical practice. Oncology, 3(1):24–40, January 2006.

118 [67] B M Kenyon, E E Voest, C C Chen, E Flynn, J Folkman, and R J D’Amato. A model of angiogenesis in the mouse cornea. Investigative ophthalmology & visual science, 37(8):1625–1632, 1996.

[68] Robert S Kerbel. Antiangiogenic therapy: a universal chemosensitization strat- egy for cancer? Science (New York, N.Y.), 312(5777):1171–5, May 2006.

[69] Hwan-Young Kim, Hye-Ran Kim, Min-Gu Kang, Nguyen Thi Dai Trang, Hee- Jo Baek, Jae-Dong Moon, Jong-Hee Shin, Soon-Pal Suh, Dong-Wook Ryang, Hoon Kook, and Myung-Geun Shin. Profiling of Biomarkers for the Expo- sure of Polycyclic Aromatic Hydrocarbons: Lamin-A/C Isoform 3, Poly[ADP- ribose] Polymerase 1, and Mitochondria Copy Number Are Identified as Uni- versal Biomarkers. BioMed research international, 2014:605135, January 2014.

[70] Sean H J Kim, Jayanta Debnath, Keith Mostov, Sunwoo Park, and C Anthony Hunt. A computational approach to resolve cell level contributions to early glandular epithelial cancer progression. BMC systems biology, 3:122, January 2009.

[71] M Lacroix, D Abi-Said, D R Fourney, Z L Gokaslan, W Shi, F DeMonte, F F Lang, I E McCutcheon, S J Hassenbusch, E Holland, K Hess, C Michael, D Miller, and R Sawaya. A multivariate analysis of 416 patients with glioblas- toma multiforme: prognosis, extent of resection, and survival. Journal of neu- rosurgery, 95(2):190–8, August 2001.

[72] H Christopher Lawson, Prakash Sampath, Eileen Bohan, Michael C Park, Na- math Hussain, Alessandro Olivi, Jon Weingart, Lawrence Kleinberg, and Henry Brem. Interstitial chemotherapy for malignant gliomas: the Johns Hopkins ex- perience. Journal of neuro-oncology, 83(1):61–70, May 2007.

119 [73] Fuhai Li, Xiaobo Zhou, Jinwen Ma, and S Wong. Multiple nuclei tracking using integer programming for quantitative cancer cell cycle analysis. Medical Imaging, IEEE ... , 29(1):96–105, 2010.

[74] Anne Limbourg, Thomas Korff, L Christian Napp, Wolfgang Schaper, Hel- mut Drexler, and Florian P Limbourg. Evaluation of postnatal arteriogenesis and angiogenesis in a mouse model of hind-limb ischemia. Nature protocols, 4(12):1737–1746, 2009.

[75] HS Lin and JC Wooley. Catalyzing inquiry at the interface of computing and biology. 2005.

[76] L A Liotta and E C Kohn. The microenvironment of the tumour-host interface. Nature, 411(6835):375–9, May 2001.

[77] Gang Liu, Amina a Qutub, Prakash Vempati, Feilim Mac Gabhann, and Alek- sander S Popel. Module-based multiscale simulation of angiogenesis in skeletal muscle. Theoretical biology & medical modelling, 8(1):6, January 2011.

[78] M. Lorger, H. Lee, J. S. Forsyth, and B. Felding-Habermann. Comparison of in vitro and in vivo approaches to studying brain colonization by breast cancer cells. Journal of Neuro-Oncology, 104(3):689–696, 2011.

[79] Girieca Lorusso and Curzio R¨uegg. The tumor microenvironment and its con- tribution to tumor evolution toward metastasis. Histochemistry and cell biology, 130(6):1091–103, December 2008.

[80] John Lowengrub, Qing Nie, Vittorio Cristini, and Xiangrong Li. Nonlinear three-dimensional simulation of solid tumor growth. Discrete and Continuous Dynamical Systems - Series B, 7(3):581–604, February 2007.

120 [81] Seiji Mabuchi, Yoshito Terai, Kenichiro Morishige, Akiko Tanabe-Kimura, Hiroshi Sasaki, Masanori Kanemura, Satoshi Tsunetoh, Yoshimichi Tanaka, Masahiro Sakata, Robert a Burger, Tadashi Kimura, and Masahide Ohmichi. Maintenance treatment with bevacizumab prolongs survival in an in vivo ovar- ian cancer model. Clinical cancer research : an official journal of the American Association for Cancer Research, 14(23):7781–9, December 2008.

[82] With P Macklin and M E Edgerton. Agent-based cell modelling: application to breast cancer. In Multiscale Modeling of Cancer : An Integrated Experimental and Mathematical Modeling Approach, chapter 10. 2009.

[83] Amit Maity, Nabendu Pore, Jerry Lee, Don Solomon, and Donald M O Rourke. Epidermal Growth Factor Receptor Transcriptionally Up-Regulates Vascular Endothelial Growth Factor Expression in Human Glioblastoma Cells via a Path- way Involving Phosphatidylinositol 3 -Kinase and Distinct from That Induced by Hypoxia Epidermal Growth F. 2000.

[84] Michael R Mancuso, Rachel Davis, Scott M Norberg, Shaun O Brien, Barbara Sennino, Tsutomu Nakahara, Virginia J Yao, Tetsuichiro Inai, Peter Brooks, Bruce Freimark, David R Shalinsky, Dana D Hu Lowe, and Donald M Mcdonald. Rapid vascular regrowth in tumors after reversal of VEGF inhibition. 116(10), 2006.

[85] Y Mansury, M Kimura, J Lobo, and TS Deisboeck. Emerging patterns in tumor systems: simulating the dynamics of multicellular clusters with an agent-based spatial agglomeration model. Journal of Theoretical ... , pages 343–370, 2002.

[86] Yuri Mansury and Thomas S. Deisboeck. The impact of search precision in an agent-based tumor model. Journal of Theoretical Biology, 224(3):325–337, October 2003.

121 [87] Nikos V Mantzaris, Steve Webb, and Hans G Othmer. Mathematical modeling of tumor-induced angiogenesis. Journal of mathematical biology, 49(2):111–87, August 2004.

[88] Richard Mayeux. Biomarkers: potential uses and limitations. NeuroRx : the journal of the American Society for Experimental NeuroTherapeutics, 1(2):182– 8, April 2004.

[89] Christopher Meek. Causal inference and causal explanation with background knowledge. Fifth Conference on Uncertainty in Artificial Intelligence, pages 403–410, 1995.

[90] MD Mesarovic, SN Sreenath, and JD Keene. Search for organising principles: understanding in systems biology. Systems Biology, pages 19–27, 2004.

[91] F Milde, M Bergdorf, and P Koumoutsakos. A hybrid model for three- dimensional simulations of sprouting angiogenesis. Biophysical journal, 95(7):3146–3160, 2008.

[92] Paul S Mischel and Timothy F Cloughesy. Targeted molecular therapy of GBM. Brain pathology (Zurich, Switzerland), 13(1):52–61, January 2003.

[93] Rodolfo Molina-Pe˜na and Mario Mois´es Alvarez. A simple mathematical model based on the cancer stem cell hypothesis suggests kinetic commonalities in solid tumor growth. PloS one, 7(2):e26233, January 2012.

[94] Alessandro Morabito, Ermelinda De Maio, Massimo Di Maio, Nicola Normanno, and Francesco Perrone. Tyrosine kinase inhibitors of vascular endothelial growth factor receptors in clinical trials: current status and future directions. The oncologist, 11(7):753–64, 2006.

122 [95] W Mueller-Klieser, J P Freyer, and R M Sutherland. Influence of glucose and oxygen supply conditions on the oxygenation of multicellular spheroids. British journal of cancer, 53(3):345–53, March 1986.

[96] Debabrata Mukhopadhyay and Kaustubh Datta. Multiple regulatory path- ways of vascular permeability factor/vascular endothelial growth factor (VPF/VEGF) expression in tumors. Seminars in cancer biology, 14:123–130, 2004.

[97] Sukriti Nag. The blood-brain barrier and cerebral angiogenesis: Lessons from the cold-injury model, 2002.

[98] Sara M. Nolte, Chitra Venugopal, Nicole McFarlane, Olena Morozova, Robin M. Hallett, Erin O’Farrell, Branavan Manoranjan, Naresh K. Murty, Paula Klurfan, Edward Kachur, John P. Provias, Forough Farrokhyar, John A. Hassell, Marco Marra, and Sheila K. Singh. A cancer stem cell model for studying brain metastases from primary lung cancer. Journal of the National Cancer Institute, 105(8):551–562, 2013.

[99] Naoki Oka, Akio Soeda, Akihito Inagaki, Masafumi Onodera, Hidekazu Maruyama, Akira Hara, Takahiro Kunisada, Hideki Mori, and Toru Iwama. VEGF promotes tumorigenesis and angiogenesis of human glioblastoma stem cells. Biochemical and biophysical research communications, 360(3):553–9, Au- gust 2007.

[100] Antonio Omuro and Lisa M DeAngelis. Glioblastoma and other malignant gliomas: a clinical review. JAMA : the journal of the American Medical Asso- ciation, 310(17):1842–50, November 2013.

[101] Mayumi Ono. Molecular links between tumor angiogenesis and inflammation:

123 Inflammatory stimuli of macrophages and cancer cells as targets for therapeutic strategy, 2008.

[102] CR Ozawa and Andrea Banfi. Microenvironmental VEGF concentration, not total dose, determines a threshold between normal and aberrant angiogenesis. Journal of Clinical ... , 113(4), 2004.

[103] J C Panetta, W E Evans, and M H Cheok. Mechanistic mathematical modelling of mercaptopurine effects on cell cycle of human acute lymphoblastic leukaemia cells. British journal of cancer, 94(1):93–100, January 2006.

[104] Eun Joo Park, Yong Zhi Zhang, Natalia Vykhodtseva, and Nathan McDan- nold. Ultrasound-mediated blood-brain/blood-tumor barrier disruption im- proves outcomes with trastuzumab in a breast cancer brain metastasis model. Journal of Controlled Release, 163(3):277–284, 2012.

[105] Judea Pearl. TETRAD and SEM, 1998.

[106] Judea Pearl. The causal foundations of structural equation modeling. Handbook of structural equation modeling, (June):68–91, 2012.

[107] Holger Perfahl, HM Byrne, Tingan Chen, and Veronica Estrella. Multiscale modelling of vascular tumour growth in 3D: the roles of domain size and bound- ary conditions. PloS one, 1:1–15, 2011.

[108] Sara G M Piccirillo, Elena Binda, Roberta Fiocco, Angelo L Vescovi, and Khalid Shah. Brain cancer stem cells. Journal of molecular medicine (Berlin, Ger- many), 87(11):1087–1095, 2009.

[109] M. J. Plank and B. D. Sleeman. Tumour-Induced Angiogenesis: A Review. Journal of Theoretical Medicine, 5(3-4):137–153, 2003.

124 [110] Amina Qutub, Feilim Gabhann, Emmanouil Karagiannis, Prakash Vempati, and Aleksander Popel. Multiscale models of angiogenesis. In IEEE Engineering in Medicine and Biology Magazine, volume 28, pages 14–31, 2009.

[111] Brian I Rini and Eric J Small. Biology and clinical development of vascular endothelial growth factor-targeted therapy in renal cell carcinoma. Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 23(5):1028–43, February 2005.

[112] W G Roberts, J Delaat, M Nagane, S Huang, W K Cavenee, and G E Palade. Host microvasculature influence on tumor vascular morphology and endothelial gene expression. The American journal of pathology, 153(4):1239–48, October 1998.

[113] Tiina Roose, S. Jonathan Chapman, and Philip K. Maini. Mathematical Models of Avascular Tumor Growth. SIAM Review, 49(2):179–208, January 2007.

[114] David M Roy and Logan a Walsh. Candidate prognostic markers in breast cancer: focus on extracellular proteases and their inhibitors. Breast cancer (Dove Medical Press), 6:81–91, January 2014.

[115] Rajkumar Savai, Alexander Claus Langheinrich, Ralph Theo Schermuly, Soni Savai Pullamsetti, Rio Dumitrascu, Horst Traupe, Wigbert Stephan Rau, Werner Seeger, Friedrich Grimminger, and Gamal Andre Banat. Evaluation of angiogenesis using micro-computed tomography in a xenograft mouse model of lung cancer. Neoplasia (New York, N.Y.), 11(1):48–56, 2009.

[116] Jason Sheehan, Christopher P Cifarelli, Kasandra Dassoulas, Claire Olson, Jes- sica Rainey, and Shaojie Han. Trans-sodium crocetinate enhancing survival and glioma response on magnetic resonance imaging to radiation and temozolomide. Journal of neurosurgery, 113(2):234–9, August 2010.

125 [117] JA Sherratt and MAJ Chaplain. A new mathematical model for avascular tumour growth. Journal of Mathematical Biology, 43:291–312, 2001.

[118] Qihui Shi, Lidong Qin, Wei Wei, Feng Geng, Rong Fan, Young Shik Shin, Deliang Guo, Leroy Hood, Paul S Mischel, and James R Heath. Single-cell proteomic chip for profiling intracellular signaling pathways in single tumor cells. Proceedings of the National Academy of Sciences of the United States of America, 109(2):419–24, January 2012.

[119] Bill Shipley. Cause and Correlation in Biology: A User’s Guide to Path Anal- ysis, Structural Equations, and Causal Inference. Cambridge UP, Cambridge, UK, 2000.

[120] Bill Shipley. Confirmatory path analysis in a generalized multilevel context. Ecology, 90(2):363–368, 2009.

[121] Abbas Shirinifard, J Scott Gens, Benjamin L Zaitlen, Nikodem J Popawski, Maciej Swat, and James a Glazier. 3D multi-cell simulation of tumor growth and angiogenesis. PloS one, 4(10):e7190, January 2009.

[122] D Shweiki, A Itin, D Soffer, and E Keshet. Vascular endothelial growth factor induced by hypoxia may mediate hypoxia-initiated angiogenesis. Nature, 1992.

[123] F R Sidoli, a Mantalaris, and S P Asprey. Modelling of Mammalian cells and cell culture processes. Cytotechnology, 44(1-2):27–46, January 2004.

[124] Lorenzo Spinelli, Alessandro Torricelli, Paolo Ubezio, and Britta Basse. Mod- elling the balance between quiescence and cell death in normal and tumour cell populations. Mathematical biosciences, 202(2):349–70, August 2006.

[125] Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search. 81, 1993.

126 [126] Jordan R Stern, Scott Christley, Olga Zaborina, John C Alverdy, and Gary An. Integration of TGF-β- and EGFR-based signaling pathways using an agent- based model of epithelial restitution. Wound repair and regeneration : official publication of the Wound Healing Society [and] the European Tissue Repair Society, 20(6):862–71, 2012.

[127] C L Stokes, D A. Lauffenburger, and S K Williams. Migration of individual microvessel endothelial cells: stochastic model and parameter measurement. Journal of cell science, 99 ( Pt 2):419–30, June 1991.

[128] Brian R. Stoll, Cristiano Migliorini, Ananth Kadambi, Lance L. Munn, and Rakesh K. Jain. A mathematical model of the contribution of endothelial pro- genitor cells to angiogenesis in tumors: Implications for antiangiogenic therapy. Blood, 102(7):2555–2561, 2003.

[129] Roger Stupp, Warren P Mason, Martin J van den Bent, Michael Weller, Bar- bara Fisher, Martin J B Taphoorn, Karl Belanger, Alba a Brandes, Christine Marosi, Ulrich Bogdahn, J¨urgen Curschmann, Robert C Janzer, Samuel K Lud- win, Thierry Gorlia, Anouk Allgeier, Denis Lacombe, J Gregory Cairncross, Elizabeth Eisenhauer, and Ren´eO Mirimanoff. Radiotherapy plus concomi- tant and adjuvant temozolomide for glioblastoma. The New England journal of medicine, 352(10):987–96, March 2005.

[130] Dominik Sturm, Sebastian Bender, David T W Jones, Peter Lichter, Jacques Grill, Oren Becher, Cynthia Hawkins, Jacek Majewski, Chris Jones, Joseph F Costello, Antonio Iavarone, Kenneth Aldape, Cameron W Brennan, Nada Jabado, and Stefan M Pfister. Paediatric and adult glioblastoma: multiform (epi)genomic culprits emerge. Nature reviews. Cancer, 14(2):92–107, February 2014.

127 [131] Shuyu Sun, Mary F. Wheeler, Mandri Obeyesekere, and Charles W. Patrick. A deterministic model of growth factor-induced angiogenesis. Bulletin of Mathe- matical Biology, 67(2):313–337, 2005.

[132] Xiaoqiang Sun, Le Zhang, Hua Tan, Jiguang Bao, Costas Strouthos, and Xiaobo Zhou. Multi-scale agent-based brain cancer modeling and prediction of TKI treatment response: Incorporating EGFR signaling pathway and angiogenesis. BMC bioinformatics, 13(1):218, January 2012.

[133] Chien-Kuo Tai, Wei Jun Wang, Thomas C Chen, and Noriyuki Kasahara. Single-shot, multicycle suicide gene therapy by replication-competent retrovirus vectors achieves long-term survival benefit in experimental glioma. Molecular therapy : the journal of the American Society of Gene Therapy, 12(5):842–51, November 2005.

[134] Shingo Takano, Yoshihiko Yoshii, Shinichi Kondo, Brain Tumor, Hideo Suzuki, Tooru Maruno, Shizuo Shirai, and Tadao Nose. Concentration of Vascular Endothelial Growth Factor in the Serum and Tumor Tissue of Brain Tumor Patients Concentration of Vascular Endothelial Growth Factor in the Serum and Tumor. pages 2185–2190, 1996.

[135] Bryan C Thorne, Alexander M Bailey, Douglas W DeSimone, and Shayn M Peirce. Agent-based modeling of multicell morphogenic processes during devel- opment. Birth defects research. Part C, Embryo today : reviews, 81(4):344–53, December 2007.

[136] Q Tian, N D Price, and L Hood. Systems cancer medicine: towards realization of predictive, preventive, personalized and participatory (P4) medicine. Journal of internal medicine, 271(2):111–21, February 2012.

128 [137] Rui D M Travasso, Eugenia Corvera Poire, Mario Castro, Juan Carlos Rodrguez-Manzaneque, and A. Hernandez-Machado. Tumor angiogenesis and vascular patterning: A mathematical model. PLoS ONE, 6(5), 2011.

[138] Silvia Vosseler, Nicolae Mirancea, Peter Bohlen, Margareta M Mueller, and Norbert E Fusenig. Angiogenesis Inhibition by Vascular Endothelial Growth Factor Receptor-2 Blockade Reduces Stromal Matrix Metalloproteinase Expres- sion , Normalizes Stromal Tissue , and Reverts Epithelial Tumor Phenotype in Surface Heterotransplants Angiogenesis Inhibition. 2005.

[139] MD Walter, E Alexander, and WE Hunt. Evaluation of BCNU and/or radiation therapy in the treatment of anaplastic gliomas. J Neurosurg, 49:333–343, 1978.

[140] Y Wang and PR Crisostomo. TGF-α increases human mesenchymal stem cell- secreted VEGF by MEK-and PI3-K-but not JNK-or ERK-dependent mecha- nisms. American Journal of Physiology, 46202:1115–1123, 2008.

[141] Zhihui Wang, Le Zhang, Jonathan Sagotsky, and Thomas S Deisboeck. Simulat- ing non-small cell lung cancer with a multiscale agent-based model. Theoretical biology & medical modelling, 4:50, January 2007.

[142] J P Ward and J R King. Mathematical modelling of avascular-tumour growth. II: Modelling growth saturation. IMA journal of mathematics applied in medicine and biology, 16(2):171–211, June 1999.

[143] R S Warren, H Yuan, M R Matli, N a Gillett, and N Ferrara. Regulation by vas- cular endothelial growth factor of human colon cancer tumorigenesis in a mouse model of experimental liver metastasis. The Journal of clinical investigation, 95(4):1789–97, April 1995.

129 [144] R Wechsler-Reya and MP Scott. The developmental biology of brain tumors. Annual review of neuroscience, (Rather 1978):385–428, 2001.

[145] Robert A. Weinberg. The Biology of Cancer. Garland Science, 1 edition, 2006.

[146] Christopher G Willett, Yves Boucher, Emmanuelle di Tomaso, Dan G Duda, Lance L Munn, Ricky T Tong, Daniel C Chung, Dushyant V Sahani, Sanjeeva P Kalva, Sergey V Kozin, Mari Mino, Kenneth S Cohen, David T Scadden, Alan C Hartford, Alan J Fischman, Jeffrey W Clark, David P Ryan, Andrew X Zhu, Lawrence S Blaszkowsky, Helen X Chen, Paul C Shellito, Gregory Y Lauw- ers, and Rakesh K Jain. Direct evidence that the VEGF-specific antibody bevacizumab has antivascular effects in human rectal cancer. Nature medicine, 10(2):145–7, February 2004.

[147] Sewall Wright. The method of path coefficients. The Annals of Mathematical Statistics, 5(3):161–215, 1934.

[148] Nicole a Young, David K Flaherty, David C Airey, Peter Varlan, Feyi Aworunse, Jon H Kaas, and Christine E Collins. Use of flow cytometry for high-throughput cell population estimates in brain tissue. Frontiers in neuroanatomy, 6(July):27, January 2012.

[149] L.A. Zadeh. Feature extraction Foundations and Applications. Springer Berlin Heidelberg, 1 edition, 2006.

[150] Le Zhang, Chaitanya A Athale, and Thomas S Deisboeck. Development of a three-dimensional multiscale agent-based tumor model: simulating gene-protein interaction profiles, cell phenotypes and multicellular patterns in brain cancer. Journal of theoretical biology, 244(1):96–107, January 2007.

130 [151] Zhitu Zhu, Diane C Wang, Lauren Iu M Popescu, and Xiangdong Wang. Single- cell transcriptome in the identification of disease biomarkers: opportunities and challenges. Journal of translational medicine, 12(1):212, August 2014.

[152] L S Ziemer, C J Koch, a Maity, D P Magarelli, a M Horan, and S M Evans. Hypoxia and VEGF mRNA expression in human tumors. Neoplasia (New York, N.Y.), 3(6):500–8, 2001.

131