PANI: a Novel Algorithm for Fast Discovery of Putative Target Nodes in Signaling Networks
Total Page:16
File Type:pdf, Size:1020Kb
PANI: A Novel Algorithm for Fast Discovery of Putative Target Nodes in Signaling Networks Huey-Eng Chua§ Sourav S Bhowmick§ Lisa Tucker-Kellogg‡ Qing Zhao§ C F Dewey, Jr† Hanry Yu¶ §School of Computer Engineering, Nanyang Technological University, Singapore ‡Mechanobiology Institute, National University of Singapore, Singapore ¶Department of Physiology, National University of Singapore, Singapore †Division of Biological Engineering, Massachusetts Institute of Technology, USA chua0530|assourav|[email protected], LisaTK|[email protected], [email protected] ABSTRACT of disease [26]. Observation-based approaches screen drug Biological network analysis often aims at the target identifi- compounds in vitro, ex vivo, or in vivo; and measure em- cation problem, which is to predict which molecule to inhibit pirical outcome for determining which drugs are “active” (or activate) for a disease treatment to achieve optimum ef- against the disease. In contrast, target-based approaches ficacy and safety. A related goal, arising from the increasing identify a particular molecule (e.g., enzyme, receptor) that availability of semi-automated assays and moderately par- functions prominently in a validated mechanism of the dis- allel experiments, is to suggest many molecules as potential ease, and then synthesize a drug compound to interact specif- targets. The target prioritization problem is to predict a sub- ically with that target molecule. One key expectation in set of molecules in a given disease-associated network which target-based drug development is that specificity for one contains successful drug targets with highest probability. disease-causing molecule and lack of binding to other Sensitivity analysis prioritizes targets in a dynamic network molecules will minimize toxicity. Another emerging research model according to principled criteria, but fails to penal- area is the network-based drug discovery approaches which ize off-target effects, and does not scale for large networks. exploit knowledge of disease mechanism at a systems level We describe Pani (Putative TArget Nodes PrIoritization), [226]. a novel method that prunes and ranks the possible target There is an ongoing debate on the value of observation- nodes by exploiting concentration-time profiles and network based, target-based, and network-based approaches in drug structure (topological) information. Pani and two sensitiv- development [39, 107, 97]. As the debate continues, in- ity analysis methods were applied to three signaling net- novations in experimental methods are chipping away at works, mapk-pi3k; myosin light chain (mlc) phosphoryla- the previous distinctions between these approaches. Hy- tion and sea urchin endomesoderm gene regulatory network brid strategies are increasingly accessible, in part because of which are implicated in ovarian cancer; atrial fibrillation cost-effective methods for parallelizing experiments on tis- and embryonic deformity. Predicted targets were compared sues and living cells. Technologies, such as microfluidic cell against a reference set: the molecules known to be targeted culture arrays, and high-content imaging [220, 45, 40], fa- by drugs in clinical use for the respective diseases. Pani is cilitate investigations that bridge between screening empiri- orders of magnitude faster and prioritizes the majority of cal outcomes, targeted study of disease mechanisms, and/or known targets higher than sensitivity analysis. This high- network-based systems biology [155]. The specific meth- lights a potential disagreement between absolute mathemat- ods are diverse and rapidly changing, but one clear trend is ical sensitivity and our intuition of influence. We conclude an increase in customized [155] designs for high-throughput that empirical, structural methods like Pani, which demand experiments, meaning that individual investigators decide almost no run time, offer benefits not available from quan- not only the treatments and controls for the input samples, titative simulation and sensitivity analysis. but they also decide which measurements to perform on the samples (e.g., which genes to measure, which behaviors to 1. INTRODUCTION quantify). This level of design is particularly challenging Drug discovery research has gradually shifted from when experiments have enough “high throughput” coverage observation-based approaches with phenotypic screening, to- to exceed the biological expertise of any single investigator, ward target-based research aimed at molecular mechanisms but not enough coverage to skip the decision-making step and simply measure every variable. Experimental innovation often creates novel computational problems, including immature research topics where sim- ple algorithms might be effective. Experimental trends in Permission to make digital or hard copies of all or part of this work for drug discovery research are now creating demand for com- personal or classroom use is granted without fee provided that copies are putational automation to assist in the selection of molecule not made or distributed for profit or commercial advantage and that copies sets for multiplex assays. This paper addresses molecule bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific selection in a manner that deliberately spans the gap from permission and/or a fee. network-based and target-based computation to observation- Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$10.00. based goals for final outcome. The first step of this work is outcome of disease is not addressed well by sensitivity anal- to formalize “target prioritization”, the problem of choosing ysis. a set of putative target molecules for further study. Next, we present a fast and novel algorithm called Pani (Putative 2. RELATED WORK TArget Nodes PrIoritization), which uses network infor- Sensitivity analysis [296, 193, 108] is a family of closely mation and simple empirical scores to prioritize and rank related methods that is frequently proposed for target iden- biologically relevant target molecules in signaling networks. tification. Sensitivity analysis measures the effect of a pa- A putative target node in a signaling network is a protein rameter perturbation (e.g., a kinetic rate constant change) that when perturbed is able to achieve desirable efficacy on the output node and assigns sensitivity values to a node and safety in terms of regulation of a particular output node. based on the extent of output node perturbation. The pa- Informally, an output node is a protein that is either involved rameters are ranked according to the sensitivity value and in some biological processes (e.g., proliferation) which may sensitive parameters (parameters with high sensitivity val- be deregulated, resulting in manifestation of a disease (e.g., ues) are then selected as potential targets [108]. The param- cancer) or be of interest due to its physiological role in the eter values of a real biological network vary depending on disease. An example of an output node for the application genetics, cellular environment and cell type. Thus, no single of cancer drug design might be Akt (a node of interest in mapk pi3k “true" nominal parameter value exists. Hence, more appro- the - network [93]). Constitutive activation of Akt priate are gsa-based methods [296], such as sobol [238] and was shown to be oncogenic and may be targeted to disrupt mpsa [296], which measure the effect on the output node ovarian tumor cell growth [8]. Regulation of the output when all parameters are varied simultaneously. node provides a means to restore normalcy to the diseased Although gsa-based methods can identify sensitive pa- network [106]. rameters of the system, they have several limitations. First, Pani is a generic algorithm applicable to any biological Pani they require simulating the network behavior for a combina- signaling network. The algorithm starts with a pre- torial number of different parameter combinations, making processing step that prunes the candidate nodes (nodes be- it computationally expensive, especially for larger networks. ing considered for analysis) based on a reachability rule to sobol mlc Pani analysis takes about 21 hours for the phospho- reduce computation cost. Then, in its main phase, rylation network [160] with 105 nodes. For larger network prioritizes nodes using a score based on a kinetic property pssd (e.g., the endomesoderm network [138] with 622 nodes), it called profile shape similarity distance ( ) and two net- fails to complete the analysis due to memory problem. Sec- work structural properties, namely target downstream effect tde bc ond, these methods generally identify parameters resulting ( ) and bridging centrality ( ) [111]. Putative target in maximum output node perturbation without considering nodes are nodes with high ranking score. off-target effects. Third, these methods may miss “insensi- Profile shape similarity distance measures the similarity tive” nodes that may be important drug targets, since they between concentration-time series profiles (plot of a node’s only consider one property (sensitivity) in their ranking. For concentration against time) using a customized distance mea- instance, mpsa [296] and sobol [238] ignore Akt as a target sure that, for example, permits delays and inversions. pssd node although active Akt can inhibit activation of erk in is a distance measure which identifies the most relevant up- differentiated myotubes via Raf-Akt interaction [214]. Pani stream regulators.