Manuscript under review Comprehensive data-driven analysis of the impact of chemoinformatic structure on the genome-wide biological response profiles of cancer cells to 1159 drugs Suleiman A. Khan 1,§ , Ali Faisal 1, John Patric Mpindi 2, Juuso A. Parkkinen 1, Tuomo Kalliokoski 3, Antti Poso 4, Olli P. Kallioniemi 2, Krister Wennerberg 2, Samuel Kaski 1,5,§ 1 Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, PO Box 15400, Espoo, 00076, Finland. 2 Institute for Molecular Medicine Finland, University of Helsinki, PO Box 20, Helsinki, 00014, Finland. 3 CADD, Global Discovery Chemistry, Novartis Institute for Biomedical Research, Basel, CH4002, Switzerland. 4 School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, PO Box 1627, Kuopio, 70211, Finland. 5 Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, PO Box 68, Helsinki, 00014, Finland §Corresponding authors. Contact:
[email protected] ,
[email protected] Abstract Motivation: Detailed and systematic understanding of the biological effects of millions of available compounds on living cells is a signif icant challenge. As most compounds impact multiple targets and pathways, traditional methods for analyzing structure-function relationships are not comprehensive enough. Therefore more advanced integrative models are needed for predicting biological effect s elicited by specific chemical features. As a step towards creating such computational links we developed a data-driven chemical systems biology approach to comprehensively study the relationship of 76 structural 3D-descriptors (VolSurf , chemical space) of 1159 drugs with the gene expression responses (biological space) they elicited in three cancer cell lines.