PQSAR: the Membrane Quantitative Structure-Activity Relationships in Cheminformatics

Expert Systems With Applications 54 (2016) 219–227 Contents lists available at ScienceDirect Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa PQSAR: The membrane quantitative structure-activity relationships in cheminformatics ∗ Ammar Adl a, , Moustafa Zein b, Aboul Ella Hassanien b,c a Faculty of Computers and Information, Beni-Suef University, Egypt b Faculty of Computers and Information, Cairo University, Cairo, Egypt c Scientific Research Group in Egypt (SRGE) a r t i c l e i n f o a b s t r a c t Keywords: The applications of quantitative structure activity relationships (QSAR) are used to establish a correlation Quantitative structure activity relationships between structure and biological response. Similarity searching is one of QSAR major phases. Innovating (QSAR) new strategies for similarity searching is an urgent task in cheminformatics research for three reasons: Similarity measurements (i) the increasing size of chemical search space of compound databases; (ii) the importance of similarity Similarity searching strategy P System measurements to (2D) and (3D) QSAR models; and (iii) similarity searching is a time consuming process Chemical search space in drug discovery. In this study, we introduce theoretical similarity searching strategy based on mem- Drug discovery brane computing. It solves time consumption problem. We adopt a ranking sorting algorithm with P System to rank probabilities of similarity according to a predefined similarity threshold. That bio-inspired model, simulating biological living cell, presents a high performance parallel processing system, we called it PQSAR. It relies on a set of rules to apply ranking algorithm on probabilities of similarity. The simu- lated experiments show how the effectiveness of PQSAR method enhanced the performance of similarity searching significantly; and introduced a standard ranking algorithm for similarity searching. ©2016 Elsevier Ltd. All rights reserved. 1. Introduction 2D and(or) 3D structure ( Gasteiger, 2014; Todeschini & Con- sonni, 2009 ). Cheminformatics based methods deploy a set of QSAR models are effective virtual screening (VS) tools, and sev- descriptors for building QSAR models. Molecular descriptors en- eral successful applications based on QSAR models have been in- code a representation of a molecule into a numeric representa- troduced as solutions to structure activity relationship problems tion such as fingerprints. Fingerprints encode structure represen- ( Luo, Wang, Roth, Golbraikh, & Tropsha, 2014; Varnek & Trop- tation of compounds to binary values ( Karunaratne, Boström, & sha, 2008 ). QSAR is employed to explore chemical databases, in- Norinder, 2013 ). Similarity coefficient can convert binary values of clude structure and biological activities for compounds involved. compounds to probability of similarity. Similarity measurements The target of building statistical and predictive models is to explore are the backbone of QSAR models in drug discovery. Furthermore, chemical databases, and predict biological activities of untested they constitute principal concepts in chemistry reasoning and anal- compounds. QSAR approaches have been applied to guide lead op- ysis ( Maggiora & Shanmugasundaram, 2011 ). Importance of simi- timization, and study mechanisms of chemical biological interac- larity searching grew rapidly, in cheminformatics, and is now play- tion in modern drug discovery ( Shahlaei, 2013 ). QSAR modeling ing a dominant role in lead optimization and medicinal industrial consists of four fundamental phases, namely data preparation, de- discovery programs. Its importance keeps growing resulting in ad- scriptor discovery, model building, and model validation. vanced (2D and 3D)-based similarity in the current chemical infor- The initial phase in QSAR or quantitative structure property mation systems ( Parkesh et al., 2012; Willett, 2013 ). relationships (QSPR) is to represent structures of compounds by The increasing size of chemical search space of compound descriptors such as fingerprint representation. Variety of struc- databases, and importance of similarity measurements to drug dis- ture descriptors have been developed to describe molecules in covery are main factors in chemical studies and research ( Vogt & Bajorath, 2012 ). Similarity searching in large chemical databases is a time-consuming task in drug discovery process that has many ∗ Corresponding author. Tel.: +20 1092546901. difficulties ( Schenone, Dan cík,ˇ Wagner, & Clemons, 2013 ). Several E-mail addresses: [email protected] (A. Adl), [email protected] searching strategies and algorithms were introduced to deal with (M. Zein), [email protected] (A.E. Hassanien). similarity searching ( Bento et al., 2014; Vogt & Bajorath, 2012 ), URL: http://www.egyptscience.net (A.E. Hassanien) http://dx.doi.org/10.1016/j.eswa.2016.01.051 0957-4174/© 2016 Elsevier Ltd. All rights reserved. 220 A. Adl et al. / Expert Systems With Applications 54 (2016) 219–227 Fig. 1. (2D) or (3D) QSAR steps with P System. but they were not efficient enough. There is still a huge need brane computing is considered an innovative solution, involved in for more efficient searching algorithms. Similarity has two types; intelligent systems and applications such as (sorting, approximate structure similarity and activity similarity. They are represented by algorithms, hard problems, P automate, pubic key protocol and probabilities that can be used in searching process ( Randi c,´ 2014 ). computer graphics) ( Ciobanu, P aun,˘ & Pérez-Jiménez, 2006; Enguix The probabilities are necessary to be screened, in order to identify & López, 2006; Georgiou, Gheorghe, & Bernardini, 2006; Michel molecules, and exceed a predefined similarity threshold ( Backman & Jacquemard, 2006; Pérez-Jiménez, Romero-Jiménez, & Sancho- & Girke, 2014; Vogt & Bajorath, 2012 ). Probabilities of similarity are Caparrini, 2006 ). ranked according to a similarity threshold. Ranking ( n ) numbers of QSAR is employed to handle many molecular data modeling ap- probabilities is the last step in all similarity searching methods that plications in medical chemistry ( Cherkasov et al., 2014 ). PQSAR is is performed by a ranking sorting algorithm. It heavily consumes the beginning of developing significant QSAR models based on par- time in a nonlinear way. allelism. It is a part of expert systems and computational models, Time consumption is a big challenge in drug discovery; there- which are considered a source for alternative predictors of in vivo fore, those algorithms lead to more difficulties in drug discovery. effects in both animals and human. Membrane computing assumes The objective of proposed strategy is to introduce P System, as a that processes taking place in compartments of a living cell can be solution to tackle time consumption problem of ranking sorting interpreted as computations ( Blakes et al., 2014 ). P System ranking and searching algorithms. A P System is a computing model which sorting algorithm introduces a high performance similarity search- abstracts from the functionality of living cells process their objects ing, due to its linear time complexity, with a true parallel pro- (chemical compounds) in their compartmental structure, according cessing ( Alhazov & Sburlan, 2005; Sburlan, 2003 ). That algorithm to given rules. P System can solve computationally hard problems is modified to be used in searching and ranking processes. Rank- in a feasible time using parallelism ( Membrane Computing, 2014a; ing and searching processes are main functions in building (2D) 2014b ). Ranking sorting P System is a true parallel algorithm, intro- and (3D) QSAR models. Fig. 1 shows the phases of building QSAR duced by Drago s¸ Sburlan as a bio-inspired model. Ranking sorting model and how it is affected by similarity searching and ranking algorithm simulates biological living cells ( Sburlan, 2003 ). It ranks processes. integer numbers with linear time complexity. P Systems are formal Innovative similarity searching and ranking strategies are cre- models based on rules that are applied in a maximally parallel way ated based on membrane computing. We adopt ranking sorting P on objects. Up till now there is no real implementation of P Sys- System to rank probabilities of similarity. The ranking and search- tems, neither in electronics industry nor the biology field. Mem- ing P System results in a ranked list of exceeded probabilities of Download English Version: https://daneshyari.com/en/article/383268 Download Persian Version: https://daneshyari.com/article/383268 Daneshyari.com.

Load more