Spinout Equinox Pharma Speeds up and Reduces the Cost of Drug Discovery

Total Page:16

File Type:pdf, Size:1020Kb

Spinout Equinox Pharma Speeds up and Reduces the Cost of Drug Discovery Spinout Equinox Pharma speeds up and reduces the cost of drug discovery pinout Equinox Pharma1 provides a service to the pharma and biopharma industries that helps in the development of new drugs. The company, established by researchers from Imperial College London in IMPACT SUMMARY 2008, is using its own proprietary software to help speed up and reduce the cost of discovery processes. S Spinout Equinox Pharma provides services to the agri- tech and pharmaceuticals industries to accelerate the Equinox is based on BBSRC-funded collaborative research The technology also has applications in the agrochemical discovery of molecules that could form the basis of new conducted by Professors Stephen Muggleton2, Royal sector, and is being used by global agri-tech company products such as drugs or herbicides. Academy Chair in Machine Learning, Mike Sternberg3, Chair Syngenta. The company arose from BBSRC bioinformatics and of Structural Bioinformatics, and Paul Freemont4, Chair of structural biology research at Imperial College London. Protein Crystallography, at Imperial College London. “The power of a logic-based approach is that it can Technology commercialisation company NetScientific propose chemically-novel molecules that can have Ltd has invested £250K in Equinox. The technology developed by the company takes a logic- enhanced properties and are able to be the subject of novel Equinox customers include major pharmaceutical based approach to discover novel drugs, using computers to ‘composition of matter’ patents,” says Muggleton. companies such as Japanese pharmaceutical company learn from a set of biologically-active molecules. It combines Astellas and agro chemical companies such as knowledge about traditional drug discovery methods and Proving the technology Syngenta, as well as SMEs based in the UK, Europe, computer-based analyses, and focuses on the key stages in A three-year BBSRC grant in 2003 funded the work of Japan and the USA. drug discovery of target identification, lead validation, and Muggleton and Sternberg in their initial development of the The company is also using their software in-house, to lead optimisation. technology. BBSRC follow-on funding was also provided in develop new antibiotics. 2006 for a further year. The software can screen tens of millions of potential molecules in only a few hours, and is more accurate than BBSRC support enabled Dr Ata Amini, a researcher at company’s proprietary software for drug discovery. This was other software-based approaches to identifying promising Imperial College London, to prove the technology and for called INDDex (Investigational Novel Drug Discovery by leads for drug discovery. details to be published5. The researchers developed a unique Example)8. machine learning method known as SVILP (Support Vector Inductive Logic Programming) and filed patents for it. These More recently, further development work has been were granted in Japan and USA and remain pending in undertaken by chemoinformatics PhD student Chris Europe6. Reynolds. His work, supported by a BBSRC CASE studentship between Equinox and Imperial College London, involves Imperial Innovations, part of the university, and technology integrating chemical synthesis into in silico (performed by commercialisation company NetScientific Ltd subsequently computer simulation) ‘hit’ discovery of drugs. both made investments of £250,000 in Equinox to in-license (an agreement to collaborate between companies with Logic-based drug discovery different strengths) drug discovery technology and develop In recent years, computer-based virtual screening has the business7. become an integral part of the drug discovery process. As part of this development, the INDDEx software uses its Scientists examine a computer model of a biologically-active Dr Amini was subsequently employed by Equinox, where molecular structure. logic-based rules to search databases of millions of small he used the concepts of the technology to develop the molecules for those that could potentially form the basis of 1 Spinout Equinox Pharma speeds up and reduces the cost of drug discovery new drugs. In addition, the software can perform ‘scaffold 50% prediction rate hopping’ – a process that enables novel compounds to be Existing high-throughput conventional screening is REFERENCES 1 Equinox Pharma Limited found while maintaining the known activity of an original expensive and limited to a few compounds at a time. 2 Professor Stephen Muggleton: http://www.imperial.ac.uk/people/s. substance. By identifying novel molecular structures that Computer-based virtual screening does not have these muggleton have little similarity to the original substances, the software limitations, but not all in silico approaches are as effective 3 Professor Michael Sternberg: http://www.imperial.ac.uk/people/m.sternberg increases the chance of discovering patentable molecules. as the one developed by Equinox. Some only achieve a 4 Professor Paul Freemont: http://www.imperial.ac.uk/people/p.freemont 5 Amini, A., Muggleton, S. H., Lodhi, H. & Sternberg, M. J. (2007). A novel INDDEx uses artificial intelligence to create rules that 15% (or lower) hit rate, whereas INDDEx is able to achieve logic-based approach for quantitative toxicology prediction. J Chem Inf identify the chemical properties of molecules which are up to a 50% prediction rate. In view of this substantial Model 47, 998-1006. 6 Patent: Muggleton, S.H., Lodhi, H. M., Sternberg, M.J.E. & Amini, A. (2006) responsible for drug activity. In-house information, and advantage, INDDEx can screen large numbers of Support Vector Inductive Logic Programming –PCT /GB2006/003320/ information provided by clients, act as ‘training’ datasets compounds rapidly and cost effectively. In some instances, Granted USA and Japan, Pending in EU. 7 Imperial Innovations: Investment in Equinox Pharma of substances that have already been screened against up to 40 million compounds have been screened by Equinox a target molecule. From the training dataset, INDDEx in a matter of hours – with important implications for 8 Reynolds, C. R., Amini, A. C., Muggleton, S. H. & Sternberg, M. J. E. (2012). generates a set of logic-based rules about the relative speeding up drug discovery. Assessment of a Rule-Based Virtual Screening Technology (INDDEx) on a position of parts of a molecule that display desirable Benchmark Data Set. Journal of Physical Chemistry B 116, 6732-6739. 9 Di Fruscia, P., Zacharioudakis, E., Liu, C., Moniot, S., Laohasinnarong, S., characteristics. Equinox is following a mixed-business model of ‘fee-for- Khongkow, M., Harrison, I. F., Koltsida, K., Reynolds, C. R., Schmidtkunz, K., service’ and in-house drug discovery. Currently, Equinox Jung, M., Chapman, K. L., Steegborn, C., Dexter, D. T., Sternberg, M. J. E., Lam, E. W. F. & Fuchter, M. J. (2015). The Discovery of a Highly Selective On completion of screening and searches of small molecules is applying its technology to develop novel antibiotics. In 5,6,7,8-Tetrahydrobenzo[4,5]thieno[2,3-d]pyrimidin-4(3H)-one SIRT2 by the software, medicinal chemists are presented with rules addition, INDDEx has been used at Imperial College London Inhibitor that is Neuroprotective in an in vitro Parkinson’s Disease Model. ChemMedChem 10, 69-82. and promising lead molecules, which can then be tested to progress the development of inhibitors of SIRT2 – a 10 Syngenta: http://www3.syngenta.com/country/uk/en/about/businesses/ in the laboratory. The results are in a form that is readily potential target for the treatment of Parkinson’s disease9. Pages/Our_UK_businesses.aspx understood, helping direct the next step of drug discovery. Customers for the services provided by Equinox include “It’s incredibly exciting to transfer the results of academic major international pharmaceutical and agrochemical research into a company and have a commercial framework companies and SMEs based in the UK, Europe, Japan to apply the approach to drug discovery,” says Professor and the USA. One of its customers at Syngenta10, an Sternberg. “We’ve proved the technology with commercial agrochemical company with 29,000 employees in over and academic partners and our next project will be to apply 90 countries, says, ‘We were impressed with the ability of INDDEx to search for novel antibiotics, which will address INDDEx to produce good structure-activity relationship one of the global challenges in healthcare.” models for activity and selectivity in a series with very difficult molecules to analyse.’ 2.
Recommended publications
  • Predicting and Characterising Protein-Protein Complexes
    Predicting and Characterising Protein-Protein Complexes Iain Hervé Moal June 2011 Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute and Department of Biochemistry and Molecular Biology, University College London A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Biochemistry at the University College London. 2 I, Iain Hervé Moal, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Abstract Macromolecular interactions play a key role in all life processes. The con- struction and annotation of protein interaction networks is pivotal for the understanding of these processes, and how their perturbation leads to dis- ease. However the extent of the human interactome and the limitations of the experimental techniques which can be brought to bear upon it necessit- ate theoretical approaches. Presented here are computational investigations into the interactions between biological macromolecules, focusing on the structural prediction of interactions, docking, and their kinetic and thermo- dynamic characterisation via empirical functions. Firstly, the use of normal modes in docking is investigated. Vibrational analysis of proteins are shown to indicate the motions which proteins are intrinsically disposed to under- take, and the use of this information to model flexible deformations upon protein-protein binding is evaluated. Subsequently SwarmDock, a docking algorithm which models flexibility as a linear combination of normal modes, is presented and benchmarked on a wide variety of test cases. This algorithm utilises state of the art energy functions and metaheuristics to navigate the free energy landscape.
    [Show full text]
  • Functional Effects Detailed Research Plan
    GeCIP Detailed Research Plan Form Background The Genomics England Clinical Interpretation Partnership (GeCIP) brings together researchers, clinicians and trainees from both academia and the NHS to analyse, refine and make new discoveries from the data from the 100,000 Genomes Project. The aims of the partnerships are: 1. To optimise: • clinical data and sample collection • clinical reporting • data validation and interpretation. 2. To improve understanding of the implications of genomic findings and improve the accuracy and reliability of information fed back to patients. To add to knowledge of the genetic basis of disease. 3. To provide a sustainable thriving training environment. The initial wave of GeCIP domains was announced in June 2015 following a first round of applications in January 2015. On the 18th June 2015 we invited the inaugurated GeCIP domains to develop more detailed research plans working closely with Genomics England. These will be used to ensure that the plans are complimentary and add real value across the GeCIP portfolio and address the aims and objectives of the 100,000 Genomes Project. They will be shared with the MRC, Wellcome Trust, NIHR and Cancer Research UK as existing members of the GeCIP Board to give advance warning and manage funding requests to maximise the funds available to each domain. However, formal applications will then be required to be submitted to individual funders. They will allow Genomics England to plan shared core analyses and the required research and computing infrastructure to support the proposed research. They will also form the basis of assessment by the Project’s Access Review Committee, to permit access to data.
    [Show full text]
  • Development of Novel Strategies for Template-Based Protein Structure Prediction
    Imperial College London Department of Life Sciences PhD Thesis Development of novel strategies for template-based protein structure prediction Stefans Mezulis supervised by Prof. Michael Sternberg Prof. William Knottenbelt September 2016 Abstract The most successful methods for predicting the structure of a protein from its sequence rely on identifying homologous sequences with a known struc- ture and building a model from these structures. A key component of these homology modelling pipelines is a model combination method, responsible for combining homologous structures into a coherent whole. Presented in this thesis is poing2, a model combination method using physics-, knowledge- and template-based constraints to assemble proteins us- ing information from known structures. By combining intrinsic bond length, angle and torsional constraints with long- and short-range information ex- tracted from template structures, poing2 assembles simplified protein models using molecular dynamics algorithms. Compared to the widely-used model combination tool MODELLER, poing2 is able to assemble models of ap- proximately equal quality. When supplied only with poor quality templates or templates that do not cover the majority of the query sequence, poing2 significantly outperforms MODELLER. Additionally presented in this work is PhyreStorm, a tool for quickly and accurately aligning the three-dimensional structure of a query protein with the Protein Data Bank (PDB). The PhyreStorm web server provides comprehensive, current and rapid structural comparisons to the protein data bank, providing researchers with another tool from which a range of biological insights may be drawn. By partitioning the PDB into clusters of similar structures and performing an initial alignment to the representatives of each cluster, PhyreStorm is able to quickly determine which structures should be excluded from the alignment.
    [Show full text]
  • Investigation of Topic Trends in Computer and Information Science
    Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP* 텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로 Su Yeon Kim (김수연)** Sung Jeon Song (송성전)*** Min Song (송민)**** ABSTRACT The goal of this paper is to explore the field of Computer and Information Science with the aid of text mining techniques by mining Computer and Information Science related conference data available in DBLP (Digital Bibliography & Library Project). Although studies based on bibliometric analysis are most prevalent in investigating dynamics of a research field, we attempt to understand dynamics of the field by utilizing Latent Dirichlet Allocation (LDA)-based multinomial topic modeling. For this study, we collect 236,170 documents from 353 conferences related to Computer and Information Science in DBLP. We aim to include conferences in the field of Computer and Information Science as broad as possible. We analyze topic modeling results along with datasets collected over the period of 2000 to 2011 including top authors per topic and top conferences per topic. We identify the following four different patterns in topic trends in the field of computer and information science during this period: growing (network related topics), shrinking (AI and data mining related topics), continuing (web, text mining information retrieval and database related topics), and fluctuating pattern (HCI, information system and multimedia system related topics). 초 록 이 논문의 연구목적은 컴퓨터공학 및 정보학 관련 연구동향을 분석하는 것이다. 이를 위해 텍스트마이닝 기법을 이용하여 DBLP(Digital Bibliography & Library Project)의 학술회의 데이터를 분석하였다. 대부분의 연구동향 분석 연구가 계량서지학적 연구방법을 사용한 것과 달리 이 논문에서는 LDA(Latent Dirichlet Allocation) 기반 다항분포 토픽 모델링 기법을 이용하였다.
    [Show full text]
  • Can Predicate Invention Compensate for Incomplete Background Knowledge?
    Can predicate invention compensate for incomplete background knowledge? Andrew CROPPER Stephen MUGGLTEON Imperial College London, United Kingdom Abstract. In machine learning we are often faced with the problem of incomplete data, which can lead to lower predictive accuracies in both feature-based and re- lational machine learning. It is therefore important to develop techniques to com- pensate for incomplete data. In inductive logic programming (ILP) incomplete data can be in the form of missing values or missing predicates. In this paper, we inves- tigate whether an ILP learner can compensate for missing background predicates through predicate invention. We conduct experiments on two datasets in which we progressively remove predicates from the background knowledge whilst measuring the predictive accuracy of three ILP learners with differing levels of predicate inven- tion. The experimental results show that as the number of background predicates decreases, an ILP learner which performs predicate invention has higher predictive accuracies than the learners which do not perform predicate invention, suggesting that predicate invention can compensate for incomplete background knowledge. Keywords. inductive logic programming, predicate invention, relational machine learning, meta-interpretive learning 1. Introduction In an ideal world machine learning datasets would be complete. In reality, however, we are often faced with incomplete data, which can lead to lower predictive accuracies in both feature-based [6,10] and relational [22,17] machine learning. It is therefore important to develop techniques to compensate for incomplete data. In inductive logic programming (ILP) [11], a form of relational machine learning, incomplete background knowledge can be in the form of missing values or missing predicates [9].
    [Show full text]
  • Knowledge Mining Biological Network Models Stephen Muggleton
    Knowledge Mining Biological Network Models Stephen Muggleton To cite this version: Stephen Muggleton. Knowledge Mining Biological Network Models. 6th IFIP TC 12 International Conference on Intelligent Information Processing (IIP), Oct 2010, Manchester, United Kingdom. pp.2- 2, 10.1007/978-3-642-16327-2_2. hal-01055072 HAL Id: hal-01055072 https://hal.inria.fr/hal-01055072 Submitted on 11 Aug 2014 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License Knowledge Mining Biological Network Models S.H. Muggleton Department of Computing Imperial College London Abstract: In this talk we survey work being conducted at the Cen- tre for Integrative Systems Biology at Imperial College on the use of machine learning to build models of biochemical pathways. Within the area of Systems Biology these models provide graph-based de- scriptions of bio-molecular interactions which describe cellular ac- tivities such as gene regulation, metabolism and transcription. One of the key advantages of the approach taken, Inductive Logic Pro- gramming, is the availability of background knowledge on existing known biochemical networks from publicly available resources such as KEGG and Biocyc.
    [Show full text]
  • Centre for Bioinformatics Imperial College London
    Centre for Bioinformatics Imperial College London Inaugural Report January 2003 Centre Director: Prof Michael Sternberg www.imperial.ac.uk/bioinformatics [email protected] Support Service Head: Dr Sarah Butcher www.codon.bioinformatics.ic.ac.uk [email protected] Centre for Bioinformatics - Imperial College London - Inaugural Report - January 2003 1 Contents Summary..................................................................................................................... 3 1. Background and Objectives .................................................................................... 4 1.1 Background ....................................................................................................... 4 1.2 Objectives of the Centre for Bioinformatics ....................................................... 4 1.3 Objectives of the Bioinformatics Support Service.............................................. 5 1.4 Historical Notes ................................................................................................. 5 2. Management ........................................................................................................... 6 3. Biographies of the Team......................................................................................... 7 4. Bioinformatics Research at Imperial ....................................................................... 8 4.1 Affiliates of the Centre ....................................................................................... 8 4.2 Research Grants
    [Show full text]
  • Centre for Bioinformatics Imperial College London Second Report 31
    Centre for Bioinformatics Imperial College London Second Report 31 May 2004 Centre Director: Prof Michael Sternberg www.imperial.ac.uk/bioinformatics [email protected] Support Service Head: Dr Sarah Butcher www.codon.bioinformatics.ic.ac.uk [email protected] Centre for Bioinformatics - Imperial College London - Second Report - May 2004 1 Contents Summary..................................................................................................................... 3 1. Background and Objectives .................................................................................. 4 1.1 Objectives of the Report........................................................................................ 4 1.2 Background........................................................................................................... 4 1.3 Objectives of the Centre for Bioinformatics........................................................... 5 1.4 Objectives of the Bioinformatics Support Service ................................................. 5 2. Management ......................................................................................................... 6 3. Biographies of the Team....................................................................................... 7 4. Bioinformatics Research at Imperial ..................................................................... 8 4.1 Affiliates of the Centre........................................................................................... 8 4.2 Research..............................................................................................................
    [Show full text]
  • Towards Chemical Universal Turing Machines
    Towards Chemical Universal Turing Machines Stephen Muggleton Department of Computing, Imperial College London, 180 Queens Gate, London SW7 2AZ. [email protected] Abstract aid of computers. This is not only because of the scale of the data involved but also because scientists are unable Present developments in the natural sciences are pro- to conceptualise the breadth and depth of the relationships viding enormous and challenging opportunities for var- ious AI technologies to have an unprecedented impact between relevant databases without computational support. in the broader scientific world. If taken up, such ap- The potential benefits to science of such computerization are plications would not only stretch present AI technology high knowledge derived from large-scale scientific data has to the limit, but if successful could also have a radical the potential to pave the way to new technologies ranging impact on the way natural science is conducted. We re- from personalised medicines to methods for dealing with view our experience with the Robot Scientist and other and avoiding climate change (Muggleton 2006). Machine Learning applications as examples of such AI- In the 1990s it took the international human genome inspired developments. We also consider potential fu- project a decade to determine the sequence of a single hu- ture extensions of such work based on the use of Uncer- man genome, but projected increases in the speed of gene tainty Logics. As a generalisation of the robot scientist sequencing imply that before 2050 it will be feasible to de- we introduce the notion of a Chemical Universal Turing machine.
    [Show full text]
  • Efficiently Learning Efficient Programs
    Imperial College London Department of Computing Efficiently learning efficient programs Andrew Cropper Submitted in part fulfilment of the requirements for the degree of Doctor of Philosophy in Computing of Imperial College London and the Diploma of Imperial College London, June 2017 Abstract Discovering efficient algorithms is central to computer science. In this thesis, we aim to discover efficient programs (algorithms) using machine learning. Specifically, we claim we can efficiently learn programs (Claim 1), and learn efficient programs (Claim 2). In contrast to universal induction methods, which learn programs using only examples, we introduce program induction techniques which addition- ally use background knowledge to improve learning efficiency. We focus on inductive logic programming (ILP), a form of program induction which uses logic programming to represent examples, background knowledge, and learned programs. In the first part of this thesis, we support Claim 1 by using appropriate back- ground knowledge to efficiently learn programs. Specifically, we use logical minimisation techniques to reduce the inductive bias of an ILP learner. In addi- tion, we use higher-order background knowledge to extend ILP from learning first-order programs to learning higher-order programs, including the support for higher-order predicate invention. Both contributions reduce learning times and improve predictive accuracies. In the second part of this thesis, we support Claim 2 by introducing tech- niques to learn minimal cost logic programs. Specifically, we introduce Metaopt, an ILP system which, given sufficient training examples, is guaranteed to find minimal cost programs. We show that Metaopt can learn minimal cost robot strategies, such as quicksort, and minimal time complexity logic programs, including non-deterministic programs.
    [Show full text]
  • Alan Turing and the Development of Artificial Intelligence
    Alan Turing and the development of Artificial Intelligence Stephen Muggleton Department of Computing, Imperial College London December 19, 2012 Abstract During the centennial year of his birth Alan Turing (1912-1954) has been widely celebrated as having laid the foundations for Computer Sci- ence, Automated Decryption, Systems Biology and the Turing Test. In this paper we investigate Turing’s motivations and expectations for the development of Machine Intelligence, as expressed in his 1950 article in Mind. We show that many of the trends and developments within AI over the last 50 years were foreseen in this foundational paper. In particular, Turing not only describes the use of Computational Logic but also the necessity for the development of Machine Learning in order to achieve human-level AI within a 50 year time-frame. His description of the Child Machine (a machine which learns like an infant) dominates the closing section of the paper, in which he provides suggestions for how AI might be achieved. Turing discusses three alternative suggestions which can be characterised as: 1) AI by programming, 2) AI by ab initio machine learning and 3) AI using logic, probabilities, learning and background knowledge. He argues that there are inevitable limitations in the first two approaches and recommends the third as the most promising. We com- pare Turing’s three alternatives to developments within AI, and conclude with a discussion of some of the unresolved challenges he posed within the paper. 1 Introduction In this section we will first review relevant parts of the early work of Alan Turing which pre-dated his paper in Mind [42].
    [Show full text]
  • Knowledge Mining Biological Network Models
    Knowledge Mining Biological Network Models S.H. Muggleton Department of Computing Imperial College London Abstract: In this talk we survey work being conducted at the Cen- tre for Integrative Systems Biology at Imperial College on the use of machine learning to build models of biochemical pathways. Within the area of Systems Biology these models provide graph-based de- scriptions of bio-molecular interactions which describe cellular ac- tivities such as gene regulation, metabolism and transcription. One of the key advantages of the approach taken, Inductive Logic Pro- gramming, is the availability of background knowledge on existing known biochemical networks from publicly available resources such as KEGG and Biocyc. The topic has clear societal impact owing to its application in Biology and Medicine. Moreover, object descrip- tions in this domain have an inherently relational structure in the form of spatial and temporal interactions of the molecules involved. The relationships include biochemical reactions in which one set of metabolites is transformed to another mediated by the involvement of an enzyme. Existing genomic information is very incomplete concerning the functions and even the existence of genes and me- tabolites, leading to the necessity of techniques such as logical ab- duction to introduce novel functions and invent new objects. Moreo- ver, the development of active learning algorithms has allowed automatic suggestion of new experiments to test novel hypotheses. The approach thus provides support for the overall scientific cycle of hypothesis generation and experimental testing. Bio-Sketch: Professor Stephen Muggleton FREng holds a Royal Academy of Engineering and Microsoft Research Chair (2007-) and is Director of the Imperial College Computational Bioinformatics Centre (2001-) (www.doc.ic.ac.uk/bioinformatics) and Director of Modelling at the BBSRC Centre for Integrative Modelling at Impe- rial College.
    [Show full text]