Inductive Logic Programming with Gradient Descent for Supervised

Total Page:16

File Type:pdf, Size:1020Kb

Inductive Logic Programming with Gradient Descent for Supervised Inductive Logic Programming with Gradient Descent for Supervised Binary Classification by Nicholas Wu B.S., Massachusetts Institute of Technology (2019) Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2020 ○c Massachusetts Institute of Technology 2020. All rights reserved. Author................................................................ Department of Electrical Engineering and Computer Science February 2020 Certified by. Andrew W. Lo Charles E. and Susan T. Harris Professor, Sloan School of Management Thesis Supervisor Accepted by . Katrina LaCurts Chairman, Master of Engineering Thesis Committee 2 Inductive Logic Programming with Gradient Descent for Supervised Binary Classification by Nicholas Wu Submitted to the Department of Electrical Engineering and Computer Science on February 2020, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science Abstract As machine learning techniques have become more advanced, interpretability has be- come a major concern for models making important decisions. In contrast to Local Interpretable Model-Agnostic Explanations (LIME), this thesis seeks to develop an interpretable model using logical rules, rather than explaining existing blackbox mod- els. We extend recent inductive logic programming methods developed by Evans and Grefenstette [3] to develop an gradient descent-based inductive logic programming technique for supervised binary classification. We start by developing our methodol- ogy for binary input data, and then extend the approach to numerical data using a threshold-gate based binarization technique. We test our implementations on datasets with varying pattern structures and noise levels, and select our best performing im- plementation. We then present an example where our method generates an accurate and interpretable rule set, whereas the LIME technique fails to generate a reasonable model. Further, we test our original methodology on the FICO Home Equity Line of Credit dataset. We run a hyperparameter search over differing number of rules and rule sizes. Our best performing model achieves a 71.7% accuracy, which is comparable to multilayer perceptron and randomized forest models. We conclude by suggesting directions for future applications and potential improvements. Thesis Supervisor: Andrew W. Lo Title: Charles E. and Susan T. Harris Professor, Sloan School of Management 3 4 Acknowledgments My journey through MIT hasn’t been linear, and there have been so many twists and turns along the way. As such, there are so many people I have to thank for their guidance, advice, and support. To start, I extend my heartfelt gratitude to Professor Andrew Lo for all his help through this thesis. I am incredibly grateful for all the guidance he has provided for me along the way. Whenever I had questions, whenever I seemed to get stuck or con- fused, Professor Lo would always offer a different angle or another way forward. His mentorship throughout the entire research process helped make this thesis possible. I also recognize the help I received from the staff at the Laboratory for Financial Engi- neering. I sincerely thank Jayna Cummings, Crystal Myler, and Mavanee Nealon, for all their behind-the-scenes work coordinating meetings and making sure my research process went smoothly. Additionally, I want to thank some of my close friends and classmates who have always been listening and helping me along my way throughout MIT. I’ve been helped by so many people, whether it be inspiring me, listening to my ideas, providing general advice, or teaching me things I don’t know. To that end, I want to thank Alap Sahoo, Luis Sandoval, Haris Brkic, Evan Tey, Justin Yu, and Henry La Soya for all they’ve done to help me through this thesis, and especially for being good friends to me throughout this time. I also want to thank Jenny Shi for all her support over the past two years. From late-night proofreading, motivating me to work, and getting me past mental stumbling blocks on my research path, she has done so much to help me get here. Lastly, I have to thank my parents, Daniel and Li, and my sister, Jackie, for their continual support throughout everything in the past four and a half years. Throughout this entire time, my family has stood by me, helping me through all my struggles and celebrating my successes. Without them, I likely would never have even gotten the opportunity to come to MIT, and I extend my greatest thanks for their constant support. 5 6 Contents 1 Introduction 15 1.1 Research Goals . 15 1.2 Thesis Structure and Result Summary . 16 1.2.1 Chapter 2 . 16 1.2.2 Chapter 3 . 16 1.2.3 Chapter 4 . 17 1.2.4 Chapter 5 . 17 1.2.5 Chapter 6 . 17 1.2.6 Chapter 7 . 18 2 Inductive Logic Programming 19 2.1 Notation and Definition . 19 2.2 Early methods for Inductive Logic Programming . 20 2.2.1 RLGG refinements . 21 2.2.2 Top-down approaches . 22 2.2.3 Inverse Entailment . 22 2.3 Recent Approaches . 23 2.3.1 Boolean Satisfiability Reduction . 23 2.3.2 Learning Process . 23 2.3.3 Approach Results . 24 3 Explainable Artificial Intelligence 25 3.1 Why Explainability? . 25 7 3.2 Blackbox Interpretability Methods . 26 3.2.1 LIME . 26 3.2.2 Gradient Approximation . 27 3.2.3 General Approach Shortcomings . 28 3.3 Inductive Methods versus Regression . 28 4 Model Development 31 4.1 Approximating Logical Structures . 32 4.1.1 Parametrization . 32 4.1.2 Implementating AND . 34 4.1.3 General Learning Procedure . 36 4.2 Experiments . 37 4.2.1 Data Construction . 37 4.2.2 Model Comparisons . 37 4.3 Discussion . 41 5 Inducing Rules on Numerical Data 43 5.1 Approach Details . 43 5.1.1 Feature Expansion . 44 5.1.2 Interpretability . 44 5.2 Experiments . 44 5.2.1 Experimental Results . 45 5.3 Discussion . 46 5.3.1 Comparison to LIME . 48 6 FICO Home Equity Line of Credit Tests 51 6.1 Dataset . 51 6.1.1 Weight-of-Evidence Encoding . 53 6.2 Binary Data Rule Learning . 53 6.2.1 Experiments . 54 6.3 Numerical Data Rule Learning . 55 8 6.3.1 Results and Analysis . 56 6.4 Discussion . 57 6.4.1 Comparisons to Related Work . 58 6.4.2 Generalizability . 58 6.4.3 Approach Shortcomings . 59 7 Conclusion and Next Steps 63 7.1 Key Ideas . 63 7.2 Future Work . 65 A Tables 67 9 10 List of Figures 4-1 Relation between number of features and model performance . 39 4-2 Relation between number of rules and model performance . 40 4-3 Relation between rule size and model performance . 41 4-4 Relation between noise level and model performance . 42 5-1 Relation between error rate and numerical model performance . 46 5-2 Relation between number of rules and numerical model performance . 47 5-3 Relation between rule size and numerical model performance . 47 6-1 Example Training Run, Plotting Accuracy and Loss . 59 11 12 List of Tables 4.1 List of all constructed dataset configurations for binary data rule learning 38 4.2 Model Run Results, Constructed Dataset 1 . 39 5.1 List of all constructed dataset configurations for numerical rule learning 45 5.2 LIME Coefficients around test point . 49 6.1 Rules for learning non-creditworthiness . 55 6.2 Rules for learning creditworthiness . 55 6.3 Rules for learning non-creditworthiness, numerical data . 57 6.4 Rules for learning creditworthiness, numerical data . 57 6.5 Comparison between our descent-based inductive logic programming and other models . 58 A.1 Model Run Results, Varying Number of Features . 67 A.2 Model Run Results, Varying Number of Rules . 68 A.3 Model Run Results, Varying Rule Size . 69 A.4 Model Run Results, Varying Noise Level . 70 A.5 Numerical Model Run Results, Varying Noise Level . 70 A.6 Numerical Model Run Results, Varying Number of Rules . 71 A.7 Numerical Model Run Results, Varying Number of Rules . 71 A.8 Hyperparameter search, predicting non-creditworthiness with HELOC Binarized Data . 72 A.9 Hyperparameter search, predicting creditworthiness using HELOC Bi- narized Data . 72 13 A.10 Hyperparameter search, predicting non-creditworthiness using HELOC Numerical Data . 73 A.11 Hyperparameter search, predicting creditworthiness using HELOC Nu- merical Data . 73 14 Chapter 1 Introduction In the past decade, the proliferation of machine learning and artificial intelligence techniques has allowed such models to outperform many traditional methods for a myriad of applications, from image processing to natural language processing. How- ever, many successful machine learning models function as blackboxes, especially since these models frequently produce extremely intricate functions where there is no in- tuitive meaning for any of the model’s parameters. Further, as models become more complex, the number of parameters in a model such as a deep neural network can exceed several million, making it difficult to understand model behavior. As such, there has been interest in producing machine learning models that are explainable to humans without sacrificing accuracy. One of the interesting approaches to developing explainable artificial intelligence comes from the field of inductive logic programming. Inductive logic programming deals with the development of a hypothesis that logically entails a set of background examples. This approach addresses the explainability issue explicitly in that any logic program consists of a set of Horn clauses, which are explainable as logical rules. 1.1 Research Goals This research seeks to extend research development in the field of inductive logic programming in order to adapt inductive logic programming methods to the general 15 task of supervised binary classification.
Recommended publications
  • Investigation of Topic Trends in Computer and Information Science
    Investigation of Topic Trends in Computer and Information Science by Text Mining Techniques: From the Perspective of Conferences in DBLP* 텍스트 마이닝 기법을 이용한 컴퓨터공학 및 정보학 분야 연구동향 조사: DBLP의 학술회의 데이터를 중심으로 Su Yeon Kim (김수연)** Sung Jeon Song (송성전)*** Min Song (송민)**** ABSTRACT The goal of this paper is to explore the field of Computer and Information Science with the aid of text mining techniques by mining Computer and Information Science related conference data available in DBLP (Digital Bibliography & Library Project). Although studies based on bibliometric analysis are most prevalent in investigating dynamics of a research field, we attempt to understand dynamics of the field by utilizing Latent Dirichlet Allocation (LDA)-based multinomial topic modeling. For this study, we collect 236,170 documents from 353 conferences related to Computer and Information Science in DBLP. We aim to include conferences in the field of Computer and Information Science as broad as possible. We analyze topic modeling results along with datasets collected over the period of 2000 to 2011 including top authors per topic and top conferences per topic. We identify the following four different patterns in topic trends in the field of computer and information science during this period: growing (network related topics), shrinking (AI and data mining related topics), continuing (web, text mining information retrieval and database related topics), and fluctuating pattern (HCI, information system and multimedia system related topics). 초 록 이 논문의 연구목적은 컴퓨터공학 및 정보학 관련 연구동향을 분석하는 것이다. 이를 위해 텍스트마이닝 기법을 이용하여 DBLP(Digital Bibliography & Library Project)의 학술회의 데이터를 분석하였다. 대부분의 연구동향 분석 연구가 계량서지학적 연구방법을 사용한 것과 달리 이 논문에서는 LDA(Latent Dirichlet Allocation) 기반 다항분포 토픽 모델링 기법을 이용하였다.
    [Show full text]
  • Can Predicate Invention Compensate for Incomplete Background Knowledge?
    Can predicate invention compensate for incomplete background knowledge? Andrew CROPPER Stephen MUGGLTEON Imperial College London, United Kingdom Abstract. In machine learning we are often faced with the problem of incomplete data, which can lead to lower predictive accuracies in both feature-based and re- lational machine learning. It is therefore important to develop techniques to com- pensate for incomplete data. In inductive logic programming (ILP) incomplete data can be in the form of missing values or missing predicates. In this paper, we inves- tigate whether an ILP learner can compensate for missing background predicates through predicate invention. We conduct experiments on two datasets in which we progressively remove predicates from the background knowledge whilst measuring the predictive accuracy of three ILP learners with differing levels of predicate inven- tion. The experimental results show that as the number of background predicates decreases, an ILP learner which performs predicate invention has higher predictive accuracies than the learners which do not perform predicate invention, suggesting that predicate invention can compensate for incomplete background knowledge. Keywords. inductive logic programming, predicate invention, relational machine learning, meta-interpretive learning 1. Introduction In an ideal world machine learning datasets would be complete. In reality, however, we are often faced with incomplete data, which can lead to lower predictive accuracies in both feature-based [6,10] and relational [22,17] machine learning. It is therefore important to develop techniques to compensate for incomplete data. In inductive logic programming (ILP) [11], a form of relational machine learning, incomplete background knowledge can be in the form of missing values or missing predicates [9].
    [Show full text]
  • Knowledge Mining Biological Network Models Stephen Muggleton
    Knowledge Mining Biological Network Models Stephen Muggleton To cite this version: Stephen Muggleton. Knowledge Mining Biological Network Models. 6th IFIP TC 12 International Conference on Intelligent Information Processing (IIP), Oct 2010, Manchester, United Kingdom. pp.2- 2, 10.1007/978-3-642-16327-2_2. hal-01055072 HAL Id: hal-01055072 https://hal.inria.fr/hal-01055072 Submitted on 11 Aug 2014 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License Knowledge Mining Biological Network Models S.H. Muggleton Department of Computing Imperial College London Abstract: In this talk we survey work being conducted at the Cen- tre for Integrative Systems Biology at Imperial College on the use of machine learning to build models of biochemical pathways. Within the area of Systems Biology these models provide graph-based de- scriptions of bio-molecular interactions which describe cellular ac- tivities such as gene regulation, metabolism and transcription. One of the key advantages of the approach taken, Inductive Logic Pro- gramming, is the availability of background knowledge on existing known biochemical networks from publicly available resources such as KEGG and Biocyc.
    [Show full text]
  • Centre for Bioinformatics Imperial College London Second Report 31
    Centre for Bioinformatics Imperial College London Second Report 31 May 2004 Centre Director: Prof Michael Sternberg www.imperial.ac.uk/bioinformatics [email protected] Support Service Head: Dr Sarah Butcher www.codon.bioinformatics.ic.ac.uk [email protected] Centre for Bioinformatics - Imperial College London - Second Report - May 2004 1 Contents Summary..................................................................................................................... 3 1. Background and Objectives .................................................................................. 4 1.1 Objectives of the Report........................................................................................ 4 1.2 Background........................................................................................................... 4 1.3 Objectives of the Centre for Bioinformatics........................................................... 5 1.4 Objectives of the Bioinformatics Support Service ................................................. 5 2. Management ......................................................................................................... 6 3. Biographies of the Team....................................................................................... 7 4. Bioinformatics Research at Imperial ..................................................................... 8 4.1 Affiliates of the Centre........................................................................................... 8 4.2 Research..............................................................................................................
    [Show full text]
  • Towards Chemical Universal Turing Machines
    Towards Chemical Universal Turing Machines Stephen Muggleton Department of Computing, Imperial College London, 180 Queens Gate, London SW7 2AZ. [email protected] Abstract aid of computers. This is not only because of the scale of the data involved but also because scientists are unable Present developments in the natural sciences are pro- to conceptualise the breadth and depth of the relationships viding enormous and challenging opportunities for var- ious AI technologies to have an unprecedented impact between relevant databases without computational support. in the broader scientific world. If taken up, such ap- The potential benefits to science of such computerization are plications would not only stretch present AI technology high knowledge derived from large-scale scientific data has to the limit, but if successful could also have a radical the potential to pave the way to new technologies ranging impact on the way natural science is conducted. We re- from personalised medicines to methods for dealing with view our experience with the Robot Scientist and other and avoiding climate change (Muggleton 2006). Machine Learning applications as examples of such AI- In the 1990s it took the international human genome inspired developments. We also consider potential fu- project a decade to determine the sequence of a single hu- ture extensions of such work based on the use of Uncer- man genome, but projected increases in the speed of gene tainty Logics. As a generalisation of the robot scientist sequencing imply that before 2050 it will be feasible to de- we introduce the notion of a Chemical Universal Turing machine.
    [Show full text]
  • Efficiently Learning Efficient Programs
    Imperial College London Department of Computing Efficiently learning efficient programs Andrew Cropper Submitted in part fulfilment of the requirements for the degree of Doctor of Philosophy in Computing of Imperial College London and the Diploma of Imperial College London, June 2017 Abstract Discovering efficient algorithms is central to computer science. In this thesis, we aim to discover efficient programs (algorithms) using machine learning. Specifically, we claim we can efficiently learn programs (Claim 1), and learn efficient programs (Claim 2). In contrast to universal induction methods, which learn programs using only examples, we introduce program induction techniques which addition- ally use background knowledge to improve learning efficiency. We focus on inductive logic programming (ILP), a form of program induction which uses logic programming to represent examples, background knowledge, and learned programs. In the first part of this thesis, we support Claim 1 by using appropriate back- ground knowledge to efficiently learn programs. Specifically, we use logical minimisation techniques to reduce the inductive bias of an ILP learner. In addi- tion, we use higher-order background knowledge to extend ILP from learning first-order programs to learning higher-order programs, including the support for higher-order predicate invention. Both contributions reduce learning times and improve predictive accuracies. In the second part of this thesis, we support Claim 2 by introducing tech- niques to learn minimal cost logic programs. Specifically, we introduce Metaopt, an ILP system which, given sufficient training examples, is guaranteed to find minimal cost programs. We show that Metaopt can learn minimal cost robot strategies, such as quicksort, and minimal time complexity logic programs, including non-deterministic programs.
    [Show full text]
  • Alan Turing and the Development of Artificial Intelligence
    Alan Turing and the development of Artificial Intelligence Stephen Muggleton Department of Computing, Imperial College London December 19, 2012 Abstract During the centennial year of his birth Alan Turing (1912-1954) has been widely celebrated as having laid the foundations for Computer Sci- ence, Automated Decryption, Systems Biology and the Turing Test. In this paper we investigate Turing’s motivations and expectations for the development of Machine Intelligence, as expressed in his 1950 article in Mind. We show that many of the trends and developments within AI over the last 50 years were foreseen in this foundational paper. In particular, Turing not only describes the use of Computational Logic but also the necessity for the development of Machine Learning in order to achieve human-level AI within a 50 year time-frame. His description of the Child Machine (a machine which learns like an infant) dominates the closing section of the paper, in which he provides suggestions for how AI might be achieved. Turing discusses three alternative suggestions which can be characterised as: 1) AI by programming, 2) AI by ab initio machine learning and 3) AI using logic, probabilities, learning and background knowledge. He argues that there are inevitable limitations in the first two approaches and recommends the third as the most promising. We com- pare Turing’s three alternatives to developments within AI, and conclude with a discussion of some of the unresolved challenges he posed within the paper. 1 Introduction In this section we will first review relevant parts of the early work of Alan Turing which pre-dated his paper in Mind [42].
    [Show full text]
  • Knowledge Mining Biological Network Models
    Knowledge Mining Biological Network Models S.H. Muggleton Department of Computing Imperial College London Abstract: In this talk we survey work being conducted at the Cen- tre for Integrative Systems Biology at Imperial College on the use of machine learning to build models of biochemical pathways. Within the area of Systems Biology these models provide graph-based de- scriptions of bio-molecular interactions which describe cellular ac- tivities such as gene regulation, metabolism and transcription. One of the key advantages of the approach taken, Inductive Logic Pro- gramming, is the availability of background knowledge on existing known biochemical networks from publicly available resources such as KEGG and Biocyc. The topic has clear societal impact owing to its application in Biology and Medicine. Moreover, object descrip- tions in this domain have an inherently relational structure in the form of spatial and temporal interactions of the molecules involved. The relationships include biochemical reactions in which one set of metabolites is transformed to another mediated by the involvement of an enzyme. Existing genomic information is very incomplete concerning the functions and even the existence of genes and me- tabolites, leading to the necessity of techniques such as logical ab- duction to introduce novel functions and invent new objects. Moreo- ver, the development of active learning algorithms has allowed automatic suggestion of new experiments to test novel hypotheses. The approach thus provides support for the overall scientific cycle of hypothesis generation and experimental testing. Bio-Sketch: Professor Stephen Muggleton FREng holds a Royal Academy of Engineering and Microsoft Research Chair (2007-) and is Director of the Imperial College Computational Bioinformatics Centre (2001-) (www.doc.ic.ac.uk/bioinformatics) and Director of Modelling at the BBSRC Centre for Integrative Modelling at Impe- rial College.
    [Show full text]
  • Logic-Based Machine Learning
    Chapter LOGICBASED MACHINE LEARNING Stephen Muggleton and Flaviu Marginean Department of Computer Science University of York Heslington York YO DD United Kingdom Abstract The last three decades has seen the development of Computational Logic techniques within Articial Intelligence This has led to the development of the sub ject of Logic Programming LP which can b e viewed as a key part of LogicBased Articial Intelligence The subtopic of LP concerned with Machine Learning is known as Inductive Logic Programming ILP which again can b e broadened to Logic Based Machine Learning by dropping Horn clause restrictions ILP has its ro ots in the groundbreaking research of Gordon Plotkin and Ehud Shapiro This pap er provides a brief survey of the state of ILP application s theory and techniques Keywords Inductive logic programming machine learning scientic discovery protein prediction learning of natural language INTRODUCTION As envisaged by John McCarthy in McCarthy Logic has turned out to b e one of the key knowledge representations for Articial Intelligence research Logic in the form of the FirstOrder Predicate Calculus provides a representation for knowledge which has a clear semantics together with wellstudied sound rules of inference Interestingly Turing Turing and McCarthy McCarthy viewed a combination of Logic and Learning as b eing central to the development of Articial Intelligence research This article is concerned with the topic of LogicBased Machine Learning LBML The mo dern study of LBML has largely b een involved with the study
    [Show full text]
  • Logical Explanations for Deep Relational Machines Using Relevance Information
    Journal of Machine Learning Research 20 (2019) 1-47 Submitted 8/18; Revised 6/19; Published 8/19 Logical Explanations for Deep Relational Machines Using Relevance Information Ashwin Srinivasan [email protected] Department of Computer Sc. & Information Systems BITS Pilani, K.K. Birla Goa Campus, Goa, India Lovekesh Vig [email protected] TCS Research, New Delhi, India Michael Bain [email protected] School of Computer Science and Engineering University of New South Wales, Sydney, Australia. Editor: Luc De Raedt Abstract Our interest in this paper is in the construction of symbolic explanations for predictions made by a deep neural network. We will focus attention on deep relational machines (DRMs: a term introduced in Lodhi (2013)). A DRM is a deep network in which the input layer consists of Boolean-valued functions (features) that are defined in terms of relations provided as domain, or background, knowledge. Our DRMs differ from those in Lodhi (2013), which uses an Inductive Logic Programming (ILP) engine to first select features (we use random selections from a space of features that satisfies some approximate constraints on logical relevance and non-redundancy). But why do the DRMs predict what they do? One way of answering this was provided in recent work Ribeiro et al. (2016), by constructing readable proxies for a black-box predictor. The proxies are intended only to model the predictions of the black-box in local regions of the instance-space. But readability alone may not be enough: to be understandable, the local models must use relevant concepts in an meaningful manner.
    [Show full text]
  • Spinout Equinox Pharma Speeds up and Reduces the Cost of Drug Discovery
    Spinout Equinox Pharma speeds up and reduces the cost of drug discovery pinout Equinox Pharma1 provides a service to the pharma and biopharma industries that helps in the development of new drugs. The company, established by researchers from Imperial College London in IMPACT SUMMARY 2008, is using its own proprietary software to help speed up and reduce the cost of discovery processes. S Spinout Equinox Pharma provides services to the agri- tech and pharmaceuticals industries to accelerate the Equinox is based on BBSRC-funded collaborative research The technology also has applications in the agrochemical discovery of molecules that could form the basis of new conducted by Professors Stephen Muggleton2, Royal sector, and is being used by global agri-tech company products such as drugs or herbicides. Academy Chair in Machine Learning, Mike Sternberg3, Chair Syngenta. The company arose from BBSRC bioinformatics and of Structural Bioinformatics, and Paul Freemont4, Chair of structural biology research at Imperial College London. Protein Crystallography, at Imperial College London. “The power of a logic-based approach is that it can Technology commercialisation company NetScientific propose chemically-novel molecules that can have Ltd has invested £250K in Equinox. The technology developed by the company takes a logic- enhanced properties and are able to be the subject of novel Equinox customers include major pharmaceutical based approach to discover novel drugs, using computers to ‘composition of matter’ patents,” says Muggleton. companies such as Japanese pharmaceutical company learn from a set of biologically-active molecules. It combines Astellas and agro chemical companies such as knowledge about traditional drug discovery methods and Proving the technology Syngenta, as well as SMEs based in the UK, Europe, computer-based analyses, and focuses on the key stages in A three-year BBSRC grant in 2003 funded the work of Japan and the USA.
    [Show full text]
  • ILP Turns 20 Biography and Future Challenges
    Mach Learn (2012) 86:3–23 DOI 10.1007/s10994-011-5259-2 ILP turns 20 Biography and future challenges Stephen Muggleton · Luc De Raedt · David Poole · Ivan Bratko · Peter Flach · Katsumi Inoue · Ashwin Srinivasan Received: 8 May 2011 / Accepted: 29 June 2011 / Published online: 5 September 2011 © The Author(s) 2011. This article is published with open access at Springerlink.com Abstract Inductive Logic Programming (ILP) is an area of Machine Learning which has now reached its twentieth year. Using the analogy of a human biography this paper re- calls the development of the subject from its infancy through childhood and teenage years. We show how in each phase ILP has been characterised by an attempt to extend theory and implementations in tandem with the development of novel and challenging real-world appli- cations. Lastly, by projection we suggest directions for research which will help the subject coming of age. Editors: Paolo Frasconi and Francesca Lisi. S. Muggleton () Imperial College London, London, UK e-mail: [email protected] L. De Raedt Katholieke Universiteit Leuven, Leuven, Belgium e-mail: [email protected] D. Poole University of British Columbia, Vancouver, Canada url: http://www.cs.ubc.ca/~poole/ I. Bratko University of Ljubljana, Ljubljana, Slovenia e-mail: [email protected] P. Flach University of Bristol, Bristol, UK e-mail: [email protected] K. Inoue National Institute of Informatics, Tokyo, Japan e-mail: [email protected] A. Srinivasan South Asian University, New Delhi, India e-mail: [email protected] 4 Mach Learn (2012) 86:3–23 Keywords Inductive Logic Programming · (Statistical) relational learning · Structured data in Machine Learning 1 Introduction The present paper was authored by members of a discussion panel at the twentieth Interna- tional Conference on Inductive Logic Programming.
    [Show full text]