Lander Noterman Discovering and Using Functions Via Semantic
Total Page:16
File Type:pdf, Size:1020Kb
Discovering and Using Functions via Semantic Querying Lander Noterman Supervisor: Prof. dr. ir. Ruben Verborgh Counsellors: Ben De Meester, Anastasia Dimou Master's dissertation submitted in order to obtain the academic degree of Master of Science in Computer Science Engineering Department of Electronics and Information Systems Chair: Prof. dr. ir. Koen De Bosschere Faculty of Engineering and Architecture Academic year 2017-2018 Discovering and Using Functions via Semantic Querying Lander Noterman Supervisor: Prof. dr. ir. Ruben Verborgh Counsellors: Ben De Meester, Anastasia Dimou Master's dissertation submitted in order to obtain the academic degree of Master of Science in Computer Science Engineering Department of Electronics and Information Systems Chair: Prof. dr. ir. Koen De Bosschere Faculty of Engineering and Architecture Academic year 2017-2018 Preface The author(s) gives (give) permission to make this master dissertation available for consultation and to copy parts of this master dissertation for personal use. In the case of any other use, the copyright terms have to be respected, in particular with regard to the obligation to state expressly the source when quoting results from this master dissertation. June 1, 2018 Word of thanks Firstly, I would like to thank Ben De Meester and Anastasia Dimou for acting as counsellors for this work. Ben’s supervision especially was tremendously helpful during the process of writing this thesis. His knowledge about the subject helped me learn a lot about the Semantic Web and its technologies, and his feedback and advice was immensely valuable for the successful completion of this work. I would also like to thank Prof. dr. ir. Ruben Verborgh for being the supervisor for this dissertation. His lessons introduced me to the Semantic Web and its powerful capabilities. Finally, I would like to thank my friends, my family, and especially my girlfriend for supporting me through the course of my studies and during the making of this dissertation. Their continued support helped me to stay focussed and successfully complete even the more challenging parts of the past five years. i Discovering and Using Functions via Semantic Querying by Lander Noterman Supervisor: Prof. dr. ir. Ruben Verborgh Counsellors: Ben De Meester, Anastasia Dimou Master’s dissertation submitted in order to obtain the academic degree of Master of Science in Computer Science Engineering Department of Electronics and Information Systems Chair: Prof. dr. ir. Koen De Bosschere Faculty of Engineering and Architecture Academic year 2017-2018 Abstract On today’s web, functions are available in the form of code snippets, packages and Web APIs. How- ever, existing solutions for searching functions on the web lack the ability to search these functions by type signature, hence keyword search is used instead, which is imprecise. Package managers partially automate the process of acquiring functions, however, they do not automate the invocation of them. This work focusses on the problem of querying and automated usage of functions. Using Linked Data and Semantic Web technologies, we created the FunctionHub: a system for searching for functions on the web and invoking them using a uniform interface. In this system, functions and implementations are semantically described using RDF, hence they can be queried using SPARQL. Our evaluation demonstrates four improvements that result from linking semantic descriptions of functions to those of implementations: (i) more precise search abilities, like searching by type signa- ture, (ii) automated invocation of implementations, (iii) linking descriptions of functions and imple- mentations enables abstracted function processing, this system can be used for at-runtime discovery and invocation of functions and inference of knowledge from RDF data, (iv) redundancy of imple- mentations avoids a single point of failure. Finally, abstracted function processing brings us closer to a future where intelligent agents can not only understand the data on the Semantic Web, but act upon it using these functions. Keywords — Semantic Web, Linked Data, automated function processing, semantic querying ii Discovering and Using Functions via Semantic Querying Lander Noterman Supervisor: Prof. dr. ir. Ruben Verborgh Counsellors: Ben De Meester, Anastasia Dimou Abstract—On today’s web, functions are available in the processable: available functions do not have a se- form of code snippets, packages and Web APIs. However, mantic description, hence, they cannot be discovered existing solutions for searching functions on the web lack and invoked automatically. (iv) Functions are stored precise methods of finding a desired function. Acquiring and provided from a centralized location, this might functions is partially automated by package managers, however, they do not automate the invocation of them. cause problems, as there is a single point of failure. This work focusses on the problem of querying and While solving these problems would improve automated usage of functions. Using Linked Data and searching for and using functions on the web for Semantic Web technologies, we created the FunctionHub: developers, machines could benefit even more from a system for searching for functions on the web and these improvements. On the Semantic Web, in- invoking them using a uniform interface. In this system, telligent agents are envisioned to be able to use functions and implementations are semantically described using RDF, hence they can be queried using SPARQL. the web to accomplish tasks [1]. Such tasks can Our evaluation demonstrates several improvements that be accomplished using functions, hence this works result from linking semantic descriptions of functions aims to enable machines to use them. to those of implementations, like more precise search abilities, abstracted function processing and redundancy of implementations. Finally, abstracted function processing II. RELATED WORK brings us closer to a future where intelligent agents can Apart from the main technologies of the Semantic not only understand the data on the Semantic Web, but act upon it using these functions. Web, some specific technologies are used in this work. Additionally, code search engines and pack- age managers offer functionality that is sought to be I. INTRODUCTION improved with this work. With the advent of the World Wide Web, reusing code in software projects became much easier. Developers can download libraries, find code A. Describing and instantiating NPM modules in GitHub1 repositories, download packages from 2 For finding and using NPM modules, this NPM etc. However, this approach cannot yet be work makes use of Linked Software Dependen- fully automated: finding code requires searching for cies (LS(D)). LS(D) presents two technologies: the it on the web and using libraries or packages from Object-Oriented Components ontology and Compo- package managers requires reading documentation. nents.js [2]. These technologies are used to describe We identified four main problems with the current and instantiate NPM modules respectively. situation of searching for and using functions on the web: (i) Searching for functions is mostly keyword- based, which is an imprecise way of finding func- B. Describing Web APIs tions. (ii) Using functions is environment specific: Hydra is an ontology to semantically describe functions should be implemented in the correct pro- REST web services. It describes the operations that gramming language. (iii) Functions are not machine can be executed using the web service in a semantic 1http://www.github.com way, hence automating the usage of the service 2http://www.npmjs.com becomes possible [3]. iii C. Describing abstract functions 1) Describing functions with Linked Data en- A number of ontologies enable describing func- ables search capabilities beyond keyword tions, however, these descriptions are closely related search. to their implementation (e.g., Object-Oriented Com- 2) Linking abstract function descriptions with ponents describes NPM modules, Hydra describes specific implementations enables the use of REST Web APIs). The Function Ontology (FnO) a uniform interface to invoke functions. is an ontology to describe abstract functions, the 3) Using Linked Data to describe functions, in- descriptions do not contain details regarding the telligent agents can make use of them to implementation, hence they can describe functions enable automated processing of information. in any implementation [4]. 4) Following the Linked Data principles enables distribution of storage and responsibility. The utility of these first three hypotheses can D. Code search engines be demonstrated using some specific use cases. A Code search engines allow users to search number of use cases are formulated to clarify the through source code. In this work, our solution hypotheses and enable evaluation of this work. is compared to searchcode.com and GitHub code search. Both of these tools offer a web interface for 1) A search engine for functions in which we searching through source code. cannot only search by keywords, but also by type signature. Additionally, this search engine not only enables finding code, but also E. Package managers web services implementing this functionality. Package managers are used to find and download 2) A package manager that allows us to find functionality in the form of software packages avail- functions using type signature and is not lim- able on the web. Examples of package managers are ited to one programming language. NPM,