Bringing Relational Databases Into the Semantic Web: a Survey

Semantic Web 0 (0) 1–41 1 IOS Press Bringing Relational Databases into the Semantic Web: A Survey Editor(s): Jie Tang, Tsinghua University, China Solicited review(s): Zi Yang, Carnegie Mellon University, USA; Yuan An, Drexel University, USA; Ling Chen, University of Technology, Sydney, Australia; Juanzi Li, Tsinghua University, China Dimitrios-Emmanuel Spanos a;∗, Periklis Stavrou a and Nikolas Mitrou a a National Technical University of Athens, School of Electrical & Computer Engineering, 9, Heroon Polytechneiou str., 15773 Zografou, Athens, Greece E-mail: {dspanos,pstavrou}@cn.ntua.gr,[email protected] Abstract. Relational databases are considered one of the most popular storage solutions for various kinds of data and they have been recognized as a key factor in generating huge amounts of data for Semantic Web applications. Ontologies, on the other hand, are one of the key concepts and main vehicle of knowledge in the Semantic Web research area. The problem of bridging the gap between relational databases and ontologies has attracted the interest of the Semantic Web community, even from the early years of its existence and is commonly referred to as the database-to-ontology mapping problem. However, this term has been used interchangeably for referring to two distinct problems: namely, the creation of an ontology from an existing database instance and the discovery of mappings between an existing database instance and an existing ontology. In this paper, we clearly define these two problems and present the motivation, benefits, challenges and solutions for each one of them. We attempt to gather the most notable approaches proposed so far in the literature, present them concisely in tabular format and group them under a classification scheme. We finally explore the perspectives and future research steps for a seamless and meaningful integration of databases into the Semantic Web. Keywords: Relational Database, Ontology, Mapping, OWL, Survey 1. Introduction sciences, environmental monitoring, cultural heritage, e-Government and business process management. This Over the last decade and more, the Semantic Web evident progress is the result of years-long research (SW) has grown from an abstract futuristic vision, and it comes as no surprise that, nowadays, Seman- mainly existing in the head of its inspirer, Tim Berners- tic Web is perceived as a multidisciplinary research Lee, into an ever approaching reality of a global web of field on its own, combining and gaining expertise from interlinked data with well-defined meaning. Standard other scientific fields, such as artificial intelligence, in- languages and technologies have been proposed and formation science, algorithm and complexity theory, are constantly evolving in order to serve as the build- database theory and computer networks, to name a few. ing blocks for this “next generation” Web, relevant The participation of databases and their role in this tools are being developed and gradually reaching ma- evolving Web setting has been investigated from the turity, while numerous real world applications already very beginning of the Semantic Web conception, not give an early taste of the benefits the Semantic Web only because it was initially compared to “a global database” [23], but also because this – new at the is about to bring in various domains, as diverse as life time – research field could take advantage of the great experience and maturity of the database field. How- *Corresponding author. E-mail: [email protected]. ever, the collaboration and exchange of ideas between 0000-0000/0-1900/$00.00 ⃝c 0 – IOS Press and the authors. All rights reserved 2 D.E. Spanos et al. / Bringing Relational Databases into the Semantic Web: A Survey these two fields was not unidirectional: the database relational database and seeks ways to extract informa- community quickly recognized the opportunities aris- tion and render it suitable for use from a Semantic Web ing from a close cooperation with the Semantic Web perspective. In this case, motivation is shifted from field [113] and how the latter could offer solutions to the efficient storage and querying of existing ontolog- long-standing issues and provide inspiration to sev- ical structures to problems, such as database integra- eral database subcommunities, interested in heteroge- tion, ontology learning, mass generation of SW data, neous database integration and interoperability, dis- ontology-based data access and semantic annotation of tributed architectures, deductive databases, conceptual dynamic Web pages. These problems have been inves- modelling and so on. tigated in the relevant literature, each one touching on Attempts to combine these two different worlds a different aspect of the database to ontology mapping originally focused on the reconciliation of the discrep- problem. Unfortunately, this term has been freely used ancies among the two most representative and domi- to describe most of the aforementioned issues, creating nant technologies of each world: relational databases slight confusion regarding the goal and the challenges and ontologies. This problem is also known as the faced for each one of them. Hence, in this paper, we database to ontology mapping problem, which is sub- take a look at approaches that do one or more of the sumed by the broader object-relational impedance following: mismatch problem and is due to the structural differ- – create from scratch a new ontology based on in- ences among relational and object-oriented models. formation extracted from a relational database, Correspondences between the relational model and the – generate RDF statements that conform to one or RDF graph model, which is a key component of the Se- more predefined ontologies and reflect the con- mantic Web, were also investigated and a W3C Work- tents of a relational database, ing Group1 has been formed to examine this issue and – answer semantic queries directly against a rela- propose related standards. Nevertheless, the definition tional database, and of a theoretically sound mapping or transformation be- – discover correspondences between a relational tween the mentioned models is not an end on its own. database and a given ontology The motivation driving the consideration of mappings among relational databases and Semantic Web tech- We attempt to define and distinguish between these closely related but distinct problems, analyze meth- nologies is multifold, leading to separate problems, ods and techniques employed in order to deal with where mappings are discovered, defined and used in a them, identify features of proposed approaches, clas- different way for each problem case. sify them accordingly and present a future outlook for Originally, database systems were considered by the this much researched issue. Semantic Web community as an excellent means for The paper is structured as follows: Section 2 men- efficient ontology storage [19], because of their known tions the motivations and benefits of the database and well evidenced performance benefits. This consid- to ontology mapping problem, while Section 3 gives eration has led to the development and production of a short background and defines some terminology several database systems, especially optimized for the of database systems and Semantic Web technologies. persistent storage, maintenance and querying of SW Section 4 presents a total classification of relational data [47]. Such systems are informally known as triple database to ontology mapping approaches and lists de- stores, since they are specifically tailored for the stor- scriptive parameters used for the description of each age of RDF statements, which are also referred to as approach. The structure of the rest of paper is largely triples. This sort of collaboration between database and based on this taxonomy. Section 5 deals with the Semantic Web specifies a data and information flow problem of creating a new ontology from a relational from the latter to the former, in the form of population database and is further divided in subsections that dis- of specialized databases, that often have some prede- tinguish among the creation of a database schema on- fined structure, with SW data. tology (Section 5.1) and a domain-specific ontology A different, perhaps more interesting research line (Section 5.2) as well as among approaches that rely ex- takes as starting point an existing and fully functional tensively on the analysis of the database schema (Sec- tion 5.2.2) and approaches that do not (Section 5.2.1). 1RDB2RDF Working Group: http://www.w3.org/2001/ Section 6 investigates the problem of discovering and sw/rdb2rdf/ defining mappings between a relational database and D.E. Spanos et al. / Bringing Relational Databases into the Semantic Web: A Survey 3 one or more existing ontologies. Section 7 sums up the form interface request. It has been argued that, due to main points of the paper, giving emphasis on the chal- the infeasibility of manual annotation of every single lenges faced by each category of approaches, while dynamic page, a possible solution would be to “anno- Section 8 gives some insight on future directions and tate” directly the underlying database schema, insofar requirements for database to ontology mapping solu- as the web page owner is willing to reveal the structure tions. of his database. This “annotation” is simply a set of correspondences between the elements of the database schema and an already existing ontology that fits the 2. Motivation and Benefits domain of the dynamic page content [125]. Once such mappings are defined, it would be fairly trivial to gen- The significance of databases from a Semantic Web erate dynamic semantically annotated pages with the perspective is evident from the multiple benefits and embedded content annotations derived in an automatic use cases a database to ontology mapping can be used fashion. in. After all, the problem of mapping a database into Heterogeneous database integration. The resolution Semantic Web did not emerge as a mere exercise of of heterogeneity is one of the most popular, long- transition from one representation model to another.

Load more