Logics for the Semantic Web

Total Page:16

File Type:pdf, Size:1020Kb

Logics for the Semantic Web LOGICS FOR THE SEMANTIC WEB Pascal Hitzler, Jens Lehmann, Axel Polleres Reader: Patrick Hayes 1 INTRODUCTION A major international research effort is currently under way to improve the exist- ing World Wide Web (WWW), with the intention to create what is often called the Semantic Web [Berners-Lee et al., 2001; Hitzler et al., 2010]. Driven by the World Wide Web Consortium (W3C) and its director Sir Tim Berners-Lee (inven- tor of the WWW), and heavily funded by many national and international research funding agencies, Semantic Web has become an established field of research. It integrates methods and expertise from many subfields of Computer Science and Artificial Intelligence [Studer, 2006], and it has now reached sufficient maturity for first industrial scale applications [Hamby, 2012; Hermann, 2010]. Correspondingly, major IT companies are starting to roll out applications involving Semantic Web technologies; these include Apple's Siri, IBM's Watson system, Google's Knowedge Graph, Facebook's Open Graph Protocol, and schema.org as a collaboration be- tween major search engine providers including Microsoft, Google, and Yahoo!. The Semantic Web field is driven by the vision to develop powerful methods and technologies for the reuse and integration of information on the Web. While current information on the Web is mainly made for human consumption, it shall in the future be made available for automated processing by intelligent systems. This vision is based on the idea of describing the meaning|or semantics|of data on the Web using metadata|data that describes other data|in the form of so-called ontologies [Hitzler et al., 2010]. Ontologies are essentially knowledge bases rep- resented using logic-based knowledge representation languages. This shall enable access to implicit knowledge through logical deduction [Hitzler and van Harme- len, 2010], and its use for search, integration, browsing, organization, and reuse of information. Of course, the idea of adopting knowledge representation languages raises the question which of the many approaches discussed in the literature should be adopted and promoted to Web standards (officially called W3C Recommenda- tions). In this chapter, we give an overview of the most important present stan- dards as well as their origins and history. The idea that the World Wide Web shall have capabilities to convey information for processing by intelligent systems, and not only by humans, has already been part of its original design [Berners-Lee, 1996]. The World Wide Web was initiated 2 Pascal Hitzler, Jens Lehmann, Axel Polleres in 1990, and immediately showed exponential growth [Berners-Lee, 1996]. In the meantime, it has become a very significant technological infrastructure of modern society. In the 1990s, the Semantic Web vision1 was mainly driven by the W3C Meta- data Activity [W3C Metadata, revision of 23 August 2002] which produced the first version of the Resource Description Framework (RDF) which we will discuss in Section 2. The Semantic Web Activity [W3C Semantic Web, revision of 19 June 2013] replaced the Metadata Activity in 2001, and has installed several stan- dards for representing knowledge on the Web, most noteably two revisions of RDF, the Web Ontology Language (OWL) discussed in Section 3, and the Rule Inter- change Format (RIF) discussed in Section 4. In Section 5, we discuss some of the particular challenges which must be faced when adopting logic-based knowledge representation languages for the Semantic Web, and in Section 6 we discuss some of the more recent research developments and questions. Note that we give a more detailed technical account for RDF than for OWL and RIF, because the latter are closely related to description logics and logic programming, respectively, and the reader is refered to the corresponding chapters in this volume for additional background and introductions. 2 RDF AND RDF SCHEMA The Resource Description Framework (RDF) [Manola et al., 2004; Hayes, 2004] comprises a simple data format as well as a basic schema language, called RDF Schema [Brickley and Guha, 2004]. While historically often termed a \medadata" standard, that is, an exchange format for data about documents and resources, in the meantime, RDF has been well established as a universal data exchange format for classical data integration scenarios, and particularly for publishing and exchanging structured data on the Web [Polleres et al., 2011]. Informally, all RDF data can be understood as a set of subject{predicate{object triples, where all subjects and predicates are Uniform Resource Identifiers (URIs)2 [Berners-Lee et al., 2005], and in the object position both URIs and literal values (such as numbers, strings, etc.) are allowed. Such a simple, triple based format was chosen since on the one hand, it can accomodate for any kind of metadata in the form of predicate-value pairs, and on the other hand, any more complex relational or object-oriented data can be decomposed into such triples in a fairly straightforward manner [Berners-Lee, 2006].3 Last, but not least, since URIs are being used as constant symbols in the lan- guage of RDF, any RDF triple may likewise be viewed as a generalization of 1The term Semantic Web became popular in the aftermath of the widely cited popular science article [Berners-Lee et al., 2001]. We were able to trace the term Semantic, in relation to Web, back to a 1994 presentation by Tim Berners-Lee [Berners-Lee, 1994]. 2URIs are a generalization of URLs. 3See also the discussion of semantic networks in the chapter on description logics, in this volume. Logics for the Semantic Web 3 http://www.w3.org <DesignIssues/TimBook-old/History.html> dc:creator <People/Berners-Lee/card#i> Figure 1. An RDF triple a \link" on the Web; as opposed to plain links on the traditional Web, on the Semantic Web, links can be associated with an arbitrary binary relation, which again is represented by a URI. We note that this idea of \typed" links was already part of Tim-Berners Lee's original design ideas of the Web [Berners-Lee, 1993; Berners-Lee and Fischetti, 1999]. For instance, on the website of the W3C (http: //www.w3.org, if one wants to state that the page with the URI http://www.w3. org/DesignIssues/TimBook-old/History.html was created by Tim Berners- Lee, this fact may be viewed as such a typed link, and consequently as an RDF subject-predicate-object triple, as shown in Fig. 1. Here, the URI dc:creator is used to denote the has-creator relation between a resource and its creator. This common view of typed links as labeled edges linking between resources also leads to sets of RDF triples often being called \RDF graphs." A distinguished relation within RDF, represented by the URI rdf:type is the is-a relation, that allows to denote membership to a certain class, where classes are again represented by URIs, for instance the class foaf:Person. Another important feature of RDF is that so called \blank nodes" can be used in the subject or object positions of triples to denote unnamed or unknown resources. This allows to model incomplete information in the form of existentials. For instance, the RDF graph in Fig. 2 extends the information in Fig. 1 by the fact that Tim Berners-Lee is a Person, is named \Timothy Berners-Lee" and knows some person named \Dan Brickley". In the following, after giving a brief history of the RDF standard (Section 2.1), we will present the RDF Data model along with a short discussion of different syntactic representations and the semantics of RDF (Section 2.2). We continue with a discussion of RDF Schema in Section 2.3 and the query language SPARQL in Section 2.4. 4 Pascal Hitzler, Jens Lehmann, Axel Polleres http://www.w3.org <DesignIssues/TimBook-old/History.html> dc:creator <People/Berners-Lee/card#i> foaf:name rdf:type foaf:knows foaf:Person “Timothy Berners-Lee” rdf:type foaf:name “Dan Brickley” Figure 2. An RDF graph with a \blank" (anonymous) node 2.1 Brief History of RDF The standardisation of RDF has been preceded by two earlier proposals for meta- data standards in the form of W3C member submissions, namely (i) the Chan- nel Definition Format [Ellerman, 1997] and (ii) the Meta Content Framework (MCF) [Guha and Bray, 1997]. While the former comprised an XML format with a fixed term of metadata properties for describing information channels on the Web, (somewhat similar to RSS nowadays), the latter (MCF) was strictly extensible and evolved into the first version of RDF [Lassila and Swick, 1999], published in 1999. Another important metadata initiative from the digital libraries community, Dublin Core [Nilsson et al., 2008], which started around the same time but outside of W3C, later on adopted RDF as a representation syntax, becoming one of the most prominent RDF vocabularies, see Section 2.3 below. The first official standard recommendation of RDF from 1999 was extended in 2004 by a formal definition of the Semantics of RDF [Hayes, 2004], decoupling the syntactical representation in XML from the RDF data model. Since then RDF has been used in various contexts and experienced wide adop- tion. In 2009 the W3C held a workshop on future directions of RDF [Herman, 2009], discussing several extensions but also simplifications of the standard. These extensions are currently under discussion in the ongoing W3C RDF 1.1 working group.4 4See http://www.w3.org/2011/rdf-wg/. Logics for the Semantic Web 5 2.2 Different Syntactic Representations and Semantics There are various serialisations for RDF. Fig. 3 shows different syntactic represen- tations of the six triples in the RDF graph from Fig. 2 in some of these serialisation syntaxes: N-Triples [Beckett and McBride, 2004a], cf.
Recommended publications
  • Metadata for Semantic and Social Applications
    etadata is a key aspect of our evolving infrastructure for information management, social computing, and scientific collaboration. DC-2008M will focus on metadata challenges, solutions, and innovation in initiatives and activities underlying semantic and social applications. Metadata is part of the fabric of social computing, which includes the use of wikis, blogs, and tagging for collaboration and participation. Metadata also underlies the development of semantic applications, and the Semantic Web — the representation and integration of multimedia knowledge structures on the basis of semantic models. These two trends flow together in applications such as Wikipedia, where authors collectively create structured information that can be extracted and used to enhance access to and use of information sources. Recent discussion has focused on how existing bibliographic standards can be expressed as Semantic Metadata for Web vocabularies to facilitate the ingration of library and cultural heritage data with other types of data. Harnessing the efforts of content providers and end-users to link, tag, edit, and describe their Semantic and information in interoperable ways (”participatory metadata”) is a key step towards providing knowledge environments that are scalable, self-correcting, and evolvable. Social Applications DC-2008 will explore conceptual and practical issues in the development and deployment of semantic and social applications to meet the needs of specific communities of practice. Edited by Jane Greenberg and Wolfgang Klas DC-2008
    [Show full text]
  • From the Semantic Web to Social Machines
    ARTICLE IN PRESS ARTINT:2455 JID:ARTINT AID:2455 /REV [m3G; v 1.23; Prn:25/11/2009; 12:36] P.1 (1-6) Artificial Intelligence ••• (••••) •••–••• Contents lists available at ScienceDirect Artificial Intelligence www.elsevier.com/locate/artint From the Semantic Web to social machines: A research challenge for AI on the World Wide Web ∗ Jim Hendler a, , Tim Berners-Lee b a Tetherless World Constellation, RPI, United States b Computer Science and AI Laboratory, MIT, United States article info abstract Article history: The advent of social computing on the Web has led to a new generation of Web Received 24 September 2009 applications that are powerful and world-changing. However, we argue that we are just Received in revised form 1 October 2009 at the beginning of this age of “social machines” and that their continued evolution and Accepted 1 October 2009 growth requires the cooperation of Web and AI researchers. In this paper, we show how Available online xxxx the growing Semantic Web provides necessary support for these technologies, outline the challenges we see in bringing the technology to the next level, and propose some starting places for the research. © 2009 Elsevier B.V. All rights reserved. Much has been written about the profound impact that the World Wide Web has had on society. Yet it is primarily in the past few years, as more interactive “read/write” technologies (e.g. Wikis, blogs and photo/video sharing) and social network- ing sites have proliferated, that the truly profound nature of this change is being felt. From the very beginning, however, the Web was designed to create a network of humans changing society empowered using this shared infrastructure.
    [Show full text]
  • V a Lida T in G R D F Da
    Series ISSN: 2160-4711 LABRA GAYO • ET AL GAYO LABRA Series Editors: Ying Ding, Indiana University Paul Groth, Elsevier Labs Validating RDF Data Jose Emilio Labra Gayo, University of Oviedo Eric Prud’hommeaux, W3C/MIT and Micelio Iovka Boneva, University of Lille Dimitris Kontokostas, University of Leipzig VALIDATING RDF DATA This book describes two technologies for RDF validation: Shape Expressions (ShEx) and Shapes Constraint Language (SHACL), the rationales for their designs, a comparison of the two, and some example applications. RDF and Linked Data have broad applicability across many fields, from aircraft manufacturing to zoology. Requirements for detecting bad data differ across communities, fields, and tasks, but nearly all involve some form of data validation. This book introduces data validation and describes its practical use in day-to-day data exchange. The Semantic Web offers a bold, new take on how to organize, distribute, index, and share data. Using Web addresses (URIs) as identifiers for data elements enables the construction of distributed databases on a global scale. Like the Web, the Semantic Web is heralded as an information revolution, and also like the Web, it is encumbered by data quality issues. The quality of Semantic Web data is compromised by the lack of resources for data curation, for maintenance, and for developing globally applicable data models. At the enterprise scale, these problems have conventional solutions. Master data management provides an enterprise-wide vocabulary, while constraint languages capture and enforce data structures. Filling a need long recognized by Semantic Web users, shapes languages provide models and vocabularies for expressing such structural constraints.
    [Show full text]
  • Semantics Developer's Guide
    MarkLogic Server Semantic Graph Developer’s Guide 2 MarkLogic 10 May, 2019 Last Revised: 10.0-8, October, 2021 Copyright © 2021 MarkLogic Corporation. All rights reserved. MarkLogic Server MarkLogic 10—May, 2019 Semantic Graph Developer’s Guide—Page 2 MarkLogic Server Table of Contents Table of Contents Semantic Graph Developer’s Guide 1.0 Introduction to Semantic Graphs in MarkLogic ..........................................11 1.1 Terminology ..........................................................................................................12 1.2 Linked Open Data .................................................................................................13 1.3 RDF Implementation in MarkLogic .....................................................................14 1.3.1 Using RDF in MarkLogic .........................................................................15 1.3.1.1 Storing RDF Triples in MarkLogic ...........................................17 1.3.1.2 Querying Triples .......................................................................18 1.3.2 RDF Data Model .......................................................................................20 1.3.3 Blank Node Identifiers ..............................................................................21 1.3.4 RDF Datatypes ..........................................................................................21 1.3.5 IRIs and Prefixes .......................................................................................22 1.3.5.1 IRIs ............................................................................................22
    [Show full text]
  • A New Generation Digital Content Service for Cultural Heritage Institutions
    A New Generation Digital Content Service for Cultural Heritage Institutions Pierfrancesco Bellini, Ivan Bruno, Daniele Cenni, Paolo Nesi, Michela Paolucci, and Marco Serena Distributed Systems and Internet Technology Lab, DISIT, Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Firenze, Italy [email protected] http://www.disit.dsi.unifi.it Abstract. The evolution of semantic technology and related impact on internet services and solutions, such as social media, mobile technologies, etc., have de- termined a strong evolution in digital content services. Traditional content based online services are leaving the space to a new generation of solutions. In this paper, the experience of one of those new generation digital content service is presented, namely ECLAP (European Collected Library of Artistic Perform- ance, http://www.eclap.eu). It has been partially founded by the European Commission and includes/aggregates more than 35 international institutions. ECLAP provides services and tools for content management and user network- ing. They are based on a set of newly researched technologies and features in the area of semantic computing technologies capable of mining and establishing relationships among content elements, concepts and users. On this regard, ECLAP is a place in which these new solutions are made available for inter- ested institutions. Keywords: best practice network, semantic computing, recommendations, automated content management, content aggregation, social media. 1 Introduction Traditional library services in which the users can access to content by searching and browsing on-line catalogues obtaining lists of references and sporadically digital items (documents, images, etc.) are part of our history. With the introduction of web2.0/3.0, and thus of data mining and semantic computing, including social media and mobile technologies most of the digital libraries and museum services became rapidly obsolete and were constrained to rapidly change.
    [Show full text]
  • Enterprise Data Classification Using Semantic Web Technologies
    Enterprise Data Classification Using Semantic Web Technologies David Ben-David1, Tamar Domany2, and Abigail Tarem2 1 Technion – Israel Institute of Technology, Haifa 32000, Israel [email protected] 2 IBM Research – Haifa, University Campus, Haifa 31905, Israel {tamar,abigailt}@il.ibm.com Abstract. Organizations today collect and store large amounts of data in various formats and locations. However they are sometimes required to locate all instances of a certain type of data. Good data classification allows marking enterprise data in a way that enables quick and efficient retrieval of information when needed. We introduce a generic, automatic classification method that exploits Semantic Web technologies to assist in several phases in the classification process; defining the classification requirements, performing the classification and representing the results. Using Semantic Web technologies enables flexible and extensible con- figuration, centralized management and uniform results. This approach creates general and maintainable classifications, and enables applying se- mantic queries, rule languages and inference on the results. Keywords: Semantic Techniques, RDF, Classification, modeling. 1 Introduction Organizations today collect and store large amounts of data in various formats and locations. The data is then consumed in many places, sometimes copied or cached several times, causing valuable and sensitive business information to be scattered across many enterprise data stores. When an organization is required to meet certain legal or regulatory requirements, for instance to comply with regulations or perform discovery during civil litigation, it becomes necessary to find all the places where the required data is located. For example, if in order to comply with a privacy regulation an organization is required to mask all Social Security Numbers (SSN) when delivering personal information to unauthorized entities, all the occurrences of SSN must be found.
    [Show full text]
  • 15 Years of Semantic Web: an Incomplete Survey
    Ku¨nstl Intell (2016) 30:117–130 DOI 10.1007/s13218-016-0424-1 TECHNICAL CONTRIBUTION 15 Years of Semantic Web: An Incomplete Survey 1 2 Birte Glimm • Heiner Stuckenschmidt Received: 7 November 2015 / Accepted: 5 January 2016 / Published online: 23 January 2016 Ó Springer-Verlag Berlin Heidelberg 2016 Abstract It has been 15 years since the first publications today’s Web: Uniform Resource Identifiers (URIs) for proposed the use of ontologies as a basis for defining assigning unique identifiers to resources, the HyperText information semantics on the Web starting what today is Markup Language (HTML) for specifying the formatting known as the Semantic Web Research Community. This of Web pages, and the Hypertext Transfer Protocol (HTTP) work undoubtedly had a significant influence on AI as a that allows for the retrieval of linked resources from across field and in particular the knowledge representation and the Web. Typically, the markup of standard Web pages Reasoning Community that quickly identified new chal- describes only the formatting and, hence, Web pages and lenges and opportunities in using Description Logics in a the navigation between them using hyperlinks is targeted practical setting. In this survey article, we will try to give towards human users (cf. Fig. 1). an overview of the developments the field has gone through In 2001, Tim Berners-Lee, James Hendler, and Ora in these 15 years. We will look at three different aspects: Lassila describe their vision for a Semantic Web [5]: the evolution of Semantic Web Language Standards, the The Semantic Web is not a separate Web but an evolution of central topics in the Semantic Web Commu- extension of the current one, in which information is nity and the evolution of the research methodology.
    [Show full text]
  • Web Semantics in Cloud Computing
    International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February-2016 ISSN 2229-5518 378 Review: Web Semantics in Cloud Computing Rupali A. Meshram, Assistant professor, CSE Department, PRMIT&R, Badnera Komal R. Hole, Assistant professor, CSE Department, PRMIT&R, Badnera Pranita P. Deshmukh, Assistant professor, CSE Department, PRMIT&R, Badnera Roshan A. karwa, Assistant professor, CSE Department, PRMIT&R, Badnera Abstract – on web service and the web services are more efficient This paper shows how cloud computing on the so as to attract more users. background of Semantic Web is going to important. Many cloud computing services are Cloud computing has become interesting area both in implementing on distributed web environment and academia and industry. Semantic Web has been an provides virtualized resources, which focuses on some important research area in academic and industrial aspects like the semantic ambiguity, distributed researchers. Many applications will need to work with semantic reasoning, distributed semantic models large amounts of data, one particular application would construction, distributed metadata storage, parallel certainly not exist without the capability of accessing and semantic computing, etc. It also introduces many real processing arbitrary amounts of metadata: search world applications like semantic analysis in science engines that locate the data and services that other computing, cloud resource discovery, accurate applications need. To solve this problem, semantic web advertise recommendation, and web context services used with cloud computing. In this paper we understanding in commercial services. present the use of well established semantic technologies The Semantic Web is gaining immense in Cloud computing. The purpose of this report is to give popularity.
    [Show full text]
  • A Multimodal Approach for Automatic WS Semantic Annotation
    MATAWS: A Multimodal Approach for Automatic WS Semantic Annotation Cihan Aksoy1,2, Vincent Labatut1, Chantal Cherifi1,3 and Jean-François Santucci3, 1 Galatasaray University, Computer Science Department, Ortaköy/İstanbul, Turkey 2 TÜBİTAK, Gebze/Kocaeli, Turkey 3 University of Corsica, Corte, France [email protected] Abstract. Many recent works aim at developing methods and tools for the processing of semantic Web services. In order to be properly tested, these tools must be applied to an appropriate benchmark, taking the form of a collection of semantic WS descriptions. However, all of the existing publicly available collections are limited by their size or their realism (use of randomly generated or resampled descriptions). Larger and realistic syntactic (WSDL) collections exist, but their semantic annotation requires a certain level of automation, due to the number of operations to be processed. In this article, we propose a fully automatic method to semantically annotate such large WS collections. Our approach is multimodal, in the sense it takes advantage of the latent semantics present not only in the parameter names, but also in the type names and structures. Concept-to-word association is performed by using Sigma, a mapping of WordNet to the SUMO ontology. After having described in details our annotation method, we apply it to the larger collection of real-world syntactic WS descriptions we could find, and assess its efficiency. Keywords: Web Service, Semantic Web, Semantic Annotation, Ontology, WSDL, OWL-S. 1 Introduction The semantic Web encompasses technologies which can make possible the generation of the kind of intelligent documents imagined ten years ago [1].
    [Show full text]
  • Semantic Computing
    SEMANTIC COMPUTING Lecture 12: Ontology Learning: Introduction Dagmar Gromann International Center For Computational Logic TU Dresden, 19 December 2018 Overview • Defining ontology • Introduction to ontology learning Dagmar Gromann, 19 December 2018 Semantic Computing 2 What is an ontology? Dagmar Gromann, 19 December 2018 Semantic Computing 3 What is an ontology? • Ontology (no plural = uncountable): philosophical study of being (existence, reality, categories of beings, etc.) • ontology (ontologies = countable): “formal, explicit specification of shared conceptualizations.’ (Studer et al. 1998: 186) • computational artifact designed with a purpose in mind • represented in a formal language to allow for the processing, reusing, and sharing of knowledge among humans and machines Studer, Rudi, Benjamins, Richard V., and Fensel, Dieter (1998), ’Knowledge Engineering: Principles and Methods’, Data & Knowledge Engineering, 25 (1-2), 161-198. Dagmar Gromann, 19 December 2018 Semantic Computing 4 Specification? Formal? A formal, explicit specification of shared conceptualizations; ideally an ontology: • is a model of (some aspect of) the world • captures a shared understanding of the domain of interest, a shared conceptualization • defines a vocabulary relevant to the domain and interpreted the same way by different users • specifies the meaning of the vocabulary in an explicit manner and often in a formal specification language Two main parts: • structure of the model = set of axioms • particular objects and situations = set of instances Dagmar Gromann,
    [Show full text]
  • Exploiting Semantic Web Knowledge Graphs in Data Mining
    Exploiting Semantic Web Knowledge Graphs in Data Mining Inauguraldissertation zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften der Universit¨atMannheim presented by Petar Ristoski Mannheim, 2017 ii Dekan: Dr. Bernd Lübcke, Universität Mannheim Referent: Professor Dr. Heiko Paulheim, Universität Mannheim Korreferent: Professor Dr. Simone Paolo Ponzetto, Universität Mannheim Tag der mündlichen Prüfung: 15 Januar 2018 Abstract Data Mining and Knowledge Discovery in Databases (KDD) is a research field concerned with deriving higher-level insights from data. The tasks performed in that field are knowledge intensive and can often benefit from using additional knowledge from various sources. Therefore, many approaches have been proposed in this area that combine Semantic Web data with the data mining and knowledge discovery process. Semantic Web knowledge graphs are a backbone of many in- formation systems that require access to structured knowledge. Such knowledge graphs contain factual knowledge about real word entities and the relations be- tween them, which can be utilized in various natural language processing, infor- mation retrieval, and any data mining applications. Following the principles of the Semantic Web, Semantic Web knowledge graphs are publicly available as Linked Open Data. Linked Open Data is an open, interlinked collection of datasets in machine-interpretable form, covering most of the real world domains. In this thesis, we investigate the hypothesis if Semantic Web knowledge graphs can be exploited as background knowledge in different steps of the knowledge discovery process, and different data mining tasks. More precisely, we aim to show that Semantic Web knowledge graphs can be utilized for generating valuable data mining features that can be used in various data mining tasks.
    [Show full text]
  • Semantic Data Mining: a Survey of Ontology-Based Approaches
    Semantic Data Mining: A Survey of Ontology-based Approaches Dejing Dou Hao Wang Haishan Liu Computer and Information Science Computer and Information Science Computer and Information Science University of Oregon University of Oregon University of Oregon Eugene, OR 97403, USA Eugene, OR 97403, USA Eugene, OR 97403, USA Email: [email protected] Email: [email protected] Email: [email protected] Abstract—Semantic Data Mining refers to the data mining tasks domain knowledge can work as a set of prior knowledge of that systematically incorporate domain knowledge, especially for- constraints to help reduce search space and guide the search mal semantics, into the process. In the past, many research efforts path [8], [9]. Further more, the discovered patterns can be have attested the benefits of incorporating domain knowledge in data mining. At the same time, the proliferation of knowledge cleaned out [49], [48] or made more visible by encoding them engineering has enriched the family of domain knowledge, espe- in the formal structure of knowledge engineering [76]. cially formal semantics and Semantic Web ontologies. Ontology is To make use of domain knowledge in the data mining pro- an explicit specification of conceptualization and a formal way to cess, the first step must account for representing and building define the semantics of knowledge and data. The formal structure the knowledge by models that the computer can further access of ontology makes it a nature way to encode domain knowledge for the data mining use. In this survey paper, we introduce and process. The proliferation of knowledge engineering (KE) general concepts of semantic data mining.
    [Show full text]