Master's Thesis

Total Page:16

File Type:pdf, Size:1020Kb

Master's Thesis The present work was submitted to the Institute of Information Systems and Databases RWTH Aachen University Faculty 1 - Mathematics, Computer Science and Natural Sciences Course of Studies - Computer Science Master’s thesis An Efficient Semantic Search Engine for Research Data in an RDF-based Knowledge Graph communicated by Prof. Dr. Stefan Decker Examiners: Author: Prof. Dr. Stefan Decker Sarah Bensberg, 378908 Prof. Dr. Matthias Müller Dr. Marius Politze Aachen, October 16, 2020 Statutory Declaration in Lieu of an Oath Bensberg, Sarah 378908 Last Name, First Name Matriculation No. (optional) I hereby declare in lieu of an oath that I have completed the present Master thesis entitled An Efficient Semantic Search Engine for Research Data in an RDF-based Knowledge Graph independently and without illegitimate assistance from third parties (such as academic ghostwriters). I have used no other than the specified sources and aids. In case that the thesis is additionally submitted in an electronic format, I declare that the written and electronic versions are fully identical. The thesis has not been submitted to any examina- tion body in this, or similar, form. Aachen , October 16, 2020 City, Date Signature Official Notification: Para. 156 StGB (German Criminal Code): False Statutory Declarations Whoever before a public authority competent to administer statutory declarations falsely makes such a declaration or falsely testifies while referring to such a declaration shall be liable to impris- onment not exceeding three years or a fine. Para. 161 StGB (German Criminal Code): False Statutory Declarations Due to Negligence (1) If a person commits one of the offences listed in sections 154 through 156 negligently the penalty shall be imprisonment not exceeding one year or a fine. (2) The offender shall be exempt from liability if he or she corrects their false testimony in time. The provisions of section 158 (2) and (3) shall apply accordingly. I have read and understood the above official notification: Aachen , October 16, 2020 City, Date Signature Acknowledgements At this point, I would like to thank everyone who supported me during the preparation of this master thesis. First of all, I would like to thank my supervisor Dr. Marius Politze for giving me the opportunity to conduct my master thesis in the Process and Application Development Research group at the IT Center of RWTH Aachen University, where the infrastructure used in the thesis was provided. I am grateful for the challenging tasks, much support, and helpful advice. My thanks go to Prof. Dr. Stefan Decker and Prof. Dr. Matthias Müller, who kindly agreed to supervise and review my work. I would like to thank them very much for their helpful suggestions and constructive criticism during the creation of this work. Furthermore, I deeply thank all members of the department for their interest in my work, their patience, and their helpfulness. Finally, I thank my family and my friends for supporting me, keeping me motivated, and always being there for me, not only during the time of my master thesis. Abstract An Efficient Semantic Search Engine for Research Data in an RDF-based Knowledge Graph Sarah Bensberg Digital transformation affects all areas of society: More data is produced and workflows rely on data analysis leading to new challenges in data management. Within the context of research data management, the National Research Data Infrastructure aims to system- atize these data stocks and make them accessible. RWTH Aachen University supports this effort with the development of the research data management platform CoScInE. Re- search data is made accessible independent of the actual storage location and described with metadata that allows a structured search. The goal is to make the data findable, accessible, interoperable, and reusable, according to the FAIR Guiding Principles. The W3C standards for the semantic web, such as the data model RDF, the corresponding query language SPARQL, and related technologies, provide the means to describe digital resources but require a significant amount of technical knowledge. This thesis deals with the implementation of a research data search in an RDF knowledge graph that is less dependent on the users’ background in knowledge engineering. Different approaches are considered under several evaluation criteria and it is investigated whether mapping the RDF data into a search index for use in a search engine improves the quality of the search and results. Such an implementation is opposed to the systematic generation of a SPARQL query. In the course of the thesis, a transformation of RDF graphs into a search index using application profiles and rules was developed. This allows the use of all functionalities and search syntaxes provided by the search engine applied. The evaluation shows that in most cases an approach is either easy to use but slow or ineffective, or, on the contrary, fast and effective but difficult to use. With the presented transformation, a solution was found that combines these two contradictory properties. Zusammenfassung Eine effiziente semantische Suchmaschine für Forschungsdaten in einem RDF-basierten Wissensgraphen Sarah Bensberg Der digitale Wandel wirkt sich auf alle Bereiche der Gesellschaft aus: Es werden stetig mehr Daten produziert und Arbeitsprozesse sind auf Datenanalysen angewiesen. Dies führt zu neuen Herausforderungen im Datenmanagement. Im Rahmen des Forschungsdatenman- agements zielt die Nationale Forschungsdateninfrastruktur darauf ab, diese Datenbestände zu systematisieren und zugänglich zu machen. Die RWTH Aachen University unter- stützt dieses Vorhaben mit der Entwicklung der Forschungsdatenmanagement Plattform CoScInE. Forschungsdaten werden unabhängig von ihrem eigentlichen Speicherort zugäng- lich gemacht und mit Metadaten beschrieben, die eine strukturierte Suche ermöglichen. Ziel ist es die Daten nach den FAIR Leitprinzipien auffindbar, zugänglich, interoperabel und wiederverwendbar zu machen. Die W3C-Standards für das semantische Web, wie das Datenmodell RDF, die zugehörige Abfragesprache SPARQL und verwandte Technologien, bieten die Mittel zur Beschreibung digitaler Ressourcen, erfordern jedoch ein erhebliches Maß an technischem Wissen. Diese Arbeit befasst sich mit der Implementierung einer Forschungsdatensuche in einem RDF- Wissensgraphen, die unabhängig vom fachlichen Hintergrund des Nutzers ist. Dazu werden verschiedene Ansätze unter mehreren Evaluationskriterien betrachtet und der Hypothese nachgegangen, ob die Abbildung der RDF Daten in einen Suchindex zur Verwendung in einer Suchmaschine die Qualität der Suche und Ergebnisse verbessern. Eine solche Implementierung steht der systematischen Generierung einer SPARQL Abfrage gegenüber. Im Rahmen der Arbeit wurde eine Transformation von RDF Graphen in einen Suchindex mithilfe von Applikationsprofilen und Regeln entwickelt. Hierdurch können alle durch die verwendete Suchmaschine bereitgestellten Funktionalitäten und Suchsyntaxen verwendet werden. Die Evaluation zeigt, dass ein Ansatz in den meisten Fällen entweder einfach zu bedienen, jedoch langsam oder ineffektiv ist oder im Gegenteil schnell und effektiv, dafür aber schwierig zu benutzen ist. Mit der vorgestellten Transformation wurde ein Lösung gefunden, welche diese beiden konträren Eigenschaften vereint. Contents 1 Introduction1 1.1 NFDI.......................................1 1.2 Motivation....................................1 1.3 RDM at RWTH Aachen University.......................2 1.4 CoScInE......................................3 1.5 Research Goals and Questions..........................4 1.6 Methodology and Structure...........................6 2 Fundamentals9 2.1 Metadata and Related Concepts........................9 2.2 Semantic Web and Linked Data......................... 10 2.2.1 RDF Data Model............................. 10 2.2.2 RDF Vocabularies............................ 11 2.2.3 Validate and Constrict RDF Data using SHACL and DASH..... 12 2.2.4 Query and Update RDF Data using SPARQL............. 13 2.2.5 Infer over RDF Data........................... 14 2.2.6 Open and Closed World Assumption.................. 14 2.3 Information Retrieval.............................. 16 2.3.1 Unstructured, Structured, and Semi-Structured Data......... 16 2.3.2 Search Queries.............................. 16 2.3.3 Different Approaches of Information Retrieval............. 16 2.4 Semantic Search................................. 17 2.4.1 Entity Retrieval Model for Web Data................. 17 2.4.2 Entity Attribute-Value Model...................... 18 2.4.3 Search Model for Web Data....................... 18 2.5 Related Work................................... 19 3 Technical Details of CoScInE 25 3.1 Data Model.................................... 25 3.2 Database Model................................. 29 3.3 Components and Processes........................... 29 3.4 Search Requirements............................... 30 3.4.1 General Conditions............................ 30 3.4.2 Effectiveness............................... 31 3.4.3 Usability................................. 31 3.4.4 Complexity of Search Request for the User.............. 31 3.4.5 Efficiency................................. 31 3.4.6 Response Time.............................. 31 3.4.7 Scalability................................. 32 3.4.8 Additional Effort............................. 32 4 Different Approaches 33 4.1 Literal and Additional Rules........................... 33 4.2 Full-Text Search in RDF Literals........................ 37 4.3 SPARQL Query Builder............................
Recommended publications
  • V a Lida T in G R D F Da
    Series ISSN: 2160-4711 LABRA GAYO • ET AL GAYO LABRA Series Editors: Ying Ding, Indiana University Paul Groth, Elsevier Labs Validating RDF Data Jose Emilio Labra Gayo, University of Oviedo Eric Prud’hommeaux, W3C/MIT and Micelio Iovka Boneva, University of Lille Dimitris Kontokostas, University of Leipzig VALIDATING RDF DATA This book describes two technologies for RDF validation: Shape Expressions (ShEx) and Shapes Constraint Language (SHACL), the rationales for their designs, a comparison of the two, and some example applications. RDF and Linked Data have broad applicability across many fields, from aircraft manufacturing to zoology. Requirements for detecting bad data differ across communities, fields, and tasks, but nearly all involve some form of data validation. This book introduces data validation and describes its practical use in day-to-day data exchange. The Semantic Web offers a bold, new take on how to organize, distribute, index, and share data. Using Web addresses (URIs) as identifiers for data elements enables the construction of distributed databases on a global scale. Like the Web, the Semantic Web is heralded as an information revolution, and also like the Web, it is encumbered by data quality issues. The quality of Semantic Web data is compromised by the lack of resources for data curation, for maintenance, and for developing globally applicable data models. At the enterprise scale, these problems have conventional solutions. Master data management provides an enterprise-wide vocabulary, while constraint languages capture and enforce data structures. Filling a need long recognized by Semantic Web users, shapes languages provide models and vocabularies for expressing such structural constraints.
    [Show full text]
  • Deciding SHACL Shape Containment Through Description Logics Reasoning
    Deciding SHACL Shape Containment through Description Logics Reasoning Martin Leinberger1, Philipp Seifer2, Tjitze Rienstra1, Ralf Lämmel2, and Steffen Staab3;4 1 Inst. for Web Science and Technologies, University of Koblenz-Landau, Germany 2 The Software Languages Team, University of Koblenz-Landau, Germany 3 Institute for Parallel and Distributed Systems, University of Stuttgart, Germany 4 Web and Internet Science Research Group, University of Southampton, England Abstract. The Shapes Constraint Language (SHACL) allows for for- malizing constraints over RDF data graphs. A shape groups a set of constraints that may be fulfilled by nodes in the RDF graph. We investi- gate the problem of containment between SHACL shapes. One shape is contained in a second shape if every graph node meeting the constraints of the first shape also meets the constraints of the second. Todecide shape containment, we map SHACL shape graphs into description logic axioms such that shape containment can be answered by description logic reasoning. We identify several, increasingly tight syntactic restrictions of SHACL for which this approach becomes sound and complete. 1 Introduction RDF has been designed as a flexible, semi-structured data format. To ensure data quality and to allow for restricting its large flexibility in specific domains, the W3C has standardized the Shapes Constraint Language (SHACL)5. A set of SHACL shapes are represented in a shape graph. A shape graph represents constraints that only a subset of all possible RDF data graphs conform to. A SHACL processor may validate whether a given RDF data graph conforms to a given SHACL shape graph. A shape graph and a data graph that act as a running example are pre- sented in Fig.
    [Show full text]
  • Using Rule-Based Reasoning for RDF Validation
    Using Rule-Based Reasoning for RDF Validation Dörthe Arndt, Ben De Meester, Anastasia Dimou, Ruben Verborgh, and Erik Mannens Ghent University - imec - IDLab Sint-Pietersnieuwstraat 41, B-9000 Ghent, Belgium [email protected] Abstract. The success of the Semantic Web highly depends on its in- gredients. If we want to fully realize the vision of a machine-readable Web, it is crucial that Linked Data are actually useful for machines con- suming them. On this background it is not surprising that (Linked) Data validation is an ongoing research topic in the community. However, most approaches so far either do not consider reasoning, and thereby miss the chance of detecting implicit constraint violations, or they base them- selves on a combination of dierent formalisms, eg Description Logics combined with SPARQL. In this paper, we propose using Rule-Based Web Logics for RDF validation focusing on the concepts needed to sup- port the most common validation constraints, such as Scoped Negation As Failure (SNAF), and the predicates dened in the Rule Interchange Format (RIF). We prove the feasibility of the approach by providing an implementation in Notation3 Logic. As such, we show that rule logic can cover both validation and reasoning if it is expressive enough. Keywords: N3, RDF Validation, Rule-Based Reasoning 1 Introduction The amount of publicly available Linked Open Data (LOD) sets is constantly growing1, however, the diversity of the data employed in applications is mostly very limited: only a handful of RDF data is used frequently [27]. One of the reasons for this is that the datasets' quality and consistency varies signicantly, ranging from expensively curated to relatively low quality data [33], and thus need to be validated carefully before use.
    [Show full text]
  • Bibliography of Erik Wilde
    dretbiblio dretbiblio Erik Wilde's Bibliography References [1] AFIPS Fall Joint Computer Conference, San Francisco, California, December 1968. [2] Seventeenth IEEE Conference on Computer Communication Networks, Washington, D.C., 1978. [3] ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Los Angeles, Cal- ifornia, March 1982. ACM Press. [4] First Conference on Computer-Supported Cooperative Work, 1986. [5] 1987 ACM Conference on Hypertext, Chapel Hill, North Carolina, November 1987. ACM Press. [6] 18th IEEE International Symposium on Fault-Tolerant Computing, Tokyo, Japan, 1988. IEEE Computer Society Press. [7] Conference on Computer-Supported Cooperative Work, Portland, Oregon, 1988. ACM Press. [8] Conference on Office Information Systems, Palo Alto, California, March 1988. [9] 1989 ACM Conference on Hypertext, Pittsburgh, Pennsylvania, November 1989. ACM Press. [10] UNIX | The Legend Evolves. Summer 1990 UKUUG Conference, Buntingford, UK, 1990. UKUUG. [11] Fourth ACM Symposium on User Interface Software and Technology, Hilton Head, South Carolina, November 1991. [12] GLOBECOM'91 Conference, Phoenix, Arizona, 1991. IEEE Computer Society Press. [13] IEEE INFOCOM '91 Conference on Computer Communications, Bal Harbour, Florida, 1991. IEEE Computer Society Press. [14] IEEE International Conference on Communications, Denver, Colorado, June 1991. [15] International Workshop on CSCW, Berlin, Germany, April 1991. [16] Third ACM Conference on Hypertext, San Antonio, Texas, December 1991. ACM Press. [17] 11th Symposium on Reliable Distributed Systems, Houston, Texas, 1992. IEEE Computer Society Press. [18] 3rd Joint European Networking Conference, Innsbruck, Austria, May 1992. [19] Fourth ACM Conference on Hypertext, Milano, Italy, November 1992. ACM Press. [20] GLOBECOM'92 Conference, Orlando, Florida, December 1992. IEEE Computer Society Press. http://github.com/dret/biblio (August 29, 2018) 1 dretbiblio [21] IEEE INFOCOM '92 Conference on Computer Communications, Florence, Italy, 1992.
    [Show full text]
  • SHACL Satisfiability and Containment
    SHACL Satisfiability and Containment Paolo Pareti1 , George Konstantinidis1 , Fabio Mogavero2 , and Timothy J. Norman1 1 University of Southampton, Southampton, United Kingdom {pp1v17,g.konstantinidis,t.j.norman}@soton.ac.uk 2 Università degli Studi di Napoli Federico II, Napoli, Italy [email protected] Abstract. The Shapes Constraint Language (SHACL) is a recent W3C recom- mendation language for validating RDF data. Specifically, SHACL documents are collections of constraints that enforce particular shapes on an RDF graph. Previous work on the topic has provided theoretical and practical results for the validation problem, but did not consider the standard decision problems of satisfiability and containment, which are crucial for verifying the feasibility of the constraints and important for design and optimization purposes. In this paper, we undertake a thorough study of different features of non-recursive SHACL by providing a translation to a new first-order language, called SCL, that precisely captures the semantics of SHACL w.r.t. satisfiability and containment. We study the interaction of SHACL features in this logic and provide the detailed map of decidability and complexity results of the aforementioned decision problems for different SHACL sublanguages. Notably, we prove that both problems are undecidable for the full language, but we present decidable combinations of interesting features. 1 Introduction The Shapes Constraint Language (SHACL) has been recently introduced as a W3C recommendation language for the validation of RDF graphs, and it has already been adopted by mainstream tools and triplestores. A SHACL document is a collection of shapes which define particular constraints and specify which nodes in a graph should be validated against these constraints.
    [Show full text]
  • INPUT CONTRIBUTION Group Name:* MAS WG Title:* Semantic Web Best Practices
    Semantic Web best practices INPUT CONTRIBUTION Group Name:* MAS WG Title:* Semantic Web best practices. Semantic Web Guidelines for domain knowledge interoperability to build the Semantic Web of Things Source:* Eurecom, Amelie Gyrard, Christian Bonnet Contact: Amelie Gyrard, [email protected], Christian Bonnet, [email protected] Date:* 2014-04-07 Abstract:* This contribution proposes to describe the semantic web best practices, semantic web tools, and existing domain ontologies for uses cases (smart home and health). Agenda Item:* Tbd Work item(s): MAS Document(s) Study of Existing Abstraction & Semantic Capability Enablement Impacted* Technologies for consideration by oneM2M. Intended purpose of Decision document:* Discussion Information Other <specify> Decision requested or This is an informative paper proposed by the French Eurecom institute as a recommendation:* guideline to MAS contributors on Semantic web best practices, as it was suggested during MAS#9.3 call. Amélie Gyrard is a new member in oneM2M (via ETSI PT1). oneM2M IPR STATEMENT Participation in, or attendance at, any activity of oneM2M, constitutes acceptance of and agreement to be bound by all provisions of IPR policy of the admitting Partner Type 1 and permission that all communications and statements, oral or written, or other information disclosed or presented, and any translation or derivative thereof, may without compensation, and to the extent such participant or attendee may legally and freely grant such copyright rights, be distributed, published, and posted on oneM2M’s web site, in whole or in part, on a non- exclusive basis by oneM2M or oneM2M Partners Type 1 or their licensees or assignees, or as oneM2M SC directs.
    [Show full text]
  • Using Rule-Based Reasoning for RDF Validation
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Ghent University Academic Bibliography Using Rule-Based Reasoning for RDF Validation Dörthe Arndt, Ben De Meester, Anastasia Dimou, Ruben Verborgh, and Erik Mannens Ghent University - imec - IDLab Sint-Pietersnieuwstraat 41, B-9000 Ghent, Belgium [email protected] Abstract. The success of the Semantic Web highly depends on its in- gredients. If we want to fully realize the vision of a machine-readable Web, it is crucial that Linked Data are actually useful for machines con- suming them. On this background it is not surprising that (Linked) Data validation is an ongoing research topic in the community. However, most approaches so far either do not consider reasoning, and thereby miss the chance of detecting implicit constraint violations, or they base them- selves on a combination of dierent formalisms, eg Description Logics combined with SPARQL. In this paper, we propose using Rule-Based Web Logics for RDF validation focusing on the concepts needed to sup- port the most common validation constraints, such as Scoped Negation As Failure (SNAF), and the predicates dened in the Rule Interchange Format (RIF). We prove the feasibility of the approach by providing an implementation in Notation3 Logic. As such, we show that rule logic can cover both validation and reasoning if it is expressive enough. Keywords: N3, RDF Validation, Rule-Based Reasoning 1 Introduction The amount of publicly available Linked Open Data (LOD) sets is constantly growing1, however, the diversity of the data employed in applications is mostly very limited: only a handful of RDF data is used frequently [27].
    [Show full text]
  • Rule Interchange Format: the Framework
    Rule Interchange Format: The Framework Michael Kifer State University of New York at Stony Brook 1 Outline What is Rule Interchange Format (RIF)? RIF Framework Basic Logic Dialect (BLD) 2 What is RIF? A collection of dialects Rule system 1 (rigorously defined rule languages) semantics preserving Intended to facilitate rule mapping sharing and exchange RIF dialect X Dialect consistency Sharing of RIF machinery: semantics preserving XML syntax mapping Presentation syntax Semantics Rule system 2 3 Why Rule Exchange ? (and not The One True Rule Language ) Many different paradigms for rule languages Pure first-order Logic programming/deductive databases Production rules Reactive rules Many different features, syntaxes Different commercial interests Many egos, different preferences, ... 4 Why RIF Dialects ? (and not just one dialect ) Again: many paradigms for rule languages First-order rules Logic programming/deductive databases Reactive rules Production rules Many different semantics Classical first-order Stable-model semantics for negation Well-founded semantics for negation ... ... ... A carefully chosen set of interrelated dialects can serve the purpose of sharing and exchanging rules over the Web 5 Current State of RIF Dialects Need your feedback! Advanced LP . Advanced LP dialect 1 dialect N RIF-PRD (Production Rules Dialect) Basic LP dialect RIF-BLD (Basic Logic Dialect) - ready to go - under development - future plans RIF-Core 6 Why Is RIF Important? Best chance yet to bring rule languages into mainstream Can make
    [Show full text]
  • XBRL and the Semantic
    How can we exploit XBRL and Semantic Web technologies to realize the opportunities? Track 3: Case Studies in XBRL Solutions 19th XBRL International Conference, Paris, 23rd June 2009 Dave Raggett <[email protected]> W3C Fellow – Financial data & Semantic Web Member – XBRL INT Technical Standards Board With some slides from Diane Mueller, JustSystems 1 Outline XBRL: adding semantics to business reports World Wide Adoption of XBRL Users and use cases for XBRL Realizing the potential Feeding the Semantic Web ◦ XBRL, XLink, RDF, Turtle, SPARQL, OWL ◦ Web APIs, Smart Searches, Web Scale Queries ◦ Findings June 2009 2 So What is XBRL? eXtensible Business Reporting Language ◦ a freely available electronic language for financial reporting. ◦ based on XML, XML Schema and XLink ◦ based on accepted financial reporting standards and practices to transport financial reports across all software, platforms and technologies Business reporting includes, but is not limited to: ◦ financial statements, ◦ financial and non-financial information ◦ general ledger transactions ◦ regulatory filings such as annual and quarterly financial statements. “XBRL allows software vendors, programmers and end users who adopt it as a specification to enhance the creation, exchange, and comparison of business reporting information” from xbrl.org June 2009 3 Not just a number XBRL binds each reported fact to a concept in a reporting taxonomy e.g. US GAAP, IFRS ◦ Each concept can be bound to a description and its definition in the accounting literature Hierarchy of Terse label, EN Currency, amount Reporting period concepts Impairment of goodwill: $M21 3 months to 2009-04-30 Description Impairment of goodwill: Loss recognized during the period that results from the write-down of goodwill after comparing the implied fair value of reporting unit goodwill with the carrying amount of that goodwill.
    [Show full text]
  • 15 Years of Semantic Web: an Incomplete Survey
    Ku¨nstl Intell (2016) 30:117–130 DOI 10.1007/s13218-016-0424-1 TECHNICAL CONTRIBUTION 15 Years of Semantic Web: An Incomplete Survey 1 2 Birte Glimm • Heiner Stuckenschmidt Received: 7 November 2015 / Accepted: 5 January 2016 / Published online: 23 January 2016 Ó Springer-Verlag Berlin Heidelberg 2016 Abstract It has been 15 years since the first publications today’s Web: Uniform Resource Identifiers (URIs) for proposed the use of ontologies as a basis for defining assigning unique identifiers to resources, the HyperText information semantics on the Web starting what today is Markup Language (HTML) for specifying the formatting known as the Semantic Web Research Community. This of Web pages, and the Hypertext Transfer Protocol (HTTP) work undoubtedly had a significant influence on AI as a that allows for the retrieval of linked resources from across field and in particular the knowledge representation and the Web. Typically, the markup of standard Web pages Reasoning Community that quickly identified new chal- describes only the formatting and, hence, Web pages and lenges and opportunities in using Description Logics in a the navigation between them using hyperlinks is targeted practical setting. In this survey article, we will try to give towards human users (cf. Fig. 1). an overview of the developments the field has gone through In 2001, Tim Berners-Lee, James Hendler, and Ora in these 15 years. We will look at three different aspects: Lassila describe their vision for a Semantic Web [5]: the evolution of Semantic Web Language Standards, the The Semantic Web is not a separate Web but an evolution of central topics in the Semantic Web Commu- extension of the current one, in which information is nity and the evolution of the research methodology.
    [Show full text]
  • Finding Non-Compliances with Declarative Process Constraints Through Semantic Technologies
    Finding Non-compliances with Declarative Process Constraints through Semantic Technologies Claudio Di Ciccio2, Fajar J. Ekaputra1, Alessio Cecconi2, Andreas Ekelhart1, Elmar Kiesling1 1 TU Wien, Favoritenstrasse 9-11, 1040 Vienna, Austria, [email protected] 2 WU Vienna, Welthandelsplatz 1, 1020 Vienna, Austria fclaudio.di.ciccio,[email protected] Abstract. Business process compliance checking enables organisations to assess whether their processes fulfil a given set of constraints, such as regulations, laws, or guidelines. Whilst many process analysts still rely on ad-hoc, often handcrafted per- case checks, a variety of constraint languages and approaches have been developed in recent years to provide automated compliance checking. A salient example is DE- CLARE, a well-established declarative process specification language based on tem- poral logics. DECLARE specifies the behaviour of processes through temporal rules that constrain the execution of tasks. So far, however, automated compliance check- ing approaches typically report compliance only at the aggregate level, using binary evaluations of constraints on execution traces. Consequently, their results lack gran- ular information on violations and their context, which hampers auditability of pro- cess data for analytic and forensic purposes. To address this challenge, we propose a novel approach that leverages semantic technologies for compliance checking. Our approach proceeds in two stages. First, we translate DECLARE templates into state- ments in SHACL, a graph-based constraint language. Then, we evaluate the resulting constraints on the graph-based, semantic representation of process execution logs. We demonstrate the feasibility of our approach by testing its implementation on real- world event logs. Finally, we discuss its implications and future research directions.
    [Show full text]
  • Integrating Semantic Web Technologies and ASP for Product Configuration
    Integrating Semantic Web Technologies and ASP for Product Configuration Stefan Bischof1 and Gottfried Schenner1 and Simon Steyskal1 and Richard Taupe1,2 Abstract. Currently there is no single dominating technol- own proprietary specification language. Also in the product ogy for building product configurator systems. While research configuration community there is currently no standard way often focuses on a single technology/paradigm, building an to specify product configuration problems although the topic industrial-scale product configurator system will almost al- of ontologies and product configuration is over 20 years old ways require the combination of various technologies for dif- (cf. [21]) and pre-dates the Semantic Web. ferent aspects of the system (knowledge representation, rea- In Section2 we describe the technologies used for this pa- soning, solving, user interface, etc.) This paper demonstrates per. In Section3 we show how to define a product configurator how to build such a hybrid system and how to integrate var- knowledge base with RDF and SHACL, that allows checking ious technologies and leverage their respective strengths. of constraints and interactive solving. Due to the increasing popularity of the industrial knowledge For solving product configuration problems, RDF and graph we utilize Semantic Web technologies (RDF, OWL and SHACL are combined with Answer Set Programming in Sec- the Shapes Constraint Language (SHACL)) for knowledge tion4. In Section5 we illustrate how reasoning with OWL representation and integrate it with Answer Set Programming can help to integrate the product configuration solutions into (ASP), a well-established solving paradigm for product con- the knowledge graph and facilitate reuse of ontologies. figuration.
    [Show full text]