Ontology-Based Big Data Management
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Applications
Applications CSE 595 – Semantic Web Instructor: Dr. Paul Fodor Stony Brook University http://www3.cs.stonybrook.edu/~pfodor/courses/cse595.html GoodRelations BBC Artists BBC World Cup Website Lecture Outline Government Data New York Times Sig.ma and Sindice Swoogle OpenCalais Schema.org data.world Elsevier Audi Data Integration Swiss Life EnerSearch E-Learning Web Services Multimedia Collection Indexing at Scotland Yard Online Procurement at Daimler Device Interoperability at Nokia 2 Publication Management @ Semantic Web Primer GoodRelations E-commerce, and in particular Business-to-Consumer (B2C) e- commerce, has been one of the main drivers behind the rapid adoption of the World Wide Web in everyday live It is now commonplace to see URLs listed on storefronts and goods vehicles Taking the UK as an example, the B2C market has grown from £87 million in April 2000 to £68.4 billion by the end of 2009, a thousand-fold increase over a single decade USA 2017 B2C market was $660 billion, but the growth is decreasing 3 statista.com @ Semantic Web Primer GoodRelations E-commerce marketplace is suffering from all the deficits of the traditional web: E-commerce websites are typically generated from structured information systems, listing price, availability, type of product, delivery options, etc., but by the time this information reaches the company’s web pages, it has been turned into HTML and all machine-interpretable structure has disappeared, with the result that machines can no longer distinguish a price from a product-code -
Product Schema, SEO, Structured Data | Caliber Media Group & Schema.Org Products | Product Schema | SEO
ProductCamp: Product Schema, SEO, Structured Data | Caliber Media Group & Schema.org Products | Product Schema | SEO Caliber Media Group presented at Product Camp Southern California 2014, and Caliber also volunteered its services towards this conference’s success. Caliber Media Group ProductCamp: Product Schema, SEO, Structured Data | Caliber Media Group & Schema.org Products | Schema | Examples Resources | Tools | Readings 2 ProductCamp SoCal 2014 & Schema: Google, after Schema ProductCamp SoCal 2014 & Schema: Google via Disconnect & Yahoo, after Schema , ProductCamp SoCal 2014 & Schema: How did Caliber help ProductCamp beat all of the other events? ProductCamp SoCal 2014 & Schema: So what was inside the pages? ProductCamp SoCal 2014 & Schema: So what was inside the pages? ProductCamp SoCal 2014 & Schema: So what was inside the pages? Search => Schema, with JSON-LD Search & Schema: When & Who? Very short version… Martin Hepp http://www.heppnetz.de/projects/goodrelations/ ProductCamp SoCal 2014 & Schema: When & Who? Short version…here they come… ProductCamp SoCal 2014 & Schema: GoodRelations added… Search & Schema: Why, and… What does Schema do for me? http://www.schema.org ProductCamp SoCal 2014 & Schema: Product since this is ProductCamp http://schema.org/Product ProductCamp SoCal 2014 & Schema: An actual Product, since this is ProductCamp… ProductCamp SoCal 2014 & Schema: Recall, Products as Structured Data Structure -- Hierarchy -- Structured Data Search & Schema: Show me Schema examples… Schema: Product ontology | microdata | Knowledge Graph ProductCamp SoCal 2014 & Schema: Product ProductCamp SoCal 2014 & Schema: Google ProductCamp SoCal 2014 & Schema: Product Schema locally ProductCamp SoCal 2014 & Schema: Product locally ProductCamp SoCal 2014 & Schema: Local Schema gone wild… Search & Product Schema | Knowledge Graph from Google Knowledge Vault – where? Search & Product Schema | Knowledge Graph – where? Google Tables. -
The Application of Semantic Web Technologies to Content Analysis in Sociology
THEAPPLICATIONOFSEMANTICWEBTECHNOLOGIESTO CONTENTANALYSISINSOCIOLOGY MASTER THESIS tabea tietz Matrikelnummer: 749153 Faculty of Economics and Social Science University of Potsdam Erstgutachter: Alexander Knoth, M.A. Zweitgutachter: Prof. Dr. rer. nat. Harald Sack Potsdam, August 2018 Tabea Tietz: The Application of Semantic Web Technologies to Content Analysis in Soci- ology, , © August 2018 ABSTRACT In sociology, texts are understood as social phenomena and provide means to an- alyze social reality. Throughout the years, a broad range of techniques evolved to perform such analysis, qualitative and quantitative approaches as well as com- pletely manual analyses and computer-assisted methods. The development of the World Wide Web and social media as well as technical developments like optical character recognition and automated speech recognition contributed to the enor- mous increase of text available for analysis. This also led sociologists to rely more on computer-assisted approaches for their text analysis and included statistical Natural Language Processing (NLP) techniques. A variety of techniques, tools and use cases developed, which lack an overall uniform way of standardizing these approaches. Furthermore, this problem is coupled with a lack of standards for reporting studies with regards to text analysis in sociology. Semantic Web and Linked Data provide a variety of standards to represent information and knowl- edge. Numerous applications make use of these standards, including possibilities to publish data and to perform Named Entity Linking, a specific branch of NLP. This thesis attempts to discuss the question to which extend the standards and tools provided by the Semantic Web and Linked Data community may support computer-assisted text analysis in sociology. First, these said tools and standards will be briefly introduced and then applied to the use case of constitutional texts of the Netherlands from 1884 to 2016. -
Internship Report — Mpri M2 Advances in Holistic Ontology Alignment
Internship Report | Mpri M2 Advances in Holistic Ontology Alignment Internship Report | Mpri M2 Advances in Holistic Ontology Alignment Antoine Amarilli, supervised by Pierre Senellart, T´el´ecomParisTech General Context The development of the semantic Web has given birth to a large number of data sources (ontologies) with independent data and schemas. These ontologies are usually generated from existing relational databases or extracted from semi-structured data. Ontology alignment is a technique to automatically integrate them by discovering overlap in their instances and similarities in their schemas. Such integration makes it possible to access the combined information of all these data sources simultaneously rather than separately. Ontology alignment must deal with numerous challenges: the schemas are heterogeneous, the volume of data is huge, and the information can be partially incomplete or inconsistent. Problem Studied My internship focuses on the Paris (Probabilistic Alignment of Relations, Instances, and Schema) ontology alignment system [SAS11] developed by Fabian Suchanek, Pierre Senellart, and Serge Abiteboul at Inria Saclay's Webdam team. Paris is a generic system which aligns data and schema simultaneously (hence the adjective \holistic") and uses each of these alignments to cross-fertilize the other. The alignments produced are annotated with a confidence score and have a probabilistic interpretation. Paris was able to align large real-world datasets from the semantic Web. The aim of my internship is to propose improvements to Paris on some problems (both theoretical and practical) that were left open by the original system: achieving a better understanding of the behavior of Paris's equations, aligning heterogeneous schemas, being tolerant to differences in the strings of both ontologies, and improving the overall performance of the system. -
Wiktionary Matcher
Wiktionary Matcher Jan Portisch1;2[0000−0001−5420−0663], Michael Hladik2[0000−0002−2204−3138], and Heiko Paulheim1[0000−0003−4386−8195] 1 Data and Web Science Group, University of Mannheim, Germany fjan, [email protected] 2 SAP SE Product Engineering Financial Services, Walldorf, Germany fjan.portisch, [email protected] Abstract. In this paper, we introduce Wiktionary Matcher, an ontology matching tool that exploits Wiktionary as external background knowl- edge source. Wiktionary is a large lexical knowledge resource that is collaboratively built online. Multiple current language versions of Wik- tionary are merged and used for monolingual ontology matching by ex- ploiting synonymy relations and for multilingual matching by exploiting the translations given in the resource. We show that Wiktionary can be used as external background knowledge source for the task of ontology matching with reasonable matching and runtime performance.3 Keywords: Ontology Matching · Ontology Alignment · External Re- sources · Background Knowledge · Wiktionary 1 Presentation of the System 1.1 State, Purpose, General Statement The Wiktionary Matcher is an element-level, label-based matcher which uses an online lexical resource, namely Wiktionary. The latter is "[a] collaborative project run by the Wikimedia Foundation to produce a free and complete dic- tionary in every language"4. The dictionary is organized similarly to Wikipedia: Everybody can contribute to the project and the content is reviewed in a com- munity process. Compared to WordNet [4], Wiktionary is significantly larger and also available in other languages than English. This matcher uses DBnary [15], an RDF version of Wiktionary that is publicly available5. The DBnary data set makes use of an extended LEMON model [11] to describe the data. -
The Social Semantic Web John G
The Social Semantic Web John G. Breslin · Alexandre Passant · Stefan Decker The Social Semantic Web 123 John G. Breslin Alexandre Passant Electrical and Electronic Engineering Digital Enterprise Research School of Engineering and Informatics Institute (DERI) National University of Ireland, Galway National University of Ireland, Galway Nuns Island IDA Business Park Galway Lower Dangan Ireland Galway [email protected] Ireland [email protected] Stefan Decker Digital Enterprise Research Institute (DERI) National University of Ireland, Galway IDA Business Park Lower Dangan Galway Ireland [email protected] ISBN 978-3-642-01171-9 e-ISBN 978-3-642-01172-6 DOI 10.1007/978-3-642-01172-6 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2009936149 ACM Computing Classification (1998): H.3.5, H.4.3, I.2, K.4 c Springer-Verlag Berlin Heidelberg 2009 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. -
Designing an Ontology for Managing the Diets of Hypertensive Individuals
Designing an Ontology for Managing the Diets of Hypertensive Individuals A thesis submitted to the College of Communication and Information of Kent State University in partial fulfillment of the requirements for the Master of Library and Information Science and Master of Science dual degree program By Julaine Clunis January, 2016 Thesis Written By Julaine Clunis BSc., Northern Caribbean University, 2005 M.L.I.S., Kent State University, 2016 M.S., Kent State University, 2016 Approved By ___________________________________________ Marcia Lei Zeng, Advisor ___________________________________________ Jeffrey W. Fruit, Director, School of Library and Information Science ___________________________________________ Amy Reynolds, Dean, College of Communication and Information Table of Contents List of Figures ................................................................................................................................ v List of Tables ............................................................................................................................... vii List of Acronyms and Abbreviations ....................................................................................... viii Glossary of Terms ......................................................................................................................... x Terms Relating to Health and Medicine ............................................................................................. xii Acknowledgements ................................................................................................................... -
Term-Based Ontology Alignment
Term-Based Ontology Alignment Virach Sornlertlamvanich, Canasai Kruengkrai, Shisanu Tongchim, Prapass Srichaivattana, and Hitoshi Isahara Thai Computational Linguistics Laboratory National Institute of Information and Communications Technology 112 Paholyothin Road, Klong 1, Klong Luang, Pathumthani 12120, Thailand {virach,canasai,shisanu,prapass}@tcllab.org, [email protected] Abstract. This paper presents an efficient approach to automatically align concepts between two ontologies. We propose an iterative algorithm that performs finding the most appropriate target concept for a given source concept based on the similarity of shared terms. Experimental results on two lexical ontologies, the MMT semantic hierarchy and the EDR concept dictionary, are given to show the feasibility of the proposed algorithm. 1 Introduction In this paper, we propose an efficient approach for finding alignments between two different ontologies. Specifically, we derive the source and the target on- tologies from available language resources, i.e. the machine readable dictionaries (MDRs). In our context, we consider the ontological concepts as the groups of lexical entries having similar or related meanings organized on a semantic hier- archy. The resulting ontology alignment can be used as a semantic knowledge for constructing multilingual dictionaries. Typically, bilingual dictionaries provide the relationship between their native language and English. One can extend these bilingual dictionaries to multilingual dictionaries by exploiting English as an intermediate source and associations between two concepts as semantic constraints. Aligningconceptsbetweentwoontologiesisoftendonebyhumans,whichis an expensive and time-consuming process. This motivates us to find an auto- matic method to perform such task. However, the hierarchical structures of two ontologies are quite different. The structural inconsistency is a common problem [1]. -
Wiktionary Matcher Results for OAEI 2020
Wiktionary Matcher Results for OAEI 2020 Jan Portisch1;2[0000−0001−5420−0663] and Heiko Paulheim1[0000−0003−4386−8195] 1 Data and Web Science Group, University of Mannheim, Germany fjan, [email protected] 2 SAP SE Product Engineering Financial Services, Walldorf, Germany [email protected] Abstract. This paper presents the results of the Wiktionary Matcher in the Ontology Alignment Evaluation Initiative (OAEI) 2020. Wiktionary Matcher is an ontology matching tool that exploits Wiktionary as exter- nal background knowledge source. Wiktionary is a large lexical knowl- edge resource that is collaboratively built online. Multiple current lan- guage versions of Wiktionary are merged and used for monolingual on- tology matching by exploiting synonymy relations and for multilingual matching by exploiting the translations given in the resource. This is the second OAEI participation of the matching system. Wiktionary Matcher has been improved and is the best performing system on the knowledge graph track this year.3 Keywords: Ontology Matching · Ontology Alignment · External Re- sources · Background Knowledge · Wiktionary 1 Presentation of the System 1.1 State, Purpose, General Statement The Wiktionary Matcher is an element-level, label-based matcher which uses an online lexical resource, namely Wiktionary. The latter is "[a] collaborative project run by the Wikimedia Foundation to produce a free and complete dic- tionary in every language"4. The dictionary is organized similarly to Wikipedia: Everybody can contribute to the project and the content is reviewed in a com- munity process. Compared to WordNet [2], Wiktionary is significantly larger and also available in other languages than English. This matcher uses DBnary [13], an RDF version of Wiktionary that is publicly available5. -
Data Models for Home Services
__________________________________________PROCEEDING OF THE 13TH CONFERENCE OF FRUCT ASSOCIATION Data Models for Home Services Vadym Kramar, Markku Korhonen, Yury Sergeev Oulu University of Applied Sciences, School of Engineering Raahe, Finland {vadym.kramar, markku.korhonen, yury.sergeev}@oamk.fi Abstract An ultimate penetration of communication technologies allowing web access has enriched a conception of smart homes with new paradigms of home services. Modern home services range far beyond such notions as Home Automation or Use of Internet. The services expose their ubiquitous nature by being integrated into smart environments, and provisioned through a variety of end-user devices. Computational intelligence require a use of knowledge technologies, and within a given domain, such requirement as a compliance with modern web architecture is essential. This is where Semantic Web technologies excel. A given work presents an overview of important terms, vocabularies, and data models that may be utilised in data and knowledge engineering with respect to home services. Index Terms: Context, Data engineering, Data models, Knowledge engineering, Semantic Web, Smart homes, Ubiquitous computing. I. INTRODUCTION In recent years, a use of Semantic Web technologies to build a giant information space has shown certain benefits. Rapid development of Web 3.0 and a use of its principle in web applications is the best evidence of such benefits. A traditional database design in still and will be widely used in web applications. One of the most important reason for that is a vast number of databases developed over years and used in a variety of applications varying from simple web services to enterprise portals. In accordance to Forrester Research though a growing number of document, or knowledge bases, such as NoSQL is not a hype anymore [1]. -
LNCS 3532, Pp
Activity Based Metadata for Semantic Desktop Search Paul Alexandru Chirita, Rita Gavriloaie, Stefania Ghita, Wolfgang Nejdl, and Raluca Paiu L3S Research Center / University of Hanover, Deutscher Pavillon, Expo Plaza 1, 30539 Hanover, Germany {chirita, gavriloaie, ghita, nejdl, paiu}@l3s.de Abstract. With increasing storage capacities on current PCs, searching the World Wide Web has ironically become more efficient than searching one’s own per- sonal computer. The recently introduced desktop search engines are a first step towards coping with this problem, but not yet a satisfying solution. The reason for that is that desktop search is actually quite different from its web counter- part. Documents on the desktop are not linked to each other in a way compara- ble to the web, which means that result ranking is poor or even inexistent, be- cause algorithms like PageRank cannot be used for desktop search. On the other hand, desktop search could potentially profit from a lot of implicit and explicit semantic information available in emails, folder hierarchies, browser cache con- texts and others. This paper investigates how to extract and store these activity based context information explicitly as RDF metadata and how to use them, as well as additional background information and ontologies, to enhance desktop search. 1 Introduction The capacity of our hard-disk drives has increased tremendously over the past decade, and so has the number of files we usually store on our computer. It is no wonder that sometimes we cannot find a document any more, even when we know we saved it somewhere. Ironically, in quite a few of these cases nowadays, the document we are looking for can be found faster on the World Wide Web than on our personal com- puter. -
Master Thesis Innovation Dynamics in Open Source Software
Master thesis Innovation dynamics in open source software Author: Name: Remco Bloemen Student number: 0109150 Email: [email protected] Telephone: +316 11 88 66 71 Supervisors and advisors: Name: prof. dr. Stefan Kuhlmann Email: [email protected] Telephone: +31 53 489 3353 Office: Ravelijn RA 4410 (STEPS) Name: dr. Chintan Amrit Email: [email protected] Telephone: +31 53 489 4064 Office: Ravelijn RA 3410 (IEBIS) Name: dr. Gonzalo Ord´o~nez{Matamoros Email: [email protected] Telephone: +31 53 489 3348 Office: Ravelijn RA 4333 (STEPS) 1 Abstract Open source software development is a major driver of software innovation, yet it has thus far received little attention from innovation research. One of the reasons is that conventional methods such as survey based studies or patent co-citation analysis do not work in the open source communities. In this thesis it will be shown that open source development is very accessible to study, due to its open nature, but it requires special tools. In particular, this thesis introduces the method of dependency graph analysis to study open source software devel- opment on the grandest scale. A proof of concept application of this method is done and has delivered many significant and interesting results. Contents 1 Open source software 6 1.1 The open source licenses . 8 1.2 Commercial involvement in open source . 9 1.3 Opens source development . 10 1.4 The intellectual property debates . 12 1.4.1 The software patent debate . 13 1.4.2 The open source blind spot . 15 1.5 Litterature search on network analysis in software development .