The Knowledge Organization of Dbpedia: a Case Study M. Cristina
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Rdfa in XHTML: Syntax and Processing Rdfa in XHTML: Syntax and Processing
RDFa in XHTML: Syntax and Processing RDFa in XHTML: Syntax and Processing RDFa in XHTML: Syntax and Processing A collection of attributes and processing rules for extending XHTML to support RDF W3C Recommendation 14 October 2008 This version: http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014 Latest version: http://www.w3.org/TR/rdfa-syntax Previous version: http://www.w3.org/TR/2008/PR-rdfa-syntax-20080904 Diff from previous version: rdfa-syntax-diff.html Editors: Ben Adida, Creative Commons [email protected] Mark Birbeck, webBackplane [email protected] Shane McCarron, Applied Testing and Technology, Inc. [email protected] Steven Pemberton, CWI Please refer to the errata for this document, which may include some normative corrections. This document is also available in these non-normative formats: PostScript version, PDF version, ZIP archive, and Gzip’d TAR archive. The English version of this specification is the only normative version. Non-normative translations may also be available. Copyright © 2007-2008 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply. Abstract The current Web is primarily made up of an enormous number of documents that have been created using HTML. These documents contain significant amounts of structured data, which is largely unavailable to tools and applications. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites, and allowing browsing applications to improve the user experience: an event on a web page can be directly imported - 1 - How to Read this Document RDFa in XHTML: Syntax and Processing into a user’s desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo’s creator, camera setting information, resolution, location and topic can be published as easily as the original photo itself, enabling structured search and sharing. -
Linked Data Overview and Usage in Social Networks
Linked Data Overview and Usage in Social Networks Gustavo G. Valdez Technische Universitat Berlin Email: project [email protected] Abstract—This paper intends to introduce the principles of Webpages, also possibly reference any Object or Concept (or Linked Data, show some possible uses in Social Networks, even relationships of concepts/objects). advantages and disadvantages of using it, aswell as presenting The second one combines the unique identification of the an example application of Linked Data to the field of social networks. first with a well known retrieval mechanism (HTTP), enabling this URIs to be looked up via the HTTP protocol. I. INTRODUCTION The third principle tackles the problem of using the same The web is a very good source of human-readable informa- document standards, in order to have scalability and interop- tion that has improved our daily life in several areas. By using erability. This is done by having a standard, the Resource De- rich text and hyperlinks, it made our access to information scription Framewok (RDF), which will be explained in section something more active rather than passive. But the web was III-A. This standards are very useful to provide resources that made for humans and not for machines, even though today are machine-readable and are identified by a URI. a lot of the acesses to the web are made by machines. This The fourth principle basically advocates that hyperlinks has lead to the search for adding structure, so that data is should be used not only to link webpages, but also to link more machine-readable. -
Towards a Korean Dbpedia and an Approach for Complementing the Korean Wikipedia Based on Dbpedia
Towards a Korean DBpedia and an Approach for Complementing the Korean Wikipedia based on DBpedia Eun-kyung Kim1, Matthias Weidl2, Key-Sun Choi1, S¨orenAuer2 1 Semantic Web Research Center, CS Department, KAIST, Korea, 305-701 2 Universit¨at Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany [email protected], [email protected] [email protected], [email protected] Abstract. In the first part of this paper we report about experiences when applying the DBpedia extraction framework to the Korean Wikipedia. We improved the extraction of non-Latin characters and extended the framework with pluggable internationalization components in order to fa- cilitate the extraction of localized information. With these improvements we almost doubled the amount of extracted triples. We also will present the results of the extraction for Korean. In the second part, we present a conceptual study aimed at understanding the impact of international resource synchronization in DBpedia. In the absence of any informa- tion synchronization, each country would construct its own datasets and manage it from its users. Moreover the cooperation across the various countries is adversely affected. Keywords: Synchronization, Wikipedia, DBpedia, Multi-lingual 1 Introduction Wikipedia is the largest encyclopedia of mankind and is written collaboratively by people all around the world. Everybody can access this knowledge as well as add and edit articles. Right now Wikipedia is available in 260 languages and the quality of the articles reached a high level [1]. However, Wikipedia only offers full-text search for this textual information. For that reason, different projects have been started to convert this information into structured knowledge, which can be used by Semantic Web technologies to ask sophisticated queries against Wikipedia. -
A Visual Notation for the Integrated Representation of OWL Ontologies
A Visual Notation for the Integrated Representation of OWL Ontologies Stefan Negru1 and Steffen Lohmann2 1Faculty of Computer Science, Alexandru Ioan Cuza University, General Berthelot 16, 700483 Iasi, Romania 2Institute for Visualization and Interactive Systems (VIS), Universit¨atsstraße 38, 70569 Stuttgart, Germany [email protected], [email protected] Keywords: Ontology Visualization, Visual Notation, OWL, Semantic Web, Knowledge Visualization, Ontologies. Abstract: The paper presents a visual notation for the Web Ontology Language (OWL) providing an integrated view on the classes and individuals of ontologies. The classes are displayed as circles, with the size of each circle representing its connectivity in the ontology. The individuals are represented as sections in the circles so that it is immediately clear from the visualization how many individuals the classes contain. The notation can be used to visualize the property relations of either the classes (conceptual layer) or of selected individuals (integrated layer), while certain elements are always shown for a better understanding. It requires only a small number of graphical elements and the resulting visualizations are comparatively compact. Yet, the notation is comprehensive, as it defines graphical representations for all OWL elements that can be reasonably visualized. The applicability of the notation is illustrated by the example of the Friend of a Friend (FOAF) ontology. 1 INTRODUCTION reasoning. Even though it provides great means to de- scribe and integrate information on the Web, its text- The Web of Data has attracted large interest for both based representation is difficult to understand for av- data publishers and consumers in the last years. -
Mapping Between Digital Identity Ontologies Through SISM
Mapping between Digital Identity Ontologies through SISM Matthew Rowe The OAK Group, Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, Sheffield S1 4DP, UK [email protected] Abstract. Various ontologies are available defining the semantics of dig- ital identity information. Due to the rise in use of lowercase semantics, such ontologies are now used to add metadata to digital identity informa- tion within web pages. However concepts exist in these ontologies which are related and must be mapped together in order to enhance machine- readability of identity information on the web. This paper presents the Social identity Schema Mapping (SISM) vocabulary which contains a set of mappings between related concepts in distinct digital identity ontolo- gies using OWL and SKOS mapping constructs. Key words: Semantic Web, Social Web, SKOS, OWL, FOAF, SIOC, PIMO, NCO, Microformats 1 Introduction The semantic web provides a web of machine-readable data. Ontologies form a vital component of the semantic web by providing conceptualisations of domains of knowledge which can then be used to provide a common understanding of some domain. A basic ontology contains a vocabulary of concepts and definitions of the relationships between those concepts. An agent reading a concept from an ontology can look up the concept and discover its properties and characteristics, therefore interpreting how it fits into that particular domain. Due to the great number of ontologies it is common for related concepts to be defined in separate ontologies, these concepts must be identified and mapped together. Web technologies such as Microformats, eRDF and RDFa have allowed web developers to encode lowercase semantics within XHTML pages. -
Improving Knowledge Base Construction from Robust Infobox Extraction
Improving Knowledge Base Construction from Robust Infobox Extraction Boya Peng∗ Yejin Huh Xiao Ling Michele Banko∗ Apple Inc. 1 Apple Park Way Sentropy Technologies Cupertino, CA, USA {yejin.huh,xiaoling}@apple.com {emma,mbanko}@sentropy.io∗ Abstract knowledge graph created largely by human edi- tors. Only 46% of person entities in Wikidata A capable, automatic Question Answering have birth places available 1. An estimate of 584 (QA) system can provide more complete million facts are maintained in Wikipedia, not in and accurate answers using a comprehen- Wikidata (Hellmann, 2018). A downstream appli- sive knowledge base (KB). One important cation such as Question Answering (QA) will suf- approach to constructing a comprehensive knowledge base is to extract information from fer from this incompleteness, and fail to answer Wikipedia infobox tables to populate an ex- certain questions or even provide an incorrect an- isting KB. Despite previous successes in the swer especially for a question about a list of enti- Infobox Extraction (IBE) problem (e.g., DB- ties due to a closed-world assumption. Previous pedia), three major challenges remain: 1) De- work on enriching and growing existing knowl- terministic extraction patterns used in DBpe- edge bases includes relation extraction on nat- dia are vulnerable to template changes; 2) ural language text (Wu and Weld, 2007; Mintz Over-trusting Wikipedia anchor links can lead to entity disambiguation errors; 3) Heuristic- et al., 2009; Hoffmann et al., 2011; Surdeanu et al., based extraction of unlinkable entities yields 2012; Koch et al., 2014), knowledge base reason- low precision, hurting both accuracy and com- ing from existing facts (Lao et al., 2011; Guu et al., pleteness of the final KB. -
Microformats the Next (Small) Thing on the Semantic Web?
Standards Editor: Jim Whitehead • [email protected] Microformats The Next (Small) Thing on the Semantic Web? Rohit Khare • CommerceNet “Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards.” — Microformats.org hen we speak of the “evolution of the is precisely encoding the great variety of person- Web,” it might actually be more appropri- al, professional, and genealogical relationships W ate to speak of “intelligent design” — we between people and organizations. By contrast, can actually point to a living, breathing, and an accidental challenge is that any blogger with actively involved Creator of the Web. We can even some knowledge of HTML can add microformat consult Tim Berners-Lee’s stated goals for the markup to a text-input form, but uploading an “promised land,” dubbed the Semantic Web. Few external file dedicated to machine-readable use presume we could reach those objectives by ran- remains forbiddingly complex with most blog- domly hacking existing Web standards and hop- ging tools. ing that “natural selection” by authors, software So, although any intelligent designer ought to developers, and readers would ensure powerful be able to rely on the long-established facility of enough abstractions for it. file transfer to publish the “right” model of a social Indeed, the elegant and painstakingly inter- network, the path of least resistance might favor locked edifice of technologies, including RDF, adding one of a handful of fixed tags to an exist- XML, and query languages is now growing pow- ing indirect form — the “blogroll” of hyperlinks to erful enough to attack massive information chal- other people’s sites. -
From FOAF to English: Linguistic Contribution to Web Semantics
From FOAF to English: Linguistic Contribution to Web Semantics Max Silberztein Université de Franche-Comté [email protected] Abstract which sentence to produce. This article presents the linguistic component of such a system. This paper presents a linguistic module capable of generating a set of English sentences that 2 The linguistic framework correspond to a Resource Description Framework (RDF) statement; I discuss how a Based on the principles of linguistic approaches to generator can control the linguistic module, as generation laid out by Danlos (1987), I have used well as the various limitations of a pure the NooJ1 platform to construct a set of linguistic linguistic framework. resources that parses a sequence of RDF statements and produces a corresponding set of English 1 Introduction sentences. NooJ allows linguists to construct structured sets of linguistic resources (“modules”) Automatic Natural Language Generation Software in the form of dictionaries and grammars2 to aims to express information stored in a knowledge formalize a large gamut of linguistic phenomena: database as a Natural Language text. The orthography and spelling, inflectional, derivational information at the source typically consists of a and agglutinative morphology, local and structural series of simple atomic statements, such as “John syntax, transformational syntax, lexical and owns a house”, “Mary lives in Manchester”, “Peter predicative semantics. All linguistic analyses are is Ann’s cousin”. These statements are usually performed sequentially by adding and/or removing stored in a knowledge database or ontology in a linguistic annotations to/from a Text Annotation formal notation such as Prolog (e.g. Structure (TAS); at each level of the analysis, each “Own(John,House)”) or XML. -
Semantic Interaction with Music Content Using FOAF
Semantic Interaction with Music Content using FOAF Oscar` Celma1, Miquel Ram´ırez1, and Perfecto Herrera1 Music Technology Group, Institut Universitari de l’Audiovisual Universitat Pompeu Fabra Ocata 1, 08003 Barcelona, Spain. {ocelma, mramirez, pherrera}@iua.upf.es http://www.iua.upf.es/mtg Abstract. SIMAC (Semantic Interaction with Music Audio Contents) is an European Commission project (IST-2.3.1.7, Semantic-based knowl- edge systems) which aims to develop a set of prototypes to describe, semi-automatically, music audio content. SIMAC is about music meta- data, about what you can say of a piece of music, on what is hidden in a music file, in a collection of music files, and in the collective knowledge of a community of music lovers. This document proposes to enhance/extend the FOAF definition to model user musical tastes. One of the goals of the project is to explore music content discovery, based on both user profiling and content-based descriptions. 1 Introduction The main goal of SIMAC1 is doing research on semantic descriptors of music contents, in order to use them, by means of a set of prototypes, for providing song collection exploration, retrieval and recommendation services. These services are meant for “home” users, music content producers and distributors and academic users. One special feature is that these descriptions are composed by semantic descriptors. Music will be tagged using a language close to the user’s own way of describing its contents —moving the focus from low-level to higher-level (i.e. semantic) descriptions. SIMAC is about music metadata. We assume that metadata is all what one can say about a single music piece or a collection of music pieces. -
Viewing FOAF – Development of a Metadata Explorer 75
ViewingViewing FOAF FOAF – Development – Development of a Metadata of a Metadata Explorer Explorer Josef Petrák Josef Petr´ak Faculty of Informatics and Statistics, University of Economics, Prague Winston Churchill Sq. 4, 130 67, Praha 3, Czech Republic Faculty of Informatics and Statistics, University of Economics, Prague [email protected] Winston Churchill Sq. 4, 130 67, Praha 3, Czech Republic [email protected] Abstract. Social networks are widely accepted application of Semantic Web technologies and are also interesting for general public. Nowadays there is a lack of quality user-friendly browser which could express meaning of the metadata stored in the most of FOAF profiles and present this knowledge in human-understandable form. The aim of the article is to inform about development of the AFE (Advanced FOAF Explorer) which is intended to perform these services and supersede one similar project. Keywords: social networks, Semantic Web, RDF, FOAF, PHP5 1 Introduction Friend Of A Friend vocabulary (FOAF) [1], an ontology used for description of personal profiles and relations among people, is well-known and popular in the Semantic Web [2] community. Thank to the user-friendly tools such as FOAF-a- Matic [5] people who are not familiar with RDF [14] can create their own profiles and publish them on the Internet. This is a good because may start spreading of the Semantic Web technologies to the public. But if these users create such profiles, they also require having a chance to view their profiles or to browse the profiles of other people. Nowadays there are several “viewers” but their features are limited. -
Learning a Large Scale of Ontology from Japanese Wikipedia
2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Learning a Large Scale of Ontology from Japanese Wikipedia Susumu Tamagawa∗, Shinya Sakurai∗, Takuya Tejima∗, Takeshi Morita∗, Noriaki Izumiy and Takahira Yamaguchi∗ ∗Keio University 3-14-1 Hiyoshi, Kohoku-ku, Yokohama-shi, Kanagawa 223-8522, Japan Email: fs tamagawa, [email protected] yNational Institute of Advanced Industrial Science and Technology 2-3-26 Aomi, Koto-ku, Tokyo 135-0064, Japan Abstract— Here is discussed how to learn a large scale II. RELATED WORK of ontology from Japanese Wikipedia. The learned on- tology includes the following properties: rdfs:subClassOf Auer et al.’s DBpedia [4] constructed a large information (IS-A relationships), rdf:type (class-instance relationships), database to extract RDF from semi-structured information owl:Object/DatatypeProperty (Infobox triples), rdfs:domain resources of Wikipedia. They used information resource (property domains), and skos:altLabel (synonyms). Experimen- tal case studies show us that the learned Japanese Wikipedia such as infobox, external link, categories the article belongs Ontology goes better than already existing general linguistic to. However, properties and classes are built manually and ontologies, such as EDR and Japanese WordNet, from the there are only 170 classes in the DBpedia. points of building costs and structure information richness. Fabian et al. [5] proposed YAGO which enhanced Word- Net by using the Conceptual Category. Conceptual Category is a category of Wikipedia English whose head part has the form of plural noun like “ American singers of German I. INTRODUCTION origin ”. They defined a Conceptual Category as a class, and defined the articles which belong to the Conceptual It is useful to build up large-scale ontologies for informa- Category as instances of the class. -
Mapping RDF Graphs to Property Graphs
Mapping RDF Graphs to Property Graphs Shota Matsumoto1, Ryota Yamanaka2, and Hirokazu Chiba3 1 Lifematics Inc., Tokyo 101-0041, Japan [email protected] 2 Oracle Corporation, Bangkok 10500, Thailand [email protected] 3 Database Center for Life Science, Chiba 277-0871, Japan [email protected] Abstract. Increasing amounts of scientific and social data are published in the Resource Description Framework (RDF). Although the RDF data can be queried using the SPARQL language, even the SPARQL-based operation has a limitation in implementing traversal or analytical algo- rithms. Recently, a variety of graph database implementations dedicated to analyses on the property graph model have emerged. However, the RDF model and the property graph model are not interoperable. Here, we developed a framework based on the Graph to Graph Mapping Lan- guage (G2GML) for mapping RDF graphs to property graphs to make the most of accumulated RDF data. Using this framework, graph data described in the RDF model can be converted to the property graph model and can be loaded to several graph database engines for further analysis. Future works include implementing and utilizing graph algo- rithms to make the most of the accumulated data in various analytical engines. Keywords: RDF · Property Graph · Graph Database 1 Introduction Increasing amounts of scientific and social data are described as graphs. As a format of graph data, the Resource Description Framework (RDF) is widely used. Although RDF data can be queried using the SPARQL language in a flexible way, SPARQL is not dedicated to traversal of graphs and has a limitation in arXiv:1812.01801v1 [cs.DB] 5 Dec 2018 implementing graph analysis algorithms.