Building Information Science Ontology (OIS) with Methontology and Protégé

Total Page:16

File Type:pdf, Size:1020Kb

Building Information Science Ontology (OIS) with Methontology and Protégé Journal of Internet Technology and Secured Transactions (JITST), Volume 1, Issue 4, December 2012 Building Information Science Ontology (OIS) with Methontology and Protégé Ahlam F. Sawsaa Joan Lu School of Computing & Engineering School of Computing & Engineering University of Huddersfield University of Huddersfield Huddersfield UK Huddersfield UK Abstract Inconsistencies in the structure of the domain have led to difficulties in using and sharing data at the Ontology is the backbone of the semantic web syntactic and semantic levels. Many technologies and can overcome semantic barriers. Domain offer good data-sharing solutions for the syntactic ontology provides a common understanding of the level, for example XML, but do not work effectively knowledge of a particular domain. Information at the semantic level. Ontology offers a good Science, meanwhile, is an interdisciplinary science solution for using data and sharing it at the semantic that is yet to be defined. It is necessary to develop level. Ontology is a modeling tool that provides a Ontology of Information Science (OIS) to represent formal description of concepts and their relations, as the unified domain knowledge. a foundation for semantic integration and This paper presents a representation of specific interoperability [1; 2]. domain knowledge by providing a definition, scope, In this paper, we focus on the ontology used in the and boundaries of Information Science (IS). The semantic web as a compatible independent model. methodology followed is Methontology, which is The aims of this study are as follows: based on the IEEE standard for the development of a software life-cycle process using the ontology editor to provide a visualization of the IS area Protégé. The OIS ontology has fourteen facets: to share a common understanding of IS actors, method, practice, studies, mediator, kinds, theory domains, resources, legislation, philosophy and theories, societal, tools, time and space. It provides a to describe the terminology and a conceptual broader base of classes to offer the opportunity to model of IS, including concepts, examples of enrich OIS ontology. Also, it could be a basis for them, and the relationships between them multiple ontologies to be built. OIS is structured at [17], in other words a logical model of OIS. class level and subclass but does not provide the The Ontology of Information Science OIS is a new individual level. The model was evaluated by domain research direction within the field of Information experts based on specific criteria and using the Science (IS). It provides a formal semantic FaCT++ reasoner to check the ontology usefulness, explanation for IS data. This paper is organized as and how it could be transferred into application follows: In section 2, we discuss the theoretical ontology for Information Science education. The foundations of ontology. In section 3, we describe paper then discusses the OIS ontology, particularly the method used to build the IS ontology. In section its structure and development. 4, we present its development and implementation. Section 5 contains our evaluation and discussion of Keywords: ontology, ontology engineering, the results. Section 6 concludes and presents knowledge visualization, knowledge representation, suggestions for future work. semantic web, information science 2. Background 1. Introduction The World Wide Web (www) is enormous and In recent years, ontology has received attention the semantic web is still in an early phase. The from both academic and industrial fields. The word semantic web needs semantic interoperability ontology has been defined from different between metadata connected with web information. perspectives, having originated in the field of The www needs smart tools to improve information philosophy, where it is used to mean the basic retrieval and to integrate all the information that characteristics of existence in the world. users need. So, metadata makes it easier for search Information science is a multidiscipline consisting of engines to get web pages. a number of different branches, including library science, computer science, and archival science. Nowadays, many sources of data and information Thus, it lacks a unified model of domain knowledge. are available on the Internet, and therefore it has Copyright © 2012, Infonomics Society 100 Journal of Internet Technology and Secured Transactions (JITST), Volume 1, Issue 4, December 2012 become imperative for computer scientists and IA community to facilitate information access and improve information retrieval on the Internet, by A. Ontology of Information Science (OIS) using techniques, applications and programming languages such as: Extensible Markup Language Ontology of IS facilitates data exchanging (XML), Resource Describe Framework (RDF, information integration, and search of IS data. RDFs) and Web Ontology Language (OWL). The Nowadays, ontologies have become mainstream in semantic Web provides a good opportunity for several domains. Ontology has been defined from researchers to access the required information. different perspectives, computer science and Particularly with an increase in information on the philosophy and has various definitions in the Internet, it can be said that the Semantic Web has literature [6; 7; 8; 9], The philosophical perspective revolutionized the world of search browsers on the defines ontology as the science or study of being, internet. what is existing, introduced by Aristotle [10; 11]. The semantic web is defined as “an extension of the The term has been borrowed by computer science current Web in which information is given well- and is used to represent knowledge or understanding defined meaning, better enabling computers and of the world. The Artificial Intelligence AI people to work in cooperation.” [3P580]. Its aim is to community defines the term as “a formal explicit convert a large amount of data and information specification of shared conceptualization” [12]. It is resources that are available on the internet rather defined as a specification of conceptualization. than just units consisting of (0) (bits) to Ontology represents knowledge of a specific domain understandable data by the computer programs using in the way that concepts are defined in unique specific languages mentioned above. manner and connected with relationships. Tim Berners-Lee, in his article describing the According to Gruber‘s definition, OIS is the semantic web, said it is an attempt to develop formal explanation of a shared conceptualization of languages that express information in a form the domain of IS. That is, the concepts of IS are accessible to human understanding. This brings to represented by the ontology model. More mind to ask this question: what is the importance of interestingly, IS knowledge is conceptualized into the semantic web? defined classes and relationships to make it machine Berners–Lee is correct that Google is a wonderful readable. tool for humans, but it does not serve machines, it is only understandable for humans; we also need these The OIS ontology has developed to overlap the pages and their contents to be understandable for field problematically. Information science IS has machines [4]. interdisciplinary relationships with different sciences and it needs to be determined how it will be defined. The semantic web offers semantic annotations that describe web resources explicitly. These annotations Furthermore, IS still seeking to identify its are based on ontologies that represent domain identity and boundaries against others fields, because knowledge through defining concepts and the a lack of scientific methodology and philosophy led semantic relations between those concepts. Besides it to big problems, particularly when information provides a machine processable representation of scientists attempted to establish the basic area of the ontology. The standards for this purpose have been science [13]. IS is concerned with collecting and defined by W3C, such as RDF, OWL. Ontology is a organizing information resources to be retrieved by foundation and is central to the growth of the users in information centers and libraries [1]. semantic web that provides a common knowledge for correspondence and communication among B. Theoretical foundation of OIS ontology heterogeneous systems. Furthermore, it is useful for different applications to share information among heterogeneous data resources [5]. Theories can help to define formal ontological properties that contribute to characterizing the Therefore, an ontological infrastructure to support concepts. Meanwhile, ontologists nowadays have a the semantic web has been developed. Particularly choice of formal frameworks which derive from with a rise of ontology in the artificial intelligence formal logic, as well as algebra, category theory, domain, it can be seen as an almost inevitable Mereology, set theory and Topology. The development in computer science and AI in general. methodology is based on Category Theory. In the Ontology plays an important role to use as a source interim, ontology adopts a categorical framework that of shared defined terms; such metadata can be used means it searches for what is universal in both in a specific domain. specific and general domains [14]. The ontology’s Copyright © 2012, Infonomics Society 101 Journal of Internet Technology and Secured Transactions (JITST), Volume 1, Issue 4, December 2012 meaning emerges basically from a reliance on the theory of category as a grounding of mathematics[15].
Recommended publications
  • Metadata and GIS
    Metadata and GIS ® An ESRI White Paper • October 2002 ESRI 380 New York St., Redlands, CA 92373-8100, USA • TEL 909-793-2853 • FAX 909-793-5953 • E-MAIL [email protected] • WEB www.esri.com Copyright © 2002 ESRI All rights reserved. Printed in the United States of America. The information contained in this document is the exclusive property of ESRI. This work is protected under United States copyright law and other international copyright treaties and conventions. No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, or by any information storage or retrieval system, except as expressly permitted in writing by ESRI. All requests should be sent to Attention: Contracts Manager, ESRI, 380 New York Street, Redlands, CA 92373-8100, USA. The information contained in this document is subject to change without notice. U.S. GOVERNMENT RESTRICTED/LIMITED RIGHTS Any software, documentation, and/or data delivered hereunder is subject to the terms of the License Agreement. In no event shall the U.S. Government acquire greater than RESTRICTED/LIMITED RIGHTS. At a minimum, use, duplication, or disclosure by the U.S. Government is subject to restrictions as set forth in FAR §52.227-14 Alternates I, II, and III (JUN 1987); FAR §52.227-19 (JUN 1987) and/or FAR §12.211/12.212 (Commercial Technical Data/Computer Software); and DFARS §252.227-7015 (NOV 1995) (Technical Data) and/or DFARS §227.7202 (Computer Software), as applicable. Contractor/Manufacturer is ESRI, 380 New York Street, Redlands, CA 92373- 8100, USA.
    [Show full text]
  • Artificial Intelligence in Health Care: the Hope, the Hype, the Promise, the Peril
    Artificial Intelligence in Health Care: The Hope, the Hype, the Promise, the Peril Michael Matheny, Sonoo Thadaney Israni, Mahnoor Ahmed, and Danielle Whicher, Editors WASHINGTON, DC NAM.EDU PREPUBLICATION COPY - Uncorrected Proofs NATIONAL ACADEMY OF MEDICINE • 500 Fifth Street, NW • WASHINGTON, DC 20001 NOTICE: This publication has undergone peer review according to procedures established by the National Academy of Medicine (NAM). Publication by the NAM worthy of public attention, but does not constitute endorsement of conclusions and recommendationssignifies that it is the by productthe NAM. of The a carefully views presented considered in processthis publication and is a contributionare those of individual contributors and do not represent formal consensus positions of the authors’ organizations; the NAM; or the National Academies of Sciences, Engineering, and Medicine. Library of Congress Cataloging-in-Publication Data to Come Copyright 2019 by the National Academy of Sciences. All rights reserved. Printed in the United States of America. Suggested citation: Matheny, M., S. Thadaney Israni, M. Ahmed, and D. Whicher, Editors. 2019. Artificial Intelligence in Health Care: The Hope, the Hype, the Promise, the Peril. NAM Special Publication. Washington, DC: National Academy of Medicine. PREPUBLICATION COPY - Uncorrected Proofs “Knowing is not enough; we must apply. Willing is not enough; we must do.” --GOETHE PREPUBLICATION COPY - Uncorrected Proofs ABOUT THE NATIONAL ACADEMY OF MEDICINE The National Academy of Medicine is one of three Academies constituting the Nation- al Academies of Sciences, Engineering, and Medicine (the National Academies). The Na- tional Academies provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions.
    [Show full text]
  • Annual Report 2018–2019 Our Vision
    ANNUAL REPORT 2018–2019 OUR VISION We shape tomorrow. We confront problems and create solutions. We expand information’s impact and technology’s potential. Together, our faculty, staff, students, and alumni make the world a better place—day by day, project by project, leap by leap. LEADERSHIP Raj Acharya Since its establishment in 2000, the Luddy School of Informatics, Computing, and Dean Engineering has built a reputation as one of the broadest of its kind. Our more than 3,000 students come from Indiana and around the world, and our unique blend Mathew Palakal of programs in informatics, computer science, intelligent systems engineering, Senior Executive Associate information and library science, data science, and more create an interdisciplinary, Dean collaborative environment where ideas thrive. Erik Stolterman Bergqvist Our forward-looking school is a mélange, a salad bowl of disparate but related Senior Executive Associate disciplines. That salad bowl provides us with a holistic taste of creativity and Dean innovation while preserving and enhancing the taste of the individual components. Esfandiar Haghverdi As we have grown exponentially through our first two decades, we have maintained Executive Associate Dean for our core values with an open-minded view of tomorrow, one that has allowed us to Undergraduate Education stay on the cutting edge of technology while anticipating what the future holds. David Leake We accomplished much during the 2018-19 school year. Our information and library Executive Associate Dean science program was ranked second in the world behind only Harvard by the 2018 Academic Ranking of World Universities. Researchers at our school garnered Kay Connelly $16.1 million in grants from the National Science Foundation, the National Institute Associate Dean for Research of Health, the National Cancer Institute, the Department of Defense, and other prestigious organizations, and our school ranks 12th in computer and information Karl F.
    [Show full text]
  • Creating Permissionless Blockchains of Metadata Records Dejah Rubel
    Articles No Need to Ask: Creating Permissionless Blockchains of Metadata Records Dejah Rubel ABSTRACT This article will describe how permissionless metadata blockchains could be created to overcome two significant limitations in current cataloging practices: centralization and a lack of traceability. The process would start by creating public and private keys, which could be managed using digital wallet software. After creating a genesis block, nodes would submit either a new record or modifications to a single record for validation. Validation would rely on a Federated Byzantine Agreement consensus algorithm because it offers the most flexibility for institutions to select authoritative peers. Only the top tier nodes would be required to store a copy of the entire blockchain thereby allowing other institutions to decide whether they prefer to use the abridged version or the full version. INTRODUCTION Several libraries and library vendors are investigating how blockchain could improve activities such as scholarly publishing, content dissemination, and copyright enforcement. A few organizations, such as Katalysis, are creating prototypes or alpha versions of blockchain platforms and products.1 Although there has been some discussion about using blockchains for metadata creation and management, only one company appears to be designing such a product. Therefore, this article will describe how permissionless blockchains of metadata records could be created, managed, and stored to overcome current challenges with metadata creation and management. LIMITATIONS OF CURRENT PRACTICES Metadata standards, processes, and systems are changing to meet twenty-first century information needs and expectations. There are two significant limitations, however, to our current metadata creation and modification practices that have not been addressed: centralization and traceability.
    [Show full text]
  • A Survey of Top-Level Ontologies to Inform the Ontological Choices for a Foundation Data Model
    A survey of Top-Level Ontologies To inform the ontological choices for a Foundation Data Model Version 1 Contents 1 Introduction and Purpose 3 F.13 FrameNet 92 2 Approach and contents 4 F.14 GFO – General Formal Ontology 94 2.1 Collect candidate top-level ontologies 4 F.15 gist 95 2.2 Develop assessment framework 4 F.16 HQDM – High Quality Data Models 97 2.3 Assessment of candidate top-level ontologies F.17 IDEAS – International Defence Enterprise against the framework 5 Architecture Specification 99 2.4 Terminological note 5 F.18 IEC 62541 100 3 Assessment framework – development basis 6 F.19 IEC 63088 100 3.1 General ontological requirements 6 F.20 ISO 12006-3 101 3.2 Overarching ontological architecture F.21 ISO 15926-2 102 framework 8 F.22 KKO: KBpedia Knowledge Ontology 103 4 Ontological commitment overview 11 F.23 KR Ontology – Knowledge Representation 4.1 General choices 11 Ontology 105 4.2 Formal structure – horizontal and vertical 14 F.24 MarineTLO: A Top-Level 4.3 Universal commitments 33 Ontology for the Marine Domain 106 5 Assessment Framework Results 37 F. 25 MIMOSA CCOM – (Common Conceptual 5.1 General choices 37 Object Model) 108 5.2 Formal structure: vertical aspects 38 F.26 OWL – Web Ontology Language 110 5.3 Formal structure: horizontal aspects 42 F.27 ProtOn – PROTo ONtology 111 5.4 Universal commitments 44 F.28 Schema.org 112 6 Summary 46 F.29 SENSUS 113 Appendix A F.30 SKOS 113 Pathway requirements for a Foundation Data F.31 SUMO 115 Model 48 F.32 TMRM/TMDM – Topic Map Reference/Data Appendix B Models 116 ISO IEC 21838-1:2019
    [Show full text]
  • Semantic Integration and Knowledge Discovery for Environmental Research
    Journal of Database Management, 18(1), 43-67, January-March 2007 43 Semantic Integration and Knowledge Discovery for Environmental Research Zhiyuan Chen, University of Maryland, Baltimore County (UMBC), USA Aryya Gangopadhyay, University of Maryland, Baltimore County (UMBC), USA George Karabatis, University of Maryland, Baltimore County (UMBC), USA Michael McGuire, University of Maryland, Baltimore County (UMBC), USA Claire Welty, University of Maryland, Baltimore County (UMBC), USA ABSTRACT Environmental research and knowledge discovery both require extensive use of data stored in various sources and created in different ways for diverse purposes. We describe a new metadata approach to elicit semantic information from environmental data and implement semantics- based techniques to assist users in integrating, navigating, and mining multiple environmental data sources. Our system contains specifications of various environmental data sources and the relationships that are formed among them. User requests are augmented with semantically related data sources and automatically presented as a visual semantic network. In addition, we present a methodology for data navigation and pattern discovery using multi-resolution brows- ing and data mining. The data semantics are captured and utilized in terms of their patterns and trends at multiple levels of resolution. We present the efficacy of our methodology through experimental results. Keywords: environmental research, knowledge discovery and navigation, semantic integra- tion, semantic networks,
    [Show full text]
  • Quantum in the Cloud: Application Potentials and Research Opportunities
    Quantum in the Cloud: Application Potentials and Research Opportunities Frank Leymann a, Johanna Barzen b, Michael Falkenthal c, Daniel Vietz d, Benjamin Weder e and Karoline Wild f Institute of Architecture of Application Systems, University of Stuttgart, Universitätsstr. 38, Stuttgart, Germany Keywords: Cloud Computing, Quantum Computing, Hybrid Applications. Abstract: Quantum computers are becoming real, and they have the inherent potential to significantly impact many application domains. We sketch the basics about programming quantum computers, showing that quantum programs are typically hybrid consisting of a mixture of classical parts and quantum parts. With the advent of quantum computers in the cloud, the cloud is a fine environment for performing quantum programs. The tool chain available for creating and running such programs is sketched. As an exemplary problem we discuss efforts to implement quantum programs that are hardware independent. A use case from machine learning is outlined. Finally, a collaborative platform for solving problems with quantum computers that is currently under construction is presented. 1 INTRODUCTION Because of this, the overall algorithms are often hybrid. They perform parts on a quantum computer, Quantum computing advanced up to a state that urges other parts on a classical computer. Each part attention to the software community: problems that performed on a quantum computer is fast enough to are hard to solve based on classical (hardware and produce reliable results. The parts executed on a software) technology become tractable in the next classical computer analyze the results, compute new couple of years (National Academies, 2019). parameters for the quantum parts, and pass them on Quantum computers are offered for commercial use to a quantum part.
    [Show full text]
  • Semantic Integration Across Heterogeneous Databases Finding Data Correspondences Using Agglomerative Hierarchical Clustering and Artificial Neural Networks
    DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2018 Semantic Integration across Heterogeneous Databases Finding Data Correspondences using Agglomerative Hierarchical Clustering and Artificial Neural Networks MARK HOBRO KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Semantic Integration across Heterogeneous Databases Finding Data Correspondences using Agglomerative Hierarchical Clustering and Artificial Neural Networks MARK HOBRO Master in Computer Science Date: April 11, 2018 Supervisor: John Folkesson Examiner: Hedvig Kjellström Swedish title: Semantisk integrering mellan heterogena databaser: Hitta datakopplingar med hjälp av hierarkisk klustring och artificiella neuronnät School of Computer Science and Communication iii Abstract The process of data integration is an important part of the database field when it comes to database migrations and the merging of data. The research in the area has grown with the addition of machine learn- ing approaches in the last 20 years. Due to the complexity of the re- search field, no go-to solutions have appeared. Instead, a wide variety of ways of enhancing database migrations have emerged. This thesis examines how well a learning-based solution performs for the seman- tic integration problem in database migrations. Two algorithms are implemented. One that is based on informa- tion retrieval theory, with the goal of yielding a matching result that can be used as a benchmark for measuring the performance of the machine learning algorithm. The machine learning approach is based on grouping data with agglomerative hierarchical clustering and then training a neural network to recognize patterns in the data. This al- lows making predictions about potential data correspondences across two databases.
    [Show full text]
  • Framework for Formal Ontology Barry Smith and Kevin Mulligan
    Framework for Formal Ontology Barry Smith and Kevin Mulligan NOTICE: THIS MATERIAL M/\Y BE PROTECTED BY COPYRIGHT LAW (TITLE 17, u.s. CODE) ABSTRACT. The discussions which follow rest on a distinction, or object-relations in the world; nor does it concern itself rust expounded by Husser!, between formal logic and formal specifically with sentences about such objects. It deals, ontology. The former concerns itself with (formal) meaning-struc­ rather, with sentences in general (including, for example, tures; the latter with formal structures amongst objects and their the sentences of mathematics),2 and where it is applied to parts. The paper attempts to show how, when formal ontological considerations are brought into play, contemporary extensionalist sentences about objects it can take no account of any theories of part and whole, and above all the mereology of Lesniew­ formal or material object·structures which may be ex­ ski, can be generalised to embrace not only relations between con­ hibited amongst the objects pictured. Its attentions are crete objects and object-pieces, but also relations between what we directed, rather, to the relations which obtain between shall call dependent parts or moments. A two-dimensional formal sentences purely in virtue of what we can call their logical language is canvassed for the resultant ontological theory, a language which owes more to the tradition of Euler, Boole and Venn than to complexity (for example the deducibility-relations which the quantifier-centred languages which have predominated amongst obtain between any sentence of the form A & Band analytic philosophers since the time of Frege and Russell.
    [Show full text]
  • Conceptual Modelling and Humanities
    Joint Proceedings of Modellierung 2020 Short, Workshop and Tools & Demo Papers Modellierung 2020: Short Papers 13 Conceptual Modelling and Humanities Yannic Ole Kropp,1 Bernhard Thalheim2 Abstract: Humanities are becoming a hyping field of intensive research for computer researchers. It seems that conceptual models may be the basis for development of appropriate solutions of digitalisation problems in social sciences. At the same time, humanities and social sciences can fertilise conceptual modelling. The notion of conceptual models becomes enriched. The approaches to modelling in social sciences thus result in a deeper understanding of modelling. The main aim of this paper is to learn from social sciences for conceptual modelling and to fertilise the field of conceptual modelling. 1 The Value of Conceptual Modelling 1.1 Computer science is IT system-oriented Computer system development is a complex process and needs abstraction, separation of concerns, approaches for handling complexity and mature support for communication within development teams. Models are one of the main artefacts for abstraction and complexity reduction. Computer science uses more than 50 different kinds of modelling languages and modelling approaches. Models have thus been a means for system construction for a long time. Models are widely used as an universal instrument whenever humans are involved and an understanding of computer properties is essential. They are enhanced by commonly accepted concepts and thus become conceptual models. The main deployment scenario for models and conceptual models is still system construction (with description, prescription, and coding sub-scenarios) although other scenarios became popular, e.g. documentation, communication, negotiation, conceptualisation, and learning. 1.2 Learning from Digital Hunanities Digital humanities is becoming a hyping buzzword nowadays due to digitalisation and due to over-applying computer technology.
    [Show full text]
  • Information Retrieval System: Concept and Scope MODULE - 5B INFORMATION RETRIEVAL SYSTEM
    Information Retrieval System: Concept and Scope MODULE - 5B INFORMATION RETRIEVAL SYSTEM 15 Notes INFORMATION RETRIEVAL SYSTEM: CONCEPT AND SCOPE 15.1 INTRODUCTION Information is communicated or received knowledge concerning a particular fact or circumstance. Retrieval refers to searching through stored information to find information relevant to the task at hand. In view of this, information retrieval (IR) deals with the representation, storage, organization of/and access to information items. Here, types of information items include documents, Web pages, online catalogues, structured records, multimedia objects, etc. Chief goals of the IR are indexing text and searching for useful documents in a collection. Libraries were among the first institutions to adopt IR systems for retrieving information. In this lesson, you will be introduced to the importance, definitions and objectives of information retrieval. You will also study in detail the concept of subject approach to information, process of information retrieval, and indexing languages. 15.2 OBJECTIVES After studying this lesson, you will be able to: define information retrieval; understand the importance and need of information retrieval system; explain the concept of subject approach to information; LIBRARY AND INFORMATION SCIENCE 321 MODULE - 5B Information Retrieval System: Concept and Scope INFORMATION RETRIEVAL SYSTEM illustrate the process of information retrieval; and differentiate between natural, free and controlled indexing languages. 15.3 INFORMATION RETRIEVAL (IR) Notes The term ‘information retrieval’ was coined by Calvin Mooers in 1950. It gained popularity in the research community from 1961 onwards, when computers were introduced for information handling. The term information retrieval was then used to mean retrieval of bibliographic information from stored document databases.
    [Show full text]
  • Chaudron: Extending Dbpedia with Measurement Julien Subercaze
    Chaudron: Extending DBpedia with measurement Julien Subercaze To cite this version: Julien Subercaze. Chaudron: Extending DBpedia with measurement. 14th European Semantic Web Conference, Eva Blomqvist, Diana Maynard, Aldo Gangemi, May 2017, Portoroz, Slovenia. hal- 01477214 HAL Id: hal-01477214 https://hal.archives-ouvertes.fr/hal-01477214 Submitted on 27 Feb 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Chaudron: Extending DBpedia with measurement Julien Subercaze1 Univ Lyon, UJM-Saint-Etienne, CNRS Laboratoire Hubert Curien UMR 5516, F-42023, SAINT-ETIENNE, France [email protected] Abstract. Wikipedia is the largest collaborative encyclopedia and is used as the source for DBpedia, a central dataset of the LOD cloud. Wikipedia contains numerous numerical measures on the entities it describes, as per the general character of the data it encompasses. The DBpedia In- formation Extraction Framework transforms semi-structured data from Wikipedia into structured RDF. However this extraction framework of- fers a limited support to handle measurement in Wikipedia. In this paper, we describe the automated process that enables the creation of the Chaudron dataset. We propose an alternative extraction to the tra- ditional mapping creation from Wikipedia dump, by also using the ren- dered HTML to avoid the template transclusion issue.
    [Show full text]