A Semantically Enabled Architecture for Crowdsourced Linked Data Management

Total Page:16

File Type:pdf, Size:1020Kb

A Semantically Enabled Architecture for Crowdsourced Linked Data Management A semantically enabled architecture for crowdsourced Linked Data management Elena Simperl Maribel Acosta Barry Norton Institute AIFB Institute AIFB Ontotext AD Karlsruhe Institute of Karlsruhe Institute of Bulgaria Technology Technology [email protected] Germany Germany [email protected] [email protected] ABSTRACT a central building block of the Linked Data technology stack. It is Increasing amounts of structured data are exposed on the Web us- a graph-based data model based on the idea of making statements ing graph-based representation models and protocols such as RDF about (information and non-information) resources on the Web in subject predicate object 2 and SPARQL. Nevertheless, while the overall volume of such open, terms of triples of the form . or easily accessible, data sources reaches critical mass, the abil- The object of any RDF triple may be used in the subject posi- ity of potential consumers to use them in novel applications and tion in other triples, leading to a directed, labeled graph typically services is predicated on the availability of purposeful means to referred to as an ’RDF graph’. Both nodes and edges in such query and manage the data, while taking into account and master- graphs are identified via URIs; nodes represent Web resources, ing its essential features in terms of decentralization, heterogene- while edges stand for attributes of such resources or properties ity of schema, varying quality, and scale. Many aspects of these connecting them. Schema information can be expressed using lan- challenges are necessarily tackled through a combination of algo- guages such as RDFS and OWL, by which resources can be typed rithmic techniques and manual effort. In the literature on tradi- as classes described in terms of domain-specific attributes, proper- ties and constraints.3 RDF graphs can be natively queried using the tional data management the theoretical and technical groundwork 4 to realize and manage such combinations is being established. In query language SPARQL. A SPARQL query is composed of graph this paper we build upon these ideas and propose a semantically patterns and can be stored as RDF triples together with any RDF domain model using SPIN to facilitate the definition of constraints enabled architecture for crowdsourced data management systems 5 which uses formal representations of tasks and data to automati- and inference rules in ontologies. cally design and optimize the operation and outcomes of human Over the past years Linked Data has established itself as the de computation projects. The architecture is applied to the context of facto means for the publication of structured data over the Web, Linked Data management to address specific challenges of Linked enjoying amazing growth in terms of the number of organizations Data query processing such as identity resolution and ontological committing to use its core principles and technologies for exposing and interlinking data sets for seamless exchange, integration and classification. Starting from a motivational scenario we explain 6 how query-processing tasks can be decomposed and translated into reuse. More and more ICT ventures offer innovative data man- MTurk projects using our semantic approach, and roadmap the ex- agement services on top of Linked (Open) Data. A first notable tensions to graph-based data management technology that are re- example comes from the public administration sector, with a wide quired to achieve this vision. array of governmental institutions in many countries worldwide exposing their data according to open-access policies and Linked Data principles and technologies. Other equally promising applica- 1. INTRODUCTION tion domains include life sciences, cultural heritage, and media and Linked Data refers to a set of guidelines and best practices for broadcasting, with the BBC, the New York Times and the Guardian publishing and accessing structured data on the Web.1 It builds adopting the same approach to publish metadata about their content upon established Web technologies, in particular HTTP and URIs, in order to increase Web traffic, improve site navigation and facili- extended with Semantic Web representation formats and protocols tate the integration of self-owned and public data sets. such as RDF, RDFS, OWL and SPARQL, by which data from dif- Per design Linked Data management solutions aim to reach a ferent sources can be shared, interconnected and used beyond the high level of automation with respect to the processing of an open application scenarios for which it was originally created. RDF is and decentralized data space bringing together data sources pub- lished by different parties, of varying quality and using heteroge- 1http://www.w3.org/DesignIssues/LinkedData. html neous conceptual schemas and vocabularies. This is achieved by using a minimal set of established Web technologies known for their robustness and scalability, paired with a uniform data model 2http://www.w3.org/RDF/ 3http://www.w3.org/TR/rdf-schema/, http://www.w3.org/TR/owl-ref/ 4 Copyright c 2012 for the individual papers by the papers’ authors. Copy- http://www.w3.org/TR/rdf-sparql-query/ ing permitted for private and academic purposes. This volume is published 5http://spinrdf.org/ and copyrighted by its editors. 6See also recent statistics of the Linked Open Data Cloud at CrowdSearch 2012 workshop at WWW 2012, Lyon, France http://www4.wiwiss.fu-berlin.de/lodcloud/ Figure 1: Generic architecture for applications consuming Linked Data (according to [5]) Figure 2: Overview of our approach and links between datasets facilitating exploration and integration. management envisions the enhancement of core components of the Nevertheless, while the core technological building blocks of this architecture presented in Figure 1 with integrated crowdsourcing approach are maturing, and large amounts of useful data are con- features that deal with the packaging of specific tasks as MTurk tinuously made available on the Web, the actual experience in de- HITs, and with the integration of the crowdsourcing results with veloping Linked Data applications reveals that many aspects of existing semi-automatic functionality. Figure 2 schematically de- Linked Data management remain, for principled or technical rea- picts this idea for a query processing scenario, which is further sons, heavily reliant on human intervention. Figure 1 shows a discussed in Section 3. Dealing with MTurk includes user inter- generic architecture of applications consuming data exposed us- face management capabilities, in order to produce optimal human- ing Linked Data principles and technologies [5]. The two layers readable descriptions of specific tasks operating on specific data, to that lend itself to LD-specific tasks are the publication layer, and increase workers’ productivity, and to reduce unintended behavior the data access, integration and storage layer, respectively. Both such as spam, but also workflow design capabilities for handling contain various aspects amenable to crowdsourcing, as we will see complex tasks. As described in [2] we propose to use SPARQL later on: (i) publishing legacy data as RDF: identifying suitable graph patterns to describe tasks and data to drive the generation vocabularies and extending them, conceptual modeling, defining of HITs interfaces, as well as extensions of Linked Data manage- mapping rules from the legacy sources to Linked Data; (ii) Web ment languages and components, most prominently query process- data access: discovering or adding missing information about a re- ing, mapping and identity resolution, and Linked Data wrappers source; (iii) Vocabulary mapping and identity resolution: defining with crowdsourcing operators. The corresponding Linked Data correspondences between related resources (iv) Query processing components need to interact with the MTurk platform to post spe- over integrated data sets: aligning different styles or attributes of cific HITs, assess the quality of the outcomes, and exploit these in data integrated from heterogeneous data sources, detecting lack of a particular application. This interaction can occur offline, when knowledge. In these scenarios, human contributions can serve dif- crowdsourcing input is being used to gradually improve the quality ferent purposes, from undertaking the task itself, to generating data of computational methods, and online, which may require specific to train specific automatic techniques, and validating (intermedi- optimizations to predict the time-to-completion of crowdsourced ary) outcomes of such techniques. tasks. Recent approaches in the area of database management systems Each task that is subject to a crowdsourcing exercise is describes [1, 4, 9] are laying out the theoretical and technical foundations for using graph patterns from the SPARQL query language represent- designing and building crowdsourcing-enabled data management ing input to the human task and its desired output. The input pattern technology. In this paper we build upon these ideas and propose a expresses a query over the existing collection of Linked Data, nec- semantically enabled architecture for crowdsourced data manage- essary to provide the input to the MTurk task. The output pattern ment systems which uses formal representations of tasks and data represents a query that should succeed when the additional asser- to automatically design and optimize the operation and outcomes tions that result from the human task have been added to the knowl- of human computation projects. The architecture
Recommended publications
  • Mapping Spatiotemporal Data to RDF: a SPARQL Endpoint for Brussels
    International Journal of Geo-Information Article Mapping Spatiotemporal Data to RDF: A SPARQL Endpoint for Brussels Alejandro Vaisman 1, * and Kevin Chentout 2 1 Instituto Tecnológico de Buenos Aires, Buenos Aires 1424, Argentina 2 Sopra Banking Software, Avenue de Tevuren 226, B-1150 Brussels, Belgium * Correspondence: [email protected]; Tel.: +54-11-3457-4864 Received: 20 June 2019; Accepted: 7 August 2019; Published: 10 August 2019 Abstract: This paper describes how a platform for publishing and querying linked open data for the Brussels Capital region in Belgium is built. Data are provided as relational tables or XML documents and are mapped into the RDF data model using R2RML, a standard language that allows defining customized mappings from relational databases to RDF datasets. In this work, data are spatiotemporal in nature; therefore, R2RML must be adapted to allow producing spatiotemporal Linked Open Data.Data generated in this way are used to populate a SPARQL endpoint, where queries are submitted and the result can be displayed on a map. This endpoint is implemented using Strabon, a spatiotemporal RDF triple store built by extending the RDF store Sesame. The first part of the paper describes how R2RML is adapted to allow producing spatial RDF data and to support XML data sources. These techniques are then used to map data about cultural events and public transport in Brussels into RDF. Spatial data are stored in the form of stRDF triples, the format required by Strabon. In addition, the endpoint is enriched with external data obtained from the Linked Open Data Cloud, from sites like DBpedia, Geonames, and LinkedGeoData, to provide context for analysis.
    [Show full text]
  • Discovering Alignments in Ontologies of Linked Data∗
    Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Discovering Alignments in Ontologies of Linked Data∗ Rahul Parundekar, Craig A. Knoblock, and Jose´ Luis Ambite University of Southern California Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA 90292 Abstract The problem of schema linking (aka schema matching in databases and ontology alignment in the Semantic Web) has Recently, large amounts of data are being published received much attention [Bellahsene et al., 2011; Euzenat and using Semantic Web standards. Simultaneously, Shvaiko, 2007; Bernstein et al., 2011; Gal, 2011]. In this there has been a steady rise in links between objects paper we present a novel extensional approach to generate from multiple sources. However, the ontologies alignments between ontologies of linked data sources. Sim- behind these sources have remained largely dis- ilar to previous work on instance-based matching [Duckham connected, thereby challenging the interoperabil- and Worboys, 2005; Doan et al., 2004; Isaac et al., 2007], ity goal of the Semantic Web. We address this we rely on linked instances to determine the alignments. Two problem by automatically finding alignments be- concepts are equivalent if all (or most of) their respective in- tween concepts from multiple linked data sources. stances are linked (by owl:sameAs or similar links). How- Instead of only considering the existing concepts ever, our search is not limited to the existing concepts in in each ontology, we hypothesize new composite the ontology. We hypothesize new concepts by combining concepts, defined using conjunctions and disjunc- existing elements in the ontologies and seek alignments be- tions of (RDF) types and value restrictions, and tween these more general concepts.
    [Show full text]
  • Linked Data Enables Such Web of Data Global Identifier
    Lift your data Introduction to Linked Data Ghislain Atemezing1 , Boris Villazón-Terrazas2 1 EURECOM, France [email protected] 2 Ontology Engineering Group , FI , UPM [email protected] Slides available at: http://www.slideshare.net/atemezing/ INSPIRE Conference 2012, Istanbul Acknowledgements: Oscar Corcho, Asunción Gómez-Pérez, Luis Vilches, Raphaël Troncy, Olaf Hartig, Luis Vilches and many others that we may have omitted. Workdistributed under the license Creative Commons Attribution- Noncommercial-Share Alike 3.0 Agenda Linked Data Geospatial LD Datasets 5-star deployment scheme for Linked Open Data Tutorial “Lift your data”, Istanbul 2012 2 Agenda Linked Data Geospatial LD Datasets 5-star deployment scheme for Linked Open Data Tutorial “Lift your data”, Istanbul 2012 3 Current Web Geospatial Database (Spain) Data exposed to the Web via HTML, pdf, etc. Statistical Database (Spain) © Slide adapted from “5min Introduction to Linked Data”- Olaf Hartig Tutorial “Lift your data”, Istanbul 2012 4 Information from ssgeingle pages can be found via sear ch eegngin es © Slide adapted from “5min Introduction to Linked Data”- Olaf Hartig Tutorial “Lift your data”, Istanbul 2012 5 Complex queries over muliltipl e pages/data sources? © Slide adapted from “5min Introduction to Linked Data”- Olaf Hartig Tutorial “Lift your data”, Istanbul 2012 6 What do we actually want? • Use the Web like a single global database • web of documents -> web of data Geospatial Statistical Database Database (Spain) (Spain) © Slide adapted from “5min Introduction to Linked Data”- Olaf Hartig 7 Tutorial “Lift your data”, Istanbul 2012 7 Linked Data enables such Web of Data Global Identifier: URI (Uniform Resource Identifier), which is a string of characters used to identify a name or a resource on the Internet.
    [Show full text]
  • Historical Gazetteer System Integration: CHGIS, Regnum Francorum, and Geonames
    p. 1 Historical Gazetteer System Integration: CHGIS, Regnum Francorum, and GeoNames Merrick Lex Berman (CGA, Harvard University) Johan Åhlfeldt (Regnum Francorum Online) Marc Wick 24 Jul 2014 [draft chapter for submission to publisher] Abstract: Integrating digital gazetteers involves the matching of place name records, disambiguation of unique places and conflation of duplicates or variant place names. The challenge of mapping between historical instances of place names is also an ongoing concern for several important projects dealing with ancient place names. Here the matching of historical place names from two unrelated datasets (for China and France) to gazetteer web services is undertaken using a basic geospatial and geonomial algorithm. The quantitative results of the matching trials are considered, problems in dealing with vernacular scripts considered, and practical implications for integrating historical gazetteers discussed. 1. Gazetteer Web Services and Digital Historical Gazetteers Online gazetteers provide essential services. They serve as authority records for geocoding place names and for retrieval of place names associated with real world coordinates (reverse geocoding). For example, the expanding interconnections of LinkedOpenData [LOD] 1 on the semantic web have consistently placed GeoNames 2 at or near the center of the semantic web's social graph. The centrality of GeoNames is largely due to three factors: first, it is a free and open Application Programming Interface [API]; second, the API is simple and easy-to-use; and third, it is currently the only global geographic resource with stable URIs. Traffic for the GeoNames web service topped twenty million requests per day (in 2012); and since half of these requests are from smart phones, 3 it is clear that geographic information retrieval [GIR] is being p.
    [Show full text]
  • The Resource Description Framework and Its Schema Fabien Gandon, Reto Krummenacher, Sung-Kook Han, Ioan Toma
    The Resource Description Framework and its Schema Fabien Gandon, Reto Krummenacher, Sung-Kook Han, Ioan Toma To cite this version: Fabien Gandon, Reto Krummenacher, Sung-Kook Han, Ioan Toma. The Resource Description Frame- work and its Schema. Handbook of Semantic Web Technologies, 2011, 978-3-540-92912-3. hal- 01171045 HAL Id: hal-01171045 https://hal.inria.fr/hal-01171045 Submitted on 2 Jul 2015 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. The Resource Description Framework and its Schema Fabien L. Gandon, INRIA Sophia Antipolis Reto Krummenacher, STI Innsbruck Sung-Kook Han, STI Innsbruck Ioan Toma, STI Innsbruck 1. Abstract RDF is a framework to publish statements on the web about anything. It allows anyone to describe resources, in particular Web resources, such as the author, creation date, subject, and copyright of an image. Any information portal or data-based web site can be interested in using the graph model of RDF to open its silos of data about persons, documents, events, products, services, places etc. RDF reuses the web approach to identify resources (URI) and to allow one to explicitly represent any relationship between two resources.
    [Show full text]
  • Aligning Ontologies of Linked Data
    Chapter 1 Aligning Ontologies of Linked Data Rahul Parundekar University of Southern California Craig A. Knoblock University of Southern California Jos´eLuis Ambite University of Southern California 1.1 Introduction ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 1 1.2 Linked Data Sources with Heterogeneous Ontologies :::::::::::::::::::: 3 1.3 Finding Alignments Across Ontologies ::::::::::::::::::::::::::::::::::: 3 1.3.1 Source Preprocessing :::::::::::::::::::::::::::::::::::::::::::::: 4 1.3.2 Aligning Atomic Restriction Classes :::::::::::::::::::::::::::::: 4 1.3.3 Aligning Conjunctive Restriction Classes ::::::::::::::::::::::::: 6 1.3.4 Eliminating Implied Alignments :::::::::::::::::::::::::::::::::: 7 1.3.5 Finding Concept Coverings ::::::::::::::::::::::::::::::::::::::: 10 1.3.6 Curating Linked Data ::::::::::::::::::::::::::::::::::::::::::::: 11 1.4 Results :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 12 1.4.1 Representative Examples of Atomic Alignments ::::::::::::::::: 12 1.4.2 Representative Examples of Conjunctive Alignments :::::::::::: 14 1.4.3 Representative Examples of Concept Coverings :::::::::::::::::: 14 1.4.4 Outliers :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 15 1.4.5 Precision and Recall of Country Alignments ::::::::::::::::::::: 15 1.4.6 Representative Alignments from Other Domains ::::::::::::::::: 16 1.5 Related Work ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 16 1.6 Conclusion :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
    [Show full text]
  • Geolod: a Spatial Linked Data Catalog and Recommender
    big data and cognitive computing Article GeoLOD: A Spatial Linked Data Catalog and Recommender Vasilis Kopsachilis *,† and Michail Vaitis † Department of Geography, University of the Aegean, GR-811 00 Mytilene, Greece; [email protected] * Correspondence: [email protected] † Current address: Department of Geography, University Hill, GR-811 00 Mytilene, Greece. Abstract: The increasing availability of linked data poses new challenges for the identification and retrieval of the most appropriate data sources that meet user needs. Recent dataset catalogs and recommenders provide advanced methods that facilitate linked data search, but none exploits the spatial characteristics of datasets. In this paper, we present GeoLOD, a web catalog of spatial datasets and classes and a recommender for spatial datasets and classes possibly relevant for link discovery processes. GeoLOD Catalog parses, maintains and generates metadata about datasets and classes provided by SPARQL endpoints that contain georeferenced point instances. It offers text and map- based search functionality and dataset descriptions in GeoVoID, a spatial dataset metadata template that extends VoID. GeoLOD Recommender pre-computes and maintains, for all identified spatial classes in the Web of Data (WoD), ranked lists of classes relevant for link discovery. In addition, the on-the-fly Recommender allows users to define an uncatalogued SPARQL endpoint, a GeoJSON or a Shapefile and get class recommendations in real time. Furthermore, generated recommendations can be automatically exported in SILK and LIMES configuration files in order to be used for a link discovery task. In the results, we provide statistics about the status and potential connectivity of spatial datasets in the WoD, we assess the applicability of the recommender, and we present the outcome of a system usability study.
    [Show full text]
  • Introduction Au Web Sémantique
    INTRODUCTION AU WEB SÉMANTIQUE Philippe GENOUD – Danielle ZIEBELIN - LIG-Steamer [email protected] 1 https://www.w3.org/standards/semanticweb/ In addition to the classic “Web of documents” W3C is helping to build a technology stack to support a “Web of data,” the sort of data you find in databases. The ultimate goal of the Web of data is to enable computers to do more useful work and to develop systems that can support trusted interactions over the network. The term “Semantic Web” refers to W3C’s vision of the Web of linked data. Semantic Web technologies enable people to create data stores on the Web, build vocabularies, and write rules for handling data. Linked data are empowered by technologies such as RDF, SPARQL, OWL, and SKOS. 2 WEB architecture URL www.google.com Hyper Link HTML <a href="http://fr.wikipedia.oorg/wiki/Georges_Brassens>..</a> fr.wikipedia.org 3 Web Sémantique – Ontologies P. Genoud D. Ziébelin Web evolution * * ? *d’après le cours Technologie du Web – A. Hombiat - Licence Professionnelle Études Statistiques et Systèmes d'Information Géographique (ESSIG) – UPMF - 2015 4 Web Sémantique – Ontologies P. Genoud D. Ziébelin Web evolution * * ? "The web of human-readable document is being merged with a web of machine understandable data. The potential of the mixture of humans and machines working together and communication through the web could be immense." Tim Berners-Lee, The World Wide Web: A very short personal history, May 1998 http://www.w3.org/People/Berners-Lee/ShortHistory.html 5 Web Sémantique – Ontologies P. Genoud D. Ziébelin From Web of Documents to Web of Data "The web of human-readable document is being merged with a web of machine understandable data." • The traditional web (Web of Documents) is for humans – based on the HTML markup language – HTML describes • what information is presented + how it's presented in conjunction with CSS • how information is linked • but not what the information means – meaning (semantics) of information is derived from available information 6 Web Sémantique – Ontologies P.
    [Show full text]
  • A Semantic Enhanced Model for Effective Spatial Information Retrieval Adeyinka Akanbi, Olusanya Agunbiade, Olumuyiwa Dehinbo, Sadiq Kuti
    A Semantic Enhanced Model for Effective Spatial Information Retrieval Adeyinka Akanbi, Olusanya Agunbiade, Olumuyiwa Dehinbo, Sadiq Kuti To cite this version: Adeyinka Akanbi, Olusanya Agunbiade, Olumuyiwa Dehinbo, Sadiq Kuti. A Semantic Enhanced Model for Effective Spatial Information Retrieval: Un modèle sémantique améliorée for Effective Information Retrieval spatiale. Proceedings of the World Congress on Engineering and Computer Science WCECS 2014, IAENG, Oct 2014, San Francisco, United States. pp.6. hal-01252644 HAL Id: hal-01252644 https://hal.archives-ouvertes.fr/hal-01252644 Submitted on 8 Jan 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Proceedings of the World Congress on Engineering and Computer Science 2014 Vol I WCECS 2014, 22-24 October, 2014, San Francisco, USA A Semantic Enhanced Model for Effective Spatial Information Retrieval Adeyinka K. Akanbi, Olusanya Y. Agunbiade, Olumuyiwa J. Dehinbo, and Sadiq T. Kuti Abstract—A lot of information on the web is geographically detailed sematic referencing showing the spatial relationship referenced. Discovering and retrieving this geographic among entities of the geospatial data for easy accessibility information to satisfy various users needs across both open and by the user. According to [2], Spatial Information Retrieval distributed Spatial Data Infrastructures (SDI) poses eminent is mainly concerned with the provision of access to geo- research challenges.
    [Show full text]
  • The Basic Message Was to Leave out Anything the Provider Does Not
    Europeana Data Model – Mapping Guidelines v2.3 18/11/2016 Co-funded by the European Union Document Scope This is The EDM Mapping Guidelines document. It is part of the family of documents about the Europeana Data Model (EDM) aimed at different audiences. They can all be found at http://pro.europeana.eu/edm-documentation, except the object templates which are at https://github.com/europeana/corelib/wiki/EDMObjectTemplatesProviders The EDM Mapping Guidelines – give guidance for providers wanting to map their data to EDM. They show which property relates to which class and contains definitions of the properties, the data types that can be used as values and the obligation level of each property. It also has an example of original data, the same data converted to EDM and diagrams showing the distribution of the properties amongst the classes. The full set of EDM classes and properties are being implemented incrementally and the Mapping Guidelines is the reference document showing which are currently available. The EDM Definition – the formal specification of the Europeana Data Model and lists the classes and properties that could be used in Europeana. Not all of these are currently implemented. Please refer to the Mapping Guidelines for the current subset of classes and properties in use. The EDM Primer – the “story” of EDM and explains how the classes and properties may be used together to model data and support Europeana functionality. The XML schema and the Schematron rules - this is the XML schema and the required Schematron rules for validating the current implementation of EDM. The EDM object templates – a working document is a simple wiki listing that shows which properties apply to which class and states the data types and obligation of the values.
    [Show full text]
  • Instructions for the Preparation of a Camera-Ready Paper in MS Word
    An RDF guide for the Darwin Core standard Editor(s): Solicited review(s): Open review(s): Steven J. Baskaufa,*, John Wieczorekb, John Deckc, and Campbell O. Webbd aDepartment of Biological Sciences, PMB 351634, Vanderbilt University, Nashville, TN 37146, USA E-mail: [email protected] bMuseum of Vertebrate Zoology, University of California at Berkeley, 3101 Valley Life Sciences Building, Berkeley, CA 94720, USA E-mail: [email protected] c Berkeley Natural History Museums, University of California at Berkeley, 1001 Valley Life Sciences Building, Berkeley, CA 94720, USA Email: [email protected] dArnold Arboretum of Harvard University, 1300 Centre St., Boston 02131, USA E-mail: [email protected] Abstract. The Darwin Core vocabulary is widely used to transmit biodiversity data in the form of simple text files. In order to support expression of biodiversity data in the Resource Description Framework (RDF), a guide was created as a non-normative addition to the Darwin Core standard. The guide resolves a number of issues that arise from adapting terms designed to have literal values for use with URI references. Although there are some problems that are beyond the scope of the guide, the guide is an important step towards enabling the biodiversity informatics community to participate in broader Linked Data and Se- mantic Web efforts. Keywords: biodiversity, informatics, standards, semantics, interoperability 1. Introduction Darwin Core roughly follows the Dublin Core Ab- stract Model,4 which was built on Resource Descrip- Darwin Core (DwC) [1] is a technical standard1 of tion Framework5 (RDF) concepts. A DwC term is Biodiversity Information Standards2 (TDWG). Since designated as a URI, which can be dereferenced to its ratification in 2009, Darwin Core has been widely access its normative RDF/XML metadata.
    [Show full text]
  • Linked Sensor Data
    Wright State University CORE Scholar The Ohio Center of Excellence in Knowledge- Kno.e.sis Publications Enabled Computing (Kno.e.sis) 5-2010 Linked Sensor Data Harshal Kamlesh Patni Wright State University - Main Campus Cory Andrew Henson Wright State University - Main Campus Amit P. Sheth Wright State University - Main Campus, [email protected] Follow this and additional works at: https://corescholar.libraries.wright.edu/knoesis Part of the Bioinformatics Commons, Communication Technology and New Media Commons, Databases and Information Systems Commons, OS and Networks Commons, and the Science and Technology Studies Commons Repository Citation Patni, H. K., Henson, C. A., & Sheth, A. P. (2010). Linked Sensor Data. 2010 International Symposium on Collaborative Technologies and Systems, 362-370. https://corescholar.libraries.wright.edu/knoesis/545 This Conference Proceeding is brought to you for free and open access by the The Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis) at CORE Scholar. It has been accepted for inclusion in Kno.e.sis Publications by an authorized administrator of CORE Scholar. For more information, please contact library- [email protected]. Linked Sensor Data Harshal Patni, Cory Henson, Amit Sheth Kno.e.sis Center, Department of Computer Science and Engineering, Wright State University, Dayton, Ohio, USA {harshal, cory, amit}@knoesis.org ABSTRACT the WindSpeed measurements from these sensors. However, the data is only accessible to the organization A number of government, corporate, and academic involved in the collection of such data. Such data is very organizations are collecting enormous amounts of data useful for research and analysis, and having access to provided by environmental sensors.
    [Show full text]