Component Report
Total Page:16
File Type:pdf, Size:1020Kb
COMPONENT REPORT Project Acronym: OpenUp! Grant Agreement No: 270890 Project Title: Opening up the Natural History Heritage for Europeana C3.2.1 Domain specific vocabularies for EUROPEANA - interim Concept for inclusion of domain specific metadata vocabularies and contribution to improving access to scientific information via EDM Revision: Version 2a (final) Authors (in alphabetical order): Benda Odo AIT Forschungsgesellschaft mbH Höller Astrid AIT Forschungsgesellschaft mbH Koch Gerda AIT Forschungsgesellschaft mbH Koch Walter AIT Forschungsgesellschaft mbH Project co-funded by the European Commission within the ICT Policy Support Programme Dissemination Level P Public C Confidential, only for members of the consortium and the Commission Services x AIT, 2012 C3.2.1 version 2a p. 1 Revision History Revision Date Author Organisation Description Draft 2012‐07‐18 W. Koch, AIT Concept and Draft, EDM G. Koch Draft 2012-07-23 A. Höller, AIT Update, vocabularies G. Koch Draft 2012-07-24 A. Höller AIT Description of Work update Draft 2012-08-02 O. Benda AIT Update, technical concept 1 2012-08-03 G.Koch AIT Annexes I and II 1a 2012-08-20 O. Benda AIT Technical concept refined, TMG comments integrated, examples 1b 2012-08-21 A. Höller AIT Including EOL-APIs (Annex III) 2 2012-08-24 W. Koch, AIT Description of Work update, chapter 5, final editing G. Koch 2a 2012-09-25 A. Michel, Coordination BGBM Minor editing P. Böttinger Team Statement of Originality This deliverable contains original unpublished work except where clearly indicated otherwise. Acknowledgement of previously published material and of the work of others has been made through appropriate citation, quotation or both. Distribution Recipient Date Version Accepted YES/NO TMG 2012-08-03 1 Project Coordinator 2012-08-03 1 TMG (AIT, BGBM, GBIF, IBSAS, MFN, MRAC, 2012-08-24 2 NHM, NHMW, RBGK, UH) Project Coordinator (W.Berendsohn, BGBM) 2012-08-24 2a Yes AIT, 2012 C3.2.1 version 2a p. 2 Table of Contents 1 DESCRIPTION OF WORK ................................................................................................................... 1 2 INTRODUCTION TO THE EDM CLASSES .............................................................................................. 2 3 THE EDM CONTEXTUAL CLASSES AND RELEVANT VOCABULARIES ...................................................... 4 3.1 Who? (edm:Agent)............................................................................................. 4 3.2 Where? (edm:Place) ........................................................................................... 6 3.3 When? (edm:TimeSpan) ..................................................................................... 7 3.4 What? (skos:Concept) ........................................................................................ 8 3.5 edm:Event ........................................................................................................ 9 3.6 edm:PhysicalThing ........................................................................................... 10 4 THE EDM CORE CLASSES AND ABCD(EFG) ........................................................................................ 11 4.1 edm:ProvidedCHO ............................................................................................ 12 4.2 edm:Aggregation ............................................................................................. 22 4.3 edm:WebResource ........................................................................................... 24 5 EDM DATA ENRICHMENT WITH ADDITIONAL DATA SOURCES .......................................................... 26 6 THE ONTOLOGY DATA GATEWAY.................................................................................................... 30 6.1 Gateway Protocol ............................................................................................. 31 7 EXTENSIONS OF PENTAHO TRANSFORMATIONS ............................................................................. 39 8 HELPER APPLICATIONS FOR ONTOLOGY .......................................................................................... 45 8.1 Vocabulary Registry module .............................................................................. 45 8.2 Caching module ............................................................................................... 45 8.3 Usage Directory module .................................................................................... 45 9 ANNEX I ......................................................................................................................................... 46 10 ANNEX II ........................................................................................................................................ 58 11 ANNEX III ....................................................................................................................................... 62 11.1 search ............................................................................................................ 63 11.2 pages ............................................................................................................. 66 12 LIST OF FIGURES ............................................................................................................................. 73 13 LIST OF TABLES .............................................................................................................................. 74 14 LIST OF REFERENCES ....................................................................................................................... 74 AIT, 2012 C3.2.1 version 2a p. 3 1 DESCRIPTION OF WORK Based on the analysis of EDM and various domain specific vocabularies a first concept for inclusion of metadata vocabularies and metadata enrichment was worked out. For this purpose existing tools for building and deploying semantic knowledge representations were evaluated. Figure 1 Ingesting records into Europeana (overall workflow) In Task 3.4 a survey template for vocabularies was established and a selection of current available and relevant vocabularies for metadata enrichment, specifically vocabularies for vernacular names, person names, locations, time etc. was carried out. The findings of this survey have been incorporated in chapter 3 of this document (relevant vocabularies for the EDM contextual classes). The survey template is depicted in Annex I of this document. In addition to the survey results the information provided by WP6 in C6.1.1 and C6.3.1 has been integrated into chapter 3. The selection of vocabularies was further based upon the following criteria: availability of documentation, provision of a test URL and open access (rights) and was driven by the EDM-requirements as described in chapter 2. A new component (EDM Management Subsystem) is under development and will be based on semantic technologies and standards like the XML Topic Map standard (XTM, ISO/IEC 13250). This component will be integrated into the workflow of the Natural History Aggregator. Chapter 4 depicts a first mapping of the EDM core classes to the ABCD(EFG) standard and identifies relevant fields for enrichment, chapter 5 provides an outlook on EDM data enrichment possibilities and the chapters 6 to 8 outline the envisaged technical implementation of vocabulary integration. AIT, 2012 C3.2.1 version 2a p. 1 2 INTRODUCTION TO THE EDM CLASSES The Europeana Data Model (short: EDM) 1 is a comprehensive metadata model that will be the favoured data standard metadata integration into the Europeana portal (http://www.europeana.eu) in future. The model introduces own elements and re-uses elements from various names spaces. Among those count: . The Resource Description Framework (RDF) and the RDF Schema (RDFS) namespaces (http://www.w3.org/TR/rdf-concepts/) . The OAI Object Reuse and Exchange (ORE) namespace (http://www.openarchives.org/ore) . The Simple Knowledge Organization System (SKOS) namespace (http://www.w3.org/TR/skos- reference/) . The Dublin Core namespaces for elements (http://purl.org/dc/elements/1.1/, abbreviated as DC), terms (http://purl.org/dc/terms/, abbreviated as DCTERMS) and types (http://purl.org/dc/dcmitype/, abbreviated as DCMITYPE) The EDM is a theoretical data model that allows data to be presented in different ways according to the practices of the various domains who contribute data to Europeana. A process has been undertaken to translate the model from a specification into a practical implementation. This has necessitated a selection process. Initially the essential classes that could realistically be taken forward to a first implementation were selected. Following this, the properties from EDM that could apply to each of those classes were identified. The final stage was to select a sub-set of those properties that would be incorporated in the initial XML schema.2 Figure 2 The EDM Class hierarchy3 Figure 2: The classes introduced by EDM are shown in light blue rectangles. The classes in the white rectangles are re-used from other schemas; the schema is indicated before the colon. 1 EDM documentation. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 2 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 3 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. AIT, 2012 C3.2.1 version 2a p. 2 EDM data integration will be realized step by step in the Europeana portal4. For the initial implementations a