COMPONENT REPORT

Project Acronym: OpenUp!

Grant Agreement No: 270890

Project Title: Opening up the Natural History Heritage for Europeana

C3.2.1 Domain specific vocabularies for EUROPEANA - interim Concept for inclusion of domain specific metadata vocabularies and contribution to improving access to scientific information via EDM

Revision: Version 2a (final)

Authors (in alphabetical order): Benda Odo AIT Forschungsgesellschaft mbH Höller Astrid AIT Forschungsgesellschaft mbH Koch Gerda AIT Forschungsgesellschaft mbH Koch Walter AIT Forschungsgesellschaft mbH

Project co-funded by the European Commission within the ICT Policy Support Programme Dissemination Level P Public C Confidential, only for members of the consortium and the Commission Services x

AIT, 2012 C3.2.1 version 2a p. 1

Revision History

Revision Date Author Organisation Description Draft 2012‐07‐18 W. Koch, AIT Concept and Draft, EDM G. Koch Draft 2012-07-23 A. Höller, AIT Update, vocabularies G. Koch Draft 2012-07-24 A. Höller AIT Description of Work update Draft 2012-08-02 O. Benda AIT Update, technical concept 1 2012-08-03 G.Koch AIT Annexes I and II 1a 2012-08-20 O. Benda AIT Technical concept refined, TMG comments integrated, examples 1b 2012-08-21 A. Höller AIT Including EOL-APIs (Annex III) 2 2012-08-24 W. Koch, AIT Description of Work update, chapter 5, final editing G. Koch 2a 2012-09-25 A. Michel, Coordination BGBM Minor editing P. Böttinger Team

Statement of Originality

This deliverable contains original unpublished work except where clearly indicated otherwise. Acknowledgement of previously published material and of the work of others has been made through appropriate citation, quotation or both.

Distribution

Recipient Date Version Accepted YES/NO TMG 2012-08-03 1 Project Coordinator 2012-08-03 1 TMG (AIT, BGBM, GBIF, IBSAS, MFN, MRAC, 2012-08-24 2 NHM, NHMW, RBGK, UH) Project Coordinator (W.Berendsohn, BGBM) 2012-08-24 2a Yes

AIT, 2012 C3.2.1 version 2a p. 2

Table of Contents

1 DESCRIPTION OF WORK ...... 1 2 INTRODUCTION TO THE EDM CLASSES ...... 2 3 THE EDM CONTEXTUAL CLASSES AND RELEVANT VOCABULARIES ...... 4 3.1 Who? (edm:Agent)...... 4

3.2 Where? (edm:Place) ...... 6

3.3 When? (edm:TimeSpan) ...... 7

3.4 What? (skos:Concept) ...... 8

3.5 edm:Event ...... 9

3.6 edm:PhysicalThing ...... 10 4 THE EDM CORE CLASSES AND ABCD(EFG) ...... 11 4.1 edm:ProvidedCHO ...... 12

4.2 edm:Aggregation ...... 22

4.3 edm:WebResource ...... 24 5 EDM DATA ENRICHMENT WITH ADDITIONAL DATA SOURCES ...... 26 6 THE ONTOLOGY DATA GATEWAY...... 30 6.1 Gateway Protocol ...... 31 7 EXTENSIONS OF PENTAHO TRANSFORMATIONS ...... 39 8 HELPER APPLICATIONS FOR ONTOLOGY ...... 45 8.1 Vocabulary Registry module ...... 45

8.2 Caching module ...... 45

8.3 Usage Directory module ...... 45 9 ANNEX I ...... 46 10 ANNEX II ...... 58 11 ANNEX III ...... 62 11.1 search ...... 63

11.2 pages ...... 66 12 LIST OF FIGURES ...... 73 13 LIST OF TABLES ...... 74 14 LIST OF REFERENCES ...... 74

AIT, 2012 C3.2.1 version 2a p. 3

1 DESCRIPTION OF WORK

Based on the analysis of EDM and various domain specific vocabularies a first concept for inclusion of metadata vocabularies and metadata enrichment was worked out. For this purpose existing tools for building and deploying semantic knowledge representations were evaluated.

Figure 1 Ingesting records into Europeana (overall workflow)

In Task 3.4 a survey template for vocabularies was established and a selection of current available and relevant vocabularies for metadata enrichment, specifically vocabularies for vernacular names, person names, locations, time etc. was carried out. The findings of this survey have been incorporated in chapter 3 of this document (relevant vocabularies for the EDM contextual classes). The survey template is depicted in Annex I of this document. In addition to the survey results the information provided by WP6 in C6.1.1 and C6.3.1 has been integrated into chapter 3. The selection of vocabularies was further based upon the following criteria: availability of documentation, provision of a test URL and open access (rights) and was driven by the EDM-requirements as described in chapter 2. A new component (EDM Management Subsystem) is under development and will be based on semantic technologies and standards like the XML Topic Map standard (XTM, ISO/IEC 13250). This component will be integrated into the workflow of the Natural History Aggregator. Chapter 4 depicts a first mapping of the EDM core classes to the ABCD(EFG) standard and identifies relevant fields for enrichment, chapter 5 provides an outlook on EDM data enrichment possibilities and the chapters 6 to 8 outline the envisaged technical implementation of vocabulary integration.

AIT, 2012 C3.2.1 version 2a p. 1

2 INTRODUCTION TO THE EDM CLASSES

The Europeana Data Model (short: EDM) 1 is a comprehensive metadata model that will be the favoured data standard metadata integration into the Europeana portal (http://www.europeana.eu) in future. The model introduces own elements and re-uses elements from various names spaces. Among those count: . The Resource Description Framework (RDF) and the RDF Schema (RDFS) namespaces (http://www.w3.org/TR/rdf-concepts/) . The OAI Object Reuse and Exchange (ORE) namespace (http://www.openarchives.org/ore) . The Simple Knowledge Organization System (SKOS) namespace (http://www.w3.org/TR/skos- reference/) . The Dublin Core namespaces for elements (http://purl.org/dc/elements/1.1/, abbreviated as DC), terms (http://purl.org/dc/terms/, abbreviated as DCTERMS) and types (http://purl.org/dc/dcmitype/, abbreviated as DCMITYPE)

The EDM is a theoretical data model that allows data to be presented in different ways according to the practices of the various domains who contribute data to Europeana. A process has been undertaken to translate the model from a specification into a practical implementation. This has necessitated a selection process. Initially the essential classes that could realistically be taken forward to a first implementation were selected. Following this, the properties from EDM that could apply to each of those classes were identified. The final stage was to select a sub-set of those properties that would be incorporated in the initial XML schema.2

Figure 2 The EDM Class hierarchy3

Figure 2: The classes introduced by EDM are shown in light blue rectangles. The classes in the white rectangles are re-used from other schemas; the schema is indicated before the colon.

1 EDM documentation. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 2 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 3 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012.

AIT, 2012 C3.2.1 version 2a p. 2

EDM data integration will be realized step by step in the Europeana portal4. For the initial implementations a set of seven classes has been selected: Core classes: • the provided cultural heritage object (edm:ProvidedCHO) • the web resource that is the digital representation (edm:WebResource) • the aggregation that groups the classes together (ore:Aggregation). Contextual classes: • who (edm:Agent) • where (edm:Place) • when (edm:TimeSpan) • what (skos:Concept)

The contextual classes support the modelling of semantic enrichment and allow to present information that is distinct from the actually provided cultural heritage object and give additional details on eg. the collector of data, or the place of gathering etc. The inclusion of this additional data is realized via a “proxy” mechanism in Europeana in order to support this function without distorting the original data received from the providers. Usually the values of the properties of these classes are taken from controlled vocabularies and thesauri in form of identifiers that link to further information to the vocabulary term (eg. the longitude/latitude of the place of finding, the birth date of the collector etc.) Enrichment processes in OpenUp! will therefore fulfil the task to provide the values of the properties of the EDM contextual classes.

Figure 3 Two contextual classes5

4 Europeana portal. http://www.europeana.eu 18 Jul. 2012. 5 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012.

AIT, 2012 C3.2.1 version 2a p. 3

3 THE EDM CONTEXTUAL CLASSES AND RELEVANT VOCABULARIES

3.1 Who? (edm:Agent)

“This class comprises people, either individually or in groups, who have the potential to perform intentional actions for which they can be held responsible.” 6 The following table shows the properties that will be applied in the implementation of edm:Agent.

Obligation and Property Note Value type Occurence dc:date A significant date associated with the Agent. reference min 0, max unbounded dc:identifier An identifier of the agent. reference min 0, max unbounded edm:begin The date the agent was born/established. literal min 0, max 1 edm:end The date the agent died/terminated. literal min 0, max 1 The identifier of another entity which the agent has "met" in a edm:hasMet reference min 0, max unbounded broad sense. The identifier of other entities, particularly other agents, with edm:isRelatedTo reference min 0, max unbounded whom the agent is related in a generic sense. edm:wasPresentAt The identifier of an event at which the agent was present. reference min 0, max unbounded foaf:name The name of the agent as a simple textual string. literal min 0, max unbounded reference (of owl:sameAs The URI of an agent. min 0, max unbounded an Agent) rdaGr2:biographicalInformation Information pertaining to the life or history of the agent. literal min 0, max unbounded rdaGr2:dateOfBirth The date the agent (person) was born. literal min 0, max 1 rdaGr2:dateOfDeath The date the agent (person) died. literal min 0, max 1 The date on which the agent (corporate body) was established or rdaGr2:dateOfEstablishment literal min 0, max 1 founded. The date on which the agent (corporate body) was terminated rdaGr2:dateOfTermination literal min 0, max 1 or dissolved. rdaGr2:gender The gender with which the agent identifies. literal min 0, max 1 The profession or occupation in which the agent works or has literal or rdaGr2:professionOrOccupation min 0, max unbounded worked. reference skos:altLabel skos:hiddenLabel Alternative forms of the name of the agent. literal min 0, max unbounded skos:note A note about the agent e.g. biographical notes. literal min 0, max unbounded skos:prefLabel The preferred form of the name of the agent. literal min 0, max 1 per lang tag

Table 1 Properties for edm:Agent7

Info sources mentioned by Europeana: some relevant datasets, with their own schemas or extensions: VIAF, Amsterdam Museum persons, DNB authority files as linked data, constructs used at data.bnf.fr vocabularies: FOAF, RDA group 2 elements and relations, MADS/RDF

6 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 7 Properties marked in bold will be used for the first implementation of EDM. Properties with grey background will not be implemented.

AIT, 2012 C3.2.1 version 2a p. 4

VIAF Virtual International Authority File: Institution: Online Computer Library Center (OCLC) Access (URL): http://viaf.org/viaf/search Documentation/Test (URL): http://oclc.org/developer/documentation/virtual-international-authority-file- viaf/using-api Languages: MUL Usage and Rights: 4 April 2012—VIAF (Virtual International Authority File), a project that virtually combines multiple name authority files into a single name authority service, has transitioned to become an OCLC service. OCLC will continue to make VIAF openly accessible and will also work to incorporate VIAF into various OCLC services. The new Agreement confirms the free re-use of VIAF data, including the commercial re-use of data according to the ODC-By license. http://oclc.org/developer/services/viaf

FOAF Friend of a Friend standard: Use this standard for sharing person name data.

Info sources mentioned by OpenUp!: VIAF and FOAF are also mentioned by C6.1.1 and C6.3.1. EML Agent Role Vocabulary Institution: GBIF Access (URL): http://rs.gbif.org/vocabulary/gbif/agent_role.xml Languages: EN, ES, FR Usage and Rights: http://data.gbif.org/tutorial/datauseagreement Comments: The “EML Agent Role Vocabulary” is a vocabulary for agent roles used in EML. It includes the following concepts: . author (an agent who is an author of a publication that used the dataset, or author of a data paper) . contentProvider (an agent who contributed content to a dataset; the dataset being described may be a composite) . custodianSteward (an agent who is responsible for/takes care of the dataset) . distributor (an agent involved in the publishing/distribution chain of a dataset) . editor (an agent associated with editing a publication that used the dataset, or a data paper) . metadataProvider (an agent responsible for providing the metadata) . originator (an agent who originally gathered/prepared the dataset) . owner (an agent who owns the dataset;may or may not be the custodian) . pointOfContent (an agent to contact for further information about the dataset) . principalInvestigator (a primary scientific contact associated with the dataset) . processor (an agent responsible for any post-collection processing of the dataset) . publisher (the agent associated with the publishing of some entity, for example paper, article, book, etc based on the dataset, or of a data paper . user (an agent that makes use of the dataset) . programmer (an agent providing informatics/programming support related to the dataset)

AIT, 2012 C3.2.1 version 2a p. 5

3.2 Where? (edm:Place)

“A spatial location identified by the provider and named according to some vocabulary or local convention.” 8

Obligation and Property Note Value type Occurence

reference (to a dcterms:hasPart identifier of a place that is part of the place being described, min 0, max unbounded Place ) reference (to a dcterms:isPartOf identifier of a place that the described place is part of. min 0, max unbounded Place) reference (to a owl:sameAs URI of a Place min 0, max unbounded Place) skos:altLabel skos:hiddenLabel Alternative forms of the name of the place. literal min 0, max unbounded skos:note Information relating to the place. literal min 0, max unbounded min 0, max 1 per lang skos:prefLabel The preferred form of the name of the place. literal tag The altitude of a spatial thing (decimal metres above the wgs84_pos:alt literal min 0, max 1 reference) wgs84_pos:lat The latitude of a spatial thing (decimal degrees). literal min 0, max 1 A comma separated representation of a latitude, longitude wgs84_pos:lat_long literal min 0, max 1 coordinate. wgs84_pos:long The longitude of a spatial thing (decimal degrees). literal min 0, max 1 Table 2 Properties for edm:Place9

Info sources mentioned by Europeana: some relevant datasets: Geonames, EConnect's gazetteer

Info sources mentioned by OpenUp!: GeoNames is also mentioned by C6.1.1 and C6.3.1. GeoNames Country Subdivision / reverse geocoding Access (URL): api.geonames.org/countrySubdivision? Documentation/Test (URL): http://www.geonames.org/export/web-services.html#countrysubdiv Languages: EN Usage and Rights: http://www.geonames.org/export/ Comment: The iso country code and the administrative subdivision of any given point.

GeoNames Find nearby populated place / reverse geocoding Access (URL): api.geonames.org/findNearbyPlaceName? Documentation/Test (URL): http://www.geonames.org/export/web-services.html#findNearby Languages: EN Usage and Rights: http://www.geonames.org/export/ Comment: Returns the closest populated place for the lat/lng query as xml document. The unit of the distance element is 'km'.

8 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 9 Properties marked in bold will be used for the first implementation of EDM. Properties with grey background will not be implemented.

AIT, 2012 C3.2.1 version 2a p. 6

ISO 3166-1 Alpha2 Country Codes Access (URL): http://rs.gbif.org/vocabulary/iso/3166-1_alpha2.xml Documentation/Test (URL): http://www.iso.org/iso/country_codes/iso_3166_code_lists/country_names_and_code_elements.htm Languages: EN, FR Usage and Rights: ISO provides the alpha-2 country codes for free. The full standard containing the alpha-2, alpha-3 and numeric-3 codes as well as details of the administrative language can be purchased. Comment: ISO 3166-1 alpha-2 codes are two-letter country codes defined in ISO 3166-1, part of the ISO 3166 standard published by the International Organization for Standardization (ISO), to represent countries, dependent territories, and special areas of geographical interest.

3.3 When? (edm:TimeSpan)

“A period of time having a beginning, an end and a duration.” 10

Obligation and Property Note Value type Occurence crm:P79F.beginning_is_qualified_b Qualifying information about the start of the timespan such literal min 0, max unbounded y as degree of certainty, precision, source etc. Qualifying information about the end of the timespan such crm:P80F.end_is_qualified_by literal min 0, max unbounded as degree of certainty, precision, source etc. The identifier of a timespan which is part of the described reference (to a dcterms:hasPart min 0, max unbounded timespan . Time Span ) The identifier of a timespan of which the described timespan reference (to a dcterms:isPartOf min 0, max unbounded is a part. Time Span) edm:begin The date the timespan started. literal min 0, max 1 edm:end The date the timespan finished. literal min 0, max 1 reference (to a owl:sameAs The URI of a timespan min 0, max unbounded Time Span) skos:altLabel, skos:hiddenLabel Alternative forms of the name of the timespan or period. literal min 0, max unbounded skos:note Information relating to the timespan or period. literal min 0, max unbounded min 0, max 1 per lang skos:prefLabel The preferred form of the name of thetimespan or period. literal tag Table 3 Properties for edm:TimeSpan11

Info sources mentioned by Europeana: some relevant datasets: Borys' time periods, OWL time ontology (probably way too complex and not with the right focus), CIDOC-CRM. CARARE may have something.

Info sources mentioned by OpenUp!: GeoTime:ChronoStrat Institution: GBIF Access (URL): http://vocabularies.gbif.org/services/gbif/geo_chronostrat Documentation/Test (URL): http://vocabularies.gbif.org/vocabularies/geo_chronostrat

10 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 11 Properties marked in bold will be used for the first implementation of EDM. Properties with grey background will not be implemented.

AIT, 2012 C3.2.1 version 2a p. 7

Languages: EN Usage and Rights: http://data.gbif.org/tutorial/datauseagreement Comment: Standard stratigraphic nomenclature based on palaeontological intervals of time defined by recognised fossil assemblages.

3.4 What? (skos:Concept)

“A unit of thought or meaning that comes from an organised knowledge base (such as subject terms from a thesaurus or controlled vocabulary) where URIs or local identifiers have been created to represent each concept. In the cultural heritage world there are many such controlled vocabularies such as the Library of Congress Subject Headings or AAT.” 12

For OpenUp! purposes it is planned to use this class for the integration of common/vernacular names vocabularies, or type vocabularies.

Obligation and Property Note Value type Occurence skos:altlabel skos:hiddenlabel Alternative forms of the name of the concept. literal min 0, max unbounded skos:broader, skos:narrower, The identifier of a related concept in the same thesaurus or ref min 0, max unbounded skos:related controlled vocabulary. skos:broadMatch, The identifier of a broader, narrower or related matching skos:narrowMatch, ref min 0, max unbounded concepts from other concept schemes. skos:relatedMatch The identifier of close or exactly matching concepts from skos:exactMatch, skos:closeMatch ref min 0, max unbounded other concept schemes. skos:inScheme The URI of a concept scheme ref min 0, max unbounded The notation in which the concept is represented. This may string (+ rdf: skos:notation not be words in natural language for some knowledge min 0, max unbounded datatype attribute) organisation systems e.g. algebra skos:note information relating to the concept. literal min 0, max unbounded min 0, max 1 per lang skos:preflabel The preferred form of the name of the concept. literal tag Table 4 Properties for skos:Concept13

Info sources mentioned by OpenUp!: Vernacular Names Botanical, Zoological Institution: GBIF Access (URL): http://vocabularies.gbif.org/services/gbif/extension?identifier=127063 Documentation/Test (URL): http://rs.gbif.org/extension/gbif/1.0/vernacularname.xml Languages: EN Usage and Rights: http://data.gbif.org/tutorial/datauseagreement Comment: Extension to core taxa that lists vernacular names for a scientific taxon. EOL Application Programming Interface (API) Institution: EOL (Encyclopedia of Life)

12 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 13 Properties marked in bold will be used for the first implementation of EDM. Properties with grey background will not be implemented.

AIT, 2012 C3.2.1 version 2a p. 8

Access (URL): http://eol.org/api/docs/pages and http://eol.org/api/docs/search Documentation/Test (URL): http://eol.org/info/api_overview Usage and Rights: http://eol.org/info/api_terms Comment: A more detailed description of the EOL APIs can be found in Annex III of the present document.

Darwin Core Type Vocabulary Institution: GBIF Access (URL): http://rs.gbif.org/vocabulary/dwc/basis_of_record.xml Documentation/Test (URL): http://rs.gbif.org/vocabulary/dwc/basis_of_record.xml Languages: EN Usage and Rights: http://data.gbif.org/tutorial/datauseagreement Comment: A recommended set of values to use for the basisOfRecord term to categorize Darwin Core resources. The Darwin Core Type Vocabulary extends and refines terms from the Dublin Core Type Vocabulary to describe and categorize resources more specifically for biodiversity applications.

3.5 edm:Event

“An event is a change “of states in cultural, social or physical systems, regardless of scale, brought about by a series or group of coherent physical, cultural, technological or legal phenomena” (E5 Event in CIDOC CRM) or a “set of coherent phenomena or cultural manifestations bounded in time and space” (E4 Period in CIDOC CRM).” 14 This class will not be implemented in the near future. Obligation and Property Note Value type Occurence dc:identifier string min 0, max unbounded dcterms:hasPart reference (to an Event) min 0, max unbounded dcterms:isPartOf reference (to an Event) min 0, max unbounded edm:happenedAt reference (to a Place) min 0, max unbounded edm:hasType literal or reference (to a Concept) min 0, max unbounded edm:isRelatedTo reference min 0, max unbounded edm:occuredAt reference (to a TimeSpan) min 0, max unbounded owl:sameAs reference (to an Event) min 0, max unbounded skos:altLabel, skos:hiddenLabel literal min 0, max unbounded skos:note literal min 0, max unbounded min 0, max 1 per lang skos:prefLabel literal tag Table 5 Properties for edm:Event15

14 Europeana Data Model Definition v5.2.3. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. 15 Properties marked in bold will be used for the first implementation of EDM. Properties with grey background will not be implemented.

AIT, 2012 C3.2.1 version 2a p. 9

3.6 edm:PhysicalThing

“A persistent physical item such as a painting, a building, a book or a stone. Persons are not items. This class represents cultural heritage objects known to Europeana to be physical things (such as Mona Lisa) as well as all physical things Europeana refers to in the descriptions of cultural heritage objects(such as the Rosetta Stone). This class allows to capture the distinction between a physical object and a digital representation of that object. This class is the domain of edm:realizes.” 16 This class will not be implemented in the near future. The potential properties are the same as for the CHO: the focus here is on Cultural Physical Objects. Only edm:unstored and edm:type do not apply.

Obligation and Property Note Value type Occurence dc:contributor literal or reference min 0, max unbounded dc:coverage literal or reference min 0, max unbounded dc:creator literal or reference min 0, max unbounded dc:date literal or reference min 0, max unbounded dc:description literal or reference min 0, max unbounded dc:format literal or reference min 0, max unbounded dc:identifier literal min 0, max unbounded dc:language literal min 0, max unbounded dc:publisher literal or reference min 0, max unbounded dc:relation literal or reference min 0, max unbounded dc:rights literal or reference min 0, max unbounded dc:source literal or reference min 0, max unbounded dc:subject literal or reference min 0, max unbounded dc:title literal min 0, max unbounded dc:type literal or reference min 0, max unbounded dcterms:alternative literal min 0, max unbounded dcterms:conformsTo literal or reference min 0, max unbounded dcterms:created literal or reference min 0, max unbounded dcterms:extent literal or reference min 0, max unbounded dcterms:hasFormat literal or reference min 0, max unbounded dcterms:hasPart literal or reference min 0, max unbounded dcterms:hasVersion literal or reference min 0, max unbounded dcterms:isFormatOf literal or reference min 0, max unbounded dcterms:isPartOf literal or reference min 0, max unbounded dcterms:isReferencedBy literal or reference min 0, max unbounded dcterms:isReplacedBy literal or reference min 0, max unbounded dcterms:isRequiredBy literal or reference min 0, max unbounded dcterms:issued literal or reference min 0, max unbounded dcterms:isVersionOf literal or reference min 0, max unbounded dcterms:medium literal or reference min 0, max unbounded dcterms:provenance literal or reference min 0, max unbounded dcterms:references literal or reference min 0, max unbounded

16 Europeana Data Model Definition v5.2.3. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012.

AIT, 2012 C3.2.1 version 2a p. 10

Obligation and Property Note Value type Occurence dcterms:replaces literal or reference min 0, max unbounded dcterms:requires literal or reference min 0, max unbounded dcterms:spatial literal or reference min 0, max unbounded dcterms:tableOfContents literal or reference min 0, max unbounded dcterms:temporal literal or reference min 0, max unbounded edm:currentLocation reference min 0, max 1 edm:hasMet reference min 0, max unbounded edm:hasType reference or literal min 0, max unbounded edm:incorporates reference min 0, max unbounded edm:isDerivativeOf reference min 0, max unbounded edm:isNextInSequence reference min 0, max 1 edm:isRelatedTo reference or literal min 0, max unbounded edm:isRepresentationOf reference min 0, max 1 edm:isSimilarTo reference min 0, max unbounded edm:isSuccessorOf reference min 0, max unbounded edm:realizes reference min 0, max unbounded edm:wasPresentAt reference min 0, max unbounded Table 6 Properties for edm:PhysicalThing17

4 THE EDM CORE CLASSES AND ABCD(EFG)

All data provided for an OpenUp! Europeana harvest is first transformed from the native databases into the ABCD standard and provided via the BioCASE18 provider software. Out of the BioCASE providers the ABCD data is harvested with the GBIF HIT-Tool. Afterwards the data has to be transformed into Europeana valid metadata. For the first OpenUp! ingestions of data to Europeana the transformation was based on the Europeana Semantic Elements19. This baseline transformation will be further developed and will in future allow the transformation of ABCD data into EDM data. Chapter 4 depicts a first mapping of the ABCDv2.06 standard20 and the ABCD extension EFG21 to the core classes of the Europeana Data Model. The focus of the mapping in this document is on the assortment of additional ABCD fields relevant for the Europeana Data Model and qualified fields for vocabulary enrichment processes. The standard ABCD(EFG) to ESE mapping was developed within task T3.1 and can be found on the following web page: http://open- up.eu/content/transformation-and-standard-conformance-europeana

17 Properties marked in bold will be used for the first implementation of EDM. Properties with grey background will not be implemented. 18 BioCASE Provider Software. http://www.biocase.org/products/provider_software/ 19 Jul. 2012. 19 Europeana Semantic Elements (ESE) documentation http://pro.europeana.eu/web/guest/technical-requirements 19 Jul. 2012. 20 ABCD - Access to Biological Collection Data. http://wiki.tdwg.org/ABCD 19 Jul. 2012. 21 ABCDEFG - Access to Biological Collection Databases Extended for Geosciences. http://wiki.tdwg.org/twiki/bin/view/ABCD/ DesignAbcdExtensions 23 Jul. 2012.

AIT, 2012 C3.2.1 version 2a p. 11

4.1 edm:ProvidedCHO

Definition: “The ProvidedCHO is the cultural heritage object which has given rise to and is the subject of the package of data that has been submitted to Europeana. lts properties are those of the original cultural heritage object with a few Europeana-specific ones added. [This means that they are the attributes of the original cultural heritage object (CHO) itself, not the digital representation of it.] ln the model it is the class of resource that is the object of the edm:aggregatedCHO statement. There is an exact match between ProvidedCHOs and the items that can appear from a search.”22 In the OpenUp! and ABCD(EFG) context this means that each unit record with a unique identifier (Unit ID) within the data source constitutes one potential cultural heritage object (ProvidedCHO) for Europeana.

22 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012.

AIT, 2012 C3.2.1 version 2a p. 12

Provided CHO23 EDM ABCD2.06 Enrichment role dc:contributor For contributors to literal or reference min 0, max /DataSets/DataSet/Units/Unit/Gathering/Agents/GatheringAgent/Person/Full VIAF collector the CHO. If possible unbounded Name supply the identifier of the contributor from an authority source. /DataSets/DataSet/Units/Unit/Gathering/Agents/GatheringAgentsText VIAF collector /DataSets/DataSet/Units/Unit/Gathering/Agents/GatheringAgent/AgentText VIAF collector /DataSets/DataSet/Units/Unit/Identifications/Identification/Identifiers/Identifier VIAF identifier /PersonName/FullName /DataSets/DataSet/Units/Unit/Identifications/Identification/Identifiers/Identifier VIAF identifier sText dc:date Use for a significant literal or reference min 0, max same as standard ABCD-ESE mapping date in the life of the unbounded CHO. Consider the sub- properties of dcterms:created or dcterms:issued. dc:description A description of the literal or reference min 0, max same as standard ABCD-ESE mapping CHO. Either unbounded dc:description or dc:title must be provided. dc:identifier An identifier of the literal min 0, max same as standard ABCD-ESE mapping original CHO. unbounded

23 Properties marked in bold will be used for the first implementation of EDM. Properties with grey background will not be implemented. AIT, 2012 C3.2.1 version 2a p. 13

Provided CHO23 EDM ABCD2.06 Enrichment role dc:relation The name or literal or reference min 0, max /DataSets/DataSet/Units/Unit/Associations/UnitAssociation/AssociatedUnitID identifier of a related unbounded /DataSets/DataSet/Units/Unit/Associations/UnitAssociation/AssociatedUnitS resource, generally ourceInstitutionCode used for other /DataSets/DataSet/Units/Unit/Associations/UnitAssociation/AssociatedUnitS related CHOs. Cf ourceName edm:isRelatedTo (together) Append: /DataSets/DataSet/Units/Unit/Associations/UnitAssociation/AssociationType /DataSets/DataSet/Units/Unit/Associations/UnitAssociation/Comment /DataSets/DataSet/Units/Unit/Assemblage/UnitAssemblage/AssemblageID /DataSets/DataSet/Units/Unit/Assemblage/UnitAssemblage/AssemblageNa me (together) dc:rights Use to give the literal or reference min 0, max name of the rights unbounded holder of the CHO if possible or for more general rights information. Note the difference between this property and the use of the controlled edm:rights property which relates to the digital objects (see WebResource and /DataSets/DataSet/Units/Unit/IPRStatements/IPRDeclarations/IPRDeclarat Aggregation tables). ion/Text /DataSets/DataSet/Units/Unit/IPRStatements/IPRDeclarations/IPRDeclaratio n/Details /DataSets/DataSet/Units/Unit/IPRStatements/IPRDeclarations/IPRDeclaratio n/URI /DataSets/DataSet/Units/Unit/IPRStatements/Copyrights/Copyright/Text /DataSets/DataSet/Units/Unit/IPRStatements/Copyrights/Copyright/Details /DataSets/DataSet/Units/Unit/IPRStatements/Copyrights/Copyright/URI

AIT, 2012 C3.2.1 version 2a p. 14

Provided CHO23 EDM ABCD2.06 Enrichment role

/DataSets/DataSet/Units/Unit/IPRStatements/Licenses/License/Text /DataSets/DataSet/Units/Unit/IPRStatements/Licenses/License/Details /DataSets/DataSet/Units/Unit/IPRStatements/Licenses/License/URI /DataSets/DataSet/Units/Unit/IPRStatements/TermsOfUseStatements/Term sOfUse/Text /DataSets/DataSet/Units/Unit/IPRStatements/TermsOfUseStatements/Term sOfUse/Details /DataSets/DataSet/Units/Unit/IPRStatements/TermsOfUseStatements/Term sOfUse/URI /DataSets/DataSet/Units/Unit/IPRStatements/Disclaimers/Disclaimer/Text /DataSets/DataSet/Units/Unit/IPRStatements/Disclaimers/Disclaimer/Details /DataSets/DataSet/Units/Unit/IPRStatements/Disclaimers/Disclaimer/URI /DataSets/DataSet/Units/Unit/IPRStatements/Acknowledgements/Acknowled gement/Text /DataSets/DataSet/Units/Unit/IPRStatements/Acknowledgements/Acknowled gement/Details /DataSets/DataSet/Units/Unit/IPRStatements/Acknowledgements/Acknowled gement/URI /DataSets/DataSet/Units/Unit/IPRStatements/Citations/Citation/Text /DataSets/DataSet/Units/Unit/IPRStatements/Citations/Citation/Details /DataSets/DataSet/Units/Unit/IPRStatements/Citations/Citation/URI dc:source The source of the literal or reference min 0, max same as standard ABCD-ESE mapping original CHO. This unbounded property should no longer be used for the name of the content holder: for this, see edm:dataProvider in

AIT, 2012 C3.2.1 version 2a p. 15

Provided CHO23 EDM ABCD2.06 Enrichment role

the ore:Aggregation table below.

dc:subject The subject of the literal or reference min 0, max Integration of the common names OpenUp! Common CHO. One of unbounded name services dc:subject or dc:coverage or dc:type or dcterms:spatial must be provided Integration of standard higher taxa from the botanical name services OpenUp! Botanical Taxa name services Integration of standard higher taxa from the zoological name services OpenUp! Taxa Zoological name services Integration of the vernacular names GBIF Vernacular Names Botanical, Zoological dc:title The title of the CHO. literal min 0, max same as standard ABCD-ESE mapping Either dc:title or unbounded dc:description must be provided. dc:type The nature or genre literal or reference min 0, max same as standard ABCD-ESE mapping of the CHO. ldeally unbounded the term(s) will be taken from a controlled vocabulary. One of dc:type or dc:subject or dc:coverage or dcterms:spatial must be provided

AIT, 2012 C3.2.1 version 2a p. 16

Provided CHO23 EDM ABCD2.06 Enrichment role dcterms:hasPart A resource that is literal or reference min 0, max DataSets/DataSet/Units/Unit/Sequences/Sequence/SequencedPart included either unbounded physically or logically in the CHO. dcterms:isReferencedBy Another resource literal or reference min 0, max same as standard ABCD-ESE mapping that references, cites unbounded or otherwise points to the CHO. dcterms:medium The material or literal or reference min 0, max same as standard ABCD-ESE mapping physical carrier of unbounded the CHO. dcterms:provenance A statement of literal or reference min 0, max same as standard ABCD-ESE mapping changes in unbounded ownership and custody of the CHO since its creation. Significant for authenticity, integrity and interpretation. dcterms:references Other resources literal or reference min 0, max DataSets/DataSet/Units/Unit/Identifications/Identification/Identifiers/Identifica referenced, cited or unbounded tionSource/TitleCitation otherwise pointed to by the CHO. DataSets/DataSet/Units/Unit/Identifications/Identification/Identifiers/Identifica tionSource/CitationDetail DataSets/DataSet/Units/Unit/Identifications/Identification/Identifiers/Identifica tionSource/URI dcterms:spatial Spatial literal or reference min 0, max same as standard ABCD-ESE mapping GeoNames Country characteristics of the unbounded Subdivision / CHO. i.e. what the reverse geocoding CHO represents or GeoNames Find depicts in terms of nearby populated space (e.g. a place / reverse location, co-ordinate geocoding or place). Either

AIT, 2012 C3.2.1 version 2a p. 17

Provided CHO23 EDM ABCD2.06 Enrichment role

dcterms:spatial or dc:type or dc:subject or dc:coverage must be provided

/DataSets/DataSet/Units/Unit/Gathering/SiteCoordinateSets/SiteCoordinates GeoNames Country Gathering /CoordinatesLatLong/LatitudeDecimal Subdivision / reverse geocoding GeoNames Find nearby populated place / reverse geocoding /DataSets/DataSet/Units/Unit/Gathering/SiteCoordinateSets/SiteCoordinates GeoNames Country Gathering /CoordinatesLatLong/LongitudeDecimal Subdivision / reverse geocoding GeoNames Find nearby populated place / reverse geocoding /DataSets/DataSet/Units/Unit/Gathering/Country/ISO3166Code ISO 3166-1 Alpha2 Country Codes dcterms:temporal Temporal literal or reference min 0, max same as standard ABCD-ESE mapping http://vocabularies. characteristics of the unbounded gbif.org/vocabularie CHO. i.e. what the s/geo_chronostrat CHO is about or depicts in terms of time (e.g. a period, date or date range.) DataSets/DataSet/Units/Unit/PalaeontologicalUnit/TimeRange http://vocabularies. gbif.org/vocabularie s/geo_chronostrat

AIT, 2012 C3.2.1 version 2a p. 18

Provided CHO23 EDM ABCD2.06 Enrichment role edm:currentLocation The geographic reference min 0, max 1 /DataSets/DataSet/Units/Unit/Owner/Organisation/Name/Representation/Te ? location whose xt boundaries presently include the CHO. lf the name of a repository, building, site, or other entity is used then it should include an indication of its geographic location. edm:hasMet The identifier of an reference min 0, max put in here the agent, a place, a unbounded indentifiers of the time period or any various thesaurus other identifiable terms connected to entity that the CHO the CHO may have "met" in its life. edm:hasType The identifier of a reference or literal min 0, max /DataSets/DataSet/Units/Unit/RecordBasis Darwin Core Type concept, or a word unbounded Vocabulary - or phrase from a identifier controlled vocabulary (thesaurus etc) giving the type of the CHO. E.g. Painting from the AAT thesaurus. This property can be seen as a super- property of e.g. dc:format or dc:type to support "What" questions.

AIT, 2012 C3.2.1 version 2a p. 19

Provided CHO23 EDM ABCD2.06 Enrichment role edm:isRelatedTo The identifier or reference or literal min 0, max /DataSets/DataSet/Units/Unit/Associations/UnitAssociation/AssociatedUnitID name of a concept unbounded /DataSets/DataSet/Units/Unit/Associations/UnitAssociation/AssociatedUnitS or other resource to ourceInstitutionCode which the described /DataSets/DataSet/Units/Unit/Associations/UnitAssociation/AssociatedUnitS CHO is related. E.g. ourceName Moby Dick is related (together) to XlX Century literature. Cf dc:relation. /DataSets/DataSet/Units/Unit/Assemblage/UnitAssemblage/AssemblageID /DataSets/DataSet/Units/Unit/Assemblage/UnitAssemblage/AssemblageNa me (together) edm:realizes lf the CHO described reference min 0, max Integration of the common names OpenUp! Common is of type unbounded name services edm:PhysicalThing it may realize an information object. E.g. a copy of the Gutenberg publication realizes the Bible. Integration of the vernacular names GBIF Vernacular Names Botanical, Zoological edm:type The provided object literal (TEXT- min 1, max1 same as standard ABCD-ESE mapping is one of the types VIDEO-SOUND- accepted by IMAGE-3D) Europeana and will govern which facet it appears under in the portal - TEXT, VIDEO, SOUND, IMAGE, 3D. (For 3D see also dc:format)

AIT, 2012 C3.2.1 version 2a p. 20

Provided CHO23 EDM ABCD2.06 Enrichment role edm:wasPresentAt The identifier of an reference min 0, max DataSets/DataSet/Units/Unit/Gathering/Project/ProjectTitle event at which the unbounded described object was present. E.g. the Stone of Scone was present at the coronation of King James l of England. owl:sameAs Use to point to your reference min 0, max own (linked data) unbounded representation of the object, if you have already minted a URI identifier for it. It is also possible to provide URIs minted by third-parties for the object. rdf:type Use to indicate if this reference min 0, max resource is of a unbounded given "real-world object" type - it could be edm:PhysicalThing or a more specific class.

Table 7 edm:ProvidedCHO and ABCD(EFG)

AIT, 2012 C3.2.1 version 2a p. 21

4.2 edm:Aggregation

Definition: “These are the properties that can be used for the class of ore:Aggregation. This means that they are attributes that apply to the whole set of related resources about one particular provided cultural heritage object. For each ore:Aggregation a set of these properties should be provided.”24

Aggregation EDM ABCD2.06 ore:aggregates This property exists in ref min 0, max principle only as it is unbounded stated through edm:aggregatedCHO and edm:hasView statements. edm:aggregatedCHO The identifier of the ref min 1, max 1 /DataSets/DataSet/Units/Unit/SourceInstitutionID source object e.g. the /DataSets/DataSet/Units/Unit/SourceID Mona Lisa itself. This /DataSets/DataSet/Units/Unit/UnitID could be a full linked (together) open data URl or an internal identifier. edm:dataProvider The name or identifier of literal or ref min 1, max 1 same as standard ABCD-ESE mapping the data provider of the object (i.e. the organisation providing data to an aggregator). ldentifiers will not be available until Europeana has implemented its Organisation profile. edm:hasView The URL of a web ref min 0, max enter here the multiple resource which is a unbounded "DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia digital representation of Object/FileURI" appearing after the first one (used for the CHO. This may be isShownBy) the source object itself in the case of a born digital enter here the multiple cultural heritage object. "DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia edm:hasView should Object/ProductURI" appearing after the first one (used for only be used where there isShownAt) are several views of the CHO and one (or both) of the mandatory edm:isShownAt or edm:isShownBy properties have already been used. lt is for cases where one CHO has several views of the same object. (e.g. a shoe and a detail of the label of the shoe) edm:isShownBy The URL of a web view ref min 0, max 1 same as standard ABCD-ESE mapping of the object. Either edm:isShownAt or edm:isShownBy is mandatory. For the rights that will apply to previews please see

24 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012.

AIT, 2012 C3.2.1 version 2a p. 22

Aggregation EDM ABCD2.06

edm:rights below.

edm:isShownAt The URL of a web view ref min 0, max 1 same as standard ABCD-ESE mapping of the object in full information context. Either edm:isShownAt or edm:isShownBy is mandatory. For the rights that will apply to previews please see edm:rights below. edm:object The URL of a ref min 0, max 1 same as standard ABCD-ESE mapping representation of the CHO which will be used for generating previews for use in the Europeana portal. This may be the same URL as edm:isShownBy. See Europeana Portal Image Guidelines (http://pro.europeana.eu/t echnical-requirements) for information regarding the specifications of previews. edm:provider The name or identifier of literal or ref min 1, max 1 same as standard ABCD-ESE mapping the provider of the object (i.e. the organisation providing data directly to Europeana). ldentifiers will not be available until Europeana has implemented its Organisation profile. edm:rights This is a mandatory ref min 1, max 1 same as standard ABCD-ESE mapping property and the value given here should be the rights statement that applies to the digital representation at the URL given in edm:object or edm:isShownAt/By. The value should be taken from one of those listed in the Europeana Rights Guidelines (http://pro.europeana.eu/t echnical-requirements) The rights statement given in this property will also apply to the previews used in the portal and will be the source of: * the entry in the Rights facet in the portal * the license badge that appears under the preview on the result page

AIT, 2012 C3.2.1 version 2a p. 23

Aggregation EDM ABCD2.06

Where there are several web resources attached to one edm:ProvidedCHO the rights statement given here will be regarded as the "reference" value for all the web resources so a suitable value should be chosen if the rights statements vary between different resources. ln future implementations it is hoped to handle rights statements for separate web resources associated with one CHO separately.

Table 8 edm:Aggregation and ABCD(EFG)

4.3 edm:WebResource

Definition: “These are the properties that can be used for the class of edm:WebResource. This means that they are attributes of the digital representation of the provided cultural heritage object, not the cultural heritage object itself. There may be more than one edm:WebResource for each edm:ProvidedCHO and they will be associated via the ore:Aggregation using edm:isShownBy, edm:isShownAt, edm:hasView or edm:object. Each web resource provided should have its own set of properties.”25

Web Resource EDM ABCD2.06 dc:description Use for an account or description literal or ref min 0, max of this digital representation unbounded DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/Context DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/Comment DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/CaptureEquipment dc:format Use for the format of this digital literal or ref min 0, max representation. (Use the value unbounded "3D- PDF" if appropriate to trigger DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia portal icon display.) Object/Format DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/ImageResolution

25 Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012.

AIT, 2012 C3.2.1 version 2a p. 24

Web Resource EDM ABCD2.06 dc:rights Use for the name of the rights literal or ref min 0, max holder of this digital representation unbounded if possible or for more general rights information. Note the difference between this property and the use of the mandatory, controlled edm:rights property /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMed below. iaObject/IPR/IPRDeclarations/IPRDeclaration/Text /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/IPRDeclarations/IPRDeclaration/Details /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/IPRDeclarations/IPRDeclaration/URI /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Copyrights/Copyright/Text /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Copyrights/Copyright/Details /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Copyrights/Copyright/URI /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Licenses/License/Text /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Licenses/License/Details /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/TermsOfUseStatements/TermsOfUse/Text /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/TermsOfUseStatements/TermsOfUse/Details /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/TermsOfUseStatements/TermsOfUse/URI /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Disclaimers/Disclaimer/Text /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Disclaimers/Disclaimer/Details /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Disclaimers/Disclaimer/URI /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Acknowledgements/Acknowledgement/Text /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Acknowledgements/Acknowledgement/Details /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Acknowledgements/Acknowledgement/URI /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Citations/Citation/Text /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Citations/Citation/Details /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/IPR/Citations/Citation/URI dcterms:extent The size or duration of the web literal or ref min 0, max DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia resource. unbounded Object/ImageSize

DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/ImageSize/Width

AIT, 2012 C3.2.1 version 2a p. 25

Web Resource EDM ABCD2.06

DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia Object/ImageSize/Height dcterms:created Date of creation of the web DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMedia resource. Object/CreatedDate edm:rights The value in this element will ref min 0, max 1 same as standard ABCD-ESE mapping indicate the usage and access rights that apply to this digital representation. For the first implementation of EDM only the rights associated with the ore:Aggregation class will be implemented. Later implementations will be able to use the edm:rights associated with individual WebResources so it is strongly recommended that a value is supplied for this property for each instance of a WebResource. See also edm:rights in the ore:Aggregation table. The value is a URI in a controlled form. See the Rights guidelines at http://pro.europeana.eu/technical- requirements

Table 9 edm:WebResource and ABCD(EFG)

5 EDM DATA ENRICHMENT WITH ADDITIONAL DATA SOURCES

The integration of additional data sources into an EDM object depends mainly on the capabilities provided at data source. As an example the integration of bibliographic data related to an OpenUp! data unit will be outlined. The Biodiversity Heritage Library (BHL) offers “Developer Tools and API” (http://biodivlib.wikispaces.com/Developer+Tools+and+API) which provide the Web service “Bibliography by URL”. ------snippet ------Bibliography by URL To easily link into a list of all pages containing a given scientific name, use the following URL: http://www.biodiversitylibrary.org/name/Scientific_name Where Scientific_name is any uninomial, binomial, or trinomial. Replace spaces with the underscore ( _ )character. Examples: http://www.biodiversitylibrary.org/name/Orchidaceae (Orchid family) http://www.biodiversitylibrary.org/name/Carcharodon_carcharias (Great white shark) http://www.biodiversitylibrary.org/name/Phalacrocorax_carbo_maroccanus (Great Cormorant)

AIT, 2012 C3.2.1 version 2a p. 26

The “Bibliography for "carassius gibelio"” can be accessed via the URL: http://www.biodiversitylibrary.org/name/carassius%20gibelio and delivers following output:

Figure 4 Bibliography for "carassius gibelio” derived from BHL Web service “Bibliography by URL”

This Web service will be further investigated for its use within task T3.5 of the OpenUp! project.

AIT, 2012 C3.2.1 version 2a p. 27

Another useful example can be provided by the Encyclopedia of Life (EOL). EOL provides WebServices (http://eol.org/info/api_overview) which can be used to provide additional information to an OpenUp! data unit: The “EOL API: Search” provides an URL access via a scientific name, eg: http://eol.org/api/search/1.0/carassius%20gibelio delivers: … Carassius gibelio 215509 Carassius gibelio (Bloch, 1782); Carassius auratus gibelio; Carassius bucephalus; Carassius ellipticus; Cyprinus gibelio; Carassius gibelio; Cyprinus amarus ... and the id „215509” can be used as input parameter for the “EOL API: Pages”: http://eol.org/api/pages/1.0/215509?common_names=1&details=1&images=2&subjects=all&a mp;text=2 In addition to the XML presentation as response to this WebService, a more readable presentation can be received via: http://eol.org/pages/215509/overview

Figure 5 Information on "carassius gibelio” derived from EOL Web service “EOL API: Search”

AIT, 2012 C3.2.1 version 2a p. 28

Among further external Web services that shall be evaluated and tested for their suitability for EDM metadata enrichment in the future count: http://www.sp2000.org/index.php?option=com_content&task=view&id=40&Itemid=49 http://data.gbif.org/tutorial/services, http://www.gbif.org/informatics/, http://www.gbif.org/informatics/standards-and-tools/using-data/web-services/ http://www.itis.gov/web_service.html, http://www.itis.gov/ws_searchApiDescription.html#getItisTermsfmSciName http://commons.wikimedia.org/wiki/Carassius_gibelio?uselang=de https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Web-service http://www.ubio.org/index.php?pagename=sample_applications

AIT, 2012 C3.2.1 version 2a p. 29

6 THE ONTOLOGY DATA GATEWAY

The Ontology Data Gateway serves as a broker between the PDI Transformation and the vocabulary service. It allows to access the controlled vocabularies, for instance the Common Names service provided by NHM Vienna, the Virtual Internet Authority File, the Vocabularies provided by the Global Biodiversity Information Facility and so on. There are three main features settled on the gateway.

Figure 6 Integration of the Ontology Data Gateway in the OpenUp! process

1. First it is a well suited point to introduce a common technical interface. This abstraction keeps the transformation component free from knowledge about the access protocols to the different vocabularies used.

Figure 7 Components of the Ontology Data Gateway

2. Another problem when dealing with web based services is access speed. When transforming millions of records network latency quickly becomes an issue. Therefore the gateway which is usually seated on the same host as the transformation or at least is in the same LAN provides a cache of all requests sent to it.

AIT, 2012 C3.2.1 version 2a p. 30

Since requested vocabulary terms rarely change, as do the vocabularies themselves (apart from adding new terms), this will improve the periodically running transformations dramatically. 3. The third issue the gateway tangles is a logging facility of vocabulary usage. This means the gateway will keep track of the associations between the data object and vocabulary terms used in the transformed object. Since the transformation is a stateless procedure this functionality is placed alongside the gateway. The further description of these features can be found in the chapter "Helper Applications for Ontology".

6.1 Gateway Protocol

The protocol spoken with the gateway is REST. The gateway service is meant to be a technical service. Hence its primary response format is XML or JSON depending on the request. There are two function available on the gateway. One is to present a list of all available Vocabularies the other one is to query a vocabulary for terms. Both are described in detail below.

6.1.1 Getting an overview of the available Vocabularies The gateway will present a list of the available vocabularies when addressed without a vocabulary name. Request: http://test117.ait.co.at:8080/Vocabulary/rest/ Response { "vocabulary": [ { "@name": "CN", "displayName": "Common Names", "version": 1 }, { "@name": "COL", "displayName": "Catalogue of Life - 2000", "version": 1 }, { "@name": "VIAF", "displayName": "Virtual Internet Authority File", "version": 1 }, { "@name": "GBIF-DWC", "displayName": "Global Biodiversity Information Facility - Darwin Core", "version": 1 }, { "@name": "ITIS", "displayName": "Integrated Taxonomic Information System", "version": 1

AIT, 2012 C3.2.1 version 2a p. 31

}] }

6.1.2 Querying a vocabulary Request parameters: http://test117.ait.co.at:8080/Vocabulary/rest/?q= &for=

&qs= &lim= &app=

&set= &target= &lang=

&cache=

Parameter Meaning

The identification name of the vocabulary questioned. E.g. Catalogue of Life, Vernacular Names, ..

Depending on the query syntax this is either a simple text query or a complex query structure.

[optional] association (requires target object, application name, collection/archive name) or no reason.

Query syntax is one of simple (default), json

Maximum number of terms returned or unlimited

The application identifier. E.g. OpenUp

The qualified archive and/or collection identifier.

The (possibly local) object identifier .

Requesting term and response language.

[default: off] Use the query cache: on or off

A sample request looks like: http://test117.ait.co.at:8080/Vocabulary/rest/COL?q=Papil*&start=5&lim=2

Response:

Depending on what is accepted by the client it returns either application/xml (default) or application/json.

The XML is formatted according to the schema:

AIT, 2012 C3.2.1 version 2a p. 32

AIT, 2012 C3.2.1 version 2a p. 33

Sample XML response.

tns:name tns:broader tns:narrower tns:use tns:useFor tns:scope

Sample JSON response.

{ "term": [{ "@about": "6108649", "name": [{ "#text": "Papilio (Achillides) maackii shimogorii", "@lang": "" }], "narrower": [], "broader": [{ "@href": "2294666", "#text": "Papilio (Achillides)" }], "scope": { "record_id": 6108649.0, "lsid": null, "name": "Papilio (Achillides) maackii shimogorii", "name_with_italics": "Papilio (Achillides) maackii shimogorii", "taxon": "Infraspecies", "name_code": "Gar-S-82", "parent_id": 2294666.0, "sp2000_status_id": 5.0, "database": { "database_name_displayed": "GloBIS (GART): Global Information System", "record_id": 46.0, "database_name": "GloBIS (GART)", "database_full_name": "Global Butterfly Information System", "web_site": "http://www.science4you.org/platform/lex/globis/home/index.do", "organization": "State Museum of Natural History, Stuttgart, Germany ", "contact_person": "C Häuser, J Holstein & A Steiner (eds)", "taxa": "Swallowtails and whites",

AIT, 2012 C3.2.1 version 2a p. 34

"taxonomic_coverage": "Animalia - Arthropoda - Insecta - - Papilionidae, Pieridae", "Abstract": "The GloBIS/GART database provides the taxonomic backbone for a global information system on . In addition to the data presented in the Catalogue of Life, the original GloBIS/GART database provides information about the type material such as type locality, status and current location of the type specimens and images, as well as the original and current taxonomic placement. Currently data for the families Papilionidae or \"Swallowtails\" (554 recognized species, more than 2,400 taxa), and Pieridae or \"Whites\" (1091 recognized species, more than 2,400 taxa) are available, and other butterfly families to follow.", "version": "0.2, Nov 2008", "release_date": {}, "SpeciesCount": 553.0, "SpeciesEst": 0.0, "authors_editors": "C. Häuser, J. Holstein & A. Steiner (eds)", "accepted_species_names": 1635.0, "accepted_infraspecies_names": 0.0, "species_synonyms": 462.0, "infraspecies_synonyms": 1975.0, "common_names": 0.0, "total_names": 4072.0, "is_new": false }, "is_accepted_name": 0.0, "is_species_or_nonsynonymic_higher_taxon": 1.0, "scientific_names": [ { "record_id": 6108649.0, "name_code": "Gar-S-82", "web_site": null, "": "Papilio (Achillides)", "species": "maackii", "infraspecies": "shimogorii", "infraspecies_marker": null, "author": "Fujioka, 1997", "accepted_name_code": "Gar-176", "comment": null, "scrutiny_date": null, "sp2000_status": "synonym", "database_id": 46.0, "specialist_name": null, "family_id": 19408.0, "is_accepted_name": 0.0 }], "common_names": [ { "record_id": null, "name_code": null, "common_name": null,

AIT, 2012 C3.2.1 version 2a p. 35

"language": null, "country": null, "reference": null, "database_id": null, "is_infraspecies": null }], "scientific_name_references": [ { "record_id": 7584543.0, "name_code": "Gar-S-82", "reference_type": "NomRef", "reference": { "record_id": 2071167.0, "author": "Fujioka, T., Tsukiyama, H. & Chiba, H.", "year": "1997", "title": "Japanese Butterflies and their Relatives in the World I. [in Japanese].", "source": "Fujioka, T., Tsukiyama, H. & Chiba, H. (1997) Japanese Butterflies and their Relatives in the World I. [in Japanese]. 3 volumes, 301 pp., 196 pp., 162 plates, [Japan].", "database_id": 46.0 } }], "distributions": [ { "record_id": null, "name_code": null, "distribution": null }] } }, { "@about": "6108647", "name": [ { "#text": "Papilio (Achillides) maackii kitawakii", "@lang": "" }], "narrower": [], "broader": [ { "@href": "2294666", "#text": "Papilio (Achillides)" }], "scope": { "record_id": 6108647.0, "lsid": null, "name": "Papilio (Achillides) maackii kitawakii", "name_with_italics": "Papilio (Achillides) maackii kitawakii",

AIT, 2012 C3.2.1 version 2a p. 36

"taxon": "Infraspecies", "name_code": "Gar-S-84", "parent_id": 2294666.0, "sp2000_status_id": 5.0, "database": { "database_name_displayed": "GloBIS (GART): Global Butterfly Information System", "record_id": 46.0, "database_name": "GloBIS (GART)", "database_full_name": "Global Butterfly Information System", "web_site": "http://www.science4you.org/platform/lex/globis/home/index.do", "organization": "State Museum of Natural History, Stuttgart, Germany ", "contact_person": "C Häuser, J Holstein & A Steiner (eds)", "taxa": "Swallowtails and whites", "taxonomic_coverage": "Animalia - Arthropoda - Insecta - Lepidoptera - Papilionidae, Pieridae", "Abstract": "The GloBIS/GART database provides the taxonomic backbone for a global information system on butterflies. In addition to the data presented in the Catalogue of Life, the original GloBIS/GART database provides information about the type material such as type locality, status and current location of the type specimens and images, as well as the original and current taxonomic placement. Currently data for the families Papilionidae or \"Swallowtails\" (554 recognized species, more than 2,400 taxa), and Pieridae or \"Whites\" (1091 recognized species, more than 2,400 taxa) are available, and other butterfly families to follow.", "version": "0.2, Nov 2008", "release_date": {}, "SpeciesCount": 553.0, "SpeciesEst": 0.0, "authors_editors": "C. Häuser, J. Holstein & A. Steiner (eds)", "accepted_species_names": 1635.0, "accepted_infraspecies_names": 0.0, "species_synonyms": 462.0, "infraspecies_synonyms": 1975.0, "common_names": 0.0, "total_names": 4072.0, "is_new": false }, "is_accepted_name": 0.0, "is_species_or_nonsynonymic_higher_taxon": 1.0, "scientific_names": [{ "record_id": 6108647.0, "name_code": "Gar-S-84", "web_site": null, "genus": "Papilio (Achillides)", "species": "maackii", "infraspecies": "kitawakii",

AIT, 2012 C3.2.1 version 2a p. 37

"infraspecies_marker": null, "author": "Shimogori & Fujioka, 1997", "accepted_name_code": "Gar-176", "comment": null, "scrutiny_date": null, "sp2000_status": "synonym", "database_id": 46.0, "specialist_name": null, "family_id": 19408.0, "is_accepted_name": 0.0 }], "common_names": [{ "record_id": null, "name_code": null, "common_name": null, "language": null, "country": null, "reference": null, "database_id": null, "is_infraspecies": null }], "scientific_name_references": [{ "record_id": 7584623.0, "name_code": "Gar-S-84", "reference_type": "NomRef", "reference": { "record_id": 2071167.0, "author": "Fujioka, T., Tsukiyama, H. & Chiba, H.", "year": "1997", "title": "Japanese Butterflies and their Relatives in the World I. [in Japanese].", "source": "Fujioka, T., Tsukiyama, H. & Chiba, H. (1997) Japanese Butterflies and their Relatives in the World I. [in Japanese]. 3 volumes, 301 pp., 196 pp., 162 plates, [Japan].", "database_id": 46.0 } }], "distributions": [{ "record_id": null, "name_code": null, "distribution": null }] } }] }

AIT, 2012 C3.2.1 version 2a p. 38

7 EXTENSIONS OF PENTAHO TRANSFORMATIONS

The transformation of the OpenUp meta data from the ABCD format into the ESE/EDM format is done with the Pentaho Kettle PDI tool. In order to generate the meta data suitable for Europeana's EDM format the ESE transformation routine is extended with the Ontology data. This is done by using the Ontology Data Gateway's REST service in the transformation program.

AIT, 2012 C3.2.1 version 2a p. 39

Figure 8 PDI Transformation to ESE

AIT, 2012 C3.2.1 version 2a p. 40

Steps for integrating the vocabulary services into the ESE transformation:

Figure 9 Pentaho Ontology Service access

Figure 10 Pentaho Variables for Ontology Service

AIT, 2012 C3.2.1 version 2a p. 41

1. Building the request URL.

Figure 11 Pentaho building Ontology Service Request URL

2. Sending the request.

Figure 12 Pentaho Ontology Service Request

AIT, 2012 C3.2.1 version 2a p. 42

3. Accessing the service result.

Figure 13 Pentaho Ontology Service accessing reponse

Figure 14 Pentaho Ontology Service accessing response content

AIT, 2012 C3.2.1 version 2a p. 43

Figure 15 Pentaho Ontology Service accessing response fields

AIT, 2012 C3.2.1 version 2a p. 44

8 HELPER APPLICATIONS FOR ONTOLOGY

There are three helper applications (modules) in the Ontology local store. Each of them can interact with a MySQL database to store its data. This component serves to achieve a separation between the ontology middleware (the protocol translation component of the gateway) which only processes data but does not store anything. It is also the place where certain aspects (caching, recording of usage) of the gateway are realized.

8.1 Vocabulary Registry module

In the registry module all vocabulary services are configured. It is merely a parameter storage space. Therefore it allows the vocabulary components in the gateway to store the service access information it needs, for instance the service base URL, accounts required to access the service and so on.

8.2 Caching module

The sole purpose of this module is to speed up the transformation process by serving terms of the ontology from a local cache instead of forwarding it to a remote service over and over again. Since transformations are running in a periodical manner this also affects archives that request distinct ontology terms.

8.3 Usage Directory module

Each meta data record that uses a certain term from the ontology is recorded in the usage directory. To create a unique identifier of the record the set notation which is collection:archive:country is combined with the internal record identifier of the set. This record identifier is then stored together with the term identifier of the vocabulary. This directory will provide information about how often and where terms from the ontologies are used in the data.

AIT, 2012 C3.2.1 version 2a p. 45

9 ANNEX I

The following overview template was created within task T3.4 and was sent out to the OpenUp! Consortium in May, 2012. Table 10 T3.4 Overview on current available and relevant vocabularies for metadata enrichment Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL) 1 = OpenUp! 2 = GBIF required required if known 0 = not recommended 3 = Other

Use of the content for publications and databases by individuals and organisations for not-for-profit usage is encouraged, on condition that Catalogue of REST X Species 2000 provides web services http://www.cat full and precise credit is given Species2000, Life http://webservice.catal SOAP o to facilitate automated access to the alogueoflife.or EN at three levels on all occasions 1 ITIS Botanical, ogueoflife.org/ Other: 0 - 1 - 2 - 3 - 4 - 5 Annual Checklist by computer g/services/ that records are shown. The Zoological specify programs. three part credit the complete work, the contributing database of the record, and the expert who provides taxonomic scrutiny of the individual record.

Vernacular http://vocabularies.gbi http://rs.gbif.or REST X Extension to core taxa that lists Names f.org/services/gbif/ext g/extension/gb SOAP o http://data.gbif.org/tutorial/data GBIF EN vernacular names for a scientific 1 Botanical, ension?identifier=127 if/1.0/vernacul Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 taxon. Zoological 063 arname.xml specify

AIT, 2012 C3.2.1 version 2a p. 46

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

Users of the University of Helsinki Open University web service pledge to use the service in compliance with legislation and good practices. The web service is copyright protected in Finland in compliance with valid copyright legislation. The copyright of this service belongs to the website producer. Only the original creators have copyright to the material contained on this site, and the producer will not hand http://www.avo REST o over the rights to third parties, You need to register before using the University of in.helsinki.fi/op SOAP o EN, FINNISH unless otherwise agreed. webservice. 1 Helsinki en_university/cOther: 0 - 1 - 2 - 3 - 4 - 5 Service users are not entitled to Name and Access URL are missing opy.htm specify distribute, publish, copy, make accessible to the public or otherwise commercially utilise the material without express consent from the producer. If the site is linked to other online material, the publisher and creator must be clearly displayed in line with good practice. Right to changes reserved. Users are liable for all costs related to or incurred from service use.

REST o Estonian http://unite.ut.ee/eesti http://elurikkus Estonian Species Registry is a SOAP o EN, UT-NHM Species _loomakogud/andmeb.ut.ee/index.ph database of taxa found in Estonia. 1 Other: ESTONIAN 0 - 1 - 2 - 3 - 4 - 5 Registry aasist.php p?lang=eng Rights are missing specify

AIT, 2012 C3.2.1 version 2a p. 47

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

The Berlin Model is based on the IOPI model and various later http://www.bgb implementations of the basic The Berlin REST o http://www.bgbm.org/ m.org/BioDivIn principles laid out therein. It fully Taxonomic SOAP o http://www.bgbm.org/disclaim_ FUB-BGBM BioDivInf/Docs/bgbm- f/Docs/bgbm- EN incorporates "potential taxa" (taxa as 1 Information Other: e.htm 0 - 1 - 2 - 3 - 4 - 5 model/download.htm model/docume circumscribed by a reference) as well Model specify ntation.htm as the full complexity of botanical names according to the rules of botanical nomenclature.

http://www.geo Country REST o names.org/exp The iso country code and the Subdivision / api.geonames.org/cou SOAP o http://www.geonames.org/expo GeoNames ort/web- EN administrative subdivision of any 1 reverse ntrySubdivision? Other: rt/ 0 - 1 - 2 - 3 - 4 - 5 services.html# given point. geocoding specify countrysubdiv

Find nearby http://www.geo REST o Returns the closest populated place populated names.org/exp api.geonames.org/find SOAP o http://www.geonames.org/expo for the lat/lng query as xml document. GeoNames place / ort/web- EN 1 NearbyPlaceName? Other: rt/ 0 - 1 - 2 - 3 - 4 - 5 The unit of the distance element is reverse services.html#f specify 'km'. geocoding indNearby

The GBIF Backbone http://gbrds.gbi (Nub) is an automatically synthesised GBIF f.org/browse/a REST o management classification with Backbone http://ecat- gent?uuid=d7dSOAP o http://code.google.com/projecth limited manual curating. Information GBIF Taxonomy dev.gbif.org/repository EN 2 ddbf4-2cf0- Other: osting/terms.html 0 - 1 - 2 - 3 - 4 - 5 presented here does not represent a Botanical, /checklist-export1.zip 4f39-9b2a- specify consistent taxon but may conflict with Zoological bb099caae36c other nub "usages" in many cases to a trained taxonomists eye.

AIT, 2012 C3.2.1 version 2a p. 48

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

Vocabulary to categorise paragraphs Description http://rs.gbif.or REST o of a taxon description. The individual Type GBIF http://rs.gbif.org/vocab g/vocabulary/g SOAP o http://data.gbif.org/tutorial/data terms are selected according to GBIF Vocabulary ulary/gbif/descriptionT EN 2 bif/description Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 common usage and in some cases Botanical, ype/ Type.xml specify are overlapping or broader/narrower Zoological terms of others.

https://spreads Establishme heets.google.c Each part of the web services is nt Means REST o http://rs.gbif.org/vocabom/pub?key=t provided by a web service API, which GBIF SOAP o http://data.gbif.org/tutorial/data GBIF ulary/gbif/establishme Vs- EN defines the name, input to, and output 2 Vocabulary Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 nt_means/ UWMXnkD3sl from the service for a particular data Botanical, specify wIE8T336w&gi request. Zoological d=4 Life Form REST o http://rs.gbif.or A life form vocabulary targeted at GBIF http://rs.gbif.org/vocab SOAP o http://data.gbif.org/tutorial/data GBIF g/vocabulary/g EN and based on Raunkiær's 2 Vocabulary ulary/gbif/life_form/ Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 bif/life_form/ definitions. Botanical specify Life Stage REST o GBIF http://rs.gbif.or Simple vocabulary to represent http://rs.gbif.org/vocab SOAP o http://data.gbif.org/tutorial/data GBIF Vocabulary g/vocabulary/g EN organism life stages across all 2 ulary/gbif/life_stage/ Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 Botanical, bif/life_stage/ kingdoms. specify Zoological Nomenclatur http://rs.gbif.or REST o http://rs.gbif.org/vocab al Codes g/vocabulary/g SOAP o http://data.gbif.org/tutorial/data GBIF recommended terms for GBIF ulary/gbif/nomenclatur EN 2 Botanical, bif/nomenclatu Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 denoting a nomenclatural code. al_code.xml Zoological ral_code.xml specify Sex GBIF REST o http://rs.gbif.or Vocabulary http://rs.gbif.org/vocab SOAP o http://data.gbif.org/tutorial/data A short vocabulary representing an GBIF g/vocabulary/g EN 2 Botanical, ulary/gbif/sex.xml Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 organisms sex. bif/sex.xml Zoological specify

AIT, 2012 C3.2.1 version 2a p. 49

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL) http://rs.gbif.or REST o Controlled vocabulary following the Acquisition http://rs.gbif.org/vocabg/vocabulary/g SOAP o http://data.gbif.org/tutorial/data Multi-Crop Passport standard for term GBIF source ulary/germplasm/0.1/ ermplasm/0.1/ EN 2 Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 22, Collection or acquisition source Botanical AcquisitionSource.xml AcquisitionSou specify (COLLSRC). rce.xml http://rs.gbif.or Biological http://rs.gbif.org/vocabg/vocabulary/g REST o Controlled vocabulary following the status of ulary/germplasm/0.1/ ermplasm/0.1/ SOAP o http://data.gbif.org/tutorial/data Multi-Crop Passport standard for term GBIF EN 2 sample BiologicalStatusOfSa BiologicalStatuOther: useagreement 0 - 1 - 2 - 3 - 4 - 5 20, Biological Status Of Sample Botanical mple.xml sOfSample.xm specify (SAMPSTAT). l Systema http://sn2000.t REST o http://sn2000.taxonom Describes the syntax of the habitat of Naturae axonomy.nl/Sy SOAP o http://data.gbif.org/tutorial/data GBIF y.nl/SyntaxHabitat.ht EN a taxon and lists the habitat codes for 2 2000 Habitat ntaxHabitat.ht Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 m Systema Naturae 2000. Vocabulary m specify

A recommended set of values to use for the basisOfRecord term to categorize Darwin Core resources. http://rs.gbif.or REST o Darwin Core http://rs.gbif.org/vocab The Darwin Core Type Vocabulary g/vocabulary/d SOAP o http://data.gbif.org/tutorial/data GBIF Type ulary/dwc/basis_of_re EN extends and refines terms from the 2 wc/basis_of_reOther: useagreement 0 - 1 - 2 - 3 - 4 - 5 Vocabulary cord.xml Dublin Core Type Vocabulary to cord.xml specify describe and categorize resources more specifically for biodiversity applications.

http://rs.gbif.or REST o EML Agent http://rs.gbif.org/vocab g/vocabulary/g SOAP o http://data.gbif.org/tutorial/data Vocabulary for agent roles used in GBIF Role ulary/gbif/agent_role.x EN, ES, FR 2 bif/agent_role. Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 EML. Vocabulary ml xml specify http://rs.gbif.or REST o Describes the occurence of a Occurrence http://rs.gbif.org/vocab g/vocabulary/g SOAP o http://data.gbif.org/tutorial/data species: present, common, rare, GBIF Status GBIF ulary/gbif/occurrence_ EN 2 bif/occurrence Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 irregular, doubtful, absent or Vocabulary status/ _status/ specify excluded.

AIT, 2012 C3.2.1 version 2a p. 50

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

The method used to preserve Preservation http://rs.gbif.or REST o http://rs.gbif.org/vocab specimens. This vocabulary was Method g/vocabulary/g SOAP o http://data.gbif.org/tutorial/data GBIF ulary/gbif/preservation EN based on the 2 GBIF bif/preservatio Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 _method PreservationMethodClassVoc from Vocabulary n_method specify the BioCASe MetaProfile.

REST o Taxonomic http://rs.gbif.or http://rs.gbif.org/vocab SOAP o http://data.gbif.org/tutorial/data Common taxonomic ranks and their GBIF Rank GBIF g/vocabulary/g EN, LT, DE 2 ulary/gbif/rank Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 alternative representations. Vocabulary bif/rank specify http://rs.gbif.or REST o Reference http://rs.gbif.org/vocab g/vocabulary/g SOAP o http://data.gbif.org/tutorial/data Vocabulary to categorise (literature) GBIF Type GBIF ulary/gbif/referenceTy EN 2 bif/referenceTyOther: useagreement 0 - 1 - 2 - 3 - 4 - 5 references. Vocabulary pe pe specify GBIF http://rs.gbif.or REST o http://rs.gbif.org/vocab Resource g/vocabulary/g SOAP o http://data.gbif.org/tutorial/data A vocabulary classifying resources GBIF ulary/gbif/resource_ty EN 2 Type bif/resource_ty Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 indexed by GBIF. pe Vocabulary pe specify http://rs.gbif.or REST o Taxonomic http://rs.gbif.org/vocab g/vocabulary/g SOAP o http://data.gbif.org/tutorial/data Simple vocabulary to describe the GBIF Status GBIF ulary/gbif/taxonomicSt EN 2 bif/taxonomicS Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 taxonomic status of a name. Vocabulary atus/ tatus/ specify REST o GNA http://vocabula This extension supports the http://vocabularies.gbi SOAP o http://data.gbif.org/tutorial/data GBIF Vernacular ries.gbif.org/no EN publication of vernacular name data 2 f.org/node/127063 Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 Names de/127063 to the Global Names Architecture. specify

AIT, 2012 C3.2.1 version 2a p. 51

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

ITIS produced data and information are in the public domain. While the content of many ITIS web pages is in the public domain, some ITIS pages contain material that is copyrighted by others and used by ITIS with permission. You ITIS Web may need to obtain permission Service - from the copyright owner for Access to other uses. Furthermore, some the USGS REST o ITIS data, products, and The ITIS Web Services provide the Integrated http://www.itis. http://www.itis.gov/ITI SOAP o information linked, or referred ability to search and retrieve data ITIS Taxonomic gov/ws_descri EN 3 SWebService.xml Other: to, from this site may be 0 - 1 - 2 - 3 - 4 - 5 from ITIS by providing access to the Information ption.html specify protected under U.S. and data behind theITIS web site. System foreign copyright laws. You database may need to obtain permission Botanical, from the copyright owner to Zoological acquire, use, reproduce, or distribute these materials. It is the sole responsibility of you, the user of this site, to carefully examine the content of ITIS and all linked pages for copyright restrictions and to secure all necessary permissions.

Wikipedia http://en.wikipe A dwc archive of all english wikipedia REST o http://wikimediafoundation.org/ Species http://dl.dropbox.com/ dia.org/wiki/Wi species pages containing the taxobox SOAP o wiki/Terms_of_Use%20(2012)/ Wikipedia Page u/457027/wikipedia- kipedia:WikiPr EN template. See 3 Other: en?utm_source=TOU_top_Test 0 - 1 - 2 - 3 - 4 - 5 Botanical, en-dwca.zip oject_Tree_of_ http://en.wikipedia.org/wiki/Template: specify Clone2 Zoological life Taxobox.

AIT, 2012 C3.2.1 version 2a p. 52

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

Wikipedia A dwc archive of all German http://de.wikipeREST o Species http://dl.dropbox.com/ wikipedia species pages containing dia.org/wiki/Wi SOAP o http://wikimediafoundation.org/ Wikipedia Page u/457027/wikipedia- DE the taxobox template. See 3 kipedia:Taxob Other: wiki/Nutzungsbedingungen 0 - 1 - 2 - 3 - 4 - 5 Botanical, de-dwca.zip http://de.wikipedia.org/wiki/Wikipedia: oxen specify Zoological Taxoboxen.

http://www.niis GISIN http://www.niiss.org/c s.org/cwis438/ REST o Species wis438/websites/GISI websites/GISI http://www.niiss.org/cwis438/Us SOAP o Describes whether a species is GISIN Status Origin NDirectory/tech/Conc NDirectory/tec EN erManagement/NIISSDataUse. 3 Other: 0 - 1 - 2 - 3 - 4 - 5 indigenious or exotic (or unknown). Vocabulary ept_Info.php?Concepth/Concept_Inf php?WebSiteID=1 specify Botanical ID=1 o.php?Concep tID=1 REST o ISO 639-1 http://rs.gbif.or http://rs.gbif.org/vocab SOAP o http://data.gbif.org/tutorial/data Codes for the representation of ISO Language g/vocabulary/is EN 3 ulary/iso/639-1.xml Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 names of languages (two letters). Codes o/639-1.xml specify REST o ISO 639-2 http://iso.org/6 SOAP o http://data.gbif.org/tutorial/data Codes for the representation of ISO Language http://iso.org/639-2 EN 3 39-2 Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 names of languages (three letters). Codes specify

SO 3166-1 alpha-2 codes are two- letter country codes defined in ISO ISO 3166-1 REST o 3166-1, part of the ISO 3166 standard http://rs.gbif.org/vocabhttp://rs.gbif.or Alpha2 SOAP o http://data.gbif.org/tutorial/data published by the International ISO ulary/iso/3166- g/vocabulary/is EN, FR 3 Country Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 Organization for Standardization 1_alpha2.xml o/639-2.xml Codes specify (ISO), to represent countries, dependent territories, and special areas of geographical interest.

AIT, 2012 C3.2.1 version 2a p. 53

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

ISO 3166-1 alpha-2 codes are two- letter country codes defined in ISO 3166-1, part of the ISO 3166 standard published by the International Organization for Standardization ISO 3166-1 REST o (ISO), to represent countries, http://iso.org/is Alpha2 http://iso.org/iso3166- SOAP o http://data.gbif.org/tutorial/data dependent territories, and special ISO o3166- EN 3 Country 1/alpha2 Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 areas of geographical interest. They 1/alpha2 Codes specify are the most widely used of the country codes published by ISO (the others being alpha-3 and numeric), and are used most prominently for the Internet's country code top-level domains (with a few exceptions).

http://rs.gbif.or REST o IUCN http://rs.gbif.org/vocabg/vocabulary/i SOAP o http://data.gbif.org/tutorial/data IUCN Habitat EN Describes the habitat of a species. 3 ulary/iucn/habitat.xml ucn/habitat.xm Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 Vocabulary l specify http://rs.gbif.or REST o IUCN Threat http://rs.gbif.org/vocab g/vocabulary/i SOAP o http://data.gbif.org/tutorial/data Describes the threat status of a IUCN Status ulary/iucn/threat_statu EN 3 ucn/threat_stat Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 species. Vocabulary s.xml us.xml specify MIxS MIxS REST o (Minimum http://gensc.or Environment http://gensc.org/ns/mi SOAP o http://data.gbif.org/tutorial/data Describes key aspects of Information g/ns/mixs/voc/ EN 3 al Package xs/voc/env_package Other: useagreement 0 - 1 - 2 - 3 - 4 - 5 environmental context (habitat). about any (x) env_package vocabulary specify Sequence)

AIT, 2012 C3.2.1 version 2a p. 54

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

As a user or developer you can use http://www.marREST o http://www.marinespe the WoRMS webservice to feed your AphiaName inespecies.org SOAP x WORMS cies.org/aphia.php?p= EN own application with standard 3 Service /aphia.php?p= Other: 0 - 1 - 2 - 3 - 4 - 5 soap&wsdl=1 WoRMS taxonomy. webservice specify Rights are missing

http://www.niis GISIN http://www.niiss.org/c s.org/cwis438/ REST o Species wis438/websites/GISI websites/GISI Describes whether a species is SOAP o GISIN Status NDirectory/tech/Conc NDirectory/tec EN (potentially) harmful or not. 3 Other: 0 - 1 - 2 - 3 - 4 - 5 Harmful ept_Info.php?Concepth/Concept_Inf Rights are missing specify Vocabulary ID=8 o.php?Concep tID=8

AIT, 2012 C3.2.1 version 2a p. 55

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL)

The Virtual International Authority File (VIAF) is an international service 4 April 2012—VIAF (Virtual designed to provide convenient International Authority File), a access to the world's major name project that virtually combines authority files. Its creators envision multiple name authority files the VIAF as a building block for the into a single name authority Semantic Web to enable switching of service, has transitioned to http://oclc.org/ the displayed form of names for become an OCLC service. Online developer/doc persons to the preferred language Virtual REST o OCLC will continue to make Computer umentation/virt and script of the Web user. VIAF International http://viaf.org/viaf/sear SOAP o VIAF openly accessible and will Library ual- MUL began as a joint project with the 3 Authority File ch Other: also work to incorporate VIAF 0 - 1 - 2 - 3 - 4 - 5 Center international- Library of Congress (LC), the (VIAF) SRU into various OCLC services. (OCLC) authority-file- Deutsche Nationalbibliothek (DNB), The new Agreement confirms viaf/using-api the Bibliothèque nationale de France the free re-use of VIAF data, (BNF) and OCLC. It has, over the including the commercial re- past decade, become a cooperative use of data according to the effort involving an expanding number ODC-By license. of other national libraries and other http://oclc.org/developer/servic agencies. At the beginning of 2012, es/viaf contributors include 20 agencies from 16 countries.

REST o SOAP o

Other: 0 - 1 - 2 - 3 - 4 - 5 specify REST o SOAP o

Other: 0 - 1 - 2 - 3 - 4 - 5 specify

AIT, 2012 C3.2.1 version 2a p. 56

Document Name / Category Institution Access (URL) ation/Test Type Languages Usage and rights Recommendation Domain (URL) REST o SOAP o

Other: 0 - 1 - 2 - 3 - 4 - 5 specify REST o SOAP o

Other: 0 - 1 - 2 - 3 - 4 - 5 specify REST o SOAP o

Other: 0 - 1 - 2 - 3 - 4 - 5 specify REST o SOAP o

Other: 0 - 1 - 2 - 3 - 4 - 5 specify

AIT, 2012 C3.2.1 version 2a p. 57

10 ANNEX II

Figure 16 Visual EDM representation of two related OpenUp! cultural heritage objects

AIT, 2012 C3.2.1 version 2a p. 58

Figure 17 Sample: edm:ProvidedCHO

AIT, 2012 C3.2.1 version 2a p. 59

Figure 18 Sample: ore:Aggregation

AIT, 2012 C3.2.1 version 2a p. 60

Figure 19 Sample: edm:WebResource

AIT, 2012 C3.2.1 version 2a p. 61

11 ANNEX III

The aim of this annex is to provide an overview of the application programming interface (API) tool of EOL26. The mission of the Encyclopedia of Life is to “To increase awareness and understanding of living nature through an Encyclopedia of Life that gathers, generates, and shares knowledge in an open, freely accessible and trusted digital resource.”27 The EOL API tool has different methods (see Figure 20):

Figure 20 Methods and descriptions of APIs28

In this example the two methods “search” and “pages” are used to gather information on common names (compare description of Figure 20). The search and the pages requests do have a certain format. The process of researching with the API methods is described in the following sections with an example.

26 http://eol.org/ 21 Aug. 2012. 27 http://eol.org/info/about 21 Aug. 2012. 28 http://eol.org/api 21 Aug. 2012.

AIT, 2012 C3.2.1 version 2a p. 62

11.1 search

In this example the butterfly Argynnis paphia needs to be found. The request for searching this name is: http://eol.org/api/search/1.0/Argynnis%20paphia When entering this link the following information can be seen (see Figure 21).

Figure 21 Searching Argynnis paphia

As can be seen in Figure 21 the id of the entry “Argynnis paphia” is marked. This id is needed in the next step to find the common name(s) for this butterfly. IMPORTANT: When entering a search request no full stop or other punctuation mark is allowed at the end even though many scientific names are ending with a full stop, e.g. Chenopodium bonus-henricus L.. When entering a wrong URL the following message can be seen (see Figure 22).

Figure 22 Error message after entering the wrong URL

One can also search by entering the search term in the search bar on top of the EOL homepage (see Figure 23).

AIT, 2012 C3.2.1 version 2a p. 63

Figure 23 Searching on the EOL homepage

When entering the term the result is a list of possible search responses with the most likely wanted result on top (see Figure 24).

Figure 24 Result after searching for “Argynnis paphia”

When clicking on the first response “Argynnis paphia” different detailed information on this butterfly can be found (see Figure 25).

AIT, 2012 C3.2.1 version 2a p. 64

Figure 25 Information on the butterfly Argynnis paphia

On top there are different register tabs. When clicking on “Names” first the classifications and rank of the butterfly are shown (see Figure 26). On the left side the number of “related names”, “common names” and “synonyms” is given.

Figure 26 “Names” section with classifications

AIT, 2012 C3.2.1 version 2a p. 65

When clicking on “6 common names” the common names in different languages are listed together with their source and status (see Figure 27). Please note that there are not common names for every species!

Figure 27 Common names of Argynnis paphia

These common names can also be found with the “pages” method of the EOL API.

11.2 pages

In Figure 21 the butterfly Argynnis paphia has been found by typing in a search request. With the id of this search request the common names of the butterfly can be displayed. To do this the following request including the id must be typed in the search bar: http://eol.org/api/pages/1.0/154538?common_names=1&details=1&images=2&subjects=all&a mp;text=2 The number 154538 is the id of Argynnis paphia. The number changes with every species but the rest of the URL needs to stay the same. When entering this URL the common names of the butterfly and the language (xml:lang) are shown in XML format (see Figure 28).

AIT, 2012 C3.2.1 version 2a p. 66

Figure 28 Finding common names with the id of the search request

The following table depicts a first matching test of some OpenUp! data with EOL API information. Further tests with the whole current OpenUp! Data repository are planned.

Table 11 Matching test OpenUp! data and EOL Nr. from collection Scientific name Number of Common name Language ID common names 1 ZOBODAT Brachythecium 1 brachythecium moss en 53861 2 ZOBODAT Scapania undulata 0 608677 3 Herbarium Berolinense Prunus spinosa 20 Schlehe de 638071 Schwarzdorn de blackthorn en sloe en endrino es espino negro es oratuomi fi Épine noire fr Prunus spinosa it

AIT, 2012 C3.2.1 version 2a p. 67

Nr. from collection Scientific name Number of Common name Language ID common names prugnolo it spino nero it Тёрн ru Ternovka ru tern ru ternovnik ru Терновка ru терн ru терновник ru Slån sv slånbär sv 4 Herbarium Berolinense Phleum alpinum L. 5 Alpen-Lieschgras de 1114571 alpine timothy en Mountain Timothy en Phléole des Alpes fr Codolina alpina it 5 BATS (ETI) Nyctalus leisleri 7 netopýr stromový cs 1038527 letouň lesní cs netopýr Leislerův cs Lesser Noctule en Leisler's Bat en Nóctulo pequeño es Noctule de Leisler fr 6 SPNEA (ETI) Leuconia ananas 0 332054 7 SPNEA (ETI) Isodictya palmata 1 Mermaid's Glove en 539311 ar 1002661 ري غص ن ي ن يع ال SHARKS (ETI) Apristurus microps 13 8 小眼光尾鲨 cnm 小眼光尾鯊 cnm Kleinäugiger Katzenhai de Kleinaugen- de Tiefwasserkatzenhai smalleye catshark en Smalleye cat shark en Pejegato puerco es Holbiche porc fr mažaakis juodasis katryklis lt Kleinoogkathaai nl

AIT, 2012 C3.2.1 version 2a p. 68

Nr. from collection Scientific name Number of Common name Language ID common names малоглазая чёрная ru кошачья акула Акула кошача чорна uk дрібноока 9 CRABSJ (ETI) Charybdis lucifera 0 2982733 10 ORCHNG (ETI) Calanthe villosa 0 1087713 11 ORCHNG (ETI) Bulbophyllum patella 0 346008 12 HIFN-2 (ETI) Caltha palustris palustris 13 Dotterblume de 596646

Sumpfdotterblume de marsh marigold en Cowslip en Marsh-marigold en yellow marsh marigold en yellow marsh-marigold en yellow marshmarigold en Hierba centella es Calta es Populage fr Calta palustre it calcea calului ro 13 HIFN-2 (ETI) Apera interrupta 4 dense silkybent en 1114148 Dense Silky Bentgrass en dense silky-bent en silky bentgrass en 14 FNAM (ETI) Bythites islandicus 3 冰岛深蛇鳚 cnm 605382 冰島深蛇鳚 cnm Blámævill is 15 FNAM (ETI) Osmerus mordax 3 rainbow smelt en 357054 Èperlan arc-en-ciel fr Éperlan arc-en-ciel fr 16 INTBUTEU (ETI) Argynnis paphia 6 Kaisermantel de 154538 Silver-washed Fritillary en

Le Tabac d'Espagne fr Nagy gyöngyházlepke hu keizersmantel nl Obična sedefica sr

AIT, 2012 C3.2.1 version 2a p. 69

Nr. from collection Scientific name Number of Common name Language ID common names 17 Notebooks (UH) Lepidoptera 13 Schmetterlinge de 747 Butterflies and moths en butterflies en butterflies, butterflies and en moths, and moths

moths en Hétérocères fr papillons fr papillons de nuit fr pepepe mi purerehua mi Mariposa pt Borboleta pt Чешуекрылые rz 18 Sahlberg (UH) Haliplus variegatus Sturm, 0 3480985 1834 19 GloBIS Colias erate 2 Eastern Pale Clouded en 173187 Yellow Stepski poštar sr 20 GloBIS rumina 2 Spanish Festoon en 130552 (Linnaeus, 1758 Borboleta-carnaval pt 21 Dataflos Agropyron cristatum 4 crested wheatgrass en 966731 (Schreb.) Pal. Beauv. var. incanum Nábělek

crested wheat grass en žitnjak grebenčatyj ru житняк гребенчатый ru 22 Dataflos Cousinia nabelekii Bornm. 0 6267218

23 Sound Archive Phylloscopus collybita 125 Brown Leaf Warbler en 1052649

Pouillot véloce fr Weiden-Laubsänger de Luì piccolo it etc. 24 Animal Sound Archive Nucifraga caryocatactes 33 Tannenhäher de 917366

AIT, 2012 C3.2.1 version 2a p. 70

Nr. from collection Scientific name Number of Common name Language ID common names Cascanueces Común es Cassenoix moucheté fr Nocciolaia it etc. 25 nbgb-openup Meriania claussenii Triana 0 5440965

26 nbgb-openup Caryota mitis (Lour.) 2 Burmese fishtail palm en 1090464

fishtail palm en 27 Kinorrhynca Cephalorhyncha liticola 0 15537966 Sørensen, 2008

28 Kinorrhynca Echinoderes truncatus 0 393052 Higgins, 1983

29 Herbarium W Cyrtandra 0 5642818 fulvovillosa Rech. 30 Herbarium W Eragrostis cylindriflora 1 curlyleaf en 1115637 Hochst.

31 Herbarium W Mimosa acerba Benth. 0 640750 32 Herbarium WU Thalictrum peltatum DC. 0 5530018

33 Herbarium WU Salix cavaleriei H. Lév. 0 2872044

34 Herbarium WU Carex hirta L. 4 Behaarte Segge de 1124276 hammer sedge en Laiche hérissée fr Carice villosa it 35 Paleontologie Fauna Reteporella beaniana 0 600929 (King, 1846) 36 Paleontologie Fauna NO RESULTS 37 Paleontologie Flora NO RESULTS 38 Entomology Logima zetterstedti Jezek 0 434426

39 Entomology Lobaspis strigatipes 0 609919 Bolivar, I. 1898 40 Botany Alchemilla gracillima 0 414389 Rothm.

AIT, 2012 C3.2.1 version 2a p. 71

Nr. from collection Scientific name Number of Common name Language ID common names 41 Botany Escallonia illinita C. Presl 0 5553437

42 6 波斯尼亚小鱥 cnm 214485 Heckel, 1843

波斯尼亞小鱥 cnm 裸体副鱥 cnm 裸體副鱥 cnm Slunka adriatická cs Naked en 43 Zoology ohridanus 1 Ohrid spirlin en 4630050 (Karaman, 1928)

44 Zoology Barbus rebeli Koller, 1926 2 Western balkan barbel en 4624942

Mrena e Fanit sq 45 Anthropology Homo sapiens 4 human en 327955 man en Человек разумный ru Человек разумный ru современный 46 RBINSopenup Rosalia formosa 0 3434256 47 RBINSopenup Stigmodera gratiosa 0 3222265 48 Herbarium Specimen speciosa Hiern 3 florist's gloxinia en 5650163 Brazilian gloxinia en gloxinia en 49 Herbarium Specimen Gesneria purpurascens 0 5644278 Urb. 50 Herbarium Specimen Yucca rupicola Scheele 4 Texas yucca en 1083606 Twist-leaf Yucca en Twisted-leaf Yucca en twistedleaf yucca en

AIT, 2012 C3.2.1 version 2a p. 72

12 LIST OF FIGURES

Figure 1 Ingesting records into Europeana (overall workflow) ...... 1

Figure 2 The EDM Class hierarchy ...... 2

Figure 3 Two contextual classes ...... 3

Figure 4 Bibliography for "carassius gibelio” derived from BHL Web service “Bibliography by URL” 27

Figure 5 Information on "carassius gibelio” derived from EOL Web service “EOL API: Search” ..... 28

Figure 6 Integration of the Ontology Data Gateway in the OpenUP process ...... 30

Figure 7 Components of the Ontology Data Gateway ...... 30

Figure 8 PDI Transformation to ESE ...... 40

Figure 9 Pentaho Ontology Service access ...... 41

Figure 10 Pentaho Variables for Ontology Service ...... 41

Figure 11 Pentaho building Ontology Service Request URL ...... 42

Figure 12 Pentaho Ontology Service Request ...... 42

Figure 13 Pentaho Ontology Service accessing reponse ...... 43

Figure 14 Pentaho Ontology Service accessing response content ...... 43

Figure 15 Pentaho Ontology Service accessing response fields...... 44

Figure 16 Visual EDM representation of two related OpenUp! cultural heritage objects ...... 58

Figure 17 Sample: edm:ProvidedCHO ...... 59

Figure 18 Sample: ore:Aggregation ...... 60

Figure 19 Sample: edm:WebResource ...... 61

Figure 20 Methods and descriptions of APIs ...... 62

Figure 21 Searching Argynnis paphia ...... 63

Figure 22 Error message after entering the wrong URL ...... 63

Figure 23 Searching on the EOL homepage ...... 64

Figure 24 Result after searching for “Argynnis paphia” ...... 64

Figure 25 Information on the butterfly Argynnis paphia ...... 65

Figure 26 “Names” section with classifications ...... 65

AIT, 2012 C3.2.1 version 2a p. 73

Figure 27 Common names of Argynnis paphia ...... 66

Figure 28 Finding common names with the id of the search request ...... 67

13 LIST OF TABLES

Table 1 Properties for edm:Agent ...... 4

Table 2 Properties for edm:Place ...... 6

Table 3 Properties for edm:TimeSpan...... 7

Table 4 Properties for skos:Concept ...... 8

Table 5 Properties for edm:Event ...... 9

Table 6 Properties for edm:PhysicalThing...... 11

Table 7 edm:ProvidedCHO and ABCD(EFG) ...... 21

Table 8 edm:Aggregation and ABCD(EFG) ...... 24

Table 9 edm:WebResource and ABCD(EFG) ...... 26

Table 10 T3.4 Overview on current available and relevant vocabularies for metadata enrichment 46

Table 11 Matching test OpenUp! data and EOL ...... 67

14 LIST OF REFERENCES

ABCD - Access to Biological Collection Data. http://wiki.tdwg.org/ABCD 19 Jul. 2012. ABCDEFG - Access to Biological Collection Databases Extended for Geosciences. http://wiki.tdwg.org/twiki/bin/view/ABCD/DesignAbcdExtensions 23 Jul. 2012. BioCASE Provider Software. http://www.biocase.org/products/provider_software/ 19 Jul. 2012. EDM documentation. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. Europeana Data Model Mapping Guidelines. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. Europeana Data Model Definition v5.2.3. http://pro.europeana.eu/web/guest/edm-documentation 18 Jul. 2012. Europeana Semantic Elements (ESE) documentation http://pro.europeana.eu/web/guest/technical- requirements 19 Jul. 1012.

AIT, 2012 C3.2.1 version 2a p. 74