Metadata Standards Directory WORKING GROUP

Metadata Standards Directory

Alex Ball

University of Bath

RDA Europe Webinar 16 February 2016 Outline

Motivation: Why use metadata standards, and why many don’t

Prior work: Finding a perfect match

Methodology: The Working Group

Results: The Metadata Standards Directory

Next steps: The Metadata Standards Catalog

Acknowledgements Why should I use a metadata standard? Better discovery

    versus      Better context

     

   versus          

Better reuse |      versus

   |   Better ecosystem

É Less working things out from scratch

É More complete metadata

É Benefits of practising

É Better documentation of the standards

É Concentration of development attention and effort

É Better time-saving tools

É etc., etc. So why doesn’t everyone use a metadata standard?

1 2 3 No suitable standard?

None 56.1% My lab 22.1% ISO 8.0% Open GIS 8.0% EML 7.9% FDGC 7.9% Other 6.8% DC 2.2%

Metadata standards used DwC 1.7% DIF 1.0% 0 100 200 300 400 500 600 700 Responses (N = 1205/1329)

Figure: Metadata for scientific data (Source: Tenopir et al. 2011)

Back Forward Too many standards?

(Source: cbn Randall Munroe)

‘The nice thing about standards is that you have so many to choose from’ — Tanenbaum (1988)

Back Forward Isn’t that, like, really hard?

Just fill out this simple form…

Title goes here Author name goes here Author dataset Dataset Publisher name goes here Language name ISO 639-2b code MIME type goes here, repeat as necessary born digital Number of records in your database, or size of file in bytes Abstract goes here Keyword goes here, repeat as necessary Spatial coordinates Temporal extente Spatial extent in words ID goes here Location of record Location for download Usage restrictions or permissions Record of related item Sample citation goes here Required software goes here List of coordinates, comma separated Type of coordinates goes here

Back Forward Science Data Literacy Project

http://sdl.syr.edu/?page_id=32 Scientific Data Application Profile Scoping Study

http:// www.ukoln.ac.uk/ Scientific Data Application Profile Scoping Study Report projects/sdapss/

Document details

Author: Alexander Ball, UKOLN, University of Bath Date: 3rd June 2009 Version: 1.1 Document Name: sdapss. Notes: Changes from version 1.0: Typographical corrections made. References added. Conclusions expanded. Seeing Standards

ta a ata d usical MaterialsScholarly Texts ata ual Reso Visual Resources Musica Mo Musica Sc ving Scholarly Texts Sch Mov h Musical Materials olarly Te Imag ov Scholarly Texts l Ma Visual Resources Geospatial Data u ing Imagesl Ma Visual Resour ing Images M Visual Resources ogy olarly Texts oving Images Geos es y teria teria Musical MaterialsSch patial Data olled Vocabulary Visual Reso rmat Datasets Geospatial Data Visual Re Moving Images ls xts ces Musica Musical Materials olarly rolled Vocabulary Cultural Objects ls Geospatial Data rd Fo Scholarly Text RD Contr Datasets Cont tural Objects Sch Moving Images ControlledReco Vocabular Content StandardRecord Format Mov Archives es Datasets ContentRecord Standard Format Structure Standard Cultural Objects Structure Standard Structure Standard Geospatial Data Texts Technical Metadata LCSHDescrip l Ma olarly mat Str Cultural Objects ry Str QDC uctur Technical Metadata ing Images ONIXDescrip uctur DescripRights Metadata Industry Rights Metadata brari Archives urces bula Archives al Metadata Industry MARCRights MetadataStr sourc al Metadata es Mu Descrip uctur Geospatial Data on Li ies al Metadata Geospatial Data teria Datasets Framework/TechnolRecord For ti Museums Texts Industry Information tive Metad Archives ar s Content Standard brari Datasets Controlled Voca Libraries tive Metad Museums tive Metad Libr ls e Record Format Cultural Objects Libraries OAI-PMHDescrip Righ tion Industry Li tive Metad s Record Format Cultural Objects Structure Standard Informa Dat aset s Structure Standard Technical Metadata ts Met MODSRights Meta Archives Information Cultural Objects Descrip Str uctur adat Informa Archives al Metadata ata METS tive Metad a Me da ata Structural Metadata ta ata Weak ata Archives tadata Wrappe Libraries Information IndustryLibrariesMuseums Museums tive Metad

Museums Information Industry CanCore, GEM, IEEE/LOM, MIX, ata MuseumDat, TGN, XMP Libraries Markup rs ata Semi-Weak Language AGLS, APPM, Atom, CIDOC/CRM, DACS, EAC-CPF, EAD, SGML, XML ISAAR(CPF), ISAD(G), MPEG-7, OAI-ORE, Musical Framework/ Semi-Strong RSS, SCORM, Topic Maps

MathML, MIX AACR2, DCAM, DDC, Materials Technology indecs, ISBD, LCC, Linked CanCore, Seeing Standards: KML, Data, MADS, MARC, MARC EAC-CPF, EAD, GEM, NewsML Relator Codes, MARCXML, METS IEEE/LOM, ISAAR(CPF), Rights, MODS, OAI-PMH, OAIS, ISAD(G), MARCXML, ODRL, AACR2, AES Core Controlled Atom, RSS, Audio, CCO, DC, ODRL, PREMIS, RAD, RDA, RDF, Ontology for Media EAC-CPF, EML, SKOS Resource, SCORM, TGN, IEEE/LOM, MIX, MODS, RELAX NG, Sears List of Subject NewsML, ODRL, ONIX, PB Headings, SGML, SKOS, SMIL, SRU, AACR2, Topic Maps Core, RAD, RDA, Vocabulary A Visualization of the TextMD, XrML XOBIS, XQuery, XrML AGLS, Atom, Scholarly BISAC, DACS, DCAM, GEM, METS DDC, FRBR, indecs, LCC, Rights, QDC, EAD, EML, ADL, AES Core Audio, VRA Core DCAM, Linked Data, MADS, MARC, MARC Weak GML, AES Process History, Texts DTD, Linked Data, Relator Codes, METS Rights, MODS, Metadata Universe eak MathML, DC, DTD, FRBR, ID3, PREMIS, PRISM, RDF, RELAX NG, CDWA, DC, GILS, OAI-ORE, OAI-PMH, ISAD(G), ISBD, MARC, MEI, LCSH, MEI, METS, MO, RSS, Sears List of Subject AGLS, APPM, MARCXML, MODS, Semi-W PRISM OpenURL, RDF, QDC, TEI, VRA Core Music- MPEG-21 DIDL, DC, DTD, Headings, SGML, SKOS, XMP, Atom, DACS, AGLS, RELAX NG, SGML, XOBIS, XQuery, XrML EAC-CPF, EAD, CanCore AAT, BISAC, DDC, MusicXML, MXF, ISBD, LCSH, MESH, DDC, FRBR, SRU, Topic ISAAR(CPF), LCC Semi-Strong LCC, LCSH, MARC Ontology for Media METS, MPEG-21 DIDL, ISAD(G), LCSH, Maps, XML, Content: Jenn Riley Relator Codes, MESH, Resource, PB Core, OAI-ORE, OAI-PMH, MARC, XML Schema, AACR2, CanCore, MARCXML, Sears List of Subject QDC, XML, XML OAIS, ONIX, OpenURL, Design: Devin Becker XPath, CIDOC/CRM, DCAM, GEM, OAI-ORE, RSS, Content Standard Headings, TGM I, TGM II, Schema, XPath, QDC, SRU, SWAP, TEI, IEEE/LOM, indecs, ISBD, Linked Data, SCORM, Sears Strong XQuery TextMD, XML, XML AAT, CCO, MADS, MARC Relator Codes, METS List of Subject TGN, ULAN , Z39.50 Headings, Topic XSL Schema, XPath, CDWA, CDWA Lite, Rights, MODS, MPEG-7, MuseumDat, Work funded by the Indiana University Librariesí Maps CDWA, MPEG-21 IEEE/LOM, AACR2, APPM, Z39.50 NewsML, ODRL, PREMIS, RAD, White Professional Development Award DIDL, VRA Core XSLT, Z39.50 DC, DIG35, DTD, METS, ISAD(G), CCO, DACS, RAD, RDA RDA, RDF, RELAX NG, SGML, MIX, MPEG-21 DIDL, OAI-PMH, NewsML, ISAAR(CPF), ISBD, RDF SKOS, SMIL, XMP, XOBIS, Visual SKOS OAIS, Ontology for Media Resource, PB XQuery, XrML Core, QDC, SRU, TGM I, TGM II, TGN, ULAN, CIDOC/CRM, FRAD, FRBR, FRSAD, VRA Core, XML, XML Schema, XPath, XSLT, Z39.50 Resources Conceptual Model indecs, OAIS, VSO Data Model

OAIS GML EML, MEI, MusicXML, Atom, KML, MathML, RSS Copyright 2009-2010 Jenn Riley NewsML, SGML, XML AACR2, AAT, AGLS, APPM, Atom, DC, DCAM, BISAC, CanCore, CCO, CDWA, CDWA Lite, CIDOC/CRM, AAT, CCO, CDWA, CDWA Lite, CIDOC/CRM, AACR2, AES Core FOAF, indecs, MuseumDat, SPECTRUM, TGN, ULAN` Audio, AES Process DACS, DC, DCAM, DDC, DIF, DIG35, DwC, EAC-CPF, EAD, EML, AACR2, AGLS, DTD, OAI-PMH, VRA Linked Data, MIX, History, APPM, Data MODS, OAI-ORE, CanCore, DACS, FGDC/CSDGM, FOAF, FRAD, FRBR, FRSAD, GEM, GILS, GML, ID3, Core, XML, DDC, DwC, This work is licensed under a Creative Commons CQL, DDC, FRAD, Attribution-Noncommercial-Share Alike 3.0 United States License XMLSchema, XPath, OAIS, PREMIS, EAC-CPF, EAD, IEEE/LOM, indecs, ISAAR(CPF), ISAD(G), ISBD, ISO 19115, LCC, QDC, RDF, RSS, FGDC/CSDGM, . FRBR, FRSAD, GILS, XQuery, XSLT FRBR, GEM, LCSH, Linked Data, MADS, MARC, MARC Relator Codes, ISBD, LCC, LCSH, SGML, SKOS, IEEE/LOM, TGM I, ISAAR(CPF), ISAD(G), MARCXML, MESH, MO, MODS, MPEG-7, MuseumDat, ISO 19115, KML, LCC, MADS, MARC, MARC TGM II, Topic LCSH, MADS, MARC NewsML, OAI-PMH, ONIX, Ontology for Media Maps Relator Codes, MESH, Relator Codes, MARCXML, METS, METS Rights, Resource, PB Core, PRISM, QDC, RAD, RDA, MPEG-7, ODRL, PB Atom, OpenURL, RDF, RSS, SGML, MESH, METS, MIX, MODS, Core, RAD, RDA, SCORM, Sears List of Subject Headings, RELAX NG, SMIL, SRU, VSO Data Model, XML, XMP OAI-PMH, OAIS, OpenURL, Museums TEI, TextMD, XMP, SKOS, SPECTRUM, SRU, SWAP, TGM I, XOBIS, XrML, Z39.50 METS, PREMIS, RDA, Sears List of TGM II, TGN, Topic Maps, ULAN, Strong Strong connection Stars represent those Scholarly Texts MPEG-21 e MEI, MusicXML, Subject Headings, SRU, SWAP, TEI, ag u VRA Core, XOBIS, standards that are used g Semi-Strong connection DIDL, MXF MPEG-21 DIDL, MXF OAI-ORE, TEI extMD, TGM I, TGM II, VRA Core, most often. Lan p u Z39.50 k ar Font Size SCORM XML, XML Schema, XOBIS, XPath, Content MStandard Record Format = Technical Metadata Stru TEIRights Metadata Starís strength for Archives Descript ctu , Z39.50 s ral Metadata ndustry given category ive Met Museums LEGEND ibrarie Information I L adat Semi-Strong a Descriptive SGML, XML

e Audio, AES Process History, CanCore, Semi-Weak Summary and Purpose Sliver Strength of = Metadata CCO, DC, DCAM, DTD, FGDC/CSDGM, GEM, Weak Standardís connection Connection Category The sheer number of metadata standards in the cultural indicated by IEEE/LOM, MEI, METS Rights, OAI-ORE, PB Weak Libraries heritage sector is overwhelming, and their inter-relationships Font Size , SGML, TGN, XQuery & further complicate the situation. This visual map of the Color Saturation Metadata metadata landscape is intended to assist planners with the selection and implementation of metadata standards. Semi-Weak Connection Wrappers A, CDWA Lite, CIDOC/CRM, DACS, DwC, EAC-CPF, Each of the 105 standards listed here is evaluated on its , indecs, ISAAR(CPF), ISO 19115, Linked Data, strength of application to defined categories in each of four ADL, TEI, XMP MPEG-21 DIDL, ONIX, RELAX NG, RSS, SKOS, Topic Maps, ULAN axes: community, domain, function, and purpose. The strength Semi-Strong Pre of a standard in a given category is determined by a mixture of Connection

its adoption in that category, its design intent, and its overall The standards listed OAIS , ID3, ISAD(G), KML, MPEG-7, MusicXML, MXF, ODRL, RAD, SMIL, VSO Data Model, XMP, XRML appropriateness for use in that category. closest to the center Metadata Strong of a sliver are those The standards represented here are among those most heavily Connection that are most strongly used or publicized in the cultural heritage community, though connected to the given certainly not all standards that might be relevant are included. category. A small subset of the standards plotted on the main visualization also appear as highlights above the graphic. These represent the most commonly known or discussed standards for cultural heritage metadata. http://www.dlib.indiana.edu/~jenlrile/metadatamap/ BioSharing

https:// biosharing.org/ standards/ MMI Content Standard References

https://marinemetadata.org/conventions/content-standards GEOSS Standards and Interoperability Registry

https://www .earthobservations .org/gci_sr.shtml CINERGI

http://earthcube.org/group/cinergi MSDWG Goals

1. Develop an RDA Metadata Standards Directory listing standards relevant for research data

É Comprehensive É Easy for anyone to contribute or update 2. Define and develop use cases for research metadata 3. Develop a plan for long-term growth and maintenance of the directory Stakeholder engagement

Stakeholders

É researchers É tool developers

É data managers É repositories

É data scientists É funders

É research support staff É publishers

Outreach É CAMP-4-DATA at DC-2013, Lisbon: 26 participants from 15 countries É International Digital Curation Conference 2014, San Francisco (A. Ball) É RDA-EU Working Group Core Meeting, 2014, Garching, Munich (J. Greenberg) É Working Group mailing list Disciplinary Metadata | Digital Curation Centre http://www.dcc.ac.uk/resources/metadata-standards

Contact us

Search

Home Digital curation About us News Events Resources Training Projects Community

Home > Resources for digital curators > Disciplinary Metadata

In this section Disciplinary Metadata

Briefing Papers While data curators, and increasingly researchers, know that good metadata is key for research data access and How-to Guides re-use, figuring out precisely what metadata to capture and how to capture it is a complex task. Fortunately, many academic disciplines have supported initiatives to formalise the metadata specifications the community deems to be Developing RDM Services required for data re-use. This page provides links to information about these disciplinary metadata standards, including Curation Lifecycle Model profiles, tools to implement the standards, and use cases of data repositories currently implementing them. Curation Reference Manual For those disciplines that have not yet settled on a metadata standard, and for those repositories that work with data Policy and legal across disciplines, the General ResearchDisciplinary Data section links to information about broader Metadata metadata standards that have Catalogue Data Management Plans been adapted to suit the needs of research data.

Tools

Case studies

Repository audit and assessment Search by Discipline

Standards

Disciplinary Metadata

DIFFUSE

Publications and presentations

Roles Biology Earth Science General Research Data Curation journals

Informatics research

External resources http://www.dcc.ac.uk/ resources/metadata-standards Physical Science Social Science & Humanities See Ball (2013) for details Search by Resource Type Metadata Standards Specifications for the minimum information that should be collected about research data in order for it to be re-used.

Profiles and Extensions Standards that have been adapted for use in particular types of repositories, or for particular types of data.

Use cases Institutional repositories and data portals using standards to determine which metadata should be collected upon data deposit.

Tools Software that has been developed to capture or store metadata conforming to a specific standard.

1 of 2 08/01/14 17:25 Issues to address

 UK selection bias  Incompleteness

 Conduct a worldwide survey to fill in the gaps

 Process for maintenance slow and opaque  Little scope for future development  Migrate to a new platform to  make it easier and more transparent to contribute  support a larger team of volunteer editors  allow for future development MSDWG Survey

É Conducted by Sean Chen and Cristina Perez

É Pilot phase: September 2013

É First phase: 8–22 October 2013

É Second phase: November 2013 – April 2014 É 41 responses:

É 14 new standards É 4 new profiles É 13 new metadata tools É 19 new use cases É 18 updates

É See Perez (2013) for details http://bit.ly/1fToaqd Platform migration

Site setup Dec 2013 – Mar 2014 by Sean Chen Content migration May – Aug 2014 by Kate Anne Alderete and Sean Chen Delivery preparation Jan – Feb 2015 by Dustin Allen, Alex Ball, Sean Chen and Adrian Ogletree Launch Mar 2015 at RDA Plenary 5

http://rd-alliance.github.io/metadata-directory/ Data model

Metadata standard

More Helper tool specific Use case profile

Subject Discipline Metadata standard record

Metadata MIDAS-Heritage RDA | Metadata Directory A British cultural heritage standard for recording information on buildings, archaeological sites, shipwrecks, parks and gardens, battlefields, areas of interest and artefacts. Edit this page É Page is generated Edit this page Sponsored by the Forum on Information Standards in Heritage, MIDAS Version 1.1 was released in October 2012. from simple, easy to Getting Started Summary  Edit View the standards View the standards Standard Website edit text file. View the extensions http://www.english-heritage.org.uk/publications/midas-heritage/ Specification View the tools View the tools http://www.english-heritage.org.uk/content/publications/publicationsNew/guidelines- View the use cases standards/midas-heritage/midas-heritage-2012-v1_1.pdf É ‘Edit’ links give Related Vocabularies Browse by subject areas Browse by subject areas INSCRIPTION Subjects access to source. Arts and Humanities Adding standards Social and Behavioral Sciences Disciplines Adding extensions Archaeology É ‘Add’ links for Adding tools Architecture Building Conservation Adding use cases Heritage Studies quickly adding Historical and Philosophical Studies History by Area profiles, tools or use  github github Extensions  Add  @twitter @twitter CARARE metadata schema  Edit cases.  linkedin linkedin An application profile of the MIDAS Heritage standard intended for delivering  facebook facebook metadata to the CARARE service environment about an organisation’s online collections, monument inventory database and digital objects.

 Metadata standard record

--- title: MIDAS-Heritage slug: midas-heritage description:

A British cultural heritage standard for recording information on buildings, archaeological ... É Page is generated MIDAS Version 1.1 was released in October 2012.

website: http://www.english-heritage.org.uk/publicatio... from simple, easy to subjects: edit text file. - arts-and-humanities - social-and-behavioral-sciences É ‘Edit’ links give disciplines: - history-area access to source. - heritage-studies É ‘Add’ links for - building-conservation - historical-and-philosophical-studies quickly adding - architecture profiles, tools or use - archaeology specification_url: http://www.english-heritage.org.uk/... cases. related_vocabularies: - name: INSCRIPTION url: http://fishforum.weebly.com --- Browsing the Directory

Indexes by subject for

É standards

É profiles

É tools

É use cases

Discipline-specific lists of all four

Index of disciplines by subject Browsing the Directory

Indexes by subject for

É standards

É profiles

É tools

É use cases

Discipline-specific lists of all four

Index of disciplines by subject Browsing the Directory

Indexes by subject for

É standards

É profiles

É tools

É use cases

Discipline-specific lists of all four

Index of disciplines by subject Contributing Usage

GitHub

DCC Website events Jan 1 – Dec 31, 2015 14% Total page views: 485 309 71% 9% other resources/how-guides 6% resources/metadata-standards But we can do better than that… Use cases (1)

Data providers and custodians would like to use the Directory

É to search or browse for metadata standards by what they describe – physical artifacts, video, etc.

É to compare standards side-by-side, especially to identify commonalities between the standards of different communities.

É to obtain recommendations of standards to use based on criteria I provide.

É to look up the persistent ID (PID) for a standard, for robust linking. Librarians would like to use the Directory

É to inspect existing profiles of a standard as a first step to constructing their own. Use cases (2)

Journal editors would like to use the Directory

É to check the maturity and level of support of existing standards, so they know which to recommend to authors. Funders would like to use the Directory

É to find out of which standards they have funded the development, whether they are widely used, whether they have been kept up to date, and whether they might be merged into other standards. Tool developers would like

É to submit a whole or partial dataset and retrieve a list of metadata standards which could be used to document it.

É to generate a ‘first attempt’ crosswalk between schemas automatically. Use cases (3)

Tool developers would like

É to submit a set of field names to the Directory and retrieve the metadata standard from which they originate.

É to request from the Directory a sample of metadata records adhering to a specific standard.

É to retrieve a list of appropriate metadata standards based on the partial content from a draft data management plan.

É to submit a PID for a metadata standard to the Directory and retrieve the specification for the standard.

É to submit a pair of PIDs for metadata standards to the Directory and retrieve a suggested migration pathway. Can you think of any others? From Directory to Catalog

Can only hope to satisfy such use cases É with more detail about each standard/profile

É information at the element level? É specifications in ‘native’ form? É specifications in a normalized form? É converters in a normalized form? É data types for which the standard used? É whether tied to a given format, whether RDF-friendly?

É with an API for automated access to the information

É with structured outputs so tools can act on the information Also need to make it even easier for people to contribute, directly or via tools. Metadata Standards Catalog Working Group

É Recognized and endorsed on 18 January 2016

É By 18 July 2016, collect and analyse use cases to produce requirements and technical specification for the Catalog

É By 18 January 2017, develop prototype Catalog and identify adopters

É By 18 July 2017, evaluate and validate Catalog; add mappings from selection of standards to functional ‘packages’; refine user interface and API. Please join us!

https://rd-alliance.org/groups/ metadata-standards-catalog-working-group. Acknowledgements

Fellow MSDWG co-chairs Thank you to Working Group

É Jane Greenberg, MRC members who suggested 〈 〉 directory entries, provided É Keith Jeffery use cases, and helped to É Rebecca Koskela, DataONE steer the work. DCC Disciplinary Metadata Catalogue

É Liz Bedford, DCC Survey and GitHub work

É Sean Chen, MRC 〈 〉 É Cristina Perez, MRC 〈 〉 É Kate Anne Alderete, DataONE

É Adrian Ogletree, MRC 〈 〉 References (1)

Ball, A. (2009), Scientific Data Application Profile Scoping Study Report, version 1.1, Scoping study (Bath, UK: UKOLN, University of Bath, 3 June), http://www.ukoln.ac.uk/projects/sdapss/papers/ ball2009sda-v11.pdf. Ball, A. (2013), ‘The DCC Disciplinary Metadata Catalogue’, Paper presented at the CAMP-4-DATA Workshop, International Conference on Dublin Core and Metadata Applications 2013, Lisbon, Portugal, http: //dcevents.dublincore.org/IntConf/dc-2013/paper/view/203. Perez, C. I. (2013), ‘The RDA’s Metadata Standards Directory: Information gathering’, Unpublished master’s paper (University of North Carolina, Chapel Hill). References (2)

Qin, J., Small, R., and D’Ignazio, J. (2008), ‘Metadata standards’, Syracuse University, Science Data Literacy Project, http://sdl.syr.edu/?page_id=32. Riley, J. and Becker, D. (2010), ‘Seeing Standards: A Visualization of the Metadata Universe’, Indiana University Libraries, http://www.dlib.indiana.edu/~jenlrile/metadatamap/. Tanenbaum, A. S. (1988), Computer Networks, (2nd edn., Upper Saddle River, NJ: Prentice-Hall), ISBN: 0-13-162959-X. Tenopir, C. et al. (2011), ‘Data Sharing by Scientists: Practices and Perceptions’, PLoS ONE, 6/6: e21101, doi: 10.1371/journal.pone.0021101. Metadata Standards Directory WORKING GROUP

Thank you for your attention

Alex Ball: http://alexball.me.uk/

Metadata Standards Directory Working Group: https://rd-alliance.org/groups/metadata-standards-directory-working-group.html