Lessons Learned While Building the Vocabulary Mapping Tool Cocoda ELAG Berlin

Lessons Learned While Building the Vocabulary Mapping Tool Cocoda ELAG Berlin

Lessons learned while building the vocabulary mapping tool Cocoda ELAG Berlin Jakob Voß Stefan Peters Uma Balakrishnan 2019-05-08 Lessons learned building Cocoda (2019-05-08) Introducing coli-conc Lessons learned building Cocoda (2019-05-08) Colibri, the mother of all Context generation and linguistic tools for bibliographic retrieval interfaces, conducted by our colleauge Ulrike Reiner since 2002 I automatic DDC classification (coli-auto) I automatic checking of DDC number correctness (coli-corr) I automatic analysis and synthesis of DDC numbers (coli-ana) I semi-automatic creation of DDC concordance (coli-conc) ) Domain expert and previous works to build on Lessons learned building Cocoda (2019-05-08) coli-ana, an example Lessons learned building Cocoda (2019-05-08) coli-conc, an example Lessons learned building Cocoda (2019-05-08) I in particular library classifications I Dewey Decimal Classification (DDC) I Regensburger Verbundklassifikation (RVK) I Basisklassifikation (BK) I several local classification schemes I … coli-conc, the idea I a mapping-tool to avoid aching hands I facilitate creation and management of concordances between knowledge organization systems (KOS) Lessons learned building Cocoda (2019-05-08) coli-conc, the idea I a mapping-tool to avoid aching hands I facilitate creation and management of concordances between knowledge organization systems (KOS) I in particular library classifications I Dewey Decimal Classification (DDC) I Regensburger Verbundklassifikation (RVK) I Basisklassifikation (BK) I several local classification schemes I … Lessons learned building Cocoda (2019-05-08) 2. Collect and publish existing mappings I DDC/RVK/GND/BK/STW/LCSH/IxTheo 384.491 mappings I Wikidata 3.607.683 mappings (as of 6/2018) I additional mappings not converted yet 3. Create mapping tool ) Cocoda! coli-conc, to-do list 1. Collect KOS metadata ) BARTOC registry Lessons learned building Cocoda (2019-05-08) 3. Create mapping tool ) Cocoda! coli-conc, to-do list 1. Collect KOS metadata ) BARTOC registry 2. Collect and publish existing mappings I DDC/RVK/GND/BK/STW/LCSH/IxTheo 384.491 mappings I Wikidata 3.607.683 mappings (as of 6/2018) I additional mappings not converted yet Lessons learned building Cocoda (2019-05-08) coli-conc, to-do list 1. Collect KOS metadata ) BARTOC registry 2. Collect and publish existing mappings I DDC/RVK/GND/BK/STW/LCSH/IxTheo 384.491 mappings I Wikidata 3.607.683 mappings (as of 6/2018) I additional mappings not converted yet 3. Create mapping tool ) Cocoda! Lessons learned building Cocoda (2019-05-08) Cocoda 2014: first prototype with AngularJS Lessons learned building Cocoda (2019-05-08) Cocoda 2015-2016 I Write project grant and receive additional funding I Continue with specification (esp. JSKOS data format) I Create a new implementation Lessons learned building Cocoda (2019-05-08) JSKOS data format for Knowledge Organization Systems I JSON-LD for SKOS (prefLabel, broader, narrower…) I Additional properties from other ontologies (url, next, previous, startDate, endDate…) I Additional classes for mappings, concordances, registries… ) https://gbv.github.io/jskos/ Lessons learned building Cocoda (2019-05-08) Getting data into JSKOS format 1. CSV, MARCXML, SKOS, … −! Cleanup 2. CSV, MARCXML, SKOS −! JSKOS Using at least three different tools: I skos2jskos (Perl) I jskos-convert (NodeJS) I mc2skos (Python) I scripts for minor JSON adjustments (jq) Lessons learned building Cocoda (2019-05-08) Cocoda 2015-2016 I Write project grant and receive additional funding I Continue with specification (especially JSKOS data format) I Create a new implementation Lessons learned building Cocoda (2019-05-08) I and throw it away afterwards Cocoda 2015-2016 I Write project grant and receive additional funding I Continue with specification (especially JSKOS data format) I Create a new implementation I as monolithical Java application Lessons learned building Cocoda (2019-05-08) Cocoda 2015-2016 I Write project grant and receive additional funding I Continue with specification (especially JSKOS data format) I Create a new implementation I as monolithical Java application I and throw it away afterwards Lessons learned building Cocoda (2019-05-08) coli-conc 2018: start new from scratch (Node & Vue) Lessons learned building Cocoda (2019-05-08) coli-conc 2019: curent layout Lessons learned building Cocoda (2019-05-08) Live Demo https://coli-conc.gbv.de/cocoda/app/ Lessons learned building Cocoda (2019-05-08) Infrastructure I jskos-server & DANTE terminology registry (JSKOS-API) I mapping suggestions (OpenRefine Reconciliation API) I login-server (OAuth) Lessons learned building Cocoda (2019-05-08) Lessons learned Lessons learned building Cocoda (2019-05-08) ) spreadsheets are not perfect but actually useful People stick to spreadsheets I limited I 1-to-n mappings I repeatable fields I leading zeroes I … I at least better than MS Word I you think CSV, they think Excel Lessons learned building Cocoda (2019-05-08) People stick to spreadsheets I limited I 1-to-n mappings I repeatable fields I leading zeroes I … I at least better than MS Word I you think CSV, they think Excel ) spreadsheets are not perfect but actually useful Lessons learned building Cocoda (2019-05-08) ) Communicate! Software development is communication I listen I what is actually wanted? I how do the involved parties work? I understand I face2face meetings to find a common language I most problems are communication problems I explain I pros and cons of technical decisions I basic such as URIs and Open Data I not all features are implemented first Lessons learned building Cocoda (2019-05-08) Software development is communication I listen I what is actually wanted? I how do the involved parties work? I understand I face2face meetings to find a common language I most problems are communication problems I explain I pros and cons of technical decisions I basic such as URIs and Open Data I not all features are implemented first ) Communicate! Lessons learned building Cocoda (2019-05-08) ) Never trust any data you haven’t validated! No schema, no data quality I Notations and identifiers must match regular expressions I JSON Schema helped to find inconsistencies in JSON data I Additional constraints not expressible in JSON Schema I empty strings, empty arrays, null values… Lessons learned building Cocoda (2019-05-08) No schema, no data quality I Notations and identifiers must match regular expressions I JSON Schema helped to find inconsistencies in JSON data I Additional constraints not expressible in JSON Schema I empty strings, empty arrays, null values… ) Never trust any data you haven’t validated! Lessons learned building Cocoda (2019-05-08) Easy to replace parts of the infrastructure I RVK API ! own DB ! DANTE ! jskos-server I PHP ! Perl ! JavaScript ) split your application into decoupled services! Holy decoupling I service oriented architecture (SOA) I APIs and data formats matter most I Things will break anyway Lessons learned building Cocoda (2019-05-08) ) split your application into decoupled services! Holy decoupling I service oriented architecture (SOA) I APIs and data formats matter most I Things will break anyway Easy to replace parts of the infrastructure I RVK API ! own DB ! DANTE ! jskos-server I PHP ! Perl ! JavaScript Lessons learned building Cocoda (2019-05-08) Holy decoupling I service oriented architecture (SOA) I APIs and data formats matter most I Things will break anyway Easy to replace parts of the infrastructure I RVK API ! own DB ! DANTE ! jskos-server I PHP ! Perl ! JavaScript ) split your application into decoupled services! Lessons learned building Cocoda (2019-05-08) ) Agile development requires some agile users! Look out for beneficial beta-users I users prefer a mature product instead buggy prototype I real use-cases and outcome instead of click-around testing I use-cases will drive development in different directions Lessons learned building Cocoda (2019-05-08) Look out for beneficial beta-users I users prefer a mature product instead buggy prototype I real use-cases and outcome instead of click-around testing I use-cases will drive development in different directions ) Agile development requires some agile users! Lessons learned building Cocoda (2019-05-08) ) Good luck! Patience and luck I Colibri started in 2002 I coli-conc started in 2013 I Current development cycle started in 2018 Lessons learned building Cocoda (2019-05-08) Patience and luck I Colibri started in 2002 I coli-conc started in 2013 I Current development cycle started in 2018 ) Good luck! Lessons learned building Cocoda (2019-05-08) Summary Lessons learned building Cocoda (2019-05-08) We build… I JSKOS data format to unifiy KOS & concordances data I several web services to process and share mapping data I a web application based on the format and services Lessons learned building Cocoda (2019-05-08) We learned to… I stop worrying about spreadsheets I listen, understand, and explain I inist on schemas for data quality I decouple and rewrite services I work together with beta-user I be more patient and use lucky chances Lessons learned building Cocoda (2019-05-08) Feedback is welcome! https://coli-conc.gbv.de/ the project https://coli-conc.gbv.de/cocoda/ the mapping-tool https://github.com/gbv/cocoda/ the source code https://gbv.github.io/cocoda/ the documentation https://github.com/gbv/cocoda/issues the issue tracker Lessons learned building Cocoda (2019-05-08).

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    39 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us