Linked Open Genealogy

Linked Open Genealogy

Linked Open Genealogy Porting Genealogy Commons to the Web of Data © Bernard Vatant, 2019 Genealogy : data about people • People are entities with data properties – name, date of birth, date of death ... • Linked together by object properties – mother, father, spouse, ... • Linked to other entities – places, events, organisations, works ... • Genealogy is naturally linked data! Genealogy Linked Data © Bernard Vatant 2019 2 Genealogy as Big Data Business • Ancestry.com – 10 billion records, 3 million paying suscribers • MyHeritage – 9 billion records, 35 million family trees • Geni.com – Over 100 million records, 11 million users – Owned by MyHeritage since 2012 • And many more ... – Genealogy is a trendy market • Source : https://en.wikipedia.org/wiki/List_of_genealogy_databases Genealogy Linked Data © Bernard Vatant 2019 3 Big Business means Data Silos • Proprietary data formats and API – As everywhere ... • De facto standard : GEDCOM – Non-extensible data model – Strong cultural-religious bias (WASP) – No support for linked entities (places etc.) – No support for standard identifiers (ISNI, VIAF) Genealogy Linked Data © Bernard Vatant 2019 4 Tragedy of the Genealogy Commons • Genealogy is about our common history – The oldest ancestors are the most common – Some shared today by millions of descendants! • Genealogy Commons should be open data! – And of course, free standard open linked data! • Some efforts to regain the Commons – WikiTree : 20 million records, 600,000 members – WeRelate : 2,5 million records – Open data, but still not part of the Web of Data Genealogy Linked Data © Bernard Vatant 2019 5 Genealogy in Wikidata • Over 5 million « items » of type « human » – Based on (fuzzy) « notability criteria » – ~ 400,000 hold at least one parenthood property – ~ 200,000 hold « has father » relationship – ~ 44,000 hold « has mother » relationship – ~ 38,000 are linked to both parents • External identifiers from genealogy data bases – ~ 60,000 WikiTree identifiers – ~ 25,000 WeRelate identifiers Figures as of April 2019 Genealogy Linked Data © Bernard Vatant 2019 6 A real challenge • Genealogy Linked Open Data is still small – 3 orders of magnitude under business data bases • To scale from million to billion range... – ... needs a strong business model • Some tricky legal and ethical issues – Privacy of living people – Meaning of notability – Time limits of the Genealogy Commons Genealogy Linked Data © Bernard Vatant 2019 7 Way forward #1 • Adding more genealogy to Wikidata – Native linked data – Extensible vocabulary – Links to many data bases and identifiers • Issues – Notability policy – No support for privacy – Scalability Genealogy Linked Data © Bernard Vatant 2019 8 Way forward #2 • Support development of WikiTree – Open data – Good collaboration interface – Active community • Issues – No linked data publication so far – Uses GEDCOM as interchange format Genealogy Linked Data © Bernard Vatant 2019 9 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us