Web Open Standards for Linked Data and Knowledge Graphs As Enablers of EU Digital Sovereignty
Total Page:16
File Type:pdf, Size:1020Kb
Web open standards for linked data and knowledge graphs as enablers of EU digital sovereignty Fabien Gandon, http://fabien.info PROFILE . Graduated Engineer INSA Applied Math, DEA/Master Image & Vision . PHD & HDR (Habilitation) in computer science . Research Director / Senior researcher, INRIA . Leader Wimmics (UCA, Inria, CNRS, I3S) on Campus Sophia Antipolis . Advisory Committee of W3C . Responsible research convention French Ministry of Culture – Inria . Vice-head of Science for Inria Sophia Antipolis DR/Professors: . Fabien GANDON, Inria, AI, KRR, Semantic Web, Social Web, K. Graphs . Nhan LE THANH, UCA, Logics, KR, Emotions, Workflows, K. Graphs . Peter SANDER, UCA, Web, Emotions . Andrea TETTAMANZI, UCA, AI, Logics, Evo, Learning, Agents, K. Graphs WIMMICS TEAM . Marco WINCKLER, UCA, Human-Computer Interaction, Web, K. Graphs CR/Assistant Professors: . Michel BUFFA, UCA, Web, Social Media, Web Audio, K. Graphs . Elena CABRIO, UCA, NLP, KR, Linguistics, Q&A, Text Mining, K. Graphs . Olivier CORBY, Inria, KR, AI, Sem. Web, Programming, K. Graphs . Catherine FARON-ZUCKER, UCA, KR, AI, Semantic Web, K. Graphs . Damien GRAUX, Inria, Linked Data, Sem. Web, Querying, K. Graphs . Serena VILLATA, CNRS, AI, Argumentation, Licenses, Rights, K. Graphs Research engineer: Franck MICHEL, CNRS, Linked Data, Integration, DB, K. Graphs External: . Andrei Ciortea (University of St. Gallen) Agents, WoT, Sem. Web, K. Graphs . Nicolas DELAFORGE (Mnemotix) Sem. Web, KM, Integration, K. Graphs . Alain GIBOIN, (Retired CR Inria), Interaction Design, KE, User & Task, K. Graphs . Freddy LECUE (Thales, Montreal) AI, Logics, Mining, Big Data, S. Web , K. Graphs OWL N-Quad TriG RDF XML RDFS N-Triple Turtle/N3 CSV-LD R2RML JSON LD GRDDL XML JSON LDP SPARQL SHACL RDFa Linked Data RDF HTML HTTP URI, IRI, URL, HTTP URI STANDARDS FOR DATA & KNOWLEDGE GRAPHS ON THE WEB (1/8) Web open standards World Wide Web Consortium an international community leading the Web to its full potential since 1994 i.e. building an open, interoperable Web that works for everyone, by developing freely available and open standards for it. In 2016, Tim Berners-Lee received the Turing Award for his invention of the Web . Over 430 Members org. around the world = . The not-for-profit organization’s staff of 50 supported by Membership dues World Wide Web Consortium . Over 12,000 developers worldwide . 38 working groups + 10 interest groups + 350 Business Groups and Community Groups . Hundreds of open technologies that power… browsers, smart phones, ebook readers, set top boxes, automobiles, search engines, social media, trillions of dollars of online commerce, and more than a billion Web sites xquery xslt xschema skos xslfo rdf sparql rdfs owl p3p xsignat. xbop xml:id xpath xpointer ns xml xbase canon. x dtxml xproc xfrag woff wscdl wsp wsdl xbl xkms xlink sml ttml smile soap webcgm svg awww pics png powder qa rif sec cont. sawsdl ets mathml mf omr m. ok emma geo api grddl xhtml rdfa inkml its cmwww ruby an. assx dom xform ddrsa xml eve. exi … css ra earl mwbp cc/pp aria wcag iri uaag atag … html uri http url examples of former or current members examples of standards for instance… (2/8) Web open standards for… distributed, interoperable hypermedia AN HYPERMEDIA linking everything… three components of the Web architecture 1. identification (URI) & address (URL) ex. http://www.inria.fr URL three components of the Web architecture 1. identification (URI) & address (URL) ex. http://www.inria.fr HTTP 2. communication / protocol (HTTP) GET /centre/sophia HTTP/1.1 address Host: www.inria.fr URL three components of the Web architecture 1. identification (URI) & address (URL) ex. http://www.inria.fr HTML communication HTTP WEB 2. communication / protocol (HTTP) GET /centre/sophia HTTP/1.1 reference address Host: www.inria.fr 3. representation language (HTML) URL Fabien works at <a href="http://inria.fr">Inria</a> [Tim Beners-Lee et al., 1994] 14 (3/8) Web open standards for… distributed, interoperable identifiers Universal Resource Locator / Indentifier HTML communication HTTP HTML communication HTTP WEB WEB reference address reference address URL URI identify what identify, exists on the on the web, what web exists http://my-site.fr http://animals.org/this-zebra • URI for Paris in DBpedia: http://dbpedia.org/resource/Paris • URI for name of Victor Hugo in the Library of Congress: http://id.loc.gov/authorities/names/n79091479 • The MUC18 protein at UniProt http://www.uniprot.org/uniprot/P43121 • Xavier Dolan in Wikidata https://www.wikidata.org/wiki/Special:EntityData/Q551861 • The book with doi:10.1007/3-540-45741-0_18 http://dx.doi.org/10.1007/3-540-45741-0_18 • URIs for everything e.g. identifying 1025 car configurations [François-Paul Servant et al. ESWC 2012] (4/8) Web open standards for… distributed, interoperable data RDF: a Web standard for knowledge graphs HTML communication HTTP RDF communication HTTP WEB WEB reference address reference address URI URI a Web approach to data publication « http://fr.dbpedia.org/resource/Paris » ???... a Web approach to data publication HTTP URI GET a Web approach to data publication HTTP URI GET HTML, … a Web approach to data publication HTTP URI GET RDF linked data The MUC18 protein at UniProt http://www.uniprot.org/uniprot/P43121 linked open data(sets) cloud on the Web 1400 number of linked open datasets on the Web 1200 1000 800 600 400 200 0 5/1/2007 10/8/2007 11/7/2007 11/10/2007 2/28/2008 3/31/2008 9/18/2008 3/5/2009 3/27/2009 7/14/2009 9/22/2010 9/19/2011 8/30/2014 1/26/2017 Smarter Cities’ knowledge graphs IBM Dublin [Lécué et al., 2015] (also for private KGs behind firewalls) (5/8) Web open standards for… distributed interoperable access SPARQL : Get Data, Not Documents ex. DBpedia 31 185 377 686 RDF triples extracted and mapped DBPEDIA.FR 180 000 000 arcs in an encyclopedic knowledge public dumps, endpoints, interfaces, APIs… graph 2.5 millions max 70 000 on average number of queries per day COVID LINKED DATA [Gandon, Michel, Gazzotti, Mayer, Cabrio, Corby, Menin, Winckler, Villata et al. 2020] . integrate multiple datasets in heterogeneous formats . perform information extraction, inferences, validation . provide a public end-point and visualization services (6/8) Web open standards for… distributed interoperable validation SHACL is a language for describing and validating pieces (shapes) of RDF knowledge graphs eg. every Person must have one and only one name used for validation, description, interaction, integration, code generation,… [Corby et al., 2019] ONTOLOGY FOR AI ITSELF This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement 825619. ontology and metadata of AI resources . SHACL to validate AI4EU these RDF graphs . online endpoint http://corese.inria.fr . predefined SPARQL queries, SHACL shapes, display (7/8) Web open standards for… distributed, interoperable vocabularies RDFS to declare classes of resources and properties, of your knowledge graph and organize their hierarchy Document creator author Report Document Person OWL in one… union disjunction algebraic properties intersection complement ! restriction 1..1 disjoint properties cardinality ! qualified cardinality equivalence 1..1 enumeration individual prop. neg [>18] value restriction chained prop. disjoint union keys … Sexe Date Cause CISP2 ... History Observations Element Number H 25/04/2012 vaccin-antitétanique A44 ... Appendicite EN CP - Bon état général - auscult Patients 55 823 pulm libre; bdc rég sans souffle - Consultations 364 684 tympans ok- Past medical history 187 290 Biometric data 293 908 PRIMEGE Semiotics 250 669 Diagnosis 117 442 Row of prescribed drugs 847 422 Symptoms 23 488 Health care procedures 11 850 Additional examination 871 590 Paramedical prescription 17 222 PREDICT HOSPITALIZATION Observations/notes 56 143 [Gazzotti, Faron et al. 2020] . Predict hospitalization from Physician’s records classification Sexe Date Cause CISP2 ... History Observations Element Number H 25/04/2012 vaccin-antitétanique A44 ... Appendicite EN CP - Bon état général - auscult Patients 55 823 pulm libre; bdc rég sans souffle - Consultations 364 684 tympans ok- Past medical history 187 290 Biometric data 293 908 PRIMEGE Semiotics 250 669 Diagnosis 117 442 Row of prescribed drugs 847 422 Symptoms 23 488 Health care procedures 11 850 Additional examination 871 590 Paramedical prescription 17 222 Observations/notes 56 143 PREDICT HOSPITALIZATION (1) [Gazzotti, Faron et al. 2020] . Predict hospitalization from Physician’s records classification . Augment records data with Web knowledge graphs Sexe Date Cause CISP2 ... History Observations Element Number H 25/04/2012 vaccin-antitétanique A44 ... Appendicite EN CP - Bon état général - auscult Patients 55 823 pulm libre; bdc rég sans souffle - Consultations 364 684 tympans ok- Past medical history 187 290 Biometric data 293 908 PRIMEGE Semiotics 250 669 Diagnosis 117 442 Row of prescribed drugs 847 422 Symptoms 23 488 Health care procedures 11 850 Additional examination 871 590 Paramedical prescription 17 222 Observations/notes 56 143 PREDICT HOSPITALIZATION (1) [Gazzotti, Faron et al. 2020] . Predict hospitalization from Physician’s records classification . Augment records data with Web knowledge graphs (2) . Study impact on prediction broaderTransitive skos:narrowerTransitive broaderTransitive broaderTransitive skos:narrower broader broader SKOS #Mathematics #Algebra #LinearAlgebra thesaurus, lexicon narrower narrower skos:broaderTransitive narrowerTransitive