Linked Data past, present and futures
Pierre-Yves Vandenbussche ESSnet May 27th 2019
@pyvandenbussche Introduction – Python notebook
+
2 Outline
• 1. Semantics on the Web
− Vision and construction • 2. Linked Data: Pragmatic use • 3. Promising futures
3 1. Semantics on the Web Vision and construction
4 Creation of the Web (1989)
• A “Graph” view of the Web with semantic relations between documents and entities
https://www.w3.org/History/1989/proposal.html
5 Theory towards the Semantic Web (1994)
• “Flat” view of the Web for a computer. No typed relations • A Web for agents • Coordination needed:
https://www.w3.org/Talks/WWW94Tim/ https://videos.cern.ch/record/2671957 Construction of the Semantic Web (1999-2012)
• SPARQL – Querying • OWL – Formal semantic • RDF-S – Semantic • RDF – Representation
7 2. Linked Data Pragmatic use of SW
8 Adoption technology ≠ Original thoughts
9 Shift from Semantic Web to Linked Data
• Ontology / AI implication fails to address most of real world data that is uncertain, incomplete, inconsistent and includes errors − Relation between concepts can be defined as logical entailments in a formal system (Student(?x) => Person(?x))
• Linked Data − Emphasis on sharing the information in form of a graph − Facilitating data integration through common vocabularies with light formal commitment − Constraints o ShEx (extension for wikidata May 2019) / SHACL
10 Shift from Semantic Web to Linked Data
• Reflected in industry − Search Engines: 2012 Peak of inflated expectations Semantic Web o Google 2012 (Peter Norvig) vs o Google 2012 (Ramanathan V. Guha) − Wikipedia (Jimmy Wales) https://www.youtube.com/watch?v=MY4s8uuHmy0 2015 Linked Data • W3C broaden its Semantic Web activity Trough of disillusionment (https://www.w3.org/2001/sw/) giving rise to the Data Gartner 2012 and 2015 Activity / Web of Data (https://www.w3.org/2013/data/)
11 Linked Data successes /Limitations
• Common semantics for a community: − Schema.org: Web pages metadata / Enhanced search engines − Museum and art – Getty − Library – DC/Bibframe − Statistical data – ESSnet! • Linked Open Data − Dbpedia, Wikidata, Eurostat, etc. • Semantic pipeline − BBC − Thomson Reuters • Limitations − Cost / incentive − Use Feedback − Tools and maintenance
Jem Rayfield, https://www.slideshare.net/JemRayfield/dsp-bbcjem- rayfieldsemtech2011 12 The Rise of Graph data
• Enterprise Knowledge Graphs − Google Knowledge Vault − Microsoft Academic KG − Facebook Graph Search − … • Graph mining • Convergence with Property Graphs
− RDF* (https://www.w3.org/Data/events/data-ws-2019/)
13 3. Promising futures Knowledge Discovery and Sentient Web
14 Knowledge Discovery
• Literature based discovery - Swanson 1980 • Panama papers (2016) − Neo4J − Linkurious • Discovery of Cancer related protein interactions
“Artificial Intelligence to win Nobel Prize and Beyond” Hiroaki Kitano - ISWC 2016 15 Discovery of Cancer related protein interactions
Data Prediction of new phosphorylation relations
Unstructured data
18060184 ESR1
Angiogenesis was expressed in the majority of cases. In CRC, the microvascular density (MVD) was higher than that from ACC. The ratio CD31/CD105 was 1 in ACC and 3 in Tamoxifen CRC. VEGF was positive in 25% of S102 ACC and 80% of CRC. In CRC were more mature vessels, marked only Copanislib with CD31 than immature vessels or endothelial isolated cells marked PIK3CA with both CD31 and CD105. In ACC prevailed the neoformed vessels T37 marked with both CD31 and CD105. T291 S74 GSK650394 SGK1 T65 T369
S529 AKT1 PDPK1 LOD
Known Links Fujitsu Prediction
16 Sentient Web (Graphs, + IoT + AI/ML)
“Ecosystems of services with awareness of the world through sensors, and reasoning based upon graph data & rules together with graph algorithms and machine learning” Dave Raggett (W3C/ERCIM) / Michael N. Huns (University of South Carolina)
• Combining symbolic information with statistics based upon prior knowledge and past experience − Large range of reasoning techniques o Deductive, inductive, abductive, causal, counterfactual, temporal, spatial, etc. o Together with efficient graph algorithms − Continuous learning o Heuristics, simulated annealing, reinforcement learning
17 Thank you
Pierre-Yves Vandenbussche
@pyvandenbussche Linked Data example
“74.61” “BG”
“8031.59” “2012”
“7,265,115” “79.5”
rdfs:label NUTS code NUTS code eus:geo “Bulgaria”@en “BG” BULGARIA BULGARIA wdt:P605
wd:Q219 dic/geo#BG Participation rates of 4-years-olds in education R05_1 NUTS Region Sovereign State ramon:NUTSRegion wd:Q3624078
19