Linked Data past, present and futures

Pierre-Yves Vandenbussche ESSnet May 27th 2019

@pyvandenbussche Introduction – Python notebook

+

2 Outline

• 1. Semantics on the Web

− Vision and construction • 2. : Pragmatic use • 3. Promising futures

3 1. Semantics on the Web Vision and construction

4 Creation of the Web (1989)

• A “Graph” view of the Web with semantic relations between documents and entities

https://www.w3.org/History/1989/proposal.html

5 Theory towards the (1994)

• “Flat” view of the Web for a computer. No typed relations • A Web for agents • Coordination needed:

https://www.w3.org/Talks/WWW94Tim/ https://videos.cern.ch/record/2671957 Construction of the Semantic Web (1999-2012)

• SPARQL – Querying • OWL – Formal semantic • RDF-S – Semantic • RDF – Representation

7 2. Linked Data Pragmatic use of SW

8 Adoption technology ≠ Original thoughts

9 Shift from Semantic Web to Linked Data

• Ontology / AI implication fails to address most of real world data that is uncertain, incomplete, inconsistent and includes errors − Relation between concepts can be defined as logical entailments in a formal system (Student(?x) => Person(?x))

• Linked Data − Emphasis on sharing the information in form of a graph − Facilitating data integration through common vocabularies with light formal commitment − Constraints o ShEx (extension for May 2019) / SHACL

10 Shift from Semantic Web to Linked Data

• Reflected in industry − Search Engines: 2012 Peak of inflated expectations Semantic Web o Google 2012 (Peter Norvig) vs o Google 2012 (Ramanathan V. Guha) − () https://www.youtube.com/watch?v=MY4s8uuHmy0 2015 Linked Data • W3C broaden its Semantic Web activity Trough of disillusionment (https://www.w3.org/2001/sw/) giving rise to the Data Gartner 2012 and 2015 Activity / Web of Data (https://www.w3.org/2013/data/)

11 Linked Data successes /Limitations

• Common semantics for a community: − Schema.org: Web pages / Enhanced search engines − Museum and art – Getty − Library – DC/Bibframe − Statistical data – ESSnet! • Linked − Dbpedia, Wikidata, Eurostat, etc. • Semantic pipeline − BBC − Thomson Reuters • Limitations − Cost / incentive − Use Feedback − Tools and maintenance

Jem Rayfield, https://www.slideshare.net/JemRayfield/dsp-bbcjem- rayfieldsemtech2011 12 The Rise of Graph data

• Enterprise Knowledge Graphs − Google Knowledge Vault − Microsoft Academic KG − Facebook Graph Search − … • Graph mining • Convergence with Property Graphs

− RDF* (https://www.w3.org/Data/events/data-ws-2019/)

13 3. Promising futures Knowledge Discovery and Sentient Web

14 Knowledge Discovery

• Literature based discovery - Swanson 1980 • Panama papers (2016) − Neo4J − Linkurious • Discovery of Cancer related protein interactions

“Artificial Intelligence to win Nobel Prize and Beyond” Hiroaki Kitano - ISWC 2016 15 Discovery of Cancer related protein interactions

Data Prediction of new phosphorylation relations

Unstructured data

18060184 ESR1

Angiogenesis was expressed in the majority of cases. In CRC, the microvascular density (MVD) was higher than that from ACC. The ratio CD31/CD105 was 1 in ACC and 3 in Tamoxifen CRC. VEGF was positive in 25% of S102 ACC and 80% of CRC. In CRC were more mature vessels, marked only Copanislib with CD31 than immature vessels or endothelial isolated cells marked PIK3CA with both CD31 and CD105. In ACC prevailed the neoformed vessels T37 marked with both CD31 and CD105. T291 S74 GSK650394 SGK1 T65 T369

S529 AKT1 PDPK1 LOD

Known Links Fujitsu Prediction

16 Sentient Web (Graphs, + IoT + AI/ML)

“Ecosystems of services with awareness of the world through sensors, and reasoning based upon graph data & rules together with graph algorithms and machine learning” Dave Raggett (W3C/ERCIM) / Michael N. Huns (University of South Carolina)

• Combining symbolic information with statistics based upon prior knowledge and past experience − Large range of reasoning techniques o Deductive, inductive, abductive, causal, counterfactual, temporal, spatial, etc. o Together with efficient graph algorithms − Continuous learning o Heuristics, simulated annealing, reinforcement learning

17 Thank you

Pierre-Yves Vandenbussche

@pyvandenbussche Linked Data example

“74.61” “BG”

“8031.59” “2012”

“7,265,115” “79.5”

rdfs:label NUTS code NUTS code eus:geo “Bulgaria”@en “BG” BULGARIA BULGARIA wdt:P605

wd:Q219 dic/geo#BG Participation rates of 4-years-olds in education R05_1 NUTS Region Sovereign State ramon:NUTSRegion wd:Q3624078

19