<<

Semantic Web 8 (2017) 767–772 767 DOI 10.3233/SW-170284 IOS Press Editorial

Weaving a Web of linked resources

Fabien Gandon a,*, Marta Sabou b and Harald Sack c a Wimmics, Université Côte d’Azur, Inria, CNRS, I3S, Sophia Antipolis, France E-mail: [email protected] b CDL-Flex, Vienna University of Technology, Austria E-mail: [email protected] c FIZ Karlsruhe, Leibniz Institute for Information Infrastructure, KIT Karlsruhe, Germany E-mail: Harald.Sack@fiz-Karlsruhe.de

Abstract. This editorial introduces the special issue based on the best papers from ESWC 2015. And since ESWC’15 marked 15 years of research, we extended this editorial to a position paper that reflects the path that we, as a community, traveled so far with the goal of transforming the Web of Pages to a Web of Resources. We discuss some of the key challenges, research topics and trends addressed by the Semantic Web community in its journey. We conclude that the symbiotic relation of our community with the Web requires a truly multidisciplinary research approach to support the Web’s diversity.

Keywords: Semantic Web, trends, challenges

1. From a Web of pages to a Web of resources a first wave of deployment of the Semantic Web with the principles and the Linked Open Data As we write this article Tim Berners-Lee just re- 5-Star rules [3] leading to the publication and growth ceived the Turing Award “for inventing the World of linked open datasets towards the Web of Data, as Wide Web, the first , and the fundamental depicted in Fig. 1 and Fig. 2 protocols and algorithms allowing the Web to scale” Twenty five years after its invention, the Web has [1]. But Tim not only invented the Web: he kept de- exceeded its initial status of a distributed document- fending it year after year and continuously works to centric space. Following its numerous evolutions, it lead it to its full potential. In particular, a bit less has become a virtual place where people and soft- than ten years after he submitted his proposal for an ware can co-operate within hybrid communities [10]. information management system in March 1989 [6], It supports a mixed society where humans and Web Tim wrote in September 1998 the Semantic Web Road robots interact in particular via shared . Web map [2] to give a high-level plan of the architecture sites such as Wikipedia, DBpedia, and Wikidata are of the Semantic Web and as a continuation to his the most prominent examples of this space, where soft- wish in 1994 to provide on the Web “more machine- ware (Web robots) and humans interact in hybrid com- oriented semantic information, allowing more sophis- munities on a worldwide scale. ticated processing” [4]. This vision has been made vis- One of the major evolutions of the Web was the ad- ible to a broad audience in 2001 with an article in the vent of the Web of Data based on Linked Data princi- Scientific American [5]. And again a few years later, ples. It transformed the document centered Web into a Tim was instrumental in pushing what can be seen as distributed . Linked Data provides an open in- terface based on widely used W3C standards, such as *Corresponding author. E-mail: [email protected]. URIs (Uniform Resource Identifiers) as universal iden-

1570-0844/17/$35.00 © 2017 – IOS Press and the authors. All rights reserved 768 F. Gandon et al. / Weaving a Web of linked resources

Fig. 1. Number of linked open datasets on the Web plotted from 2007 to 2017 with data from [13]and[16]. tifiers, HTTP (Hypertext Transfer Protocol) for web- rity, streaming, etc. In terms of data management, scal- based access, and RDF (Resource description Frame- ability of storage and efficiency of querying are ac- work) as common standard for data encoding. These tive research and development domains to improve pillars enable efficient identification, access, and inter- RDF store performances. In terms of data access, many linking of data on the Web. Since its start in 2007, the approaches now exist on a continuum from HTTP Web of Data has grown up to almost 10,000 interlinked gets, to, Linked Data Fragments, Linked Data Plat- datasets covering a broad range of topics, as e.g., bib- form REST approach, and SPARQL services, proto- liographical data, geographical data, lexicographical col and language. Moreover, to provide reliable, per- data, biomedical data, or social networking data with sistent, and trustworthy Linked Data services, topics DBpedia, the Linked Data version of the popular on- such as access control, version management, long term line encyclopedia Wikipedia, as its referential central preservation are of utmost importance. But again ef- hub (cf. Fig. 1,Fig.2)[12]. ficiency, federation and hybridization remain hot re- search questions.

2. This is for everything 2.2. Formal knowledge and artificial intelligence

During his participation in the London 2012 Sum- On top of these core topics of data management mer Olympics opening ceremony, Tim Berners-Lee and access, the Semantic Web community has always tweeted “This is for everyone” as a statement about the been interested in providing intelligent processing of Web. With its fundamental idea of using URIs to lit- the linked data of the Web, starting with reasoning. erally identify and henceforth describe “every thing” This remains a challenging and important topic with around us, the Web is now used in every human ac- hard problems in scaling, approximating and distribut- tivity. As a result many challenges have emerged from ing reasoning. In addition to classical logical deriva- different visions of the Semantic Web and from its dif- tion, many other artificially intelligent behaviours are ferent contexts of use and the constraints they bring studied in the community including machine learning with them. and data mining, induction of knowledge, deontic rea- soning on data licences, etc. In particular, hybrid ap- 2.1. From V to S in data management proaches investigating whether and how techniques of and reasoning can be combined with The so called “Vs” of big data (velocity, variety, statistical machine learning are in the focus of current volume, veracity) translate to many “S” of Seman- research with the promise to further improve artificial tic Web: scalability, storage, search, , secu- intelligence beyond the current state-of-the-art. F. Gandon et al. / Weaving a Web of linked resources 769

Fig. 2. The Linked Open Data Cloud diagram shows major datasets published in Linked Data format as of 2017-02-20 [16].

2.3. Heterogeneity of graph types and life cycles data in a smart city), etc. And each type of graph of the Web is not an isolated island. Graphs interact with The initial graph of linked pages of the Web has each other: the networks of communities influence the been extended by a growing number of other graphs message flows, their subjects and types, the semantic including: sociograms capturing social network struc- links between terms interact with the links between tures, workflows specifying decision paths to be fol- sites and vice versa, the small changing graphs of sen- lowed, browsing logs capturing trails of navigation, au- sors are joined to the large stable geographical graphs tomata of service compositions specifying distributed that position them, etc. Not only do we need meth- processing, linked open data from distant datasets, etc. ods to represent and analyse each kind of graph, we Moreover, these graphs are distributed over many dif- also require the means to combine them and to perform ferent sources with very different characteristics. Some multi-criteria analyses on their combinations. sub-graphs are public (e.g. DBpedia), while others are private (e.g. semantic intrawebs). Some sub-graphs are 2.4. A truly open-world assumption small and local (e.g., a user’s profile on a device), and some are huge and hosted on clusters (e.g., Wikipedia). One of the major changes when trying to port Some are largely stable (e.g., a thesaurus for Latin), database and approaches to the Web some change several times per second (e.g., sensor is the open world assumption (OWA) principle that un- 770 F. Gandon et al. / Weaving a Web of linked resources derlies Semantic Web technologies. Many existing re- cepted. With the advent of Linked Data technologies, sults from more established research domains, such as our community increasingly addressed large-scale , knowledge based systems or model-based data integration problems with enterprises knowledge engineering, have to be revisited to account for this graphs. Recent years have seen the uptake of Seman- open-world assumption. But in a broader sense, the tic Web technologies in application domains that span open world of linked data on the Web also adds addi- the borders of the digital and physical world. As a re- tional challenges to address uncertainty, data quality, sult we explored the use of our technologies as integral data and processing traceability in order to be able to part of cyber-physical systems such as adaptive traffic provide reasons for the users to trust the systems and control systems in smart cities [8] or flexible assembly their results. lines in smart factories [7].

2.5. Human–machine partnership 3. Leading the Semantic Web to its full potential This expanding Web of data, together with the schemas, ontologies and vocabularies used to structure 3.1. Dynamics of research topics in Semantic Web and link it, form a formal Semantic Web with which we have to design new interaction means to support The previous topics are only a taste of the many im- the next generation of Web applications. The Seman- portant topics now addressed at international venues tic Web has a role to play in addressing challenges at such as ISWC, ESWC and WWW conference series. the intersection of knowledge-based interactions and AsshowninFig.3 these topics were born, grew Web-augmented interactions [11]: there is not only a and went mainstream very dynamically over the past need for Human Computer Interaction (HCI) to pro- 14 years as we moved from solved to new challenges. vide methods to design human-data interaction applied Starting out with a small number of key topics fo- to linked data on the Web, but also inversely a need cused on representation languages (rdf, web ontology for the Semantic Web community to investigate how language), query answering, and linked data and the intelligent inferences they support learning primarily, Semanting Web research has been can improve human-machine interactions. On the Web, extended with further topics including: semantic web large-scale interactions also create many challenges, services (with a peak of activity between 2002–2009), and in particular the ongoing need to reconcile the linked data (which represents an major research share formal semantics of computer science (logics, ontolo- since 2007) as well as data management topics such as gies, typing systems, etc.) on which the Web architec- ontology matching and SPARQL. ture is built, with the soft semantics of people (posts, This special issue is based on the top articles se- tags, status, and so on) through which a lot of the Web lected from the 12th ESWC in 2015. Besides having content is created. And as the Web becomes a ubiq- a main focus on advances in Semantic Web research uitous infrastructure reflecting all the objects of our and technologies, we, the Chairs of ESWC 2015, de- world, we witness ever-increasing frictions between cided to broaden the scope to span other relevant re- formal semantics and social semantics. A promising search areas as well. In particular, as a response to the research avenue to span the gap between formal and emerging human-machine partnership challenge, the social semantics is the use of Human Computation and core tracks of the research conference were comple- Crowdsourcing techniques to involve large, distributed mented with new tracks focusing on linking machine groups of users in the various stages of the knowl- and human computation at Web scale (Cognition and edge engineering life-cycle, from ontology modeling Semantic Web, Human Computation and Crowdsourc- and verification to annotation, data curation and entity ing). The current special issues contains two extended linking [15]. papers from the Reasoning track and one extended pa- per from the new track on Cognition: 2.6. Conquering new application domains – SPARQL with Property Paths on the Web. This pa- The benefits of Semantic Web technologies had per extends work in the area of querying Linked been firstly apparent for information-intensive do- Data on the Web, thus addressing challenges re- mains such as medicine, library science or cultural her- lated to data management and graph type het- itage where taxonomy structures were already well ac- erogeneity. To query Linked Data on the Web, F. Gandon et al. / Weaving a Web of linked resources 771

Fig. 3. Rexplore showing the evolution of major topics and keywords in the Semantic Web community over the last 14 years [14].

a range of approaches have been proposed span- – FrameBase: Enabling Integration of Heteroge- ning from centralized to fully distributed query- neous Knowledge addresses data management ing. SPARQL property paths are an extension of issues while revisiting our current assumptions SPARQL that allow graph navigation and there- about formal knowledge representation practices fore better capture the distributed and graph-like adopted in the Semantic Web. The representation feature of Linked Data datasets. Yet, the seman- of knowledge that involves more than two en- tics of SPARQL restricts the applicability of this tities (i.e., n-ary relations) can be achieved us- new construct to a single RDF graph. Hartig ing a number of different approaches when re- and Pirro address this gap and propose the for- lying on the subject-predicate-object representa- mal foundations for evaluating property paths on tions as defined by the RDF model. This increases the Web, in a distributed fashion. They propose semantic heterogeneity and hampers integration two different semantics, reachability and context- across knowledge sources. As a potential solu- based, and experimentally evaluate these. Inter- tion, this paper proposes the FrameBase knowl- estingly, their results show that some queries can- edge base schema. Derived from the combina- not be evaluated over the Web in practice and mo- tion of the FrameNet and WordNet linguistic re- tivate introducing the notion of Web-safe queries. sources, FrameBase aims to offer a more concise – Ontology Understanding without Tears: The Sum- and expressive approach for representing n-ary marization Approach. To deal with the increas- relations compared to triple-based approaches. ing size and complexity of RDF knowledge ba- The authors, subsequently demonstrate the sup- sis, methods are necessary for representing these port of FrameBase for integrating heterogeneous graphs in a more concise manner in order to knowledge. This is achieved through integration aid their quick understanding and further foster rules that transform data from external knowledge the human-machine partnership. A family of ap- bases into FrameBase instances. proaches in this direction are summarization ap- proaches which identify a subgraph in the knowl- 3.2. Multidisciplinary approaches to support edge base which includes the most representative diversity concepts of the schema. This paper presents two algorithms for summarization which take into ac- “We need diversity of thought in the world to face count both the schema and the instances in an the new challenges.” – Tim Berners-Lee RDF knowledge base in order to provide a con- The participatory nature of the Web makes it emerge cise summary that captures crucial information as an openly co-constructed global and inherently het- within that knowledge base. These algorithms are erogeneous artifact. The “world-wide way” of deploy- part of the RDF Digest platform that automati- ing the Web everywhere and for everything implies cally produces and visualizes high quality sum- that, as the Web is spreading into the world, the world maries of RDF/S Knowledge Bases (KBs). is spreading into the Web. The resulting world “wild” 772 F. Gandon et al. / Weaving a Web of linked resources

Web that is being created and is evolving every day [6] T.J. Berners-Lee, Information management: A proposal, Tech- is affected by, and at the same time reflects, the com- nical report, 1989, http://cds.cern.ch/record/1405411/files/ plexity of our world. From a Semantic Web point of ARCH-WWW-4-010.pdf. view, as soon as we want to model, analyse and com- [7] S. Biffl and M. Sabou, Semantic Web Technologies for Intel- bine these many facets of one Web, we face the gen- ligent Engineering Applications, Springer, 2016. doi:10.1007/ 978-3-319-41490-4. eral challenge of its diversity and span. This complex- [8] M. d’Aquin, J. Davies and E. Motta, Smart cities’ data: ity implies that a huge challenge for Web development Challenges and opportunities for semantic technologies, IEEE in general, and for the Semantic Web community in Computing 19(6) (2015), 66–70. doi:10.1109/MIC. particular, is the resulting need for large-scale multi- 2015.130. disciplinary cooperation: the three ‘W’s of the World [9] F. Gandon, The three ‘W’ of the call for Wide Web call for the three ‘M’s of a Massively Mul- the three ‘M’ of a Massively Multidisciplinary Methodology, tidisciplinary Methodology [9], and the Semantic Web in: International Conference on Web Information Systems and is no exception to this. Technologies, Springer, 2014, pp. 3–15. [10] F. Gandon et al., Challenges in bridging social semantics and formal semantics on the Web, in: International Conference on References Enterprise Information Systems, Springer, 2013, pp. 3–15. [11] F. Gandon and A. Giboin, Paving the WAI: Defining Web- [1] ACM, 2016 A.M. Turing Award Citation for invent- augmented interactions, in: International Conference Web Sci- ing the World Wide Web, the first Web browser, and ence, ACM, 2017. the fundamental protocols and algorithms allowing the [12] Linked Data Stats, 2017, http://stats.lod2.eu/. Web to scale, 2017, http://amturing.acm.org/award_winners/ [13] Linked Open Data, 2017, http://linkeddata.org/. berners-lee_8087960.cfm. [14] F. Osborne, E. Motta and P. Mulholland, Exploring scholarly [2] T. Berners-Lee, Semantic Web road map, 1998, https://www. data with Rexplore, in: International Semantic Web Confer- w3.org/DesignIssues/Semantic.html. ence, Springer, 2013, pp. 460–477. [3] T. Berners-Lee, Linked Data, 2006, https://www.w3.org/ DesignIssues/LinkedData.html. [15] C. Sarasua, E. Simperl, N. Noy, A. Bernstein and [4] T. Berners-Lee et al., The World-Wide Web, Commun. ACM J.M. Leimeister, Crowdsourcing and the Semantic Web: 37 (1994), 76–82. doi:10.1145/179606.179671. A research manifesto, Human Computation 2(1) (2015), 3–17. [5] T. Berners-Lee, J. Hendler and O. Lassila, The Semantic doi:10.15346/hc.v2i1.2. Web, Scientific American 284(5) (2001), 28–37. doi:10.1038/ [16] The Linked Open Data cloud diagram, 2017, http://lod-cloud. scientificamerican0501-28. net/.