Visualisation of Linked Data – Reprise

Semantic Web 8 (2017) 1–21 1 DOI 10.3233/SW-160249 IOS Press Editorial

Aba-Sah Dadzie a,* and Emmanuel Pietriga b a KMi, The Open University, UK E-mail: [email protected] b INRIA, France E-mail: [email protected]

Abstract. Linked Data promises to serve as a disruptor of traditional approaches to data management and use, promoting the push from the traditional Web of documents to a Web of data. The ability for data consumers to adopt a follow your nose approach, traversing links defined within a dataset or across independently-curated datasets, is an essential feature of this new Web of Data, enabling richer knowledge retrieval thanks to synthesis across multiple sources of, and views on, inter-related datasets. But for the Web of Data to be successful, we must design novel ways of interacting with the corresponding very large amounts of complex, interlinked, multi-dimensional data throughout its management cycle. The design of user interfaces for Linked Data, and more specifically interfaces that represent the data visually, play a central role in this respect. Contributions to this special issue on Linked Data visualisation investigate different approaches to harnessing visualisation as a tool for exploratory discovery and basic-to-advanced analysis. The papers in this volume illustrate the design and construction of intuitive means for end-users to obtain new insight and gather more knowledge, as they follow links defined across datasets over the Web of Data.

Keywords: Linked Data, linked open data, visualisation, visual analysis, exploratory discovery, visual querying

1. Introduction creasingly data- and knowledge-driven world are be- coming more and more dependent on applications that Linked Data is a key component of the Seman- build upon the capability to transparently fetch het- tic Web vision. The ability for data consumers to erogeneous yet implicitly connected data from mul- adopt a follow your nose approach, traversing links de- tiple, independent sources. It has become clear that ﬁned within a dataset or across independently-curated we must rethink how those users interact with the datasets, is an essential feature of what we now call very large amounts of this complex, interlinked, multi- the Web of Data, enabling richer knowledge retrieval dimensional data, throughout its management cycle, thanks to synthesis across multiple sources of, and from generation to capture, enrichment in use and views on, inter-related datasets. Since its early days, reuse, and sharing beyond its original context. Linked Data (LD) promised to serve as a disruptor of The Linked Data community introduces its founda- traditional approaches to data management and use, tional concept as follows [30]: promoting the push from the traditional Web of documents to this Web of data [3,4,40]. But the advan- “Linked Data is about using the Web to connect re- tages of following LD principles have now become lated data that wasn’t previously linked, or using even more apparent [21,39], as the boundary between the Web to lower the barriers to linking data cur- the data available on the (online) Web and users’ own rently linked using other methods.” (personal) data gets ever more fuzzy. Users in our in- This sentiment is echoed across academia, state organisations and industry [see 2,3,13,22,39,40,53,54, *Corresponding author. E-mail: [email protected]. among others]. But despite the initiative’s recognised

1570-0844/17/$35.00 © 2017 – IOS Press and the authors. All rights reserved 2 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise value, and even though it is now well past its infancy, institutional and governmental levels as part of various Linked (Open) Data1 extension and adoption beyond open data initiatives, as well as in industry, where ef- the Semantic Web (SW) community remains limited. forts include investigating LD as a way to more effi- This is due largely to three key challenges: ciently handle big data [39,40]. At the simpler level of individual data consumers, 1. The very large amount of available LD, which lay (non-expert) users may eventually employ Linked keeps increasing [see, e.g., 27,39,54]. For in- Data as a portal to timely, context-based knowledge stance, as at mid-2016 Ermilov et al. [14] report for everyday activities, doing so either implicitly or ex- 9,960 RDF datasets containing over 130 billion plicitly. Such activities could range from simple ones, triples, with connectedness up to 40% from 3% such as comparing the price of groceries or planning a in 2011. These measures originate from LOD- journey using public transport, to more complex ones Stats [34], which provides the most up-to-date involving important decisions, such as purchasing life set of statistics about LOD datasets referenced on insurance or a new home. LD has the potential to make datahub.io [50] and selected public and open gov- such activities more efficient by giving users access ernment stores. to a rich Web of interlinked data relevant to the task 2. The inherently heterogeneous nature of LD, com- at hand; however, this should not be accompanied by prising independently-curated datasets backed by technical barriers that are difficult to overcome by a both standard and custom ontologies, with varia- majority of users. tion in granularity, quality, completeness and ad- The design of user interfaces for LD [22,24,54], and herence to the five-star model of Linked Data de- more specifically interfaces that represent the data vi- sign [3]. This is further complicated by differ- sually, play a central role in that respect. Well-designed ences in source and target domains, and the im- visualisations harness the powerful capabilities of the pact this has on data structure and content. human perceptual system, providing users with rich 3. The focus in the SW community on automated, representations of the data. Combined with appropri- machine-driven data processing, which has so far ate interaction techniques, they enable users to navi- resulted in data management and use solutions that require an advanced level of technical know- gate through, and make sense of, large and complex how to properly generate, clean and link new data datasets, help spot outliers and anomalies, recognise into the Linked Data cloud. Importantly, this has patterns, identify trends [7,49]. Visualisation can pro- also translated into the need for a correspondingly vide effective support for confirming hypotheses, and advanced level of technical know-how at the other favours deriving new insight. Visual analytics systems end of the workflow, in end-user-driven explo- complement human cognitive and perceptual abili- ration, analysis and presentation to diverse audi- ties with automated data processing and mining tech- ences of LD-backed content. Even though custom niques [9,39,47,55]. Such systems not only support the tools can be, and often are, developed for specific presentation of data, but processes for the full data use cases and datasets [2,9,37,47,54], this situa- management cycle as well, from data capture to analy- tion constitutes a significant obstacle to the adop- sis, enrichment and reuse. tion of LD as a framework by a wider range of However, visualisation in and of itself is not an ab- data producers and consumers. solute solution to the challenges inherent to complex data sense-making, analysis and reuse. To contribute Linked Data already spans a wide range of applica- to the realisation of LD’s potential, visualisations have tion areas, a strong indication that its potential value to be tailored to specific tasks, and effectively support is already largely acknowledged. The open challenges users in the performance of these tasks by providing mentioned here, among others, are being tackled by visual representations of the data at a relevant level the research community which, beyond the publication of detail and abstraction. Following Shneiderman’s of academic papers, is developing and making publicly mantra [49], users should be able to navigate from an available numerous pieces of software to test differ- overview of the data to detailed representations of spe- ent approaches to the management and use of LD. The cific items of interest. But the system should also pro- adoption and promotion of LD can also be observed at vide relevant representations of the data at these various levels of detail [7]. This is a major challenge, 1We use LD and LOD interchangeably throughout this paper, the considering the follow-your-nose approach to data ex- latter mainly to stress where data is also open to public (re)use. ploration mentioned earlier, and given the potentially A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 3 high level of heterogeneity of the interlinked datasets ration, as a key initial step to improving access, espe- involved in most LD-backed applications. Only then cially for non-SW and non-domain experts. Benedetti will the results of visual exploration and analysis lead et al. [2], Haag et al. [16] discuss the need to sup- to new insights about LD, with an increased level of port visual querying to overcome technical challenges confidence in drawing conclusions from the data and in the formulation of formal queries by lay users the supporting visual analysis, and in decision-making and, as complexity grows, even technical experts. Mai based on the knowledge thus acquired. et al. [37] further demonstrate the added value in We review the state of the art in Section 2, and look providing multiple visual perspectives that support at increasing efforts to evaluate front-end tools with the follow your nose paradigm when exploring query target end-users in Section 3. In Section 4 we discuss results across multiple, heterogeneous, albeit linked, on-going research in Linked Data visualisation and the datasets. Halpin and Bria [17] describe the process fol- range of challenges tackled by contributors to the spe- lowed to map how Digital Social Innovation is be- cial issue, as part of the process followed in designing ing used throughout the European Union (EU) across and building effective visual presentation and analysis multiple aspects of people’s lives, and measure the solutions for Linked Data. We conclude in Section 5 resulting social impact. They illustrate the benefits with a vision for the role of visualisation in advanc- gained in employing SW technology to overcome sev- ing research in, analysis of, and applications based on eral challenges – challenges posed by differences in Linked Data. structure, granularity and completeness of the data sources involved. Indeed, multiple sources have to be queried to capture all data required to create a 2. The state of the art: Linked Data visualisation DSI map (of the EU), a process that may be used to generate a linked open dataset synthesising this Tackling the challenges associated with the repre- data. sentation of Linked Data and its visual exploratory Beyond tools that help initiate exploration by iden- analysis requires considering related, supporting tech- tifying entry points into LD, users should also be pro- nologies. Halpin and Bria [17], Klímek et al. [27], Mai vided with support for identifying and selecting more et al. [37], Mitchell and Wilson [39], for instance, focused resources. Depending on the use case, this highlight reliance on data structuring, typically using might involve increasingly customised (or customis- ontologies as a backbone; the identification and encod- able) tools for in-depth exploration and analysis, along ing of links via ontology mapping and alignment; and with support for reviewing content and sharing both the need to integrate such technologies into the visual this content and the results of discovery and analysis discovery and analysis pipeline. Formal querying is a beyond its initial point of generation or use. Toward powerful tool for extracting specific information either this aim, it may be instructive to follow best practice from one or across multiple datasets. However, formu- and verified frameworks and models for the design of lating any but very simple queries in SPARQL (the de visualisation and visual analytics such as in [47,49,55], facto RDF query language [52]), or any other formal adapting these where necessary for the specific case of query language, is tedious even for SW experts. It is LD and SW data in general. As in [9], we use Shnei- also error-prone, especially with each additional graph derman’s seminal visual information-seeking mantra and optional or required relationship [9,39]. Further, as a guiding principle in evaluating the usability and the textual results returned by query engines, which utility of visualisation-driven approaches and tools for typically take the form of machine-readable RDF, are different users and their tasks – looking at support for: not easily interpreted by users, even if they have a good (i) generating overviews on the underlying and related level of domain understanding. Even when consider- data; (ii) filtering out noise and less important data ing expert users, meaningful overviews are required to in order to focus on ROIs (Regions of Interest); and obtain a good understanding of data structure, content (iii) visualising ROIs in detail. and relationships for any but very small result sets. Pre- We review selected work in the SW community that defined query templates serve as an aid here, and as a illustrate advancement of the state of the art in Linked foundation on which more intuitive support for guided Data visualisation in the five years since the 2011 sur- querying and question-answering may be built. vey about LD visualisation [9], looking at models, ap- Valsecchi et al. [54] demonstrate the advantages proaches and tools that focus on one or more of the gained in identifying suitable entry points to LD explo- following visualisation-driven tasks: 4 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise

– identification of suitable entry points into this and identification of both implicit and explicit links large, complex data space, across data using non-standard or unambiguous lan- – navigation inside individual, and across multiple, guage and terminology [18]. linked datasets to support exploratory discovery, Research on and about the use of ontologies is well – analysis of data structure and alignment, established and significantly predates LD. However, – basic to advanced data querying and question- employing ontologies to structure LD and support the answering, creation of links within and between linked datasets, – sense-making and guided, in-depth analysis, as with other ontology engineering tasks, is mostly re- – content enrichment through data annotation and stricted to technical and domain experts. This is often identification or derivation of new links, because of the dearth of effective user interfaces (UIs), – presentation and sharing of data and results de- which are needed to reduce complexity in all but the rived from their analysis to different audiences. simplest of ontological structures. This complexity increases when considering the alignment and mapping This review of the state of the art differs from the 2011 of ontologies and LD [24,27,36,37], and impacts cor- survey [9] in two key areas. First, here we look in rectness and interpretation of data and links between more detail at the larger visual exploration and anal- those data. While ideally all LD would follow the five- ysis process, examining supporting technologies that, star LOD design guidelines [3], in practice, the qual- e.g., minimise how much pre-processing is required ity and granularity of data on the Web varies signifi- before the data can be visualised. Secondly, we con- cantly. Ontologies are especially useful in such cases, sider text-based browsers only where they make signif- where independently curated datasets with incompati- icant use of graphics to generate effective visual rep- ble structure and representation – but with related con- resentations. This is not to detract from the value and tent – must be linked. Research on ontology visuali- utility of text-based browsers in general, but to allow sation is therefore both relevant to, and often closely more in-depth focus on the merits and challenges of linked to, research on LD visualisation and interactive visualisation-driven approaches. analysis. Sections 2.1, 2.2 and 2.3 look at the roles of each Weise et al. [57], for instance, extend VOWL, the of (i) ontologies, (ii) formal querying and exploration Visual Notation for OWL Ontologies [36], to visualise and (iii) content analysis and presentation respectively, linked dataset schemas. They aim at supporting users and their role in the broader task of LD visualisation. in getting an understanding of LD content, especially However, the discussion that follows shows a good where datasets do not conform strictly to defined onto- amount of overlap between all three aspects, reinforc- logical structures or other defined schemas. Their web- ing the benefits in integrating these component parts based tool, LD-VOWL, follows principles of ontology of the process and/or building seamless pipelines be- engineering to extract schema information using a set tween them. of SPARQL queries, and visualises the results using a graph layout. Kazemzadeh et al. [26] demonstrate 2.1. The role of ontologies in Linked Data the benefits of defining a unifying ontology to allow visualisation the synthesis of biological data, to support more complete and far-reaching analysis of the large amounts Dowd [13], Hitzler and Janowicz [21], Ivanova of detailed-but-disconnected experimental data gener- et al. [24], Mitchell and Wilson [39], Welter et al. ated in this field. Halpin and Bria [17] collated data [58], among others, highlight the importance of on- to map digital social innovation across the EU. They tologies in the effective use of SW technology and concluded that an ontology provided the most flexible other data-driven research and applications. Ontolo- means for capturing, in a single structure, the varied gies provide structure and a practical means to cap- information required to map this information space. ture, encode, categorise, index and retrieve data. Fo- They also used ontologies as a support for queries re- cusing specifically on LD, ontologies are a critical el- quired during the subsequent analysis steps, involv- ement in linking independently-created datasets. By ing the Linked Data generated by merging the multi- also supporting the encoding of colloquial, domain- ple data resources describing this activity. Further, they and community-specific use of language and terminol- found that structuring data using an ontology allowed ogy using commonly-agreed, formal terms and con- them to continue to add new data “on the fly”, includ- cepts, ontologies aid correct interpretation of content ing their own survey data. Finally, it also enabled them A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 5 to create social networks from the information gleaned out maps types and sub-types defined in backing on- about individuals and organisations, due to the inher- tologies to pseudo-geographical regions and their sub- ent network structure of the LD encoded in RDF. divisions. Instances of a type are mapped to cities We see a cyclical relationship between ontologies within their (parent) region. Disconnected (isolated) and LD, with ontologies guiding initial structuring dur- types and untyped data lie on islands outside the ing data capture, and the structured LD feeding back mainland cluster of inter-linked data. From this ini- into ontology evolution. This can be seen as a virtuous tial overview, keyword search and interaction with the circle, where LD and the backing ontologies contribute map (including click and zoom) transparently trigger to refining one another. This process further aids align- SPARQL queries to retrieve more detailed information ment and linking of disparate, yet related datasets that about ROIs and support jumping via links to other re- reuse the ontologies in question. lated entities in the same region or in other regions. The user may swap between layers over the base map 2.2. Visual querying & exploratory discovery to reveal additional visual cues about data density and structure (e.g., data and object properties, class depth Haag et al. [16], recognising the challenges faced within the ontology). by users in querying LD and RDF in general without Similarly, Halpin and Bria [17], mentioned earlier, good knowledge of the SPARQL query language [52], use a map of the European Union to represent infor- extend the concept of “Filter/Flow Graphs” [59]to mation about digital social innovation (DSI) captured enable relatively sophisticated visual SPARQL query- as Linked Data. Complementing SPARQL queries ing. The authors carry out a usability evaluation with backed by a dedicated ontology with a combination a small group of users familiar with the Semantic of map-based representation and social network lay- Web, but who were not SW experts. Their findings out showing the links between collaborating organisa- demonstrate high learnability of the UI, with users tions, they were able to carry out a relatively detailed self-correcting as the visual representation illustrates analysis that revealed both new information about DSI the effect of flow construction on the result-set ob- and areas on the map where information was missing. tained. Benedetti et al. [2]intheLODeX prototype, The ultimate aim of the project is to feed the results which is targeted at both non-technical users and tech- of the analysis into EU and state policy on opening up nical experts, explore requirements for easier construc- access to funding and other resources to improve DSI. tion of SPARQL queries. LODeX, which serves as a query construction guide for users, combines a detailed 2.3. Visual content analysis & presentation table view on focused elements with a graph layout summarising the schema of linked datasets. Being a large, continuously growing, heteroge- Mai et al. [37] tackle the challenges inherent in neous, multi-dimensional collection of independently linking and exploring “data silos”. Among these chal- generated, albeit inter-linked, datasets, LOD con- lenges, they study the need for seamless transition be- tributes to today’s big data [39], bringing with it both tween different perspectives on heterogeneous data, value and challenges for content analysis. One aspect and questions related to fusing differences in data in of managing LD involves the collection of statistics terms of accuracy and granularity, by using a federated and other metrics that describe the data itself, help- approach to exploration. By following the links within ing to obtain an overview of content and structure and a dataset and across datasets to related data, they pro- a means for identifying points of entry into the data. vide user support for successively revealing more de- Open review and access for the Semantic Web Jour- tail about an entity of interest while switching between nal [23] allow capture of the full review and editorial coupled representations of query results as tables and process in addition to metadata (e.g., topics) about pa- graphs. When items in the result-set feature geoloca- per content as Linked Data. Hu et al. [22] illustrate tion information, the data may additionally be plotted the benefits in this approach that simplifies linking to on a map. further information such as author networks, to en- Valsecchi et al. [54] use a cartographic metaphor to able, among others, tracking current and predicting fu- generate overviews, aiming to help users obtain an un- ture trends in research. The authors use a semantically- derstanding of data structure and content. They rely on enabled journal portal [31] to present scientomet- interaction techniques that follow Shneiderman’s vi- rics derived from analysis of data retrieved from the sual information-seeking mantra [49]. The visual lay- SPARQL endpoint for this dataset and the results of 6 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise statistical, trend and topical analysis. The interactive have difficulty formulating formal (SPARQL) queries, portal uses a number of visualisation types in a dash- Kazemzadeh et al. [26] provide an interactive visuali- board to provide multiple perspectives for browsing sation dashboard to enable intuitive search and explo- content and the metrics derived from the data. By link- ration, employing pre-specified query templates that ing to other LD describing research activity, the dash- return results using a colour-coded, force-based net- board also illustrates spatial distribution of author in- work layout that depicts component type, interaction fluence based on network and citation analysis. type, and strength of the relationship between compo- The field of Bioinformatics emerged as a result of nents. the introduction of computational techniques to aid the Another area outside the SW community where LD analysis and use of the very large, open, multi-media plays a key role is in data journalism, due to the databases generated in the Biological and Life Sci- clearly recognised benefits in following (structured) ences, due to the introduction of advanced technology links between related-but-independent sources of data for experimentation. This has enabled significant re- and knowledge. LD can contribute to improving ex- search and discoveries such as the mapping of the hu- ploration and enriching information retrieval, comple- man genome [51], which in turn have led to the gen- menting new approaches to curating and delivering eration of further data, enabling further scientific re- news stories that threaten to disrupt traditional jour- search. Welter et al. [58] discuss the benefits of adopt- nalistic practice. Dowd [13] highlights the need to ing SW technology to aid data structuring and increase (re)train and upskill a new generation of investiga- automation in the periodic update of the growing cat- tive journalists to enable them to transform big and alogue for the Human Genome project [see 51]. They open data into basic to advanced graphics and in- demonstrate additional benefits such as a reduction in teractive visualisation, to meet expectations for re- the reliance on manual generation of representative vi- defining and augmenting traditional news reporting sual overviews of the data, and, further, the ability to and storytelling. The study also recognises how LOD, create additional, interactive visualisations by appro- along with the use of data collected via social me- priating other technology, using the data encoded to dia platforms, enables the ordinary “interested” pub- simplify and widen (re)use. Welter et al. [58] illustrate lic to aggregate existing and/or create and report their extensibility of the approach to capture relevant infor- own news stories, with both professional and such mation from other related, open biological data stores “lay journalists” typically appropriating the medium to and publications, as well as seamless linking to and provide additional context for stories using statistical more expressive querying of data both within and out- charts and geo-visualisation. Professionals with access with the field. For example, by extending their ontolo- to high-end display hardware such as smart screens are gies, domain experts are able to retrieve key informa- further able to improve their art by overlaying this re- tion not captured in experimental data and study related data and corresponding analytics on the visual sults, such as the ethnicity of participants, by linking story. to geographical and ethnology databases. Data journalism, whether by professionals, lay jour- Kazemzadeh et al. [26] also discuss the challenges nalists, as well as other content creators and aggrega- in obtaining overviews of biological data generated in tors such as bloggers, employs tools for data explo- different ways and for different purposes. Although ration, retrieval, recommendation and annotation. Re- structured using standard ontologies, these data sel- fer [53] is an example of a text-based, visual explo- dom define explicit links between related attributes. ration prototype that uses Named Entity Linking to re- The authors therefore make the case for unifying trieve and recommend entities for annotating text us- frameworks that support synthesis of and analysis ing data retrieved from DBpedia [12,29]. By exploiting across these datasets, to enable more comprehensive the links present in LD, refer provides end-users with analysis that considers relevant information captured support for navigating from their start point to explore by different experiments and in disconnected datasets. other related entities, by traversing data across four They use the construction of a custom domain ontol- categories, colour-coded to aid recognition as Per- ogy describing links across data of interest and LD son, Place, Event and the general, abstract con- principles to demonstrate how this may be achieved, cept of Thing. to improve the discovery of biological interaction at Javed et al. [25], similarly, illustrate the importance the molecular level, a key component of research into of intuitive navigation of LD in VIZ-VIVO, which uses human diseases. To support domain experts who may radial network maps and sunburst diagrams to gen- A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 7 erate information overviews describing research com- One benefit in the pipeline or integrated approach munities, by building on the structure of a custom on- to tackling requirements for LD visualisation is that tology. The aim is to support exploratory navigation the process itself serves as a knowledge retrieval task through the data graph, and, as a result, richer, context- and a form of evaluation of both the approach and driven information retrieval. Javed et al. [25] demon- the tools and techniques that feed into it [see, e.g., strate how this approach enables users to answer ques- 21,37,47]. The development team can only progress to tions about interaction within research communities the next stage in what is often an iterative pipeline or and measure the impact of research at different lev- workflow if they have been successful in: (a) correctly els. identifying relevant resources; (b) extracting data and Rakhmawati et al. [45] illustrate the application of metadata that meet the user’s context and the require- LD principles to financial data reporting and the col- ments of their task or end goal; (c) preprocessing this lection of information on crowdsourcing, to promote data, where necessary, to feed into the visualisation more equal distribution of funds to mosques in Indone- tools available and the analysis required; (d) carrying sia, where such institutions also serve a social func- out initial exploratory analysis, using analytical and/or tion. The authors use a web-based portal and a Twitter mining models, statistical and/or visualisation-driven portal to collect data about individual and joint contri- methods. These initial steps will confirm whether the butions to mosques, and merge the financial data into input data and preprocessing will feed into solving the a (centralised) LD store. This approach enables further problem at hand, and therefore guide the identification data enrichment by linking to geolocation and other of suitable approaches for more detailed analysis, or relevant information, allowing the aggregated finan- may need to be reviewed to improve the process or cor- cial information to be displayed on a map of mosques rect issues found. coupled with statistical charts displaying more detail, We therefore now see an increase in usability eval- highlighting where non-uniform distribution of fund- uation of tools built on SW technology. Further, the ing occurs. Rakhmawati et al. [45] indicate the poten- open data initiative has spurred the sharing of what is tial to carry out further analysis as their approach sim- typically considered personal or in-house research data plifies the porting of the resulting LD to other applica- (such as interim analysis and evaluation data) along tions. with final analysis results, to encourage verification of research by third parties and contribute to benchmark datasets. Tietz et al. [53], for instance, make available 3. Evaluating usability & utility online anonymised evaluation data for their tool refer. They evaluate usability of the text visualisation tool Recognition of the need to support human end-users with a range of regular Web users, a quarter of whom (in addition to the focus on machines) is receiving described themselves as LD experts, while the oth- greater traction within the SW community, especially ers had passing to no knowledge of Linked Data. The for applications targeted at lay users and even domain study found that while the UI was, overall, seen as use- experts who are not necessarily technological experts. ful, and navigation sufficiently intuitive to allow both This is critical to extending reach of the technology user types to discover the information sought, some UI beyond the SW community, and especially important features for revealing additional information were not as expectations of end users grow with advances in recognised especially by the lay users. An interesting technology. Further, as data size and complexity in- finding: criteria for categorising the data and propos- crease exponentially with ubiquitous use of said tech- ing recommendations, a key component in obtaining nology [39,58, among others], the need for intuitive, more complete understanding of relationships revealed user- and task-focused tools for navigating through the as users navigate through data, was assumed by some noise to identify and make effective use of relevant users to be the whole document, rather than the entity knowledge, within the user’s context of use and field of with the focus. This highlights a significant knowledge reference, increases. This is true for domain and tech- gap between SW experts and lay users, with the push nical experts and lay users, the latter of whom range in the SW from a Web of documents to a Web of data, from technophobic or low technology awareness, to and the finer granularity available with the latter. those who routinely use a range of advanced techno- Benedetti et al. [2] in what they describe as qual- logical devices and software in their personal, social itative evaluation, involving both lay and technical and working lives. users, report high satisfaction and accuracy across both 8 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise user types for tasks requiring browsing of their visual Linked Data, open or closed, evokes discussions schema summaries (91% with up to 5 min to com- about big data, within the SW community and further plete), and successful task completion based on visual afield, in government and in industry, due to its size, query generation (90% with up to 15 min to complete). complexity, heterogeneity, dynamicity, distributed na- It should be noted that the authors restrict the tasks to ture and differences in granularity, quality and verac- those that can rely on browsing the schema summaries, ity. But LD goes beyond simply being big, serving as and do not compare their results with benchmarks, nor an enabler with potential to address some of the key mention whether or not relevant benchmarks exist. issues in managing vast amounts of data. While not re- Valsecchi et al. [54] target technological experts quired to follow a schema, a situation that contributes and evaluate their tool only with users who have a to challenges in its use, the inherent structure of LD Computer Science background. Participants worked supports identification of the links within a dataset and with two maps generated from highly inter-connected across to other independently generated data, enabling DBpedia [12] data and very disconnected Linked- more seamless exploration and richer information dis- MDB [19,32] data, respectively. The evaluation so- covery. The authors in this special issue highlight the licited qualitative and quantitative information on in- importance of following guidelines and best practice tuitiveness and learnability of the UI, and how well for curating Linked Data as well as those for visual it supported users in obtaining an understanding of information presentation and analysis, to work toward data structure, including connectedness and content. realising the full potential of LD. Results were mixed: success and correctness in com- Fu et al. discuss the need to evaluate visualisation- pleting tasks ranged from 65% to 100%, with higher based techniques used in ontology understanding, a scores overall for the DBpedia layout. However, par- task that contributes to the process of ontology map- ticipants were able to discover only some of the key ping, which is in turn a contributor to linking indepen- functions, including a subset of those required for nav- dent datasets. The authors argue that such evaluation is igation. A key finding was the importance of provid- key to the design of usable, intuitive, user-facing tools. ing additional textual detail to complement and con- Nuzzolese et al. mine the link structure in LD to firm the conclusions drawn from the visual overviews. rank importance of data associated with a knowledge This is not an unusual finding: Ware [56], for instance, entity of interest, based on popularity with respect to illustrates how text may be used to complement visual the contributions of the crowd to encyclopaedic knowl- thinking and understanding. edge. The results are used to generate visual sum- We conclude this section with a summary in Table 1 maries about the topic of interest and support step-wise of prototypes built and new approaches described in navigation to other related data. Bikakis et al. aggre- this section and in Section 2, by comparing these ac- gate LD hierarchically based on its backing ontolog- cording to a set of criteria for usable visualisation, de- ical structure, to improve navigation through the data rived from guidelines for interactive visualisation and using a variety of visualisation techniques built on top visual information seeking found in [49], [see also 9]. of the resulting aggregates. Scheider et al. harness the It should be noted that the assessment is made using familiar metaphor provided by maps to improve navi- authors’ descriptions where an implemented prototype gation through LD, and hence, exploration and query- is not available publicly; such cases are highlighted ing of LD with spatio-temporal attributes. with an asterisk after the tool name. A number of one-off visualisation solutions exist. However, low to no reusability limits their usefulness. An important challenge being tackled is, therefore, the 4. The special issue design of workflows, pipelines and integrated, multi- perspective solutions that tackle different stages in the The call for papers for a special issue of the Se- complete Linked Data management cycle, typically fo- mantic Web Journal on “Visual Exploration and Anal- cusing on data selection and pre-processing, initial ex- ysis of Linked Data” was posted in June 2014. We re- ploration and a selection of tools for more detailed ceived 13 formal expressions of interest and 11 com- analysis of ROIs and for presenting the results of anal- plete submissions. Of these, 6 were accepted for pub- yses. Brasoveanu et al. describe the task-centred pro- lication, addressing between them the three key com- cess they followed to build a visualisation dashboard ponents identified in the call, that feed into effective for decision support, comprising a set of coupled tools Linked Data visualisation (see also Section 2). for aggregation and analysis of linked open statistical A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 9

Table 1 Features addressed for the LD visualisation approaches and tools reviewed. “x*”: partial implementation or limited scope in tool/approach; “ ”: feature is not provided but could be implemented for the tool/approach; “–”: feature not relevant to or within scope for tool/approach Usability Criterion LD- Sparql- LODeX NHGRI LinkedPPI EarthCube OpenData- Linked VIZ- SWJ Sciento- refer VOWL FilterFlow GWAS GeoLink ZIS* Data Maps VIVO* metrics Diagram Visual overview x x* xxxx Multiple xxxxxx perspectives Coordinated views x x x Data overview x* Graph xx x x* visualisation Ontology/RDF xx x x x graph view Detail on demand x x* xxxxxxx* xx RDF URIs & x* xx x labels Highlight links in xx xx x x xx x x data Support for x* x* x* xx* xx* scalability Query (formal x syntax) Query x* xx* xx* xx (forms/keyword) Visual querying x x x x* x* Filtering x x x x x x x x x* x History x* x* Presentation xx x x x templates Keyword/DMI xx x x x x entry point Non-domain xx x x* x* xx x* x specific Reusable output x* xx Target – Lay-users x* xx* x* xx* Target – Domain xxxx x x experts Target – xx x x* xxx* Tech-users data. Del Rio et al. extend the five-star methodology of 1812 [38]. The classic map requires mainly human for generating high quality LD. They define a seam- perception to understand its content; expert knowledge lessness metric for visual analytics, with the goal of in history and cartography may provide some advan- improving reuse of the interim and final results ob- tages but are not strong requirements. However, mak- tained during visual analysis, by embedding semantic ing the visualisation and its content searchable in a data into the process and the analyses’ output. digital data store requires additional annotations, both Cartography is an oft-used metaphor when generat- textual and using visual overlays, to be stored with ing visualisations, as it takes advantage of users’ fa- the map. Scheider, Degbelo, Lemmens, van Elzakker, miliarity with and consequent ability to read maps to Zimmerhof, Kostic, Jones & Banhatti in Exploratory aid exploratory discovery. For example, a user may Querying of SPARQL Endpoints in Space and Time ex- wish to explore Napoleon’s march to Russia in the War amine challenges of this type, faced by users in for- 10 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise

Fig. 1. Building queries using the click-and-select UI in SPEX, the visual query editor built to support non-technical end users exploring Linked Data. mal querying of data stores as part of the exploration guidelines they specify. The authors report between 45 and question-answering process. Even assuming such and 90 minutes for familiarisation with the tool using annotations exist, end users, whether technological ex- aprintoutofitshelpfile,followedby30to45min- perts or lay persons, require additional expertise to suc- utes to complete a set of exploratory tasks. Participants cessfully retrieve information required to answer their made good use of all query features, including func- questions; namely, the need for expertise in formal tionality for reuse of previous results. However, none query languages such as SPARQL, at least basic under- were successful at completing all tasks, with an over- standing of data structure and content, and depending all success rate of 52%. The authors note, though, that on the data types involved, understanding of complex even where participants were not successful in com- data constructs. pleting the tasks, in some cases they tried alternative The authors therefore look at two data types – spa- strategies to answer the questions, with some of these tial (physical or geographical space) and temporal alternatives leading to partial success or a correct an- (time) – that occur frequently across many domains. swer(s). Scheider et al. employ space and time as features that Key challenges faced by the participants were re- may be used to explore individual data sets and as a lated to the unfamiliar use of terminology and label- bridge between data silos, by harnessing the everyday ing of data – which, derived directly from the RDF, metaphor of maps coupled with a timeline. They also would be relatively easily interpreted by a SW expert, follow Shneiderman’s mantra [49] to define a set of but did not translate easily otherwise, especially for design requirements to support intuitive, visual explo- the more ambiguous terms. A simple example is dis- ration. They aim to enable a broad range of users, that tinguishing between a map entity and the (visual) geo- do not necessarily have much domain or technologi- graphical map. Other challenges were due to user in- cal expertise, to progressively build and edit queries as terface design issues such as what were found to be re- they explore linked datasets along attributes defining dundant steps to trigger events in the dialogs, and how “space and time”. to generate sub-queries using the timeline. Scheider et al. describe a qualitative usability eval- The authors conclude the paper with future direc- uation carried out with six participants: two librarians, tions for research to more effectively translate their de- two PhD students in Geography, and two ITC lectur- sign guidelines into more intuitive support for a non- ers. The goal of this study was to assess the extent technical, lay user audience. to which their prototype SPEX,theSPatio-temporal Ontologies typically serve as a backbone for struc- Content Explorer (shown in Fig. 1), satisfies the design turing Linked Data, and play a key role in defining A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 11 links within a dataset and across datasets to other re- visualisation support for completing the task while related data. Tools for constructing and exploring ontolo- ducing confounding factors, on the basis that domain- gies and especially for mapping and alignment of con- or SW experts would require less intuitive support to cepts and properties therefore contribute to effective complete the tasks successfully. Further, the visualisa- management and (re)use of LD. Fu, Noy & Storey in tions were generated to be as representative of current Eye Tracking the User Experience – An Evaluation of applications as possible, by implementing the most Ontology Visualization Techniques recognise a gap be- typical cues for navigation and browsing, focusing on tween work to develop ontology visualisation tools and ROIs, filtering and hiding data. Also, each task started the usability evaluation required to ensure best prac- with only the root node visible, so that participants had tices are followed in the design and development of full control over their navigation paths. such tools. They highlight the resulting negative im- The results, based on 31 participants (five were dis- pact on the appropriation of useful and usable tech- carded due to incomplete eye tracking data), found niques that map to user expertise and knowledge re- the indented lists to be more effective for searching, quirements in ontology engineering and related knowl- while the graphs better supported information process- edge management tasks, Linked Data being one obvi- ing. Participants completed tasks more quickly using ous example. the lists, at least in part due to the more compact visu- Fu et al. use eye tracking to collect empirical evi- alisation requiring less time to scan. However, they did dence about usability in a comparative study of two of not improve on accuracy over the use of the graphs. the most widely used visualisation techniques in on- Evidence on which of the two resulted in higher cog- tology understanding: indented lists and trees. The au- nitive load was inconclusive, with no statistical differ- thors argue that eye tracking, being able to directly ence between the two visualisation techniques. Over- capture user actions, provides more in-depth measures, all, participants spent more time on information search at a micro level, to help determine why an approach or than in processing; Fu et al. suggest that more research tool may be effective, as opposed to reliance only on be conducted about building support for search in on- traditional measures of performance during usability tology visualisation. evaluation such as time to completion and success rate, The paper concludes with pointers to a more de- that provide indirect means for capturing usability. tailed investigation of affordances for information The study examined four hypotheses: (i) search, search and processing in ontology visualisation tools, (ii) information processing, (iii) ability to reduce cog- including more efficient use of screen space and inves- nitive load, (iv) why a particular technique supports tigation of the specific requirements of different user more efficient and/or effective use by particular types types across a wider set of visualisation techniques. of users. In Aemoo: Linked Data Exploration based on The study involved 36 participants, graduates and Knowledge Patterns Nuzzolese, Presutti, Gangemi, undergraduates from a wide range of disciplines within Peroni & Ciancarini extend previous work on the use the Sciences. They were instructed to perform map- of knowledge patterns to counter challenges faced ping tasks between two sets of ontologies; each ses- by humans and machines attempting to explore the sion lasted approximately 2 hours. Participants were knowledge content of Linked Data. Nuzzolese et al. required to verify whether existing mappings were cor- argue that encyclopaedic knowledge patterns (EKPs), rect and complete (identifying missing mappings oth- being derived from a collective intelligence store span- erwise). Visualisations of each ontology were provided ning multiple domains, provide a “cognitively sound” to aid task completion, which relied on exploration base from which to carry out exploratory discovery of each pair of ontologies, both created from stan- of complex, often dynamic, heterogeneous data such dard benchmark datasets. One pair was simple and de- as LD. scribed conferences. The second pair from the biomed- EKPs are essentially mini-ontologies that define a ical domain, describing organisms, was more complex. class, and by mining the links to instances in Wiki- The authors note deliberate effort to reduce participant pedia, retrieve relationships to other classes ranked bias due to familiarisation with the data domains and based on popularity or frequency of use. EKPs map to the visualisation techniques, in order to simulate as how humans [contributors to Wikipedia] view (knowl- closely as possible challenges faced by novices (with edge about) entities and inter-relationships within this no knowledge of SW technologies and especially on- knowledge. By retrieving the most important/highly tologies). This restriction allowed measurement of the ranked relationships (above a variable threshold), 12 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise

Fig. 2. Illustrating from left to right, navigation from the EKP for the entity retrieved for the search term “Napoleon” to the EKP corresponding to the “French Invasion of Russia” in “The War of 1812” (see lower snapshot), through an entity of type Military Conﬂict.

EKPs can be used to summarise knowledge about an reflects all updates to the knowledge base. This is in entity. addition to dynamic data retrieved from Twitter and Nuzzolese et al. describe Aemoo (see Fig. 2), a Google news that correspond to the entity in question. tool targeted at users with neither particular knowl- Aemoo includes a switch that allows users to retrieve edge of the structure or content of a data store, nor what is described as peculiar knowledge or curiosi- expertise in SW technologies. Aemoo demonstrates ties – entities linked to the focus that fall below the how EKPs may be used to explore knowledge bases. popularity threshold. This enables users to also visu- The aim is to help users focus on the most “impor- alise information about an entity that would usually not tant” information about an entity and then navigate to be considered important or relevant. other related information. The authors employ a con- Nuzzolese et al. present the results of a comparative cept map coupled with a text-based detail window, ar- evaluation, taking Google as a baseline for exploratory guing that mapping the visualisation to the data struc- search, and RelFinder [35] to test Aemoo’s automatic ture of EKPs aids end-users in forming mental mod- summarisation against manual filtering and existing els that support exploration. However, acknowledging approaches to visual querying and exploration. Thirty- potential challenges in the use of the large, complex, two non-technical undergraduates from two universi- unwieldy graphs entailed by the inter-linking of large- ties completed three tasks using Aemoo and, alter- scale, multi-dimensional datasets, they restrict their nately, one of the other two tools. The authors report concept maps to a depth one level below the root. Their that each of the three tools performed better than the approach filters out less relevant or important infor- other two for one task – RelFinder at summarisation, mation, thereby also reducing noise. Because Aemoo Aemoo at discovering related entities, and Google at queries Wikipedia on-the-fly, the information retrieved discovering relations between entities. Overall, how- A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 13

Fig. 3. Visualising the results of hierarchical aggregation of Linked Data in the SynopsViz prototype. Clockwise from top, left: dataset statistics; selected classes in a treemap; selected spatial attributes. Bottom, left: middle snapshot zooms in to show detail, before zooming back out and including another temporal attribute. ever, Aemoo outperformed both RelFinder and Google and analysis, hierarchical or not, complemented by sta- search, with RelFinder performing worst. While Ae- tistical summaries of the input data. The authors note moo also scored higher for learnability and usability, also that while they focus on data encoded as RDF, the based on the SUS questionnaire [see 6], there was no approach is reusable for data in other forms, provided significant difference with the other tools. it is numerical and/or temporal. Based on these evaluation results, Nuzzolese et al. Figure 3 shows SynopsViz, a prototype built to are considering, among others, reusing functionality demonstrate the results of applying this hierarchical in RelFinder to present relationships in Aemoo. They aggregation model to Linked Data. Three visualisation conclude with pointers to additional work to improve options, a timeline, a treemap and chart views, are pro- presentation of, and navigation in, knowledge bases. vided for visualising the output. Bikakis, Papastefanatos, Skourla & Sellis in AHi- Bikakis et al. assess the effectiveness and efficiency erarchical Aggregation Framework for Efficient Multi- of their approach by comparing it with a baseline de- level Visual Exploration and Analysis tackle the chal- scribed as FLAT – displaying all detail for a dataset. lenges faced in exploratory analysis of today’s big, het- The test measured (initial) response time for data pre- erogeneous, dynamic data, to support immediate and processing and subsequent response time for display- on-going analysis online. Bikakis et al. posit that a hi- ing data of interest, using the DBpedia 2014 dataset erarchical approach to data processing and aggregation [10]. Construction and response time for the two hi- enables customisable construction of data overviews erarchical approaches presented were roughly equiva- on the fly, from both static and especially dynamic, lent, with minor improvement for the range-based over numerical and temporal data. Followed by support for the content-based tree. For data up to 10,000 triples, identifying and drilling into the details of ROIs, the au- FLAT performance approaches that of hierarchical ag- thors aim to provide support for end users in navigat- gregation. Significant cost in construction of the hier- ing through and making sense of big data. archical approaches was found to be due to communi- Their generic model takes as input data regardless of cation, notable for small datasets, but decreasing pro- structure and generates output structured to allow the portionally for larger datasets due to additional time use of varied visualisation techniques for presentation required to render the latter. To improve overall re- 14 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise sponse time, the authors also enable incremental con- 1–3 star mundane data to low-cost, inter-connected, struction for the hierarchical approach. With respect to semantically-enriched, 4–5 star data. They extend the data properties, the test results show that FLAT was PROV ontology [28] to capture data and the analytical restricted to a maximum of 305,000 triples, while the process, and to illustrate how inter-connected, seman- hierarchical approach successfully processed all data tic, Linked Data may be used to build a bridge between subsets. Further, the latter was found to outperform independent datasets and the results of analyses, thus FLAT for all tests, with increasing difference in re- lowering the cost of performing those analyses. sponse time with dataset size. Del Rio et al. describe an application scenario where Usability evaluation of the SynopsViz tool was also a student with technological expertise analyses data carried out with ten participants with a background in from multiple stores to obtain information about de- Computer Science. The evaluation involved three sets commissioned satellites and other equipment aban- of tasks across all three approaches to measure users’ doned in space. The scenario illustrates the need to ability to (i) discover specific information, (ii) discover switch constantly between semantically-rich data de- data within defined ranges and (iii) compare and rank scribing the student’s focus, and more mundane rep- data subsets. The comparative evaluation was carried resentations of this data in visual form, that are eas- out with two datasets, one containing 970 triples and ier to interpret but whose content cannot be reused for the second almost 38,000. For all three approaches, the further analysis. The analysis is therefore constrained, authors found significant differences in task comple- with significant cost incurred by moving between mention time for the larger dataset size, due to the much tal models of the data formed as the analysis pro- longer time spent navigating through data structure gresses and alternative representations that break these and content. The hierarchical approaches significantly models. Some of these require further data munging, outperformed FLAT for the larger datasets, because resulting in attention loss with respect to the data and data clustering reduced the number of navigation steps their semantics. The student then passes on her initial required to find matching responses. For the smaller results to a fellow student, who reviews and extends the dataset, however, time to completion for FLAT ap- analysis to confirm her results and answer open ques- proached that of the hierarchical approach, due mainly tions. The scenario illustrates further difficulty reusing to the need for a larger number of navigation steps the initial results, due to limited provenance informa- through the hierarchy. The range task was also partic- tion about the data and the lack of traces about the an- ularly difficult to perform using the FLAT approach – alytical process followed. Cost again increases when cognitive load may have contributed to this as partic- the first student attempts to reuse the analysis results ipants had to browse a significant portion of the data of her colleague; while the new visualisations provide (displayed in detail) to complete the task. The range- further information, they are not directly connected to based and content-based approaches differed slightly the underlying linked and semantically-enriched data. in performance time depending on which approach Del Rio et al. utilise the outcomes of this scenario better matched data clustering to task type. The authors to derive an application seamlessness metric that takes note also a key concern of participants that hampered into account the initial cost of analysis and the cost of navigation: getting lost during exploration. reusing the results of prior analyses. The metric is de- The authors conclude with pointers to future work, fined using ontology restrictions that assign costs rang- including a hybrid version that would employ a ing from one to five stars. The costs reflect the impact weighted combination of each of the content-based of employing a set of applications or following specific and range-based approaches to aggregation. approaches in the data analysis process. The proposal Del Rio, Lebo, Fisher & Salisbury in A Five-Star harnesses the strengths of both the VA and SW com- Rating Scheme to Assess Application Seamlessness as- munities for managing data to avoid pain points and re- sess the cost of coupling multiple visualisation tools duce cost in data analysis, instead increasing effective- together on different platforms, with the aim to pro- ness. A key component of the approach is maintain- mote seamless reuse of visualisation solutions and ing the links between the analyst, the analytical tools therefore more effective analysis. in use, and input data (including previous analysis re- The authors use the munging task in Visual Ana- sults). lytics (VA) – processing of input data to fit a spe- The authors conclude with proposals for future work cific tool’s requirements – to illustrate variation in cost to take into account also analysts’ expertise and back- due to data structure and semantics, from high-cost ground in measuring the cost of using selected tools or A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 15 approaches. They also discuss refining their guidelines tions industries. They then summarise the results of to capture more effectively the cost difference between exploratory evaluation of their ETIHQ dashboard, to individual star ratings, as opposed to ranges, in their measure usability against traditional approaches to seamlessness metric. cross-domain analysis for decision support, and iden- Today’s inter-connected, data-driven economy re- tify additional functionality required by target users. quires decision-makers to consider the impact of mul- Sixteen participants took part in the evaluation, all tiple external factors using data from a variety of in their usual work environment: ten tourism experts sources and increasingly outside their domain, in order based in different countries and six researchers. They to feed wider contextual data into informed decision- completed two activities with the dashboard, com- making. However, the cost of access to third party data, prising preset and user-defined tasks, and a third ac- especially where proprietary, and support for effec- tivity using their usual working tools. The ETIHQ tive analysis of the resulting heterogeneous datasets, dashboard was observed to perform significantly bet- are often beyond the means of smaller organisations ter than manual (traditional) approaches to question- and research institutions. Brasoveanu, Sabou, Scharl, answering, with participants providing more detailed Hubmann-Haidvogel & Fischl in Visualizing Statisti- and precise responses, accompanied by a gain in task cal Linked Knowledge Sources for Decision Support completion time of almost 30%. Recommendations for look at the potential of the growing collection of sta- improvements to the dashboard included, among oth- tistical Linked Open Datasets to fill this need. These ers, the use of a simpler workflow, the implementa- datasets, including the Eurostat database [11] and the tion of search features, and the use of domain termi- World Bank Catalog [41] are built on the RDF Data nology (as opposed to terminology reflecting the use Cube Vocabulary [8]. of SW technology for encoding data). Participants also Brasoveanu et al. identify a set of challenges in- requested the inclusion of news and social media data. herent to the exploration and analysis of large-scale, The authors conclude with a description of updates heterogeneous, inter-connected data, made more com- to the dashboard based on feedback from the usability plex by the unique challenges found in statistical LD evaluation. analysis – including variation in data structure and We summarise in Table 2 the contributions in this content, inconsistent and poor use of data schemas, special issue by looking at the set of visual presenta- and frequent failures of dynamic data portals such tion and analysis features that was also used in Table 1 as SPARQL endpoints. They therefore highlight the to compare other related work. For completeness, two need for integrated visual, multi-perspective analysis additional features relevant to LD visualisation and solutions that enable complex question-answering over analysis are included: faceted search/browse and sup- such data, that scale, and that are reusable beyond spe- port for editing underlying data. Neither of these fea- cific use cases. tures was implemented in any of the tools in Table 1, The authors describe the iterative, scenario-driven and we only see limited implementation here. process they followed through requirements collection, design and prototyping, following best practices 4.1. Reviewers and guidelines for visual analysis, with a focus on the specific requirements of Linked Data and statisti- This special issue would not have been possible cal data. This process resulted in a dashboard contain- without the support of our reviewers, who provided de- ing multiple, coordinated components for data prepro- tailed and informed feedback to authors throughout the cessing and integration, visualisation-driven analysis, multiple iterations on each paper. We thank all review- and for effective presentation of the often also multi- ers for their contribution to the special issue, and list dimensional analysis results. Based on their findings – them below in alphabetical order. challenges faced and successful application of re- James Burton University of Brighton, England, UK search – they propose a set of guidelines and work- Mark Gahegan University of Auckland, New Zealand flows for tackling the challenges in building decision Roberto García González Universitat de Lleida, support systems over Linked Statistical Data. Spain Brasoveanu et al. illustrate the advantages in pro- Luc Girardin Macrofocus GmbH, Switzerland viding multiple perspectives on the data (slices) and Florian Haag University of Stuttgart, Germany the results of analyses using a set of scenarios about John Howse University of Brighton, England, UK decision support in the tourism and telecommunica- Valentina Ivanova Linköping University, Sweden 16 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise

Table 2 Design features for visual presentation and analysis of Linked Data addressed in contributions to the special issue. “x*”: partial implementation/limited scope; “ ”: could be implemented; “–”: not relevant/out of scope Usability Criterion Scheider et al. Fu et al. * Nuzzolese et al. Bikakis et al. Del Rio et al. * Brasoveanu et al. * Visual overview x* x* x* xx x Multiple perspectives x x x x x Coordinated views x* xx Data overview – x x* Graph visualisation x x x Ontology/RDF graph view x x x* Detail on demand x x x x x x RDF URIs & labels x x x Highlight links in data x x x x Support for scalability x x* x* Query (formal syntax) x – Query (forms/keyword) x x x Visual querying x x x Faceted Search/Browse x* x Filtering x* x* xx x History x x x Presentation templates x Keyword/DMI entry point x x x x* x Non-domain speciﬁc x x x x x x Edit underlying data x* Reusable output x* x* Target – Lay-users x* xx* Target – Domain experts x* xxx Target – Tech-users x x* x* xx

Tomi Kauppinen Aalto University School of Sci- users, primarily through interactive, visual represen- ence, Finland tations of this novel form of data whose characteris- Steffen Lohmann Fraunhofer IAIS, Germany tics are quite unique, as discussed earlier. While these Heiko Paulheim University of Mannheim, Germany characteristics are what make LD such enablers,they Jan Polowinski TU Dresden, Germany are also the reason why it is proving so difficult to de- Mariano Rico Universidad Politecnica de Madrid, sign effective user interfaces for LD and for the Web Spain of Data at large. Bernhard Schandl mySugr GmbH, Austria As demonstrated in the papers in this special issue, Thomas Wischgoll Wright State University, USA we continue to see advances in research on the use of visualisation for making LD more accessible. Ef- 5. Future challenges for Linked Data visualisation forts have been moving from producing simple, basic demonstrators of LD browsing to more elaborate tools In their introduction to the Semantic Web Journal’s that integrate data capture, exploration, analysis and/or first set of LD description papers in 2013, Hitzler and presentation in a visualisation-driven front-end. How- Janowicz [21] identify Linked Data as “enablers in re- ever, there is still significant room for improvement. search”, stating its importance. We reiterate this state- Hendler [20] in 2008 raised the “chicken-and-egg ment confidently: even if many challenges remain, LD problem” regarding the adoption of Semantic Web has significant value as a rich, structured source of technology outside the community. The technology knowledge for both research and practical applica- was not sufficiently proven at the time, even though tions. RDF and SPARQL had been introduced several years The challenges discussed in this special issue re- before, and examples of success using ontologies to volve mainly around making LD accessible to end- describe data already existed. Government, state-run A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 17 institutions and industry were reluctant to dedicate made over the years to visualise these tree and graph resources to encoding and sharing their varied data structures. Such approaches, however, have had lim- to grow this new “Web of Data”, but data was re- ited success; even those that have explored alternatives quired to feed into the development of sophisticated to the typical representation based on node-link dia- tools for managing and interacting with this new Web. grams, such as OntoTrix, which investigated the poten- Seven years later, Neish [40] describes several exam- tial of a hybrid representation involving adjacency ma- ples of successful use of Linked Data in different do- trices [1]. While node-link diagrams may be effective mains outside the Semantic Web community, in gov- representations when considering very small graphs, ernment, education and industry. He stresses, however, they fail to provide meaningful visualisations beyond a that links across independently created datasets rarely few dozen nodes and edges, even when styled to better span across domains. Further, technological barriers convey the resources’ and properties’ semantics [43]. continue to hamper the adoption of LD principles in Benedetti et al. [2], for instance, report that evalua- data generation and sharing, with a corresponding in- tion participants wondered how usable their graphical crease in economic and other risks following these schema-based summaries would be for large datasets. principles for data management. In this special issue, Nuzzolese et al. acknowledge this User interface design for LD is still in its infancy. challenge by restricting the concept maps they use to This is reflected in publications describing design and visualise encyclopaedic knowledge to only a single prototypes developed within the research community level below the root (and focus), in addition to setting a for LD visualisation, which are targeted predominantly threshold below which data directly linked to the focus at workshops rather than journals or research tracks in is hidden. conferences. However, workshops dedicated to discus- Indeed, node-link diagrams quickly become very sion of research on interactive visualisation tools for cluttered and illegible, and generate a lot of viscos- LD are one avenue that will help foster research as the ity [15] both in terms of interactive exploration and generation and consumption of LD increases, and as content manipulation. Furthermore, and perhaps more LD use spreads over more and more application ar- importantly, exposing the data’s graph structure might eas. Interestingly, Neish [40] concludes that incremen- not always be relevant. Doing so may therefore have tal adoption of LD may help avoid the acknowledged the unintended consequence of over-emphasising the “chicken-and-egg problem”, by ensuring that the tech- data model to the detriment of the content. In many nology is well developed and proven before the Linked cases, the model is of little importance to users, and Data initiative starts to play a key role in helping to re- it is interesting to observe that early attempts at cre- solve newer challenges such as those posed by today’s ating general-purpose LD browsers, no matter the tar- complex, distributed, heterogeneous, big data [see also geted level of audience expertise, have tended to rely 39,58]. on presentation solutions such as Fresnel [44], that do not expose the graph structure to users, but rather pro- 5.1. Breaking away from graph visualisation vide lenses on this structure that employ, for example, metaphors that relate more closely to end users’ tasks The review of the state of the art (Section 2) and context. shows a predominance of LD representations based on More recent work, including some of the articles in graph visualisation techniques. This is probably due this special issue, are continuing to investigate this av- to the following key factors that encourage the use enue, moving away from node-link diagrams and rep- of node-link diagrams to depict tree and graph struc- resentations of the low-level graph structure in gen- tures: (i) ontologies are often hierarchically structured, eral. Instead, LD front-ends now tend to provide richer, rooted at Thing or another general, abstract topic or more versatile sets of user interface components, each domain concept; (ii) RDF’s data model is a directed designed for different types of data attributes: inter- labeled graph, “ipso facto we use graphs to repre- active map components for resources featuring geolo- sent it” [48]; (iii) network analysis is one of the more cation information, timelines for resources featuring common visualisation-driven tasks carried out within temporal information, etc. Network visualisations can the field, to explore, e.g., collaborations and other in- be part of such suites of UI components, but they terrelationships between researchers and within re- should no longer serve as the main representation of search data, and social networks at large. It is thus not the data. They should rather provide the means to re- particularly surprising that many attempts have been veal some particular aspects of the data, and be used 18 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise only when appropriate, i.e., when the way the data is and empower them through the resulting visualisations structured at a low level is of importance to the user, and underlying data. Hart and Dolbear [18], Neish or when higher-level hierarchical or network structures [40] discuss the potential for LD to be used to bridge can be derived from it. We see this in, e.g., Halpin and differences in terminology, standards and conventions Bria [17], who in their exploration of DSI overlay links across domains and specialised fields, to ensure that between institutions on a map of the EU to reveal the solutions that employ LD map to users’ context rather collaboration network resulting from, among others, than to the technology. It is important that LD visu- joint projects. This view affords policy-makers, the key alisation is designed with, not just for, the user. Pe- target audience of the study’s outcomes, to gain a quick trelli et al. [42] illustrate how a process of user-centred overview on regions of active participation in DSI, and design, closely involving target users in the collec- how institutions may be working together in this re- tion of requirements and the design process, revealed gard. A second visualisation showing a traditional so- marked differences in expectations and terminology cial network allows a focus on network analysis. How- use between two highly specialised communities of ever, while based on the same data, it loses the addi- practice. Cognisant of this, they employed SW princi- tional contextual cues (found in the map), which are ples to bridge this (not unusual) gap. This ensured that of much greater value to users, both policy makers and their visualisation-driven solutions fully captured the the general public interested in these questions. target users’ environment, and translated data structure and content such that it enabled interaction with the 5.2. Bridging knowledge and semantic gaps between content in ways that mapped to users’ mental models, communities of practice goals and expectations. Further, by effectively employ- Visualisation, whether purely for presenting data or ing SW technologies, the tools helped to trigger new as part of the visual analytics process, is rarely void of insight, by enabling the end users to carry out more text [56, among others]. At its most basic, text provides in-depth investigation of their own data, enriched with descriptive labels of elements in the view. Appropri- related knowledge extracted from additional sources. ately used, supplementary text provides additional detail that increases understanding of complex data and 5.3. Achieving better synergy with the InfoVis and helps to confirm conclusions drawn from visualisation. Visual Analytics communities In the visual analytics process, users will often also interact with the underlying data using snapshots (for Important concepts to take into account in user in- Linked Data, encoded in text as RDF), in summary terface design are simplicity and representativeness. form, or encoded into filters, to vary input or examine Graph visualisation quickly becomes complex, and detail in ROIs. as discussed earlier, the low-level structure of RDF A challenge faced by both lay users and domain ex- graphs, or even the higher-level network formed by LD perts outside the Semantic Web in such cases, how- datasets, will often not be the most relevant representa- ever, is the correct interpretation of such text [18]. Us- tion of the data in terms of content. The graph structure ability evaluation (see Section 3) reveals this to be due provides much expressive power and facilitates inter- largely to differences in mental models of data between linking, but it has been designed with software agents tool developers and end users, when developers de- in mind, not human users. The fact that graph visual- fault to presentation using SW terminology and con- isation offers a direct mapping to SW experts’ mental ventions. For instance, prefix:Entity or pre- model [33] of the data has probably contributed signif- fix:Property are correctly interpreted by a SW icantly to its early use, and as more and more SW users expert, with the (ontology) prefix providing additional, were exposed to such visualisations, the correspond- discriminating context. For a non-expert, this only in- ing increase in visual literacy [5,46,56] may have in creases visual clutter and cognitive load scanning the turn contributed to restricting the range of visualisation view; ontology prefixes use abbreviations and terse techniques considered for LD and the Web of Data at forms that require domain knowledge to interpret cor- large. rectly. Entity, data types and properties vary between Research in visualisation may arguably be reaching self-descriptive labels and terse, technical terminology, a plateau when it comes to the design of new visu- both of which are sometimes ambiguous to lay users. alisation techniques [see, e.g., 55]. But there is quite Carpendale [7] illustrates the importance of a wide range of existing visualisation techniques that grounding visualisation in users’ context, to enable have not yet been considered by the SW community, A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 19 that could potentially provide more effective support at tive use of it in different contexts, through the use of representing LD and enable users to interact more in- visualisation [55]. Mitchell and Wilson [39] refer to tuitively with it. Conversely, better synergy with the vi- LD as a “broker” that may be used both to reduce cost sualisation and visual analytics communities could be and increase the value of today’s big data. Hitzler and achieved by getting that community to consider LD as Janowicz [21] identify LD as an “enabler”. Neish [40] an opportunity to enable more open-ended visual ex- illustrates how LD is enabling richer, more effective ploration, discovery and analytic processes, thanks to use of independent and related datasets. the unique properties of this form of data. Contributors to this special issue on Linked Data vi- Bridging the gap between these communities is im- sualisation confirm the potential of LD as a unifier and portant. However this will only be beneficial if fur- a bridge across large scale, complex, distributed, het- ther appropriation of existing and new research in vi- erogeneous data and independent tools. By harness- sualisation and visual analytics results in demonstra- ing visualisation as a tool for exploratory discovery ble, positive impact on research and applications in the and basic-to-advanced analysis, the papers in this vol- LD community [39,55]. Following best practices and ume illustrate the design and construction of intuitive guidelines in verified visualisation and visual analy- means for target end users, from a variety of domains sis frameworks and models should help to work suc- and with limited-to-advanced knowledge of technol- cessfully toward this goal. Further, because visualisa- ogy, to obtain new insights and enrich existing knowl- tion design guidelines target development for the hu- edge, as they follow links defined across Linked Data. man end-user, they inherently include an element of We look forward to a growing library of shared knowl- user-centred design principles, reinforcing the benefits edge and visualisation-driven tools that break down in weaving these into the process of LD visualisation. technological barriers, promoting instead richer explo- van Wijk [55] examines the particular challenges ration and intuitive, insightful analysis of users’ per- for effective, insightful visualisation of the increas- sonal context, myriad, shared situations and complex ingly large amounts of complex data by technologi- problems captured in Linked Data, and enable end cal and domain experts as well as lay users, by con- users to draw confident conclusions about data and sit- sidering “The Value of Visualization” from the per- uations and add value to their everyday, knowledge- spectives of (i) efficiency and effectiveness or util- driven tasks. ity in technology; (ii) cost and benefit in economics; and (iii) as a field within the academic disciplines of Art and Science. The paper was written just over Acknowledgement ten years ago, as marked advances in technology had started to trigger the “data explosion” now described Aba-Sah Dadzie is funded by the EU project EDSA as big data. We see today another step in technolog- (EC no. 643937). ical advancement; its findings, however, still remain very relevant, with one that is key to LD – extending research findings beyond the specific academic com- References munity and translating these to practical application in the wider world. The paper also presents a seminal [1] B. Bach, E. Pietriga and I. Liccardi, Visualizing populated ontologies with OntoTrix, International Journal on Semantic model of data-and context-driven, user-focused visual- Web and Information Systems (IJSWIS) 9(4) (2013), 17–40. isation that has been extended more recently to capture doi:10.4018/ijswis.2013100102. also the challenges in triggering insight and support- [2] F. Benedetti, S. Bergamaschi and L. Po, Visual querying ing sense-making and knowledge acquisition through LOD sources with LODeX, in: K-CAP 2015: Proc., 8th In- exploration and perception in the iterative visual ana- ternational Conference on Knowledge Capture,K.Barker and J.M. Gómez-Pérez, eds, 2015, pp. 12:1–12:8. doi:10. lytics process [see 47]. 1145/2815833.2815849. We have used the term value with reference to [3] T. Berners-Lee, Linked Data – Design issues, 2006, https:// Linked Data several times in this paper. However, www.w3.org/DesignIssues/LinkedData, (Accessed 14.10. whether we will in reality reap the value of LD is 2016). directly impacted by the cost of realising this value. [4] C. Bizer, The emerging web of Linked Data, IEEE Intelligent Systems 24(5) (2009), 87–92. doi:10.1109/MIS.2009.102. Cost in terms of time, financial cost, human effort and [5] E. Blevis, E. Churchill, W. Odom, J. Pierce, D. Roedl and skill – technological and domain, and other resources R. Wakkary, Visual thinking & digital imagery, in: CHI EA’12: required to extract this value and make timely, effec- Extended Abstracts, 2012 International Conference on Hu- 20 A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise

man Factors in Computing Systems, J.A. Konstan E.H. Chi portal for scientometrics, in: ISWC 2013: The Semantic Web – and K. Höök, eds, 2012, pp. 2715–2718. doi:10.1145/2212776. Proc., 12th International Semantic Web Conference, Part II, 2212703. H. Alani, L. Kagal, A. Fokoue, P. Groth, C. Biemann, J.X. Par- [6] J. Brooke, SUS: A retrospective, Journal of Usability Studies reira, L. Aroyo, N. Noy, C. Welty and K. Janowicz, eds, 2013, 8(2) (2013), 29–40. pp. 114–129. doi:10.1007/978-3-642-41338-4_8. [7] S. Carpendale, Innovations in visualization, in: GI’13: Proc., [23] IOS Press Journal: Semantic Web – Interoperability, Usabil- Graphics Interface 2013, F.F. Samavati and K. Hawkey, eds, ity, Applicability, http://semantic-web-journal.net, (Accessed Canadian Information Processing Society, 2013, pp. 1–8. 19.10.2016). [8] R. Cyganiak, D. Reynolds and J. Tennison, The RDF data [24] V. Ivanova, P. Lambrix, S. Lohmann and C. Pesquita (eds), cube vocabulary, Technical report, The World Wide Web VOILA! 2016: Proc., ISWC2016 International Workshop on Consortium (W3C), 2014, http://www.w3.org/TR/vocab-data- Visualizations and User Interfaces for Ontologies and Linked cube. Data, 2016, http://ceur-ws.org/Vol-1704. [9] A.-S. Dadzie and M. Rowe, Approaches to visualising Linked [25] M. Javed, S. Payette, J. Blake and T. Worrall, VIZ-VIVO: To- Data: A survey, Semantic Web 2(2) (2011), 89–124. doi:10. wards visualizations-driven linked data navigation, in: VOILA! 3233/SW-2011-0037. 2016: Proc., ISWC2016 International Workshop on Visualiza- [10] Data set 2014 – DBpedia, http://wiki.dbpedia.org/data-set- tions and User Interfaces for Ontologies and Linked Data, 2014, (Accessed 26.10.2016). V. Ivanova et al., eds, 2016, pp. 80–92. http://ceur-ws.org/ [11] Database – Eurostat, http://ec.europa.eu/eurostat/data/ Vol-1704/paper7.pdf. database, (Accessed 26.10.2016). [26] L. Kazemzadeh, M.R. Kamdar, O.D. Beyan, S. Decker and [12] DBpedia, http://dbpedia.org, (Accessed 17.10.2016). F. Barry, LinkedPPI: Enabling intuitive, integrative protein– [13] C. Dowd, The new order of news and social media en- protein interaction discovery, in: LISC2014: Proc., 4th Inter- terprises: Visualisations, linked data, and new methods national Workshop on Linked Science – Making Sense Out of and practices in journalism, Communication Research and Data, J. Zhao, M. van Erp, C. Keßler, T. Kauppinen, J. van Os- Practice 2(1) (2016), 97–110. doi:10.1080/22041451.2016. senbruggen and W.R. van Hage, eds, 2014, pp. 48–59, http:// 1155339. ceur-ws.org/Vol-1282/lisc2014_submission_4.pdf. [14] I. Ermilov, J. Lehmann, M. Martin and S. Auer, LODStats: [27]J.Klímek,M.Necaský,˘ B. Kostov, M. Blasko˘ and P. Kremen,˘ The data web census dataset, in: ISWC 2016: The Seman- Efficient exploration of linked data cloud, in: DATA-2015: tic Web – Proc., 15th International Semantic Web Confer- Proc., 4th International Conference on Data Management ence, Part II, P. Groth, E. Simperl, A. Gray, M. Sabou, Technologies and Applications, M. Helfert, A. Holzinger, M. Krötzsch, F. Lecue, F. Flöck and Y. Gil, eds, 2016, pp. 38– O. Belo and C. Francalanci, eds, 2015, pp. 255–261. doi:10. 46. doi:10.1007/978-3-319-46547-0_5. 5220/0005558002550261. [15] T.R.G. Green and M. Petre, Usability analysis of visual pro- [28] T. Lebo, S. Sahoo, D. McGuinness, K. Belhajjame, J. Cheney, gramming environments: A ‘cognitive dimensions’ frame- D. Corsar, D. Garijo, S. Soiland-Reyes, S. Zednik and J. Zhao, work, Journal of Visual Languages & Computing 7(2) (1996), PROV-O: The PROV Ontology, Technical report, The World 131–174. doi:10.1006/jvlc.1996.0009. Wide Web Consortium (W3C), 2013, https://www.w3.org/TR/ [16] F. Haag, S. Lohmann, S. Bold and T. Ertl, Visual SPARQL sparql11-overview. querying based on extended filter/flow graphs, in: AVI’14: [29] J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, Proc., International Working Conference on Advanced Visual P.N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer Interfaces, P. Paolini and F. Garzotto, eds, 2014, pp. 305–312. and C. Bizer, DBpedia – A large-scale, multilingual knowledge doi:10.1145/2598153.2598185. base extracted from Wikipedia, Semantic Web 6(2) (2015), [17] H. Halpin and F. Bria, Crowdmapping digital social innova- 167–195. doi:10.3233/SW-140134. tion with linked data, in: ESWC 2015: The Semantic Web. Lat- [30] Linked Data – Connect distributed data across the Web, http:// est Advances and New Domains – 12th European Semantic linkeddata.org, (Accessed 21.10.2016). Web Conference, F. Gandon, M. Sabou, H. Sack, C. d’Amato, [31] Linked Data portal for the Semantic Web Journal, http:// P. Cudré-Mauroux and A. Zimmermann, eds, 2015, pp. 606– semantic-web-journal.com/SWJPortal, (Accessed 19.10. 620. doi:10.1007/978-3-319-18818-8_37. 2016). [18] G. Hart and C. Dolbear, Geographic information in an open [32] LinkedMDB, http://linkedmdb.org, (Accessed 19.10.2016). world, Linked Data: A Geographic Perspective, CRC Press, [33] Z. Liu and J.T. Stasko, Mental models, visual reasoning 2013, Chapter 4. doi:10.1201/b13877-5. and interaction in information visualization: A top-down per- [19] O. Hassanzadeh and M.P. Consens, Linked movie data base, spective, IEEE Transactions on Visualization and Computer in: LDOW 2009: Proc., Linked Data on the Web Work- Graphics 16(6) (2010), 999–1008. doi:10.1109/TVCG.2010. shop at WWW’2009, C. Bizer, T. Heath, T. Berners-Lee and 177. K. Idehen, eds, 2009, http://ceur-ws.org/Vol-538/ldow2009_ [34] LODStats project, http://lodstats.aksw.org, (Accessed 14.10. paper12.pdf. 2016). [20] J. Hendler, Web 3.0: Chicken farms on the Semantic Web, [35] S. Lohmann, P. Heim, T. Stegemann and J. Ziegler, The Computer 41(1) (2008), 106–108. doi:10.1109/MC.2008.34. RelFinder user interface: Interactive exploration of rela- [21] P. Hitzler and K. Janowicz, Linked Data, Big Data, and the 4th tionships between objects of interest, in: IUI’11: Proc., paradigm, Semantic Web 4(3) (July 2013), 233–235. doi:10. 15th International Conference on Intelligent User Interfaces, 3233/SW-130117. C. Rich, Q. Yang, M. Cavazza and M.X. Zhou, eds, 2010, [22] Y. Hu, K. Janowicz, G. McKenzie, K. Sengupta and P. Hit- pp. 421–422. ISBN 978-1-60558-515-4. doi:10.1145/1719970. zler, A linked-data-driven and semantically-enabled journal 1720052. A.-S. Dadzie and E. Pietriga / Visualisation of Linked Data – Reprise 21

[36] S. Lohmann, S. Negru, F. Haag and T. Ertl, Visualizing on- ics 20(12) (2014), 1604–1613. doi:10.1109/TVCG.2014. tologies with VOWL, Semantic Web 7(4) (2015), 399–419. 2346481. doi:10.3233/SW-150200. [48] m.c. schraefel and D. Karger, The pathetic fallacy of RDF, in: [37] G. Mai, K. Janowicz, Y. Hu and G. McKenzie, A Linked Data SWUI 2006: Proc., 3rd International Semantic Web User Inter- driven visual interface for the multi-perspective exploration of action Workshop, 2006, http://swui.semanticweb.org/swui06/ data across repositories, in: VOILA! 2016: Proc., ISWC2016 papers/Karger/Pathetic_Fallacy.html. International Workshop on Visualizations and User Interfaces [49] B. Shneiderman, The eyes have it: A task by data type taxon- for Ontologies and Linked Data, V. Ivanova et al., eds, 2016, omy for information visualizations, in: Proc., 1996 IEEE Sym- pp. 93–101. http://ceur-ws.org/Vol-1704/paper?pdf. posium on Visual Languages, IEEE, IEEE Computer Society, [38] C.-J. Minard, Carte figurative des pertes successives en 1996, pp. 336–343. doi:10.1109/VL.1996.545307. hommes de l’Armée Française dans la campagne de Russie [50] The Datahub, http://datahub.io, (Accessed 14.10.2016). 1812–1813 (Figurative Map of the successive losses in men [51] The International Human Genome Mapping Consortium, A of the French Army in the Russian campaign 1812–1813), physical map of the human genome, Nature (2001), 934–941. in: Tableaux graphiques et cartes figuratives, Ecole nationale doi:10.1038/35057157. des ponts et chaussées, Fol. 10975, Autogr.Regnier et Dour- [52] The W3C SPARQL Working Group, SPARQL 1.1 overview, det (Paris) edition, 1862. Available at Bibliothèque numérique W3C recommendation 21 March 2013, Technical report, The patrimoniale des ponts et chaussés, (Accessed 28.10.2016). World Wide Web Consortium (W3C), 2013, https://www.w3. [39] I. Mitchell and M. Wilson, Linked Data – Connecting and org/TR/sparql11-overview. exploiting big data, Technical report, Fujitsu Services Lim- [53] T. Tietz, J. Jäger, J. Waitelonis and H. Sack, Semantic anno- ited, 2012. https://www.fujitsu.com/uk/Images/Linked-data- tation and information visualization for blogposts with refer, connecting-and-exploiting-big-data-(v1.0).pdf. in: VOILA! 2016: Proc., ISWC2016 International Workshop on [40] P. Neish, Linked Data: What is it and why should you care? Visualizations and User Interfaces for Ontologies and Linked The Australian Library Journal 64(1) (2015), 3–10. doi:10. Data, V. Ivanova et al., eds, 2016, pp. 28–40. http://ceur-ws. 1080/00049670.2014.974004. org/Vol-1704/paper3.pdf. [41] Open Data Catalog – The World Bank, http://datacatalog. [54] F. Valsecchi, M. Abrate, C. Bacciu, M. Tesconi and worldbank.org/, (Accessed 26.10.2016). A. Marchetti, Linked Data maps: Providing a visual entry [42] D. Petrelli, A.-S. Dadzie and V.Lanfranchi, Mediating between point for the exploration of datasets, in: IESD 2015: Proc., AI and highly specialized users, AI Magazine 30(4) (2009), 4th International Workshop on Intelligent Exploration of Se- 95–102. Special Issue on “Usable AI”. mantic Data, D. Thakker, D. Schwabe, K. Kozaki, R. Garcia, [43] E. Pietriga, Semantic web data visualization with graph style M. Brambilla and V. Dimitrova, eds, 2015, http://ceur-ws.org/ sheets, in: SoftVis’06: Proc., 2006 ACM Symposium on Soft- Vol-1472/IESD_2015_paper_2.pdf. ware Visualization, E. Kraemer, M.M. Burnett and S. Diehl, [55] J.J. van Wijk, The value of visualization, in: IEEE VIS 05: eds, 2006, pp. 177–178. doi:10.1145/1148493.1148532. Proc., Visualization 2005, Oct. 2005, pp. 79–86. doi:10.1109/ [44] E. Pietriga, C. Bizer, D. Karger and R. Lee, Fresnel – A VISUAL.2005.1532781. browser-independent presentation vocabulary for RDF, in: [56] C. Ware, Visual Thinking: For Design, Morgan Kaufmann Pub- ISWC2006: The Semantic Web, Proc., 5th International Se- lishers Inc., Burlington, USA, 2010. mantic Web Conference, I.F. Cruz, S. Decker, D. Allemang, [57] M. Weise, S. Lohmann and F. Haag, LD-VOWL: Extracting C. Preist, D. Schwabe, P. Mika, M. Uschold and L. Aroyo, eds, and visualizing schema information for linked data endpoints, 2006, pp. 158–171. doi:10.1007/11926078_12. in: VOILA! 2016: Proc., ISWC2016 International Workshop on [45] N.A. Rakhmawati, R.P. Wibowo and M.I.H. Amir, Visual- Visualizations and User Interfaces for Ontologies and Linked isation application development for mosque financial report Data, V. Ivanova et al., eds, 2016, pp. 120–127. http://ceur-ws. using linked data and crowd-sourcing, in: Procedia Com- org/Vol-1704/paper11.pdf. puter Science: The Third Information Systems International [58] D. Welter, J. MacArthur, J. Morales, T. Burdett, P. Hall, Conference 2015, Vol. 72, 2015, pp. 374–381. doi:10.1016/ H. Junkins, A. Klemm, P. Flicek, T. Manolio, L. Hindorff j.procs.2015.12.152. and H. Parkinson, The NHGRI GWAS catalog, a curated [46] P. Ruchikachorn and K. Mueller, Learning visualizations resource of SNP-trait associations, Nucleic Acids Research by analogy: Promoting visual literacy through visualization 42(D1) (2014), D1001–D1006. doi:10.1093/nar/gkt1229. morphing, IEEE Transactions on Visualization and Com- [59] D. Young and B. Shneiderman, A graphical filter/flow repre- puter Graphics 21(9) (2015), 1028–1044. doi:10.1109/TVCG. sentation of boolean queries: A prototype implementation and 2015.2413786. evaluation, Journal of the American Society for Information [47] D. Sacha, A. Stoffel, F. Stoffel, B.C. Kwon, G. Ellis and Science 44(6) (July 1993), 327–339. doi:10.1002/(SICI)1097- D.A. Keim, Knowledge generation model for visual analyt- 4571(199307)44:6<327::AID-ASI3>3.0.CO;2-J. ics, IEEE Transactions on Visualization and Computer Graph-