The Publishing Workflow Ontology (PWO)
Total Page:16
File Type:pdf, Size:1020Kb
Semantic Web 0 (0) 1–12 1 IOS Press The Publishing Workflow Ontology (PWO) Editor(s): Victor de Boer, Vrije Universiteit Amsterdam, Netherlands; Agnieszka Ławrynowicz, Poznan´ University of Technology, Poland Solicited review(s): Boris Villazón-Terrazas, iSOCO, Spain; Ondrejˇ Zamazal, Prague University of Economics, Poland; Raúl Palma, Poznan´ Supercomputing and Networking Center, Poland Aldo Gangemi a, Silvio Peroni b;∗, David Shotton c, Fabio Vitali d a STLab-ISTC, Consiglio Nazionale delle Ricerche (Italy) Laboratoire d’Informatique de Paris Nord, Université Paris 13 (France) [email protected] b Department of Computer Science and Engineering, University of Bologna (Italy) [email protected] c Oxford e-Research Centre, University of Oxford (UK) [email protected] d Department of Computer Science and Engineering, University of Bologna (Italy) [email protected] Abstract. In this paper we introduce the Publishing Workflow Ontology (PWO), i.e., an OWL 2 DL ontology for the description of workflows that is particularly suitable for formalising typical publishing processes such as the publication of articles in journals. We support the presentation with a discussion of all the ontology design patterns that have been reused for modelling the main characteristics of publishing workflows. In addition, we present two possible application of PWO in the publishing and legislative domains. Keywords: PWO, ODP, publishing process, workflow description 1. Introduction Journal2, as RDF statements in the Linked Open Data Cloud, that allow software agents and applications to Keeping track of publication processes is a crucial check and reason on them, and to infer new infor- task for publishers. This action allows them to pro- mation. However, the description of processes, for in- duce statistics on their goods (e.g., books, authors, ed- stance the peer-review process or the publishing pro- itors) and to understand whether and how their pro- cess, is something that is not currently handled – al- duction changes over time. Organisers of particular though sources of related raw data exist (e.g., Easy- events, such as academic conferences, have similar Chair metadata). Furthermore, a flexible model for de- needs. Tracking the number of submissions in the cur- scribing these data, developed according to guidelines rent edition of a conference, the number of accepted for guaranteeing its reusability in different domains, is papers, the review process, etc., are important statis- also needed. Having these types of data publicly avail- tics that can be used to improve the review process in able and described according to such model would re- future editions of the conference. sult in: Some communities have started to publish data, e.g., the Semantic Web Dog Food1 and the Semantic Web – increasing transparency of the (publishing) pro- cess; *Corresponding author. E-mail: [email protected]. 1Semantic Web Dog Food: http://data.semanticweb.org. 2Semantic Web Journal: http://semantic-web-journal.com. 1570-0844/0-1900/$27.50 c 0 – IOS Press and the authors. All rights reserved 2 The Publishing Workflow Ontology (PWO) – enabling the use of such data for statistical analy- PWO, describing how it extends the aforementioned sis; patterns in order to handle the main components of – facilitating the integration, alignment and adap- workflows. In Section 5 we show how to use PWO for tation of both the data and the model accord- modelling data related to the publication process of an ing to the needs and constraints of different article of the Semantic Web Journal and to the codi- domains (publishing, academic conferences, re- fication process of laws in US legislation. Finally, in search funding, etc.). Section 6 we conclude the paper sketching out some future works. In this paper, which extends our work presented at the 5th International Workshop on Ontology and Se- 3 mantic Web Patterns [10] , we introduce the Publish- 2. Workflows and the Semantic Web ing Workflow Ontology (PWO), that we developed in order to reach the aforementioned goals. This ontol- The use and managing of workflows is a relevant as- ogy is one of the Semantic Publishing and Referencing pect in different applicative domains. For instance, for- 4 (SPAR) Ontologies [20] (which have been created for malised approaches for workflow definitions and ex- the description of different aspects of the publishing ecutions have been adopted in industrial applications domain), and allows one to describe the logical steps for managing and delivering products5, XML-based in a workflow, as for example the process of publi- scholarly publishing [19], and Wiki-oriented publica- cation of a document. Each step may involve one or tion mechanisms6. more events that take place at a particular phase of the In the last years the Semantic Web community workflow (e.g., authors are writing the article, the ar- have also started on working and proposing mod- ticle is under review, a reviewer suggests to revise the els for the formalisation and description of generic article, the article is in printing, the article has been workflows, and have shown several applications of published, etc.). This ontology has been developed in these models/theories within the publishing domain. order to allow its use with other SPAR Ontologies as Maybe the first huge-impact project on these topic well as other models and existing data. has been Wf4Ever (STREP FP7-ICT-2007-6 270192)7 The rest of the paper is organised as follows. In Sec- [12]. This project addresses challenges related to the tion 2 we discuss some related works on workflows preservation of scientific experiments through the def- within the Semantic Web domain. In Section 3 we pro- inition of models and ontologies for describing scien- vide the definitions of workflow we have used as start- tific experiments, to the collection of best practices for ing point for modelling our ontology, and discuss the the creation and management of Research Objects8 [2], use of some existing ontology design patterns for ad- and to the analysis and management of decay in scien- dressing the modelling issues related to the main char- tific workflows. acteristics of workflows. In Section 4 we introduce As already stated, one of the outcomes of the project has been the proposal for workflow-centric Research 3This paper was selected by the members of the organising com- Objects [1], i.e., an OWL ontology9 for linking to- mittee as one of the best papers of the workshop for being part of a gether scientific workflows, the provenance of their ex- fast-track submission to the Semantic Web journal’s Special Issue on ecutions, interconnections between workflows and re- Ontology Design Patterns. However, more than proposing a new pat- lated resources (e.g., datasets, publications, etc.), and tern, our work extends the published workshop paper and focuses on the reuse of design patterns for developing an ontology for publish- social aspects related to such scientific experiments. ing workflows. The additional material provided in this revised ver- Another interesting proposal for describing work- sion concerns: the description of the design pattern adopted, which flows is the work done by Garijo and Gil [11]. In this has been extended in order to clarify several parts and that now in- work, they describe a framework to publish computa- cludes a precise specification of the requirements for such model; the extension of the section describing our ontology, that now includes some statistics and definitions that were not present in the original 5SDL LiveContent 2014: http://docs.sdl.com/LiveContent/content/ article, as well as the introduction of several new parts of the ontol- en-US/SDL%20LiveContent%20full%20documentation-v1/GUID- ogy that have been added as consequence of reviews and comments 867B6863-ADAD-40C9-A3F7-775FE29FAFF3 we received during the workshop; the introduction of a new exam- 6http://extensions.xwiki.org/xwiki/bin/view/Extension/XWiki+ ple of use in the legislative domain, and an example of SPARQL Publication+Workflow+Application queries for answering some relevant questions within the publishing 7Wf4Ever project homepage: http://www.wf4ever-project.org. domain. 8Research Object website: http://www.researchobject.org. 4SPAR Ontologies website: http://www.sparontologies.net. 9Research Object OWL ontology: http://purl.org/wf4ever/ro. The Publishing Workflow Ontology (PWO) 3 tional workflows, which includes the specification of In order to gather minimal requirements, we take into a particular OWL ontology, i.e., the Open Provenance account both general definitions of workflows, and ex- Model for Workflows (OPMW)10, for the description of isting workflow models, as, e.g., generalised in [27]. workflow traces and their templates. Along the lines Minimal requirements will then be used to retrieve or of the aforementioned work, the same authors recently create appropriate ontology design patterns [9], which published the Ontology for Provenance and Plans (P- have proved to substantially improve ontology design Plan)11. P-Plan is an OWL 2 DL ontology that extends quality [3], due to their ability to address the semantics the Provenance Ontology [16] in order to represent the of requirements, and to their modularity. plans that guided the execution of scientific processes, Definitions are quite convergent, e.g., the Oxford describing how such plans are composed and their cor- Dictionaries site defines workflow as follows: respondence to provenance records that describe the execution itself. “The sequence of