The Document Components Ontology (DoCO) Editor(s): Oscar Corcho, Universidad Politécnica de Madrid, Spain Solicited review(s): Francesco Ronzano, Universitat Pompeu Fabra, Barcelona, Spain; Almudena Ruiz Iniesta, Universidad Politécnica de Madrid, Spain; one anonymous reviewer Alexandru Constantina, Silvio Peronib,c,*, Steve Pettiferd, David Shottone, and Fabio Vitalib a École Polytechnique Fédérale de Lausanne EPFL IC IIF LSIR, BC 159 (Bâtiment BC), Station 14, CH-1015 Lausanne, Switzerland
[email protected] b Department of Computer Science And Engineering, University of Bologna Mura Anteo Zamboni 7, 40127 Bologna (BO), Italy
[email protected],
[email protected] c Semantic Technology Laboratory, Institute of Cognitive Sciences and Technologies, National Research Council, Via Nomentana 56, 00161 Rome (RM), Italy d School of Computer Science, University of Manchester Kilburn Building, M13 9PL Manchester, United Kingdom
[email protected] e Oxford e-Research Centre, University of Oxford 7 Keble Road, OX1 3QG Oxford, United Kingdom
[email protected] Abstract. The availability in machine-readable form of descriptions of the structure of documents, as well as of the document discourse (e.g. the scientific discourse within scholarly articles), is crucial for facilitating semantic publishing and the overall comprehension of documents by both users and machines. In this paper we introduce DoCO, the Document Components Ontology, an OWL 2 DL ontology that provides a general-purpose structured vocabulary of document elements to describe both structural and rhetorical document components in RDF. In addition to describing the formal description of the ontology, this paper showcases its utility in practice in a variety of our own applications and other activities of the Semantic Publishing community that rely on DoCO to annotate and retrieve document components of scholarly articles.