Lecture Notes in Computer Science 6378 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan Van Leeuwen

Lecture Notes in Computer Science 6378 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen Editorial Board David Hutchison Lancaster University, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa University of California, Irvine, CA, USA Friedemann Mattern ETH Zurich, Switzerland John C. Mitchell Stanford University, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Germany Madhu Sudan Microsoft Research, Cambridge, MA, USA Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany Deborah L. McGuinness James R. Michaelis Luc Moreau (Eds.) Provenance and Annotation of Data and Processes Third International Provenance andAnnotation Workshop IPAW 2010, Troy, NY, USA, June 15-16, 2010 Revised Selected Papers 13 Volume Editors Deborah L. McGuinness Tetherless World Constellation Rensselaer Polytechnic Institute 110 8th Street, Troy, NY 12180, USA E-mail: [email protected] James R. Michaelis Tetherless World Constellation Rensselaer Polytechnic Institute 110 8th Street, Troy, NY 12180, USA E-mail: [email protected] Luc Moreau University of Southampton School of Electronics and Computer Science Southampton SO17 1BJ, United Kingdom E-mail: [email protected] Library of Congress Control Number: 2010940987 CR Subject Classification (1998): H.3-4, D.4.6, I.2, H.5, K.6, K.4, C.2 LNCS Sublibrary: SL 3 – Information Systems and Application, incl. Internet/Web and HCI ISSN 0302-9743 ISBN-10 3-642-17818-9 Springer Berlin Heidelberg New York ISBN-13 978-3-642-17818-4 Springer Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. springer.com © Springer-Verlag Berlin Heidelberg 2010 Printed in Germany Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India Printed on acid-free paper 06/3180 In Memoriam, Eleanor Louise McGuinness, 1917 - 2010 Preface Interest in and needs for provenance are growing as data proliferate. Data are increasing in a wide array of application areas, including scientific workflow systems, logical reasoning systems, text extraction, social media, and linked data. As data volumes expand and as applications become more hybrid and distributed in nature, there is growing interest in where data came from and how they were produced in order to understand when and how to rely on them. Provenance, or the origin or source of something, can capture a wide range of information. This includes, for example, who or what generated the data, the history of data stewardship, manner of manufacture, place and time of manufacture, and so on. Annotation is tightly connected with provenance since data are often commented on, described, and referred to. These descriptions or annotations are often critical to the understandability, reusability, and reproducibility of data and thus are often critical components of today’s data and knowledge systems. Provenance has been recognized to be important in a wide range of areas including databases, workflows, knowledge representation and reasoning, and dig- ital libraries. Thus, many disciplines have proposed a wide range of provenance models, techniques, and infrastructure for encoding and using provenance. One timely challenge for the broader community is to understand the range of strengths and weaknesses of different approaches sufficiently to find and use the best models for any given situation. This also comes at a time when a new incubator group has been formed at the World Wide Web Consortium (W3C) to provide a state-of-the- art understanding and develop a roadmap in the area of provenance for Semantic Web technologies, development, and possible standardization. The Third International Provenance and Annotation Workshop (IPAW 2010) built on the success of previous workshops held in Salt Lake City (2008), Chicago (2006, 2002), and Edinburgh (2003). It was held during June 15–16, in Troy, New York, at Rensselaer Polytechnic Institute. IPAW 2010 brought together computer scientists from different areas and provenance users to discuss open problems related to the provenance of computational and non-computational artifacts. A total of 59 people attended the workshop. These attendees came from the United States (USA), the United Kingdom (UK), the Netherlands, Germany, Brazil, and Japan. We received 36 submissions in response to the initial call for papers. Each of these submissions was reviewed by at least three reviewers. Overall, 7 submissions were accepted as full papers, 11 were accepted as medium-length papers, 7 were accepted as demo papers, and 6 were accepted as short papers. In addition, a follow-up call for late-breaking work in the form of a poster and abstract was issued, which resulted in 10 additional contributions being made. VIII Preface The workshop was organized as a single-track event with paper, poster, and demo sessions interleaved. Susan Davidson (University of Pennsylvania) presented a keynote address on provenance and privacy. Prior to IPAW 2010, on June 14, 28 attendees participated in a Provenance Hackathon, organized by Paul Groth. The aim of the Provenance Hackathon was to see whether participants, grouped in teams, could quickly build end-user applications that demonstrate unique benefits of provenance, through leveraging existing infrastructure and provenance models. Each application was evaluated based on provenance usage and usefulness by a panel of three judges. Details of participating teams, as well as provenance strategies and solutions, can be found at http://thinklinks.wordpress.com/2010/06/15/provenance-hackathon/ Immediately following IPAW 2010, a group of 28 researchers met to discuss plans for the fourth and last Provenance Challenge. The Provenance Challenge series was initiated to understand and compare expressiveness of provenance systems; it evolved into an interoperability challenge, in which provenance information is exchanged between systems. The Second Provenance challenge led to the specification of a common provenance model, the Open Provenance Model [1], which was tested in the Third Provenance Challenge. The purpose of the fourth and last Provenance Challenge is to apply the Open Provenance Model to a broad end-to-end scenario, and demonstrate novel functionality that can only be achieved by the presence of an interoperable solution for provenance. The participants successfully identified a scenario and provenance queries as well as a draft schedule. Details can be found at http://twiki.ipaw.info/bin/view/Challenge/ FourthProvenanceChallenge After the challenge planning meeting, a group of 20 researchers met to discuss evolving issues with one provenance Interlingua (PML). The workshop was organized by Paulo Pinheiro da Silva. Applications and tools were presented and use cases were articulated to help motivate and prioritize language extensions. IPAW 2010 and associated workshops were a very successful event with much enthusiastic discussion and many new ideas generated. We are grateful for the support of STI Innsbruck for sponsoring the Provenance Hackathon, of Microsoft for sponsoring the banquet, and for Rensselaer for providing meeting space and staff support. We also thank the Program Committee members for their thorough reviews. Reference [1] Luc Moreau, Ben Clifford, Juliana Freire, Joe Futrelle, Yolanda Gil, Paul Groth, Natalia Kwasnikowska, Simon Miles, Paolo Missier, Jim Myers, Beth Plale, Yogesh Simmhan, Eric Stephan, and Jan Van den Bussche. The open provenance model core specification (v1.1). Future Generation Computer Systems, July 2010. (DOI: 10.1016/j.future.2010.07.005) (URL: http://eprints.ecs.soton.ac.uk/21449/) Organization IPAW 2010 was organized by the Tetherless World Constellation at Rensselaer Polytechnic Institute. Workshop Co-chairs Deborah L. McGuinness Rensselaer Polytechnic Institute, USA Luc Moreau University of Southampton, UK Program Committee Christian Bizer Freie Universität Berlin, Germany James Cheney University of Edinburgh, UK Richard Cyganiak DERI, Ireland Susan Davidson University of Pennsylvania, USA Li Ding Rensselaer Polytechnic Institute, USA Ian Foster University of Chicago,USA Peter Fox Rensselaer Polytechnic Institute, USA Juliana Freire University of Utah, USA Alyssa Glass Stanford University, USA Paul Groth Vrije Universiteit Amsterdam, The Netherlands Olaf Hartig Universität zu Berlin, Germany Michael Hausenblas DERI, Ireland Bertram Ludaescher University of California, Davis, USA Marta Mattoso UFRJ, Brazil Simon Miles Kings College, UK Paolo Missier University of Manchester, UK Jim Myers NCSA, USA Paulo

Lecture Notes in Computer Science 6378 Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan Van Leeuwen

AAAI/SSS 2008 Program - SSKI Detailed Program (Stanford University, History Building)

The 8Th ACM Web Science Conference 2016

The Role of Virtual Observatories and Data Frameworks in an Era of Big Data

Data Life Cycle: Introduction, Definitions and Considerations

The Earth System Grid: a Visualisation Solution

A Neuroimaging Data Model to Share Brain Mapping Statistical Results

Web Approach for Ontology-Based Classification, Integration, and Interdisciplinary Usage of Geoscience Metadata

Metadata and Semantics Workshop Summary

Speaker and Moderator Biographical Information

The Foundations for Provenance on the Web

Response to NSF 20-015 Request for Information on Data-Focused

Peter Fox Chair, Tetherless World Constellation Professor of Earth and Environmental Science and Computer Science at Rensselaer Polytechnic Institute