Common Motifs in Scientific Workflows: an Empirical Analysis Daniel Garijo, Pinar Alper, Khalid Belhajjame, Oscar Corcho, Yolanda Gil, Carole Goble

Total Page:16

File Type:pdf, Size:1020Kb

Common Motifs in Scientific Workflows: an Empirical Analysis Daniel Garijo, Pinar Alper, Khalid Belhajjame, Oscar Corcho, Yolanda Gil, Carole Goble Common motifs in scientific workflows: An empirical analysis Daniel Garijo, Pinar Alper, Khalid Belhajjame, Oscar Corcho, Yolanda Gil, Carole Goble To cite this version: Daniel Garijo, Pinar Alper, Khalid Belhajjame, Oscar Corcho, Yolanda Gil, et al.. Common motifs in scientific workflows: An empirical analysis. Future Generation Computer Systems, Elsevier, 2014, 36, 10.1016/j.future.2013.09.018. hal-01342933 HAL Id: hal-01342933 https://hal.archives-ouvertes.fr/hal-01342933 Submitted on 7 Jul 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Common Motifs in Scientific Workflows: An Empirical Analysis Daniel Garijo∗, Pinar Alper y, Khalid Belhajjamey, Oscar Corcho∗, Yolanda Gilz, Carole Gobley ∗Ontology Engineering Group, Universidad Politecnica´ de Madrid. fdgarijo, ocorchog@fi.upm.es ySchool of Computer Science, University of Manchester. falperp, khalidb, [email protected] zInformation Sciences Institute, Department of Computer Science, University of Southern California. [email protected] Abstract—While workflow technology has gained momentum [14] and CrowdLabs [8] have made publishing and finding in the last decade as a means for specifying and enacting compu- workflows easier, but scientists still face the challenges of re- tational experiments in modern science, reusing and repurposing use, which amounts to fully understanding and exploiting the existing workflows to build new scientific experiments is still a daunting task. This is partly due to the difficulty that scientists available workflows/fragments. One difficulty in understanding experience when attempting to understand existing workflows, workflows is their complex nature. A workflow may contain which contain several data preparation and adaptation steps in several scientifically-significant analysis steps, combined with addition to the scientifically significant analysis steps. One way various other data preparation activities, and in different to tackle the understandability problem is through providing implementation styles depending on the environment and abstractions that give a high-level view of activities undertaken within workflows. As a first step towards abstractions, we report context in which the workflow is executed. The difficulty in in this paper on the results of a manual analysis performed over understanding causes workflow developers to revert to starting a set of real-world scientific workflows from Taverna and Wings from scratch rather than re-using existing fragments. systems. Our analysis has resulted in a set of scientific workflow Through an analysis of the current practices in scientific motifs that outline i) the kinds of data intensive activities that are observed in workflows (data oriented motifs), and ii) the different workflow development, we could gain insights on the creation manners in which activities are implemented within workflows of understandable and more effectively re-usable workflows. (workflow oriented motifs). These motifs can be useful to inform Specifically, we propose an analysis with the following objec- workflow designers on the good and bad practices for workflow tives: development, to inform the design of automated tools for the generation of workflow abstractions, etc. 1) To reverse-engineer the set of current practices in work- flow development through an analysis of empirical evi- I. INTRODUCTION dence. 2) To identify workflow abstractions that would facilitate Scientific workflows have been increasingly used in the last understandability and therefore effective re-use. decade as an instrument for data intensive scientific analysis. 3) To detect potential information sources and heuristics In these settings, workflows serve a dual function: first as that can be used to inform the development of tools for detailed documentation of the method (i. e. the input sources creating workflow abstractions. and processing steps taken for the derivation of a certain data item) and second as re-usable, executable artifacts for In this paper we present the result of an empirical analysis data-intensive analysis. Workflows stitch together a variety performed over 177 workflow descriptions from Taverna [10] of data manipulation activities such as data movement, data and Wings [3]. Based on this analysis, we propose a catalogue transformation or data visualization to serve the goals of the of scientific workflow motifs. Motifs are provided through i) scientific study. The stitching is realized by the constructs a characterization of the kinds of data-oriented activities that made available by the workflow system used and is largely are carried out within workflows, which we refer to as data- shaped by the environment in which the system operates and oriented motifs, and ii) a characterization of the different man- the function undertaken by the workflow. ners in which those activity motifs are realized/implemented A variety of workflow systems are in use [10] [3] [7] [2] within workflows, which we refer to as workflow-oriented serving several scientific disciplines. A workflow is a software motifs. It is worth mentioning that, although important, motifs artifact, and as such once developed and tested, it can be that have to do with scheduling and mapping of workflows shared and exchanged between scientists. Other scientists can onto distributed resources [12] are out the scope of this paper. then reuse existing workflows in their experiments, e.g., as The paper is structured as follows. We begin by providing sub-workflows [17]. Workflow reuse presents several advan- related work in Section II, which is followed in Section III by tages [4]. For example, it enables proper data citation and brief background information on Scientific Workflows, and the improves quality through shared workflow development by two systems that were subject to our analysis. Afterwards we leveraging the expertise of previous users. Users can also describe the dataset and the general approach of our analysis. re-purpose existing workflows to adapt them to their needs We present the detected scientific workflow motifs in Section [4]. Emerging workflow repositories such as myExperiment IV and we highlight the main features of their distribution across the analyzed workflows in section V. Finally, we distill experiments instead of scientific workflows (in silico data the main findings of our study (in Section VI) and conclude oriented experiments). by outlining our future plans in Section VII. III. PRELIMINARIES For the purposes of our analysis, we use workflows specified II. RELATED WORK and used in the Taverna [10] and Wings [3] workflow systems. Our motifs could be observed as higher-level patterns ob- The choice of these two systems is due to : served over scientific workflow datasets. ”Workflow patterns” 1) The similarity in the kinds of workflow modeling con- have been extensively studied in the last two decades [15]. structs they provide. When compared to other scientific Work in this area is primarily focused on outlining the workflow systems both Taverna and Wings are observers inventory of workflow development constructs provided by of the pure data-flow paradigm, they provide no explicit different workflow languages and the ways of combining those mechanism for even the most basic control constructs constructs. Classification models have also been developed to such as conditionals and looping, while there are implicit detect additional patterns in structure, usage and data [13]. ways of doing this. Scientific workflows are characterized by the lack of complex 2) The variety of execution environments in which these control constructs, where the order of execution is determined systems operate. While Taverna provides users with a by the availability of data. As observed by a recent study means to specify workflows that mainly make use of [9], these systems largely support data-flow patterns1 and even autonomous third party services, Wings provides an bring-about new ones with their varied handling of data tokens. environment in which users have relatively more control Data-flow patterns outline ways of managing data resources over the resources and the analysis operations that sci- during workflow execution, such as visibility, data interaction entists use in their experiments. Within Wings, analysis and transfer. These patterns are orthogonal to the scientific tools and data artifacts are made part of the environment data-oriented function undertaken by the processing steps i.e. prior to workflow design, workflows are then composed our motifs. In this regard our work complements the workflow of using these encapsulated software/tools. patterns research with its focus on pinpointing characteristic data-oriented activities, rather than an analysis of workflow A. Taverna and Wings workflow systems languages, token handling or data dependencies. Our work is Taverna2 is an open-source workflow system, which has also based on an analysis of empirical evidence of how data- been initially devised for in-silico experimentation in the life- intensive activities have been implemented against different sciences. Recently, it has been extended
Recommended publications
  • Cybersecurity Through Nimble Task Allocation: Semantic Workflow Reasoning for Mission-Centered Network Models
    Cybersecurity Through Nimble Task Allocation: Semantic Workflow Reasoning for Mission-Centered Network Models Yolanda Gil Information Sciences Institute University of Southern California [email protected] http://www.isi.edu/~gil USC Information Sciences Institute Yolanda Gil 1 Knowledge Technologies at ISI: From Basic Research to Transition AFOSR DARPA DOD 2000 2005 TRELLIS: Basic JIST: Capturing decisions Transition to IC’s research on information and evidence in RDEC experimental analysis capture intelligence analysis platform NSF AFRL DOD 2002 2008 Semantic workflows for Windward: Scalable Transition to IC’s large-scale scientific data Knowledge Discovery RDEC experimental analysis through Distributed Analysis platform AFOSR W3C 2005 2010 MathTrust: Basic research Provenance Incubator on social trust of open leading to standards for information sources provenance on the web AFOSR 2011 Mission - Centered USC InformationNetwork Sciences Models Institute Yolanda Gil 2 http://www.w3.org/2005/Incubator/prov/wiki W3C Provenance Standard Need standard representation for origins of information • Trust in open systems (Web), computational science, process validation and compliance W3C Provenance Incubator launched in 2009 (Y. Gil, Chair) • Collected use cases for provenance • Developed provenance requirements for the Web • Created mappings for existing provenance vocabularies • State-of-the-art report • Provenance in Web architecture Final report in Dec 2012 proposed charter for a Working Group on provenance WG started 4/2011 to develop PROV
    [Show full text]
  • A 20-Year Community Roadmap for Artificial Intelligence Research in the US
    A 20-Year Community Roadmap for Begin forwarded message: From: AAAI Press <[email protected]> Subject: Recap Date: July 8, 2019 at 3:52:40 PM PDT To: Carol McKenna Hamilton <[email protected]>Artificial Intelligence Research in the US Could you check this for me? Hi Denise, While it is fresh in my mind, I wanted to provide you with a recap on the AAAI proceedings work done. Some sections of the proceedings were better than others, which is to be expected because authors in some subject areas are less likely to use gures, the use of which is a signicant factor in error rate. The vision section was an excellent example of this variability. The number of problems in this section was quite high. Although I only called out 13 papers for additional corrections, there were many problems that I simply let go. What continues to be an issue, is why the team didn’t catch many of these errors before compilation began. For example, some of the errors involved the use of the caption package, as well as the existence of setlength commands. These are errors that should have been discovered and corrected before compilation. As I’ve mentioned before, running search routines on the source of all the papers in a given section saves a great deal of time because, in part, it insures consistency. That is something that has been decidedly lacking with regard to how these papers have been (or haven’t been) corrected. It is a cause for concern because the error rate should have been improving, not getting worse.
    [Show full text]
  • Annotated Table of Contents V B I P
    AI MATTERS, VOLUME 1, ISSUE 1!AUGUST 2014 AI Matters Annotated Table of Contents Welcome to AI Matters ! Kiri Wagstaff, Editor Full article: http://doi.acm.org/10.1145/2639475.2639476 A welcome from the Editor of AI Matters and an en- couragement to submit for the next issue. Artificial Intelligence: No Longer Just for O You and Me Yolanda Gil, SIGAI Chair Full article: http://doi.acm.org/10.1145/2639475.2639477 The Chair of SIGAI waxes enthusiastic about the cur- Drop-in Challenge games at RoboCup. rent state of and future prospects for AI developments V Peter Stone, Patrick MacAlpine, Katie Genter, and innovations. She also reports on high school and Sam Barrett. Full image and details: student projects featured at the 2014 Intel Science http://doi.acm.org/10.1145/2639475.2655756 and Engineering Fair. Using Agent-Based Modeling and Cul- Announcing the SIGAI Career Network I N and Conference tural Algorithms to Predict the Location Sanmay Das, Susan L. Epstein, and Yolanda of Submerged Ancient Occupational Gil Sites Full article: http://doi.acm.org/10.1145/2639475.2649581 Robert G. Reynolds, Areej Salaymeh, John O'Shea, and Ashley Lemke SIGAI has created a career networking website and Full article: http://doi.acm.org/10.1145/2639475.2639479 annual conference for the benefit of early career sci- entists. Benefits include mentoring, networking, and A collaboration between archaeologists and artificial job connections. intelligence experts has discovered ancient hunting sites submerged in over 120 feet of water in Lake Future Progress in Artificial Intelligence: Huron. This is the oldest known hunting ground in the world.
    [Show full text]
  • AAAI Members Elect New President-Elect & Executive
    For press inquiries only, contact: Sara Hedberg Emergent, Inc. (425) 444-7272 [email protected] FOR IMMEDIATE RELEASE AAAI members elect new President-Elect & Executive Councilors MENLO PARK, CA. – July 13, 2005. The American Association for Artificial Intelligence (AAAI), the nation’s scientific AI society, announces the new President-Elect and Executive Councilors elected by the society’s membership. Eric Horvitz of Microsoft Research has been elected AAAI President-Elect and will become President in 2007 for a two-year term. Horvitz joins President Alan MacWorth, University of British Columbia, and Ron Brachman, immediate Past President, on the presidential team. Tom Mitchell of Carnegie Mellon University completes his six years of service to AAAI. The newly elected Councilors include: Maria Gini, University of Minnesota Kevin Knight, USC/Information Sciences Institute Peter Stone, University of Texas at Austin Sebastian Thrun, Stanford University Councilors serve a three year term, and one-third of the council rotates off each year. Other Council members include: (through 2006): Steve Chien, JPL / California Institute of Technology Yolanda Gil, USC / ISI Haym Hirsh, University of Rutgers Andrew Moore, Carnegie Mellon University (through 2007): Oren Etzioni, University of Washington Lise Getoor, University of Maryland, College Park Karen Myers, SRI International Illah Nourbakhsh, NASA Ames Research Center / Carnegie Mellon University Councilors who complete their three years of service this year include: Carla Gomes, Cornell University Michael Littman, Rutgers University Maja Mataric, University of Southern California Yoav Shoham, Stanford University (more) “We are very grateful to Tom Mitchell and to these retiring Councilors who have given – and continue to give -- so generously of their time and talents to the organization,” says Carol Hamilton, AAAI Executive Director.
    [Show full text]
  • Ricky J. Sethi Curriculum Vitae 2020
    Ricky J. Sethi Curriculum Vitae 2020 CONTACT Fitchburg State University w: research.sethi.org 160 Pearl Street e: [email protected] Fitchburg, MA 01420 p: 978.665.3703 RESEARCH My research uses fundamental ideas from machine learning and computational science INTERESTS in social computing (fact-checking misinformation and virtual communities), data sci- ence (semantic workflows for digital humanities and reproducibility), and computer vision (physics-based methods for group and crowd analysis). EDUCATION Ph.D., Computer Science . 2004 - 2009 • University of California, Riverside Adviser: Amit K. Roy-Chowdhury Committee: Eamonn J. Keogh and Christian R. Shelton Area of Study: Artificial Intelligence/Computer Vision M.S., Physics/Business (Information Systems) . 1999 - 2001 • University of Southern California B.A., Molecular and Cellular Biology, Neurobiology (Physics minor) . 1996 • University of California, Berkeley ACADEMIC APPOINTMENTS Associate Professor of Computer Science . 2014 - Present • Associate Professor, 2018 - Present Assistant Professor, 2014 - 2018 Fitchburg State University Director of Research . 2013 - Present • The Madsci Network Consulting Scientist . 2018 - Present • US National Academy of Sciences (NAS)’s The Science and Entertainment Exchange Team Lead and Adjunct Professor . 2013 - Present • Southern New Hampshire University Research Scientist. .2014 • Postdoctoral Associate UMass Amherst/UMass Medical School NSF Computing Innovation Fellow . 2010 - 2013 • University of California, Los Angeles University of Southern
    [Show full text]
  • Towards the Geoscience Paper of the Future: Best Practices for Documenting and Sharing Research from Data to Software to Provenance
    Accepted for publication in the Earth and Space Science Journal, 2016. Towards the Geoscience Paper of the Future: Best Practices for Documenting and Sharing Research from Data to Software to Provenance Yolanda Gil1 (orcid.org/0000-0001-8465-8341) Information Sciences Institute and Department of Computer Science, University of Southern California Cédric H. David (orcid.org/0000-0002-0924-5907) Jet Propulsion Laboratory, California Institute of Technology Ibrahim Demir (orcid.org/0000-0002-0461-1242) IIHR Hydroscience & Engineering Institute, University of Iowa Bakinam T. Essawy (orcid.org/0000-0003-2295-7981) Department of Civil and Environmental Engineering, University of Virginia Robinson W. Fulweiler (orcid.org/0000-0003-0871-4246) Department of Earth and Environment, Department of Biology, Boston University Jonathan L. Goodall (orcid.org/0000-0002-1112-4522) Department of Civil and Environmental Engineering, University of Virginia Leif Karlstrom (orcid.org/0000-0002-2197-2349) Department of Geological Sciences, University of Oregon, Eugene, Oregon, 97403 Huikyo Lee (orcid.org/0000-0003-3754-3204) Jet Propulsion Laboratory, California Institute of Technology Heath J. Mills (orcid.org/0000-0002-2882-9238) Division of Natural Sciences, University of Houston Clear Lake Ji-Hyun Oh (orcid.org/0000-0001-8989-7691) Computer Science Department of the University of Southern California & Jet Propulsion Laboratory, California Institute of Technology Suzanne A Pierce (orcid.org/0000-0002-3050-1987) Texas Advanced Computing Center and Jackson School of Geosciences, University of Texas at Austin Allen Pope (orcid.org/0000-0001-9699-7500) National Snow and Ice Data Center, University of Colorado Boulder & Polar Science Center, Applied Physics Lab, University of Washington Mimi W.
    [Show full text]
  • Arxiv:1705.02607V2 [Cs.SE] 18 May 2017 0002-3444-7615
    REPORT ON THE FOURTH WORKSHOP ON SUSTAINABLE SOFTWARE FOR SCIENCE: PRACTICE AND EXPERIENCES (WSSSPE4) DANIEL S. KATZ(1), KYLE E. NIEMEYER(2), SANDRA GESING(3), LORRAINE HWANG(4), WOLFGANG BANGERTH(5), SIMON HETTRICK(6), RAY IDASZAK(7), JEAN SALAC(8), NEIL CHUE HONG(9), SANTIAGO NÚÑEZ-CORRALES(10), ALICE ALLEN(11), R. STUART GEIGER(12), JONAH MILLER(13), EMILY CHEN(14), ANSHU DUBEY(15), PATRICIA LAGO(16) Abstract. This report records and discusses the Fourth Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE4). The report includes a description of the keynote presentation of the workshop, the mission and vision statements that were drafted at the workshop and finalized shortly after it, a set of idea papers, position papers, experience papers, demos, and lightning talks, and a panel discussion. The main part of the report covers the set of working groups that formed during the meeting, and for each, discusses the participants, the objective and goal, and how the objective can be reached, along with contact information for readers who may want to join the group. Finally, we present results from a survey of the workshop attendees. (1) National Center for Supercomputing Applications (NCSA) & Department of Computer Science & Department of Electrical and Computer Engineering & School of Information Sciences (iSchool), University of Illinois, Urbana– Champaign, IL, USA; [email protected]; ORCID: 0000-0001-5934-7525. (2) School of Mechanical, Industrial, and Manufacturing Engineering, Oregon State University, Corvallis, OR, USA; [email protected]; ORCID: 0000-0003-4425-7097. (3) Center for Research Computing & Department of Computer Science and Engineering, University of Notre Dame, IN, USA; [email protected]; ORCID: 0000-0002-6051-0673.
    [Show full text]
  • The Transformative Potential of AI for Science
    The Transforma,ve Poten,al of AI for Science: Will AI Write the Scien,fic Papers of the Future? Yolanda Gil Informaon Sciences Ins.tute and Department of Computer Science University of Southern California h<ps://.nyurl.com/yolandagil-aaai2020 This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as the original source is credited. Source: “Will AI Write the Scien.fic Papers of the Future?”, Yolanda Gil, 2020 Presiden.al Address, Associaon for the Advancement of Ar.ficial Intelligence (AAAI), February 2020. A Personal Perspecve on AI A Personal Perspecve on AI • The AI community has always been • Visionary • Broad • Inclusive • Interdisciplinary • Determined • And dare I say • Successful A Personal View of Watershed Moments in AI: (I) 1980s Scripts Plans Goals and Understanding Roger C. Schank and Robert P. Abelson A Personal View of Watershed Moments in AI: 1990s 1990: Sphynx shows speaker independent large vocabulary connuous speech recognion – used it to write my PhD thesis 1992: Soar flies helicopter teams in simulaons and is indisnguishable to commanders from human-controlled aircra 1992: TD-Gammon autonomously learns to play backgammon at human player levels 1995: SKICAT idenfies five new quasars in the Second Palomar Sky Survey 1995: Navlab is the first trained car to drive autonomously on highways to cross the United States 1997: Deep Blue defeats human chess world champion and gets grandmaster-level rang 1999: RAX flew a spacecraC autonomously, demonstrang planning, monitoring, and fault
    [Show full text]
  • Prodigy Bidirectional Planning
    Prodigy Bidirectional Planning Eugene Fink Jim Blythe Computer Science Information Sciences Institute Carnegie Mellon University University of Southern California Pittsburgh, pa 15213 Marina del Rey, ca 90292 e.fi[email protected] [email protected] www.cs.cmu.edu/∼eugene www.isi.edu/∼blythe Abstract The prodigy system is based on bidirectional planning, which is a combination of goal-directed reasoning with simulated execution. Researchers have implemented a se- ries of planners that utilize this search strategy, and demonstrated that it is an efficient technique, a fair match for other successful planners; however, they have provided few formal results on the common principles underlying the developed algorithms. We formalize bidirectional planning, elucidate some techniques for improving its ef- ficiency, and show how different strategies for controlling search complexity give rise to different versions of prodigy. In particular, we demonstrate that prodigy is incom- plete and discuss advantages and drawbacks of its incompleteness. We then develop a complete bidirectional planner and compare it experimentally with prodigy.We show that it is almost as fast as prodigy and solves a wider range of problems. 1 Introduction Newell and Simon [1961; 1972] invented means-ends analysis during their work on the General Problem Solver (gps), back in the early days of artificial intelligence. Their technique combined goal-directed reasoning with forward chaining from the initial state. The authors of later planning systems [Fikes and Nilsson, 1971; Warren, 1974; Tate, 1977] gradually abandoned forward search and began to rely exclusively on backward chaining. Researchers studied several types of backward chainers [Minton et al., 1994] and showed that least commitment improves the efficiency of goal-directed reasoning, which gave rise to tweak [Chapman, 1987], abtweak [Yang et al., 1996], snlp [McAllester and Rosenblitt, 1991], ucpop [Penberthy and Weld, 1992; Weld, 1994], and other least-commitment planners.
    [Show full text]
  • AI MAGAZINE AAAI News
    AAAI News AAAI News Spring News from the Association for the Advancement of Artificial Intelligence AAAI Announces New Senior Members! 2017 Feigenbaum Prize Awarded! AAAI congratulates the following indi - AAAI is delighted to announce that Yoav Shoham (Stanford viduals on their election to AAAI Senior University, Google), has been selected as the recipient of the Member status: 2017 AAAI Feigenbaum Prize. Shoham is being recognized in Alessandro Cimatti (Fondazione Bruno particular for high-impact basic research in artificial intelli - Kessler, Italy) gence — including knowledge representation, multiagent sys - tems, and computational game theory — and translating the basic research Xuelong Li (Chinese Academy of Sci - into impactful and innovative commercial products. ences, China) The AAAI Feigenbaum Prize was established in 2010 and is awarded bien - Nathan R. Sturtevant (University of nially to recognize and encourage outstanding Artificial Intelligence Denver, USA) research advances that are made by using experimental methods of com - This honor was announced at the puter science. The associated cash prize of $10,000 is provided by the recent AAAI-17 Conference in San Feigenbaum Nii Foundation. Francisco. Senior Member status is designed to recognize AAAI members who have achieved significant accom - plishments within the field of artificial cial Intelligence, held in 1999 in Orlan - heads the UW Robotics and State Esti - intelligence. To be eligible for nomina - do, Florida, USA. The 2017 recipient of mation Lab (RSE-Lab). Fox is a Fellow tion for Senior Member, candidates the AAAI Classic Paper Award was: of the AAAI and IEEE, and he received must be consecutive members of AAAI Monte Carlo Localization: Efficient several best paper awards at major for at least five years and have been Position Estimation for Mobile Robots.
    [Show full text]
  • The Open Provenance Model Core Specification (V1.1)
    Future Generation Computer Systems 27 (2011) 743–756 Contents lists available at ScienceDirect Future Generation Computer Systems journal homepage: www.elsevier.com/locate/fgcs The Open Provenance Model core specification (v1.1) Luc Moreau a,∗, Ben Clifford, Juliana Freire b, Joe Futrelle c, Yolanda Gil d, Paul Groth e, Natalia Kwasnikowska f, Simon Miles g, Paolo Missier h, Jim Myers c, Beth Plale i, Yogesh Simmhan j, Eric Stephan k, Jan Van den Bussche f a U. of Southampton, United Kingdom b U. Utah, United States c National Center for Supercomputing Applications, United States d Information Sciences Institute, USC, United States e VU University Amsterdam, The Netherlands f U. Hasselt and Transnational U. Limburg, Belgium g King's College, London, United Kingdom h U. of Manchester, United Kingdom i Indiana University, United States j Microsoft, United States k Pacific Northwest National Laboratory, United States article info a b s t r a c t Article history: The Open Provenance Model is a model of provenance that is designed to meet the following Received 21 December 2009 requirements: (1) Allow provenance information to be exchanged between systems, by means of a Received in revised form compatibility layer based on a shared provenance model. (2) Allow developers to build and share tools that 14 June 2010 operate on such a provenance model. (3) Define provenance in a precise, technology-agnostic manner. (4) Accepted 10 July 2010 Support a digital representation of provenance for any ``thing'', whether produced by computer systems or Available online 16 July 2010 not. (5) Allow multiple levels of description to coexist.
    [Show full text]
  • An Interview with Yolanda Gil
    Successful Research in AI Reflections on Successful Research in Artificial Intelligence: An Interview with Yolanda Gil Yolanda Gil, Biplav Srivastava, Ching-Hua Chen, Oshani Seneviratne This article contains the observations ditorial Team: What are your observations on how has of Yolanda Gil, director of knowledge the nature of AI research changed and evolved over technologies and research professor at the years? the Information Sciences Institute of E Gil: I worked on my thesis in the 1980s, when AI research the University of Southern California, methodology was very different than it is today. Back then, USA, and president of Association for AI research was very broad and cross-disciplinary. The Advancement of Artificial Intelligence who was recently interviewed about machine learning conferences I attended would include the factors that could influence suc- cognitive psychologists developing new capabilities for edu- cessful AI research. The editorial team cation and learning, human computer interaction researchers interviewers included members of the doing work on knowledge acquisition, and mathematicians special track editorial team from IBM focused on the statistics of machine learning and formal Research (Biplav Srivastava, Ching-Hua aspects of learning. Because of this, AI researchers like Chen) and Rensselaer Polytechnic Insti- myself would naturally mingle with researchers from other tute (Oshani Seneviratne). disciplines. In contrast, today we see research that is more narrow and focused. This could be a sign of a maturing field, because sometimes when a field grows people tend to spe- cialize into very focused areas. But having a narrow view of AI is not necessarily a good thing in my opinion.
    [Show full text]