A Graph Model of Data and Workflow Provenance

Total Page:16

File Type:pdf, Size:1020Kb

A Graph Model of Data and Workflow Provenance A graph model of data and workflow provenance Umut Acar Peter Buneman Max-Planck Institute for Software Systems University of Edinburgh James Cheney Jan Van den Bussche Natalia Kwasnikowska University of Edinburgh Hasselt University Hasselt University Stijn Vansummeren Universite´ Libre de Bruxelles Abstract that these models share a great deal of structure once they are defined in a common language and data model [8, 6]. Provenance has been studied extensively in both database Provenance models have also been developed for a va- and workflow management systems, so far with little riety of workflow systems, such as Chimera [10], Tav- convergence of definitions or models. Provenance in erna [20], Kepler [1], Karma [24], and ZOOM [26]; databases has generally been defined for relational or also, many other systems such as PASS and PASOA em- complex object data, by propagating fine-grained an- ploy similar ideas [14, 23]. These systems model and notations or algebraic expressions from the input to record provenance as a directed acyclic graph that, in- the output. This kind of provenance has been found formally, describes the macroscopic computation steps useful in other areas of computer science: annotation (e.g., whole program executions) performed in construct- databases, probabilistic databases, schema and data in- ing intermediate and final results. Recently, the Open tegration, etc. In contrast, workflow provenance aims to Provenance Model (OPM) [22] has been developed as a capture a complete description of evaluation – or enact- consensus exchange format for representing provenance ment – of a workflow, and this is crucial to verification graphs. in scientific computation. Workflows and their prove- Workflow systems employ a much wider variety of nance are often presented using graphical notation, mak- programming constructs than databases, including con- ing them easy to visualize but complicating the formal currency, procedures, service calls, and queries to exter- semantics that relates their run-time behavior with their nal databases. However, these systems are seldom ac- provenance records. We bridge this gap by extending a companied by formal specifications of their intended se- previously-developed dataflow language which supports mantics, with or without provenance. As a result, it can both database-style querying and workflow-style batch be hard to understand provenance information produced processing steps to produce a workflow-style provenance by a workflow system, since the meaning intended by the graph that can be explicitly queried. We define and implementer may not match the expectations of the user. describe the model through examples, present queries This is a particularly vexing problem because some users that extract other forms of provenance, and give an exe- and implementers might not even be aware of the possi- cutable definition of the graph semantics of dataflow ex- bility of misinterpretation, leading to further confusion. pressions. The scarcity of clear specifications of the semantics and provenance behavior of workflow systems makes it 1 Introduction difficult to integrate database and workflow provenance or compare provenance graphs generated by different A number of standard database provenance models systems. Therefore, we believe that it is essential to study tailored to relational, complex-object or XML query the semantics of workflow provenance models and relate languages have emerged. These models include lin- them to existing models of database provenance. eage [9], where-provenance [3, 2], why-provenance [3, To compare and unify these different techniques, we 4], and more recent innovations such as dependency- need to define a common provenance model. Database provenance [7], how-provenance [13, 11], and prove- provenance models can be visualized as graphs. Where- nance traces [6]. These models have been presented in provenance, lineage, and dependency-provenance can be a number of different ways and founded on several dif- visualized as bipartite graphs linking parts of the out- ferent motivations. Recently, further study has revealed put with parts of the input. How-provenance and why- provenance are more complex, but can also be visualized such as images or data files. Functions include primitive as directed acyclic graphs linking parts of the output to operations on basic data types, such as integer addition parts of the input, where nodes are labeled with symbolic and equality. Furthermore, functions can also represent algebraic operations. Graphs provide a natural common large computational steps such as external program or formalism for workflow and database provenance. service calls: for example, to model the first Provenance We also need a common language that can express Challenge workflow we might use base types such as both database queries and workflows. In this paper, we Image, Header, or WarpFile and function symbols such use a core calculus for dataflows, called DFL, based on as align warp : Image × Header × Image × Header ! the Nested Relational Calculus (NRC). DFL has been WarpFile or reslice : WarpFile ! Image × Header to previously introduced by Hidders et al. [16, 17] and we represent the macroscopic computation steps. also build upon some prior work on provenance in this The remaining syntactic constructs above are standard setting [19]. We develop a graphical model of prove- components of the Nested Relational Calculus: we in- nance for both database queries and simple workflows clude record and field projection operations, booleans in a uniform way. This should provide a foundation for and conditionals, and set operations. We employ the syn- studying more complex workflow language features such tax fe0(x) j x 2 eg for the “for-loop” or set comprehen- as nondeterminism, concurrency and while-loops. sion operation which evaluates e to a set fv1; : : : ; vng 0 0 The structure of the rest of this paper is as follows. In and returns the set of values fe (v1); : : : ; e (vn)g ob- 0 Section 2 we review the dataflow calculus DFL. In Sec- tained by evaluating e with x bound to each vi. The tion 3 we describe the structure of provenance graphs and expression S e flattens a nested collection. The expres- give examples showing how typical dataflows are trans- sion empty?(e) tests whether collection e is empty. lated to graphs. In Section 4 we describe the provenance We will use ordered-pair syntax (e1; e2) to abbreviate semantics of DFL programs, and give an executable, yet hfst : e1; snd : e2i, and write fst(e) or snd(e) instead of still high-level implementation in Haskell. Section 5 dis- πfst(e) or πsnd(e), respectively, for the first and second cusses how to express queries over the graphs, partic- projections of an ordered pair. We also assume a fixed, ularly inspired by where- and and why-provenance in finite set of attribute names Attr. databases, and outlines some future directions. Section 6 DFL and NRC are statically typed languages with an discusses related work and Section 7 concludes. arbitrary but fixed collection of atomic types, and an ar- Note. Our graphical model is fundamentally very bitrary but fixed signature that assigns types to the con- similar to the trace model developed in prior, unpub- stants and function symbols [5]. The static typing disci- lished work with Acar and Ahmed [6]. However, we pline ensures that expressions are always well-defined on make a different contribution: namely, we feel our graph- input values of the correct type. For ease of presentation theoretic presentation is more widely accessible than in what follows, we will ignore typing issues and silently the syntactic traces and operational semantics rules em- restrict attention to expressions and evaluations that are ployed to simplify proofs of their main results in [6]. well-defined in the conventional sense [5]. So whenever we apply, for example, e1 [ e2, e1 and e2 are assumed to correctly evaluate to sets. 2 Background The dataflow language DFL [17] is an extension of the 3 Value, evaluation and provenance graphs Nested Relational Calculus (NRC) that includes atomic values and functions. As we can only briefly introduce DFL expressions are normally evaluated over complex DFL here due to space reasons, we encourage readers values, which are nested combinations of atomic data unfamiliar with DFL and NRC to consult [5, 17] for more values d, tuples of complex values hA1 : v1;:::;An : details. In brief, the syntax of DFL is as follows: vni, and sets of complex values fv1; : : : ; vng. As we show in Section 3.1, we can easily represent complex 0 0 e; e ::= x j let x = e in e j c j f(e1; : : : ; en) values as trees or (with sharing) as directed acyclic graphs. Using such value graphs, we are going to rep- j πA(e) j hA1:e1;:::;An:eni j empty?(e) resent the evaluation of a DFL expression by means of j True j False j if e then e else e 1 2 a provenance graph in Section 3.3. A provenance graph 0 [ j ; j feg j e1 [ e2 j fe j x 2 eg j e is a two-sorted graph, consisting of a value graph and an evaluation graph (introduced in Section 3.2), that docu- Here, c denotes a constant atomic data value, drawn from ments the evaluation of a program. Moreover, there is a a set D, and f denotes a function. Atomic data values connection between the evaluation graph and the value may be values of base types such as integers or booleans graph in that each evaluation node is linked to a value or strings, but they may also be more complicated objects node. 2 3.1 Value graphs As with value graphs, we write labl(m) to indicate that l 0 A value graph G is a directed acyclic graph (V; E) with node x has label l and m ! m to indicate that nodes m 0 labels on the nodes and edges.
Recommended publications
  • Are We Losing Our Ability to Think Critically?
    news Society | DOI:10.1145/1538788.1538796 Samuel Greengard Are We Losing Our Ability to Think Critically? Computer technology has enhanced lives in countless ways, but some experts believe it might be affecting people’s ability to think deeply. OCIETY HAS LONG cherished technology alters the way we see, hear, the ability to think beyond and assimilate our world—the act of the ordinary. In a world thinking remains decidedly human. where knowledge is revered and innovation equals Rethinking Thinking Sprogress, those able to bring forth Arriving at a clear definition for criti- greater insight and understanding are cal thinking is a bit tricky. Wikipedia destined to make their mark and blaze describes it as “purposeful and reflec- a trail to greater enlightenment. tive judgment about what to believe or “Critical thinking as an attitude is what to do in response to observations, embedded in Western culture. There experience, verbal or written expres- is a belief that argument is the way to sions, or arguments.” Overlay technolo- finding truth,” observes Adrian West, gy and that’s where things get complex. research director at the Edward de For better or worse, exposure to technology “We can do the same critical-reasoning Bono Foundation U.K., and a former fundamentally changes how people think. operations without technology as we computer science lecturer at the Uni- can with it—just at different speeds and versity of Manchester. “Developing our formation can easily overwhelm our with different ease,” West says. abilities to think more clearly, richly, reasoning abilities.” What’s more, What’s more, while it’s tempting fully—individually and collectively— it’s ironic that ever-growing piles of to view computers, video games, and is absolutely crucial [to solving world data and information do not equate the Internet in a monolithic good or problems].” to greater knowledge and better de- bad way, the reality is that they may To be sure, history is filled with tales cision-making.
    [Show full text]
  • Autumn 2Copy2:First Draft.Qxd
    news THE ROYAL SOCIETY OF EDINBURGH ISSUE 26 AUTUMN/WINTER 2009 RESOURCE THE NEWSLETTER OF SCOTLAND’ S NATIONAL ACADEMY SCOTLAND’S RESEARCH TALENT In September 2009 over 70 researchers, mostly in the early stages of their careers, were invited to attend the RSE Annual Research Awards Ceremony. The Awards were presented by RSE President, Lord Wilson of Tillyorn, KT GCMG and Professor Alan Miller, RSE Research Awards Convener. Former BP Research Fellow, Professor Miles Padgett, (pictured below right) who held his Fellowship at the University of St Andrews between 1993 and 1995, also addressed the meeting. Professor Padgett now holds a personal Chair in Physics and is head of the Optics Group at the University of Glasgow. He was elected to the Fellowship of the RSE in 2001 and served as Young People’s Convener from 2005 to 2008, remaining an active member of that programme today. Lord Wilson is pictured with Dr Sinead Rhodes from the Department of Psychology, University of St Andrews whose research proposal was granted a small project fund in the Scottish Crucible programme. Full details of all the awards can be found inside. International Links Climate Change Debate Education Programme Scotland’s Research Talent Lessells Travel Scholarships Cormack Vacation Piazzi Smyth Bequest Dr Spela Ivekovic Scholarships Research Scholarship School of Computing, University of Dundee Dominic Lawson James Henderson Swarm Intelligence and Projective Department of Physics and Astronomy, Department of Physics, Geometry for Computer Vision University of Glasgow
    [Show full text]
  • The Best Nurturers in Computer Science Research
    The Best Nurturers in Computer Science Research Bharath Kumar M. Y. N. Srikant IISc-CSA-TR-2004-10 http://archive.csa.iisc.ernet.in/TR/2004/10/ Computer Science and Automation Indian Institute of Science, India October 2004 The Best Nurturers in Computer Science Research Bharath Kumar M.∗ Y. N. Srikant† Abstract The paper presents a heuristic for mining nurturers in temporally organized collaboration networks: people who facilitate the growth and success of the young ones. Specifically, this heuristic is applied to the computer science bibliographic data to find the best nurturers in computer science research. The measure of success is parameterized, and the paper demonstrates experiments and results with publication count and citations as success metrics. Rather than just the nurturer’s success, the heuristic captures the influence he has had in the indepen- dent success of the relatively young in the network. These results can hence be a useful resource to graduate students and post-doctoral can- didates. The heuristic is extended to accurately yield ranked nurturers inside a particular time period. Interestingly, there is a recognizable deviation between the rankings of the most successful researchers and the best nurturers, which although is obvious from a social perspective has not been statistically demonstrated. Keywords: Social Network Analysis, Bibliometrics, Temporal Data Mining. 1 Introduction Consider a student Arjun, who has finished his under-graduate degree in Computer Science, and is seeking a PhD degree followed by a successful career in Computer Science research. How does he choose his research advisor? He has the following options with him: 1. Look up the rankings of various universities [1], and apply to any “rea- sonably good” professor in any of the top universities.
    [Show full text]
  • Curating the CIA World Factbook 29 the International Journal of Digital Curation Issue 3, Volume 4 | 2009
    Curating the CIA World Factbook 29 The International Journal of Digital Curation Issue 3, Volume 4 | 2009 Curating the CIA World Factbook Peter Buneman, Heiko Müller, School of Informatics, University of Edinburgh Chris Rusbridge, Digital Curation Centre, University of Edinburgh Abstract The CIA World Factbook is a prime example of a curated database – a database that is constructed and maintained with a great deal of human effort in collecting, verifying, and annotating data. Preservation of old versions of the Factbook is important for verification of citations; it is also essential for anyone interested in the history of the data such as demographic change. Although the Factbook has been published, both physically and electronically, only for the past 30 years, we appear in danger of losing this history. This paper investigates the issues involved in capturing the history of an evolving database and its application to the CIA World Factbook. In particular it shows that there is substantial added value to be gained by preserving databases in such a way that questions about the change in data, (longitudinal queries) can be readily answered. Within this paper, we describe techniques for recording change in a curated database and we describe novel techniques for querying the change. Using the example of this archived curated database, we discuss the extent to which the accepted practices and terminology of archiving, curation and digital preservation apply to this important class of digital artefacts.1 1 This paper is based on the paper given by the authors at the 5th International Digital Curation Conference, December 2009; received November 2009, published December 2009.
    [Show full text]
  • Transforming Databases with Recursive Data Structures
    TRANSFORMING DATABASES WITH RECURSIVE DATA STRUCTURES Anthony Kosky A DISSERTATION in COMPUTER AND INFORMATION SCIENCE Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy. 1996 Susan Davidson— Supervisor of Dissertation Peter Buneman— Supervisor of Dissertation Peter Buneman— Graduate Group Chairperson c Copyright 2003 by Anthony Kosky iii To my parents. iv v WARRANTY Congratulations on your acquisition of this dissertation. In acquiring it you have shown yourself to be a computer scientist of exceptionally good taste with a true appreciation for quality. Each proof, algorithm or definition in this dissertation has been carefully checked by hand to ensure correctness and reliability. Each word and formula has been meticulously crafted using only the highest quality symbols and characters. The colours of inks and paper have been carefully chosen and matched to maximize contrast and readability. The author is confident that this dissertation will provide years of reliable and trouble free ser- vice, and offers the following warranty for the lifetime of the original owner: If at any time a proof or algorithm should be found to be defective or contain bugs, simply return your disser- tation to the author and it will be repaired or replaced (at the author’s choice) free of charge. Please note that this warranty does not cover damage done to the dissertation through normal wear-and-tear, natural disasters or being chewed by family pets. This warranty is void if the dissertation is altered or annotated in any way. Concepts described in this dissertation may be new and complicated.
    [Show full text]
  • A Survey on Scientific Data Management
    A Survey on Scientific Data Management References [1] Ilkay Altintas, Chad Berkley, Efrat Jaeger, Matthew Jones, Bertram Ludascher,¨ and Steve Mock. Kepler: An extensible system for design and execution of scientific workflows. In Proceedings of the 16th International Conference on Scientific and Statistical Database Management (SSDBM’04), pages 21– 23, 2004. [2] Alexander Ames, Nikhil Bobb, Scott A. Brandt, Adam Hiatt, Carlos Maltzahn, Ethan L. Miller, Alisa Neeman, and Deepa Tuteja. Richer file system metadata using links and attributes. In Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST’05), pages 49–60, 2005. [3] Bill Anderson. Mass storage system performance prediction using a trace-driven simulator. In MSST ’05: Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies, pages 297–306, Washington, DC, USA, 2005. IEEE Computer Society. [4] Phil Andrews, Bryan Banister, Patricia Kovatch, Chris Jordan, and Roger Haskin. Scaling a global file system to the greatest possible extent, performance, capacity, and number of users. In MSST ’05: Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Tech- nologies, pages 109–117, Washington, DC, USA, 2005. IEEE Computer Society. [5] Grigoris Antoniou and Frank van Harmelen. A Semantic Web Primer. The MIT Press, 2004. [6] Scott Brandt, Carlos Maltzahn, Neoklis Polyzotis, and Wang-Chiew Tan. Fusing data management services with file systems. In PDSW ’09: Proceedings of the 4th Annual Workshop on Petascale Data Storage, pages 42–46, New York, NY, USA, 2009. ACM. [7] Peter Buneman, Adriane Chapman, and James Cheney.
    [Show full text]
  • Towards a Multi-Discipline Network Perspective
    Manifesto from Dagstuhl Perspectives Workshop 12182 Towards A Multi-Discipline Network Perspective Edited by Matthias Häsel1, Thorsten Quandt2, and Gottfried Vossen3 1 Otto Group – Hamburg, DE, [email protected] 2 Universität Hohenheim, DE, [email protected] 3 Universität Münster, DE, [email protected] Abstract This is the manifesto of Dagstuhl Perspectives Workshop 12182 on a multi-discipline perspective on networks. The information society is shaped by an increasing presence of networks in various manifestations, most notably computer networks, supply-chain networks, and social networks, but also business networks, administrative networks, or political networks. Online networks nowadays connect people all around the world at day and night, and allow to communicate and to work collaboratively and efficiently. What has been a commodity in the private as well as in the enterprise sectors independently for quite some time now is currently growing together at an increasing pace. As a consequence, the time has come for the relevant sciences, including computer science, information systems, social sciences, economics, communication sciences, and others, to give up their traditional “silo-style” thinking and enter into borderless dialogue and interaction. The purpose of this Manifesto is to review where we stand today, and to outline directions in which we urgently need to move, in terms of both research and teaching, but also in terms of funding. Perspectives Workshop 02.–04. May, 2012 – www.dagstuhl.de/12182 1998 ACM Subject Classification A.0 General, A.2 Reference, H. Information Systems, J.4 So- cial and Behavioral Sciences, K.4 Computers and Society Keywords and phrases Networks, network infrastructure, network types, network effects, data in networks, social networks, social media, crowdsourcing Digital Object Identifier 10.4230/DagMan.2.1.1 Executive Summary The information society is shaped by an increasing presence of networks in various mani- festations.
    [Show full text]
  • O. Peter Buneman Curriculum Vitæ – Jan 2008
    O. Peter Buneman Curriculum Vitæ { Jan 2008 Work Address: LFCS, School of Informatics University of Edinburgh Crichton Street Edinburgh EH8 9LE Scotland Tel: +44 131 650 5133 Fax: +44 667 7209 Email: [email protected] or [email protected] Home Address: 14 Gayfield Square Edinburgh EH1 3NX Tel: 0131 557 0965 Academic record 2004-present Research Director, Digital Curation Centre 2002-present Adjunct Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 2002-present Professor of Database Systems, Laboratory for the Foundations of Computer Science, School of Informatics, University of Edinburgh 1990-2001 Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 1981-1989 Associate Professor of Computer Science, Department of Computer and Information Science, University of Pennsylvania. 1981-1987 Graduate Chairman, Department of Computer and Information Science, University of Pennsyl- vania 1975-1981 Assistant Professor of Computer Science at the Moore School and Assistant Professor of Decision Sciences at the Wharton School, University of Pennsylvania. 1969-1974 Research Associate and subsequently Lecturer in the School of Artificial Intelligence, Edinburgh University. Education 1970 PhD in Mathematics, University of Warwick (Supervisor E.C. Zeeman) 1966 MA in Mathematics, Cambridge. 1963-1966 Major Scholar in Mathematics at Gonville and Caius College, Cambridge. Awards, Visting positions Distinguished Visitor, University of Auckland, 2006 Trustee, VLDB Endowment, 2004 Fellow of the Royal Society of Edinburgh, 2004 Royal Society Wolfson Merit Award, 2002 ACM Fellow, 1999. 1 Visiting Resarcher, INRIA, Jan 1999 Visiting Research Fellowship sponsored by the Japan Society for the Promotion of Science, Research Institute for the Mathematical Sciences, Kyoto University, Spring 1997.
    [Show full text]
  • The Hyperview Approach to the Integration of Semistructured Data
    The HyperView Approach to the Integration of Semistructured Data Lukas C. Faulstich1 Dissertation am Fachbereich Mathematik und Informatik der Freien Universität Berlin Eingereicht am: 16. November 1999 Verteidigt am: 15. Februar 2000 Gutachter: Prof. Dr. Heinz Schweppe Prof. Dr. Herbert Weber (TU Berlin) Prof. Dr. Hartmut Ehrig (TU Berlin) Betreuer: Prof. Dr. Heinz Schweppe Prof. Dr. Herbert Weber (TU Berlin) Prof. Dr. Hartmut Ehrig (TU Berlin) Dr. Ralf-Detlef Kutsche (TU Berlin) Dr. Gabriele Taentzer (TU Berlin) 1Supported by the German Research Society, Berlin-Brandenburg Graduate School on Distributed Infor- mation Systems (DFG grant no. GRK 316) ii to Myra iii Abstract In order to use the World Wide Web to answer a specific question, one often has to collect and combine information from multiple Web sites. This task is aggravated by the structural and se- mantic heterogeneity of the Web. Virtual Web sites are a promising approach to solve this problem for particular, focused application domains. A virtual Web site is a Web site that serves pages containing concentrated information that has been extracted, homogenized,andcombined from several underlying Web sites. The goal is to save the user from tediously searching and browsing multiple pages at all these sites. The HyperView approach to the integration of semistructured data sources presented in this thesis provides a methodology, a formal framework, and a software environment for building such virtual Web sites. To achieve this kind of integration, data has to be extracted from external Web documents, integrated into a common representation, and then presented to the user in form of Web documents.
    [Show full text]
  • File Size for Two Types of Query As the Retrieved Resultset Increases
    PROVENANCE SUPPORT FOR SERVICE-BASED INFRASTRUCTURE Shrija Rajbhandari School of Computer Science Cardiff University This thesis is submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy April 2007 UMI Number: U585009 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate the deletion. Dissertation Publishing UMI U585009 Published by ProQuest LLC 2013. Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code. ProQuest LLC 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 48106-1346 Declaration This work has not previously been accepted in substance for any degree and is not concurrently submitted in candidature for any degree. Signed (candidate) Date STATEMENT 1 This thesis is being submitted in partial fulfillment of the requirements for the degree of PhD. C \ - j r s Signed ................. 2 2 2 2 ^ ......................... (candidate) D ate...... STATEMENT 2 This thesis is the result of my own investigations, except otherwise stated. Other sources are acknowledged by explicit references. Signed ................ 2 2 ^ 2 ........................ (candidate) Date ..... STATEMENT 3 I hereby give consent for my thesis, if accepted, to be available for photocopying and for interlibrary loan, and for the title and summary to be made available to outside organisations.
    [Show full text]
  • Using Links to Prototype a Database Wiki
    Using Links to prototype a Database Wiki James Cheney, Sam Lindley Heiko Muller¨ University of Edinburgh Tasmanian ICT Centre CSIRO , [email protected] Hobart, Australia [email protected] [email protected] ABSTRACT versioning, annotation, and provenance, which are especially im- Both relational databases and wikis have strengths that make portant for scientific databases [7]. Many of these features can be them attractive for use in collaborative applications. In the last added to existing systems in ad hoc ways, but this can be expensive. decade, database-backed Web applications have been used exten- Wikis are user-editable Web sites that have grown very popular sively to develop valuable shared biological references called cu- in the last decade. A prime example is Wikipedia, which is displac- rated databases. Databases offer many advantages such as scala- ing standard print reference works. Wikis allow users to edit their bility, query optimization and concurrency control, but are not easy content almost as casually as they search or browse. Wikis support to use and lack other features needed for collaboration. Wikis have collaboration and transparency by recording detailed change histo- become very popular for early-stage biocuration projects because ries and allowing space for discussion. Because they are free and they are easy to use, encourage sharing and collaboration, and pro- relatively simple to set up and configure, wikis are becoming popu- lar for nascent biological database projects: for example, the Gene vide built-in support for archiving, history-tracking and annotation. 1 However, curation projects often outgrow the limited capabilities of Wiki Portal lists over 15 biological wiki projects, and a wiki was used to coordinate scientific response to the swine flu outbreak in wikis for structuring and efficiently querying data at scale, necessi- 2 tating a painful phase transition to a database-backed Web applica- early 2009 .
    [Show full text]
  • RSE Fellows Ordered by Area of Expertise As at 11/10/2016
    RSE Fellows ordered by Area of Expertise as at 11/10/2016 HRH Prince Charles The Prince of Wales KG KT GCB Hon FRSE HRH The Duke of Edinburgh KG KT OM, GBE Hon FRSE HRH The Princess Royal KG KT GCVO, HonFRSE A1 Biomedical and Cognitive Sciences 2014 Professor Judith Elizabeth Allen FRSE, FMedSci, Professor of Immunobiology, University of Manchester. 1998 Dr Ferenc Andras Antoni FRSE, Honorary Fellow, Centre for Integrative Physiology, University of Edinburgh. 1993 Sir John Peebles Arbuthnott MRIA, PPRSE, FMedSci, Former Principal and Vice-Chancellor, University of Strathclyde. Member, Food Standards Agency, Scotland; Chair, NHS Greater Glasgow and Clyde. 2010 Professor Andrew Howard Baker FRSE, FMedSci, BHF Professor of Translational Cardiovascular Sciences, University of Glasgow. 1986 Professor Joseph Cyril Barbenel FRSE, Former Professor, Department of Electronic and Electrical Engineering, University of Strathclyde. 2013 Professor Michael Peter Barrett FRSE, Professor of Biochemical Parasitology, University of Glasgow. 2005 Professor Dame Sue Black DBE, FRSE, Director, Centre for Anatomy and Human Identification, University of Dundee. ; Director, Centre for Anatomy and Human Identification, University of Dundee. 2007 Professor Nuala Ann Booth FRSE, Former Emeritus Professor of Molecular Haemostasis and Thrombosis, University of Aberdeen. 2001 Professor Peter Boyle CorrFRSE, FMedSci, Former Director, International Agency for Research on Cancer, Lyon. 1991 Professor Sir Alasdair Muir Breckenridge CBE KB FRSE, FMedSci, Emeritus Professor of Clinical Pharmacology, University of Liverpool. 2007 Professor Peter James Brophy FRSE, FMedSci, Professor of Anatomy, University of Edinburgh. Director, Centre for Neuroregeneration, University of Edinburgh. 2013 Professor Gordon Douglas Brown FRSE, FMedSci, Professor of Immunology, University of Aberdeen. 2012 Professor Verity Joy Brown FRSE, Provost of St Leonard's College, University of St Andrews.
    [Show full text]