Myexperiment: Defining the Social Virtual Research Environment
Total Page:16
File Type:pdf, Size:1020Kb
myExperiment: Defining the Social Virtual Research Environment David De Roure 1, Carole Goble 2, Jiten Bhagat 2, Don Cruickshank 1, Antoon Goderis 2, Danius Michaelides 1 and David Newman 1 1School of Electronics & Computer Science, 2School of Computer Science, University of Southampton University of Manchester [email protected] [email protected] Abstract 1. It should facilitate the management and sharing of The myExperiment Virtual Research Environment Research Objects – these are the digital supports the sharing of research objects used by commodities that are used and reused by scientists, such as scientific workflows. For researchers, ranging from data and methods to researchers it is both a social infrastructure that scholarly publications. encourages sharing and a platform for conducting 2. It should support the social model : producers of research, through familiar user interfaces. For research objects should have incentives to make developers it provides an open, extensible and them available; consumers need to be able to participative environment. We describe the design, discover and reuse them; all will benefit from self- implementation and deployment of myExperiment and and community-curation. suggest that its four capabilities – research objects, 3. It should provide an open, extensible environment social model, open environment and actioning to permit ease of integration with other software, research – are necessary characteristics of an effective tools and services, and benefit from participative Virtual Research Environment for e-research and open contribution of software. science. 4. It should provide a platform to action research, for example to deliver research objects to remote 1. Introduction services and software. It should be straightforward to create customised, task specific tools and Scientific advance relies on a social process in environments. which scientists share ideas, methods and data. Capability (1) is a repository function, (2) is Traditionally this discourse is mediated by the characteristic of social web sites and (3) is related to scholarly publishing process, but scientists are open source community development. (4) is what increasingly turning to blogs, wikis and social makes these into a research environment – the research networks to facilitate this process, a phenomenon objects are not just stored and exchanged but they are sometimes characterised as Science 2.0 [1]. With this used in the conduct of research (we describe them as we also see a movement to open science where large actionable ). We note that the tenets of open science – scale, open distributed collaboration is enabled by open data, open access and open source – are making data, methods and results freely available on consistent with this definition. Implicit in all these the Web. capabilities is the notion that the interface, be it human The purpose of a Virtual Research Environment or programmatic, must be familiar and easy to use. (VRE) is “to provide researchers with the tools and With a view to establishing these four capabilities services they need to do research of any type as we have designed and built myExperiment, a social efficiently and effectively as possible” [2]. Reflecting web site for scientists which directly supports their our observations on the social process of science, we research. The myExperiment.org service went live in suggest that an effective virtual research environment November 2007 and has attracted considerable interest. should provide four key capabilities, and we propose In the period January-July 2008 the site received over these as the definition of the “Social Virtual Research 8,500 unique visitors and achieved over 1,000 Environment”: registered users. myExperiment is distinctive because it majors on the social dimension, and it can itself be invocation logs and documentation. They may also seen as an experiment to explore whether scientific collect multiple workflows together for sharing. These communities share sufficiently in order to benefit from collections are manifest to users as packs and are a the network effects of a social web site. distinctive feature of myExperiment. In this paper we report for the first time on the As the user communities of myExperiment increase construction and usage of the site, and the insights in number and breadth we are developing support for gained into achieving the four capabilities. Section 2 new research objects, such as experimental plans and presents the myExperiment project within the statistical models. framework of the capabilities and positions it with respect to related work. We then look at the software 2.1.2 Social model. To support producers of research design in Section 3 and the implementation and objects in contributing to myExperiment we provide deployment in Section 4. After an analysis of usage in members of the site with support for credit and Section 5 we close in Section 6 by revisiting the four attribution, and fine control over the visibility and capabilities and reflecting on our experience of ‘the sharing of research objects. Early user feedback myExperiment experiment’ at this stage in its revealed this to be the most critical factor in making a development. social web site acceptable for use by scientists. Other members of the site ‘consuming’ the research 2. The myExperiment VRE objects can view, download, tag, review and ‘favourite’ them, which aids their discovery and Scientific workflows are valuable commodities enhances reputation of the producers. Additionally, which require expertise to build [3]. myExperiment content exposed publicly is discoverable through was motivated by observing a clear need to share search engines. workflows – to reduce reinvention, propagate best Unless they are maintained, workflows and other practice and enable scientists to concentrate on science research objects can cease to be reusable over time – – amongst a fairly decoupled community of workflow they effectively ‘decay’, though in fact it is their users. It was also motivated by a frustration with context that is changing. For example, a recent change existing systems which: (a) missed the social in gene identifiers by one service provider led to a dimension, merely making things available rather than myExperiment announcement for users of the affected encouraging and controlling sharing; (b) presented workflows. Useful workflows will be curated by the complex user interfaces out of line with the popular community that uses them, and the original authors are web sites that people are using on an everyday basis, also encouraged to curate because they are getting thereby demanding further skill. The motivation and credit for use of their work. Workflow decay is a rationale for the myExperiment project is discussed in difficult problem and myExperiment provides a new detail in [4-5] and the design principles in [6]. approach through community curation. 2.1. myExperiment capabilities 2.1.3 Open environment. myExperiment has paid as much attention to its developer community as it has to myExperiment addresses the four capabilities of our designing the user interface. VRE definition as follows: By creating tools to manage the API, the exposed functionality is highly customisable in response to 2.1.1 Research Objects. Our key research objects are requirements. The API has enabled new interfaces to scientific workflows and their associated objects (such be built, such as Google Gadgets and Facebook Apps. as data and documentation). We have extra support for It also enables existing interfaces to incorporate specific workflow formats so that we can ‘look inside’ myExperiment functionality, such as a wiki or the these compound objects to extract metadata, provide a Taverna workflow workbench. graphical rendering and possibly identify the services myExperiment always prefers reuse to reinvention that are used. We already provide this full range of and can easily access other services. It is designed to support for Taverna workflows [7] and we are be part of the scholarly knowledge cycle and is currently developing support for other systems. compatible with Open Archives Initiative protocols. Significantly myExperiment also supports research While it provides a workflow repository function, objects which are collections of other objects, because much of the associated information – such as data and researchers work with collections of items associated publications – may be held in other repositories, so with an experiment – for example, a specific version of myExperiment makes it easy to refer to external a workflow together with input and output data, service content. In addition to the API, the myExperiment codebase Inforsense’s commercial Customer Hub is open source and can be used by anyone to set up (www.chub.inforsense.com) has the ambition of their own myExperiment instance. enabling Inforsense workflow users to share best practices and leverage community knowledge. 2.1.4 Actioning research. myExperiment is designed However, it does not rely on the social model. to call upon external services to process research Galaxy (galaxy.psu.edu) provides a public site objects. Taverna workflows are executed by where biologists can run analyses and for developers it myExperiment submitting a collection of research provides an open-source framework for tool and data objects for remote processing to an enactor, and the integration. It does not provide social infrastructure to results are automatically collected back into support