Enacting Open Science by Gcube
Total Page:16
File Type:pdf, Size:1020Kb
9th International Workshop on Science Gateways (IWSG 2017), 19-21 June 2017 Enacting Open Science by gCube Massimiliano Assante, Leonardo Candela, Donatella Castelli, Gianpaolo Coro, Francesco Mangiacrapa, Pasquale Pagano, Costantino Perciante Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” – Consiglio Nazionale delle Ricerche via G. Moruzzi, 1 – 56124, Pisa, Italy Email: {name.surname}@isti.cnr.it Abstract—The Open Science movement is promising to rev- the initial owner; and (c) disincentive factors, e.g. the effort olutionise the way science is conducted with the goal to make spent in “publishing” research artefacts going beyond papers it more fair, solid and democratic. This revolution is destined is receiving little or no value in researchers assessment and to remain just a wish if it is not supported by changes in culture and practices as well as in enabling technologies. This career success. paper describes the gCube offering to enact Open Science- This paper describes the solution proposed by gCube [4] to friendly Virtual Research Environments. In particular, the paper overcome some of the above mentioned barriers – in particular describes how a complete solution suitable for realising Open those tied to cost-based arguments – by providing researchers Science practices is achieved by a social networking collaborative and practitioners with a working environment where Open environment in conjunction with a shared workspace, an open data analytics platform, and a catalogue enabling FAIR principles Science practices are transparently promoted. gCube is a on every research artefact. software system specifically conceived to enable the con- Keywords—Open Science; gCube; D4Science; Social Network- struction and development of Virtual Research Environments ing; Analytics; Publishing; (VREs) [5], i.e. web-based working environments tailored to support the needs of their designated community working on I. INTRODUCTION a research question. Beside providing their users with the Open Science is promising to revolutionise the entire sci- domain-specific facilities, i.e. datasets and services suitable ence enterprise by envisioning and proposing new practices for the research question, each VRE is equipped with basic aiming at making science better [1]–[3]. The promised ben- services supporting collaboration and cooperation among its efits include (i) better interpretation, understanding and re- users, namely: (i) a shared workspace to store and organise producibility of research activities and results; (ii) enhanced any version of a research artefact; (ii) a social networking area transparency in science lifecycle and improvement in “sci- to have discussions on any topic (including working version entific fraud” detection; (iii) reduction of the overall cost of and released artefacts) and be informed on happenings; (iii) research, by promoting re-use of results; (iv) introduction of a data analytics platform to execute processing tasks either comprehensive and fair scientific reward criteria capturing all provided by a user or provided by others to be applied to facets and contributions in research life-cycle; and (v) better users’ cases and datasets; and (iv) a catalogue-based publishing identification and assessment of scientific results within the platform to make the existence of a certain artefact public and “tsunami of scientific literature” witnessed by scientists. disseminated. These facilities are at the fingerprint of VRE These benefits are destined to remain a wish if the research users. They continuously and transparently capture research community as a whole (funding agencies, research performing activities, authors and contributors, as well as every by-product organizations, publishers, research infrastructures, scientists, resulting from every phase of a typical research lifecycle as well as citizens) does not fully embrace Open Science thus reducing the issues related with Open Science and its and put in place efforts and initiatives aiming at making it communication [6], [7]. the norm. The good news is that the movement is gaining momentum and consensus. In fact, funding agencies are de- II. RELATED WORKS veloping policies supporting its implementation as well as are There are plenty of tools and approaches supporting Open supporting the development of infrastructures and services. Science [8]. They include (a) repositories maintaining different Moreover, research infrastructures start offering services and versions of datasets and software to promote their citation facilities going in the direction of Open Science. Scientists try and reuse, e.g. Dryad, GitHub, Zenodo, figshare; (b) tools to overcome the limitations of scholarly communication prac- aiming at promoting and enacting the production of new tices by relying on services and technologies to “publish” non- forms of publications to make the release of research results traditional research artefacts (datasets, workflows, software). more effective and comprehensive, e.g. interactive notebooks The bad news is that the implementation of this movement is and enhanced publications [9]; (c) tools aiming at making confronting with a number of barriers including: (a) cultural more open, transparent, holistic and participative the research factors, e.g. the fear to lose the control and exploitation of assessment process, e.g. open peer review, post-publication datasets; (b) cost-based factors, e.g. the extra-effort needed review, annotation and commenting tools, social networks for to make a research-artefact exploitable by a user other than scientists like ResearchGate [10]. One of the major barrier 9th International Workshop on Science Gateways (IWSG 2017), 19-21 June 2017 Fig. 1. gCube-based Open Science Framework preventing the systematic exploitation and uptake of these the components’ purpose, e.g. being openly discussed by social tools by scientific communities and application contexts is networking practices, (ii) research artefacts are continuously related to their “fragmentation”, scientists have to jump across enriched and enhanced with metadata capturing their entire several platforms to get a complete picture of a research lifecycle, their versions, and the detailed list of authors and activity and its current and future results. Whenever possible, tasks performed leading to the current development status and the “pieces” resulting from a research activity are linking each shapes. other, either by explicit links or by derived links, e.g. [11]. All these components (a) are conceived to operate in a However this link-based mechanism is quite fragile and costly well defined application context corresponding to the Virtual to keep healthy and up to date, and this is leading to research Research Environment they are serving, i.e. the VRE members packaging formats [12]. are the primary researchers and practitioners expected to have Scientific workflows technologies [13] are adopted by an fully-fledged access to the artefact shared by a VRE; (b) increasing number of communities to automate scientific meth- are conceived to open up research artefacts, independently of ods and procedures. Environments enabling to publish and their maturity level, beyond VRE boundaries (no lock-in) yet share workflows exist, yet the guarantees that the method according to artefact owner policies, i.e. it is up to artefact captured by the workflow seamlessly works in settings other owners to decide when a certain item resulting from a research then the originator ones are limited. Moreover, the act of activity is “ready” to be released and how (only metadata, role- publishing workflows is not systematic across communities based access to payloads, usage license); (c) are operated by and contexts. relying on an infrastructure that guarantees a known quality of The Open Science Framework is a web-based service en- service thus promoting community uptake, i.e. scientists might abling to keep files, data, and protocols pertaining to any be reluctant to migrate their working environment towards user defined project in a single, shared place. It provides for innovative and “cloud”-based ones [14], the proposed facilities credits, citation and versioning as well as for carefully deciding should be as much as possible easy to use, protect consolidated what is going to be shared with whom. This platform share practices, and guarantee that scientists continue to get on with commonalities with the gCube based set of facilities described their daily job. in this paper, e.g. a project is like a VRE, yet the VRE based approach make available to its users in one place the entire set of facilities and services they need to perform their research, thus going beyond the pure sharing of material. III. VIRTUAL RESEARCH ENVIRONMENTS BASED SCIENTIFIC WORKFLOWS Figure 1 depicts the main components and facilities offered by gCube to support collaborative activities and enable Open Science practices, namely a shared workspace (cf. Sec. III-A), a social networking area (cf. Sec. III-B), a data analytics platform (cf. Sec. III-C), and a publishing platform (cf. Sec. Fig. 2. gCube-based Open Science Workflow III-D). These components are all correlated each other and realize a “system” where (i) research artefacts seamlessly flow A prototypical and simplified scientific workflow enacted across the various components to be “managed” according to by these components is (cf. Fig. 2): (i) Dr. Smith is willing to 2 9th International Workshop on Science Gateways (IWSG 2017), 19-21 June 2017 investigate the impact of a certain