<<

9th International Workshop on Gateways (IWSG 2017), 19-21 June 2017

Enacting by gCube

Massimiliano Assante, Leonardo Candela, Donatella Castelli, Gianpaolo Coro, Francesco Mangiacrapa, Pasquale Pagano, Costantino Perciante Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo” – Consiglio Nazionale delle Ricerche via G. Moruzzi, 1 – 56124, Pisa, Italy Email: {name.surname}@isti.cnr.it

Abstract—The Open Science movement is promising to rev- the initial owner; and (c) disincentive factors, e.g. the effort olutionise the way science is conducted with the goal to make spent in “ artefacts going beyond papers it more fair, solid and democratic. This revolution is destined is receiving little or no value in researchers assessment and to remain just a wish if it is not supported by changes in culture and practices as well as in enabling technologies. This career success. paper describes the gCube offering to enact Open Science- This paper describes the solution proposed by gCube [4] to friendly Virtual Research Environments. In particular, the paper overcome some of the above mentioned barriers – in particular describes how a complete solution suitable for realising Open those tied to cost-based arguments – by providing researchers Science practices is achieved by a social networking collaborative and practitioners with a working environment where Open environment in conjunction with a shared workspace, an analytics platform, and a catalogue enabling FAIR principles Science practices are transparently promoted. gCube is a on every research artefact. software system specifically conceived to enable the con- Keywords—Open Science; gCube; ; Social Network- struction and development of Virtual Research Environments ing; Analytics; Publishing; (VREs) [5], i.e. web-based working environments tailored to support the needs of their designated community working on I.INTRODUCTION a research question. Beside providing their users with the Open Science is promising to revolutionise the entire sci- domain-specific facilities, i.e. datasets and services suitable ence enterprise by envisioning and proposing new practices for the research question, each VRE is equipped with basic aiming at making science better [1]–[3]. The promised ben- services supporting collaboration and cooperation among its efits include (i) better interpretation, understanding and re- users, namely: (i) a shared workspace to store and organise producibility of research activities and results; (ii) enhanced any version of a research artefact; (ii) a social networking area in science lifecycle and improvement in “sci- to have discussions on any topic (including working version entific fraud” detection; (iii) reduction of the overall cost of and released artefacts) and be informed on happenings; (iii) research, by promoting re-use of results; (iv) introduction of a data analytics platform to execute processing tasks either comprehensive and fair scientific reward criteria capturing all provided by a user or provided by others to be applied to facets and contributions in research life-cycle; and (v) better users’ cases and datasets; and (iv) a catalogue-based publishing identification and assessment of scientific results within the platform to make the existence of a certain artefact public and “tsunami of scientific literature” witnessed by scientists. disseminated. These facilities are at the fingerprint of VRE These benefits are destined to remain a wish if the research users. They continuously and transparently capture research community as a whole (funding agencies, research performing activities, authors and contributors, as well as every by-product organizations, publishers, research infrastructures, scientists, resulting from every phase of a typical research lifecycle as well as citizens) does not fully embrace Open Science thus reducing the issues related with Open Science and its and put in place efforts and initiatives aiming at making it communication [6], [7]. the norm. The good news is that the movement is gaining momentum and consensus. In fact, funding agencies are de- II. RELATED WORKS veloping policies supporting its implementation as well as are There are plenty of tools and approaches supporting Open supporting the development of infrastructures and services. Science [8]. They include (a) repositories maintaining different Moreover, research infrastructures start offering services and versions of datasets and software to promote their citation facilities going in the direction of Open Science. Scientists try and reuse, e.g. , GitHub, , figshare; (b) tools to overcome the limitations of prac- aiming at promoting and enacting the production of new tices by relying on services and technologies to “publish” non- forms of publications to make the release of research results traditional research artefacts (datasets, workflows, software). more effective and comprehensive, e.g. interactive notebooks The bad news is that the implementation of this movement is and enhanced publications [9]; (c) tools aiming at making confronting with a number of barriers including: (a) cultural more open, transparent, holistic and participative the research factors, e.g. the fear to lose the control and exploitation of assessment process, e.g. open , post-publication datasets; (b) cost-based factors, e.g. the extra-effort needed review, annotation and commenting tools, social networks for to make a research-artefact exploitable by a user other than scientists like ResearchGate [10]. One of the major barrier 9th International Workshop on Science Gateways (IWSG 2017), 19-21 June 2017

Fig. 1. gCube-based Open Science Framework preventing the systematic exploitation and uptake of these the components’ purpose, e.g. being openly discussed by social tools by scientific communities and application contexts is networking practices, (ii) research artefacts are continuously related to their “fragmentation”, scientists have to jump across enriched and enhanced with metadata capturing their entire several platforms to get a complete picture of a research lifecycle, their versions, and the detailed list of authors and activity and its current and future results. Whenever possible, tasks performed leading to the current development status and the “pieces” resulting from a research activity are linking each shapes. other, either by explicit links or by derived links, e.g. [11]. All these components (a) are conceived to operate in a However this link-based mechanism is quite fragile and costly well defined application context corresponding to the Virtual to keep healthy and up to date, and this is leading to research Research Environment they are serving, i.e. the VRE members packaging formats [12]. are the primary researchers and practitioners expected to have Scientific workflows technologies [13] are adopted by an fully-fledged access to the artefact shared by a VRE; (b) increasing number of communities to automate scientific meth- are conceived to open up research artefacts, independently of ods and procedures. Environments enabling to publish and their maturity level, beyond VRE boundaries (no lock-in) yet share workflows exist, yet the guarantees that the method according to artefact owner policies, i.e. it is up to artefact captured by the workflow seamlessly works in settings other owners to decide when a certain item resulting from a research then the originator ones are limited. Moreover, the act of activity is “ready” to be released and how (only metadata, role- publishing workflows is not systematic across communities based access to payloads, usage license); (c) are operated by and contexts. relying on an infrastructure that guarantees a known quality of The Open Science Framework is a web-based service en- service thus promoting community uptake, i.e. scientists might abling to keep files, data, and protocols pertaining to any be reluctant to migrate their working environment towards user defined project in a single, shared place. It provides for innovative and “cloud”-based ones [14], the proposed facilities credits, citation and versioning as well as for carefully deciding should be as much as possible easy to use, protect consolidated what is going to be shared with whom. This platform share practices, and guarantee that scientists continue to get on with commonalities with the gCube based set of facilities described their daily job. in this paper, e.g. a project is like a VRE, yet the VRE based approach make available to its users in one place the entire set of facilities and services they need to perform their research, thus going beyond the pure sharing of material.

III. VIRTUAL RESEARCH ENVIRONMENTS BASED SCIENTIFIC WORKFLOWS Figure 1 depicts the main components and facilities offered by gCube to support collaborative activities and enable Open Science practices, namely a shared workspace (cf. Sec. III-A), a social networking area (cf. Sec. III-B), a data analytics platform (cf. Sec. III-C), and a publishing platform (cf. Sec. Fig. 2. gCube-based Open Science Workflow III-D). These components are all correlated each other and realize a “system” where (i) research artefacts seamlessly flow A prototypical and simplified scientific workflow enacted across the various components to be “managed” according to by these components is (cf. Fig. 2): (i) Dr. Smith is willing to

2 9th International Workshop on Science Gateways (IWSG 2017), 19-21 June 2017 investigate the impact of a certain alien species in the Mediter- ranean sea and announces this willingness by a post (social networking); (ii) Dr. Green and Dr. Rossi start collaborating with Dr. Smith by organising and populating a shared folder with suitable material, e.g. datasets, notes, papers (workspace); (iii) Dr. Smith and Dr. Rossi propose two diverse models aiming at capturing the effects of the selected species on Mediterranean sea ecosystem, they implement and make them available (data analytics); (iv) the availability of these early- results suggests Dr. Bahl to start a study on another species he developed a model for in the past and leads Dr. Bahl to create another workspace folder with specific material and produce another version of Dr. Rossi’s model; (vi) Dr. Smith, Dr. Green and Dr. Rossi execute a large set of concurrent experiments, Fig. 3. gCube Workspace screenshot make available every dataset resulting from them (workspace, publishing), and announce their findings by also preparing integrating the workspace facility in other applications (e.g. it a paper. Meanwhile, Dr. Wang start re-using the model(s) is exploited by the Analytics Platform Portlet to give seamless produced by Dr. Smith et al. as well as Dr. Bahl’s one to access to workspace items), and (ii) a RESTful API suitable analyse certain datasets she owns, spot a potential implemen- for any web-based programmatic access. tation issue affecting all of the models, produces and publishes corrected versions, and “annotate” the initial models with her findings; (vii) being alerted by Dr. Wang annotation, Dr. Smith et al. decide to re-execute their experiments on other datasets by using both their version of the model and Dr. Wang’s one to realise that Dr. Wang model better suits with their initial hypothesis (all of this happen well before their paper being published). This representative workflow can be easily and effectively implemented only by relying on a suite of facilities like those offered by gCube where the “place” where research activity is conducted and the “place” the activity is published and immediately communicated are the same. In other settings Fig. 4. gCube Workspace Platform Architecture where there is a decoupling of the “place” where research is performed (the scientists workbench) from the place where The distinguishing features of this platform for Open Sci- research is communicated, e.g. papers containing links to ence are the following: (i) every workspace item is equipped supporting material, the implementation of this scenario is with an actionable unique identifier that can be used for more challenging and expensive, if feasible at all. citation and access purposes; (ii) every workspace item is versioned and a new version is automatically produced when- A. The Workspace Platform ever the item is explicitly changed by the user or any appli- Figure 3 depicts the user interface of the workspace facility, cation/service of the VRE on behalf of an authorised user; i.e. the area VRE users rely on to organise their material and (iii) every item, be it a single item or a folder, is equipped have access to the material shared with others. It resembles with rich and extensible metadata (“key, value” pairs) that a typical file system with files organised in folders, yet it capture descriptive features as well as lineage features; (iv) supports an open-ended set of items that are (a) equipped with three typologies of folders are supported: private, content is rich and extensible metadata and (b) actually stored by an array available for the owner only; shared, content is available for of storage solutions [4]. selected users decided by the owner; and VRE folder, content Figure 4 depicts the software architecture characterising the is available to VRE members; (v) the workspace is tightly workspace facility. This facility relies on Apache Jackrabbit integrated with both the social networking and catalogue for for storing and managing workspace items – actually their easing the dissemination of its artefact (either single items or metadata – by means of specific node types and attributes as groups of items). “key, value” pairs. Items payload is stored by relying on a hybrid storage solution [4] that, by means of ad-hoc plugins, B. The Social Networking Collaborative Platform exploits various storage solutions suitable for diverse typolo- Figure 5 depicts the user interface of the social networking gies of content, e.g. MongoDB for binary files, GeoServer area, i.e. the area VRE users rely on to communicate with and THREDDS Data Servers for geospatial data, RDB for their VRE co-workers and be informed on others achieve- tabular data. In addition to the portlet previously discussed, ments, discussions and opinions. It resembles a social network the workspace facility is offered by (i) a widget suitable for with posts, tags, mentions, comments and reactions, yet its

3 9th International Workshop on Science Gateways (IWSG 2017), 19-21 June 2017 integration with the rest makes it a powerful and flexible C. The Data Analytics Platform communication channel for scientists. Figure 7 depicts the user interface of the data analytics area, i.e. the area VRE users rely on to perform their analytics tasks. It resembles a stand-alone analytics platform, e.g. Weka, with a collection of ready-to-use algorithms and methods, yet it relies on a distributed and heterogeneous computing infrastructure enacting to executed complex tasks.

Fig. 5. gCube Social Networking screenshot

Figure 6 depicts the software architecture characterising the Fig. 7. gCube Data Analytics Platform screenshot social networking collaborative platform. This facility relies on the Social Networking Engine, a Cassandra database [15] Figure 8 depicts the software architecture characterising the for storing social networking related data and Elasticsearch analytics platform. The DataMiner Master is a web service [16] for the retrieval of social networking data. The Engine in charge to accept requests for executing processes and exposes its facilities by an HTTP REST Interface and com- executing them, either locally or by relying on the DataMiner prises two services: (i) the Social Networking Service that Worker(s) depending from the specific process. The service is efficiently store and accesses social networking data (Posts, conceived to work in a cluster of replica services operating Comments, Notifications, etc.) in the underlying Cassandra behind a proxy acting as load balancer. It is offered by a Cluster. and (ii) the Social Networking Indexer Service that standard web-based protocol, i.e. OGC WPS1; The DataMiner builds Elasticsearch indices to perform search operations over Worker is a web service in charge to execute the processes it the social networking data. is assigned to by a Master. The service is conceived to work in a cluster of replica services and is offered by a standard web- based protocol, i.e. OGC WPS. Both the services are conceived to be deployed and operated by relying on various providers, e.g. Master and Worker instances can be deployed on private or public cloud providers. DataMiner Master and Worker instances execute processes based on an open set of algorithms hosted by a dedicated repository, the DataMiner Algorithms Repository. Two kinds of algorithms are hosted: “local” and “distributed” algorithms. Local algorithms are directly exe- cuted on a DataMiner Master instance and possibly use parallel processing on several cores and a large amount of memory. Distributed algorithms use with a Map- Fig. 6. gCube Social Networking Collaborative Platform Architecture Reduce approach and rely on the DataMiner Worker instances in the Worker cluster. The Algorithm Importer portlet and the Algorithm Publisher service enable users to inject new The distinguishing features of this platform for Open Sci- algorithms into the platform by using various programming ence are the following: (i) every item is equipped with an languages [17]. actionable unique identifier that can be used for citation and The distinguishing features of this platform for Open Sci- access purposes; (ii) the discussion patterns enabled are really ence are the following: (i) every process hosted by the platform transparent and open; every (re)action performed by a user – is equipped with an actionable unique identifier that can be be it a new post, a reply to a post, or the rating of a certain post used for citation and access purposes; (ii) the offering and or post reply – is carefully captured and documented; (iii) there publication of user provided processes (e.g. scripts, compiled is no pre-defined way to structure a discussion; users can start programs) by an as-a-Service standard-based approach (pro- new discussion threads, annotate them with tags for easing the cataloguing and discovery, refer to other threads and material 1OGC Web Processing Servicehttp://www.opengeospatial.org/standards/ both internally stored and available on the web. wps

4 9th International Workshop on Science Gateways (IWSG 2017), 19-21 June 2017

Fig. 8. gCube Analytics Platform Architecture cesses are described and exposed by the OGC Web Processing Service standard); (iii) the ability to manage and support processes produced by using several programming languages (e.g. , Java, Fortran, Phyton); (iv) the automatic production of a detailed provenance record for every analytics task executed by the platform, i.e. the overall input/output data, parameters, and metadata that would allow to reproduce and repeat the task are stored into the workspace and documented by a Fig. 9. gCube Platform screenshot PROV-O-based accompanying record; (v) integration with the shared workspace to implement collaborative experimental spaces, e.g. users can easily share datasets, methods, code;(vi) directives on how to exploit attributes for items organisa- support for using a Map-Reduce approach tional purposes, e.g. automatically transform values in tags for computing and data intensive processing; (vii) extensibility or exploit the values for creating collections or groups of of the platform to quasi-transparently rely on and adapt to a items. On top of this Catalogue Service, gCube offers several distributed, heterogeneous and elastically provided array of components to make publication of items easier for VRE users workers to execute the processing tasks. and services. A Catalogue Portlet, accessible in each VRE, allows to navigate the catalogue content as well as to publish D. The Publishing Platform content by exploiting the Publishing Widget. This widget is Figure 9 depicts the user interface of the publishing plat- also embedded into the Workspace portlet, so users can publish form, i.e. the facility VRE users rely on to announce and be folders and/or files directly from there. External services can informed on the availability of certain artefacts at diverse ma- access the catalogue content and publish new items via the turity levels. It resembles a catalogue of artefacts with search gCube Catalogue RESTful APIs. The Catalogue Service relies and browse, yet the with respect to the typologies on the Workspace and Storage Hub (cf. Sec. III-A) for storing of products published, the metadata to document them as well the payload of the published items. as the integration with the rest make it a flexible environment. Every published item in the catalogue is characterised by (i)a type, which highlights its features and allows an easier search, (ii) an open ended set of metadata which carefully describe the item, and (iii) optional resource(s) representing the actual payload of the item. Figure 10 depicts the software architecture characterising the publishing platform. This platform primarily relies on CKAN technology, i.e. an software enabling to 2 build and operate open data portals / catalogues . This core Fig. 10. gCube Publishing Platform Architecture technology has been wrapped and extended by means of the Catalogue Service, a component realising the business logic The distinguishing features of this platform for Open Sci- of the publishing platform. The Catalogue Service enact the ence are the following: (i) every catalogue item is equipped management of Catalogue Item Types, i.e. specifications of with an actionable, persistent, unique identifier that can be diverse typologies of items supported. Each catalogue item used for citation and access purposes; (ii) whenever a cata- type carefully defines the metadata elements characterising logue item is published, the associated payload(s) is stored in the item typology by specifying the names of the attributes, a persistent storage area to guarantee its long-term availability; the possible values, whether an attribute is single instance (iii) every catalogue item is equipped with a license carefully or repeatable. In addition to that, each item type contains characterising the possible (re-)uses; (iv) every publication of an item leads to the automatic production of a post in the social 2CKAN technology website https://ckan.org/ networking area of the VRE to inform its members; (v) every

5 9th International Workshop on Science Gateways (IWSG 2017), 19-21 June 2017

catalogue item is equipped with rich and open metadata, i.e. [8] B. Kramer and J. Bosman, “Innovations in scholarly communication - it is possible to carefully customise the typologies of products global survey on research tool usage [version 1; referees: 2 approved],” F1000Research, vol. 5, no. 692, 2016. and the accompanying metadata to the community needs. [9] A. Bardi and P. Manghi, “Enhanced publications: Data models and information systems,” LIBER Quarterly, vol. 23, no. 4, pp. 240–273, IV. CONCLUSION AND FUTURE WORK 2014. [10] M. Thelwall and K. Kousha, “ResearchGate: Disseminating, commu- This paper described a suite of tools overall realising Open nicating, and measuring scholarship?” Journal of the Association for Science-friendly working environments. These tools support Information Science and Technology, vol. 66, no. 5, pp. 876–889, 2015. all the phases of typical research lifecycles and transparently [11] A. Burton, H. Koers, P. Manghi, M. Stocker, M. Fenner, A. Aryani, S. La Bruzzo, M. Diepenbroek, and U. Schindler, “The scholix frame- inject practices aiming at making the entire process leading work for interoperability in data-literature information exchange,” D-Lib to a certain version of a research artefact more transparent Magazine, vol. 23, no. 1/2, 2017. and repeatable without posing additional requirements for [12] S. Bechhofer, I. Buchan, D. De Roure, P. Missier, J. Ainsworth, J. Bha- gat, P. Couch, D. Cruickshank, M. Delderfield, I. Dunlop, M. Gamble, scientists. They are conceived to make the “publishing” act an D. Michaelides, S. Owen, D. Newman, S. Sufi, and C. Goble, “Why easy, dynamic, comprehensive, lossless and holistic task where is not enough for scientists,” Future Generation Computer owners retain the control of and credit for every published Systems, vol. 29, no. 2, pp. 599–611, 2013. [13] C. S. Liew, M. P. Atkinson, M. Galea, T. F. Ang, P. Martin, and artefact that, being interlinked with other artefacts and the J. I. V. Hemert, “Scientific workflows: Moving across paradigms,” ACM working environment exploited for their production, cater for Computing Surveys, vol. 49, no. 4, 2016. their effective understanding and reuse. is the [14] M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “A view beginning of a research task rather than the concluding ones. of cloud computing,” Communications of the ACM, vol. 53, no. 4, pp. These tools are an essential part of the gCube Open Source 50–58, Apr. 2010. technology [4]. They are offered as-a-Service by means of the [15] J. Carpenter and E. Hewitt, Cassandra: The Definitive Guide, 2nd Edition. O’Reilly Media, 2016. D4Science.org infrastructure [18]. Concrete exploitation cases [16] C. Gormley and Z. Tong, Elasticsearch: The Definitive Guide. O’Reilly and experiences demonstrate their effectiveness, e.g. [19]–[23]. Media, 2015. Future work include the integration with recommender [17] G. Coro, L. Candela, P. Pagano, A. Italiano, and L. Liccardo, “Paralleliz- ing the execution of native data mining algorithms for computational systems [24], [25], scientific workflows [13], and research biology,” Concurrency and Computation: Practice and Experience, objects [12] to enlarge the possible exploitation cases and vol. 27, no. 17, pp. 4630–4644, 2015. scenarios. [18] L. Candela, D. Castelli, A. Manzi, and P. Pagano, “Realising Vir- tual Research Environments by Hybrid Data Infrastructures: the D4Science Experience,” in International Symposium on Grids and ACKNOWLEDGMENT Clouds (ISGC) 2014 23-28 March 2014, Academia Sinica, Taipei, Taiwan, PoS(ISGC2014)022, ser. Proceedings of Science, 2014. This work has received funding from the European Union’s [19] R. Froese, J. T. Thorson, and R. B. J. Reyes, “A bayesian approach Horizon 2020 research and innovation programme under the for estimating length-weight relationships in fishes,” Journal of Applied AGINFRA PLUS project (Grant agreement No. 731001), Ichthiology, vol. 30, no. 1, pp. 78–85, 2014. [20] G. Coro, C. Magliozzi, A. Ellenbroek, and P. Pagano, “Improving the BlueBRIDGE project (Grant agreement No. 675680), the data quality to build a robust distribution model for architeuthis dux,” ENVRI PLUS project (Grant agreement No. 654182), and the Ecological Modelling, vol. 305, pp. 29–39, 2015. EOSCpilot project (Grant No. 739563). [21] G. Coro, S. Large, C. Magliozzi, and P. Pagano, “Analysing and forecasting fisheries time series: purse seine in indian ocean as a case study,” ICES Journal of Marine Science: Journal du Conseil, p. fsw131, REFERENCES 2016. [1] B. Fecher and S. Friesike, “Open science: One term, five schools of [22] G. Coro, P. Pagano, and A. Ellenbroek, “Combining simulated expert thought,” in Opening Science, S. Bartling and S. Friesike, Eds. Springer knowledge with neural networks to produce ecological niche models for International Publishing, 2014, pp. 17–47. latimeria chalumnae,” Ecological modelling, vol. 268, pp. 55–63, 2013. [2] S. Bartling and S. Friesike, “Towards another scientific revolution,” in [23] G. Coro, T. J. Webb, W. Appeltans, N. Bailly, A. Cattrijsse, and Opening Science. Springer International Publishing, 2014, pp. 3–15. P. Pagano, “Classifying degrees of species commonness: North sea fish [3] B. A. Nosek, G. Alter, G. C. Banks, D. Borsboom, S. D. Bowman, as a case study,” Ecological Modelling, vol. 312, pp. 272–280, 2015. S. J. Breckler, S. Buck, C. D. Chambers, G. Chin, G. Christensen, [24] H. Avancini, L. Candela, and U. Straccia, “Recommenders in a Personal- M. Contestabile, A. Dafoe, E. Eich, J. Freese, R. Glennerster, D. Go- ized, Collaborative Environment,” Journal of Intelligent roff, D. P. Green, B. Hesse, M. Humphreys, J. Ishiyama, D. Karlan, Information Systems, vol. 28, no. 3, pp. 253–283, June 2007. A. Kraut, A. Lupia, P. Mabry, T. Madon, N. Malhotra, E. Mayo-Wilson, [25] J. Bobadilla, F. Ortega, A. Hernando, and A. Guti´errez, “Recommender M. McNutt, E. Miguel, E. Levy Paluck, U. Simonsohn, C. Soderberg, systems survey,” Knowledge-Based Systems, vol. 46, pp. 109–132, 2013. B. A. Spellman, J. Turitto, G. VandenBos, S. Vazire, E. J. Wagenmakers, R. Wilson, and T. Yarkoni, “Promoting an culture,” Science, vol. 348, no. 6242, 2015. [4] M. Assante, L. Candela, D. Castelli, G. Coro, L. Lelii, and P. Pagano, “Virtual research environments as-a-service by gcube,” PeerJ , 2016. [5] L. Candela, D. Castelli, and P. Pagano, “Virtual research environments: an overview and a research agenda,” Data Science Journal, vol. 12, pp. GRDI75–GRDI81, 2013. [6] B. A. Nosek and Y. Bar-Anan, “Scientific utopia: I. opening scientific communication,” Psychological Inquiry, vol. 23, no. 3, pp. 217–243, 2012. [7] M. Assante, L. Candela, D. Castelli, P. Manghi, and P. Pagano, “Science 2.0 repositories: Time for a change in scholarly communication,” D-Lib Magazine, vol. 21, no. 1/2, 2015.

6