<<

Inspired ISSUE 28 SEPTEMBER 2017 news from the EGI community

TOP STORIES What is the European Cloud? page 2 What is FAIR? page 4 The EGI Marketplace page 5 Results of the EGI-Engage project page 6

MORE

01 DI4R 2017 registration and call for abstracts

07 Introducing the EOSC-hub project

08 Achievements of Competence Centres

10 A new interface for the EGI Federated Cloud Advanced Computing for Research 11 Outcomes of the INDIGO-DataCloud project

www.egi.eu 12 The EDISON community for data science enthusiasts Welcome to issue 28!

In the new edition of our newsletter, we focus on the end of the EGI-Engage project and the achievements of the last 30 months. Your feedback and suggestions are always welcome! Send an email to Sara & Iulia at: [email protected]

DI4R 2017: Connecting the building blocks for Open Science

Brussels: 30 November - 1 December 2017

The Digital Infrastructures for demonstrate how open science, Registration Research conference will take higher education and innovators Online registration for the event place this year in Brussels, can benefit from these building is now open. Early-bird rates are Belgium, from 30 November to 1 blocks, and ultimately to available until 30 October. December 2017. advance integration and cooperation between initiatives. Europe's leading e- infrastructures, EGI, EUDAT, The event is collocated with the GÉANT, OpenAIRE, PRACE and EOSCpilot 1st Stakeholder RDA Europe, invite all Engagement Event taking place researchers, developers and on the 28 & 29 November 2017. service providers for two days of Open Call brainstorming and discussions under the theme “Connecting The Programme Committee, the building blocks for Open chaired by Franciska de Jong Science”. from CLARIN, welcomes abstracts to be submitted to the More information The 2017 edition of the DI4R following Topic Areas: conference will showcase the DI4R website policies, processes, best > Interoperability www.digitalinfrastructures.eu practices, data and services that, > Data science and skills leveraging today’s initiatives – DI4R Call for Abstracts > Impact evaluation and metrics national, regional, European and (INDICO website) indico.egi.eu/indico/event/3455 international – are the building > Security, trust and identity blocks of the European Open > The EOSC & EDI building DI4R Registration Science Cloud and European blocks https://www.digitalinfrastructu Data Infrastructure. res.eu/registration > Business models, sustainability The main goal of DI4R 2017 is to and policies

Inspired // Issue #28, September 2017 1 What is the European Open Science Cloud?

Iulia Popescu on what we know so far about the European initiative

The idea of a European Open created through global collabo- Science Cloud (EOSC) took shape rations, where “the digital and in 2015, as a vision of the EC of a physical are coming together” as large infrastructure to support described by Carlos Moedas, the and develop open science and Commissioner for Research, open in Europe and Science and Innovation. beyond. The EOSC is projected One year later, the notions of to become a reality by 2020 and open science & open innovation will be Europe’s virtual environ- took a more concrete form as ment for all researchers to store, they become a strategic aim for manage, analyse and re-use Europe’s scientific landscape data for research, innovation outlined in an official EC report: and educational purposes. “Open Innovation, Open Science, Open to the World – a vision for Europe”. What are open science and open innovation? Open Science In 2015, the European Commis- In the report, open science is sion set three goals for research defined as a new approach to and innovation policy within scientific progress based on European Union: Open sharing all available knowledge scale. Open science evokes “a Innovation, Open Science and using new collaborative tools change in the scientific land- Open to the World. and digital technologies. The scape towards a public funded These concepts promote the outcome would be a shift in the science to be more accessible, idea of opening up European modus operandi of doing transparent, collaborative and research and innovation research from the early stages closer to citizens” (Europe’s systems to move towards a and publication of results to future: Open Innovation, Open reality where knowledge is sharing them at a larger, global Science, Open to the World).

Open Innovation EOSC-related Publications The premise behind open inno- European Cloud Initiative - Building a competitive data and vation is to allow knowledge to knowledge economy in Europe (2016) http://go.egi.eu/eci circulate more freely and create a culture of new products and Realising the European Open Science Cloud (2016) markets as well as shared social http://go.egi.eu/hleg-eosc and economic values.

Open Innovation, Open Science, Open to the World – a vision Towards the European Open for Europe (2016) http://go.egi.eu/ooo Science Cloud The European Open Science Europe’s future: Open Innovation, Open Science, Open to the Cloud is envisioned by the World (second report, 2017) http://go.egi.eu/ooo2 European Commission as a supporting landscape to foster Report on the governance and financial schemes for the open science and open innova- European Open Science Cloud (2017) http://go.egi.eu/OSPPrep tion: a network of organisations and infrastructures from various The EOSC declaration (forthcoming) countries and communities that

Inspired // Issue #28, September 2017 2 EGI and EOSC provisioning, authentication EGI endorses the principles of and authorisation. the EOSC Declaration and > Develop the certification commits to contribute to the schemes and skills necessary implementation of the to become users or operators supports the open creation and European Open Science Cloud: of digital research dissemination of knowledge and Governance and funding infrastructures and EOSC scientific data (Report on the involving multiple research governance and financial > Support the definition, communities, techno-logy schemes for the European Open implementation and operation experts and service providers. Science Cloud). of the EOSC structure with more than 300 data centres in Research data services and The creation of EOSC is aimed at 50 countries. architecture removing technical, policy and human barriers, leading to > Contribute its best practices > Manage the EOSC-hub knowledge creation and and experience to the Service Integration and economic prosperity in Europe. definition of the EOSC policies Management system. and ensuring interoperability > Offer advanced compute The European Commission’s among suppliers at a global (Cloud and High-Throughput) “European Cloud initiative” scale. publication, issued in April 2016, and data services from publicly- set an ambitious vision for the Data culture and FAIR data funded and commercial organisations. European Open Science Cloud: > Provide and improve imple- “to give Europe a global lead in mentation guidelines for FAIR > Operate a federated identity scientific data infrastructures services (Findable, Accessible, provisioning, authentication and to ensure that European Interoperable, and Re-usable) and authorisation services for scientists reap the full benefits in the area of advanced the EOSC users and service of data-driven science.” compute, federated identity providers. A vision in action The European Open Science that integrate services and hub will act as an entry point for Cloud is intended to set off the infrastructures to show inter- researchers and innovators to ground by federating existing operability and its benefits in a discover, access, and use a scientific data infrastructures number of scientific domains: variety of advanced data-driven that are now spread across life & earth sciences, high-energy resources. disciplines and EU member physics, social sciences, physics The consortium of the project is states. This will make access to and astronomy. led by the EGI Foundation and scientific data easier and more The EOSC-hub project brings together more than 100 efficient (Realising the European beneficiaries and linked third Open Science Cloud). The EOSC-hub project was successfully reviewed by the parties including research The EOSCpilot project European Commission and is infrastructures, e-Infrastructure The EOSCpilot project supports expected to start in January providers, SMEs and academic the first phase in the develop- 2018. The scope of EOSC-hub is institutions. The project follows ment of the EOSC. The project to create the integration and the EOSC guidelines recently brings together stakeholders management structure of the released by the EC in the EOSC from research infrastructures European Open Science Cloud. declaration. and e-Infrastructure providers The project will enable an open and will engage with funders access to research resources and policy makers to propose from a myriad of scientific More information and trial EOSC’s governance disciplines via a digital hub: an framework. integration system of software Iulia Popescu is a The project has already selected and services from major Communications Officer at 10 science demonstrators European e-infrastructures and the EGI Foundation functioning as high-profile pilots research infrastructures. The Inspired // Issue #28, September 2017 3 What is FAIR?

Gergely Sipos summarises the FAIR principles and how EGI is contributing to their implementation

Findable, Accessible, Interoperable, Reusable – FAIR: an acronym that recently became inevitable for anyone involved in research data management, or in any of the initiatives relating to the European Open Science Cloud. Digital scientific data, tools, workflows and services are becoming available at increased speed and unprecedented scale. The FAIR principles precede 3. We provide consultancy and Unfortunately, a large segment implementation choices, and do training about service of these digital objects remains not enforce or recommend any management and sustainability unnoticed, unaccessed or un- specific technology, standard, or planning for scientific and used beyond their producer implementation-solution. The developer teams to ensure their team, limiting our abilities of principles are also not a standard research output is professionally extracting maximum benefit and or a specification. They establish managed and secured for a long knowledge from these research a concise and measurable set term. investments. that can act as a common If you want to contribute to the denominator across institutes, The FAIR principle was first evolution and implementation of across data and service providers introduced in a workshop held in the FAIR principles, then a good and across disciplines. This Leiden in 2014, where agroup of place to start is the next FORCE11 means that they can be used as like-minded academic and Conference in Berlin between a guide to help data and tool private stakeholders met to October 25-27, or the DI4R owners to evaluate if their data, discuss ways to overcome Conference in Brussels between tools and services are findable, obstacles in data discovery and November 30 – December 1. accessible, interoperable, and . reusable. FAIR consists of 15 elements that How EGI contributes to a define the characteristics needed to enable reuse by third-parties. FAIR digital world For example, to be Findable (F), 1. We provide technologies, data should: tools and related training for developers who want to create More information > have a persistent identifier FAIR services (e.g. workflows, > be described by metadata. tools, VREs), and support them Gergely Sipos is the EGI in operating those services Foundation Customer and Although the elements of the within pan-European Technical Outreach Manager FAIR principles are related they infrastructures. are also independent and More about FAIR separable. The principles may be 2. We facilitate the integration of Wilkinson, M. D. et al. adhered to in any combination FAIR data into EGI to offer those doi:10.1038/sdata.2016.18 and incrementally, as providers’ data for processing applications publishing environments evolve that harness the HTC, Cloud and The 15 FAIR principles to increasing degrees of Container services offered https://www.force11.org/ ‘FAIRness’. within the infrastructure.

Inspired // Issue #28, September 2017 4 The EGI Marketplace

Diego Scardaci writes about one of the key results of the EGI-Engage

The EGI Marketplace will be Marketplace workflow launched in production in the > Authentication: Managed next few weeks. through the EGI Check-in service, which allows customers to use This tool will be the platform the credentials of their home where EGI-related services, organization. Customers are delivered by EGI providers and required to register during their partners, can be promoted, first login into the Marketplace discovered, shared, ordered and to create a customer profile in accessed. It will include EGI the database. Part of the data is European Open Science Cloud services as well as discipline and retrieved by the EGI Check-in facilitating the discovery, the community-specific tools and service and additional data is order and the access of a large services enabled by EGI and/or gathered through a form. set of services provided by provided by third parties. several stakeholders. To reach > Discover and order services: The EGI Marketplace is designed this aim, several activities are The customer browses the as an electronic market: it's a already planned in four main Marketplace, finds what is platform where services can be areas: needed and selects the services advertised and where customers s/he needs to order. > Technical development of can easily order and access interfaces to retrieve and them. In addition, the > Check-Out: The service orders publish service data to/from Marketplace will enhance are submitted with information other tools (e.g. the visibility for resource and service complemented with the eInfraCentral service registry); providers, raising awareness of customer profile. The order then what they can provide as well as follows the appropriate service > Publishing of thematic helping to promote cross- order management, according to community services; disciplinary research. the EGI Integrated Management > Service order management System (IMS) processes and The Marketplace will also act as automation; procedures. the main web interface to access > Pay-for-use: launch of first the Applications on Demand The first testing phase after the commercial offers. service (AoDS). Orders for the Marketplace becomes opera- AoDS could be submitted only tional will include only services from the Marketplace and will provided by EGI. After this is be managed via an automatic completed, the Marketplace will workflow guaranteeing a quick be opened to the whole EGI and smooth access to the collaboration and partners, in applications. particular for publishing More information thematic community services The EGI Marketplace was (i.e. services provided by third EGI Marketplace developed using the PrestaShop partied that rely on EGI services). http://marketplace.egi.eu technology, a free, e-commerce solution largely The EGI Marketplace will be Diego Scardaci is part of the adopted in the commercial further enhanced in the EOSC- EGI Customer and Technical world with a wide community hub project and will become a Outreach Team behind it. key component of the future

Inspired // Issue #28, September 2017 5 EGI-Engage: a list of key exploitable results

Tiziana Ferrari lists the main achievements of the past 30 months

The EGI-Engage project (full Security policies name: Engaging the Research The security team updated the Community towards an Open policy frameworks to follow the Science Commons) ran from technical evolution of the EGI March 2015 to August 2018 with services and also to make them funds from European Union (EU) more general and re-usable by Horizon 2020 programme (grant other initiatives. number 654142). Thematic services integrated The project brought together 43 Thrpoughout the project, the partners with a mission to team worked with research expand the capabilities of a communities and research backbone of federated services infrastructures to co-design and for compute, storage, data, co-develop new services. Most of communication, knowledge and these services are now offered expertise, complementing Tools for federated service as integrated scientific applica- community-specific capabilities. management tions with EGI’s e-Infrastructure What were the key results of services. The project supported EGI-Engage? technological innovation and http://go.egi.eu/sat new services in the area of Update of the strategy, Improved EGI service Service Registry and resource governance & procurement portfolio allocation. During EGI-Engage, procedures the EGI Accounting Portal was The EGI service portfolio was The EGI Federation now has a redeveloped and improved with redesigned during EGI-Engage, new EGI strategy and a new a new user interface, new views with improved service definitions, governance model adapted to and features. The operational and divided in: the recent evolutions of the e- tools were also continuously Infrastructure landscape in External service catalogue improved and adapted to satisfy Europe. The team also worked Aimed at researchers, research new requirements from service on an analysis of opportunities communities and businesses, providers and user and barriers for cross-border contains compute, storage and communities. procurement of e-Infrastructure data, training and applications platform services. services provided by the EGI During the project, the team http://go.egi.eu/strategy2020 Federation. developed a platform designed Integrated Management http://go.egi.eu/SCpdf to make data discoverable and available in an easy way across System and Certification Internal service catalogue all EGI federated resources. The Foillowing 18 months of work to This catalogue contains tools EGI DatHub will offer scalable define a system to plan, imple- designed to facilitate data access and compute ment, monitor and continually coordination and improve how capabilities around scientific improve all business processes the EGI Federation works datasets for scientific groups at under its responsibility, the EGI together. The EGI internal the large scale. Once in Foundation was awarded ISO services are provided for the production, the EGI DataHub will 9001:2015 and ISO/IEC 20000- benefit of the EGI Council enable data processing in hybrid 1:2011 certifications. members and affiliated environments like public and http://go.egi.eu/cert organisations. private clouds.

Inspired // Issue #28, September 2017 6 Expanded Federated Cloud ordered and accessed. It will computing include EGI services as well as The EGI Federated Cloud was discipline and community- expanded with new IaaS specific tools and services capabilities. It now integrates enabled by EGI and/or provided existing commercial and public by third parties. IaaS Cloud deployments and e- Applications on Demand Infrastructures with the current A service providing researchers More information EGI production infrastructures. dedicated access to EGI Marketplace computational and storage resources, as well as other Tiziana Ferrari is the A platform where EGI-related Technical Director of the EGI facilities needed to run scientific services, delivered by EGI Foundation and was project applications. providers and partners, can be coordinatior of EGI-Engage promoted, discovered, shared, http://go.egi.eu/aod

Introducing the EOSC-hub project

Integrating and managing services for the European Open Science Cloud

EOSC-hub is the H2020 EINFRA12 on mature processes, policies programme aiming at reaching (A) project proposal submitted and tools from the leading out to new user groups and by a consortium of 74 partners European federated e-Infrastru- service providers. The project under the coordination of EGI, ctures to cover the whole life- aims at evolving the service EUDAT and INDIGO-DataCloud. cycle of services, from planning catalogue according to the users’ The action was positively to delivery. The Hub aggregates requirements and the latest reviewed by the European services from local, regional and technological develop-ments. Commission and the project is national e-Infrastructures in Through the virtual access planned to start in early 2018. Europe and worldwide. The Hub mechanism, more scienti-fic will act as a contact point for communities and users will have The EOSC-hub mission is to researchers and innovators to access to services for their contribute to the EOSC imple- discover, access, use and reuse a scientific discovery and collabo- mentation by enabling seamless broad spectrum of resources for ration across disciplinary and and to a system of advanced data-driven research. geographical boundaries. research data and services The services will include services provided across nations and The project will improve skills in four broad areas: Common, multiple disciplines. The project and knowledge among resear- Thematic, Collaborative and will offer these resources via the chers and service operators by Federation. Hub – an integration and mana- delivering specialised trainings gement system of the European The catalogue will be open and and by establishing competence Open Science Cloud, acting as a progressively extended to include centres to co-create solutions. European-level entry point for all data and thematic services from The project creates a Joint Digital stakeholders. The Hub will external partners willing to Innovation Hub that stimulates deliver a catalogue of services, collaborate with the project. In an ecosystem of industry/SMEs, software and data from the EGI order to do so, the project will service providers and resear- Federation, EUDAT CDI, INDIGO- run a network of Competence chers to support business pilots, DataCloud and research e- Centres involving early adopters, market take-up and commercial Infrastructures. The Hub builds and a stakeholder engagement boost strategies.

Inspired // Issue #28, September 2017 7 Competence Centres: results in EGI-Engage

Gergely Sipos outlines the main achievements of the Competence Centres

EGI-Engage pioneered a new We called them the Competence model of engagement and Centres – or CCs for short. support for Research Infrastru- EGI-Engage launched eight CCs ctures, based on distributed 2.5 years ago and while these More information centres where national initiatives, continue to operate beyond the user communities, technology project, I am pleased to report Gergely Sipos led the EGI- and service providers join forces that the initiative was a success Engage Competence Centre to collect and analyse require- and we plan to take the CC programme. ments, integrate community- model into the new EOSC-hub specific applications into state- Further information, project, which is due to start in of-the-art services, foster including technical details, January 2018. interoperability across e- milestones and deliverables, Infrastructures, and evolve Here is a summary of what we is available at services through a user-centric achieved together with the http://go.egi.eu/cc development model. research communities:

ELIXIR – Life sciences analysis environment in the workflow has been performed The ELIXIR CC aimed at evaluating, cloud (Insyght Comparative on data of real patients. Genomics use case from CNRS adopting and promoting > The selection of the biobank IFB and PhenoMeNal project use technologies and resources from workflows most suited as use case from EMBL-EBI). EGI to the wider ELIXIR research cases (from CZ, NL, SE). community. The team collected > Users in the US and in Europe representative life science use can access the same tools and cases that could benefit from run them on local clouds. MoBrain – Structural biology EGI services and then set up a (JetStream interoperability use The CC aimed at lowering federated cloud infrastructure case from the US University of barriers for scientists to access combining the EGI Federated Indiana). online portals and tools for Cloud with ELIXIR cloud providers structural biology, building on and with the ELIXIR Authentication the work of the WeNMR/ and Authorisation system to BBMRI - Biobanking INSTRUCT and NeuGrid4You implement those use cases. The This CC was set up to develop teams. The CC: results were: and pilot data processing > Implemented GPGPU-enabled > The cBioPortal from CESNET is workflows for sensitive personal web interfaces for the AMBER now ported to and hosted 24/7 data. The work resulted in: and DisVis online portals, provi- to the CESNET cloud site. > Expansion of the BiobankCloud ding an enhanced service by > The compute-intensive part of platform with the authentication exploiting the faster performance the META-Pipe metagenomics and authorization mechanisms of accelerated computing. pipeline use case from CSC and to allow integration with common > The Scipion cloud framework Marine metagenomics use case AAIs (e.g., BBMRI-ERIC AAI, EGI was deployed into the EGI from EMBL-EBI were successfully Check-in). Federated Cloud to allow ported to the federated cloud > A demonstrator was ported to researchers to obtain 3D maps resources. the private cluster of the MMCI of macromolecular complexes. > Life scientists are now able to hospital (in the Czech Republic), > Continued to support a instantiate their own data where the data analysis continuous robust use of HTC

Inspired // Issue #28, September 2017 8 resources - the HADDOCK portal, LifeWatch – Biodiversity EPOS – Earth sciences for example, has been sending sciences The CC collected, analysed and about 10 million jobs per year. The CC's goal was to assess and compared community needs > In collaboration with the implement requirements of with EGI technical offerings, INDIGO-DataCloud project the LifeWatch research communities resulting in three pilots: CC put into production two new for e-infrastructure services. > AAI: demonstrated interope- web portals making use of the Throughout the project, the CC: rability between the EPOS AAI available grid GPGPU resources > Integrated pattern recognition and the EGI Check-in service; the via Docker containers: DisVis tools and data flow handlers prototype was developed based (114 registered users) and with the IFCA cloud site. on the UNITY IDM technology PowerFit (79 registered users). and interfaced with Check-in. > Compiled a LifeWatch service catalogue of 16 services covering > Earthquake simulation (MISFIT): DARIAH – Arts and support for ecological obser- showed how an existing seismo- humanities vatories, workflows, virtual labs logy application can be improved by integration with the EGI The goal of the DARIAH CC was and . Federated Cloud. to raise awareness of e-Infra- > Supported the deployment of structures’ benefits. To achieve services via Federated Cloud > Satellite Data: set up an this the CC: resources and used 5.5 million environment with EGI to support the development of new services > Established a VO to collect CPU hours during the project. for satellite data processing. The compute and storage resources pilot deployed an EPOS service for the DARIAH community EISCAT_3D – Ionosphere and on top of the Geohazard provided by the EGI data centres atmosphere observatory Thematic Exploitation Platform and cloud sites. The CC worked on the by Terradue from the satellite > Developed and deployed the development of the EISCAT_3D data TCS, and linked it to the EGI DARIAH Science Gateway with user portal backed by EGI Federated Cloud to exploit its applications (Simple Semantic federated HTC and cloud computing and storage Search Engine, Parallel Semantic services. This portal will provide resources. Search Engine, and DBO@Cloud) scientists with services to and three services (Cloud discover, access and analyse Disaster mitigation Access, Workflow Development, (e.g. visualise, mine) data The CC worked to develop and File Transfer), and enabled a generated by EISCAT_3D. federated login. customised IT services to The EISCAT_3D portal has a support climate and disaster > Established a Working Group working access control and mitigation researchers in Asia within the DARIAH-ERIC interfaces for data discovery and and produced: community to provide advisory download as well as a function > Two web portals to simulate support and promote the for analysis job submissions. tsunami wave propagation benefits of using Cloud Moreover, the system facilitated (iCOMCOT) and weather infrastructure and the DARIAH the development of data models conditions (WRF). CC services beyond the time and modelling tools within the limits of the EGI-Engage project. EISCAT_3D community, and the The two portals provide stand- > Coordinated participation and applicability of operating a alone and ease-of-use contribution to 15 external central portal service for simulation tools for the entire events to promote and scientists to interact and lifecycle of a tsunami event and disseminate the achievements. compute with EISCAT data. numerical weather prediction.

Inspired // Issue #28, September 2017 9 A new interface for the EGI Federated Cloud

Enol Fernández describes the new features of the EGI Applications DataBase

The EGI Applications Database (AppDB) has recently expanded its core functionality with a new dashboard: the Virtual Machine Operations (VMOps) dashboard. VMOps provides a Graphical User Interface (GUI) that performs Virtual Machine (VM) management operations on the EGI Federated Cloud. The dashboard introduces a user-friendly environment actions to be applied both on where users can create and The VMOps dashboard is the set of VMs comprising a manage VMs along with developed and hosted by the topology and on fine-grained associated storage devices on Institute of Accelerating actions on each individual VM any of the EGI Federated Cloud Systems and Applications (IASA) providers. It provides a complete The dashboard removes the and is built on top of the EGI view of the deployed applications need for users to own X.509 Federated Cloud. VMOps and a unified resource certificates. They can now log in interacts with providers via the management experience, to the EGI Check-in service and TOSCA standard and uses the independently of the technology use their institutional credential Infrastructure Manager (IM) as driving each of the resource to see details about their an IaaS federated access tool. centres of the federation. membership to virtual It has been designed with a organisations. Users can create new scalable architecture composed infrastructure topologies, which VMOps then accesses resources by a front-end and several back- include a set of VMs, their on behalf of the user by ends for load balancing and associated storage and employing temporary provides a RESTful API which contextualization, a wizard-like credentials, which are obtained other services can use for builder that guides them via the RCAuth MasterPortal, or integration. through the selection of the by employing Per-User-Sub- virtual appliances, virtual Proxy technologies, depending organisation, resource provider, on the level of integration of More information and the final customisation of each VO. the VMs that will be deployed. The VMOps dashboard also Enol Fernández leads the Its tight integration with the integrates with the EGI ARGO cloud development AppDB Cloud Marketplace Monitoring Service and with the activities at the EGI allows for an automatic EGI GOCDB to present the status Foundation. discovery of the appliances of the providers and any VOMps dashboard which are supported at each scheduled downtimes on a resource provider Once a single view, allowing users to https://dashboard.appdb.egi.eu topology has been created, select the most appropriate /vmops VMOps allows management providers.

Inspired // Issue #28, September 2017 10 Outcomes of the INDIGO-DataCloud project

Davide Salomoni and Giacinto Donvito on how INDIGO lives up to the Better Software for Better Science motto

INDIGO-DataCloud is an EU- modular components, 50 Docker funded project that ran with the containers and 170 software objective of developing a new packages, all supporting up-to- cloud software platform for the date open operating systems. scientific community. With this in This result was accomplished by mind, the team developed tools exploiting key European know- hub service catalogue with many to facilitate the exploitation of how, reusing and extending of its components: identity and distributed cloud and storage open source software and access management, token resources through public or contributing code to upstream translation, virtual filesystems private infrastructures. projects. (Onedata), advanced IaaS services, neutral access to The project's 30 months were 4) The release of two service heterogeneous Cloud resources exciting and ripe with results. catalogues: a short one, with a (Infrastructure Manager), web We believe that the foundations high-level description of the frontend services and user-level laid by INDIGO will continue to INDIGO solutions, and a longer containers. find proper development and version, with details about adoption in a wide variety of components and reports of 8) The positive evaluation of two fields, public and private, at the sample applications. spin-off projects: eXtreme- service of science and for the DataCloud and DEEP- 5) The creation of two large benefit of the overall general HybridDataCloud, due to start in distributed testbeds to support public. The key achievements late 2017 / early 2018. These development activities and pre- are: projects will continue to develop production applications. The many INDIGO components in 1) The involvement of scientific testbeds allowed the communi- areas such as data lifecycle user communities to define and ties to integrate INDIGO management, smart caching, track their requirements: the components into scientific flexible metadata management INDIGO team categorised applications now deployed in for big data sets, PaaS-level requests, identified requirements production over public or private access to HPC resources and and classified them into three infrastructures. So far, scientific real-time, streaming-based data areas: storage, computational communities that integrated ingestion and processing. We and infrastructural. INDIGO components belong to expect that these developments, the domains of life sciences, 2) The identification of once matured to production physics, structural biology, earth technology gaps linked to level, will eventually find a place sciences, physics and cultural concrete use cases. These gaps in the European Open Science heritage, among others. helped the team to validate the Cloud service catalogue, to technical implementations and 6) The establishment of further enhance and facilitate to define the INDIGO technical collaborations with IBM, ATOS the work of scientists and architecture: a modular and T-Systems: The INDIGO resource providers. framework, fully based on open team worked with industry standards, covering all areas of leaders to facilitate the adoption the cloud stack (IaaS, PaaS, and enhancement of INDIGO More information SaaS). components. 3) Two major software releases: 7) The participation to the Davide Salomoni and MidnightBlue (announced in EOSC-hub project: The INDIGO Giacinto Donvito led the INDIGO-DataCloud project August 2016) and ElectricIndigo team will nominate the project’s (in April 2017). ElectricIndigo Technical Coordinator. INDIGO https://www.indigo-datacloud.eu/ now consists of about 40 open will also contribute to the EOSC-

Inspired // Issue #28, September 2017 11 The EDISON community for data science enthusiasts

Themis Athanassiadou introduces the Data Science Pro portal

The EDISON project, where the EGI Foundation is a consortium member, wants to accelerate the establishment of the data scientist profession, which is defined as an expert who can extract meaningful value from the data collected and also manage the whole lifecycle of data, including supporting scientific data e-Infrastructures. Homepage of the Data Science Pro website One of the project’s goals is to bring prospective data scientists professionals with the right The portal is now being tested closer to industry and training profiles, and job seekers can and validated by a set of users organisations and to create an improve their skills whilst getting including the EDISON project interactive environment where the training they need for the partners, EDISON Liaison Group these groups can collaborate jobs they want. members, and selected and share their knowledge and universities. experiences. The community portal will be launched in September 2017 Through the Data Science Pro, To make this happen, the with support from the EGI the EDISON team hopes to bring EDISON team created the Data Federated Cloud and the forth the project’s value Science Pro - a community portal e-infrastructure. The proposition, which is tailoring for data science enthusiasts. The Data Science Pro portal will each offering according to the portal’s vision is based on the provide access to a Virtual Labs supply and demand needs of the EDISON Data Science environment hosted by EGI Data Science profession and Framework, a collection of community partners and built further promoting the EDISON documents and guidelines set by around practical data sets with legacy. EDISON members to define the education purposes. This is data science profession. possible by integrating the EGI AAI to allow researchers to use More information What does Data Science Pro their organisational IDs to access Data Science Pro do? the portal and EGI Federated Cloud resources. https://www.datasciencepro.eu Data Science Pro gathers a collection of data science As of February 2017, the EDISON project courses offered by universities, implementation phase was http://edison-project.eu/ professors and experts in the finalised with the support of field and is built as a dynamic UKIM, a member of MARGI (the The EDISON project is marketplace. It's a place where Macedonian arm of EGI), The funded by European Union’s educators can create trainings, Research Institute for Horizon 2020 research and universities can post their Telecommunications and innovation programme under grant agreement no. programs adhering to EDISON, Cooperation and Engineering 675419. employers can search for Ingeneria Informatica.

Inspired // Issue #28, September 2017 12