OpenStack in the European Open Science

Enol Fernández Jérôme Pansanel Boris Parák EGI Foundation IPHC CESNET

… … …

• • … • • • • • • … •

• • • •

• •

• •…

• …

EGI is a federation of National e-infrastructures

22 countries + 1 EIRO (CERN)

EGI Foundation (Amsterdam, NL): coordination body https://www.egi.eu/about/egi-foundation EGI: largest distributed e-infrastructure in the world

70,600 computing cores; 356 PB disk & 380 PB tape storage 666 open research publications from Jan 2018 (OpenAIRE stats) EGI Service Catalogue

The list of services that EGI as a federation offers. Full details at https://www.egi.eu/services EGI Cloud Federation (aka FedCloud)

● Multi-cloud IaaS with Single Sign-On based on Virtual Organisations (VO) ○ Allows research communities to bring computing near their data ○ Allows providers to easily support international collaborations

● Federation features: ○ VO-scoped Virtual Machine Image catalogue with replication across providers ○ Centralised usage accounting ○ Resource discovery ○ Availability/Reliability monitoring ○ Unified GUI dashboard ○ Supporting diverse IaaS technology: OpenStack, OpenNebula and Federated dashboard EGI Cloud Federation infrastructure

20 providers supporting 11 SLAs:

● 15 OpenStack ● 4 OpenNebula (2 of them moving to OpenStack) ● 1 Synnefo (moving to OpenStack)

Growing this year with the EOSC-hub project and other related EOSC activities:

● 6 new OpenStack providers in the pipeline

Looking to expand beyond academic providers EGI FedCloud at IPHC The Hubert Curien Pluridisciplinary Institute

● A 400 person research institute based in Strasbourg - FRANCE ● Covering a large array of scientific domains (from particle physics to ecology) ● Hosting a scientific called SCIGNE ● For more information, visit our website: http://www.iphc.cnrs.fr The SCIGNE platform

● A platform for on-demand scientific computing ● Managed by a team of 8 engineers ● Several collaborations (University of Strasbourg, CNRS, France Grilles, French Bioinformatics Institute, CERN and EGI) ● Provides several services (grid and , data management, disk and tape storage, ...) ● Additional informations available on https://www.grand-est.fr The OpenStack Cloud Infrastructure

● Started in 2013 with OpenStack Grizzly ● CentOS 7 & OpenStack Pike (RDO) ● 520 cores and 3.5 TB RAM (edge-scale) ● 300 TB storage for Cinder powered by ● Used for scientific computing (no CPU overallocation, NVIDIA GPU, ...) ● Available through the CLI and the Web dashboard (Horizon) ● Configured and maintained with Quattor - https://www.quattor.org ● Federated to the EGI, France Grilles and IFB Clouds ● Availability > 99 % Integration into the EGI Federated Cloud - 1/2

● Certified in 2015 ● Extensive documentation (previously hosted on the EGI wiki) - https://egi-federated-cloud-integration.readthedocs.io/en/latest/ ● Following EGI components in use: ○ EGI Checkin, Keystone-VOMS (single sign-on) ○ cASO / APEL (centralised usage accounting) ○ Cloudkeeper and Cloudkeeper-OS (image synchronization with the VO-scoped catalogue) ○ Cloud-info-provider (resource discovery) Integration into the EGI Federated Cloud - 2/2

● The components are mostly easy to install and configure ● Cloudkeeper is also used by the French Cloud federation (FG-Cloud) ● Take care when upgrading OpenStack! ● Some hacks required with recent version of OpenStack (OOI/OCCI is not an independant service) ● This issue will be solved in the short term (OCCI is not mandatory anymore) FedCloud use cases and projects

● EGI use cases hosted at IPHC ○ ELIXIR, Biomed, NBIS, ... ○ Life and health sciences ○ Managed through SLA (also for opportunistic access) ○ Running on resources funded by France Grilles and IFB

● Several projects are ongoing ○ Deployment of EGI Notebook service ○ Development of a new version of Cloudkeeper-OS ○ Container-as-a-service at the European scale ○ Lease manager for VMs (os-vm-expire, -lease-it) ○ Creation of an on-demand tensorflow service Moving to OpenStack Partners in Cloud

● e-Infrastructure established by public universities in Czech Republic ● Network, compute, and data services for research ● Involved with GÉANT, EGI, EOSC-hub, ELIXIR, et al.

● National centre operating computing and data storage infrastructure ● Emphasis on experimental use of e-infrastructure resources for research ● Involved with ELIXIR, BBMRI, West-life, et al. Legacy Infrastructure

● Offering HPC cloud resources to communities from multiple disciplines ● Integrated in EGI Federated Cloud, ELIXIR Compute Platform ● Resources ○ 6k CPU cores ○ 65 TB of RAM ○ Local RAIDs and Ceph / IBM Spectrum Scale ○ 250 nodes ○ GP GPU and SR-IOV InfiniBand ○ Both provider and overlay networks ● Running on OpenNebula, 7+ years Motivation

● OpenStack APIs de facto standard ● Large existing ecosystem of tools ● Popular demand, mostly international communities ● Growing demands on service portfolio diversification Rules

● Try to learn as much as possible ● Train a new team of flying cloud-monkeys ● Avoid vendor lock-in ‒ software or hardware ● Dead-ends are expected, within reason ● Experimental features are allowed ● Uptime and reliability are not everything ● Push for production in one year, Q1/2019 ● Try not to get murdered by angry users Technical Titbits

● Deployment in containers, from The Kolla Project ● Custom tooling mixing puppet and ansible, minimalistic ● Adventures with OVN (networking-ovn for Neutron) ● Fun and games with federated identity from multiple IdPs Stay Tuned! What are we missing in OpenStack?

● Overall quite happy :), but would appreciate: ● Better support for federation ○ Hierarchical projects auto-provisioning ○ Better OpenID Connect support ● Deprovisioning ● Better documentation on how OpenStack services interact ● Better tracing of user actions (e.g. for auditing) ● Nicer policy management Become a provider https://www.eosc-hub.eu/join-as-service-provider Levels of Integration

Level of Integration Provider responsiblitlies

Internal Catalogue HIGH Follow Supporting services. Federation Services EOSC-hub

processes

HIGH EOSC-hub SMS External Catalogue

Services from participating e-Infrastructures Run own SMS and integrate with EOSC (SPM, SLM, CRM, Thematic services MEDIUM RDM, SACM, ISM, ISRM)

Other services wishing to participate in EOSC-hub Support EOSC-hub processes (SPM, SLM, LOW CRM, RDM, SACM, ISM, External SMS ISRM)

More information about the Service Catalogues: https://wiki.eosc-hub.eu/display/EOSC/EOSC-hub+service+catalogue Conclusions

EOSC-hub establishes key elements for European Open Science Cloud

● First set of services, including the EGI Cloud Federation ● Service request, provisioning and management processes

OpenStack is the main IaaS technology in the current EOSC landscape and will keep growing in the near future

EOSC-hub is open to new providers, join us! Thanks!

Questions? Backup EOSC Objectives Actions

Increase the ability to exploit research > Publish, discover, access services and resources for all scientific data across scientific disciplines and disciplines between the public & private sector > Open to national, regional, pan-European providers, and supports different exploitation models (e.g. free at point of use, commercial)

Increase interoperability, interconnect the > Provide thematic services integrated with European compute/data existing and the new research digital platforms for data exploitation infrastructures across Europe > Single sign on, integrated access and order

Support open science > Services to share and discover research artefacts (publications, datasets, software, workflows etc.), research artefacts data sources (publication repositories, publishers, data archives, software archives, etc.) Some of the usage of EGI Cloud