Potential use of supercomputing resources in and in Spain for CMS (including some technical points)

Presented by Jesus Marco (IFCA, CSIC-UC, Santander, Spain) on behalf of IFCA team, And with tech contribution from Jorge Gomes, LIP, Lisbon, Portugal @ CMS First Open Resources Computing Workshop CERN 21 June 2016 Some background…

The supercomputing node at the University of Cantabria, named ALTAMIRA, is hosted and operated by IFCA It is not large (2500 cores) but it is included in the Supercomputing Network in Spain, that has quite significant resources (more than 80K cores) that will be doubled along next year. As these resources are granted on the basis of the scientific interest of the request, CMS teams in Spain could directly benefit of their use "for free". However the request needs to show the need for HPC resources (multicore architecture in CMS case) and the possibility to run in a “existing" framework (OS, username/passw. access, etc.). My presentation aims to cover these points and ask for experience from other teams/experts on exploiting this possibility. The EU PRACE initiative joins many different at different levels (Tier-0, Tier-1) and could also be explored. Evolution of the Spanish Supercomputing Network (RES)

See https://www.bsc.es/marenostrum-support-services/res Initially (2005) most of the resources were based on IBM-Power +Myrinet Since 2012 many of the centers evolved to use x86 + Infiniband • First one was ALTAMIRA@UC: – 2.500 Xeon E5 cores with FDR IB to 1PB on GPFS • Next: MareNostrum 3 @ BSC – >50.000 cores Xeon E5 with FDR • Latest one: FinisTerrae II @ CESGA – >7.000 cores Xeon E5 2680v3 + FDR) NEXT YEAR: MareNostrum4 @ BSC • x4 current capacity • MN3 nodes will remain available! Access to the Spanish Supercomputing Network (RES)

See https://www.bsc.es/marenostrum-support-services/res

The RES consists of a distributed virtual infrastructure of supercomputers located in different sites, each of which contributes to the total processing power available to users of different R&D groups in Spain or based in another country but developed by with participation of Spanish researchers

RES Call for Projects: Participants are encouraged to submit proposed activities where RES resources and expertise help to accelerate relevant contributions to research areas where intensive computing is an enabling technology

Evaluation by an Access Committee, advised by a Technical Experts Panel and a Scientific Experts Panel, of the activity, which must provide detailed objectives and an approximate time line for completion. Access to the Spanish Supercomputing Network (RES)

See https://www.bsc.es/marenostrum-support-services/res

Activity call is continuously open and the evaluation is every four months. Those activities, whose computing needs require access to resources during more than a period, can request access for up to two periods, reporting to the Access Committee the intermediate results obtained and then, so the Committee could confirm the access the second period. In the activities were access is granted, RES will provide CPU hours in the supercomputers (hardware and basic system software) and know-how of the Operations team (Systems and User Support team) to ensure the correct performance of the user applications. Interest for CMS

Addressing peaks of demand on simulation Easiest Use Case? Can be planned in advance and submitted to calls with details

Opportunistic use for high granularity jobs In these busy systems, where large parallel jobs take up to 3 days, there is a significant “waiting” time for free Experience in Altamira: we use this time to provide opportunistic resources to long tail of users requesting jobs with 1-16 cores (Initial support by Regional/University funds: “USO OPORTUNISTA DE RECURSOS DE SUPERCOMPUTACION EN SIMULACION OPORTUSIM” “Incentivación de Proyectos de I+D, SODERCAN “) Large Data “fast processing” Exploit Access to storage via Infiniband FDR (for example at IFCA we share GPFS volumes with the Tier-2 system) Most nodes are linked to Geant via dark , data can be moved fast Can be linked to “hot” analysis with high impact if needed Potential Issues

POLITICAL/STRATEGY (but likely not to be discussed now…) Resources are limited RES calls are over subscribed Arguments: Scientific Impact, detailed Use Cases Next year MN node renewal will provide at least 2x resources Experience could be extended to other countries, or towards PRACE Local support As stated in the RES call Can be complemented by WLCG experts close to RES Key word: Cost / Efficiency of using Supercomputing resources In many cases the cost of a supercomputing node is similar (IB=10GB Eth) Scale factor is key (operating 50K cores is << 50x operating 1K cores) Need to adapt to a “Fixed” Supercomputing Environment Use a single group for “production” (ssh, not certificates but username/passw) “Encapsulate” (“virtualize”) over existing platform Start a “bridging” project based on Cloud technology How? See next… udocker ([email protected])

• tool to execute content of docker containers in user space when docker is not available • enables download of docker containers from dockerhub • enables execution of docker containers by non‐privileged users • can be used to execute the content of docker containers in batch systems and interactive clusters managed by others • wrapper around other tools to mimic docker capabilities • current version uses proot to provide a chroot like environment without privileges (runs on CentOS 6, CentOS 7, Fedora, Ubuntu) • More info at: • https://www.gitbook.com/book/indigo‐dc/udocker/details • https://indigo‐dc.gitbooks.io/udocker/content/doc/user_manual.html Example of (already proven) path to integration into a “supercomputing framework” using “a la cloud” techniques being explored in the INDIGO‐DataCloud project (contribution from Jorge Gomes, LIP team, Lisbon, Portugal) udocker

• Examples:

# download, but could also import or load a container exported/save by docker $ udocker.py pull ubuntu:latest $ udocker.py create --name=myubuntu ubuntu:latest

# make the host homedir visible inside the container and execute something $ udocker.py run -v $HOME myubuntu /bin/bash <

Example of (already proven) path to integration into a “supercomputing framework” using “a la cloud” techniques being explored in the INDIGO‐DataCloud project (contribution from Jorge Gomes, LIP team, Lisbon, Portugal) udocker

• Examples:

# download, but could also import or load a container exported/save by docker $ udocker.py pull fedora:latest $ udocker.py create --name=myfed fedora:latest

# install something and run $ udocker.py run myfed /bin/bash yum install –y firefox $ udocker.py run --bindhome --hostauth –hostenv -v /sys -v /proc \ -v /var/run -v /dev --user=myusername --dri myfed firefox

Example of (already proven) path to integration into a “supercomputing framework” using “a la cloud” techniques being explored in the INDIGO‐DataCloud project (contribution from Jorge Gomes, LIP team, Lisbon, Portugal) udocker

• Everything is stored in the user home dir or some other location • Container layers are download to the user home • Directory trees can be created/extracted from these container layers • proot uses the debugger ptrace mechanism to change pathnames and execute transparently inside a directory tree • No impact on read/write or execution, only impact on system calls using pathnames (ex. open, chdir, etc)

• Does not require installation of software in the host system: • udocker is a python script • proot is statically compiled Example of (already proven) path to integration into a “supercomputing framework” using “a la cloud” techniques being explored in the INDIGO‐DataCloud project (contribution from Jorge Gomes, LIP team, Lisbon, Portugal) And then…PRACE ?

PRACE Research Infrastructure (http://www.prace-ri.eu/ ) is the top level of the European HPC ecosystem Again, focus is on HPC for very large parallel applications But again, one could try to exploit similar ideas More HPC-oriented architectures (GPU, Power, Blue Gene) Calls are regularly launched There are two levels of resources: Tier-0 systems (>1Petaflop = >50K cores approx.) Also Tier-1 systems Initial Focus: opportunistic use? THANKS FOR YOU INTEREST! 12