CERN Openlab VI Project Agreement “Exploration of Google Technologies
Total Page:16
File Type:pdf, Size:1020Kb
CERN openlab VI Project Agreement “Exploration of Google technologies for HEP” between The European Organization for Nuclear Research (CERN) and Google Switzerland GmbH (Google) CERN K-Number Agreement Start Date 01/06/2019 Duration 1 year Page 1 of 26 THE EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH (“CERN”), an Intergovernmental Organization having its seat at Geneva, Switzerland, duly represented by Fabiola Gianotti, Director-General, and Google Switzerland GmbH (Google), Brandschenkestrasse 110, 8002 Zurich, Switzerland , duly represented by [LEGAL REPRESENTATIVE] Hereinafter each a “Party” and collectively the “Parties”, CONSIDERING THAT: The Parties have signed the CERN openlab VI Framework Agreement on November 1st, 2018 (“Framework Agreement”) which establishes the framework for collaboration between the Parties in CERN openlab phase VI (“openlab VI”) from 1 January 2018 until 31 December 2020 and which sets out the principles for all collaborations under CERN openlab VI; Google Switzerland GmbH (Google) is an industrial Member of openlab VI in accordance with the Framework Agreement; Article 3 of the Framework Agreement establishes that all collaborations in CERN openlab VI shall be established in specific Projects on a bilateral or multilateral basis and in specific agreements (each a “Project Agreement”); The Parties wish to collaborate in the “exploration of applications of Google products and technologies to High Energy Physics ICT problems related to the collection, storage and analysis of the data coming from the Experiments” under CERN openlab VI (hereinafter “Exploration of Google technologies for HEP”); AGREE AS FOLLOWS: Article 1 Purpose and scope 1. This Project Agreement establishes the collaboration of the Parties in Exploration of Google technologies for HEP, hereinafter the “Project”). The detailed tasks and responsibilities of the Parties for this Project are described in the Statement of Work, contained in Annex 1 of this Project Agreement. Page 2 of 26 2. The Parties’ collaboration under this Project Agreement is subject to the terms and conditions set forth in the Framework Agreement, unless otherwise specified in this Project Agreement. Article 2 Contributions The Parties shall make the following contributions to the Project: 1. Google Switzerland GmbH (Google) contributions: • Deployment of technical experts to the Project; • Secondment of or grants for recruiting dedicated researchers (at the master and doctoral level) at CERN for the execution of the projects; • Credits towards the use of cloud technologies Google Switzerland GmbH (Google) shall make the above contributions and issue the credit memos, as applicable, to CERN within ninety working days of receipt of the corresponding invoice from CERN. 2. CERN contributions: • Deployment of technical experts to the Project; • Supervision of personnel funded by Google Switzerland GmbH (Google); • Deployment of interns for the CERN Summer Student 2019-2020 program to work on activities related to the Project; • Access to CERN’s technical environment (hardware, software) as required for the Project; • Support to public relations and communications activities in accordance with Annex 1 of the Framework Agreement and based on Google Switzerland GmbH (Google)’s openlab VI membership status. Article 3 Contact points The contact persons for the Project shall be: For Google Switzerland GmbH (Google): Page 3 of 26 XXXXXXX For CERN: Federico Carminati, CERN openlab Maria Girone, CERN openlab For the individual use cases Use case 1.1 Tim Bell, Ricardo Rocha (IT-CM) Use case 1.2 Felice Pantaleo, Andrea Bocci (CMS) Use case 2.1 Sofia Vallecorsa (CERN openlab) Use case 2.2 Jennifer Ngadiuba, Maurizio Pierini (CMS) Use case 2.3 Luca Canali (IT-DB), Viktor Khristenko (CERN openlab) Use case 2.4 Sofia Vallecorsa, Taghi Aliyev (CERN openlab), xxxxx (UNOSAT) Use Case 2.5 LHCb Use case 3.1 Federico Carminati, Sofia Vallecorsa (CERN openlab) Use case 3.2 Federico Carminati, Sofia Vallecorsa (CERN openlab) CERN 1 Esplanade des Particules CH 1211 Geneva 23, Switzerland Article 4 Entry into force, duration and termination 1. This Project Agreement shall enter into force on the date of signature by the last Party to sign with effect of June 1st, 2019. It shall remain in force for as long as necessary to give effect to the Parties’ respective rights and obligations under this Project Agreement. 2. Except in case of force majeure, each Party may only terminate this Project Agreement in the event that the other Party fails to honour any of its obligations thereunder. Page 4 of 26 Signed on ………… The European Organization for Nuclear Research (CERN) ……………………………… (Signature) ……………………………… (Full Name) ……………………………… (Email address) Page 5 of 26 Signed on ………… Google Switzerland GmbH (Google) ……………………………… (Signature) ……………………………… (Full Name) ……………………………… (Email address) Page 6 of 26 Annex 1: Project Statement of Work Page 7 of 26 OVERVIEW Start date June 1st, 2019 Duration 12 months with option for renewal upon mutual agreement Members CERN, ATLAS, Google Total effort Total cash contributions Total in-kind contributions SUMMARY OF RESOURCE CONTRIBUTIONS FTEs Facilities1 Hardware Software Other Cash (Summer (FTE+travel, students, other, etc.) events) CERN Google Switzerland GmbH (Google) Total2 PROJECT DESCRIPTION The following Programme of Work is organized in three main areas with specific use cases of interest. Section 1 describes explorations in data center and computing infrastructures technologies (storage, workflow managements, containers, hybrid infrastructures integration, etc.). Section 2 focuses on application of machine learning/deep learning tools and algorithms. Finally Section 3 deals with quantum computing explorations. 1. Data Center technologies 1 Office space, rack space, access to infrastructure or facilities, etc. 2 Provide estimated values for facilities, hardware and software contributions Page 8 of 26 Kubernetes performance and scalability (use case 1.1) Large computing environments require high levels of automation for both resource and workload management, as well as careful resource planning to meet the needs of daily operations and periodic spikes on resource requests while minimizing cost. Balancing the two can often be challenging. Looking forward to the upcoming challenges of the High Luminosity LHC and its significant increase in the amount of data to be processed, this use case will explore technologies with potential to ease this task while improving resource usage. In addition to the more traditional scale out options using well established network and compute provisioning methods, it will focus on the Kubernetes project and ecosystem, where Google and more recently CERN have been investing significantly along a much larger community. The main focus will be on extending the ongoing efforts to scale out the CERN data center to the Google Cloud Platform (GCP) (already done at scale for one time demos like Kubecon Barcelona 2019), exploring Kubernetes federation and service mesh technologies (Istio). The Batch use case can be taken as the primary use case to offer a larger CPU capacity to CERN users. Another interesting area of collaboration are accelerators, and how we can offer GPUs (currently available in a very limited amount at CERN) and TPUs transparently to our users building on top of the same Kubernetes layer. The final result should include a detailed report on the technologies used as well as cost models and efficiency in the areas of compute, networking and storage. The current CERN capacity is in the order of 300.000 hardware hyper-threads (cores) and 100s of Petabytes of storage. For a successful evaluation of this model’s viability we expect to have a significant fraction of these resources available in GCP at least for a short period of time. Additional areas of interest include research on serverless models to simplify the LHC computing workflows. The existing workflows launched by data arriving from the detectors are a good match for pipelines triggered by the storage systems, with each step executed as an independent function. Technologies like Knative (the backend of Google’s Cloud Run service offering) should allow us to again build on the same Kubernetes layer, allowing experiment users to focus more on their computing tasks and less on resource management. It should also allow Google to get a large use case being deployed in this new cloud service offering. The end result is expected to be a prototype of one of the experiment’s data reconstruction relying on a single pipeline backed by GCS/S3 and Knative. Milestone Duration H/W S/W Approx. Criteria Outcome effort (FTE) M1 6 months x x 1 Working CERN Batch system prototype running on Kubernetes, with resources both at CERN and in the Google Cloud Platform (GCP) Page 9 of 26 M2 6 month 0.5 - Report on cost models and service efficiency for resources running on GCP M3 6 months x 1 - Experiment data reconstruction deployed using a serverless pipeline Composable data centers for efficient HEP computing workflows (use case 1.2) This use-case deals, in a novel way, with event processing, pushing back the frontiers of technologies that can find application in many other areas of science and technology. At High Luminosity LHC and FCC, the higher proton-proton interaction rate, pileup and event processing rate present an unprecedented challenge to the real-time and offline event reconstruction, requiring a processing power orders of magnitude larger than today. This exceeds by far the