CERN Cloud Benchmark Suite

CERN Cloud Benchmark Suite D. Giordano, C. Cordeiro (CERN) HEPiX Benchmarking Working Group - Kickoff Meeting June 3rd 2016 D. Giordano Aim • Benchmark computing resources – Run in any cloud IaaS (virtual or bare-metal) – Run several benchmarks – Collect, store and share results • Satisfy several use cases – Procurement / acceptance phase – Stress tests of an infrastructure – Continuous benchmarking of provisioned resources – Prediction of job performance D. Giordano HEPiX Benchmarking Working Group 03/06/2016 2 ~1.5 year Experience • Running in a number of cloud providers • Collected ~1.2 M tests • Adopted for – CERN Commercial Cloud activities – Azure evaluation – CERN Openstack tests • Focus on CPU performance – To be extended to other areas • Further documentation – White Area Lecture (Dec 18th) … D. Giordano HEPiX Benchmarking Working Group 03/06/2016 3 Strategy – Allow collection of a configurable number of benchmarks • Compare the benchmark outcome under similar conditions – Have a prompt feedback about executed benchmarks • In production can suggest deletion and re-provisioning of underperforming VMs – Generalize the setup to run the benchmark suite in any cloud – Ease data analysis and resource accounting • Examples of analysis with Ipython shown in the next slides • Compare resources on cost-to-benefit basis – Mimic the usage of cloud resources for experiment workloads • Benchmark VMs of the same size used by VOs (1 vCPU, 4 vCPUs, etc) • Probe randomly assigned slots in a cloud cluster – Not knowing what the neighbor is doing D. Giordano HEPiX Benchmarking Working Group 03/06/2016 4 Cloud Benchmark Suite D. Giordano HEPiX Benchmarking Working Group 03/06/2016 5 A Scalable Architecture • A configurable sequence of benchmarks to run • Results are collected in Elasticsearch cluster & monitored with Kibana – Metadata: VM UID, CPU architecture, OS, Cloud name, IP address, … • Detailed analysis performed with Ipython analysis tools D. Giordano HEPiX Benchmarking Working Group 03/06/2016 6 The cern-benchmark* package • The cern-benchmark suite is available in the ai6-stable.repo and also in GitLab1 – cern-benchmark-docker is also available, as an extension of the previous, providing a container execution mode • The benchmark suite provides the ability to run one or more benchmarks with the option to publish (or not) the final results to ES at CERN – The currently available benchmarks are: ATLAS KV, Fast Benchmark and Whetstone VM (dedicated CentOS6 containers) =============================================== RESULTS OF THE OFFLINE BENCHMARK FOR CLOUD =============================================== offline Machine classification: i7_1_f6m26s3_mhz2266.746 run kv Whetstone Benchmark: …… …… run fastBmk cern-benchmark run whetstone {} publish to ES results.json $ kv $ fastBmk $ whetstone 1 https://gitlab.cern.ch/cloud-infrastructure/cloud-benchmark-suite and https://gitlab.cern.ch/cloud-infrastructure/cloud-benchmark-suite-docker More information at http://bmkwg.web.cern.ch/bmkwg/ D. Giordano HEPiX Benchmarking Working Group 03/06/2016 7 Benchmarking Approach • Run the benchmark(s) in parallel in a configurable number of threads • The benchmark suite can be run at any time during the VM lifecycle – Only at the beginning – At each job cycle – Sequentially - all configured benchmarks in a row – Synchronized - specific benchmarks at specific points in time D. Giordano HEPiX Benchmarking Working Group 03/06/2016 8 Benchmarks Used So Far • HEP related – LHCb Fast Benchmark (fastBmk) • Original python code modified by A. Wiebalck to run python.multiprocessing • Very fast, gaussian random generator – ATLAS G4 Simulation via KV toolkit • Open-source Phoronix benchmarks adopted by DBCE to “commoditize” resources – Adopted in the past, proved to be useful to generate high and continuous load • Open-source Whetstone • Studied possibility to include HS06 – Difficult: for license aspects, no open source distribution, long running time, etc • Recently SPEC has released a Cloud IaaS toolkit (more comments later) D. Giordano HEPiX Benchmarking Working Group 03/06/2016 9 ATLAS KV Reference Workload • Which workload to use for benchmarking? – CPU time/event is different for each workload – Measured that within ~10% the relative CPU/event performance doesn’t depend on specific workloads • Confirmed also using a different approach: HammerCloud jobs Study done in pre-procurement phase CloudA CloudB • Preferred workload: G4 single muon: faster running time O(few mins) – NB: the CPU time/event doesn’t include the first event, to avoid bias due to the initialization process D. Giordano HEPiX Benchmarking Working Group 03/06/2016 10 The DBCE Phoronix benchmarks D. Giordano HEPiX Benchmarking Working Group 03/06/2016 11 Few Examples" ! ² OpenStack!! @ CERN! ² Microsoft Azure! ² Procurement! ² CPU bmk Vs Job D. Giordano HEPiX Benchmarking Working Group 03/06/2016 12 Probing the OpenStack Compute Environment! • Probe performance of VMs in OpenStack Compute Environment! – Where resources are assigned to the experiments for CERN cloud activities ! – Tenant with ~200 single-core VMs ! • Make sure VMs are provisioned in different Hypervisors! • Run synchronized benchmarking suite ! VM1 VM 2 time Num VMs per pnode D. Giordano HEPiX Benchmarking Working Group 03/06/2016 13 Profiling Results! 1.5 Average KV performance Vs Time 95th percentile evt 1.2 mean KV sec/ KV 1 5th percentile 3 KV performance per pnode evt 1.5 95th percentile KV sec/ KV 1 5th percentile D. Giordano HEPiX Benchmarking Working Group 03/06/2016 14 Case of Study: OpenStack at CERN! Work done in collaboration with J. Van Eldik • Evaluate the effect of hypervisor load on the performance of single vCPU VMs! – Extracted 5 nodes from pool of computing nodes! • Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz! – Load phases: create a targeted number of VMs per hypervisor ! • 1 VM per KVM! • 16 VMs per KVM! • 30 VMs per KVM! – VM image: Scientific Linux CERN SLC release 6.6 (Carbon)! – Run sequence of benchmarks! VM1 VM2 time • Used Phoronix open source benchmarks to produce load! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 15 Qualitative Look at Data! • Larger dispersion in KV and FastBmk values in the highest-load region ! 25 VMs 16 VMs 1 VM 30 VMs 16 VMs 20 VMs D. Giordano HEPiX Benchmarking Working Group 03/06/2016 16 More quantitative analysis: FastBmk Vs KV! • Correlation study in the region 16 and 30 VMs! – NB: FastBmk metric transformed into value-1 [s] – The average performance degradation differs per Hypervisor and Bmk used! Ratio mean(30VMs)/mean(16 VMs) Projection-Y Profile-X Aggr. x hypervisor KV FastBmk Evolution of a single VM in 2D plot the parameter space Projection-X FastBmk Vs KV Aggr. x hypervisor A single KVM.!! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 17 And the Other Benchmarks?! • Ability to discriminate LAME mp3 encoding different hypervisor Where is the single performance depends KVM.?? on the specific test ! 7Zip compression A single KVM.!! A single KVM.!! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 18 https://zenodo.org/record/48495#.V1FJUud95FV Few Examples" ! ² OpenStack!! @ CERN! ² Microsoft Azure! ² Procurement! ² CPU bmk Vs Job D. Giordano HEPiX Benchmarking Working Group 03/06/2016 19 Cost-benefit Analysis for VM Flavours (series)! • Goal: Compare the VM offer on the cost/event basis! – CPU time/event * VM-cost/hour è cost/event! – Adopt KV. Fast Benchmark (fastBmk) as a crosscheck! • Measurements cover! – three Azure Data Centres: Central US, North EU, West EU! – Two series of VMs! • Standard_A3 (4 cores) Standard Tier • Standard_D3 (4 cores) Optimized compute USD/(hcore) • Azure pricing website ! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 20 Azure Data Centre and CPU Model Comparison! • Performance & effective cost of a given VM series depends on the CPU model! D3 E5-2660 D3 E5-2660 D3 E5-2673 D3 E5-2673 • Effective normalized cost of D3 VMs (E5-2660 0) is high! – Even compared with A series! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 21 KV Vs fastBmk! • Good Linearity among two independent A3 and D3 series benchmarks: KV and fastBmk ! • The measured effect does not depend on specific compiler flags! ! A1 series D. Giordano HEPiX Benchmarking Working Group 03/06/2016 22 Few Examples" ! ² OpenStack!! @ CERN! ² Microsoft Azure! ² Procurement! ² CPU bmk Vs Job D. Giordano HEPiX Benchmarking Working Group 03/06/2016 23 Rank Cloud Providers by Benchmark! • When: In tendering phase! • Who: Cloud providers ! • How: Run our performance test! – Results are collected and used to verify compliance with the Technical Requirements! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 24 Benchmarking & Service Credit Compensation! • Poor performance gives rise to Service Credit Compensation! – Fix limits: min. desired (KV 1.2 s/evt) and tolerated performance (KV 1.5 s/evt)! – Compensation ∝ lost performance ᐧ penalty! Poorly performing cloud Compensation Region Time [day] Well performing cloud Compensation Region D. Giordano HEPiX Benchmarking Working Group 03/06/2016 25 Few Examples" ! ² OpenStack!! @ CERN! ² Microsoft Azure! ² Procurement! ² CPU bmk Vs Job D. Giordano HEPiX Benchmarking Working Group 03/06/2016 26 Correlation with job performance! • Job Vs KV benchmark as measured in the same VM! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 27 Next steps! • Continue studies on benchmarking of! – CERN private cloud resources! – Commercial cloud resources !! • T-Systems – CERN cloud production activity! – 4 experiments involved

CERN Cloud Benchmark Suite

ENOS: a Holistic Framework for Conducting Scientific

Towards Measuring and Understanding Performance in Infrastructure- and Function-As-A-Service Clouds

Measuring Cloud Network Performance with Perfkit Benchmarker

ENOS: a Holistic Framework for Conducting Scientific Evaluations Of

Optimal Allocation of Virtual Machines in Multi-Cloud Environments with Reserved and On-Demand Pricing

Bulut Ortamlarinda Hipervizör Ve Konteyner Tipi Sanallaştirmanin Farkli Özellikte Iş Yüklerinin Performansina Etkisinin Değerlendirilmesi

HPC on Openstack Review of the Our Cloud Platform Project

Platforma Vert.X

Contributions to Large-Scale Distributed Systems the Infrastructure Viewpoint - Toward Fog and Edge Computing As the Next Utility Computing Paradigm? Adrien Lebre

Network Function Virtualization Benchmarking: Performance Evaluation of an IP Multimedia Subsystem with the Gym Framework Avalia

MUSA D7.1 Initial Market Study

Advanced Consolidation for Dynamic Containers Damien Carver