CERN Cloud Suite

D. Giordano, C. Cordeiro (CERN)

HEPiX Benchmarking Working Group - Kickoff Meeting June 3rd 2016

D. Giordano Aim • Benchmark computing resources – Run in any cloud IaaS (virtual or bare-metal) – Run several benchmarks – Collect, store and share results

• Satisfy several use cases – Procurement / acceptance phase – Stress tests of an infrastructure – Continuous benchmarking of provisioned resources – Prediction of job performance

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 2 ~1.5 year Experience • Running in a number of cloud providers • Collected ~1.2 M tests • Adopted for – CERN Commercial Cloud activities – Azure evaluation – CERN Openstack tests • Focus on CPU performance – To be extended to other areas

• Further documentation – White Area Lecture (Dec 18th) …

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 3 Strategy – Allow collection of a configurable number of benchmarks • Compare the benchmark outcome under similar conditions

– Have a prompt feedback about executed benchmarks • In production can suggest deletion and re-provisioning of underperforming VMs

– Generalize the setup to run the benchmark suite in any cloud

– Ease data analysis and resource accounting • Examples of analysis with Ipython shown in the next slides • Compare resources on cost-to-benefit basis

– Mimic the usage of cloud resources for experiment workloads • Benchmark VMs of the same size used by VOs (1 vCPU, 4 vCPUs, etc) • Probe randomly assigned slots in a cloud cluster – Not knowing what the neighbor is doing

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 4 Cloud Benchmark Suite

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 5 A Scalable Architecture

• A configurable sequence of benchmarks to run • Results are collected in Elasticsearch cluster & monitored with Kibana – Metadata: VM UID, CPU architecture, OS, Cloud name, IP address, … • Detailed analysis performed with Ipython analysis tools

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 6 The cern-benchmark* package

• The cern-benchmark suite is available in the ai6-stable.repo and also in GitLab1 – cern-benchmark-docker is also available, as an extension of the previous, providing a container execution mode • The benchmark suite provides the ability to run one or more benchmarks with the option to publish (or not) the final results to ES at CERN – The currently available benchmarks are: ATLAS KV, Fast Benchmark and Whetstone VM

(dedicated CentOS6 containers) ======RESULTS OF THE OFFLINE BENCHMARK FOR CLOUD ======

offline Machine classification: i7_1_f6m26s3_mhz2266.746

run kv Whetstone Benchmark: …… ……

run fastBmk cern-benchmark run whetstone {} publish to ES results.json $ kv $ fastBmk $ whetstone

1 https://gitlab.cern.ch/cloud-infrastructure/cloud-benchmark-suite and https://gitlab.cern.ch/cloud-infrastructure/cloud-benchmark-suite-docker

More information at http://bmkwg.web.cern.ch/bmkwg/

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 7 Benchmarking Approach

• Run the benchmark(s) in parallel in a configurable number of threads • The benchmark suite can be run at any time during the VM lifecycle – Only at the beginning – At each job cycle

– Sequentially - all configured benchmarks in a row

– Synchronized - specific benchmarks at specific points in time

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 8 Benchmarks Used So Far • HEP related – LHCb Fast Benchmark (fastBmk) • Original python code modified by A. Wiebalck to run python.multiprocessing • Very fast, gaussian random generator – ATLAS G4 Simulation via KV toolkit • Open-source Phoronix benchmarks adopted by DBCE to “commoditize” resources – Adopted in the past, proved to be useful to generate high and continuous load • Open-source Whetstone • Studied possibility to include HS06 – Difficult: for license aspects, no open source distribution, long running time, etc • Recently SPEC has released a Cloud IaaS toolkit (more comments later)

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 9 ATLAS KV Reference Workload • Which workload to use for benchmarking? – CPU time/event is different for each workload – Measured that within ~10% the relative CPU/event performance doesn’t depend on specific workloads • Confirmed also using a different approach: HammerCloud jobs Study done in pre-procurement phase

CloudA CloudB

• Preferred workload: G4 single muon: faster running time O(few mins) – NB: the CPU time/event doesn’t include the first event, to avoid bias due to the initialization process

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 10 The DBCE Phoronix benchmarks

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 11 Few Examples ² OpenStack @ CERN ² Azure ² Procurement ² CPU bmk Vs Job

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 12 Probing the OpenStack Compute Environment • Probe performance of VMs in OpenStack Compute Environment – Where resources are assigned to the experiments for CERN cloud activities – Tenant with ~200 single-core VMs • Make sure VMs are provisioned in different Hypervisors • Run synchronized benchmarking suite VM1 VM 2 time

Num VMs per pnode

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 13 Profiling Results

1.5 Average KV performance Vs Time 95th percentile

evt

1.2 mean KV sec/ KV

1 5th percentile

3

KV performance per pnode evt 1.5 95th percentile

KV sec/ KV 1 5th percentile

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 14 Case of Study: OpenStack at CERN Work done in collaboration with J. Van Eldik

• Evaluate the effect of hypervisor load on the performance of single vCPU VMs – Extracted 5 nodes from pool of computing nodes • (R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz – Load phases: create a targeted number of VMs per hypervisor • 1 VM per KVM • 16 VMs per KVM • 30 VMs per KVM – VM image: Scientific Linux CERN SLC release 6.6 (Carbon) – Run sequence of benchmarks

VM1 VM2 time • Used Phoronix open source benchmarks to produce load

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 15 Qualitative Look at Data • Larger dispersion in KV and FastBmk values in the highest-load region

25 VMs 16 VMs 1 VM 30 VMs 16 VMs 20 VMs

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 16 More quantitative analysis: FastBmk Vs KV • Correlation study in the region 16 and 30 VMs – NB: FastBmk metric transformed into value-1 [s] – The average performance degradation differs per Hypervisor and Bmk used

Ratio mean(30VMs)/mean(16 VMs)

Projection-Y Profile-X Aggr. x hypervisor KV

FastBmk

Evolution of a single VM in 2D plot the parameter space Projection-X FastBmk Vs KV Aggr. x hypervisor A single KVM.!!

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 17 And the Other Benchmarks? • Ability to discriminate LAME mp3 encoding different hypervisor Where is the single performance depends KVM.?? on the specific test

7Zip compression

A single KVM.!!

A single KVM.!!

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 18 https://zenodo.org/record/48495#.V1FJUud95FV Few Examples ² OpenStack @ CERN ² ² Procurement ² CPU bmk Vs Job

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 19 Cost-benefit Analysis for VM Flavours (series) • Goal: Compare the VM offer on the cost/event basis – CPU time/event * VM-cost/hour è cost/event – Adopt KV. Fast Benchmark (fastBmk) as a crosscheck • Measurements cover – three Azure Data Centres: Central US, North EU, West EU – Two series of VMs • Standard_A3 (4 cores) Standard Tier • Standard_D3 (4 cores) Optimized compute USD/(hŸcore) • Azure pricing website

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 20 Azure Data Centre and CPU Model Comparison • Performance & effective cost of a given VM series depends on the CPU model

D3 E5-2660 D3 E5-2660

D3 E5-2673 D3 E5-2673

• Effective normalized cost of D3 VMs (E5-2660 0) is high – Even compared with A series

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 21 KV Vs fastBmk • Good Linearity among two independent A3 and D3 series benchmarks: KV and fastBmk • The measured effect does not depend on specific compiler flags

A1 series

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 22 Few Examples ² OpenStack @ CERN ² Microsoft Azure ² Procurement ² CPU bmk Vs Job

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 23 Rank Cloud Providers by Benchmark • When: In tendering phase • Who: Cloud providers • How: Run our performance test – Results are collected and used to verify compliance with the Technical Requirements

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 24 Benchmarking & Service Credit Compensation • Poor performance gives rise to Service Credit Compensation – Fix limits: min. desired (KV 1.2 s/evt) and tolerated performance (KV 1.5 s/evt) – Compensation ∝ lost performance ᐧ penalty

Poorly performing cloud Compensation Region

Time [day]

Well performing cloud Compensation Region

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 25 Few Examples ² OpenStack @ CERN ² Microsoft Azure ² Procurement ² CPU bmk Vs Job

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 26 Correlation with job performance • Job Vs KV benchmark as measured in the same VM

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 27 Next steps • Continue studies on benchmarking of – CERN private cloud resources – Commercial cloud resources • T-Systems – CERN cloud production activity – 4 experiments involved for the next 3 months to run production workloads – Benchmarking of each provisioned VM will be performed

• Add other benchmarks, if agreed in this WG – Representative of the applications of the other experiments – Suggested in this WG

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 28 Next steps (II)

• Evaluate what others are doing • Keeping in mind HEP specific needs – Recently started to look into: • PerfKitBenchmarker – Long list of benchmarks available (hpcc, fio, netperf, objStorage, scimark, speccpu, specsfs, unixbench, …) • SPEC Cloud IaaS 2016 – Currently only 2 benchmarks available: Cassandra for I/O, map-reduce for CPU – It is under license

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 29 Conclusions • Benchmarking of Cloud Resources will stay – And become more and more crucial

• Need to extend to other areas – Network, storage

• Have an unified approach for several use cases – Procurement, continuous benchmarking/optimization, calibration, …

• Adopt supported IT tools – Messaging System, ES, Kibana, Analytics

D. Giordano HEPiX Benchmarking Working Group 03/06/2016 30 D. Giordano HEPiX Benchmarking Working Group 03/06/2016 31