CERN Cloud Benchmark Suite
Total Page:16
File Type:pdf, Size:1020Kb
CERN Cloud Benchmark Suite D. Giordano, C. Cordeiro (CERN) HEPiX Benchmarking Working Group - Kickoff Meeting June 3rd 2016 D. Giordano Aim • Benchmark computing resources – Run in any cloud IaaS (virtual or bare-metal) – Run several benchmarks – Collect, store and share results • Satisfy several use cases – Procurement / acceptance phase – Stress tests of an infrastructure – Continuous benchmarking of provisioned resources – Prediction of job performance D. Giordano HEPiX Benchmarking Working Group 03/06/2016 2 ~1.5 year Experience • Running in a number of cloud providers • Collected ~1.2 M tests • Adopted for – CERN Commercial Cloud activities – Azure evaluation – CERN Openstack tests • Focus on CPU performance – To be extended to other areas • Further documentation – White Area Lecture (Dec 18th) … D. Giordano HEPiX Benchmarking Working Group 03/06/2016 3 Strategy – Allow collection of a configurable number of benchmarks • Compare the benchmark outcome under similar conditions – Have a prompt feedback about executed benchmarks • In production can suggest deletion and re-provisioning of underperforming VMs – Generalize the setup to run the benchmark suite in any cloud – Ease data analysis and resource accounting • Examples of analysis with Ipython shown in the next slides • Compare resources on cost-to-benefit basis – Mimic the usage of cloud resources for experiment workloads • Benchmark VMs of the same size used by VOs (1 vCPU, 4 vCPUs, etc) • Probe randomly assigned slots in a cloud cluster – Not knowing what the neighbor is doing D. Giordano HEPiX Benchmarking Working Group 03/06/2016 4 Cloud Benchmark Suite D. Giordano HEPiX Benchmarking Working Group 03/06/2016 5 A Scalable Architecture • A configurable sequence of benchmarks to run • Results are collected in Elasticsearch cluster & monitored with Kibana – Metadata: VM UID, CPU architecture, OS, Cloud name, IP address, … • Detailed analysis performed with Ipython analysis tools D. Giordano HEPiX Benchmarking Working Group 03/06/2016 6 The cern-benchmark* package • The cern-benchmark suite is available in the ai6-stable.repo and also in GitLab1 – cern-benchmark-docker is also available, as an extension of the previous, providing a container execution mode • The benchmark suite provides the ability to run one or more benchmarks with the option to publish (or not) the final results to ES at CERN – The currently available benchmarks are: ATLAS KV, Fast Benchmark and Whetstone VM (dedicated CentOS6 containers) =============================================== RESULTS OF THE OFFLINE BENCHMARK FOR CLOUD =============================================== offline Machine classification: i7_1_f6m26s3_mhz2266.746 run kv Whetstone Benchmark: …… …… run fastBmk cern-benchmark run whetstone {} publish to ES results.json $ kv $ fastBmk $ whetstone 1 https://gitlab.cern.ch/cloud-infrastructure/cloud-benchmark-suite and https://gitlab.cern.ch/cloud-infrastructure/cloud-benchmark-suite-docker More information at http://bmkwg.web.cern.ch/bmkwg/ D. Giordano HEPiX Benchmarking Working Group 03/06/2016 7 Benchmarking Approach • Run the benchmark(s) in parallel in a configurable number of threads • The benchmark suite can be run at any time during the VM lifecycle – Only at the beginning – At each job cycle – Sequentially - all configured benchmarks in a row – Synchronized - specific benchmarks at specific points in time D. Giordano HEPiX Benchmarking Working Group 03/06/2016 8 Benchmarks Used So Far • HEP related – LHCb Fast Benchmark (fastBmk) • Original python code modified by A. Wiebalck to run python.multiprocessing • Very fast, gaussian random generator – ATLAS G4 Simulation via KV toolkit • Open-source Phoronix benchmarks adopted by DBCE to “commoditize” resources – Adopted in the past, proved to be useful to generate high and continuous load • Open-source Whetstone • Studied possibility to include HS06 – Difficult: for license aspects, no open source distribution, long running time, etc • Recently SPEC has released a Cloud IaaS toolkit (more comments later) D. Giordano HEPiX Benchmarking Working Group 03/06/2016 9 ATLAS KV Reference Workload • Which workload to use for benchmarking? – CPU time/event is different for each workload – Measured that within ~10% the relative CPU/event performance doesn’t depend on specific workloads • Confirmed also using a different approach: HammerCloud jobs Study done in pre-procurement phase CloudA CloudB • Preferred workload: G4 single muon: faster running time O(few mins) – NB: the CPU time/event doesn’t include the first event, to avoid bias due to the initialization process D. Giordano HEPiX Benchmarking Working Group 03/06/2016 10 The DBCE Phoronix benchmarks D. Giordano HEPiX Benchmarking Working Group 03/06/2016 11 Few Examples" ! ² OpenStack!! @ CERN! ² Microsoft Azure! ² Procurement! ² CPU bmk Vs Job D. Giordano HEPiX Benchmarking Working Group 03/06/2016 12 Probing the OpenStack Compute Environment! • Probe performance of VMs in OpenStack Compute Environment! – Where resources are assigned to the experiments for CERN cloud activities ! – Tenant with ~200 single-core VMs ! • Make sure VMs are provisioned in different Hypervisors! • Run synchronized benchmarking suite ! VM1 VM 2 time Num VMs per pnode D. Giordano HEPiX Benchmarking Working Group 03/06/2016 13 Profiling Results! 1.5 Average KV performance Vs Time 95th percentile evt 1.2 mean KV sec/ KV 1 5th percentile 3 KV performance per pnode evt 1.5 95th percentile KV sec/ KV 1 5th percentile D. Giordano HEPiX Benchmarking Working Group 03/06/2016 14 Case of Study: OpenStack at CERN! Work done in collaboration with J. Van Eldik • Evaluate the effect of hypervisor load on the performance of single vCPU VMs! – Extracted 5 nodes from pool of computing nodes! • Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz! – Load phases: create a targeted number of VMs per hypervisor ! • 1 VM per KVM! • 16 VMs per KVM! • 30 VMs per KVM! – VM image: Scientific Linux CERN SLC release 6.6 (Carbon)! – Run sequence of benchmarks! VM1 VM2 time • Used Phoronix open source benchmarks to produce load! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 15 Qualitative Look at Data! • Larger dispersion in KV and FastBmk values in the highest-load region ! 25 VMs 16 VMs 1 VM 30 VMs 16 VMs 20 VMs D. Giordano HEPiX Benchmarking Working Group 03/06/2016 16 More quantitative analysis: FastBmk Vs KV! • Correlation study in the region 16 and 30 VMs! – NB: FastBmk metric transformed into value-1 [s] – The average performance degradation differs per Hypervisor and Bmk used! Ratio mean(30VMs)/mean(16 VMs) Projection-Y Profile-X Aggr. x hypervisor KV FastBmk Evolution of a single VM in 2D plot the parameter space Projection-X FastBmk Vs KV Aggr. x hypervisor A single KVM.!! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 17 And the Other Benchmarks?! • Ability to discriminate LAME mp3 encoding different hypervisor Where is the single performance depends KVM.?? on the specific test ! 7Zip compression A single KVM.!! A single KVM.!! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 18 https://zenodo.org/record/48495#.V1FJUud95FV Few Examples" ! ² OpenStack!! @ CERN! ² Microsoft Azure! ² Procurement! ² CPU bmk Vs Job D. Giordano HEPiX Benchmarking Working Group 03/06/2016 19 Cost-benefit Analysis for VM Flavours (series)! • Goal: Compare the VM offer on the cost/event basis! – CPU time/event * VM-cost/hour è cost/event! – Adopt KV. Fast Benchmark (fastBmk) as a crosscheck! • Measurements cover! – three Azure Data Centres: Central US, North EU, West EU! – Two series of VMs! • Standard_A3 (4 cores) Standard Tier • Standard_D3 (4 cores) Optimized compute USD/(hcore) • Azure pricing website ! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 20 Azure Data Centre and CPU Model Comparison! • Performance & effective cost of a given VM series depends on the CPU model! D3 E5-2660 D3 E5-2660 D3 E5-2673 D3 E5-2673 • Effective normalized cost of D3 VMs (E5-2660 0) is high! – Even compared with A series! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 21 KV Vs fastBmk! • Good Linearity among two independent A3 and D3 series benchmarks: KV and fastBmk ! • The measured effect does not depend on specific compiler flags! ! A1 series D. Giordano HEPiX Benchmarking Working Group 03/06/2016 22 Few Examples" ! ² OpenStack!! @ CERN! ² Microsoft Azure! ² Procurement! ² CPU bmk Vs Job D. Giordano HEPiX Benchmarking Working Group 03/06/2016 23 Rank Cloud Providers by Benchmark! • When: In tendering phase! • Who: Cloud providers ! • How: Run our performance test! – Results are collected and used to verify compliance with the Technical Requirements! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 24 Benchmarking & Service Credit Compensation! • Poor performance gives rise to Service Credit Compensation! – Fix limits: min. desired (KV 1.2 s/evt) and tolerated performance (KV 1.5 s/evt)! – Compensation ∝ lost performance ᐧ penalty! Poorly performing cloud Compensation Region Time [day] Well performing cloud Compensation Region D. Giordano HEPiX Benchmarking Working Group 03/06/2016 25 Few Examples" ! ² OpenStack!! @ CERN! ² Microsoft Azure! ² Procurement! ² CPU bmk Vs Job D. Giordano HEPiX Benchmarking Working Group 03/06/2016 26 Correlation with job performance! • Job Vs KV benchmark as measured in the same VM! D. Giordano HEPiX Benchmarking Working Group 03/06/2016 27 Next steps! • Continue studies on benchmarking of! – CERN private cloud resources! – Commercial cloud resources !! • T-Systems – CERN cloud production activity! – 4 experiments involved