Turbo Boost Up, AVX Clock Down: Complica ons for Scaling Tests
Steve Lantz
12/15/2017 1 What Is CPU Turbo? (Sandy Bridge)
= nominal frequency
h p://www.hotchips.org/wp-content/uploads/hc_archives/hc23/HC23.19.9-Desktop-CPUs/HC23.19.921.SandyBridge_Power_10-Rotem-Intel.pdf
12/15/2017 2 Factors Affec ng Turbo Boost 2.0 (SNB)
“The processor’s rated frequency assumes that all execu on cores are ac ve and are at the sustained thermal design power (TDP). However, under typical opera on not all cores are ac ve or at execu ng a high power workload...Intel Turbo Boost Technology takes advantage of the available TDP headroom and ac ve cores are able to increase their opera ng frequency. “To determine the highest performance frequency amongst ac ve cores, the processor takes the following into considera on to recalculate turbo frequency during run me: • The number of cores opera ng in the C0 state. • The es mated core current consump on. • The es mated package prior and present power consump on. • The package temperature. “Any of these factors can affect the maximum frequency for a given workload. If the power, current, or thermal limit is reached, the processor will automa cally reduce the frequency to stay with its TDP limit.” h ps://www.intel.com/content/www/us/en/processors/core/2nd-gen-core-family-mobile-vol-1-datasheet.html
12/15/2017 3 How Long Can SNB Sustain Turbo Boost?
12/15/2017 4 Does It Affect Tests on phiphi?
slantz@phiphi ~$ sudo cpupower frequency-info • Yes, in principle, for 1-4 threads analyzing CPU 0: ... – Faster clock, even in steady state? boost state support: – Supported: yes More likely if a test is < 2 minutes Active: yes – More likely if it has a cold(ish) start 2400 MHz max turbo 4 active cores 2400 MHz max turbo 3 active cores • 2500 MHz max turbo 2 active cores Why do we care? 2500 MHz max turbo 1 active cores – It throws off parallel scaling tests slantz@phiphi ~$ grep MHz /proc/cpuinfo – Baseline with 1 thread is too high cpu MHz : 2000.205 ... • We can try to measure the effect slantz@phiphi ~$ grep -c ida /proc/cpuinfo – Results in a few slides... 24
12/15/2017 5 Does It Affect Tests on mic0 or phi2?
slantz@phiphi ~$ ssh mic0 grep -c ida /proc/cpuinfo • No, Turbo Boost is disabled 0 slantz@phiphi ~$ micsmc –-turbo status mic0 – No IDA flags in /proc/cpuinfo – mic0 (turbo): No Intel Dynamic Accelera on Turbo mode is disabled
slantz@phi2 ~$ grep -c ida /proc/cpuinfo • IDA test works best 0
– Based on CPUID slantz@phi2 ~$ sudo cpupower frequency-info analyzing CPU 0: – Root isn’t needed ... boost state support: • Others do use IDA! Supported: no Active: no – Stampede2 KNL slantz@phi2 ~$ cat /sys/devices/system/cpu/intel_pstate/no_turbo – Stampede2 Skylake 1
12/15/2017 6 Strategy for Measuring Turbo Boost*
• Write a loop that takes a known, fixed number of cycles – Must always execute sequen ally (no reordering by CPU) – Must not involve any memory opera ons – Can use SSE opera ons, but not AVX (we’ll look at that later) • Run copies of it on some number of threads with OpenMP • Time the loop on all threads; do this for many trials – Take minimum across threads and trials to remove OpenMP overhead • Calculate max frequency observed (for nthreads ≤ ncores)
*h ps://stackoverflow.com/ques ons/11706563/how-can-i-programma cally-find-the-cpu-frequency-with-c/25400230#25400230
12/15/2017 7 Inner Loop: SpinALot128
static int inline SpinALot128(int spinCount) • Simple C code with intrinsics { __m128 x = _mm_setzero_ps(); – Based on 128-bit SSE opera ons for(int i=0; i