Accurate Measurement of Small Execution Times – Getting Around Measurement Errors Carlos Moreno, Sebastian Fischmeister

1 Accurate Measurement of Small Execution Times – Getting Around Measurement Errors Carlos Moreno, Sebastian Fischmeister Abstract—Engineers and researchers often require accurate example, a typical mitigation approach consists of executing measurements of small execution times or duration of events in F multiple times, as shown below: a program. Errors in the measurement facility can introduce time start = get_current_time() important challenges, especially when measuring small intervals. Repeat N times: Mitigating approaches commonly used exhibit several issues; in Execute F particular, they only reduce the effect of the error, and never time end = get_current_time() eliminate it. In this letter, we propose a technique to effectively // execution time = (end - start) / N eliminate measurement errors and obtain a robust statistical The execution time that we obtain is subject to the mea- estimate of execution time or duration of events in a program. surement error divided by N; by choosing a large enough The technique is simple to implement, yet it entirely eliminates N, we can make this error arbitrarily low. However, this the systematic (non-random) component of the measurement error, as opposed to simply reduce it. Experimental results approach has several potential issues: (1) repeating N times confirm the effectiveness of the proposed method. can in turn introduce an additional unknown overhead, if coded as a for loop. Furthermore, this is subject to uncertainty in I. MOTIVATION that the compiler may or may not implement loop unrolling, meaning that the user could be unaware of whether this Software engineers and researchers often require accurate overhead is present; (2) the efficiency of the error reduction measurements of small execution times or duration of events in process is low; that is, we require a large N (thus, potentially a program. These measurements may be necessary for example large experiment times) to significantly reduce the error; and for performance analysis or comparison, or for measurement- (3) re-executing F may require re-initialization through some based worst-case execution time (WCET) analysis. Obtaining call to an additional function, which then introduces a new execution times analytically from the assembly code is increas- uncertainty: ingly difficult with modern architectures, and often the most time start = get_current_time() practical alternative is actual measurement during execution. Repeat N times: Measurement errors and uncertainties introduce an impor- initialize_parameters() Execute F tant difficulty, especially for short intervals. For example, a time end = get_current_time() basic approach to measure the execution time of a given // execution time = (end - start) / N function or fragment of code F is the following: In this case the measured time corresponds to time start = get_current_time() the execution time of F plus the execution time of Execute F initialize_parameters() and it is not possible to determine time end = get_current_time() // execution time = end - start either of the two values. Unfortunately, the resulting measurement includes the Other mitigating approaches include profiling the execution actual execution time of F plus an unknown amount time of the get_current_time() facility. For example, mea- of time corresponding to the internal processing time of sure the execution time of a null fragment, and the measured average value corresponds to the systematic error or overhead get_current_time(). With modern OSs, invocation of the in get_current_time(): get_current_time() facility can even involve a context Repeat N times: switch to kernel mode and back. This unknown overhead start = get_current_time() comprises a systematic (non-random) component and a noise end = get_current_time() (random) component. The systematic error is in general related sum = sum + (end - start) to the execution time of the get_current_time() facility, overhead = sum / N whereas the random component can be due to a variety of This approach, however, can be ineffective for modern factors such as measurement based on clock ticks or schedul- architectures where low-level hardware aspects such as cache ing issues, or even electrical noise in the case of measurement and pipelines can cause a difference in the execution time of based on pin-toggling. Though our work addresses both, get_current_time() when executed in succession vs. when our focus is on the systematic error, which is the one that executed with other code in between calls. commonly used approaches fail to effectively eliminate. Embedded engineers often measure execution time by ob- Some of the common mitigating approaches only reduce the serving a signal at an output pin. They instrument the code effect of these errors and never completely eliminate it. In our with an output port instruction to toggle a pin, and measure time between edges to obtain execution times. Though this Carlos Moreno and Sebastian Fischmeister are with the Electrical and Computer Engineering Department of the University of Waterloo, Canada. mechanism in general yields higher accuracy, it is still subject E-mail: fcmoreno,sfi[email protected] to the same types of measurement errors discussed above. 2 II. RELATED WORK subject to noise. However, we can reduce the effect of noise To the best of our knowledge, no concrete effective approaches by repeating this differential measurement multiple times. have been suggested to get around these issues related to We remark that coding these repetitions as a for loop is measurement errors. Some common ideas and themes seem to not an issue, since the loop overhead occurs outside of the be present, some of them oriented to measuring variations in measurements. The idea is illustrated below: Repeat N times: timing parameters (see for example [1]). The effect of outliers time T1 = get_current_time() is a recurrent theme. Typical approaches involve statistics that Execute F are immune to the effect of outliers, such as the median. time T2 = get_current_time() Execute F Jain [2] and Laplante [3] both discuss performance analysis Execute F including estimation of execution times and statistical analysis time T3 = get_current_time() of experimental data. Oliveira et. al [4] proposed a system total += (T3 - T2) - (T2 - T1) where statistically rigorous measurements are extracted under // execution time = total / N carefully controlled environment, effectively getting around This technique is simple to implement, and by choosing some of the issues that can disrupt the parameters being a sufficiently large N, we can make the effect of the noise measured. These works, however, focus on the extraction and arbitrarily low. However, as our experimental results confirm, analysis of experimental data for performance evaluation, and the technique presented in the next section produces mea- not on the actual measurement of execution times. surements with higher precision for the same total number Stewart [5] covers basic techniques for measuring execution of measurements. time and applicability to WCET and performance analysis. IV. STRAIGHT LINE FITTING Lilja [6] also covers some of the basic techniques. It provides We now present a technique that is more efficient in terms of a good set of definitions, terminology, and modeling of mea- reduction of the noise for a given total number of measure- surement errors and their sources. In particular, [6] presents ments. Without loss of generality, we assume a zero-mean a good discussion on the notions of accuracy, precision, and model for the measurement noise.1 In the simpler case where resolution. CPU clock cycle counters in modern processors no re-initialization is necessary before each invocation of F, (e.g., Intel [7] and ARM [8]) can provide high resolution, a statistical estimate of T can be obtained through a straight but they don’t necessarily avoid the measurement artifacts that e line fitting given the multiple points (N ;T ), where T is the reduce accuracy and precision in the measurements. k k k measured time when executing F N times. The slope of this In the context of WCET analysis, hybrid approaches [9] use k line corresponds to T . static analysis to determine WCET in terms of execution times e We can set up a scheme where M measurements are taken, of fragments. These times are measured with pin-toggling where the first time we measure one execution of F, then two instrumentation, assisted by a special logic-analyzer hardware. executions, then three, and so on until measuring M executions Our proposed approach exhibits important advantages over of F. Following the above notation (N ;T ), this would the pin-toggling instrumentation approach, yet it can benefit k k correspond to the case where N = k, with 1 6 k 6 M. from the static analysis component that can compensate for k Since we do not require a large number of repetitions for complex hardware features that introduce a relationship where F (i.e., M can be a relatively low value), it is feasible to execution paths affect the execution times being measured. execute this sequence without requiring a for loop. This III. GETTING AROUND MEASUREMENT ERRORS avoids the issue of introducing additional unknown overhead, as mentioned in Section I. We start by addressing the systematic error only. In the next The straight line y = ax + b for a set of M points (x ; y ) sections we include the random part (noise) in the analysis. k k can be easily determined [10]. In our case, we obtain the For N consecutive executions of F, the measured execution value of T (the slope, a) substituting x (= N ) = k and time T corresponds to T = NT + ϵ, where T is the e k k e e y = T . Figure 1 shows an example with M = 20 using execution time of F and ϵ is the overhead from the invocations k k POSIX’s clock_gettime() to measure a ≈40 ns execution of get_current_time(). We notice that it is impossible to time (best-fitting line is y = 40:4 x + 18:8). determine T , since we have two unknowns. e One of the important advantages of our method is its robust- The key observation is that with an additional measurement ness against sporadic measurements with large errors.

Accurate Measurement of Small Execution Times – Getting Around Measurement Errors Carlos Moreno, Sebastian Fischmeister

Metric System Units of Length

Introduction to Measurements & Error Analysis

Software Testing

How Are Units of Measurement Related to One Another?

Measuring in Metric Units BEFORE Now WHY? You Used Metric Units

Breathe London-NPL-Statistical Methods for Characterisation Of

EXPERIMENTAL METHOD Linear Dimension, Weight and Density

MEMS Metrology Metrology What Is a Measurement Measurable

Implementation and Performance Analysis of Precision Time Protocol on Linux Based System-On-Chip Platform

Laser Power Measurement: Time Is Money

3.1 Dimensional Analysis

Understanding Measurement Uncertainty in Certified Reference Materials 02 Accuracy and Precision White Paper Accuracy and Precision White Paper 03