The Interplay of Compile-Time and Run-Time Options for Performance Prediction Luc Lesoil, Mathieu Acher, Xhevahire Tërnava, Arnaud Blouin, Jean-Marc Jézéquel
Total Page:16
File Type:pdf, Size:1020Kb
The Interplay of Compile-time and Run-time Options for Performance Prediction Luc Lesoil, Mathieu Acher, Xhevahire Tërnava, Arnaud Blouin, Jean-Marc Jézéquel To cite this version: Luc Lesoil, Mathieu Acher, Xhevahire Tërnava, Arnaud Blouin, Jean-Marc Jézéquel. The Interplay of Compile-time and Run-time Options for Performance Prediction. SPLC 2021 - 25th ACM Inter- national Systems and Software Product Line Conference - Volume A, Sep 2021, Leicester, United Kingdom. pp.1-12, 10.1145/3461001.3471149. hal-03286127 HAL Id: hal-03286127 https://hal.archives-ouvertes.fr/hal-03286127 Submitted on 15 Jul 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. The Interplay of Compile-time and Run-time Options for Performance Prediction Luc Lesoil, Mathieu Acher, Xhevahire Tërnava, Arnaud Blouin, Jean-Marc Jézéquel Univ Rennes, INSA Rennes, CNRS, Inria, IRISA Rennes, France [email protected] ABSTRACT Both compile-time and run-time options can be configured to reach Many software projects are configurable through compile-time op- specific functional and performance goals. tions (e.g., using ./configure) and also through run-time options (e.g., Existing studies consider either compile-time or run-time op- command-line parameters, fed to the software at execution time). tions, but not both and the possible interplay between them. For in- Several works have shown how to predict the effect of run-time stance, all run-time configurations are measured using a unique ex- options on performance. However it is yet to be studied how these ecutable of the system, typically compiled with the default compile- prediction models behave when the software is built with different time configuration (i.e., using ./configure without overriding compile- compile-time options. For instance, is the best run-time configu- time options’ values). Owing to the cost of measuring configura- ration always the best w.r.t. the chosen compilation options? In tions, this is perfectly understandable but it is also a threat to valid- this paper, we investigate the effect of compile-time options on the ity. In particular, we can question the generality of these models if performance distributions of 4 software systems. There are cases we change the compile-time options when building the software where the compiler layer effect is linear which is an opportunity system: Do compile-time options change the performance distribu- to generalize performance models or to tune and measure runtime tion of run-time configurations? If yes, to what extent? Is the best performance at lower cost. We also prove there can exist an inter- run-time configuration always the best? Are the most influential play by exhibiting a case where compile-time options significantly run-time options always the same whatever the compile-time op- alter the performance distributions of a configurable system. tions used? Can we reuse a prediction model whatever the build has been? In short: (RQ1) Do compile-time options change the CCS CONCEPTS performance distributions of configurable systems? In this paper we investigate the effect of compile-time options • Software and its engineering ! Software product lines; Soft- together with runtime options on the performance distributions ware product lines; Software performance; • Computing method- of 4 software systems. For each of these systems, we measure a ologies ! Machine learning. relevant performance metrics for a combination of =12 compile =1 ¸=1 time options and =1A run-time options (yielding 2 2 A possi- ble configurations) over a number of different inputs. We show 1 INTRODUCTION that the compile-time options can alter the run-time performance An ideal software system is expected to deliver the right function- distributions of software systems, and that it is worth tuning the ality on time, using as few resources as possible, in every possible compile-time options to improve their performances. We thus ad- circumstances whatever the hardware, the operating system, the dress a second research question: (RQ2) How to tune software compiler, or the data fed as input. For fitting such a diversity of performances at the compile-time level? needs, it is increasingly common that software comes in many Our contributions are as follows : variants and is highly configurable through configuration options. • We construct a protocol and carry out an empirical study Numerous works have studied the effects of configuration op- investigating the effects of compile-time options on run-time tions on performance. The outcome is a performance model that performances, for 4 software systems, various workloads, can be used to predict the performance of any configuration, to find and non-functional properties; an optimal configuration, or to identify, debug, and reason about in- • We provide a Docker image, a dataset of measurements, as fluential options of a system [19, 23, 28, 29, 38–40, 44, 49, 53, 56, 63]. well as analysis scripts1; There are however numerous mechanisms to implement and de- • We exhibit a case (namely nodeJS and its operation rate) for liver options: configuration files, command-line parameters, feature which the compile-time options interact with the run-time toggles or flags, plugins, etc. A useful distinction to make is between options. We also present and implement a simple case of compile-time and run-time options. On the one hand, compile-time cross-layer tuning, providing a set of configuration options options can be used to build a custom system that can then be (run-time & compile-time) improving the operation rate of executed for a variety of usages. The widely used "./configure && the default configuration of nodeJS. make" is a prominent example for configuring a software project Remainder. Section2 discusses the motivation of this paper; at compile-time. On the other hand, run-time options are used to Section3 presents the experimental protocol; Section4 evaluates parameterize the behavior of the system at load-time or during the and analyses the results w.r.t. this protocol; Section5 discusses execution. For instance, users can set some values to command-line arguments for choosing a specific algorithm or tuning the execu- 1Companion repository; Zenodo data: https://zenodo.org/record/4706963; Docker: tion time based on the particularities of a given input to process. https://hub.docker.com/r/anonymicse2021/splc21 Luc Lesoil, Mathieu Acher, Xhevahire Tërnava, Arnaud Blouin, Jean-Marc Jézéquel the results, their limitations and future work; Section6 describes model has many applications and use-cases [4, 19, 22–24, 28, 29, 38– research works related to the topic of this paper; Section7 details 40, 44, 49, 53, 56, 60, 63, 65]: optimization (tuning) and finding of the threats to validity; Section8 concludes our paper. the best configuration, specialization of the configuration space, un- derstanding and debugging of the effects of configuration options, 2 BACKGROUND AND MOTIVATION verification (e.g., non-regression testing), etc. The performance of a 2.1 Compile-time and run-time options system, such as execution time, is usually measured using the same compiled executable (binary). In our case, we consider that the Many software systems are highly-configurable both at compile way the software is compiled is subject to variability; there time and at run time. For instance, the "./configure && make" is are two configuration spaces. Formally, given a software sys- widely used to build a custom system i.e., an executable binary. tem with a compile-time configuration space C and a run-time Considering the x264 video encoder (see Figure1), the compile- configuration space R, a performance model is a black-box func- time option –disable-asm can be used to disable "platform-specific tion 5 : C × R ! R that maps each run-time configuration A 2 R assembly optimizations". There are compile-time options to activate to the performance of the system compiled with a compile-time the needed functionality (e.g., –disable-avs), to fit requirements configuration 2 2 C. The construction of a performance model con- related to the hardware or operating system, etc. Some of these sists in running the system in a fixed compile-time setting 2 2 C on options have a suggested impact on performance. For the same various configurations A 2 R, and record the resulting performance subject system x264, it is also possible to use run-time options values ? = 5 ¹2, Aº. such as me, mbtree and cabac with possible effects on the size Learning performance models. Measuring all configurations of and encoding time. Run-time options are used to parameterize the a configurable system is the most obvious path to, for example, behavior of the system at load time or during the execution. find a well-suited configuration. It is however too costly or infea- sible in practice. Machine-learning techniques address this issue by measuring only a subset of configurations (known as sample) and then using these configurations’ measurements to build a per- formance model capable of predicting the performance of other configurations (i.e., configurations not measured before). Research work thus follow a "sampling, measuring, learning" process [19, 22– 24, 29, 38–40, 44, 49, 53, 60, 63, 65]. The training data for learning a performance model of a system compiled with a configuration 2 2 C is then 퐷2 = f¹R#8, P#8º j 8 2 1 : = g where = is the number of measurements. Statistical learningJ techniques,K such as linear regression [51], decision trees [48], or random forests [42], use this training set to build prediction models.