Multi-Layer Software Configuration
Total Page:16
File Type:pdf, Size:1020Kb
Multi-layer Software Configuration: Empirical Study on Wordpress Mohammed Sayagh, Bram Adams Polytechnique Montreal, Canada fmohammed.sayagh, [email protected] Abstract—Software can be adapted to different situations systems consist of multiple layers, each of which hides the and platforms by changing its configuration. However, incorrect complexity of a lower layer, and has its own objects and configurations can lead to configuration errors that are hard configuration mechanisms. Since the behaviour of the system to resolve or understand, especially in the case of multi-layer architectures, where configuration options in each layer might as a whole requires neighbouring layers to collaborate, one contradict each other or be hard to trace to each other. Hence, needs to understand each layer’s configuration as well as how this paper performs an empirical study on the occurrence of configuration options in each layer interfere with each other. multi-layer configuration options across Wordpress (WP) plugins, Let’s consider the case of WP (Figure 1), which is currently WP, and the PHP engine. Our analyses show that WP and its the most popular content management system, and a typical plugins use on average 76 configuration options, a number that increases across time. We also find that each plugin uses on example of a multi-layer system consisting of a LAMP stack average 1.49% to 9.49% of all WP database options, and 1.38% (Linux, Apache, MySQL and PHP), the WP PHP application to 15.18% of all WP configurable constants. 85.16% of all WP and a myriad of WP plugins. One example of a cross-layer database options, 78.88% of all WP configurable constants, and configuration error was the inability of WP plugins to send 52 PHP configuration options are used by at least two plugins emails, due to a misconfiguration in lower layers related at the same time. Finally, we show how the latter options have 2 a larger potential for questions and confusion amongst users. to the PHP configuration option sendmail path . A second example was the case where the NextGen plugin was no longer I. INTRODUCTION able to upload images3 until someone pointed out that the Configuration is the means to adapt a software application script downloading the images was blocked by a configuration to different contexts and environments. For example, the Linux option in the PHP layer (memory limit). In both examples, kernel can be customized to different users by selecting only configuration options in lower layers impacted the behaviour the features that are of interest. Similarly, the kernel can of the top layer plugins. be customized to a specific hardware platform by providing Whereas existing work focuses on software configuration the details of processor, hard disk, and other devices. Each and configuration errors within a single layer of a software software system has its own mechanism for configuration, system, this paper represents a first empirical study towards ranging from hardcoded constants to global variables, property understanding the configuration options used by and shared files or dedicated databases, typically with a graphical user between different layers in the WP multi-layer system, as well interface to hide the underlying storage mechanism. as potential links with comprehension problems of users. The An incorrect value of a configuration option could result results can then be used in a follow-up study on multi-layer in incorrect behavior of a system, which we refer to as configuration errors. In particular, we address two preliminary configuration errors. Such errors occur often, typically are research questions to understand the evolution of configuration severe in nature, hard to debug, but they are actionable [1]. options across time, and two questions analyzing the usage of Such bugs are severe because they can have a catastrophic options across the studied layers: impact. For example, due to a misconfiguration, Facebook RQ1: What is the proportion of usage of each configu- was left inaccessible for about two hours1, depriving more ration mechanism in each layer? than 500 million users from access to the Facebook website. WP uses configurable constants and database con- Configuration errors are also hard to debug, since they need figuration options equally, while WP plugins prefer expertise in the failing application. However, if one is able (87%) configuration options stored in the database. to track down the cause of such an error, then the error RQ2: How does configuration mechanism usage evolve is actionable since a maintainer just needs to update the across time in each layer? configuration, usually without recompiling. Generally, the number of configuration options grows The major challenge for resolving software configuration across time, especially when new features are added errors is to find the violating configuration options, a chal- lenge that is aggravated in multi-layer systems. Multi-layer 2https://wordpress.org/support/topic/plugin-contact-form-7-wont-connect- to-smtp-server 1https://www.facebook.com/notes/facebook-engineering/more-details-on- 3https://wordpress.org/support/topic/plugin-nextgen-gallery-error-exceed- todays-outage/431441338919 memory-limit 978-1-4673-7529-0/15 c 2015 IEEE 31 SCAM 2015, Bremen, Germany Accepted for publication by IEEE. c 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. 52 in a layer. Configuration options that are not used categories. Arshad et al. [3] provide a characterization of anymore are removed after a while. configuration problems for two Java EE application servers, RQ3: How many configuration options defined in lower GlassFish and JBoss, by analyzing 281 bugs-reports. Hubaux layers are used by WP plugins? et al. [4] conduct two surveys respectively among Linux and On average, 1.49% to 9.49% of all WP database eCos users to understand configuration challenges. Jin et al. configuration options and 1.38% to 15.18% of all [5] analyze two open source applications and one industrial WP configurable constants are used by the plugins. application to quantify the challenges that configurability Furthermore, 1.30 PHP configuration options are creates for software testing and debugging. Other studies focus used by plugins, but only 0.40 ones are modified. on predicting configuration bugs. Using textual information in These large numbers are confirmed by the Stack- bug reports, Xia et al. [6] built a model to predict whether a Overflow and WP Exchange fora, where 12.19% bug is a configuration error or not. of all plugin conversations and 9.49% of all WP Many studies have been conducted to resolve configuration conversations related to configuration mention multi- errors. Keller et al. [7] proposed the tool ConfErr that aims at layer configuration options. quantifying the resilience of a software system to configuration RQ4: How many plugins share the same configuration errors caused by spelling mistakes, structural errors, and options of lower layers? semantic errors. Zhang et al. [8] built a tool to identify the 78.88% of all WP configurable constants and 85.16% root cause of a configuration error in Java programs. Zhang of all WP database options are used by at least two et al. [1] provide the tool ConfSuggester, which suggests plugins. For PHP, 52 PHP configuration options are the configuration option responsible for introducing a bug used by at least two plugins. We found a strong in a new version. The suggestion is generated based upon correlation of up to 0.55 between the number of the control flow of a system. Elsner et al. [9] propose a plugins using a configuration option and the number framework to detect configuration inconsistencies. It allows of fora conversations mentioning it. a user to specify the possible inconsistencies in a software The paper is organized as follows: section II presents the application, which will be combined with a model built from background and related work, and section III presents our the configuration files to find the inconsistencies. Attariyan et methodology. Section IV provides the results of our study, al. [10] built the tool ConfAid, which aims at pointing out while section V discusses the threats to validity. Finally, the root cause of configuration errors, again by analyzing the section VI concludes the paper and presents future work. control flow. Tartler et al. [11] propose an approach to resolve the inconsistencies between the configuration model and its II. BACKGROUND AND RELATED WORK implementation in the Linux source code. In this section, we provide background information about While the above research provides a set of configuration op- software configuration, multi-layer systems, and the WP tions that should be changed in order to fix a bug, Wang et al. ecosystem, and we discuss related work. [12] present an ordered set of configuration options to change in order to fix a configuration error, based on user feedback. A. Software Configuration Similarly, Xiong et al. [13] propose an approach that yields Configuration is a mechanism to adapt software systems the configuration options to change and a range of possible to a context or an infrastructure and is used to customize a values. Lillack et al. [14] evaluate the tool Lotrack, which system’s behaviors. A configuration option is a pair consisting explains for each code fragment which load-time configuration of an option name and its value, where the value has a specific options should be active for it to be executed. Nadi et al. [15] type (typically boolean or categorical, but sometimes numeric propose a static approach to extract and validate configuration or even a string).