IBM Research Report

RC25675 (WAT1803-107) March 29, 2018 Computer Science IBM Research Report ConfEx: An Analytics Framework for Text-Based Software Configurations in the Cloud Ozan Tuncer1, Nilton Bila2, Canturk Isci2, Ayse K. Coskun1 1Boston University Boston, MA 02215 USA 2IBM Research Division Thomas J. Watson Research Center P.O. Box 218 Yorktown Heights, NY 10598 USA Research Division Almaden – Austin – Beijing – Brazil – Cambridge – Dublin – Haifa – India – Kenya – Melbourne – T.J. Watson – Tokyo – Zurich ConfEx: An Analytics Framework for Text-based Software Configurations in the Cloud Ozan Tuncer Nilton Bila Canturk Isci Ayse K. Coskun Boston University IBM Research IBM Research Boston University Boston, MA 02215 Yorktown Heights, NY 10598 Yorktown Heights, NY 10598 Boston, MA 02215 Email: [email protected] Email: [email protected] Email: [email protected] Email: [email protected] Abstract—Modern cloud applications are designed in a highly Configurations are traditionally validated by applications configurable way to provide increased reusability and portability. during startup. However, recent work has shown that 14-93% With the growing complexity of these applications, configuration of configuration parameters in today’s cloud software do not errors (i.e., misconfigurations) have become major sources of service outages and disruptions. While some research has so far have any special code for checking their correctness during ap- focused on automatically detecting errors on configurations that plication initialization [11]. To detect misconfigurations before are represented as well-structured key-value pairs, discovering deployment, researchers have developed various tools to auto- and extracting configurations remain a challenge for a wide range matically check for errors in application configurations (e.g., of cloud applications that store their configurations in loosely- [12], [13]). Among such tools, statistical and learning-based structured text files. This paper proposes ConfEx, a framework that enables dis- techniques (e.g., [14], [15], [16]) have gained popularity as covery and analysis of text-based configurations in multi-tenant low overhead configuration checkers that can be applied in an cloud platforms and cloud image repositories. Our framework application-agnostic manner. Statistical configuration checkers uses a novel vocabulary-based discovery technique to identify train on a corpus of configurations and learn common patterns. text-based configuration files in cloud system instances with These methods can then identify configurations that deviate unlabeled content. We show that, even for labeled configuration files, widely-used and expert-maintained configuration parsing from the norm as potential errors. Such statistical methods tools lack the consistency and robustness needed for meaning- are powerful in practice because they do not require intrusive ful statistical analysis of configurations. We introduce a novel static/dynamic analysis or application instrumentation. disambiguation technique that resolves the inconsistencies in the In order to perform statistical and learning-based config- configuration-related data extracted by existing parsers. When uration analysis in multi-tenant cloud platforms, it is essen- tested on 4581 popular Docker Hub images, ConfEx achieves over 98% precision and recall in identifying configuration files, tial to extract configuration information from cloud system and consistently improves the efficacy of misconfiguration detec- instances (i.e., images, VMs, and containers) without losing tion through outlier analysis as well as syntactic configuration any information that is crucial for detecting errors. This validation. is challenging because cloud instance contents are largely unlabeled. One needs to discover which files are configuration I. INTRODUCTION files and also figure out to which applications these files be- Cloud software is complex and highly customizable. To long. Furthermore, cloud software configurations are typically function correctly, securely, and with high performance, cloud stored in loosely-structured text files where each software has applications often depend on precise tuning of hundreds of its own custom configuration syntax. For effective statistical configuration parameters [1]. In typical cloud services that analysis, the information extracted from these files needs to consist of multi-tiered software stacks, ensuring the desired be represented in a consistent format that allows comparison operation often requires correctly configuring thousands of of individual configuration parameters across a large number parameters [2]. of cloud instances. Errors in software configurations have been reported as In this work, we propose ConfEx, a novel software con- causes of service disruptions and outages at Facebook [3], figuration analytics framework that enables robust analysis LinkedIn [4], Microsoft Azure [5], Amazon EC2 [6], and of loosely-structured text-based configurations in multi-tenant Google [7]. Moreover, the affordability offered by the cloud cloud platforms and image repositories. ConfEx discovers and the prevalence of open-source software have enabled new configuration files of known applications in cloud instances levels of agility, where small teams of developers can deliver and parses these files to produce consistent configuration data new cloud services and functionality in short periods of time. for corpus-based analysis. We demonstrate two use cases of This newfound agility has led to a trend where service devel- ConfEx on a corpus of 4581 popular Docker Hub images: (1) opers and operators may lack the expertise needed to precisely detecting injected misconfigurations through outlier analysis tune all software components of a multi-tier architecture. As a and (2) syntactic configuration validation. Our contributions result, misconfigurations have become one of the lead causes can be summarized as follows: of cloud software failures [8], [9], [10]. • We design and implement ConfEx, a configuration an- TABLE I alytics framework that enables discovery and extraction COMMON CONFIGURATION ERROR TYPES AND EXAMPLE CONSTRAINTS of consistent configuration data and robust configuration THAT LEAD TO ERRORS UPON VIOLATION. analysis in multi-tenant cloud platforms. We demonstrate that ConfEx enables the use of existing configuration Error type Example configuration constraint analysis tools, which are designed for key-value pairs, In PostgreSQL, parameter values that are not simple with text-based software configurations in the cloud. Illegal entries identifiers or numbers must be single-quoted. • As part of our framework, we develop a vocabulary-based Variables must be in certain types (e.g., float). configuration file discovery technique to identify text- In PHP, mysql.max_persistent must be no based software configuration files in cloud instances with larger than the max_connections in MySQL. unlabeled content. Our approach can identify application Inconsistent In Cloudshare, service’s redis.host entry entries configuration files with over 98% precision and recall. (an IP address) must be a substring of Nginx’s • We show that the outputs of existing configuration file upstream.msg.server entry (IP address:port). parsers often lack the consistency and robustness needed Invalid When using PHP in Apache, recode.so must for statistical analysis, and introduce a disambiguation ordering be defined before mysql.so. technique for parser outputs to resolve this problem. In MySQL, maximum allowed table size must be The rest of this work starts with an overview of configura- smaller than the memory available in the system tion analysis and management techniques in the cloud. Sec. II Environmental inconsistency In httpd, Apache user permissions must be set provides a background on configuration files and common correctly to enable file uploads for website visitors. misconfigurations. Section III gives the details of our proposed ConfEx framework. Section IV explains our experimental Missing In OpenLDAP, a configuration entry must include parameter ppolicy.schema to enable password policy. methodology, and Section V presents our experimental find- MySQL’s Autocommit parameter must be set to ings. Finally, we conclude in Section VII. Valid entries False to avoid poor performance under “insert” that cause intensive workloads. II. BACKGROUND ON TEXT-BASED CONFIGURATIONS performance In this section, we explain how cloud applications and ser- or security Debug-level logging must be disabled to avoid issues performance degradation. vices typically store their configurations. Then, we categorize common configuration errors to give some insight on the type of information required for effective configuration analysis. statement. While parsing these lines, one needs to retain the A. Text-based Configurations relational information between the parameters defined within Most cloud applications and system services store their the conditional statement, indicating that User and Group configurations in human-readable text files or in configuration belong to the IfModule unixd_module section. stores such as etcd and Windows registry. We focus on text In some configuration files, the file schema is not embedded file based configurations as this type of storage is prevalent in the file itself and requires domain knowledge to understand. for many of the building blocks of cloud applications (e.g., One such example is the Linux filesystem configuration file MySQL, Nginx, and Redis). (/etc/fstab), which defines available filesystems and their Figure 1 shows a snippet from an Apache

Load more