Four Languages and Lots of Macros: Analyzing Autotools Build Systems

Four Languages and Lots of Macros: Analyzing Autotools Build Systems Jafar M. Al-Kofahi Suresh Kothari Christian Kästner Electrical and Computer Engineering Electrical and Computer Engineering School of Computer Science Department Department Carnegie Mellon University Iowa State University Iowa State University Abstract maintenance burden [22, 39], making build systems more de- Build systems are crucial for software system development. fect prone, and would make migration more challenging [40]. However, there is a lack of tool support to help with their Comprehension is a challenge as build tools tend to use do- high maintenance overhead. GNU Autotools are widely used main specific, powerful, and complex languages. in the open-source community, but users face various chal- Generator-based build tools can be particularly challeng- lenges from its hard to comprehend nature and staging of ing to understand as they have multiple stages involved in multiple code -generation steps, often leading to low quality the build process, where each stage generates build artifacts and error-prone build code. In this paper, we present a plat- to be executed in a later stage during the build. For exam- form, AutoHaven, to provide a foundation for developers ple, in GNU Autotools the M4 preprocessor creates a shell to create analysis tools to help them understand, maintain, script that subsequently generates a Makefile from a tem- and migrate their GNU Autotools build systems. Internally it plate, which is eventually executed by make. Such staged uses approximate parsing and symbolic analysis of the build build process adds to the system’s complexity. Such systems logic. We illustrate the use of the platform with two tools: are not easy to analyze, neither for humans nor machines. ACSense helps developers to better understand their build In this work, we target GNU Autotools [7] as an instance of systems and ACSniff detects build smells to improve build particularly difficult generator-based build tools that requires code quality. Our evaluation shows that AutoHaven can sup- an analysis of multiple languages and their interactions. In port most GNU Autotools build systems and can detect build contrast to prior tools for maintaining and understanding smells in the wild. build systems that focused mostly on specific tasks [19– 21, 31, 41], we aim to build a generic analysis infrastructure CCS Concepts • Software and its engineering → Soft- that can be used for various maintenance tasks. We demon- ware maintenance tools; strate our infrastructure by providing two tools for build comprehension and build-smell detection. Keywords build-system, GNU Autotool, Autoconf, build Both build comprehension and build-smell detection ad- maintenance dress real issues for users of Autotools. We noticed that ACM Reference format: Autotools users frequently complain about maintaining and Jafar M. Al-Kofahi, Suresh Kothari, and Christian Kästner. 2017. making changes to their Autoconf scripts. Developers face Four Languages and Lots of Macros:. In Proceedings of 16th ACM challenges in understanding their build system, specifically SIGPLAN International Conference on Generative Programming: Con- their configuration script and how it fits into the bigpic- cepts and Experiences, Vancouver, Canada, October 23–24, 2017 (GPCE’17), ture. A trial-and-error approach to build system maintenance 11 pages. leads to overly complicated and hard to maintain configu- https://doi.org/10.1145/3136040.3136051 ration scripts. In line with prior work [39], we identified that developers neglect to properly maintain their build system, causing various issues within their build system adding 1 Introduction to their maintenance overhead, such as inconsistencies be- Build systems are a crucial part of software development. tween user documentation and how is the build system im- Prior work has shown that build systems are complex [18, plemented. We also identified the existence of build smells, 28, 39], come with high maintenance overhead [29, 30], and which are issues related to the quality of the build system that are defect prone as they keep evolving with the software adds to its technical debt and complexity but does not break system [34]. Those studies emphasize the importance of build the build, such as the existence of unused variables, depen- systems and call for better tool support to help developers dencies. These build smells exists within specific parts of the understand, maintain, and migrate their build systems. build script but can also span multiple languages, including For the developers to properly maintain or migrate their configuration scripts, build logic, and the target code. build systems, they first need to have a solid understand- We approach the analysis of generator-based build sys- ing of their build system mechanics and how it builds the tems written with Autotools by attempting to parse and software. A lack of understanding would lead to increased GPCE’17, October 23–24, 2017, Vancouver, Canada Jafar M. Al-Kofahi, Suresh Kothari, and Christian Kästner recognize common patterns in un-preprocessed build code 2.1 GNU Autotools (i.e., without running the generators). We subsequently sym- To understand our solution, we first need to introduce how bolically execute the configuration logic to identify options, GNU Autotools and their build systems work. Figure 1 shows their interactions, and their build actions on the remainder of the structure of such build systems. For a given system, the the build process. A build action is any action executed (i.e., developers provide Makefile.am Automake files, to describe check for dependency, generate artifacts) to accomplish the how to build the system at a high level; they also provide a build process. For build-smell detection, we further extract configure.ac Autoconf file to describes the system’s external the targets of build actions in templates for the build logic dependencies and the build-system configurations. These and in the target C code itself and define analyses for build declared configurations can alter the build process to include smells that compare the various extracted information. We or exclude system features, at the file level within the build package our analysis infrastructure providing a structural script, or at a more granular level in the source code. representation of the build system created by parsing and GNU Automake processes the Makefile.am files to iden- symbolic analysis as a tool platform called AutoHaven. We tify the source files to be built, and then it generates Make- further explain how the extracted information can be used to files.in templates that hold the build logic for the system. support developers in build comprehension tasks and build a Special placeholders annotate these templates, which are tool for build-smell detection. We evaluate infrastructure on later used to adjust the build logic according to the selected ten real-world build system written with GNU Autotools and build configurations; these placeholders are called substitu- identify that our approximation is practical and useful for tion variables. They are used by the configuration script to build analysis. Our evaluation shows that we can correctly pass configuration related settings (i.e. which file to include) capture the configuration actions and detect build smells in and environment configurations (i.e., which compiler to use, real build systems. compiler flags). Overall, we contribute: The Autoconf configure.ac script is written using GNU M4 macros [12], shell scripting, and can also have snippets of • An analysis infrastructure for generator-based build C, C++, or Perl programming languages to check for certain systems written in GNU Autotools that extracts infor- libraries or features in the environment. The M4 language mation from configuration scripts with approximate is macro based, where each macro expands to a predefined parsing and symbolic execution of the configuration snippet of text; in the Autoconf case, these M4 macros are ex- logic. panded to snippets of shell scripts. GNU Autoconf processes • A comprehension tool support for developers to better the configure.ac file, expand the M4 macros and generate understand their GNU Autotools build systems. the configure shell script file. The configure script consumes • A definition of build smells, and a detection tool that the Makefile.in templates and generates concrete Makefiles. identifies build smells, including issues that span mul- For example, a template would have the following line: CC tiple languages. = @CC@. This variable holds the C compiler command for • An evaluation of our infrastructure on ten real-world the build process. When the configure script is executed, it build systems, demonstrating accuracy despite approx- identifies the default C compiler, then substitute the @CC@ imations and the ability to identify real build-smells placeholder in the templates with the C compiler command in the wild. (e.g., CC = gcc). In GNU-Autotools-based build systems, developers can This paper extends a prior workshop paper on the Auto- control what to compile from the source code at file level Haven infrastructure [22]. In that workshop paper, we ar- with the Makefiles, or on a more granular level using con- gued for the need of such an infrastructure and sketched a ditional compilation. In conditional compilation, a block of possible solution. We also surveyed criticism of Autotools code would have an associated condition, and if the condition users in practice and presented the first version of our pars- is met then the block of code is included in the source code ing approach for configuration logic. In this paper, we extend compilation; otherwise, it gets excluded. For conditional com- our prior work by adding a symbolic analysis of the config- pilation, developers use C-Preprocessor macros to surround uration logic, by building two tools (comprehension and blocks of code with CPP control constructs (i.e., #ifdef’s) and build-smell detection) on top of our infrastructure, and by declare the condition using CPP macros. evaluating our approach with ten real-world build systems.

Four Languages and Lots of Macros: Analyzing Autotools Build Systems

Source Code Trees in the VALLEY of THE

Red Hat Enterprise Linux 6 Developer Guide

GNU Build System

The GNOME Desktop Environment

Why and How I Use Lilypond Daniel F

Using GNU Autotools to Manage a “Minimal Project”

The Starlink Build System SSN/78.1 —Abstract I

Project1: Build a Small Scanner/Parser

Autoconf.Pdf

Autotools Tutorial

GNU M4, Version 1.4.7 a Powerful Macro Processor Edition 1.4.7, 23 September 2006

1 What Is Gimp? 3 2 Default Short Cuts and Dynamic Keybinding 9