Configuration Management with Newfig William Lefebvre and David Snyder – CNN Internet Technologies
Total Page:16
File Type:pdf, Size:1020Kb
Auto-configuration by File Construction: Configuration Management with Newfig William LeFebvre and David Snyder – CNN Internet Technologies ABSTRACT A tool is described that provides for the automatic configuration of systems from a single description. The tool, newfig, uses two simple concepts to provide its functionality: boolean logic for making decisions and file construction for generating the files. Newfig relies heavily on external scripts for anything beyond the construction of files. This simple yet powerful design provides a mechanism that can easily build on other tools rather than a single monolithic stand- alone program. This provides for a great deal of flexibility while maintaining simplicity. The language is a combination of boolean logic and output statements, and also provides for macros and other essential elements. All output is written to channels: an abstraction which provides for extensive configurability. Examples are provided that show the tool’s power and flexibility. A description is also provided of the efforts undergone at CNN to integrate this tool in to a sizable infrastructure. The paper concludes with a discussion of future improvements. Introduction When we sought out a tool for automated sys- tems configuration, we were looking for certain prop- As the number of systems in an infrastructure grows, the administration problem grows with it. erties that we believe would be beneficial in our envi- Unless the engineering staff grows accordingly, at some ronment. We wanted the system to be idempotent, point the management of the systems will need to be congruent, deterministic, transportable, extensible, and automated. This realization is nothing new, and several fail-safe. We wanted a specification language that was successful tools have been developed to meet this need. both clear and concise, to minimize training and maxi- mize understanding. When a tool could not be found Configuration data used by a system to control that met all of these criteria, we set out to design a how it behaves can be broken down into three basic new solution to the problem. The result, newfig, takes categories: information which is unique to a system, a new approach to the problem while providing all of information which is common to a proper subset of the qualities that we believe are important. As a result, systems, and information which is common to all sys- very little is built in to newfig. Instead it is a frame- tems in the infrastructure. Unique information includes work which can utilize other tools and scripts to a system’s hostname, local disk partitions, and network accomplish results. Rather than provide a wide range address assignments. Global information includes such files as /etc/services and /etc/networks. The partially of built-in mechanisms, newfig uses file construction global information is the most interesting: it is com- as its only primitive operation. Its configuration is a mon across systems which share a similar function but purely declarative language based on boolean logic. not necessarily others. The task of configuring a sys- The files that newfig constructs, called channels, can tem requires that all of this information be in place and be used to replace existing files on the system or as correct. The most complicating factor in achieving a input to external programs (including scripts). As a correct configuration is in the management of files result, the system is naturally extensible. Newfig is which combine data from more than one category. used to generate input for and monitor the execution of the programs and scripts which perform the actual A file that only contains unique data can be modifications to the system. It provides a structure copied into place when a system is installed. Such files around which system administrators can do what they are rarely touched after system installation. Files that do best: automate through scripting. We believe that contain only global information can be easily updated from a central repository. Likewise, files that contain the resulting system meets all our initial design goals partially global information can be updated from a cen- and provides us with an excellent platform for auto- tral authority provided there is some mechanism that mated configuration. distinguishes among the different classes of systems Related Work and is able to determine which data are appropriate for the system. However, when a file contains a mixture of Prescriptions [8] is a declarative language for these data categories, its maintenance becomes signifi- describing the desired state of configuration for dis- cantly more difficult. This is most noticeable in files tributed systems. It provides mechanisms for specify- such as fstab, inetd.conf, hosts.allow, rc script directo- ing operations that may be used to bring systems into ries, and sometimes passwd. conformance with a specification. However, it does 2004 LISA XVIII – November 14-19, 2004 – Atlanta, GA 93 Auto-configuration by File Construction: Configuration Management with Newfig LeFebvre and Snyder not seem to provide mechanisms for file distribution changes is provided to the tool, and it ensures that and synchronization. We have not been able to find those changes are carried out on each system in an recent work on this project since Thornton’s 1994 exact order. ISconf version 2 requires that the descrip- technical report. tion be created and maintained manually, whereas later CFengine [1] is the most widely known work in versions provide more automated ways for generating this area. It is to automated systems configuration the description. The basic premise of ISconf is that what awk(1) is to scripting. The main concepts are ‘‘order matters’’: it is easier to replicate the order in patterns, actions, and an execution context that is man- which operations are conducted than it is to determine aged by an interpreter. It provides a rich body of built- and accommodate the interactions of those operations. in actions, but has little room for expansion beyond Like newfig, ISconf implements congruence. The dif- that set. The configuration drives actions to perform ficulty with ISconf is the monotonic increase to the on a system, whereas newfig provides a description of description and the steps required to recreate a system. the desired target for the system and enforces con- As changes are piled on top of changes, the time formity to that description. CFengine implements con- required to build or rebuild a system continues to vergence by providing a mechanism that brings a sys- increase. Although preservation of the order of tem closer to an ideal. In our environment we sought a changes is sufficient to achieve congruence, it is our tool that implements congruence, which is a tighter belief that it is not necessary. standard than convergence. The idea of file construc- Design tion is not central to CFengine, and a CFengine con- figuration that uses such a methodology tends to be Newfig is a system designed to provide for the cumbersome. CFengine is also not able to remove automatic configuration of individual machines from a changes implicitly: such steps must be explicitly given common description. Boolean algebra is used to con- in the configuration. trol the generation of output to a number of channels. Each channel can be used to control the contents and Psgconf [6] takes a highly modular approach to characteristics of a file. Channels can also be used as configuration management, providing hooks for vari- input to scripts for operations which are more compli- ous data stores, policy rules, and actions. The configu- cated than basic file construction. All channel defini- ration parsing is order dependent, making psgconf a tions are part of the configuration, allowing the func- procedural configuration tool rather than declarative, tionality of newfig to be extended with ease. Newfig is which has some advantages and some drawbacks. designed to be idempotent, transportable, extensible, Site [3] uses declarative statements to describe conformant (rather than convergent), and fail-safe. the configuration of a computing site at three levels of The configuration consists of a series of boolean abstraction. At the lowest levels, drivers written in C phrases interspersed with output statements. Boolean are used to construct the contents of configuration algebra is used as the logical structure for the newfig files. The paper describes a prototype implementation configuration language. Clauses are used to infer the only. In contrast, newfig provides no restrictions on the logical value of a symbol from other symbols. The language used to construct channels, which is good for algebra supports the three basic logical operations: admins who have long since shed their systems pro- and, or, not. Parentheses are also recognized for gramming scales (or never had them to begin with). grouping operations. Between the boolean clauses are PIKT [5] is an interpreted scripting language, statements that send lines to channels. Each channel preprocessor, and scheduler that is primarily intended must be explicitly defined in the configuration along to monitor systems, reporting problems and taking with its characteristics. A channel can be associated corrective action when possible. Over time, it has been with a file, in which case its contents becomes that of extended to include configuration management fea- the file. A channel can also be associated with an tures, but most of the terminology in the language is external command or script, in which case the script is built around monitoring. For example, scheduling used to process the channel’s contents. External com- periodic execution of a script involves adding it to the mands are also used to perform syntax and semantic ‘‘alarms’’ section of the alerts.cfg file. checks of channels’ content to ensure correctness. Radmind [2] takes a file based approach to con- Processing is performed in several distinct figuration management by integrating intrusion detec- phases in newfig: read, intrinsic definition, inference, tion with centralized system management.