Version Models for Software Configuration Management REIDAR CONRADI Norwegian University of Science and Technology, Trondheim AND BERNHARD WESTFECHTEL Aachen University of Technology

After more than 20 years of research and practice in software configuration management (SCM), constructing consistent configurations of versioned software products still remains a challenge. This article focuses on the version models underlying both commercial systems and research prototypes. It provides an overview and classification of different versioning paradigms and defines and relates fundamental concepts such as revisions, variants, configurations, and changes. In particular, we focus on intensional versioning, that is, construction of versions based on configuration rules. Finally, we provide an overview of systems that have had significant impact on the development of the SCM discipline and classify them according to a detailed taxonomy.

Categories and Subject Descriptors: D.2.2 [Software Engineering]: Tools and Techniques—computer-aided software engineering; D.2.6 [Software Engineering]: Programming Environments; D.2.9 [Software Engineering]: Management—software configuration management; H.2.3 [Database Management]: Languages—database (persistent) programming languages; H.2.8 [Database Management]: Database Applications; I.2.3 [Artificial Intelligence]: Deduction and Theorem Proving—deduction, logic programming General Terms: Languages, Management Additional Key Words and Phrases: Changes, configuration rules, configurations, revisions, variants, versions

1. INTRODUCTION Maturity Model (CMM) developed by Software configuration management the Software Engineering Institute (SCM) is the discipline of managing the (SEI) [Humphrey 1989; Paulk et al. evolution of large and complex software 1997]. CMM defines levels of maturity systems [Tichy 1988]. The importance of in order to assess software development SCM has been widely recognized, as processes in organizations. Here SCM is reflected in particular in the Capability seen as one of the key elements for

This work was partially performed during a sabbatical in which the second author spent August through October 1995 in Trondheim. Support from NTNU is gratefully acknowledged. Authors’ addresses: R. Conradi, Department of Computer and Information Science, Norwegian Univer- sity of Science and Technology (NTNU), N-7034 Trondheim, Norway; B. Westfechtel, Department of Computer Science III, Aachen University of Technology, D-52056 Aachen, Germany. Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. © 1998 ACM 0360-0300/98/0600–0232 $05.00

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 233

CONTENTS control (by establishing strict proce- 1. INTRODUCTION dures to be followed when performing 2. PRODUCT SPACE a change), status accounting (record- 2.1 Software Objects 2.2 Relationships ing and reporting the status of compo- 2.3 Representations of the Product Space nents and change requests), and audit 3. VERSION SPACE and review (quality assurance func- 3.1 Versions, Versioned Items, and Deltas tions to preserve product consistency). 3.2 Extensional and Intensional Versioning Thus SCM is seen as a support disci- 3.3 Intents of Evolution: Revisions, Variants, and Cooperation pline for project managers. 3.4 Representations of the Version Space: Version —As a development support discipline, Graphs and Grids SCM provides functions that assist 3.5 State-Based and Change-Based Versioning 4. INTERPLAY OF PRODUCT SPACE AND VERSION developers in performing coordinated SPACE changes to software products. This 4.1 AND/OR Graphs view of SCM is described, for exam- 4.2 Granularity of Versioning ple, in the textbook by Babich [1986]. 4.3 Deltas To support developers, SCM is in 4.4 Relations Between Version Model and Data Model charge of accurately recording the 5. INTENSIONAL VERSIONING composition of versioned software 5.1 Problem: Combinability Versus Consistency products evolving into many revisions Control and Manageability and variants, maintaining consis- 5.2 Conceptual Framework for Intensional Version- ing tency between interdependent compo- 5.3 Configuration Rules nents, reconstructing previously re- 5.4 Configurators: Tools for Evaluating Configura- corded software configurations, tion Rules building derived objects (compiled 5.5 Tools code and executables) from their 6. VERSION MODELS IN SCM SYSTEMS 6.1 Overview sources (program text), and construct- 6.2 Taxonomy-Based Classification ing new configurations based on de- 6.3 Descriptions of SCM Systems scriptions of their properties. 7. RELATED WORK 7.1 Related Work on Version Models In this article, SCM is primarily con- 7.2 Related Disciplines 8. CONCLUSION sidered a development support disci- pline. We provide an overview of version models implemented both in commercial systems and research prototypes. A ver- moving from “initial” (undefined pro- sion model defines the objects to be ver- cess) to “repeatable” (project manage- sioned, version identification and orga- ment, SCM, and quality assurance have nization, as well as operations for come into operation). Furthermore, retrieving existing versions and con- SCM plays an important role in achiev- structing new versions. Software objects ing ISO 9000 conformance. and their relationships constitute the SCM serves different needs [Feiler product space, their versions are orga- 1991a]. nized in the version space. A versioned object base combines product and ver- —As a management support discipline, sion space. A specific version model is SCM is concerned with controlling characterized by the way the version changes to software products. It is space is structured, by the decision of this view of SCM that is addressed in which objects are versioned both exter- the classical textbook by Bersoff et al. nally (from the user’s point of view) and [1980] and the IEEE standard [IEEE internally (within the versioned object 1983; IEEE 1988]. According to the base), by the relationships among ver- latter, SCM covers functionalities sion spaces of different objects, and by such as identification of product com- the way reconstruction of old and con- ponents and their versions, change struction of new versions are supported.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 234 • R. Conradi and B. Westfechtel

SCM systems express version models is to give definitions and introduce a in varying ways. Many systems, includ- taxonomy. Furthermore, it provides a ing most commercial ones [Rigg et al. survey of the current state of the art by 1995], are file-based and apply version- describing SCM systems in a unified ing to files and directories [Rochkind terminology and classifying them ac- 1975; Tichy 1985; Fowler et al. 1994]. cording to the taxonomy. In this way, it Various language-based approaches prepares the ground for developing a have been developed as well based on uniform version model, that is, a com- modular programming languages [Lamp- mon framework in which specific ver- son and Schmidt 1983b], module inter- sion models may be expressed. A uni- connection languages [Prieto-Diaz and form model would assist developers not Neighbors 1986], or system modeling only in constructing SCM systems but languages [Marzullo and Wiebe 1986]. also in tailoring them more flexibly to These languages are typically used to the needs of their users. Multiple para- represent versions of modules and rela- digms could be supported in parallel, tionships such as imports, include de- allowing users to switch back and forth pendencies, and the like. Finally, sev- as required. Furthermore, as noted in eral SCM systems are founded on Brown et al. [1991], a uniform model databases and manage versions of ob- would constitute a common foundation jects and relationships stored in the da- for integrating heterogeneous SCM sys- tabase. Different data models have been tems in a federated architecture. used, including EER [Dittrich et al. In the interest of a thorough discus- 1986; Oquendo et al. 1989], object-ori- sion, we focus on core issues of version- ented [Estublier and Casallas 1994], ing, namely, the organization of the ver- and deductive [Zeller and Snelting sion space, the interrelations of product 1995] ones. space and version space, and the con- The variety of formalisms makes it struction of consistent configurations. difficult to compare the version models Other issues considered essential parts of different SCM systems with one an- other. In addition, each system comes of SCM are only discussed briefly, in with its own terminology. On the other particular, management of workspaces, hand, the underlying concepts are often construction of derived objects, coopera- very similar. In order to reveal these tion, and distribution: all these issues concepts, we introduce a unified termi- are related to version management, but nology. Furthermore, to describe ver- elaborating on them goes beyond the sion models in a uniform, “canonical” scope of this article. formalism, we use graphs at many The article is structured as follows. places in the article. A graph consists of Before introducing versions, the product nodes and edges representing entities space is described in Section 2. Subse- and (binary) relationships, both of quently, we discuss the version space which may be decorated with attributes. without making any assumptions about Graphs are well suited to represent the the product space (Section 3). The inter- organization of a versioned object base, play of product and version space is even if the corresponding system is not addressed in Section 4. Section 5 is de- graph-based. For example, SCCS [Roch- voted to intensional versioning (i.e., con- kind 1975] and RCS [Tichy 1985] are struction of versions based on rules de- both file-based, but the version space of scribing consistent combinations). a text file may be represented naturally Section 6 provides an overview of sys- as a version graph. Other formalisms tems that have had significant impact are used as required, such as textual on the development of the SCM disci- languages for expressing configuration pline. Related work is discussed in Sec- rules. tion 7, and a short conclusion is given in The main contribution of this article Section 8.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 235

2. PRODUCT SPACE following, we represent the contents of software objects in long attributes at- The product space describes the struc- tached to nodes of the product graph. ture of a software product without tak- Software objects may have different ing versioning into account (in other representations, depending on the types words, we assume only one version of a of tools operating on them. In toolkit software product). The product space environments, software objects are can be represented by a product graph stored as text files [Rochkind 1975]. In whose nodes and edges correspond to contrast, syntax trees [Habermann and software objects and their relationships, Notkin 1986] or graphs [Nagl 1996] are respectively. Different version models used in structure-oriented environ- vary in their assumptions with respect ments. As discussed later, these repre- to the product space. These differences sentations influence the functionality of refer to the types of software objects an SCM system. For example, a and relationships, to the granularity of command for comparing two versions of object representations, and to the se- a program module returns differing text mantic levels of modeling. lines in the case of text files and differ- ing syntactic units in the case of syntax 2.1 Software Objects trees, respectively. Independently of the representation A software object records the result of a chosen for software objects, we may dis- development or maintenance activity. tinguish between domain-independent An SCM system has to manage all kinds and domain-specific models of the prod- of software objects created throughout uct space. Domain-independent models the software life cycle, including re- make no assumptions about the types of quirements specifications, designs, doc- software objects to be maintained. All umentations, program code, test plans, software objects produced throughout test cases, user manuals, project plans, the whole software lifecycle project are and the like. subject to [Tichy 1985]. Identification is an essential function Domain-specific models are tailored to- provided by SCM. Thus, each software wards specific types of software objects object carries an object identifier (OID) (e.g., abstract data types in algebraic that serves to identify it uniquely specifications) [Ehrig et al. 1989]. within a certain context. An external OID is a name assigned by the user, 2.2 Relationships whereas a system-generated, unique OID may be used internally. Software objects are connected by vari- Software objects are rather coarse- ous types of relationships. Composition grained units that are structured inter- relationships are used to organize soft- nally. For example, a program module is ware objects with respect to their gran- composed of declarations and state- ularity. For example, a software product ments, and a documentation consists of may be composed of subsystems, which sections and paragraphs. Thus a soft- in turn consist of modules. Objects that ware object is composed of more fine- are decomposed are called composite ob- grained units. jects or configurations. Objects residing The data model used for representing at the leaves of the composition hierar- the product space may or may not dis- chy are denoted atomic objects. Note tinguish explicitly between coarse- that an “atomic” software object is still grained and fine-grained units. For structured internally; that is, it has a example, file systems make this distinc- fine-grained content. The root of a com- tion, whereas many object-oriented data position hierarchy is called the (soft- models represent coarse- and fine- ware) product. grained units in a uniform way. In the The semantics of composite objects

ACM Computing Surveys, Vol. 30, No. 2, June 1998 236 • R. Conradi and B. Westfechtel varies significantly across different object depends on the available tool sup- modeling approaches. Furthermore, a port. Furthermore, software objects may specific approach may support customi- be partially derived and be partially zable semantics. As a least common de- constructed manually. For example, the nominator, a composite object is defined skeleton of a module body may be cre- as an object o that represents a sub- ated automatically from its interface graph of the product graph. All objects and subsequently filled in by a pro- that are transitively reachable from o grammer. via composition relationships belong to The process of creating derived ob- this subgraph. There may be structural jects from source and other derived ob- constraints with respect to the composi- jects is called system building. The ac- tion hierarchy (trees, DAGs), and the tions to be performed are specified by existence of a component may depend build rules. The build tool has to ensure on the existence of its superobject(s). that build steps corresponding to these Furthermore, there may be constraints rules are executed in the correct order; with respect to long attributes (e.g., that is, build dependencies must be long attributes may be attached only to taken into account. In contrast, source leaves of the composition hierarchy). Fi- dependencies represent relationships nally, a composite object may act as a between source objects (e.g., lifecycle unit with respect to structural opera- dependencies as mentioned previously). tions (e.g., copy or delete), abstraction (encapsulation of components), concur- 2.3 Representations of the Product Space rency control (locking of a composite object includes locking of its compo- Figure 1 illustrates different represen- nents), and version control. tations of a sample software product foo Dependency relationships (simply which is implemented in the program- called dependencies in the following) es- ming language C. Part (a) shows the tablish directed connections between ob- modules of foo and their import depen- jects that are orthogonal to composition dencies. The top-level module main im- relationships. They include, for exam- ports from a and b, which both import ple, lifecycle dependencies between re- from c. foo may be represented in differ- quirements specifications, designs, and ent ways. Some examples are given in module implementations, import or in- (b), (c), and (d): clude dependencies between modules, and build dependencies between com- —In (b), foo is stored in the file system. piled code and source code. Each module is represented by multi- The source and the target of a depen- ple files. The suffixes .h, .c, .o, and dency correspond to a dependent and a .exe denote header files, body files, master object, respectively. A depen- compiled code, and executables, re- dency implies that the contents of the spectively. Dependencies and build dependent must be kept consistent with steps are stored in a text file (the the contents of the master. Thus the system model sys, e.g., a make file dependent may have to be changed [Feldman 1979]). when the master is modified. —In (c), we assume a data model that Software objects are further classified supports typed objects and relation- into source objects and derived objects. A ships (e.g., an EER model as used in source object is created by a human who PCTE [Oquendo et al. 1989]). As in is supported by interactive tools, for ex- the file system representation, there ample, text editors or graphical editors. is still a composition tree whose A derived object is created automati- leaves correspond to single files. How- cally by a tool, for example, a compiler ever, dependencies are not repre- or linker. Note that the classification of sented in a separate text file. Rather, a software object as a source or derived the tree is augmented with relation-

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 237

Figure 1. Different representations of a software product.

ships reflecting include dependencies. ules.1 This organization, which has Build dependencies are not given ex- been realized, for example, in POEM plicitly because they can be computed [Lin and Reiss 1995], corresponds di- automatically from source dependen- rectly to the logical structure dis- cies and composition relationships. played in (a). —In (d), there is no longer a spanning tree, and all files making up a module are summarized in one object. Fur- thermore, only a single type of rela- 1 This relationship may be annotated by an at- tionship is used, which represents tribute that distinguishes between include depen- source dependencies between mod- dencies emanating from header and body.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 238 • R. Conradi and B. Westfechtel

3. VERSION SPACE under version control. In contrast, only one state is maintained for an unver- A version model defines the items to be sioned item; that is, changes are per- versioned, the common properties formed by overwriting. Versioning re- shared by all versions of an item, and quires a sameness criterion; that is, the deltas, that is, the differences be- tween them. Furthermore, it deter- there must be some way to decide mines the way version sets are orga- whether two versions belong to the nized. To this end, it introduces same item. This decision can be per- dimensions of evolution such as revi- formed with the help of a unique identi- sions and variants, it defines whether a fier, for example, an OID in the case of version is characterized in terms of the software objects. state it represents or in terms of some Within a versioned item, each version changes relative to some baseline, it must be uniquely identifiable through a selects a suitable representation for the version identifier (VID). Many SCM sys- version set (e.g., version graphs), and it tems automatically generate unique also provides operations for retrieving version numbers and offer additional old versions and constructing new ver- symbolic (user-defined) names serving sions. as primary keys. However, a version The characterization of version mod- can also be identified by an expression, els given in this section is still incom- which is the identification scheme used plete. Although the previous section de- by intensional versioning. scribed the product space without All versions of an item share common taking versioning into account, the cur- properties called invariants. These in- rent section conversely focuses on the variants can be represented, for exam- version space, abstracting from the ple, by unversioned attributes or rela- product space. Thus we are not con- tionships. Which invariants are shared cerned with the kinds of items put un- by versions depends on the specific ver- der version control, and we also con- sion model or the way it is customized sider versions of a single item only. to a certain application. At one end of However, a version model needs to ad- the spectrum, versions virtually share dress the interplay between product only a common OID. For example, in space and version space as well (Section systems such as SCCS and RCS ver- 4). sions of a text file may differ in arbi- trary ways. At the other end of the spectrum, versions must share semantic 3.1 Versions, Versioned Items, and Deltas properties. For example, version control A version v represents a state of an in algebraic specification [Ehrig et al. evolving item i. v is characterized by a 1989] enforces that all versions of a pair v ϭ (ps, vs), where ps and vs module body realize the shared inter- denote a state in the product space and face. a point in the version space, respec- Versions differ with respect to specific tively. The term item covers anything properties (e.g., represented by ver- that may be put under version control, sioned attributes). The difference be- including, for example, files and directo- tween two versions is called a delta. ries in file-based systems, objects stored This term suggests that differences in object-oriented databases, entities, should be small compared to invariants. relationships, and attributes in EER da- Delta can be defined in two ways (Fig- tabases, and so on. Versioning can be ure 2): applied at any level of granularity, ranging from a software product down —a symmetric delta between two ver- to text lines. sions v1 and v2 consists of properties گ A versioned item is an item that is put specific to both v1 and v2 (v1 v2 and

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 239

Figure 2. Deltas: (a) symmetric; (b) directed.

گ گ v2 v1, respectively, where denotes sion control is heavily influenced by the set difference); or way V is defined. Extensional versioning —a directed delta, also called a change, means that V is defined by enumerating is a sequence of (elementary) change its members: operations op1...opm which, when V ϭ ͕v ,...,v ͖. applied to one version v1, yields an- 1 n other version v2 (note the correspon- dence to transaction logs in databases). Extensional versioning supports re- In practice, deltas are not necessarily trieval of previously constructed ver- small. In the worst case, the common sions (which is a necessary requirement to any version model). All versions are part of v1 and v2 may even be empty. In fact, items may undergo major changes, explicit and have been checked in once and the common properties may become before. Each version is typically identi- smaller and smaller the more versions fied by a unique number. The user in- are created. For example, it is usually teracting with the SCM system re- unrealistic to assume that all versions trieves some version vi, performs of module bodies realize the same inter- changes on the retrieved version, and face (this assumption is made, e.g., in finally submits the changed version as a the algebraic approach already cited new version viϩ1. To ensure safe re- [Ehrig et al. 1989] and in the Gandalf trieval of previously constructed ver- system [Kaiser and Habermann 1983]). sions, versions can be made immutable. On the other hand, common properties In many systems, all versions are made do have to be asserted because other- immutable when they are checked into wise it does not make sense to group the object base [Rochkind 1975]; in oth- versions at all. ers, explicit operations are provided to A way out of this dilemma is multi- freeze mutable versions [Westfechtel level versioning; that is, a version may 1996]. Furthermore, immutability may have versions themselves. For example, be enforced selectively (e.g., by distin- in Adele [Estublier 1985] a module has guishing between mutable and immuta- multiple versions of interfaces each of ble attributes [Estublier and Casallas which is realized by a set of body ver- 1995]). sions. DAMOKLES [Dittrich et al. 1986] Intensional versioning is applied generalizes this idea and supports re- when flexible automatic construction of cursive versioning; that is, any version consistent versions in a large version may be versioned in turn. space needs to be supported. Instead of 3.2 Extensional and Intensional Versioning enumerating its members, the version set is defined by a predicate: A versioned item is a container for a set V of versions. The functionality of ver- V ϭ ͕v͉c͑v͖͒.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 240 • R. Conradi and B. Westfechtel

In this case, versions are implicit and 3.3 Intents of Evolution: Revisions, many new combinations are constructed Variants, and Cooperation on demand. The predicate c defines the Versioning is performed with different constraints that must be satisfied by all intents. A version intended to supersede members of V. A specific version v is its predecessor is called a revision. Revi- described intensionally by its properties sions evolve along the time dimension (e.g., the Unix version supporting the and may be created for various reasons, X11 window system). A version is con- such as fixing bugs, enhancing or ex- structed in response to some query. For tending functionality, adapting to example, such a query may simply con- changes in base libraries, and the like. sist of a tuple of attribute values (which Instead of performing modifications by may be considered the VID of the ver- overwriting, old revisions are preserved sion). In this case, a query corresponds to support maintenance of software de- to a (partial or total) function q that livered to customers, to recover from creates versions in V based on at- erroneous updates, and so on. tributes ranging over the domains Versions intended to coexist are A1...An: called variants. For example, variants 3 of data structures may differ with re- q : A1 x ...xAn V. spect to storage consumption, run-time efficiency, and access operations. Fur- Here the term “attribute” is used in a thermore, a software product may sup- general way; for example, attributes port multiple operating systems or win- may identify variants (e.g., an os at- dow systems. tribute determining the operating sys- Finally, versions may also be main- tem) or changes (e.g., a Boolean at- tained to support cooperation. In this tribute Fix to indicate whether a certain case, multiple developers work in paral- bug fix should be included or omitted). lel on different versions. Each developer The difference between extensional operates in a workspace [Estublier and intensional versioning may be illus- 1996] that contains the versions created trated by comparing SCCS [Rochkind and used. Cooperation policies regulate 1975] and RCS [Tichy 1985] to condi- when versions are exported from or im- tional compilation as, for example, sup- ported into a workspace. These issues ported with the C programming lan- are closely related to software process guage [Kernighan and Ritchie 1978]. management and in the following are SCCS and RCS store and reconstruct only discussed in passing (see also Sec- versions of text files (extensional ver- tion 7.2). sioning). The preprocessor used for con- ditional compilation constructs any source file based on the values of pre- 3.4 Representations of the Version Space: processor variables (intensional version- Version Graphs and Grids ing). All fragments of the source file are Many SCM systems use version graphs excluded whose conditions evaluate to for representing version spaces. A ver- false. sion graph consists of nodes and edges From SCCS/RCS and conditional com- corresponding to (groups of) versions pilation, SCM systems have been devel- and their relationships, respectively. oped that differ significantly in their Since a version graph assumes an ex- versioning capabilities. On the other plicitly given set of versions, it is ap- hand, it must be emphasized that exten- plied in conjunction with extensional sional and intensional versioning are by versioning. Despite the limitations dis- no means mutually exclusive, but can cussed in the following, they are widely (and should) be combined into a single used in practice. SCM system. Although a rich variety of version

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 241

Figure 3. Version graphs (one-level organization): (a) sequence; (b) tree; (c) acyclic graph. graphs is conceivable, the mainstream sions. Here at least two relationship of SCM systems is based on a small types are required, called successor number of graph types. (within a branch) and offspring (be- In the simplest case (one-level organi- tween branches) in Figure 4. This orga- zation), a version graph consists of a set nization is applied, for example, in RCS. of versions connected by relationships of ClearCase [Leblang 1994] goes beyond a single type, called successor relation- the RCS organization by recording ships. A version graph of this type pri- merges in the version graph. By means marily represents the evolution history of merging, changes performed on one of a versioned item. “v2 is a successor of branch can be propagated to another v1” means that v2 has been derived from branch. Essentially, this results in a v1, for example, by modifying a copy of directed acyclic graph. However, the v1. branches are not joined; rather, each of Version graphs may have different them continues to exist. shapes (Figure 3). In the most restric- Version graphs as presented previ- tive case, versions are organized into a ously support management of variants sequence of revisions. In a version tree, only to a limited extent. Variants can be successors of nonleaf versions may be represented by branches as long as created, for example, in order to main- their number is small. In the case of tain old versions that have already been multidimensional variation, this ap- delivered to a customer. In an acyclic proach breaks down because the num- graph, a version may have multiple pre- ber of branches explodes combinatori- decessors, for example, in order to ex- ally. Let us assume that each dimension press that a bug fix in an old version is is modeled by an attribute with domain merged with the currently developed Ai. Then the number of branches b is version. dominated by the product of the domain Several SCM systems use one of these cardinalities: different kinds of one-level organiza- Յ ͉ ͉ ͉ ͉ tion, for example, sequences in NSE b A1 ...An . [Adams et al. 1989] and acyclic graphs in PCTE [Oquendo et al. 1989]. To illustrate this, let us assume that DAMOKLES [Dittrich et al. 1986] sup- our sample product foo varies with re- ports user-defined structural con- spect to the operating system (DOS, straints. The structure of a version Unix, VMS), the database system (Ora- graph may be defined in the database cle, Informix, dbase), and the window schema as a sequence, a tree, or a di- system (X11, SunViews, Windows). In rected acyclic graph. this case, up to 27 branches would be In two-level organization, a version required. graph is composed of branches, each of This problem can be solved in the which consists of a sequence of revi- following ways.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 242 • R. Conradi and B. Westfechtel

Figure 4. Version graphs (two-level organization).

—Version graphs may be generalized in based models, a version is described in order to support multidimensional terms of changes applied to some base- variation. In Figure 5(a), versions are line. To this end, changes are assigned organized into clusters that are used change identifiers (CID) and potentially to construct classification hierarchies further attributes to characterize the [Dittrich and Lorie 1988]. reasons and the nature of a change. —Alternatively, versions may be ar- Change-based versioning provides a ranged in a grid (Figure 5b), that is, nice link to change requests: a change an n-dimensional space whose dimen- request is implemented by a (possibly sions correspond to variant attributes composite) change. Thus a version may [Sciore 1994]. be described in terms of the change requests it implements. Figure 5 illustrates only the variant “State- versus change-based” is or- space, assuming that there is no evolu- thogonal to “extensional versus inten- tion along the time axis. Revisions can sional.” State-based extensional version- be represented in the grid by adding a ing is provided, for example, by SCCS time dimension (orthogonal version and RCS and state-based intensional management [Reichenberger 1994]). In versioning can be realized, for example, the case of the version graph, one level by conditional compilation: preprocessor may be added at the bottom and succes- variables can be used to represent vari- sor relationships may be introduced to ants, and a state may be specified by represent histories. tuples of values for these variables. However, we emphasize that in general 3.5 State-Based and Change-Based conditional compilation can be used for Versioning both state- and change-based versioning In Section 3.1, a version has been de- (the latter is done, e.g., in the COV fined as a state of an evolving item. system [Gulla et al. 1991]). Version models that focus on the states Change-based versioning comes in of versioned items are called state- two forms. In the case of change-based based. In state-based versioning, ver- extensional versioning, the version set is sions are described in terms of revisions defined explicitly by enumerating its and variants. members and each version is described Changes provide an alternative way by the changes relative to some base- of characterizing versions. In change- line. Thus changes are used only for

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 243

Figure 5. n-dimensional variant space: (a) version graph; (b) grid. documentation. The OVUM report versioning are the COV system [Gulla about SCM systems [Rigg et al. 1995] et al. 1991], PIE [Goldstein and Bobrow uses the term change package to denote 1980], DaSC [MacKay 1995], and As- this form of change-based versioning. gard [Micallef and Clemm 1996]. Several SCM systems support change The change space (version space packages by annotating versions in ver- structured in terms of changes) can be sion graphs with change identifiers represented in different ways. The COV (e.g., ClearCase and PCMS). system arranges versions in an n-di- In change-based intensional version- mensional grid called option space. ing, changes are combined freely to con- Each change corresponds to a Boolean struct new versions as required. There- option that is set to true (false)ifthe fore a change is considered a partial change is applied (omitted). The Aide- function c : V 3 V, where V denotes the de-Camp documentation introduces a set of all potential versions of some matrix representation as shown in Fig- item. A version v is constructed by ap- ure 6(a). Lines and columns correspond plying a sequence of changes c1 ...cn to to versions and changes, respectively. a baseline b: The application of a change is indicated by a circle at an intersection point. For v ϭ c o ...oc͑b͒ ϭ c ͑...c ͑b͒...͒. 1 n n 1 example, v1 is constructed by applying The OVUM report (and also the sur- c1, c2, and c3 in order. vey by Feiler [1991a]) adopts the termi- Figure 6(b) illustrates the relations to nology coined by the Aide-de-Camp sys- the version graphs introduced in Fig- tem [Software Maintenance and ures 3 and 4, respectively. The versions Development Systems 1990] and calls shown in (a) are arranged in a graph this form of versioning the change set whose nodes and edges correspond to model. Further examples of systems versions and changes, respectively. b supporting change-based intensional denotes the baseline, and intermediate

ACM Computing Surveys, Vol. 30, No. 2, June 1998 244 • R. Conradi and B. Westfechtel

Figure 6. Change space: (a) matrix representation; (b) version graph with explicit changes.

versions are anonymous. For example, items put under version control. As the path from b to v1 again contains the mentioned earlier, a version model changes c1, c2, and c3. This version needs to address the interplay between graph explicitly expresses all changes product space and version space as well. included in a certain version, and there- In the following, we discuss those as- fore provides more information than the pects of version models that are con- version graphs shown in Figures 3 and cerned with this interplay. In particu- 4. In particular, it makes explicit when lar, we investigate which items are put certain changes were applied to multi- under version control, at what granular- ple versions. For example, c4 was used ity versioning is applied both externally to construct both v2 and v3. In contrast, and internally, how versions of different state-based versioning does not name items are interrelated, what models are the changes; that is, the changes are used for representing versioned object anonymous. As a consequence, the bases, and how the version model may changes applied to a version must be be related to the data model. deduced from the topology of the version graph. This may become difficult if 4.1 AND/OR Graphs merging is applied in extensive and complicated ways (or even impossible if AND/OR graphs [Tichy 1982a] provide merges are not recorded at all). a general model for integrating product space and version space. An AND/OR 4. INTERPLAY OF PRODUCT SPACE AND graph contains two types of nodes, VERSION SPACE namely, AND nodes and OR nodes. Analogously, a distinction is made be- In Section 2, the product space was tween AND and OR edges, which ema- described under the assumption that nate from AND and OR nodes, respec- only one version of each item is main- tively. An unversioned product graph tained. In Section 3, basic definitions can be represented by an AND/OR for versioning were given without con- graph consisting exclusively of AND sidering the product space. The current nodes/edges. Versioning of the product section combines product space and ver- graph is modeled by introducing OR sion space into a versioned object base. nodes. Versioned objects and their ver- So far, we have considered versioning sions are represented by OR nodes and of a single item only, and we have made AND nodes, respectively. no assumptions concerning the kinds of Note that the version graph illus-

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 245 trated in Figure 5(a) can be regarded as well. Each version of foo corresponds to an AND/OR graph as well: version clus- a bound configuration. In Figure 7(c) ters and versions correspond to OR references to components are generic, nodes and AND nodes, respectively. and the versions of foo therefore repre- However, in the following we are not sent generic configurations. concerned with the capabilities of Using this figure, we may classify ver- AND/OR graphs to model the version sion models according to the selection space of a single object. Rather, we are order during the configuration process: interested in the relations between ver- sions of different objects, abstracting for —Product first (Figure 7(a)) means that the moment from the ways in which the product structure is selected first; their version spaces are structured. subsequently, versions of components Therefore in the following a versioned are selected. This approach is fol- object is simply represented by an OR lowed, for example, by SCCS and node whose outgoing edges point to its RCS. It suffers from the restriction versions. that structural versioning cannot be AND edges are used to represent both expressed (the product structure is composition and dependency relation- the same for all configurations). ships. A relationship is bound to a spe- —Version first (Figure 7(b)) inverts this cific version if the corresponding AND approach: the product version is se- edge ends at an AND node; otherwise it lected first and uniquely determines is called generic. A configuration is rep- the component versions. Different resented by a subgraph spanned by all product versions may be structured in nodes that are transitively reachable different ways. For example, a version from the root node of the configuration. of component c is contained only in If all AND edges belonging to this sub- foo.1 (version 1 of product foo). PCTE graph are bound, the configuration is [Oquendo et al. 1989] is an example of called bound as well; otherwise it is an SCM system using this organiza- called generic. Furthermore, we may tion. distinguish between partially and to- —Intertwined (Figure 7(c)) means that tally generic configurations. In the first AND and OR selections are performed case, there are both bound and generic in alternating order. The intertwined AND edges; in the latter case, all AND organization is used, for example, by edges are generic. A bound configura- ClearCase [Leblang 1994], which ver- tion can be constructed from a generic sions both files and directories. Again, configuration by eliminating the OR this selection scheme supports struc- nodes, that is, by selecting one succes- tural versioning. sor of each OR node reached during traversal from the root node. Thus, “version first” and “inter- In the following, we return to the twined” both take into account that dif- examples of product graphs in Figure 1 ferent versions of an object may vary and compare several approaches to ver- with respect to their relationships to sioning the product graph. Figure 7 (versions of) other objects. This means shows AND/OR graphs that are all that in addition to objects, relationships based on Figure 1(b).2 In Figure 7(a) are versioned as well. only atomic software objects are ver- So far, we have applied versioning sioned. foo represents a generic configu- only to the product graph organization ration. In Figure 7(b) foo is versioned as shown in Figure 1(b). In Figure 8, the alternatives “intertwined” and “version first” are illustrated for the product 2 Module versions are identified by numbers. For the sake of simplicity, each module version is graph of Figure 1(d) (To save space, represented by a single node (e.g., no distinction “product first” has been omitted). In between header files and bodies). contrast to the previous figure, AND

ACM Computing Surveys, Vol. 30, No. 2, June 1998 246 • R. Conradi and B. Westfechtel

Figure 7. Different kinds of AND/OR graphs and selection orders (I): (a) product first; (b) version first; (c) intertwined. edges represent dependencies instead of sions 2Ј of a, b, and main, respectively). composition relationships. Intertwined This effect is called version prolifera- selection is performed, for example, in tion. Note that version proliferation Adele [Estublier 1985], whereas “ver- need not involve physical copying (e.g., sion first” is realized, for example, in multiple versions may share the same POEM [Lin and Reiss 1995]. source file through pointers). However, To illustrate the differences between this does not solve the problem at the Figures 8(a) and (b), let us assume that logical level (the user is confronted with a bug is fixed in module c that does not a combinatorially exploding number of affect its interface. A new version c.2 is versions). To get rid of version prolifer- created that is to be included in the new ation at the logical level, we have to release of our sample product foo.3 In distinguish between versions of modules (a), a new configuration is constructed and versions of configurations (in (b), in which c.2 is selected instead of c.1. the versions of main play both roles However, in (b) new versions of all mod- simultaneously). ules above c have to be created (ver- 4.2 Granularity of Versioning

3 The current release (the starting point for the In the previous examples we considered bug fix) is represented in Figure 8 by filled nodes. versioning only at the coarse-grained

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 247

Figure 8. Different kinds of AND/OR graphs and selection orders (II): (a) intertwined; (b) version first. level. This means that we have applied sioning. An SCM system provides an versioning to product graphs as intro- external interface to the versioned ob- duced in Section 2. As explained earlier, ject base that offers versioned items as product graphs detail the product struc- well as identification and selection of ture only down to the level of software versions to its users. At the external objects such as interfaces and bodies of interface, software objects are the items modules; the fine-grained contents are subject to version control. The internal represented as long attributes. granularity may be much smaller (see However, in Section 3 we introduced the following). the term “item” in a more general way External versioning may be applied in to denote anything that can be ver- different ways to the composition hier- sioned, including entities, relationships, archy of software objects. Component and attributes. In particular, versioning versioning means that only atomic ob- can be applied at any level of granular- jects are put under version control. ity. Each object has its own version space, To clarify this issue, let us further modeled, for example, by a version elaborate the notion of granularity. graph. Total versioning applies to all First, version granularity refers to the levels of the composition hierarchy. size of a version and second, delta gran- Product versioning differs from total ularity refers to the size of those units versioning by arranging versions of all in terms of which deltas are recorded: in objects in a uniform, global version RCS version and delta granularity are space. at the level of text files and text lines, This classification is illustrated in respectively. In this case, the delta Figure 9, where externally versioned ob- granularity is much finer than the ver- jects are surrounded by boxes. The sion granularity. AND/OR graph in Figure 9(a) was taken With respect to version granularity, from Figure 7(a), and the graphs shown we need to distinguish further between in (b) and (c) were both copied from external versioning and internal ver- Figure 7(b). Note that the topology of an

ACM Computing Surveys, Vol. 30, No. 2, June 1998 248 • R. Conradi and B. Westfechtel

Figure 9. External versioning: (a) component versioning; (b) total versioning.

AND/OR graph shows which objects are same symbolic name (or more generally versioned, but externally and internally through configuration rules referring to versioned objects are not distinguished. revisions and variants). Thus a configu- Thus a given AND/OR graph can be ration is represented implicitly through accessed in different ways, depending an attribute value rather than as a on the version model presented to the first-class entity. user. In particular, in the case of prod- uct versioning the user may select ver- Total Versioning. Total versioning sions of nonroot objects, but this is done generalizes component versioning in in the version space attached to the that all objects are versioned rather whole product. than only the leaves of the composition hierarchy. In contrast to component ver- Component Versioning. In compo- sioning, versions of composite objects nent versioning, the product structure are represented explicitly as entities; is selected first. Typically, versions of atomic and composite objects are ver- components are organized into version sioned uniformly. Whereas component graphs (see, e.g., RCS [Tichy 1985]). A versioning implies that the product configuration is constructed by assem- structure is selected first, total version- bling versions of components; this is ing may be combined with both the “ver- called composition model in Feiler sion first” and the “intertwined” selec- [1991a]. Note that the granularity of tion order and can therefore express composition is an “atomic” software ob- structural versioning. For example, ject (rather than fine-grained units such PCTE [Oquendo et al. 1989] supports as statements or text lines). extensional versioning of composite ob- Frequently, the version graphs of dif- jects (“version first”). In contrast, in ferent components are related only ClearCase [Leblang 1994] AND/OR se- weakly to each other. To illustrate this, lections are intertwined: a version of a Figure 10 shows a sample configuration directory references a set of versioned whose components are located at differ- files (or directories) rather than specific ent places in the respective version versions of these. A single-version view 4 graphs. The selection problem can be on the versioned file system is provided alleviated by tagging all components of through dynamically evaluated configu- some consistent configuration with the ration rules (intensional versioning with dynamic binding). 4 AND nodes representing versions are placed in- side the OR node representing the versioned com- Product Versioning. Product ver- ponent. Furthermore, the AND/OR graph is aug- sioning differs from total versioning by mented with successor relationships. arranging versions of all objects in a

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 249

Figure 10. AND/OR graph augmented with successor relationships. uniform, global version space. It may be Product versioning has become more regarded as a layer that simplifies se- and more popular because of its global lection against a versioned object base. view on the software product. A poten- Product versioning sets up a transpar- tial problem consists of lacking modu- ent single-version view that hides inter- larity of the version space. All changes, nally maintained versions of objects and revisions, and variants are global. For relationships from users and tools. example, if there are two implementa- Product space and version space are tion variants of a procedure Sort (e.g., orthogonal to each other: a given model QuickSort and HeapSort), we must intro- of the product space can be combined duce an attribute at the global level in with different models of the version order to enable choices between these space and vice versa. The integration of variants. An approach to structure the product space and version space in the version space (in the COV system) is state-based SCM system VOODOO described in Munch [1996]. [Reichenberger 1994] is illustrated in Figure 11, where versioned objects, re- 4.3 Deltas visions, and variants are organized into orthogonal dimensions. To represent versions in the object base, All change-based systems (e.g., the deltas are used both at the coarse- COV system [Gulla et al. 1991], Aide- grained and the fine-grained levels. The de-Camp [Software Maintenance and selection of an appropriate delta repre- Development Systems 1990], and DaSC sentation is driven by different require- [MacKay 1995]) support product ver- ments to be considered at the physical sioning to overcome the limitations of and logical level, respectively. Storage component versioning. In general, a and run-time efficiency have to be change to a software product may affect achieved at the physical level. At the multiple components. Since the compo- logical level, deltas are used to compare sition model falls short of recording versions in such a way that the user is these cross-dependencies, it is difficult provided with a high-level description of to incorporate a change into a product differences. Furthermore, intensional version by selecting the respective com- versioning relies on deltas as well. By ponent versions. Therefore in change- processing the deltas, a version is con- based approaches product versions are structed that must meet the require- described in terms of global changes. ments stated in its intensional descrip- Note that combination of changes oper- tion. Accordingly, we distinguish between ates at a finer granularity (e.g., text physical and logical deltas. Although lines) than composition of components. many SCM systems use a single delta

ACM Computing Surveys, Vol. 30, No. 2, June 1998 250 • R. Conradi and B. Westfechtel

Figure 11. Product versioning. representation for both purposes [Munch by applying backward deltas on the 1993], physical and logical deltas may main and forward deltas on the also be separated (e.g., syntactic merg- branches. RCS deltas are fine-grained, ing of program modules stored as text whereas change-based SCM systems files [Buffenbarger 1995]). such as PIE [Goldstein and Bobrow The distinction between directed and 1980] and DaSC [MacKay 1995] employ symmetric deltas, as illustrated in Fig- directed deltas at both the coarse- ure 2, can be transferred directly to the grained and the fine-grained level. In implementation level. Using directed both systems, the modifications per- deltas [Tichy 1982b], a version is con- formed in a change are stored in a layer. structed by applying a sequence of A product version is composed of a se- changes to some base version. In the quence of layers that are stacked on top case of embedded deltas, all versions are of each other. (For further details, see stored in an overlapping manner so that the description of PIE in Section 6, in common fragments are shared. Either particular, Figure 21.) each version points to its fragments Since versions are stored in an over- [Fraser and Myers 1986], or the frag- lapping manner, AND/OR graphs can be ments are decorated with control expres- classified as embedded deltas. As de- sions for determining the versions in which they are visible (interleaved del- scribed so far, versions can be combined tas [Rochkind 1975; Leblang and freely to derive bound configurations McLean 1985]). from some generic configuration. This Please note that the internal delta “combinability” can be constrained by representation is orthogonal to the ex- control expressions (configuration rules) ternal version model. For example, in- stating, for instance, that a certain ver- terleaved deltas may be used to realize sion can be selected only when configur- both extensional and intensional ver- ing a Unix variant. sioning as well as both state- and Adding control expressions results in change-based versioning. In the follow- an interleaved delta scheme that is real- ing, the different types of delta repre- ized in SCCS [Rochkind 1975] and sentations are discussed in turn. DSEE [Leblang and McLean 1985] RCS [Tichy 1985], which is based on among others. Note that both systems directed deltas, reconstructs versions of implement a version model similar to text files from the most recent version RCS, which is based on directed deltas. on the main trunk of the version graph In SCCS and DSEE, fragments of text

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 251

Figure 12. Interleaved deltas: (a) fine-grained level; (b) coarse-grained level. lines are tagged with the set of versions although it may still have to be tested in which they are included. in multiple versions. SCCS and DSEE support extensional On the other hand, editing of source versioning (of text files) but conditional files cluttered with control expressions compilation employs interleaved deltas may confuse the user. Multiversion edi- for intensional versioning. For example, tors overcome this problem by hiding Figure 12(a) illustrates conditional com- control expressions (see, e.g., P-Edit pilation in a source file varying with [Kruskal 1984], MVPE [Sarnak et al. respect to the operating system. Figure 1988], and COV [Gulla et al. 1991]). A 12(b) demonstrates that conditional read filter selects a single version, re- compilation can be applied at the moving all control expressions; a write coarse-grained level as well. The figure filter constrains the set of versions to contains a cutout of a product descrip- which the change is applied. For exam- tion written in the Proteus Configura- ple, a general change in common parts tion Language (PCL [Tryggeseth et al. of the source file may be performed by 1995]). The selected version of the files setting up a universal write filter and component depends on the value of the selecting some arbitrary version by the os attribute, which refers to the operat- read filter (e.g., the Unix version). If ing system. In contrast to conditional Unix is also selected for writing, the compilation (single-source versioning), control expressions of all changed parts different versions of this component are are set up so that all changes are spe- stored in separate files (version segrega- cific to the Unix version. tion) [Mahler 1994]. Version segregation is vulnerable to the multiple maintenance problem 4.4 Relations Between Version Model and [Babich 1986]: a change to a common Data Model fragment must be applied to all versions in turn. This is even the case when the To conclude this section, we discuss the fragment is shared at the physical level interplay between product space and (e.g., interleaved deltas in SCCS). version space in terms of its implica- Merge tools may reduce the multiple tions for database management. When maintenance problem by partly auto- designing a database management sys- mating change propagation. However, tem for software engineering, the de- single-source versioning is a more ele- signer must decide how to support ver- gant solution since the change needs to sioning. In particular, there are several be performed only once for all versions, alternatives concerning the relations

ACM Computing Surveys, Vol. 30, No. 2, June 1998 252 • R. Conradi and B. Westfechtel

Figure 13. Definition of a versioned object type. between the data model and the version language that provides customized con- model. structs for defining versioned object types. In addition, the query language Version Model on Top of the Data is modified so that queries against Model. In this case, version manage- a versioned database can be written ment is seen as an ordinary database in a convenient and natural way. application. Thus the version model is DAMOKLES [Dittrich et al. 1986], EX- represented by a schema whose under- TRA-V [Sciore 1994], and Adele [Estub- lying data model is not aware of ver- lier and Casallas 1994] follow this ap- sioning. This solution has been adopted, proach. for example, by PCTE [Oquendo et al. If the version model is defined on top 1989] and CoMa [Westfechtel 1996], of the data model, different types are which are based on an EER and a graph required for versioned objects and their data model, respectively. Its main ad- versions. As argued in Estublier and vantage is that the data model is kept Casallas [1994] and Sciore [1994], this simple and potentially application-spe- distinction is awkward and complicates cific extensions are avoided (a widely both schemata and queries. Figure 13 accepted uniform version model does illustrates how this problem is solved in not yet exist). Adele. In this example, only one object However, implementing version man- type is defined for a versioned interface. agement completely on top of a data- Adele distinguishes between common base management system has a number attributes shared by all revisions, modi- of limitations. For example, there is no fiable attributes whose values are revi- support for storing versions efficiently, sion-specific, and immutable attributes, the transaction manager does not take where each update triggers the creation versioning into account, and so on. of a new revision. Furthermore, the at- Therefore several database manage- tribute realization is declared as ver- ment systems provide predefined classes sioned, meaning that an interface revi- for version management (e.g., O 2 sion may have multiple realization [GOODSTEP 1995]). In this case, the variants. data model is still not aware of version- ing, but components of the database management system such as storage Data Model on Top of the Version and transaction manager are modified Model. So far, the version model de- to support version management effi- pends heavily on the data model, being ciently. either defined on top of or built into the data model. In a few SCM systems, such Version Model Built into the Data as ICE [Zeller and Snelting 1995] and Model. If the data model is extended COV [Munch 1996], the version model is with versioning, applications can be completely orthogonal to the data supported through a data-definition model. This may be achieved through an

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 253 instrumentable version engine that pro- performed both in the product space vides a basic delta storage and configu- and the version space. Having discussed ration rules by means of which specific the interplay between product space version models may be expressed [Con- and version space in Section 4, we are radi and Westfechtel 1997]. The version now ready to elaborate on rule-based engine is not aware of the data model version construction, a topic that could and can thus be combined with any data have been addressed only partially in model (e.g., EER, object-oriented, or Section 3. simply files). Both ICE and COV are based on in- 5.1 Problem: Combinability Versus terleaved deltas, but use different logics Consistency Control and Manageability for control expressions. In ICE, version- ing is applied to file-based data. The Configuration rules are used to config- COV system applies versioning to an ure consistent versions from a versioned EER data model. To this end, a layer object base. Rules are required to ad- that takes care of the consistency con- dress the combinability problem. The straints inherent in the data model is number of potential versions explodes placed on top of the version engine (e.g., combinatorially; only a few are actually a relationship is visible only when both consistent or relevant. The combinabil- ends belong to the currently selected ity problem has to be solved in any version). Thus this architecture differs version model. considerably from the architecture of For example, the (state-based) compo- conventional database management sys- sition model [Feiler 1991a] applies ex- tems where version model and data tensional versioning at the component model are rather entangled. level; that is, previously constructed component versions are reused. Rule- 5. INTENSIONAL VERSIONING based construction of configurations realizes intensional versioning at the In Section 3.2 we distinguished between configuration level; that is, new combi- extensional and intensional versioning. nations of component versions are as- Extensional versioning is concerned sembled into configurations. Without with the reconstruction of previously any constraints, the number of potential created versions and requires version configurations is very large. For a prod- identification, immutability, and effi- uct consisting of m modules existing in cient storage. On the other hand, inten- v versions, there exist vm potential con- sional versioning deals with the construc- figurations (i.e., the number of potential tion of new versions from property-based configurations grows polynomially in v). descriptions. Intensional versioning is On the one hand, change-based ver- very important for large version spaces, sioning reduces the combinability prob- where a software product evolves into lem by grouping logically related modi- many revisions and variants and many fications of multiple components. Thus changes have to be combined. we do not have to worry about which In order to support intensional ver- versions of components actually fit to- sioning, an SCM system must provide gether. However, the selection problem for both combinability—any version has has not disappeared. Rather, it has to be constructed on demand—and con- been moved from the product space to sistency control—a constructed version the version space [Munch 1996]. In the must meet certain constraints. The con- case of unconstrained combination of struction of a version may be viewed as changes (each change may either be ap- a selection against a versioned object plied or skipped), there are 2c potential base. The selection is directed by config- configurations for c changes; that is, the uration rules, which constitute an es- number of potential configurations sential part of a version model, and is grows exponentially in c.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 254 • R. Conradi and B. Westfechtel

Figure 14. Intensional versioning.

The challenge of intensional version- framework for intensional versioning. ing consists of providing for consistency First let us define the central notion of control while still supporting combin- configuration rule: a configuration rule ability. The space of all potential ver- guides or constrains version selection sions is much larger than the space of for a certain part of a software product. consistent ones. The problem of consis- Thus a configuration rule consists of a tency control can be addressed both in product part, which determines its the version space and in the product scope in the product space, and a ver- space. In the version space, configura- sion part, which performs a selection in tion rules are used to eliminate incon- the version space (see Section 5.3 for sistent combinations; in the product further details). space, the knowledge about software ob- A versioned object base combines jects, their contents, and their relation- product space and version space and ships is enriched in order to check and stores all versions of a software product, ensure product constraints. SCM sys- relying, for example, on interleaved del- tems tend to solve the problem in the tas. The versioned object base is aug- version space because they frequently mented with a rule base of stored config- only have limited knowledge of the uration rules (e.g., control expressions product space (typically, software ob- as shown in Figure 12). jects are represented as text files). A query consists of a set of submitted Even if a sophisticated tool for con- configuration rules, each composed of a structing a version is employed, the product part and a version part. A con- user must be warned if a new version is figurator is a tool that constructs a ver- created that has never before been con- sion by evaluating a query against a figured. Although old versions can be versioned object base and a rule base. assigned levels of “confidentiality” (e.g., The constructed version has to satisfy tested or released), a new version can- both version constraints (e.g., consistent not be trusted blindly. Therefore the selection of the Unix version) and prod- configured version is subject to quality uct constraints (e.g., syntactic or seman- assurance (e.g., testing). Potentially, tic consistency). Configurators are dis- changes to the constructed version need to be performed (correction delta). cussed further in Sections 5.4 and 5.5. The configuration process is con- 5.2 Conceptual Framework for Intensional cerned with the binding of generic refer- Versioning ences. Often, its result is a bound con- figuration, but it may also deliver a Figure 14 illustrates our conceptual configuration that is partially generic.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 255

The latter results in a multistage config- Mostly, build rules are deterministic uration process. with respect to the result of building. Binding can be performed at different Nondeterminism deals only with the points in time. Static binding means order in which build steps are exe- that the configurator resolves all un- cuted (the build plan imposes only a bound references before any component partial order on the build steps). is accessed. To this end, the configura- Some authors advertise the advan- tor constructs a table that maps each tages of nondeterministic build rules component name to a specific version. (e.g., “build some sort program no In dynamic binding, each reference is matter what algorithm and which evaluated on demand only (e.g., when a compiler is used” [Rich and Solomon source file is read by a compiler). 1991]). However in many situations it Binding may even be deferred until is crucial to control the details of a runtime. In this case, a program is con- build without leaving the freedom for figured dynamically without stopping nondeterministic choices (e.g., even its execution. A popular example is Java the functional behavior of a program applets, which are loaded dynamically may depend on whether it is compiled when they are activated through a Web with or without optimization). browser. Dynamic configuration has been studied in the context of distrib- Finally, Figure 14 suggests that the uted systems (see Kramer [1993] for an rule base is not put under version con- overview). Up to now, the relations to trol. On the other hand, versioning of SCM have not been investigated thor- the rule base is desirable as well be- oughly. Only recently have a few ap- cause the configuration rules may proaches been proposed that apply ver- evolve along with the software product. sion-selection techniques from SCM to In the case of our sample software prod- dynamic configuration [Warren and uct foo the set of supported operating Sommerville 1995; Schmerl and Marlin systems, window systems, and database 1995]. However, elaborating on this systems may evolve, as may the con- topic goes beyond the scope of this arti- straints on combining these dimensions cle. of variation. Currently, versioning of To sharpen the focus, we also refrain the rule base is handled at best in a from discussing system building [Bori- rudimentary way, such as by storing son 1989]. Although the conceptual configuration rules in text files that are framework illustrated in Figure 14 can subject to version control. As argued in be applied to both source and derived Conradi and Westfechtel [1997], simple objects, the problems to be considered time-stamped revisions of the rule base are different in the following respects. may suffice to reconstruct not only old product versions, but also the configura- —In source version construction, we tion rules valid at that time. Some fur- have a selection problem in both prod- ther remarks on this topic follow in uct and version space. The selection Section 7.2. must be performed so that the out- come of the configuration process obeys all configuration rules and 5.3 Configuration Rules product constraints. Here nondeter- Intensional versioning is driven by con- minism may have to be taken into figuration rules that are classified in account and the configurator may the following. First, built-in rules are have to backtrack from wrong selec- hardwired into the respective SCM sys- tions. tem and cannot be changed by the user. —When constructing derived versions, For example, a built-in rule may enforce we primarily have to consider the effi- that at most one version of a software ciency and accuracy of the build. object is contained in any constructed

ACM Computing Surveys, Vol. 30, No. 2, June 1998 256 • R. Conradi and B. Westfechtel

Figure 16. Scoping of configuration rules.

Product Parts of Configuration Rules. So far we have discussed only the version parts of configuration rules. The product part describes the scope of a configuration rule in the product Figure 15. Version parts of configuration rules. space. In Figure 16, a configuration rule is written in the form p : v, where p and v denote product part and version part, configuration. User-defined rules are respectively. Rule (1) applies to a single supplied by the user (e.g., “select the module a (local rule) and selects the latest version before May 22nd”). version checked out by the current user. The star in Rule (2) indicates global Version Parts of Configuration Rules. application to all modules so that the Configuration rules take on different same variant is selected throughout the forms depending on how the version whole product. The product part of Rule space is structured. Figure 15 provides (3) denotes all modules reachable from some typical examples that refer to the b by a reflective and transitive closure revision, variant, and change space, re- over relationships of type DependsOn; spectively. Configuration rules are that is, the rule applies to b and to all stated as logical formulas, abstracting modules on which b depends transi- from the actual syntax as implemented tively. in different SCM systems. In the revision-space category, config- Ordering of Configuration Rules. uration rules refer to the time dimen- Configuration rules can be ordered in sion. Rule (1) selects the latest revision strictness classes. A constraint is a man- by the maximal time stamp. Rule (2) datory rule that must be satisfied. Any refers to a revision by its number. violation of a constraint indicates an In the variant space, configuration inconsistency. For example, Rule (1) in rules refer to values of variant at- Figure 17 ensures that all selected ver- tributes. Rule (3) identifies a variant by sions belong to a given variant. A pref- specifying operating system, window erence is an optional rule that is applied system, and database system. Rule (4) only when it can be satisfied. For exam- expresses a constraint on the combina- ple, Rule (2) states that a checked-out tion of attribute values: the X11 window version is selected provided it is avail- system is not available under the DOS able. Finally, a default is also an op- operating system. tional rule, but is weaker than a prefer- In the change space, Rule (5) specifies ence: a default rule is applied only when a version in terms of the changes to be no unique selection could be performed applied. Rules (6) and (7) specify further otherwise. For example, Rule (3) selects relationships that describe consistent the most recent version as the default. change combinations. Rule (6) states Strictness classes determine the order that inclusion of change c2 implies that in which configuration rules are evalu- c1 is included as well (c2 is based on ated (constraints 3 preferences 3 de- c1). Rule (7) states that changes c1, c2, faults). In addition, rules may be given and c3 are mutually exclusive (operator priorities. Rules with high priorities are V) (i.e., at most one of these changes considered before low-priority rules. A may be applied). priority may be assigned explicitly or

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 257

flexible and powerful base layer pro- viding for delta storage (interleaved deltas) and configuration rules. Any desired version model may be realized on top of configuration rules. In the case of version graphs, revision Figure 17. Strictness classes. chains and branches may be ex- pressed by implication and mutual ex- clusion, respectively (see Rules (6) and (7) in Figure 15). may be defined implicitly by textual or- dering. Priorities may be combined with strictness classes such as by assigning 5.4 Configurators: Tools for Evaluating priorities to preferences [Lavency and Configuration Rules Vanhoedenaghe 1988]. However they can also be used without strictness A configurator constructs a version by classes. For example, in ClearCase evaluating configuration rules against a [Leblang 1994] rules are evaluated ac- versioned object base. Construction can cording to their textual order until a be performed in different computational (unique) match is found. frameworks. In a functional framework, Finally, we may distinguish between intensional versioning is modeled by ap- stored and submitted configuration plying a (potentially partial) function q rules. The rules given in Figure 17 (query) to its arguments a1 ...an; that would typically be submitted in a query is, a version v is constructed by evaluat- in order to specify properties requested ing the expression q(a1,..., an). This by the user. Rule (4) in Figure 15 is an approach assumes that version con- example of a stored configuration rule struction delivers a deterministic result expressing a constraint that must be (i.e., version selection is unique). fulfilled by all configured versions. In a rule-based framework, version construction is modeled as the evalua- Relations Between Version Graphs tion of a query against a deductive data- and Configuration Rules. To conclude base [Ramamohanarao and Harland this section, we discuss briefly how ver- 1994; Ramakrishnan and Ullman 1995]. sion graphs and configuration rules are The deductive database consists of a related in different SCM systems: versioned object base and a rule base —Configuration rules on top of version containing stored configuration rules. graphs. This is the classical solution, Since a query may not specify the ver- called composition model in Feiler sion to be constructed in a unique way, [1991a]. Version graphs are main- a rule-based configurator has to cope tained for components (extensional with nondeterminism. Ambiguous choices versioning); configurations are con- can be resolved either automatically or structed by selecting component ver- interactively. In any case, the configu- sions with the help of configuration rator needs to explore a search space of rules (intensional versioning). The potential solutions. To this end, it may composition model is realized, for ex- use different search strategies such as ample, in DSEE [Leblang and “depth-first” or “breadth-first.” McLean 1985] and ClearCase [Leb- Configurators may be classified not lang 1994]. only with respect to the computational —Version graphs on top of configuration paradigm (functional, rule-based) but rules. This inverted approach is fol- also with respect to the underlying ver- lowed by a few research prototypes sion model (state- or change-based). such as ICE [Zeller 1995] and COV This results in four combinations, as [Munch 1996]. Both systems rely on a described in the following.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 258 • R. Conradi and B. Westfechtel

State-Based Functional Configura- rules. As version selections are per- tors. A state-based functional configu- formed, constraints are added that nar- rator operates on a versioned object row down further choices. For example, base typically represented by inter- if version uniqueness is required, we leaved deltas. Conditional compilation may not select a different successor in C [Kernighan and Ritchie 1978] and each time a certain OR node is visited. configuration construction at the Variant selection serves as another ex- coarse-grained level in PCL [Tryggeseth ample: after having selected a certain et al. 1995] may be cited as examples.5 variant of an operating system, we must The versioned object base is usually tra- ensure that this selection is performed versed in a “context-free” manner (e.g., consistently for all further nodes sequences of text lines in conditional reached in the course of the configura- compilation and trees of components in tion process. If this turns out to be PCL). Configuration rules take the form impossible, the configurator has to of control expressions as shown, for ex- backtrack from the wrong selection. ample, in Figure 12. The configurator is A simple example is given in Figure supplied with a tuple of attribute values 18(a) which shows an AND/OR graph (a1,..., an) and constructs some ver- whose AND nodes (versions) are anno- sion v. Control expressions are evalu- tated with configuration rules.6 For ated against this tuple to select the each version, a triple of attribute values relevant components of v. When applied denotes the variants to which it belongs; stands for any value. a depends on the ء to a composite object co, expressions may also be used to transform (a1,..., window system, b on the database sys- Ј Ј an) into a new tuple (a 1,..., a m), tem, and c on the file system; main can which is then employed recursively to be used with any variant. Figure 18(b) configure co. This means that attribute shows a query that leaves all attribute values are transformed and propagated values unspecified. Below, a sample through the composition hierarchy. In trace of the configuration process is this way, it is possible to enforce certain given. After selection of a.1, the os at- constraints (e.g., selection of the same tribute is bound to Unix. Thus the Unix variant throughout the whole configura- version of c is selected. However, selec- tion process). However rule-based con- tion of b fails and triggers backtracking. figurators are in general more powerful in enforcing constraints or detecting in- Change-Based Functional Configura- consistencies. tors. A change-based functional con- figurator is supplied with a sequence of State- and Rule-Based Configura- changes. Internally, the object base may tors. A state- and rule-based configu- be stored using either interleaved or rator supports nondeterminism through directed deltas (e.g., Aide-de-Camp an appropriate search strategy (e.g., [Software Maintenance and Develop- depth-first search with backtracking). It ment Systems 1990] and PIE [Goldstein typically operates on some AND/OR and Bobrow 1980], respectively). Con- graph that is traversed along “context- ceptually, the changes are applied in sensitive” relationships (e.g., dependen- order to some baseline. Inconsistencies cies between modules in Adele [Estub- are detected when a change operation lier 1985]; see also Figure 8). For each fails, for example, insertion of a text OR node reached in the course of the line at an undefined location. configuration process, a successor is se- lected with the help of configuration Change- and Rule-based Configura- tors. A change- and rule-based con-

5 Please recall that conditional compilation can be used for change-based versioning as well (see Sec- 6 Note that only variants are considered in this tion 3.5). example (no revisions).

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 259

Figure 18. Rule-based construction of a configuration: (a) AND/OR graph with configuration rules; (b) construction of a configuration.

figurator differs from its functional fact rather fuzzy.7 In particular, state- counterpart by considering stored con- based SCM systems often offer merge figuration rules constraining change tools [Buffenbarger 1995] to combine combinations. Configuration rules can versions/changes. Although mostly used be used to detect inconsistent change in state-based systems, they have been combinations; furthermore, they can be applied in change-based systems as employed in a constructive manner to well. complete a partial specification. This is Merge tools combine versions or done, for example, in the COV system changes. They may be classified as fol- [Munch 1996]. Consider the mutual ex- lows (Figure 19). clusion rule (7) in Figure 15. If the user —Raw merging simply applies a change insists on applying both c1 and c2, the in a different context. For example, in configurator reports an inconsistency. Figure 19(a) change c2 was originally Alternatively, the configurator may au- performed independently of change c1 tomatically disable change c2 (and c3) and is later combined with c1 to pro- after c1 has been selected. duce version v4. Raw merging is sup- ported by SCCS [Rochkind 1975] among others. It was later general- 5.5 Merge Tools ized in change-based SCM systems such as Aide-de-Camp [Software So far, we have distinguished between Maintenance and Development Sys- state-based and change-based configu- rators. Although state- and change- 7 This was already demonstrated in the discussion based SCM systems may differ consider- of the relationships between version graphs and ably at first glance, the borderline is in change-based versioning (Section 3; see Figure 6).

ACM Computing Surveys, Vol. 30, No. 2, June 1998 260 • R. Conradi and B. Westfechtel

Figure 19. Types of merging: (a) raw merging; (b) two-way merging; (c) three-way merging.

tems 1990] and the COV system quences of change operations. A con- [Munch et al. 1993]. flict arises if two operations do not commute (e.g., contradictory changes —A two-way merge tool (Figure 19(b)) compares two alternative versions a1 to the name of a procedure). In raw and a2 and merges them into a single merging, one change wins and the version m. To this end, it displays the other is overridden. differences to the user who has to select the appropriate alternative. A Merge tools can be characterized by two-way merge tool can merely detect the semantic level at which merging is differences, and cannot resolve them performed (i.e., their knowledge about automatically. the product space): —To reduce the number of decisions to be performed by the user, a three-way —Textual merging is applied to text merge tool (Figure 19(c)) consults a files [Adams et al. 1986]. Almost all common baseline b if a difference is commercial SCM systems support tex- detected. If a change has been applied tual merging [Rigg et al. 1995]. Al- in only one version, this change is though we can expect only an arbi- incorporated automatically. Other- trary text file as the result of the wise, a conflict is detected that can be merge (and, e.g., not a legal C pro- resolved either manually or automat- gram) and only physical conflicts can ically (the latter is not recommended). be detected, textual merging seems to Notably, the change-based system yield good results in practice [Leblang Aide-de-Camp offers a three-way 1994]. In particular, it works well merge tool in addition to raw merg- when small, local changes to large ing. well-structured programs are com- In comparing raw merging and three- bined and changes have been coordi- way merging, we have to distinguish nated beforehand so that semantic between inconsistencies and conflicts: conflicts are unlikely to occur. —In raw merging, a change c is applied —Syntactic merging exploits the con- in a different context. A change is a text-free (or even context-sensitive) sequence of change operations, say syntax of the versions to be merged. op1 ... opm. If any opi fails (e.g., Therefore, it can guarantee a syntac- because it is applied to a nonexistent tically correct result and can perform object), there is an inconsistency. more intelligent merge decisions. —In addition to inconsistencies, three- However, syntactic merging has been way merging can detect conflicts, that realized only in a few research proto- is contradictory changes. Three-way types [Buffenbarger 1995; Westfech- merging attempts to combine two se- tel 1991].

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 261

—Semantic merging takes the seman- 6. VERSION MODELS IN SCM SYSTEMS tics of programs into account [Berzins After having characterized version mod- 1994, 1995; Binkley et al. 1995; Hor- els in general, we now take a look at witz et al. 1989]. Semantic merge concrete systems. We draw a picture of tools perform sophisticated analyses the SCM landscape by describing the in order to detect conflicts between contributions of a representative selec- changes. However, it is a hard prob- tion of influential SCM systems. These lem to come up with a definition of descriptions are based primarily on the semantic conflict that is neither too published scientific literature. Although strong nor too weak (and is decid- the following survey includes commer- able). Furthermore, the merge algo- cial systems in addition to research pro- rithms developed so far are applicable totypes, we do not provide a comprehen- only to simple programming lan- sive overview of SCM systems available guages (not C or Cϩϩ). For these in the commercial marketplace (see reasons, semantic merge tools have Rigg et al. [1995] for such an overview). not (yet?) made their way into prac- Furthermore, our main goal is to illus- tice. trate different version models rather than to evaluate the functionalities pro- Finally, operation-based merging vided by SCM systems. [Lippe and van Oosterom 1992] comes The survey focuses on the core issues up with a general algorithm that makes of versioning discussed in Sections 2 no assumptions about the product through 5. Further topics such as sys- space. The algorithm takes two se- tem building, cooperation support, quences of change operations and com- workspace management, and distribu- bines them into a single sequence, de- tion are mentioned briefly at best. As a tecting both inconsistencies and conflicts. consequence, influential systems whose However, the application of this algo- main merits lie in these fields are not rithm is complex because a huge search included. For example, Make [Feldman space of potential merged operation se- 1979], Odin [Clemm 1995], and CAPITL quences must be considered. Because of [Adams and Solomon 1995] are con- its generality, the merge tool cannot cerned with system building, and NSE rely on any hints as to which operation [Adams et al. 1989] has contributed to should be appended next and where workspace management (hierarchy of conflicting operations are positioned in workspaces) and cooperation support the input sequences. (optimistic concurrency control). Note that merge tools try to detect conflicts in the product space (by detect- ing noncommuting operations). Alterna- 6.1 Overview tively, conflict detection can be per- We have selected more than 20 SCM formed in the version space to some systems that vary widely in their under- extent. This approach is followed in lying version models. The evolution the COV system [Munch 1996], where graph in Figure 20 illustrates the evolu- configuration rules are used to con- tion of SCM since the early ’70s. Each strain change combinations. Conflicting node corresponds to a specific system merges can be excluded by constraints V and briefly describes its original contri- of the form c1 c2 (mutual exclusion of bution(s). Incoming edges express the c1 and c2). Once such a constraint has most important influences of previous been set up, attempts to combine c1 and systems. c2 can be detected as erroneous before Before delving into the details, let us performing the actual merge. However, make some global remarks. before c1 and c2 are known to conflict, it is frequently necessary to combine them —Initially, SCM was supported through (by raw merging) and test the result. isolated tools covering specific aspects

ACM Computing Surveys, Vol. 30, No. 2, June 1998 262 • R. Conradi and B. Westfechtel

Figure 20. Evolution graph of SCM systems.

(e.g., SCCS [Rochkind 1975] for ver- more recent systems increasingly use sioning of source objects). Later on, database technology. This is demon- integrated systems were developed strated by the Adele system, for ex- that combined the functionalities of ample, which evolved from a file- individual tools into a coherent envi- based system with an ad hoc hard- ronment (e.g., DSEE [Leblang and wired data model [Estublier 1985] McLean 1985] or ClearCase [Leblang into an active object-oriented data- 1994]). base system supporting historical, —The pioneers addressed many prob- logical, and cooperative versioning lems in an ad hoc manner. Subse- (revisions, variants, and workspaces, quent systems addressed topics such respectively) [Estublier and Casallas as intensional versioning in a more 1994]. systematic and general way (e.g., The evolution graph is traversed in compare version selection by check- chronological order in Section 6.3. In- out options to full-fledged logic-based stead of reading all system descriptions approaches such as ICE [Zeller sequentially, the reader may focus on 1995]). specific systems. Furthermore, the —Many fundamental ideas are rather edges of the evolution graph may serve old. For example, although change- as “hypertext links.” based versioning has been attracting To ease orientation, we have clustered significant attention only recently, SCM systems into families such as “ver- the idea can be traced back at least to sion graphs,” “conditional compilation,” the PIE system (change-based ver- “change-based versioning,” and “pro- sioning of Smalltalk programs [Gold- gramming-in-the-large” (systems for stein and Bobrow 1980]), which was configuring modular programs). These already developed in the late ’70s. families should be viewed just as exam- —Although early systems/tools were im- ples. They are neither disjoint (e.g., As- plemented on top of the file system, gard supports change-based versioning

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 263

Table I. Classification of SCM Systems (I)

on top of version graphs), nor do they —Product versioning: although all cover all SCM systems. change-based systems happen to sup- port global changes, there is no inher- 6.2 Taxonomy-Based Classification ent reason why change-based version- ing could not be applied at the In this section, we classify SCM systems component level. Conversely, there according to a taxonomy derived from are state-based systems (e.g., VOO- Sections 2 through 5. This taxonomy DOO [Reichenberger 1994]) that provides a much more detailed and sys- maintain versions at the product tematic classification than the catego- level. ries introduced in the previous section. Furthermore, we claim that this classi- SCM systems are classified in Tables fication is orthogonal. Indeed, an impor- I and II. These tables may be crossrefer- tant contribution of this article is to enced while reading the system descrip- distinguish among different aspects tions in the next section. For each sys- that used to be mingled together. tem, there is a corresponding row in the For example, according to our defini- table. The columns are organized in a tion, change-based versioning just means three-level hierarchy. The first level de- that versions are described in terms of fines categories (“general,” “product changes relative to some baseline. On space,” etc.). Each category serves to the other hand, the term has other con- group multiple features each of which notations, resulting from the models ac- corresponds to one dimension of the tually realized in change-based systems: classification scheme (“environment,” —Intensional versioning: as explained “object management,” etc.). in Section 3.5, change-based version- A feature may have values from some ing is not necessarily intensional (see enumeration type (e.g., “tool-kit,” “lan- the discussion of change packages). guage-based,” “structure-oriented”). Each The converse is also not true (see, feature is either single-ormultivalued. e.g., the classical composition model, In the first case, the feature is anno- which is state-based). tated with V and its value is indicated

ACM Computing Surveys, Vol. 30, No. 2, June 1998 264 • R. Conradi and B. Westfechtel

Table II. Classification of SCM Systems (II)

by an x sign in the respective column. In that do not deal with the fine-grained the second case, we use * to annotate level (e.g., PCTE), and a few that do not the feature and a ϩ for each of its consider the coarse-grained level (e.g., values. the multiversion text editor MVPE). Some features are defined only par- The latter have undefined entries for tially; they are not applicable to all coarse-grained relationships. All other SCM systems. For example, “selection systems support composition relation- order” is only meaningful for systems ships, but some of them do not take based on AND/OR graphs, and not all dependencies into account. systems support intensional versioning. Undefined values are indicated by Version Space. The structure feature hatched entries. refers to the way the version space is The tables are explained briefly in the modeled. In some cases (e.g., Gandalf), following, ordered by categories. both version graphs and grids are suit- able for representing the version space. General. Following Dart et al. Similarly, extensional and intensional [1987], we distinguish among tool-kit, versioning are nonexclusive (version language-based, and structure-oriented set). Virtually all systems support ex- environments; most SCM systems be- tensional versioning (i.e., reconstruction long to the first class. For object man- of old versions, conditional compilation agement of the versioned object base, a being a remarkable exception), but file system or a database system may be some do not consider intensional ver- used. sioning. Finally, most SCM systems are state- rather than change-based (ver- Product Space. The domain is spe- sion specification). A few systems pro- cific if the SCM system deals with spe- vide base mechanisms for both para- cific types of software objects; other- digms (e.g., ICE and COV). wise, it is general. All systems classified as specific deal with certain kinds of Interplay of Product and Version programs; however the underlying con- Space. The selection order (during the cepts may still be fairly general (e.g., configuration process) can be applied conditional compilation). Concerning only to SCM systems based on AND/OR granularity, there are some systems graphs; in particular, it is not applied to

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 265 systems derived from conditional compi- text is interspersed with preprocessor lation. Note that the selection order fol- directives that refer to the values of lows from the topology of AND/OR preprocessor variables (see Figure 12). graphs, which can be defined freely in Thus conditional compilation uses inter- some systems (e.g., DAMOKLES). The leaved deltas with visible control ex- external granularity is classified into pressions. A specific source version is product, component, and total version- constructed by the preprocessor by fil- ing. Finally, the deltas feature refers to tering out all fragments whose control the efficient representation of atomic expressions evaluate to false. Source software objects only, that is, sharing at version construction follows a functional the coarse-grained level is not consid- paradigm (no nondeterminism/back- ered. tracking). Note that conditional compi- lation is a low-level and general mecha- Intensional Versioning.8 With re- nism on top of which different version spect to the underlying computational models can be implemented (both state- paradigm, we distinguish between func- and change-based ones). tional configurators (e.g., conditional compilation) and rule-based configura- 6.3.2 SCCS. SCCS [Rochkind 1975] tors (e.g., DSEE or ClearCase). Classes manages versions of text files. (Binaries of configuration rules refer to the strict- can also be stored, but delta storage is ness classes constraints, preferences, not supported.) Versions are arranged and defaults and are applied only if the in a tree. To work on a version, it is SCM system distinguishes among dif- physically copied into a workspace (di- ferent strictness classes. Finally, all rectory). When the user is done with his SCM systems supporting intensional or her changes, the modified version is versioning (must) provide automatic checked back into the SCCS repository. configurators. In some systems, the con- SCCS uses interleaved deltas to store figurator may also be used in interactive versions in a space-efficient manner. mode (e.g., ICE or Adele). However, in contrast to conditional com- pilation, this data structure is hidden. 6.3 Descriptions of SCM Systems Thus there are interleaved deltas at two In the following, each system is de- levels when a C file with preprocessor scribed briefly in turn. The system de- directives is stored under SCCS control. scriptions are not organized according Although SCCS primarily supports to the tables presented in the previous state-based versioning, it does provide section (proceeding through the table some low-level commands for control- entries would be rather boring). In- ling delta applications (and even fixing stead, we focus on the specific contribu- deltas) in a change-based fashion. tions of each system and its relations to other systems. 6.3.3 PIE. An early approach to change-based versioning has been de- 6.3.1 Conditional Compilation. Con- veloped at XEROX PARC [Goldstein ditional compilation supports inten- and Bobrow 1980]. PIE manages config- sional versioning at the fine-grained urations of Smalltalk programs that are level and has become popular particu- internally represented by graph-like larly in conjunction with the C program- data structures. Changes are considered ming language [Kernighan and Ritchie global and may affect multiple compo- 1978]. Although it was originally devel- nents of a software product (product oped for a specific domain, the underly- versioning). Each change is placed in a ing idea is fairly general. The source layer, and layers are aggregated into contexts that act as search paths 8 Note that merge tools (Section 5.5) were not through the layers. (DaSC [MacKay taken into account here. 1995] is based on a similar approach.)

ACM Computing Surveys, Vol. 30, No. 2, June 1998 266 • R. Conradi and B. Westfechtel

Figure 21. Layers and contexts in PIE.

This is illustrated in Figure 21, where version attributes); in particular, the layers contain different implementa- product structure is selected first so tions of methods m1, m2, m3 exported that structural versioning cannot be by some class c. modeled. When constructing a context, there are two degrees of freedom: each layer 6.3.5 Gandalf. The Gandalf project may either be included or omitted; and [Habermann and Notkin 1986] was ded- the included layers can be arranged in icated to the generation of structure- any sequential order. PIE provides rela- oriented software development environ- tionships to document conditions for the ments based on abstract syntax trees. combination of layers. For example, A Gandalf C is an environment instance depends_on B implies that each context that supports C at the programming-in- containing A should include B as well. the-small level. Programming-in-the- However PIE does not enforce any con- large and version control are handled by straints. Rather, the documented rela- the SVCE subsystem [Kaiser and tionships are merely used to warn the Habermann 1983], which is based on user of possible inconsistencies. the Intercol module interconnection lan- guage [Tichy 1979]. Each module has a 6.3.4 RCS. RCS [Tichy 1982b, 1985] unique and immutable interface and po- differs from its predecessor SCCS in tentially multiple realization variants, several ways. RCS stores versions of each of which evolves into a sequence of text files using directed deltas. The lat- revisions. Thus revisions and variants est revision on the main trunk can be are separated rather than intermixed accessed directly (and therefore effi- (SCCS, RCS). A realization of an inter- ciently), whereas all other revisions are face can be provided either by writing C reconstructed by applying backward code, or can be composed of other mod- and forward deltas. Furthermore, RCS ules that jointly provide the resources provides a set of built-in attributes such listed in the export interface (composi- as version status and symbolic name. In tions). Gandalf enforces syntactic con- particular, symbolic names may be at- sistency of all interfaces and realiza- tached to all components belonging to a tions deposited into the public database. consistent configuration. In this way, Therefore it can guarantee syntactic symbolic names define threads through consistency of all constructed configura- the version graphs and make recon- tions. struction of configurations easier (see Gandalf supports intensional con- Figure 10). However, support for inten- struction of source configurations sional versioning is rather limited (op- through a configurator that performs tions of checkout commands referring to intertwined AND/OR selections. The

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 267

Figure 22. DSEE.

configurator is driven by simple config- lects the product structure first,9 which uration rules that are classified into is described in the system model, and constraints, preferences, and defaults. adds version bindings, resulting in a Constraints and preferences are at- bound configuration thread that is used tached to imports or compositions and for system building. select specific variants/revisions. If no selection rules are given, the standard 6.3.7 P-Edit/MVPE. P-Edit [Kruskal variant/revision is selected by default. 1984] and its successor MVPE [Sarnak et al. 1988] support simultaneous edit- ing of multiple versions of a text file. A 6.3.6 DSEE. DSEE [Leblang and text file is composed of fragments (se- McLean 1985; Leblang and Chase 1984; quences of words). As in conditional Leblang et al. 1988] integrates func- compilation, control expressions are tions that were previously provided in- used to determine the visibilities of dependently by tools such as Make and fragments. Unlike conditional compila- SCCS/RCS. Furthermore, DSEE sup- tion, P-Edit and MVPE hide these con- ports rule-based construction of source trol expressions when the text file is configurations and improves system displayed. Furthermore, control expres- building by maintaining a cache of de- sions are maintained automatically. A rived objects, using more accurate dif- write filter (edit set) controls which ver- ference predicates than Make and par- sions will be affected by a change, a allelizing builds over a network of read filter (view) selects a single version workstations. to be displayed to the user. The version Source code versions are arranged in space is modeled as a grid. version graphs as shown in Figure 4. A In Figure 23, the table on the left- configuration is specified by a system hand side lists all versions determined model and a thread through the version by the attributes os, ws, and db. A view graphs of components (Figure 22). The corresponds to a single row of the table, system model describes a software sys- and an edit set is specified in a query- tem in terms of source objects and their by-example style with the help of regu- relationships; furthermore, it also con- lar expressions (e.g., the edit set in Fig- tains build rules. Versions are not refer- enced in the system model. Rather, they are selected by configuration rules in 9 However, it should be noted that the system the configuration thread. Rules are or- model itself is version-controlled: thus, a version dered sequentially according to their of the system model is selected in an initial step priorities. The DSEE configurator se- not shown in Figure 22.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 268 • R. Conradi and B. Westfechtel

Figure 23. MVPE. ure 23 denotes all Unix versions). When Gandalf, Adele explicitly distinguishes displaying a view, the editor highlights three different classes of rules, namely, fragments belonging to all versions in constraints, preferences, and defaults. the edit set (a change should be con- However, the configuration rules are fined to these fragments). more sophisticated and allow attribute- based version selection. Using these 6.3.8 Cedar. The Cedar environment rules, short intensional descriptions can [Teitelman 1984] supports program- be given for large and complex configu- ming in Cedar, a modular language de- rations. veloped at XEROX PARC. Modules of Cedar programs contain generic refer- ences to other modules. Notably, a cli- 6.3.10 DAMOKLES. DAMOKLES ent module may use multiple realiza- [Dittrich et al. 1986; Gotthard 1988] was tion versions of an imported interface among the first systems applying data- simultaneously. The Cedar System base technology to SCM. DAMOKLES is Modeler [Lampson and Schmidt 1983b, based on an EER data model featuring 1983a] takes care of compiling and link- composite objects, structural inheri- ing Cedar programs. A system model in tance at both type and instance level, Cedar differs considerably from what is versioning, and database hierarchies. also called system model in DSEE. In Composite objects may overlap at the particular, a Cedar system model binds instance level (acyclic graphs), and they generic references to specific versions. may be defined recursively at the type System models describe configuration level. Objects may carry both short and versions in terms of immutable source long attributes, and the smallest granu- files and therefore roughly correspond larity (leaves of the composition hierar- to bound configuration threads in DSEE chy) may be chosen as desired (e.g., (extensional versioning, “version-first” coarse-grained objects such as modules selection). or fine-grained objects such as state- ments). 6.3.9 Adele I. In contrast to Gandalf, The version model, which is built into Adele I [Estublier 1985; Belkhatir and the data model rather than defined on Estublier 1986] was developed in the top of it, is partly influenced by SCCS/ mid-’80s for tool-kit environments (a RCS. Versions of one object are ar- posteriori integration of file-based ranged in a version graph, which may tools). Adele generalizes Gandalf’s ap- be defined as a sequence, tree, or DAG proach to configuring modular programs in the database schema (see also Figure in several ways. Whereas each Gandalf 3). Any object type may be defined as module has a unique interface, an Adele versioned (i.e., total versioning of ob- family may have multiple versions of its jects at all levels of the composition interface. Each interface version and its hierarchy). Versions may even have ver- realization variants and revisions corre- sions themselves (recursive versioning). spond to a module in Gandalf. Like Any structure of an AND/OR graph may

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 269 be represented (product first, version is passed to a compiler or linker (struc- first, intertwined). A version inherits tural variation), which options are from its generic object all attributes and passed to a preprocessor (single-source components (delegation). variation, conditional compilation), and which flags are passed to a compiler or 6.3.11 PCTE. PCTE [Wakeman and linker (derivation variations). In this Jowett 1993] is a standard for open way, Shape integrates a blend of heter- repositories that provides an interface ogeneous mechanisms previously han- for implementing software engineering dled separately from each other. tools. PCTE is based on a data model combining concepts from the EER model 6.3.13 Aide-de-Camp. Aide-de-Camp and the UNIX file system. Each object [Cronk 1992; Software Maintenance may have at most one long attribute for and Development Systems 1990] de- which UNIX-like file operations are pro- scribes versions of products in terms of vided. Links between objects are classi- change sets relative to a baseline. A fied into predefined categories, includ- change set describes logically related, ing composition links for representing global changes that may affect multiple composite objects. files. The finest grain of change is a text PCTE offers basic versioning facili- line. In contrast to layers in PIE, ties [Oquendo et al. 1989]. Unlike change sets are totally ordered accord- DAMOKLES, the version model is de- ing to their creation times. If change set fined on top of the data model. Versions c1 was created before c2, c1 will be of composite objects represent bound applied before c2 when both change sets configurations (“version-first” selection). are included in some product version. In A new version is created by recursively the case of overlaps, c2 overrides the copying the whole composition hierar- changes in c1. chy and establishing successor relation- Each change set may be viewed as a ships between all components. Incoming switch that can be turned either on or and outgoing links are copied selectively off (see Figure 6). Aide-de-Camp detects (depending on link categories and cardi- inconsistencies when reconstructing a nalities). product version from a baseline and a sequence of change sets (e.g., modifica- 6.3.12 Shape. Shape [Lampen and tions to nonexisting text lines). Further- Mahler 1988; Mahler and Lampen 1988] more, Aide-de-Camp provides a three- is an SCM system that combines ideas way merge tool for conflict detection. drawn from Make, DSEE, and RCS. An Unlike PIE, Aide-de-Camp does not attributed file system stores versioned support relationships that can be used files using directed deltas [Obst 1987]. to detect inconsistent combinations of A Shape file consists of both version change sets. selection rules and build rules and roughly corresponds to a Make file plus 6.3.14 COV. COV [Lie et al. 1989; a DSEE configuration thread. Derived Munch et al. 1993] denotes the version objects are stored in a cache. model underlying the SCM subsystem of The original contribution of Shape the EPOS software engineering environ- lies in its integrated variant manage- ment [Conradi et al. 1994]. EPOS is ment [Mahler 1994]. Variant attributes based on an EER data model, where are used in version selection rules. Fur- files are represented by entities with thermore, a set of bindings is attached long attributes. A versioned database to each variant. Each binding associates consists of fragments corresponding, for the name of an attribute with a value. example, to groups of attributes or se- Bindings control in which order directo- quences of text lines. Similarly to condi- ries are searched for source objects tional compilation, each fragment is an- (variant segregation), which set of files notated with a control expression called

ACM Computing Surveys, Vol. 30, No. 2, June 1998 270 • R. Conradi and B. Westfechtel visibility. Visibilities refer to global SQL-like manner. Configuration rules Boolean options that constitute an n- in queries are classified into constraints dimensional version grid. Like MVPE, and preferences, the latter of which can COV distinguishes between read and be ordered sequentially. Preferences act write filters, called ambition and choice, as filters that are applied only if the respectively. A choice is a complete set resulting set of versions is not empty. of option bindings (single version). An In addition, the rule base contains ambition contains a subset of the option constraints specified by compatibility bindings of the choice and corresponds rules. A compatibility rule is an asser- to some region of the version space. The tion in a restricted first-order predicate ambition defines the versions to be af- calculus. The conditions under which fected by a change, and the choice deter- two versions from different modules are mines the version presented to the user. compatible are stated in terms of ver- COV stands for change-oriented ver- sion attributes. Due to the restricted sioning, which suggests a specific inter- form of constraints, SIO can efficiently pretation of options: each option corre- check for contradictions between them. sponds to a global change that can be either included or omitted when config- 6.3.16 Inscape. Inscape [Perry 1989] uring a product version. In fact, both goes beyond Gandalf and Adele in ad- state- and change-based version models dressing semantic rather than syntactic can be expressed with options [Conradi consistency. Operations exported by a and Westfechtel 1997]. For example, module are annotated with Hoare-like version graphs may be represented with pre- and postconditions. The Inscape en- the help of configuration rules [Gulla et vironment assists the user in construct- al. 1991] that constrain delta applica- ing semantically consistent programs in tion (e.g., implication for revision chains various ways (inference of pre- and and mutual exclusion for branches). In postconditions from the implementation particular, these configuration rules of an operation, detection of unsatisfied distinguish COV from change-based preconditions at call sites of operations, systems, such as Aide-de-Camp or PIE, etc.). which provide little support for exclud- Version control [Perry 1987] is per- ing inconsistent change combinations formed at a semantic level and ad- [Conradi and Westfechtel 1996]. dresses substitutability of operation ver- In COV, configuration rules are not sions. To this end, Inscape defines (and only used passively for detecting incon- checks) various compatibility predicates sistent selections. In Munch [1996], a (between versions v1 and v2) that en- version-selection tool is presented that sure either global or local substitutabil- actively supports the user in setting up ity. In the global case, v1 can be substi- consistent ambitions and choices by au- tuted for v2 without having to inspect tomatically deducing option bindings the call sites. In contrast, local substi- from configuration rules. tutability refers to specific call sites.

6.3.15 SIO. SIO [Bernard et al. 6.3.17 POEM. POEM [Lin and Reiss 1987; Lavency and Vanhoedenaghe 1995, 1996] is an environment for pro- 1988] extends relational database tech- gramming in C/Cϩϩ that strives to sim- nology with deductive rules for version plify SCM at the user interface. To this selection. In SIO, a software product end, SCM is provided in terms of mod- consists of a set of modules each of ules; that is, SCM matches the logical which is represented by a relation. Each abstractions made by programmers. All tuple of a relation corresponds to a sin- components of a module (including gle version characterized by a set of source code, object code, and documen- attributes used for version selection. tation) are aggregated into a single ob- Configurations are described in an ject called a software unit. Software

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 271 units are connected by relationships sions only files; however, ClearCase that represent dependencies between manages versions of directories as well. source objects (see Figure 1(d)). All kinds of versioned objects are uni- For each software unit, a set of opera- formly denoted as elements. The ver- tions is provided for editing, building, sioned file system may be accessed and version control. Versions of one through a single-version view (configu- software unit are arranged in a version ration thread) defined by a configura- tree. A version of a software unit tion description. The view is a filter that uniquely determines all versions of provides tools with the illusion of work- units on which it depends (“version- ing in a single-version environment (vir- first” selection, see Figure 8(b)). tual file system). In contrast to DSEE, generic refer- 6.3.18 CoMa. CoMa [Westfechtel ences to elements are bound dynami- 1994, 1996] manages configurations of cally only when an element is accessed engineering design documents and has (there is no precomputed version map). been applied to both software engineer- Furthermore, in ClearCase the system ing and mechanical engineering. The model is accessed in the same way as CoMa model integrates composition hi- any other element (whereas in DSEE it erarchies, dependencies, and versions is used for constructing a bound config- into a coherent framework based on a uration thread; see also Figure 22). small set of concepts. Configurations Configuration rules are similar to are versioned objects whose components those offered by DSEE and are used to are connected by dependencies. Version establish workspaces for developers and graphs are maintained for both docu- to control change propagation between ments and configurations. these workspaces. A stable work envi- CoMa is based on attributed graphs. ronment may be set up by static rules The underlying model was defined by a (referring to specific versions). Dynamic programmed graph rewriting system rules are used to see recent changes using the PROGRES specification lan- performed by other developers. guage [Schu¨ rr et al. 1995]. At the imple- To support distributed SCM, ClearCase mentation level, the database system assigns ownerships to branches in ver- GRAS [Kiesel et al. 1995] is used, which sion graphs [Allen et al. 1995]. Each site offers primitives for version control appends revisions to its allocated branch. (graph deltas) but does not introduce a After having imported new revisions from version model (this is done on top of the another site, three-way merging is used data model). to combine local and remote changes. Finally, for software engineering ap- plications, a structure-oriented merge 6.3.20 PCL. PCL [Tryggeseth et al. tool [Westfechtel 1991] provides for 1995], the configuration language devel- three-way merging of versions of soft- oped in the PROTEUS project, is influ- ware documents (e.g., requirements def- enced by conditional compilation and initions, software architectures, module module interconnection languages (in implementations) that are internally particular, SySL [Sommerville and represented as abstract syntax graphs. Thomson 1989]). PCL is designed to The merge tool preserves context-free manage different types of variants at correctness and detects certain kinds of the coarse-grained level. Variation of context-sensitive conflicts by analyzing the logical structure refers to the de- bindings of identifiers to their declara- composition of a system into logical tions. components (an example was given in part (b) of Figure 12). Variation of the 6.3.19 ClearCase. ClearCase [Leb- physical structure means that a logical lang 1994] differs from its predecessor component may be mapped in different DSEE in several aspects. DSEE ver- ways into physical components (files re-

ACM Computing Surveys, Vol. 30, No. 2, June 1998 272 • R. Conradi and B. Westfechtel siding in directories). Finally, variation general facilities for composite objects, of system building occurs when a source versioning, workspaces, and process object is compiled in different ways with management. On top of these, dedicated different compilation switches. SCM tools may be built (e.g., the Adele All kinds of variations are controlled configurator described earlier). with a single mechanism, namely, at- Adele distinguishes between three or- tributes. In a configuration description, thogonal dimensions of versioning [Es- some of these attributes are assigned tublier and Casallas 1995]. Historical specific values. The configuration pro- versioning refers to the time dimension cess then proceeds top-down, resolving and introduces temporal database sup- logical and physical variation by means port. A versioned object evolves linearly of attributes (functional configurator). along the time axis. Attributes are di- When all physical components have vided into three classes: common at- been determined, a Make file is gener- tributes shared by all versions, version- ated as a final step. specific mutable attributes, and version- specific immutable attributes (see also 6.3.21 VOODOO. VOODOO [Reichen- Figure 13). Updates to immutable at- berger 1994, 1995] is a file-based SCM tributes result in creating a new ver- system that supports orthogonal version sion. Logical versioning (variants) is management. Similarly to Gandalf and supported through set-valued attributes Adele, revisions and variants are sepa- (e.g., a module may have multiple real- rated from each other rather than inter- ization variants coexisting at a given mixed in version graphs. However, time). Cooperative versioning is realized VOODOO inverts the selection order with the help of typed workspaces [Es- and selects the revision first. Further- tublier 1996]. A workspace type defines more, versioning is applied at the prod- the types of objects and relationships it uct level rather than the component contains, as well as propagation policies level. Through a carefully designed user for exchange of versions between neigh- interface, VOODOO tries to make ver- bors (both vertically and horizontally). sion management as simple as possible. For a given revision, software objects 6.3.23 Asgard. Asgard [Micallef and are organized hierarchically in a project Clemm 1996], which has been realized tree whose leaves represent versioned on top of ClearCase, provides change- components. A set of globally defined based versioning on top of version variants is associated with each version. graphs. Thus it inverts the approach Based on the three-dimensional model followed by COV, where version graphs shown in Figure 11, VOODOO provides may be introduced on top of change- different views on a software product. based versioning by defining constraints For example, when the user first selects on the combination of changes. In As- a product revision and then a variant, a gard, these constraints are derived from project tree is displayed that is purged the version graphs of components. from all components not belonging to Each component version is tagged this variant. with the name of the activity (Asgard’s term for change) that created it. A 6.3.22 Adele II. Adele II, the current workspace is defined by a baseline and version of Adele [Estublier and Casallas a set of activities A, one of which is 1994, 1995; Estublier 1996], differs con- designated as the current activity. If a siderably from the initial version de- version was created by some activity scribed earlier in this section. In partic- a ʦ A, all versions on the path from the ular, the data-modeling capabilities baseline must have been created by have been improved and generalized. some other activity aЈ ʦ A (complete Now Adele may be viewed as an active selection). Furthermore, there must be object-oriented database system with a unique maximal element (unique se-

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 273 lection). In the case of a selection error, 7. RELATED WORK the user must either add further activi- We have given a comprehensive descrip- ties to A or apply a merge tool to resolve tion of the current state of the art of an ambiguity. version models for SCM. We have fo- cused mainly on the organization of the 6.3.24 ICE. Like COV, ICE [Zeller version space and the flexible construc- and Snelting 1995] is derived from con- tion of consistent configurations from ditional compilation and represents a intensional specifications. Furthermore, versioned object base as a set of frag- we have described and classified a sig- ments that are tagged with control ex- nificant number of representative SCM pressions. COV and ICE differ with re- systems. In Section 7.1, we discuss re- spect to their underlying logic calculus. lated surveys conducted earlier (1988 ICE is based on feature logic: a feature through 1991). Subsequently, we point corresponds to an attribute whose value out how the work presented here is re- is defined by a feature term. For exam- lated to other disciplines. ple, [ws : X11] means that the ws fea- ture has the value X11. In general, a 7.1 Related Work on Version Models feature term denotes a set of potential values and may be composed by a wide Two overviews were presented in 1988 range of operators such as unification, at the first SCM workshop [Winkler 1988], which has launched a (still ongo- subsumption, negation, and the like 10 [Zeller 1996]. Probably the most impor- ing) series of follow-ups. Tichy’s tant of these is unification, which is [1988] paper introduces basic notions used to compose configurations (the fea- such as software object, source and de- ture terms of component versions are rived object, and the like, and discusses unified). A configuration is inconsistent version graphs, system building, and if unification fails (empty intersection of version selection based on AND/OR value sets as, e.g., in [ws : X11] u [ws : graphs. Furthermore, the paper at- Windows]). tempts to unify the terminology in the Feature logic may be employed as a SCM field by means of a glossary. Es- base mechanism on top of which differ- tublier [1988] complements Tichy’s pre- sentation by focusing on the construc- ent version models may be realized tion of consistent configurations. In (uniform version model). In Zeller combination, these papers reflect the [1995] and Zeller and Snelting [1997], state of the art as of 1988. feature logic is used to realize the The Software Engineering Institute checkout/checkin model, the composi- (SEI) has published several papers that tion model, the long transaction model, review and survey the state of the art in and the change set model as introduced SCM [Brown et al. 1991; Dart 1991, by Feiler [1991a] (see also Section 7.1). 1992b]. Perhaps the most influential of Like COV, ICE supports multiversion these was written in 1991 by Feiler editing. However, there is no distinction [1991a], who classified the models un- between read and write filters. Rather, derlying SCM systems into four catego- ICE presents all versions to be edited to ries corresponding to different ways in the user, using the syntax of conditional which a user interacts with an SCM compilation. To this end, feature terms system. In the checkout/checkin model, are mapped onto preprocessor directives component versions are transferred in- [Zeller 1996]. Partial evaluation is used dividually between repository and work- to remove all fragments whose feature space. The composition model supports terms cannot be unified with the sub- mitted query. Like ClearCase, ICE sup- 10 Please see Tichy [1989], Feiler [1991b], Feld- ports a virtual file system to enable man [1993], Estublier [1995], Sommerville [1996], smooth tool integration. and Conradi [1997].

ACM Computing Surveys, Vol. 30, No. 2, June 1998 274 • R. Conradi and B. Westfechtel version selection through rules and as- sioning is provided only at the configu- sists the user in selecting consistent ration level, resulting in the composi- combinations of component versions tion model introduced by Feiler. As we (“product first”). In the long transaction have shown, there are radically differ- model, a user connects to a long trans- ent approaches such as conditional com- action and operates on a configuration pilation that are not based on version version (“version first”). Finally, in the graphs at all. Furthermore, Katz pri- change set model a configuration is de- marily discusses state-based version- scribed in terms of change sets each of ing. In contrast, this article covers which aggregates all modifications per- change-based versioning as well and formed in response to some change re- investigates the relations among these quest. complementary approaches. Finally, con- The classification proposed by Feiler figuration rules and the consistency helps in understanding different para- problems of intensional versioning are digms underlying SCM systems. Fur- discussed only briefly by Katz. thermore, the models are evaluated by discussing their merits and shortcom- 7.2 Related Disciplines ings. Unfortunately, the classification is not orthogonal. The composition model Version management is related to many and the checkout/checkin model are not other disciplines of computer science. In alternatives; rather, the former is built the following, these relations are de- on top of the latter. Furthermore, long scribed briefly. A major challenge of fu- transactions can be used in combination ture research consists of clarifying the with any approach to specifying a con- relations to these disciplines. figuration. Temporal databases [Tansel et al. In 1990, Katz [1990] surveyed version 1993; Snodgrass 1992] record the evolu- models for engineering databases. Katz tion history of data such that previous primarily considers electrical engineer- states can be retrieved in addition to ing (CAD) and only mentions a few soft- the current state. Temporal databases ware engineering approaches. Although focus solely on the time dimension and both domains have evolved almost inde- cover neither variants nor change-based pendently for a long time, many paral- versioning. Furthermore, they often dis- lels do exist [Dart 1992a]. Katz intro- tinguish between valid time (time in the duces the following concepts: organizing real world) and transaction time (time the version set (histories), dynamic con- of recording a fact in the database). This figuration mechanisms (binding of ge- distinction is not relevant for SCM be- neric references), hierarchical composi- cause software objects do not represent tions (versions of composite objects), real-world objects existing indepen- version-grouping mechanisms (to repre- dently of the database. sent variants), instances versus defini- Different approaches have been devel- tions (instances of component versions oped to accommodate changes to the may have properties that depend on the database schema. In the case of schema respective contexts in configurations), evolution, only the current version of change notification and propagation the schema is valid, and all data must (when and where to propagate changes), be converted (in lazy or eager mode) in and object sharing mechanisms (work- order to maintain the consistency of the spaces). database. In contrast, schema version- In several aspects, our view of version ing [Roddick 1995] makes it possible to models goes beyond the work presented view the data under different versions by Katz. His paper focuses on ap- of the schema. In SCM systems, ver- proaches based on version graphs and sioning of the schema (and other meta- restricted to extensional versioning at data such as configuration rules) is the component level. Intensional ver- rarely considered seriously. On the

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 275 other hand, schema versioning often gramming [Sutton et al. 1995], and hy- does not take the versioning of instance brids of these. In order to integrate data into account. SCM and process modeling, functional Deductive databases [Das 1992; Ra- overlap has to be considered (e.g., be- mamohanarao and Harland 1994; Ra- tween build tools and rule-based process makrishnan and Ullman 1995] provide engines such as Marvel [Kaiser et al. for persistent storage of facts and rules 1988]). Furthermore, the definition of and are usually based on a Prolog-like “product space” must be widened and data model. Deductive capabilities are must cover process models as well. Fi- urgently needed for intensional version- nally, dynamic interactions between ing. On the other hand, deductive data- product and process need to be taken bases have been employed only rarely in into account (e.g., replanning of task SCM [Zeller 1995; Bernard et al. 1987; nets after changes to the product struc- Lavency and Vanhoedenaghe 1988]. ture [Liu and Conradi 1993]). Rather, many SCM systems incorporate Tool integration [Wasserman 1990] is home-grown deductive components that provided by SCM systems through have been developed in an ad hoc man- workspaces that hide versioning from ner. the tools. Workspaces can be separated It has been recognized for a long time physically from the versioned database that the ACID principle cannot be [Rochkind 1975], or be realized as up- transferred from short to long transac- datable database views (virtual work- tions [Barghouti and Kaiser 1991; Kai- spaces, e.g., virtual file systems [Leb- ser 1995; Feiler 1991a]. Rather, precom- lang 1994; Fowler et al. 1994]). Current mit cooperation is required in order to SCM systems focus on integration of coordinate long-lasting development file-based tools and offer poor support and maintenance tasks. Customizable for integrating tools operating on data- policies have been developed to control bases (e.g., CASE tools [Wallnau 1992]). cooperation. Many approaches to long The problem of integrating heteroge- transactions do not take versioning into neous database systems without sacri- account [Barghouti and Kaiser 1991]. ficing their autonomy is addressed by This is a severe restriction since ver- federated database systems [Sheth and sions play a crucial role in cooperation Larson 1990] and data warehouses control [Estublier and Casallas 1995]. [Hammer et al. 1995]. Furthermore, dif- So far, only a few SCM systems support ferent kinds of platforms or frameworks long transactions [Conradi and Malm offer plug-in interfaces for tool integra- 1991; Godart et al. 1995]. Many others tion, for example, broadcast message merely provide workspaces and mecha- servers [Reiss 1990] and object request nisms for controlling change propaga- brokers [Soley and Kent 1995]. Future tion between them [Estublier 1996]. generations of SCM systems need to Software process modeling [Finkel- interface with these frameworks. stein et al. 1994; Curtis et al. 1992; Current SCM systems are severely Rombach and Verlage 1995] is con- limited with respect to managing depen- cerned with the definition, analysis, and dencies between software objects. First, enactment of models of real-world soft- they are mainly concerned with depen- ware processes. Many different para- dencies between source code modules digms have been applied to process rather than with dependencies among modeling [Conradi et al. 1991], includ- any kinds of software objects produced ing active databases [Estublier and Ca- in the software lifecycle. Second, they sallas 1994], rules [Kaiser et al. 1988; are not capable of representing fine- Peuschel and Scha¨fer 1992], nets [Deit- grained dependencies, which is crucial ers and Gruhn 1990; Bandinelli et al. to provide for detailed traceability 1993; Jaccheri and Conradi 1993; throughout the whole lifecycle. These Heimann et al. 1996], imperative pro- problems can partly be addressed by

ACM Computing Surveys, Vol. 30, No. 2, June 1998 276 • R. Conradi and B. Westfechtel applying the concepts of hypertext sys- of source and derived versions, as well tems [Conklin 1987] to the software en- as workspaces and long transactions gineering domain. into a coherent framework [Conradi and The emerging discipline of software Westfechtel 1997]. This framework is architectures [Shaw and Garlan 1996] not expected to provide “the” model; stresses the importance of a high-level rather, it must be customizable to suit description of software products above the needs of a specific application. the source code level. The software ar- In the future, we expect that more chitecture acts as a central document and more SCM systems will be built for impact analysis, planning of devel- with the help of database technology. opment and maintenance activities, di- Having gained a better understanding vision of labor, understanding the inter- of version models, versioning can be faces between different components, pulled out of SCM systems and moved and so on [Nagl 1990]. SCM may benefit into database systems. In particular, in- from software architectures in some tensional versioning will benefit from ways. First, SCM systems focus primar- the powerful facilities of an underlying ily on source code and represent the deductive database system. product structure by rather low-level system models that are mainly used to ACKNOWLEDGMENTS drive system building. Architecture-ori- ented SCM will improve the software Many thanks go to the EPOS team in Trondheim and to the IPSEN team in Aachen. Furthermore, process through a high-level product de- we would like to acknowledge the competent and scription. Second, the architecture of comprehensive comments of the unknown review- the SCM system is a major research ers. Finally, the references were prepared partly challenge as well. In order to design an using the bibliography of Hal Render, University appropriate architecture, a clear under- of Colorado at Boulder. standing of the relations among an SCM system and other software components (e.g., process management systems, REFERENCES broadcast message servers, object re- ADAMS E., GRAMLICH, W., MUCHNICK, S., AND TIRF- quest brokers) needs to be developed. ING, S. 1986. SunPro: Engineering a practi- cal program development environment. In Proceedings of the International Workshop on 8. CONCLUSION Advanced Programming Environments (Trondheim, June) R. Conradi, T. M. Didrik- Over the past 20 years, many ap- sen, and D. H. Wanvik, Eds., LNCS 244, proaches to versioning have been devel- Springer-Verlag, 86–96. oped. Now we have gained a sufficient ADAMS,E.W.,HONDA, M., AND MILLER,T.C. level of understanding to classify these 1989. Object management in a CASE envi- approaches. Initial attempts to develop ronment. In Proceedings of the Eleventh Inter- national Conference on Software Engineering a uniform model have been undertaken (Pittsburgh, PA, May), IEEE Computer Soci- [Zeller 1995; Zeller and Snelting 1997]. ety Press, Los Alamitos, CA, 154–163. Furthermore, the recent evolution of ADAMS,P.AND SOLOMON, M. 1995. An overview SCM systems shows that their underly- of the CAPITL software development environ- ing version models are converging to an ment. In Software Configuration Manage- increasing extent. For example, change- ment: Selected Papers SCM-4 and SCM-5 (Se- attle, WA, April), J. Estublier, Ed., LNCS based versioning has been realized on 1005, Springer-Verlag, 1–34. top of version graphs and vice versa. ALLEN, L., FERNANDEZ, G., KANE, K., LEBLANG, D., Based on the material presented in MINARD, D., AND POSNER, J. 1995. ClearCase this article, we believe that a version MultiSite: Supporting geographically-distrib- model can be developed that integrates uted software development. In Software Con- figuration Management: Selected Papers extensional and intensional versioning, SCM-4 and SCM-5, (Seattle, WA, April), J. state-based and change-based version- Estublier, Ed., LNCS 1005, Springer-Verlag, ing, revisions and variants, construction 194–214.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 277

BABICH, W. A. 1986. Software Configuration and survey. IEEE Comput. 20, 9 (Sept.), 17– Management. Addison-Wesley, Reading, MA. 41. BANDINELLI,S.C.,FUGGETTA, A., AND GHEZZI, CONRADI,R.ED. 1997. Software Configuration C. 1993. Software process model evolution Management: ICSE’97 SCM-7 Workshop (Bos- in the SPADE environment. Trans. Softw. ton, MA, May), LNCS 1235, Springer-Verlag. Eng. 19, 12 (Dec.), 1128–1144. CONRADI, R., ET AL. 1994. EPOS: Object-ori- BARGHOUTI,N.S.AND KAISER, G. E. 1991. Con- ented and cooperative process modelling. In currency control in advanced database appli- Software Process Modelling and Technology, cations. ACM Comput. Surv. 23, 3 (Sept.), Advanced Software Development Series. A. 269–317. Finkelstein, J. Kramer, and B. Nuseibeh, Eds., Research Studies Press (John Wiley), BELKHATIR, N., AND ESTUBLIER, J. 1986. Experi- Chichester, UK, 33–70. ence with a data base of programs. In Pro- ceedings of the ACM SIGSOFT/SIGPLAN CONRADI,R.AND MALM, C. 1991. Cooperating Software Engineering Symposium on Practi- transactions and workspaces in EPOS: Design cal Software Development Environments, and preliminary implementation. In Proceed- ings of the Third International Conference on (Palo Alto, CA, Dec.) ACM SIGPLAN Not. 22, Advanced Information Systems Engineering 1, 84–91. (CAISE ’91) R. Andersen, J. A. Bubenko, and BERNARD, Y., LACROIX, M., LAVENCY, P., AND VAN- A. Solvberg, Eds. (Trondheim, May), LNCS HOEDENAGHE, M. 1987. Configuration man- 498, Springer-Verlag, 375–392. agement in an open environment. In Proceed- CONRADI, R., AND WESTFECHTEL, B. 1996. Con- ings of the 1st European Software Engineering figuring versioned software products. In Soft- Conference (Straßburg, Sept.), G. Goos and J. ware Configuration Management: ICSE’96 Hartmanis, Eds., LNCS 289, Springer-Verlag, SCM-6 Workshop (Berlin, March), I. Sommer- 35–43. ville, Ed., LNCS 1167, Springer-Verlag, 88– BERSOFF,E.H.,HENDERSON,V.D.,AND SIEGEL, 109. S. G. 1980. Software Configuration Man- CONRADI,R.AND WESTFECHTEL, B. 1997. To- agement: An Investment in Product Integrity. wards a uniform version model for software Prentice-Hall, Englewood Cliffs, NJ. configuration management. In Software Con- BERZINS, V. 1994. Software merge: Semantics figuration Management: ICSE’97 SCM-7 of combining changes to programs. ACM Workshop (Boston, MA, May), R. Conradi, Trans. Program. Lang. Syst. 16, 6 (Nov.), Ed., LNCS 1235, Springer-Verlag, 1–17. 1875–1903. CONRADI, R., LIU, C., AND JACCHERI, M. L. 1991. BERZINS, V., ED. 1995. Software Merging and Process modeling paradigms: An evaluation. Slicing. IEEE Computer Society Press, Los In Proceedings of the Seventh International Alamitos, CA. Software Process Workshop (Yountville, CA, Oct.), IEEE Computer Society Press, Los BINKLEY, D., HORWITZ, S., AND REPS, T. 1995. Alamitos, CA, 51–54. Program integration for languages with pro- cedure calls. ACM Trans. Softw. Eng. Meth- CRONK, R. D. 1992. Tributaries and deltas. odol. 4, 1 (Jan.), 3–35. BYTE 17, 1 (Jan.), 177–186. CURTIS, B., KELLNER,M.I.,AND OVER, J. 1992. BORISON, E. A. 1989. Program changes and the cost of selective recompilation. Tech. Rep. Process modeling. Commun. ACM 35, 9 (Sept.), 75–90. CMU-CS-89-205 (July), Department of Com- puter Science, Carnegie Mellon University, DART, S. 1991. Concepts in configuration man- Pittsburgh, PA. agement systems. In Proceedings of the Third International Workshop on Software Configu- BROWN, A., DART, S., FEILER, P., AND WALLNAU,K. ration Management (Trondheim, Norway, 1991. The state of automated configuration June), P. H. Feiler, Ed., ACM Press, New management. Tech. Rep. ATR92 (Sept.), Soft- York, 1–18. ware Engineering Institute, Carnegie-Mellon DART, S. A. 1992a. Parallels in computer-aided University, Pittsburgh, PA. design frameworks and software development BUFFENBARGER, J. 1995. Syntactic software environments efforts. IFIP Trans. A 16, 175– merging. In Software Configuration Manage- 189. ment: Selected Papers SCM-4 and SCM-5 (Se- DART, S. A. 1992b. The past, present, and fu- attle, WA, April), J. Estublier, Ed., LCNS ture of configuration management. Tech. Rep. 1005, Springer-Verlag, 153–172. CMU/SEI-92-TR-8 (July), Software Engineer- CLEMM, G. 1995. The Odin system. In Software ing Institute, Carnegie-Mellon University, Configuration Management: Selected Papers Pittsburgh, PA. SCM-4 and SCM-5 (Seattle, WA, April), J. DART,S.A.,ELLISON,R.J.,FEILER,P.H.,AND Estublier, Ed., LNCS 1005, Springer-Verlag, HABERMANN, A. N. 1987. Software develop- 241–262. ment environments. IEEE Computer 20, 11 CONKLIN, J. 1987. Hypertext: An introduction (Nov.), 18–28.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 278 • R. Conradi and B. Westfechtel

DAS, S. K. 1992. Deductive Databases and Tech. Rep. CMU/SEI-91-TR-7 (March), Soft- Logic Programming. Addison-Wesley, Read- ware Engineering Institute, Carnegie-Mellon ing, MA. University, Pittsburgh, PA. DEITERS,W.AND GRUHN, V. 1990. Managing FEILER,P.H.,ED. 1991b. Proceedings of the software processes in the environment MEL- Third International Workshop on Software MAC. In Proceedings of the Fourth ACM SIG- Configuration Management (Trondheim, Nor- SOFT Symposium on Practical Software De- way, June), ACM Press, New York. velopment Environments (Irvine, CA, Dec.), FELDMAN, S., ED. 1993. Proceedings of the ACM SIGSOFT Softw. Eng. Not. 15, (6) R. Fourth International Workshop on Software Taylor, Ed., 193–205. Configuration Management (Preprint) (Balti- DITTRICH, K., GOTTHARD, W., AND LOCKEMANN,P. more, MD, May). 1986. DAMOKLES, a database system for FELDMAN, S. I. 1979. Make—A program for software engineering environments. In Pro- maintaining computer programs. Softw. ceedings of the International Workshop on Ad- Pract. Exper. 9, 4 (April), 255–265. vanced Programming Environments (Trond- FINKELSTEIN, A., KRAMER, J., AND NUSEIBEH, B., heim, June), R. Conradi, T. M. Didriksen, and EDS. 1994. Software Process Modelling and D. H. Wanvik, Eds., LNCS 244, Springer- Technology. Advanced Software Development Verlag, 353–371. Series. Research Studies Press (Wiley), Chich- DITTRICH,K.R.AND LORIE, R. A. 1988. Version ester, UK. support for engineering database systems. IEEE Trans. Softw. Eng. 14, 4 (April), 429– FOWLER, G., KORN, D., AND RAO, H. 1994. n- 437. DFS: The multiple dimensional file system. In Configuration Management, W. F. Tichy, Ed., EHRIG, H., FEY, W., HANSEN, H., LO¨WE, M., AND Vol. 2 of Trends in Software, Wiley, New JACOBS, D. 1989. Algebraic software devel- York, 135–154. opment concepts for module and configuration families. In Proceedings of the Ninth Confer- FRASER,C.AND MYERS, E. 1986. An editor for ence on Foundation of Software Technology revision control. ACM Trans. Program. Lang. and Theoretical Computer Science (Bangalore, Syst. 9, 2 (April), 277–295. Dec.), V. Madhavan, Ed., LNCS 405, Spring- GODART, C., CANALS, G., CHAROY, F., AND MOLLI, er-Verlag, 181–192. P. 1995. About some relationships between configuration management, software process, ESTUBLIER, J. 1985. A configuration manager: The Adele data base of programs. In Proceed- and cooperative work: The COO environment. ings of the Workshop on Software Engineering In Software Configuration Management: Se- Environments for Programming-in-the-Large lected Papers SCM-4 and SCM-5 (Seattle, (Harwichport, MA, June), 140–147. WA, April), J. Estublier, Ed., LNCS 1005, Springer-Verlag, 173–178. ESTUBLIER, J. 1988. Configuration manage- ment: The notion and the tools. In Proceed- GOLDSTEIN,I.P.AND BOBROW, D. G. 1980. A ings of the International Workshop on Soft- layered approach to software design. Tech. ware Version and Configuration Control Rep. CSL-80-5, XEROX PARC, Palo Alto, CA. (Grassau, Germany), J. F. H. Winkler, Ed., GOODSTEP 1995. The GOODSTEP Project— Teubner Verlag, 38–61. Final Report. GOODSTEP ESPRIT Project ESTUBLIER, J. 1996. Workspace management in 6115. software engineering environments. In Soft- GOTTHARD, W. 1988. Datenbanksysteme fu¨r ware Configuration Management: ICSE’96 Software-Produktionsumgebungen. IFB 193, SCM-6 Workshop (Berlin, March) I. Sommer- Springer-Verlag, Berlin. ville, Ed., LCNS 1167, Springer-Verlag. GULLA, B., KARLSSON, E.-A., AND YEH, D. 1991. ESTUBLIER,J.ED. 1995. Software Configuration Change-oriented version descriptions in Management: Selected Papers SCM-4 and EPOS. Softw. Eng. J. 6, 6 (Nov.), 378–386. SCM-5 (Seattle, WA, April), LNCS 1005, HABERMANN,N.AND NOTKIN, D. 1986. Gandalf: Springer-Verlag. Software development environments. IEEE ESTUBLIER,J.AND CASALLAS, R. 1994. The Adele Trans. Softw. Eng. 12, 12 (Dec.), 1117–1127. configuration manager. In Configuration HAMMER, J., GARCIA-MOLINA, H., WIDOM, J., LABIO, Management, W. F. Tichy, Ed., Vol. 2 of W., AND ZHUGE, Y. 1995. The Stanford data Trends in Software, Wiley, New York, 99– warehousing project. Data Eng. Bull. 18, 2, 134. 41–48. ESTUBLIER,J.AND CASALLAS, R. 1995. Three di- HEIMANN, P., JOERIS, G., KRAPP, C.-A., AND WEST- mensional versioning. In Software Configura- FECHTEL, B. 1996. DYNAMITE: Dynamic tion Management: Selected Papers SCM-4 and task nets for software process management. SCM-5 (Seattle, WA, April), J. Estublier, Ed., In Proceedings of the Eighteenth International LNCS 1005, Springer-Verlag, 118–135. Conference on Software Engineering (Berlin, FEILER, P. H. 1991a. Configuration manage- March), IEEE Computer Society Press, Los ment models in commercial environments. Alamitos, CA, 331–341.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 279

HORWITZ, S., PRINS, J., AND REPS, T. 1989. Inte- Practical use of a practical polymorphic appli- grating non-interfering versions of programs. cative language. In Tenth Annual ACM Sym- ACM Trans. Program. Lang. Syst. 11, 3 posium on Principles of Programming Lan- (July), 345–387. guages (Austin, TX, Jan.), ACM SIGPLAN HUMPHREY, W. S. 1989. Managing the Software Not. 18, 1, 237–255. Process. SEI Series in Software Engineering, LAVENCY,P.AND VANHOEDENAGHE, M. 1988. Addison-Wesley, Reading, MA. Knowledge based configuration management. IEEE 1983. IEEE Standard for Software Con- In Proceedings of the 21st Annual Hawaii figuration Management Plans: ANSI/IEEE International Conference on System Sciences Std 828-1983. IEEE, New York. (Hawaii, Jan.), B. Shriver, Ed., IEEE Com- IEEE 1988. IEEE Guide to Software Configu- puter Society Press, Los Alamitos, CA, 83–92. ration Management: ANSI/IEEE Std 1042- LEBLANG, D. 1994. The CM challenge: Configu- 1987. IEEE, New York. ration management that works. In Configura- tion Management, W. F. Tichy, Ed., Vol. 2 of JACCHERI,M.L.AND CONRADI, R. 1993. Tech- niques for process model evolution in EPOS. Trends in Software, Wiley, New York, 1–38. IEEE Trans. Softw. Eng. 19, 12 (Dec.), 1145– LEBLANG,D.B.AND CHASE, R. P. 1984. Com- 1156. puter-aided software engineering in a distrib- uted workstation environment. In Proceedings KAISER, G., FEILER, P., AND POPOVICH, S. 1988. Intelligent assistance for software develop- of the ACM SIGSOFT/SIGPLAN Software ment and maintenance. IEEE Softw. 5, 3 Engineering Symposium on Practical Soft- (May), 40–49. ware Development Environments, ACM SIG- PLAN Not. 19, 5 (May), 104–112. KAISER, G. E. 1995. Cooperative transactions for multiuser environments. In Modern Data- LEBLANG,D.B.AND MCLEAN, G. D. 1985. Con- base Systems (Reading, MA), W. Kim, Ed., figuration management for large-scale soft- Addison Wesley, 409–433. ware development efforts. In Proceedings of the Workshop on Software Engineering Envi- KAISER,G.E.AND HABERMANN, A. N. 1983. An ronments for Programming-in-the-Large (Har- environment for system version control. In wichport, MA, June), 122–127. Digest of Papers of Spring CompCon ’83, IEEE Computer Society Press, Los Alamitos, LEBLANG,D.B.,CHASE,R.P.,JR., AND SPILKE, CA, 415–420. H. 1988. Increasing productivity with a parallel configuration manager. In Proceed- KATZ, R. H. 1990. Toward a unified framework ings of the International Workshop on Soft- for version modeling in engineering data- ware Version and Configuration Control bases. ACM Comput. Surv. 22, 4 (Dec.), 375– (Grassau, Germany), J. F. H. Winkler, Ed., 408. Teubner Verlag, 21–37. KERNIGHAN,B.W.AND RITCHIE, D. M. 1978. The LIE, A., CONRADI, R., DIDRIKSEN, T., KARLSSON, E., C Programming Language. Prentice-Hall, HALLSTEINSEN,S.O.,AND HOLAGER, P. 1989. Englewood Cliffs, NJ. Change oriented versioning. In Proceedings of ¨ KIESEL, N., SCHURR, A., AND WESTFECHTEL,B. the Second European Software Engineering 1995. GRAS, a graph-oriented software engi- Conference (Coventry, UK, Sept.), C. Ghezzi neering database system. Inf. Syst. 20, 1 and J. A. McDermid, Eds., LNCS 387, Spring- (Jan.), 21–51. er-Verlag, 191–202. IM D K , W., E . 1995. Modern Database Systems. LIN, Y.-J. AND REISS, S. P. 1995. Configuration Addison-Wesley, Reading, MA. management in terms of modules. In Software KRAMER, J. 1993. Special issue on configurable Configuration Management: Selected Papers distributed systems. Softw. Eng. J. 8, 2 SCM-4 and SCM-5 (Seattle, WA, April), J. (March), 51–52. Estublier, Ed., LNCS 1005, Springer-Verlag, KRUSKAL, V. 1984. Managing multi-version pro- 101–117. grams with an editor. IBM J. Res. Dev. 28, 1, LIN, Y.-J. AND REISS, S. P. 1996. Configuration 74–81. management with logical structures. In Pro- LAMPEN, A., AND MAHLER, A. 1988. Shape—A ceedings of the Eighteenth International Con- software configuration management tool. In ference on Software Engineering (Berlin, Proceedings of the International Workshop on March), IEEE Computer Society Press, Los Software Version and Configuration Control Alamitos, CA, 298–307. (Grassau, Germany), J. F. H. Winkler, Ed., LIPPE, E., AND VAN OOSTEROM, N. 1992. Opera- Teubner Verlag, 228–243. tion-based merging. In Proceedings of ACM LAMPSON,B.W.AND SCHMIDT, E. E. 1983a. Or- SIGSOFT ’92: Fifth Symposium on Software ganizing software in a distributed environ- Development Environments (SDE5), (Tyson’s ment. In ACM Symposium on Programming Corner, VA, Dec.) ACM SIGSOFT Softw. Eng. Language Issues in Software Systems, ACM Not. 17 5, 78–87. SIGPLAN Not. 18, 6 (June), 1–13. LIU, C., AND CONRADI, R. 1993. Automatic re- LAMPSON,B.W.AND SCHMIDT, E. E. 1983b. planning of task networks for supporting pro-

ACM Computing Surveys, Vol. 30, No. 2, June 1998 280 • R. Conradi and B. Westfechtel

cess model evolution in EPOS. In Proceedings manis, Eds., LNCS 289, Springer-Verlag, 64– of the European Software Engineering Confer- 68. ence ’93 (Garmisch-Partenkirchen, Germany, OQUENDO, F., BERRADO, K., GALLO, F., MINOT, R., Sept.), I. Sommerville and M. Paul, Eds., AND THOMAS, I. 1989. Version management Springer-Verlag, 434–450. in the PACT integrated software engineering MACKAY, S. A. 1995. The state-of-the-art in environment. In Proceedings of the Second concurrent, distributed configuration manage- European Software Engineering Conference, ment. In Software Configuration Manage- (Coventry, UK, Sept.), C. Ghezzi and J. A. ment: Selected Papers SCM-4 and SCM-5 (Se- McDermid, Eds., LNCS 387, Springer-Verlag, attle, WA, April), J. Estublier, Ed., LNCS 222–242. 1005, Springer-Verlag, 180–194. PAULK,M.C.,WEBER,C.V.,CURTIS, B., AND CHRIS- MAHLER, A. 1994. Variants: Keeping things to- SIS, M. B. 1997. The Capability Maturity gether and telling them apart. In Configura- Model—Guidelines for Improving the Soft- tion Management, W. F. Tichy, Ed., Vol. 2 of ware Process. Addison-Wesley, Reading, MA. Trends in Software, Wiley, New York, 73–98. PERRY, D. 1989. The Inscape environment. In MAHLER,A.AND LAMPEN, A. 1988. An inte- Proceedings of the Eleventh International grated toolset for engineering software config- Conference on Software Engineering (Pitts- urations. In Proceedings of the ACM SIG- burgh, PA, May), IEEE Computer Society SOFT/SIGPLAN Software Engineering Press, Los Alamitos, CA, 2–12. Symposium on Practical Software Develop- PERRY, D. E. 1987. Version control in the In- ment Environments (Boston, MA, Nov.), ACM scape environment. In Proceedings of the Softw. Eng. Not. 13, 5, 191–200. Ninth International Conference on Software MARZULLO,K.AND WIEBE, D. 1986. Jasmine: A Engineering (Monterey, CA, March), IEEE software system modelling facility. In Pro- Computer Society Press, Los Alamitos, CA, ceedings of the ACM SIGSOFT/SIGPLAN 142–149. Software Engineering Symposium on Practi- PEUSCHEL, B., AND SCHA¨ FER, W. 1992. Concepts cal Software Development Environments (Palo and implementation of a rule-based process Alto, CA, Dec.), ACM SIGPLAN Not. 22, 1, engine. In Proceedings of the Fourteenth Inter- 121–130. national Conference on Software Engineering MICALLEF, J., AND CLEMM, G. 1996. The Asgard (Melbourne, Australia, May), IEEE Computer system: Activity-based configuration manage- Society Press, Los Alamitos, CA, 262–279. ment. In Software Configuration Manage- PRIETO-DIAZ, R., AND NEIGHBORS, J. 1986. Mod- ment: ICSE’96 SCM-6 Workshop (Berlin, ule interconnection languages. J. Syst. Softw. March), I. Sommerville, Ed., LNCS 1167, 6, 4 (Nov.), 307–334. Springer-Verlag, 175–186. RAMAKRISHNAN, R., AND ULLMAN, J. D. 1995. A MUNCH, B. 1996. HiCOV: Managing the version survey of deductive database systems. J. space. In Software Configuration Manage- ment: ICSE’96 SCM-6 Workshop (Berlin, Logic Program. 23, 2 (May), 125–149. March), I. Sommerville, Ed., LNCS 1167, RAMAMOHANARAO, K., AND HARLAND, J. 1994. An Springer-Verlag, 110–126. introduction to deductive database languages and systems. VLDB J. 3, 2 (April), 107–122. MUNCH, B. P. 1993. Versioning in a software engineering database—the change oriented REICHENBERGER, C. 1994. Concepts and tech- way. Ph.D. Thesis, NTNU Trondheim, Nor- niques for software version control. Softw. way. Concepts Tools 15, 3 (July), 97–104. MUNCH,B.P.,LARSEN, J.-O., GULLA, B., CONRADI, REICHENBERGER, C. 1995. VOODOO—a tool for R., AND KARLSSON, E.-A. 1993. Uniform ver- orthogonal version management. In Software sioning: The change-oriented model. In Pro- Configuration Management: Selected Papers ceedings of the Fourth International Workshop SCM-4 and SCM-5 (Seattle, WA, April), J. on Software Configuration Management (Bal- Estublier, Ed., LNCS 1005, Springer-Verlag, timore, MD, May), S. Feldman, Ed., (Preprint) 61–79. 188–196. REISS, S. 1990. Interacting with the FIELD en- NAGL, M. 1990. Softwaretechnik: Methodisches vironment. Softw. Pract. Exper. 20, S1 (June), Programmieren-im-Großen. Springer-Verlag, 89–115. Berlin. RICH, A., AND SOLOMON, M. 1991. A logic-based NAGL,M.ED. 1996. Building Tightly-Inte- approach to system modelling. In Proceedings grated Software Development Environments: of the Third International Workshop on Soft- The IPSEN Approach. LNCS 1170, Springer- ware Configuration Management (Trondheim, Verlag, Berlin. Norway, June), P. H. Feiler, Ed., ACM Press, OBST, W. 1987. Delta technique and string-to- New York, 84–93. string correction. In Proceedings of the First RIGG, W., BURROWS, C., AND INGRAM, P. 1995. European Software Engineering Conference Configuration Management Tools. Ovum Ltd., (Straßburg, Sept., 1987), J. Goos and J. Hart- London.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 Version Models • 281

ROCHKIND, M. J. 1975. The source code control ware process programming. ACM Trans. system. IEEE Trans. Softw. Eng. 1, 4 (Dec.), Softw. Eng. Methodol. 4, 3 (July), 221–286. 364–370. TANSEL,A.U.,CLIFFORD, J., GADIA, S., JAJODIA, S., RODDICK, J. F. 1995. A survey of schema ver- SEGEV, A., AND SNODGRASS, R. 1993. Tem- sioning issues for database systems. Inf. poral Databases—Theory, Design, and Imple- Softw. Technol. 37, 7 (July), 383–393. mentation. Benjamin/Cummings, Redwood ROMBACH,H.D.AND VERLAGE, M. 1995.- City, CA. Directions in software process research. In TEITELMAN, W. 1984. A tour through Cedar. In Advances in Computers, M. V. Zelkowitz, Ed., Proceedings of the Seventh International Con- Vol. 41, Academic Press, San Diego, 1–63. ference on Software Engineering (Orlando, FL, SARNAK, N., BERNSTEIN, R., AND KRUSKAL,V. March), IEEE Computer Society Press, Los 1988. Creation and maintenance of multiple Alamitos, CA, 181–195. versions. In Proceedings of the International TICHY, W. F. 1979. Software development con- Workshop on Software Version and Configura- trol based on module interconnection. In Pro- tion Control (Grassau, Germany), J. F. H. ceedings of the IEEE Fourth International Winkler, Ed., Teubner Verlag, 264–275. Conference on Software Engineering (Pitts- SCHMERL,B.R.,AND MARLIN, C. D. 1995. De- burgh, Sept.), IEEE Computer Society Press, signing configuration management facilities Los Alamitos, CA, 29–41. for dynamically bound systems. In Software TICHY, W. F. 1982a. A data model for program- Configuration Management: Selected Papers ming support environments. In Proceedings of SCM-4 and SCM-5 (Seattle, WA, April), J. the IFIP WG 8.1 Working Conference on Auto- Estublier, Ed., LNCS 1005, Springer-Verlag, mated Tools for Information System Design 88–100. and Development (New Orleans, Jan.), North- SCHU¨ RR, A., WINTER, A., AND ZU¨ NDORF, A. 1995. Holland, 31–48. Graph grammar engineering with PROGRES. TICHY, W. F. 1982b. Design, implementation, In Proceedings of the European Software En- and evaluation of a . In gineering Conference (ESEC ’95) (Barcelona, Proceedings of the Sixth International Confer- Sept.), W. Scha¨fer and P. Botella, Eds., LNCS ence on Software Engineering (Tokyo, Sept.), 989, Springer-Verlag, 219–234. IEEE Computer Society Press, Los Alamitos, SCIORE, E. 1994. Version and configuration CA, 58–67. management in an object-oriented data TICHY, W. F. 1985. RCS—A system for version model. VLDB J. 3, 1 (Jan.), 77–106. control. Softw. Pract. Exper. 15, 7 (July), 637– SHAW, M., AND GARLAN, D. 1996. Software Ar- 654. chitecture—Perspectives on an Emerging Dis- TICHY, W. F. 1988. Tools for software configura- cipline. Prentice-Hall, Englewood Cliffs, NJ. tion management. In Proceedings of the Inter- SHETH,A.P.AND LARSON, J. A. 1990. Federated national Workshop on Software Version and database systems for managing distributed, Configuration Control (Grassau, Germany), heterogeneous, and autonomous databases. J. F. H. Winkler, Ed., Teubner Verlag, 1–20. ACM Comput. Surv. 22, 3 (Sept.), 183–236. TICHY,W.F.,ED. 1989. Proceedings of the Sec- SNODGRASS, R. T. 1992. Temporal databases. In ond International Workshop on Software Con- Theories and Methods of Spatio-Temporal figuration Management (Princeton, NJ, Nov.), Reasoning in Geographic Space (Pisa, Sept.), ACM Softw. Eng. Not. 14, 7. A. Frank, I. Campari, and U. Formentini, TICHY,W.F.,ED. 1994. Configuration Manage- Eds., LNCS 639, Springer-Verlag, 22–64. ment, Vol. 2 of Trends in Software. Wiley, SOFTWARE MAINTENANCE AND DEVELOPMENT New York. YSTEMS S . 1990. Aide-de-Camp Product Over- TRYGGESETH, E., GULLA, B., AND CONRADI,R. view. Software Maintenance and Develop- 1995. Modelling systems with variability us- ment Systems, Concord, MA. ing the PROTEUS configuration language. In SOLEY,R.M.AND KENT, W. 1995. The OMG Software Configuration Management: Selected object model. In Modern Database Systems, Papers SCM-4 and SCM-5 (Seattle, WA, (Reading, MA), W. Kim, Ed., Addison-Wesley, April), J. Estublier, Ed., LNCS 1005, Spring- Reading, MA, 18–41. er-Verlag, 216–240. SOMMERVILLE, I., ED. 1996. Software Configura- WAKEMAN, L., AND JOWETT, J. 1993. PCTE—The tion Management: ICSE’96 SCM-6 Workshop Standard for Open Repositories. Prentice- (Berlin, March) LNCS 1167, Springer-Verlag. Hall, Englewood Cliffs, NJ. SOMMERVILLE, I., AND THOMSON, R. 1989. An ap- WALLNAU, K. C. 1992. Issues and techniques of proach to the support of software evolution. CASE integration with configuration manage- Comput. J. 32, 5 (Dec.), 386–398. ment. Tech. Rep. CMU/SEI-92-TR-5 (March), SUTTON,S.M.,HEIMBIGNER, D., AND OSTERWEIL, Software Engineering Institute, Carnegie- L. J. 1995. APPL/A: A language for soft- Mellon University, Pittsburgh, PA.

ACM Computing Surveys, Vol. 30, No. 2, June 1998 282 • R. Conradi and B. Westfechtel

WARREN, I., AND SOMMERVILLE, I. 1995. Dy- design documents. Int. J. Softw. Eng. Knowl- namic configuration abstraction. In Proceed- edge Eng. 6, 4 (Dec.), 549–583. ings of the European Software Engineering WINKLER,J.F.H.,ED. 1988. Proceedings of the Conference (ESEC ’95) (Barcelona, Sept.), W. International Workshop on Software Version Scha¨fer and P. Botella, Eds., LNCS 989, and Configuration Control (Grassau, Germa- Springer-Verlag, 173–190. ny), Teubner Verlag. WASSERMAN, A. 1990. Tool integration in soft- ZELLER, A. 1995. A unified version model for ware engineering environments. In Proceed- configuration management. In Proceedings of ings of the Second International Workshop on the ACM SIGSOFT ’95 Symposium on the Software Engineering Environ. (Chinon, Foundations of Software Engineering (Wash- France, Sept.), F. Long, Ed., LNCS 467, ington, Oct), ACM Softw. Eng. Not. 20, 4, Springer-Verlag, 137–149. 151–160. WESTFECHTEL, B. 1991. Structure-oriented merg- ing of revisions of software documents. In ZELLER, A. 1996. Smooth operations with Proceedings of the Third International Work- square operators—the version set model in shop on Software Configuration Management ICE. In Software Configuration Management: (Trondheim, Norway, June), P. H. Feiler, Ed., ICSE’96 SCM-6 Workshop (Berlin, March), I. ACM Press, New York, 68–79. Sommerville, Ed., LNCS 1167, Springer-Ver- lag. WESTFECHTEL, B. 1994. Using programmed graph rewriting for the formal specification of ZELLER,A.AND SNELTING, G. 1995. Handling a configuration management system. In Pro- version sets through feature logic. In Proceed- ceedings WG ’94 Workshop on Graph-Theo- ings of the Fifth European Software Engineer- retic Concepts in Computer Science (Herrsch- ing Conference (Barcelona, Sept.), W. Scha¨fer ing, Germany, June), E. Mayr, G. Schmidt, and P. Botella, Eds., LNCS 989, Springer- and G. Tinhofer, Eds., LNCS 903, Springer- Verlag, 191–204. Verlag, 164–179. ZELLER, A., AND SNELTING, G. 1997. Unified ver- WESTFECHTEL, B. 1996. A graph-based system sioning through feature logic. ACM Trans. for managing configurations of engineering Softw. Eng. Methodol. 6, 4 (Oct.), 397–440.

Received October 1996; revised July 1997; accepted August 1997

ACM Computing Surveys, Vol. 30, No. 2, June 1998