Workshop on Software Evolution

Proceedings of the Third International Workshop on Software Evolution Edited by Tom Mens, Kim Mens, Ellen Van Paesschen, Maja D'Hondt 2007 Technical Papers 2007 ECEASST Reuse-Based Analysis of Open-Source Software Component Evolution and Its Application to Version Control * Liguo Yu 1, Kai Chen 2 and Srini Ramaswamy 3 1. Computer Science and Informatics, Indiana University South Bend, South Bend, IN 46615, USA. Email: [email protected] 2. Motorola Incorporation, Schaumburg, IL 60196, USA. Email: [email protected] 3. Computer Science Department, University of Arkansas at Little Rock, Little Rock, AR, USA. Email: [email protected] Abstract: Software evolution and configuration management are two important issues of software project management. This paper applies a two-dimensional reuse-based evolution model, which describes the evolution of an individual component from two perspectives (white-box reuse and black-box reuse), in the version control of open-source components. In the case study, we investigate the evolution of twelve Linux kernel components and explain how the reuse-based evolution model can facilitate configuration management of Linux kernel components. Keywords: Component Evolution, Version Control, Open-Source Software 1 Introduction Software evolution [1] [2] is a process that a software system changes from a simpler or worse state to a higher or better state [3]. Since changes need to be made on software to fix defects or to satisfy new requirements, its evolution is inevitable. It is originally considered, and still commonly agreed on, that the evolution process of a software system is a maintenance process of that system. On the other hand, software reuse always occurs in software evolution, which could be an ad hoc small scale reuse or a systematic large scale reuse. Therefore, software evolution can also be considered as a reuse process. Traditionally, the reuse and maintenance processes are considered as two different processes. The reuse process is usually considered as occurring in software development: reusable components are identified and adapted to the new system. The maintenance process is considered occurring during software evolution — change to the software product after it is delivered. Therefore, for the develop- then-evolve model of closed-source software development; reuse, which happens during development, and maintenance, which happens during subsequent evolution, are largely viewed as two independent processes. Recently, open-source software, i.e., the software program whose source code is available under a copyright license that permits users to freely study, change, and utilize, has growing impacts on both the software engineering community and our society. Because of its importance, a lot of research has been performed to study the evolution patterns of open-source software [4] [5]. The evolution process of open-source software is different from that of closed-source software. Usually, the documentation of * An earlier version of this paper appears in the 18th International Conference on Software Engineering and Knowledge Engineering. 2 / 11 Volume X (2007) Short Article Title open-source software on requirement, analysis, and design is limited and incomplete [6]. The quality assurance of open-source software is not achieved through systematic testing by the programmers. Instead, it is the feedback of users that usually contributes to the quality improvement of open-source software. Accordingly, open-source software is characterized by the philosophy, “release early and release often” [7]. For example, from 1994 to 2006, Linux has released over 500 versions. Due to the frequent update of open-source software, an evolution process is also a development process. In the develop-and-evolve model of open-source software, the software reuse, which usually happens during development, is mixed with software maintenance, which usually happens during evolution. Therefore, considering software evolution as a reuse process is especially important for research on open-source software systems. On the other hand, configuration management is a disciplined approach to manage the evolution of software and its components [8] [9] [10]. One of the most important issues in configuration management is version control (also called revision control or source control), which manages multiple revisions of the same unit of information. Examples of traditionally widely used version control systems are CVS [11] and PRCS [12]. In previous work [13], we investigated the evolution of open-source software components from the reuse perspective and proposed a reuse-based evolution model. In this paper, this model is applied in version control of open-source software components in order to improve their configuration management and evolution process. 2 Software Component Evolution and Version Control Software evolution is a process of making changes to software artifacts in order to improve their functional or nonfunctional requirements. The evolution of a software product consists of the evolution of its individual components. Because closed-source and open-source software systems have different properties, the evolution processes of their corresponding components are also different. This accordingly influences their selections of version control techniques. 2.1 Closed-source component evolution and version control Basili [14] first proposed three evolution models for closed-source software: quick-fix, iterative- enhancement, and full reuse-models. All these models consider reuse as the major activity in software development. They are reuse-oriented software development models and lay the foundations for component-based software development (CBSD). Both Basili’s reuse models and the latest CBSD model require complete documentation of requirement, analysis and design, which is a property of closed-source software. However, these software development models do not fit in well with open- source software, because (1) open-source software does not have complete documents on requirement, analysis, design, and systematic testing; (2) open-source software does not promote planned reuse, there is no artifact repository for open-source software. Just like the traditional software development, reuse-based closed-source software development (either CBSD or product line) also needs configuration management because a component or a product may have many versions. Considerable research has been performed to study configuration management in CBSD. For example, Mei et al. [15] proposed a configuration model by analyzing the objects that need to be managed in order to improve the CBSD process. In their model, components, as the integral logical constituents of a system, are managed as the basic configuration items. Relationships Proc. Software Evolution 2007 3 / 11 ECEASST between/among components are defined and maintained to support version control of large objects and logical modeling of system architecture. Generally, in reuse-based software development, a linear structure is used in version control of software components. Figure 1 shows an example demonstrating the evolution of three components (C1, C2, and C3) in a repository. In the figure, each component is versioned separately because components in closed-source component-based software are usually independent, which makes them easy for adaptation and reuse. Branching and merging also happen in the evolution process of the component C3, where version V1.1 is branched into two versions, V1.2 and V1.1.1, and version V1.3 and version V1.1.3 are merged into a single version V2.0. Figure 1. The versioning scheme of components in closed-source component-based development. The versioning scheme shown in Figure 1 fits well in closed-source component-based software development, because an individual component in closed-source component-based software has few dependencies on others (except for the inclusive relationship, such as C1 is part of C2) and evolves independently. 2.2 Open-source component evolution and version control As mentioned before, open-source software is characterized by the frequent release property [5]. Although it has many versions, each new release is based on a previous release. Its evolution can be modeled as an evolutionary streamline. For example, suppose an open-source software product has nine consecutive versions, V1 through V9. Its evolution can be represented as a one-dimensional streamline shown in Figure 2. To simplify the problem, in this paper, we ignore the branching and merging of software versions and only consider consecutive versions. Figure 2. One-dimensional evolution process for open-source software [13]. Figure 2 shows the evolution of the entire software system from a maintenance perspective. Every new release (version) is based on the modification of the previous version, which is a widely accepted view of open-source software evolution. A more complete model may include version branching and merging, which will be a two-dimensional tree structure. However, both the one-dimensional streamlined model and the two-dimensional tree model are maintenance-based and the reuse process is largely ignored. Most importantly, the model in Figure 2 only captures the evolution process of the product as a whole. However, different components within the product may evolve differently. For example, while 4 / 11 Volume X (2007) Short Article Title some components may never be changed, others may be changed several times during the same period. Therefore, the

Load more