An Integrated Approach for Studying Architectural Evolution

An Integrated Approach for Studying Architectural Evolution Qiang Tu and Michael W. Godfrey Software Architecture Group (SWAG) Department of Computer Science, University of Waterloo email: {qtu,migod}@swag.uwaterloo.ca Abstract can find answers to questions such as “why and when changes are made”, “how changes should be managed”, and Studying how a software system has evolved over time “what the consequences and implications of changes are to is difficult, time consuming, and costly; existing techniques continue software development”. Software Evolution, one are often limited in their applicability, are hard to extend, of the emerging disciplines of software engineering, stud- and provide little support for coping with architectural ies the change patterns of software systems, explores the change. This paper introduces an approach to studying underlying mechanisms that affect software changes, and software evolution that integrates the use of metrics, soft- provides guidelines for better software evolution processes. ware visualization, and origin analysis, which is a set of To discover the evolution patterns of past software sys- techniques for reasoning about structural and architectural tems, we need practical browsing and analysis tools that can change. Our approach incorporates data from various sta- guide users in navigating through software evolutionary his- tistical and metrics tools, and provides a query engine as tories. Browsing tools help us to visualize what changes had well as a web-based visualization and navigation interface. happened to the software system in the past. Analysis tools It aims to provide an extensible, integrated environment for can aid in discovering “undocumented” changes, assisting aiding software maintainers in understanding the evolution us to find out why such changes happened. of long-lived systems that have undergone significant archi- In this paper, we describe an approach towards efficient tectural change. In this paper, we use the evolution of GCC navigation and visualization of evolution histories of soft- as an example to demonstrate the uses of various function- ware architectures. Furthermore, we also introduce sev- alities of BEAGLE, a prototype implementation of the pro- eral methods to track and analyze the software structural posed environment. changes from past releases. The evolution history of a real- world software system, the GCC compiler suite, is used to demonstrate the effectiveness of our approach. 1 Introduction 2 Challenges to Software Evolution Research Software systems must evolve during their lifetime in response to changing expectations and environments. Many studies on software evolution emphasize the sta- The context of a software system is a dynamic multi- tistical changes of the software system by analyzing its evo- dimensional environment that includes the application do- lution metrics [15, 16, 8, 6, 20, 1, 9, 17, 4, 22]. Apart from main, the developers’ experience, as well as software devel- some visualization tools [11, 13, 7], little work has been opment processes and technologies [19]. Software systems done to help understanding the nature of the evolution of exist in an environment that is as complex and dynamic as software architecture. the natural world. Performing a long-ranging and detailed evolutionary While change is inevitable in software systems, it is also case study of a software system presents several difficult risky and expensive, as careless changes can easily jeopar- technical problems. Some previous studies have examined dize the whole system. It is a challenging task for devel- only a small number of the many versions of the candi- opers and maintainers to keep the software evolving, while date system, while in other cases the time period studied still maintaining the overall stability and coherence of the has been relatively short, which makes it difficult to gener- system. To achieve this goal, software engineers have to alize the results. The enormous amount of work required learn from history. By studying how successfully main- by large-scale empirical study makes it almost impossible tained software systems had evolved in the past, researchers without the application of dedicated tools and integrated environment, such as the SoftwareChange environment [4]. • Program source code, Makefiles, compiled binary Strong tool and environment support has been proven a key libraries, and executables. These artifacts are the main factor in conducting a successful empirical study on soft- products of software development activities. ware evolution. We believe there are three major challenges that must be • Artifacts related to the requirement specification and overcome in software evolution research. These obstacles architecture design. They include feasibility studies, limit our ability to understand the history of software sys- functional and non-functional requirement specifica- tems using effective empirical study, thus prevent us from tions, user manuals, architecture design documents, generalizing our observations into software evolution the- user interface mockups, and prototype implementa- ory. tions. The first challenge is how to organize the enormous • amount of historical data in a way that allow researchers Artifacts related to testing and maintenance activities. to access them quickly and easily. Software systems with a They include defect reports, change logs, new feature long development history generate many types of artifacts. requests, test suites, and automated quality assurance We need to determine which artifacts should be collected as (QA) tools. the data source for software evolution analysis. In this paper, we have chosen to focus on the evolution The second challenge is how to incorporate different re- archives of Open Source Software (OSS) systems. The rea- search techniques of software evolution into one integrated son is that most OSS projects maintain complete archives of platform. We have reviewed several models that are based program source code and version control database for his- on software evolution metrics, and visualization techniques tory releases on their FTP sites, and free of change. We that display software history in graphical diagrams. Evolu- also have more freedom in our research without getting tion metrics are precise, flexible, and can be used for statis- into complicated copyright or confidential issues as the case tical analysis. Visualization diagrams provide the overview with commercial systems. The ad hoc nature of OSS de- of the evolution history and have visual appeal to the users. velopment process often makes it difficult to find archived When used together, they are valuable tools for software documentation (other than the source code itself) that de- evolution study. scribe in detail all the major changes made to the software The third challenge is how to analyze the structural architecture in the past. Fortunately, OSS projects usu- changes of software systems. Traditional name-based com- ally maintain a complete source code base for all past re- parison techniques are not effective when the software sys- leases. And furthermore, there exists many software reverse tem adopts a different naming scheme for program mod- engineering techniques that can extract and rebuild some ules or a changed source directory structure. New research of the software architecture information from the source methods must be explored to solve this challenging prob- code. As the result, we selected source code as the primary lem. data source for studying the evolution of software architecture, and other program artifacts including version control 3 Discussion of Methodologies database are used as complementary. In this section, we introduce an approach that attempts to answer the three challenges listed above. Our approach 3.1.2 Software Architecture Model includes a web-based research platform that integrates sev- In this paper, software architecture refers to the structure eral essential techniques in studying software evolution, and of a software system, emphasizing the organization of its a novel approach for analyzing software structural changes. components that make up the system and the relationships We first discuss how the data are selected and stored in between these components. We apply reverse engineering the platform. Then we discuss how various analysis meth- techniques on the program source code to extract the most ods can be integrated in the platform, and how to apply them basic architecture facts including program components and to solve problems in evolution research. their relationships, and then recreate the high-level software architecture using fact abstractors and relational calculators. 3.1 History Data Repository Depending on the abstraction level, we have four architecture models that describe the structure of the software 3.1.1 Data Source systems using components and their relationships. Each As a software system evolves, the various activities related model describes the system structure with a different level to its evolution produce many new and changed artifacts. of abstraction. By modeling the software system at several The archives of each of these artifacts reveal one or more abstraction levels, researchers can not only study the over- aspects of the software evolution. These artifacts include: all organization of the system, but are also able to “drill down” the high-level component to further examine its in- from source code, we need data that provides extra infor- ternal

Load more