<<

01 Techniques for Maintenance 57 02 58

03 59

04 60 Kostas Kontogiannis 05 61 Department of Electrical and , National Technical University of Athens, 06 62 Athens, Greece 07 63

08 64

09 65

10 66 Abstract 11 constitutes a major phase of the software life cycle. Studies indicate that software 67 12 maintenance is responsible for a significant percentage of a system’s overall cost and effort. The software 68 13 engineering community has identified four major types of software maintenance, namely, corrective, 69 14 perfective, adaptive, and preventive maintenance. Software maintenance can be seen from two major points 70 15 of view. First, the classic view where software maintenance provides the necessary theories, techniques, 71 16 methodologies, and tools for keeping software systems operational once they have been deployed to their 72

17 operational environment. Most legacy systems subscribe to this view of software maintenance. The second 73

18 view is a more modern emerging view, where maintenance is an integral part of the 74

19 process and it should be applied from the early stages in the software life cycle. Regardless of the view by 75 which we consider software maintenance, the fact is that it is the driving force behind software evolution, a 20 76 very important aspect of a software system. This entry provides an in-depth discussion of software 21 77 Q1 maintenance techniques, methodologies, tools, and emerging trends. 22 78

23 79

24 80 25 INTRODUCTION type of software maintenance is referred to as Adaptive 81 26 82 Software Maintenance and refers to activities that aim to 27 83 Software maintenance is an integral part of the software modify models and artifacts of existing systems so that 28 84 life cycle and has been identified as an activity that affects these systems can be integrated with new systems or 29 85 in a major way the overall system cost and effort. It is also a migrated to new operating environments. A fourth type 30 86 major factor for affecting . Software main- of software maintenance is Preventive Software 31 87 tenance is defined by a collection of activities that aim to Maintenance. Preventive Software Maintenance deals 32 88 evolve and enhance software systems with the purpose of with all other design time and development time activities 33 89 keeping these systems operational. The field of software that have the potential to deliver higher-quality software 34 90 maintenance was first discussed in a paper by Canning,[1] [5] 35 and reduce future maintenance costs and effort. 91 where different software maintenance types where imp- 36 Examples of Preventive Software Maintenance include 92

37 licitly presented. However, it was due to a paper adhering to well-defined processes, adhering to coding 93 [2] 38 by Swanson where the terms and types of software standards, maintaining high-level documentation, or 94

39 maintenance were first explicitly defined in a typology of applying principles properly. In general, 95 [2] 40 maintenance activities. preventive maintenance encompasses any type of 96

41 In the following years, the software community realized intention-based activity that allows to forecast upcoming 97

42 the importance of the field, and the Institute of Electrical problems and prevent maintenance problems before they 98 [4,6] 43 and Electronics Engineers (IEEE) published two standards occur. Preventive maintenance touches upon all the 99 Q2 44 in this area. The IEEE standard 610.12-1990) and the other three types of maintenance and in some respect is 100 [6] 45 updated standard 1219–l998 identify four major types of more difficult to define boundaries for. Due to broad 101 [3,4] 46 software maintenance. The first type is referred to as boundaries of preventive maintenance, in this entry we 102

47 Corrective Software Maintenance, where the focus is on will mostly focus on core technical and process issues of 103

48 techniques, methodologies, and tools that support the iden- the first three types of software maintenance, namely, 104

49 tification and correction of faults that appear in software corrective, adaptive, and perfective maintenance. The 105

50 artifacts such as requirements models, design models, and interested reader can refer to Refs. [2], [6], and [7] for a 106 51 source code. The second type is Perfective Software more detailed discussion on preventive maintenance. 107 52 Maintenance, where the focus is on techniques, methodol- A number of studies have indicated that software main- 108 53 ogies, and tools that support the enhancement of the soft- tenance consumes a substantial portion of resources within 109 [8] 54 ware system in terms of new functionality. Such the . A study by Sutherland estimated 110 55 enhancement techniques and methodologies can be applied that the annual cost of software maintenance in the United 111 56 at the requirements, design, or source code levels. The third States is more than $70 billion dollars for a total of 112

Encyclopedia of DOI: 10.1081/E-ESE-120044348 Copyright # 2011 by Taylor & Francis. All rights reserved. 1 2 Techniques for Software Maintenance

01 activity that is also applied during Greenfield software 57 [15] 02 development. This view also originates from the con- 58 03 cepts of iterative, incremental, and models 59 [16] 04 as well as Model Driven Engineering (MDE) that pos- 60 05 tulate software system development as an incremental 61 06 process whereby requirements, design, source code, and 62 07 test models are continuously updated and evolved. 63 08 As with every engineering activity, software mainte- 64 [17] 09 nance must follow a specific prescribed process. 65 10 A complete maintenance path encompasses the identifica- 66 11 tion, selection, and streamlining of software analysis and 67 12 68 Q18 software reverse engineering, software artifacts transfor- Fig. 1 Proportional cost per maintenance type. [18] 13 mation, and software integration. Here, we will attempt 69 14 to present a unified process description for software main- 70 15 approximately 10 billion lines of code. Considering that it tenance that includes fours major phases, namely, portfolio 71 16 is estimated that there are more than 100 billion lines of analysis/strategy; modeling/analysis; transformation; and 72 Q3 17 code around the world,[9] the annual worldwide cost of evaluation. 73 18 software maintenance is estimated at the staggering In the portfolio analysis/strategy phase, the issues and 74 19 amount of $700 billion dollars. Furthermore, there have problems of the system in its current form are identified. 75 20 been studies estimating the ratio of software maintenance In the modeling/analysis phase, software artifacts are 76 21 cost vs. total development cost. These studies indicate that denoted and analyzed so that maintenance requirements 77 22 software maintenance costs attribute at least 50% of the can be set based on the systems’ state and strategy. The 78 23 total development cost of a software system.[9] Other stu- modeling/analysis phase allows for complete maintenance 79 24 dies,[10, 11] have estimated that maintenance costs can even paths to be defined and planned. More specifically, in this 80 25 range between 50% and 75% of the total development cost. phase, software artifacts are represented and denoted uti- 81 26 lizing a and formalism, and conse- 82 Fig. 1 illustrates the proportional cost for each type of 27 quently, various models of the existing system [usually 83 software maintenance with respect to the total maintenance 28 models of the source code such as the Abstract Syntax 84 cost. 29 Tree (AST)] are analyzed. The result of this phase is the 85 Software maintenance is the primary process for achiev- 30 identification of specific system characteristics that can be 86 ing software evolution. With accumulated experience over 31 used to define maintenance requirements, maintenance 87 the years, a collection of rules and observations were for- 32 objectives, and quantifiable measures for determining 88 mulated by M. Lehman[12–14] into what is known as the 33 whether the results of a maintenance activity when com- 89 laws of software evolution. These laws relate to observa- 34 pleted will meet the initial maintenance requirements and 90 tions regarding continuous change, increased complexity, 35 objectives or not. 91 self-regulation, conservation of organizational stability 36 In the transformation phase, the selected maintenance 92 and familiarity, continued growth, and quality degradation. 37 path is applied utilizing software manipulation and trans- 93 The laws of evolution focus on the observation that in order 38 formation tools. 94 for large systems to remain operational they must con- 39 Finally, in the evaluation phase, measurements for eval- 95 stantly be maintained, and that an unfortunate consequence 40 uating whether the selected maintenance activities have 96 of continuous maintenance is system quality degradation 41 met technical and financial requirements set in the analysis 97 42 where systems become complex, brittle, and less maintain- phase are applied. 98 43 able. This phenomenon is referred to as software erosion Having briefly introduced software maintenance as a 99 44 and software entropy. There is a point when a system phase in the software life cycle, we can now proceed to 100 45 reaches a state where regular maintenance activities discussing specific techniques, methodologies, and tools 101 46 become very costly or difficult to apply. At that point the that support software maintenance. This entry is organized 102 47 system must be considered for reengineering, migration, as follows. In the “Software Maintenance Process” section 103 48 reimplementation, replacement, or retirement. we discuss the software maintenance process. In the 104 49 In this respect, software maintenance has been tradi- “Software Maintenance Techniques” section we discuss 105 50 tionally considered as an activity that is applied on the key software maintenance techniques, while in the 106 51 source code of the system and only after the system became “Tools, Frameworks, and Processes” section we discuss 107 52 operational. However, more recent views consider that tools and frameworks for software maintenance. In the 108 53 software maintenance is an activity that can be applied in “Emerging Trends” section we present emerging trends 109 54 all phases of the software life cycle and to a variety of in the area of software maintenance and in the 110 55 software artifacts. Therefore, maintenance nowadays is not “Concluding Thoughts” section we provide some final 111 56 considered as a postdevelopment activity but rather as an thoughts on this subject. 112 Techniques for Software Maintenance 3

01 SOFTWARE MAINTENANCE PROCESS 57 02 58 03 As an engineering activity, software maintenance should 59 04 adhere to specific processes. Different research groups and 60 05 practitioners have considered the problem and have pro- 61 06 posed a number of process models for software mainte- 62 [19] 07 nance. By taking into account the state of the art and 63 08 practice, we can consider that software maintenance process 64 09 encompasses four major phases, namely, portfolio analysis 65 10 and strategy determination; system modeling and analysis; 66 11 artifact transformation; and finally, evaluation. 67 12 The portfolio analysis and strategy determination phase 68 13 aims to identify, gather, and evaluate the resources, com- 69 14 ponents, and artifacts that make up a system and conse- 70 15 quently assess the current state of the system so that 71 16 specific maintenance requirements and objectives could 72 17 be set. Depending on the nature of the problems discov- 73 18 ered, the appropriate maintenance strategy and action can 74 19 then be drafted. 75 20 The second phase, system modeling and analysis, aims 76 21 to denote and represent system resources, components, and 77 22 artifacts in a specific modeling formalism, and allow for Fig. 2 Software maintenance process. 78 Q18 23 the extraction of important information from the system for 79 24 the purpose of understanding its structure, dependencies, 80 that is required such as whether the system will be main- 25 and characteristics. 81 tained (enhanced, ported, corrected), redeveloped, retired, 26 The third phase, artifact transformation, aims to apply 82 or be kept as is. The sections below discuss in more detail 27 various maintenance and transformation techniques to 83 the phases of portfolio analysis and strategy determination. 28 achieve the requirements and the objectives set in the 84 29 portfolio analysis phase. 85 30 Finally, the evaluation phase aims to apply techniques to Software portfolio analysis 86 31 assess whether the maintenance requirements and objectives 87 32 have been met as the result of maintenance operations as This phase aims to create an inventory of the system’s 88 33 well as to assess the quality characteristics of the new sys- physical objects and resources, evaluate its operational 89 34 tem. These phases are applied incrementally and iteratively state, and assess the system’s role in the overall corporate 90 [20] 35 and not in a piecemeal sequential manner. In this respect, strategy, mission, and processes. In addition, a compila- 91 36 portfolio analysis and strategy determination phase feeds tion of data related to maintenance and operational costs of 92 37 results to system modeling and analysis phase that in turn the system is important in order to evaluate maintenance 93 38 produces results that can be fed back to and revise/extend the efforts from a financial perspective. This phase requires 94 39 portfolio analysis and strategy determination phase, moving static analysis of the source code to compile a record of the 95 40 iteratively, incrementally, and gradually through the trans- system’s physical objects, and dynamic analysis of traces 96 41 formation and evaluation phases. Fig. 2 illustrates a to assess the system’s behavior against the specified or 97 42 schematic block diagram of the maintenance process. intended behavior. Compilation of historical and forecast 98 43 financial data related to the cost of operations over the 99 44 Portfolio Analysis and Strategy Determination remaining operational period of the system are also impor- 100 45 tant data to be collected in this phase. 101 46 The objective of portfolio analysis and strategy determina- Static analysis tools can provide a wealth of source 102 47 tion is to assess the system from a financial, technical, and code-related information such as valuable information 103 48 business perspective. The purpose is to compile an inventory with respect to unused code, metrics, poor coding prac- 104 49 of a system’s physical objects and its dependencies, to con- tices, as well as component dependencies. The dynamic 105 50 struct an operational profile of the system in terms of its analysis tools can provide information with respect to 106 51 delivered functionality, to calculate an estimate of its opera- whether the system achieves its functional and non- 107 52 tional and maintenance cost and effort, to identify various functional requirements including security issues, memory 108 53 Key Performance Indicators (KPIs), and to collect satisfac- usage, performance degradation trends, and interface bot- 109 54 tion ratings obtained from the users of the software system. tlenecks. Finally, the compilation of operation and histor- 110 55 The results of this phase can be used to establish mainte- ical maintenance data could provide valuable information 111 56 nance requirements and determine the appropriate strategy with respect to annual change rate of the modules of the 112 4 Techniques for Software Maintenance

01 system, compilation of software maturity indexes, average road map from guidelines presented in Ref. [25]. Another 57 02 time/effort/cost measures for typical maintenance tasks, guideline is based on questionnaires and detailed technical, 58 03 mean time to failure metrics, mean time to repair metrics, economic assessments and management assessments that 59 04 failure intensity metrics, compilation of the number of select the appropriate maintenance strategies for a given 60 [24] 05 cumulative system failures, as well as reliability and avail- system or a family of systems. Finally, yet another 61 06 ability measures and profiles using standard reliability guideline that aims to produce a maintenance strategy 62 [21] 07 growth models. specifically for migrating legacy systems to Service- 63 08 Oriented Architecture (SOA) environments is also based 64 09 Strategy determination on questionnaires and system analysis to establish the 65 10 migration context, to describe existing capability, and to 66 [26] 11 The objective of this phase is to identify maintenance describe the target SOA state. These different strategies 67 12 requirements and to devise a strategy with respect to main- aim to provide answers to whether it makes sense to apply a 68 13 tenance activities that should be considered, given the specific maintenance task, what parts of the system can be 69 14 current state of the system. The different strategies include reused, what type of changes need to be applied to which 70 15 restructuring, migration, porting, enhancement, redevelop- components in order to accomplish the maintenance objec- 71 16 ment, or keeping the status quo, that is, leaving the system tives, and how to obtain a preliminary estimate of cost for a 72 17 as is until its final retirement. In order to make these given maintenance task or activity. 73 18 decisions, the results from the portfolio analysis phase 74 19 are considered in addition to the information obtained System Modeling and Analysis 75 20 from a number of system and environment characteristics 76 21 that help determine the importance and vulnerability of the The objective of modeling and analysis is the representa- 77 [27] 22 system from a mission and business operations perspec- tion of source code (or even binary code) at a higher 78 [22–24] 23 tive. These characteristics pertain to system vulner- level of abstraction using a domain model (schema), and 79 24 abilities due to obsolete implementation programming the subsequent analysis of such models so that dependen- 80 25 languages and infrastructure used, software mission criti- cies between system artifacts can be extracted and mod- 81 [28] 26 cality and anticipated impact in case the system fails, eled. This phase encompasses two major tasks. The first 82 27 preparedness and the level of technical competency of the task is referred to as source code representation and utilizes 83 28 organization to undertake a maintenance project, availabil- parsing technology to compile a model of the source code 84 29 ity of funds supporting the maintenance efforts, and man- that can be algorithmically and mechanically manipulated. 85 30 agement commitment. The second task is referred to as source code analysis and 86 31 In order to select an overall maintenance strategy one aims to assist on program and system understanding. The 87 32 must consider several factors and conduct a thorough tech- sections below discuss in more detail the phases of source 88 33 nical, financial, and risk assessment of the systems that are code modeling and source code analysis. 89 34 to be maintained. For the sake of simplicity, we can con- 90 35 sider that a very generic assessment can be based on the System modeling 91 36 system’s business value vis-a`-vis the ease of change, or on 92 37 user ratings vis-a`-vis the quality coefficient of the system. System modeling focuses on the construction of abstrac- 93 38 Fig. 3 summarizes such a high-level software maintenance tions that represent and denote source code, 94 39 environment characteristics, and configuration informa- 95 40 tion at a higher level of abstraction. In particular, the area 96 41 of source code modeling or source code representation 97 42 deals with techniques and methodologies to represent 98 43 information on a software system at a level of abstraction 99 44 that is suitable for algorithmic processing. System model- 100 45 ing has a profound impact in software maintenance as it 101 46 affects the effectiveness and the tractability of mainte- 102 47 nance activities. The effect of modeling formalisms to 103 48 software maintenance has been discussed in Ref. [29]. 104 49 There are two major schools of thought in the source 105 50 code modeling domain. The first school of thought advo- 106 51 cates formal models that not only aim to represent the 107 52 source code at a higher level of abstraction but also to 108 53 denote the semantics of the source code on a rigorous 109 54 mathematical formalism. In this respect, formal properties 110 55 of the code can be proven using theorem proving or other 111 Q18 56 Fig. 3 Guidelines for selecting a maintenance strategy. formal deduction techniques. Approaches that fall in this 112 Techniques for Software Maintenance 5

01 category include structural operational semantics, denota- 57 02 tional semantics, axiomatic semantics, p-calculus, and pro- 58 [30–32] 03 cess algebras. A criticism on these approaches is that 59 04 models are difficult to build and manipulate algorithmi- 60 05 cally, especially for large industrial systems. 61 06 The second school of thought advocates more informal 62 07 models that are produced from parsing or scanning the 63 08 code. These models do not necessarily have well-defined 64 09 formal semantics, but provide and convey rich-enough 65 10 information so that source code analysis and manipulation 66 11 of large software systems can be tractably achieved. 67 12 It is evident that both approaches have benefits and 68 13 drawbacks. The intuition behind the first approach that 69 14 utilizes formal models allows for properties of the source 70 15 code to be verified, a very important issue for mission- 71 16 critical . However, these approaches do 72 17 not scale up very well as the complexity and the size of 73 18 such formal models may become unmanageable for large 74 19 systems. 75 20 Similarly, the intuition behind the second approach 76 21 that utilizes informal models is that it allows for a 77 22 “good-enough” analysis of very large systems in a tractable 78 Q18 23 manner. For example, the architectural recovery of a multi- Fig. 4 Sample domain model class hierarchy for the 79 C . 24 million line system does not require highly formal and 80 25 mathematical models. The motivation here is to utilize 81 26 models that can represent massive amounts of data to 82 27 tractable so that we can obtain a solution that 83 28 is useful to software engineers who can proceed with a code for a specific platform. In this context, source code 84 29 more detailed and targeted analysis if needed. In this entry, modeling aims to facilitate source code analysis that can 85 30 we focus on the use of the latter type of informal models as also be applied at various levels of abstraction and detail, 86 31 these are mostly used in practice for the analysis and namely, at the physical level where code artifacts are 87 32 maintenance of large software systems. These include represented as tokens, lexemes and ASTs; the design 88 33 ASTs, Call Graphs, Program Summary Graphs, and level where the software is represented as a collection of 89 34 Program Dependence Graphs (PDGs) among others. modules, interfaces, and connectors; and the conceptual 90 35 These representations are achieved by parsing the source level where software is represented in the form of abstract 91 36 code of the system being analyzed at various levels of entities, such as objects, Abstract Data Types (ADTs), and 92 37 detail and granularity (i.e., statement, function, file, pack- communicating processes. 93 38 age, subsystem level). Models can also be denoted by a An example of a fraction of a domain model for the C 94 39 variety of means such as tuples, relations, objects, and programming language presented as a class hierarchy is 95 40 graphs. Regardless of how the models are denoted they illustrated in Fig. 4. More specifically, Fig. 4 illustrates 96 41 have to conform to a schema that is referred to as the part of a domain model in the form of a hierarchy of classes 97 42 Domain Model. Domain models can be represented in a that denote structural elements of the C programming 98 43 variety of ways but most often are represented as relational language. For example, as depicted in Fig. 4, the C pro- 99 [33–36] 44 schemas or as class hierarchies. gramming language has Statements, a subcategory of 100 45 One specific type of model that is most often used for which is Condition_Statement. A subcategory of 101 46 software analysis and maintenance is the AST. ASTs are Condition_Statement is Statement_If, and so on. 102 47 tree structures that represent all the syntactic information Similarly, the language has Expressions, subcategories of 103 [37] 48 contained in the source code. Every node of the tree is which include Predicate_GT, Predicate_LT, etc. 104 49 an element of the programming language used. The non- A parser can be used to invoke semantic actions that aim 105 50 leaf nodes represent operators, while the leaf nodes repre- to populate such a domain model and create objects that are 106 51 sent operands. ASTs suppress unnecessary syntactic associated and form a tree structure as the one depicted in 107 52 details (whitespace, symbols, lexemes, punctuation Fig. 5. Such trees are referred to as ASTs and provide a 108 53 tokens) and focus on the structure of the code being repre- very rich model, which can be used for source code analy- 109 54 sented. The AST notation is the most commonly used sis and transformation. Fig. 5 illustrates the Annotated 110 55 structure in to represent the source code intern- AST that is compliant with the domain model of Fig. 4 111 56 ally in order to analyze it, optimize it, and generate binary and pertains to the following snippet of code: 112 6 Techniques for Software Maintenance

01 57

02 58

03 59

04 60

05 61

06 62

07 63

08 64

09 65

10 66

11 67

12 68

13 69

14 70

15 71

16 72

17 73

18 74

19 75

20 76

21 77

22 78

23 79

24 80

25 81

26 82

27 83

28 84

29 85

30 86

31 87

32 88

33 89

34 90

35 91

36 92

37 93

38 94

39 95

40 96

41 97

42 98

43 99 Q18 44 Fig. 5 Sample Annotated Abstract Syntax Tree class. 100

45 101

46 102

47 103

48 IF (OPTION > 0) the nodes and the edges are typed and have their own 104 [38] 49 SHOW_MENU(OPTION) annotations that denote semantic properties. 105

50 ELSE The Rigi Standard Format, or RSF for short, is a 106

51 SHOW_ERROR(“Invalid option..”) format for representing source code information. It is a 107 52 generic, intuitive format that is easy to read and parse. 108 53 The Abstract Semantic Graph, or ASG for short, also The syntax of RSF is based on entity relation triplets of 109 54 provides a rich abstract representation of source code text. the form . An example of an RSF 110 55 ASGs are composed of nodes and edges. Nodes represent tuple is . Sequences of 111 56 source code entities, while edges represent relations. Both these triplets are stored in self-contained files. 112 Techniques for Software Maintenance 7

[46] 01 Currently, RSF is the base format for the reverse engi- code. Dynamic analysis aims at extracting information 57 [34,39] [47–49] 02 neering tool Rigi. and models from execution traces. The primary focus 58 03 The Tuple-Attribute Language, or TA for short, is a of the static or dynamic analysis of the system is the 59 04 language designed to represent graph infor- compilation of various system views such as architecture 60 [40] 05 mation. This information includes nodes, edges, and views, code views, metrics views, and historical views. 61 06 any attributes the edges may contain. TA is easy to read, These views allow the identification of the system’s 62 07 convenient for recording large amounts of data, and easy to major components, data and control flow dependencies, 63 08 manipulate. The main use for TA is to represent facts data schema structure, configuration constraints, as well as 64 09 extracted from source code through parsers and fact extrac- the extraction of various metrics that serve as indicators of 65 10 tors. In this way TA can be considered to be a “data the system’s quality and . For system ana- 66 11 interchange” format. Other metamodeling languages for lysis we consider the following important points of interest. 67 Q4 12 representing software artifacts include the FAMIX meta- Architectural extraction is one such area of system 68 [41] [36,42] 13 model used by Moose, the GXL, and the analysis that deals with the identification of major system 69 [43] [50] 14 Knowledge Discovery Metamodel (KDM) proposed components and their dependencies. These components 70 15 by the Object Management Group (OMG). and their dependencies (calls, uses, imports, exports) pro- 71 16 Other popular source code representation models vide a view of the system at the architecture level. The 72 17 include the Control Flow Graphs, Data Flow Graphs, Call identified components are composed of collections of 73 [37,44] 18 Graphs, and PDGs. Control Flow Graphs denote the functions, data types, and variables. For the formation of 74 19 possible flows of execution of a code segment from state- the components that constitute an architectural view of the 75 20 ment to statement. Data Flow Graphs offer a way to elim- system, clustering is used as the primary technique. 76 21 inate unnecessary control flow constraints in representing Elements such as functions, data types, and variables are 77 22 the source code of a system, focusing mostly on the grouped together according to some clustering strategy and 78 23 exchange of information between program components according to some distance or similarity measure. 79 24 (basic blocks, functions, procedures, modules). Call Architecture extraction analysis can be used for the migra- 80 25 Graphs offer a way to eliminate variations in control state- tion of legacy systems to Network-Centric and to Service- 81 26 ments by providing a normalized view of a possible flow of Oriented environments. An example of an extracted archi- 82 27 execution of a program. In Control Flow Graphs, nodes tecture is illustrated in Fig. 6, where a large system has 83 28 represent source code basic blocks, while edges represent been abstracted in the form of an Entity Relationship (ER) 84 29 possible transfer of control from one basic block to graph where nodes are files and edges denote data flow, 85 30 another. A Call Graph represents invocation information control flow, and call information between these files. A 86 31 between functions or between procedures. Nodes in the clustering has been applied in this ER graph to 87 32 Call Graph represent individual functions or procedures yield groups of files that share common flows while flows 88 33 and edges represent call sites and may be labeled with between groups are minimized. These groups illustrated in 89 34 parameter information. Finally, PDGs have been exten- Fig. 6 can be considered as components of the recovered 90 35 sively used for software analysis and in particular source system architecture. 91 36 code slicing. Nodes in PDGs represent either statements or Another important area of system analysis is the extrac- 92 [51,52] 37 entry or exit variables in a code fragment, usually this code tion of ADTs. The extraction of ADTs encompasses 93 38 fragment being the body of a function, while edges repre- two tasks. The first task deals with the analysis of data 94 39 sent either control dependencies, or data flow dependen- types in the source code of the system and the assessment 95 40 cies. Control dependencies in PDGs indicate that one of whether these data types can be considered as ADTs or 96 41 statement may or may not select the execution of another not. The assessment criteria are based on how extensive is 97 42 statement (e.g., the condition in a Statement_If may or may the use of these types in the system, the operations that are 98 43 not select the execution of the then or the else part of the associated with these data types, and their relationship with 99 44 statement’s body). Similarly, data flow dependencies other data types. The second task deals with the identifica- 100 45 between nodes in PDGs denote that one statement (node) tion and attachment of operations to these ADTs as well as 101 46 defines the value of a variable and the other statement with the identification of possible specializations and gen- 102 [44] 47 (node) uses the variable. eralizations among these types. ADT extraction can be 103 48 In the paragraphs above, we focused mostly on the used for the migration of systems written in procedural 104 49 representation of modeling of source code. Another impor- languages to new systems that conform with the Object- 105 [53] 50 tant dimension is the extraction of models from binary Oriented . 106 [27,45] 51 files, an area what is referred to as binary analysis. Yet another important area of system analysis is sli- 107 [54,55] 52 cing. Slicing is a technique that allows for the identi- 108 53 System analysis fication of all system elements that are affected by, or 109 54 affect, a particular system element at a particular location 110 55 Analysis takes two forms: static and dynamic analysis. that is referred to as the slicing criterion. Slices can be 111 56 Static analysis focuses mostly on the analysis of the source classified as forward slices and backward slices. The most 112 8 Techniques for Software Maintenance

01 57

02 58

03 59

04 60

05 61

06 62

07 63

08 64

09 65

10 66

11 67

12 68

13 69

14 70

15 71

16 72

17 73

18 74

19 75

20 76

21 77

22 78

23 79

24 80

25 81

26 82

27 83

28 84

29 85

30 86

31 87

32 88

33 89

34 90

35 91 Q18 36 Fig. 6 Snapshot of extracted system architecture. 92 37 93

38 94

39 common use of slicing is source code slicing. In this between statements and for estimating the potential impact 95

40 respect, a slicing criterion is a program statement variable of a change when maintaining the source code of a soft- 96

41 at a specific point. A forward slice is a collection of state- ware application. The most frequently used data flow algo- 97

42 ments that are affected by the slicing criterion. A backward rithms for software maintenance include the algorithms of 98

43 slice is a collection of statements that affect the slicing reaching definitions, def-use chains, available expressions, 99

44 criterion. The result of the slice can even be an executable and constant propagation. Reaching definitions data flow 100

45 program that includes all the statements in the forward or in analysis aims to identify all definitions of a variable that 101

46 the backward slice given a slicing criterion. Slicing is reach a given point, in other words, what are the possible 102

47 based on the analysis of source code models such as the values of a variable at a given program point. Def-use 103 [44] 48 AST or the PDG. chains data flow analysis aims to identify for a given 104

49 Another area of system analysis is data flow analysis. definition of a variable all the possible uses of it. 105 50 Data flow analysis is an area originally proposed in the area Available expressions data flow analysis aims to identify 106 [37,56] 51 of optimizing compilers and in the area of . the scope in which the value of specific expressions 107 52 Data flow analysis is based on how data propagate through remains unchanged. In this respect, as long as the value 108 53 the control flow structure of a code fragment. There are remains unchanged, one could potentially replace a com- 109 54 several data flow analysis algorithms proposed in the lit- plex reoccurring expression with a variable denoting its 110 55 erature. In the context of software maintenance, these value. This allows for program simplification and therefore 111 56 algorithms are used for determining the dependencies easier maintenance. Finally, constant propagation analysis 112 Techniques for Software Maintenance 9

01 aims to identify variables that obtain a constant value and how and in which order events are channeled from origi- 57 02 all the points where this constant value can be safely nator processes (callers) to receiver processes (callees). 58 [37] 03 used. Similarly, workflows denote the order by which processes 59 04 Another area is data schema analysis. Data schema and data stores are activated during the system’s operation 60 05 analysis is a type of analysis that pertains mostly to data- for a given or application . When process 61 06 centric systems that utilize and access transactional data- transformations are applied at the event flow level, we aim 62 07 bases. To maintain such systems it is not adequate to to manipulate the way systems/components interact and 63 08 analyze the source code that pertains to the transaction are part of an area called event processing. Examples of 64 09 logic or business logic but it is also important to analyze event flow transformations include integration with third- 65 10 and identify dependencies between the elements of the data party components and integration with monitoring/audit- 66 [57] 11 schemas used by the application. Data schema analysis ing tools. When process transformations are applied at the 67 12 can be used for schema simplification, thus reducing the workflow or business process level these take the form of 68 13 size and complexity of the data or facilitating schema IT workflow reengineering or business process reengineer- 69 [65] 14 merging and schema mediation, two very important tasks ing, respectively. 70 15 for and interoperability. Data transformations are applied at the schema or data 71 16 Similarly, the area of metrics analysis is a type of instance level with the purpose to analyze and manipulate 72 17 system analysis that is aiming for the compilation of a the logical and physical data in an application. When data 73 18 collection of various software metrics that reflect in quan- transformations are applied at the logical level, these may 74 19 tifiable terms structural properties of the code and of the take the form of data schema manipulations. Similarly, 75 20 design of the system in general. These computed metrics when data transformations are applied at the physical 76 21 can be used in different ways for software maintenance. level, these may take the form of manipulation of data 77 22 Some of the most typical uses of metrics are as quality streams or even manipulation of individual instances of 78 [58] [66] 23 predictors for a specific design, as differential indica- data. An example of data transformations at the logical 79 24 tors when two versions of a system are compared, or as level is the alteration of a schema in a by adding or 80 [59,60] 25 clone detection techniques. removing fields and relations. Similarly, an example of 81 26 data transformation at the physical level is the alteration 82 27 Transformation of data streams from binary to text, or the alteration of a 83 28 specific data item in a form that can be consumed by a 84 29 The objective of the Transformation phase is to apply client process. 85 30 manual, semiautomatic, or fully automatic techniques 86 31 that allow for the manipulation of the source code, the Evaluation 87 32 event flows, or the database artifacts so that maintenance 88 33 goals and objectives can be achieved. An example of a The objective of the evaluation phase is to assess whether 89 34 particular type of transformation is the architectural repair software maintenance activities fulfill the maintenance 90 35 transformation that is applied at the design or architecture requirements and the desired maintenance goals set in the 91 36 level and aims to repair a ’s architectural strategy determination phase. In this respect, there are two 92 37 drifts that may have occurred due to prolonged mainte- major points of view we can approach the evaluation 93 38 nance operations. The transformation phase can be consid- phase. The first is from a technical standpoint while the 94 39 ered in the context of three distinct activities, namely, second is from a financial standpoint. 95 40 functional transformations, process transformations, and 96 41 data transformations. Technical evaluation 97 42 Functional transformations can be applied at the 98 43 design/architecture level or at the source code level. The selection of the technical evaluation strategy to be 99 44 These transformations aim to manipulate code artifacts to used depends on the type of maintenance objective (adap- 100 45 repair faults, add new functionality, or adapt/port the sys- tive, corrective, perfective) and the type of transformation 101 46 tem to new platforms and environments. When functional (functional, process, data) that has been applied. One could 102 47 transformations are applied at the design/architecture level consider four major technical evaluation strategies. The 103 48 they take the form of what is referred to as architectural first is based on static analysis of the source code, the 104 [61] 49 repair. Similarly, when functional transformations are second is based on techniques, the third 105 50 applied at the source code level they take the form of what is based on software metrics, and the fourth is based on 106 [62] 51 is referred to as code transformations. When these feature modeling and analysis. These four strategies are 107 52 transformations do not aim to change the behavior and complementary in the sense that each one provides differ- 108 53 the functionality of the system (e.g., to increase its quality) ent evidence toward evaluating the effectiveness of the 109 [63,64] 54 they are also referred to as refactorings. maintenance operations applied. 110 55 Process transformations are applied at the event flow The first technical evaluation strategy, namely, the sta- 111 56 level or at the workflow/process level. Event flows denote tic analysis of the source code can be applied when we have 112 10 Techniques for Software Maintenance

01 ample access to the source code of a system, and we would that arise from these goals. Examples of metrics that relate 57 02 like to prove that certain structural properties of the source to evaluating a software system with respect to mainte- 58 03 code or the design of the system hold after specific main- nance cost and effort include Current Change Backlog, 59 04 tenance operations have been applied and completed. This Change Cycle Time from Date Approved and from Date 60 05 type of evaluation may take the form of identifying that Written, Cost per Delivery, Cost per Activity, Number of 61 06 certain dependencies between components have been Changes by Type, Staff Days Expended/Change by Type, 62 07 eliminated or established accordingly, or that certain Complexity Assessment, etc. 63 [67,68] 08 design patterns or refactorings have been introduced. Other examples of metrics that serve as indicators of the 64 09 Component dependencies include data and control depen- quality and maintainability of a code fragment is the cyclo- 65 10 dencies, use or exposure of interfaces from one to other matic complexity of its Control Flow Graph measured by 66 11 components or applications, or the use of common libraries the McCabe cyclomatic complexity metric, and the esti- 67 12 and protocols. Similarly, static analysis techniques can mated degree of the delivered functionality by a code 68 13 assist toward the evaluation of whether or not a design fragment measured by the Function Point metric. The 69 14 pattern or refactoring has been or can be introduced in a interesting property of the Function Point metric is that it 70 15 particular part of a system. In this respect, it has been can also be applied at the design or component level of a 71 16 experimentally shown that the introduction of design pat- system and relates to the level of cohesion of a code 72 [73] 17 terns or refactorings plays a role toward enhancing the fragment or a component. 73 [69] 18 quality of a software system. By verifying statically Furthermore, metrics can be combined in linear formu- 74 19 that such structures have been introduced into the system, las that serve as predictors of software quality or maintain- 75 [74] 20 software engineers may make assumptions on the quality ability. These linear formulas can be developed 76 21 and extensibility of the new system. experimentally using past maintainability data and linear 77 22 The second technical evaluation strategy is based on interpolation techniques. For example, in Ref. [58], the 78 Q5 23 testing and can be used whether we have full access to the software maintainability index (SMI) of a software module 79 [56,70] 24 source code or not. Furthermore, different testing is estimated by a linear formula of the form 80 25 techniques can be used for different types of evaluations. 81 26 For the evaluation of the impact maintenance activities 82 SMI ¼ 125 3:989 FANavg 0:954 DF 1:123 27 have on a system, software engineers may opt for regres- 83 MCavg 28 sion testing techniques applied at the unit level (unit test- 84 29 ing), at the component level (), or that 85 30 the system level (system and ). where SMI is the predicted maintainability of a module, 86 31 Depending how much access we have to the source code FANavg is the average number of calls emanating from all 87 32 of a system, we can choose what type of testing we can classes/functions of the module, DF is the total number of 88 33 apply. When we have full access to the source code we incoming and outgoing data flows from the module, and 89 34 consider white box testing techniques, whereas when we MCavg is the average McCabe cyclomatic complexity of all 90 35 have partial or no access to the source code we apply gray methods/functions of the module. According to experi- 91 36 box or black box testing, respectively. mental studies presented in Ref. [75], SMI values below 92 37 The third technical evaluation strategy is based on 65 indicate low maintainability, values between 65 and 85 93 38 metrics and can be used when we have some access or indicate medium maintainability, while values above 85 94 39 full access to the source code of the system. The metrics indicate high maintainability. 95 40 techniques can be applied in two ways. First, metrics can The fourth technical evaluation strategy, namely, 96 41 be used to determine whether particular components of a feature-based analysis aims to collect features from the 97 42 software system meet specific structural properties that can system (design artifacts or source code artifacts) and 98 43 be quantified by such metrics. Second, metrics can be used attempt to assess how design decisions affect quality attri- 99 44 to determine whether specific maintenance operations that butes. Two of the most well-known techniques in this area 100 [76] 45 have been applied had a differential impact on metrics are Analysis Method (SAAM) 101 46 before and after the specific maintenance operations took and the Architecture Tradeoff Analysis Method 102 [77] 47 place. This differential on metrics values may serve as an (ATAM). These techniques aim to assess the impact 103 48 indicator that the specific maintenance operations had a design decisions have on software system quality with 104 49 positive or negative impact on the system quality. In the emphasis on performance, security, availability, and mod- 105 50 related literature there are several different software ifiability. These techniques can also assist on evaluating 106 51 metrics that have been proposed as indicators of the overall the trade-off among different design choices. The techni- 107 [71] 52 system quality and maintainability. In this respect, ques are based on a structured process that aims to identify 108 53 in Ref. [72] proposed the Goal-Question- and present the scope and goals of the evaluation; perform 109 54 Metric (GQM) approach, where different metrics can be investigation and analysis of the alternative design deci- 110 55 used to assess software systems and processes according to sions; test and prioritize alternatives; and report findings. 111 56 the specific assessment goals and assessment questions The techniques can also be used for Greenfield software 112 Techniques for Software Maintenance 11

[15] 01 development early in the life cycle or to analyze exist- Finally, a third technical evaluation category that is 57 02 ing legacy system architectures. gaining attention over the past few years is based on futures 58 03 valuation theory and on portfolio valuation theory. 59 04 Financial evaluation Valuation theory is a robust field in finance and proposes 60 05 techniques to estimate the value of a choice that becomes 61 06 The evaluation of maintenance activities from a financial available with an investment (i.e., a specific maintenance 62 [81] 07 standpoint takes the form of computing the maintenance operation). As software maintenance and evolution 63 08 and operational cost for keeping the system as is vs. the activities imply new choices for future expansion, adapta- 64 09 cost of performing specific changes to the system (adap- tion, and correction, futures valuation theory allows for the 65 10 tive, corrective, or perfective changes), and computing the estimation of the value (i.e., the future benefits) of the 66 11 expected reduced maintenance and operational cost for the alternative investment opportunities (i.e., alternative main- 67 12 new enhanced system if the maintenance operations are tenance operations). The choice with the highest valuation 68 13 applied. In the related literature there are a number of can be considered as the most preferable from a financial 69 14 different techniques to perform economic analysis for eval- point of view. Relevant to the futures valuation approach 70 15 uating maintenance operations from a financial standpoint. are techniques that aim to valuate software portfolios. The 71 16 Most of these techniques fall into three major categories. idea is that alternative maintenance options provide as a 72 17 The first category deals with the traditional economic result different alternative software portfolios that can be 73 18 analysis that is based on the concepts of Net Present Value evaluated. The maintenance operation that yields a soft- 74 19 (NPV), Benefit Investment Ratio (BIR), Return on ware portfolio with the highest value among alternatives 75 [24] 20 Investment (ROI), and Rate of Return (ROR). These can also be considered as the most preferable from a 76 [24] 21 indicators, especially the NPV and BIR indicators, are used financial point of view. A comprehensive list of cost 77 22 to determine which type of maintenance operation will estimation models can also be found at NASA’s Cost 78 [82] 23 have the highest economic impact or alternatively, what Analysis Division. 79 24 is the economic impact of a selected strategy. The strategy 80 25 that yields the highest NPV or BIR is preferable from a 81 26 financial point of view. Similarly, we are also interested in SOFTWARE MAINTENANCE TECHNIQUES 82 27 keeping the ROI and ROR indicators as high as possible. In 83 28 the related literature, the strategy of doing nothing and As presented in the sections above, software maintenance 84 29 keeping the system as is to deteriorate toward its retirement is classified by three major types, namely, corrective, per- 85 30 is considered the null or status quo strategy. All other fective, and adaptive maintenance. In this section, we pre- 86 31 strategies can be compared with the null strategy. The sent some of the most frequently used software 87 32 NPV of a given strategy is defined as the difference of maintenance techniques per maintenance type. In the area 88 33 the Present Value (PV) of the total cost of the status quo of corrective maintenance we discuss techniques that are 89 34 strategy and the PV of the total cost of the given strategy. applied at different levels of software abstraction and 90 35 The PV of the total cost of a strategy is defined as the sum granularity and deal with architectural mismatch repairs, 91 36 of the cost of implementing the strategy and the cost of component restructuring, fixing software errors, and iden- 92 37 operating and supporting the new system that has been tifying possible causes of failures. In the area of perfective 93 [24] 38 produced by applying the specific strategy. maintenance, we discuss techniques that deal with enhan- 94 39 The second category is based on prediction methods cing a system with new functionality, performing software 95 40 that estimate the effort savings for support and mainte- refactorings, and merging software components. Finally, in 96 41 nance in person months for the new updated system vs. the area of adaptive maintenance, we discuss techniques 97 42 the effort for support and maintenance in person months for that deal with synchronizing business processes with run 98 [78,79] 43 the old system. An example of such an effort predic- time applications that implement these processes source 99 [80] 44 tion model is COCOMO, which takes as input the code, migrating components to Network-Centric environ- 100 45 annual change rate of the system due to scheduled main- ments, and transliterating software systems to new 101 46 tenance, the effort to implement a maintenance strategy, languages. 102 47 and a quality indicator of the system, and yields an estimate 103 48 for the maintenance effort for this system. As the quality Corrective Maintenance 104 49 indicator of the system increases, the annual maintenance 105 50 effort to keep the system operational decreases. In this Corrective maintenance is focusing on the application of 106 51 respect, we aim to select those maintenance operations techniques for understanding, fixing, or remediating pro- 107 52 that have the highest impact on the quality indicator of a blems in the source code or the design of a software system. 108 53 system. There are a number of different quality indicators Corrective maintenance can be thought of as encompassing 109 54 that can be used, such as the Software Maturity Index, and two phases. The first phase deals with the identification of 110 55 indexes that are based on a linear combination of software the fault, or what is also referred to as root cause analysis 111 [71,58] 56 metrics. (RCA). The second phase deals with the correction of the 112 12 Techniques for Software Maintenance

01 57

02 58

03 59

04 60

05 61

06 62

07 63

08 64 09 Fig. 7 Architecture repair process. 65Q18 10 66

11 67

12 problem. Unfortunately, there are no prescribed ways to therefore assume that the resulting clusters may provide a 68

13 fixing faults as every case is different. However, we could snapshot view of the as-currently-is system architecture. 69

14 consider frameworks and tools that could assist these fault Similarly, repair takes the form of identifying drifts 70

15 identification and correction tasks. In this section, we dis- between the extracted or concrete architecture and the 71

16 cuss three types of techniques, namely, architecture recon- conceptual or envisioned architecture. The concrete archi- 72

17 struction and repair techniques, techniques for the tecture depicts the actual interactions between the modules 73

18 identification of inconsistencies between requirements as implemented in the source code. These module interac- 74

19 and run time applications, and environments that assist tions are based on dependencies between program entities 75

20 the identification and correction of software faults. (e.g., functions, data types, and variables). The conceptual 76

21 architecture depicts the interactions we believe should 77 exist between the modules following design and other 22 Architecture reconstruction and repair 78 23 domain-specific principles and information. Fig. 7 illus- 79 trates the architectural repair process based on the compar- 24 The laws of evolution presented by Lehman imply that a 80 25 ison of the concrete and conceptual architecture of a 81 system as it is being maintained evolves constantly. system. The new system yields a new concrete architecture 26 Furthermore, as time goes by, its design and architecture 82 27 that is evaluated against properties, constraints, and invar- 83 gradually erodes. There is a point in time where the system iants stemming from the conceptual architecture of the 28 becomes so brittle and inflexible that even simple main- 84 29 system. 85 tenance tasks are difficult to perform. At this point, one 30 In this context, architectural repair may take two forms: 86 could consider applying corrective maintenance techni- 31 forward architecture repair and backward architecture 87 ques that aim first, to reconstruct or extract the architecture [61] 32 repair. Forward architecture repair means repairing the 88 of a system and second, to repair the architecture. 33 concrete architecture to match the conceptual architecture. 89 Reconstruction takes the form of identifying components 34 Reverse architecture repair means repairing the conceptual 90 in terms of collections of functions, data types, and vari- 35 architecture to match the concrete architecture. This match 91 ables. The identified components should contain entities 36 takes the form of reconciling the conceptual architecture 92

37 that exhibit high level of cohesion and are related by strong and the concrete architecture to minimize their anomalies. 93

38 data and control dependencies. On the contrary, elements For this reconciliation to be achieved, a number of trans- 94

39 that belong to different components should exhibit low formations can be considered. These transformations aim 95

40 coupling. Architectural reconstruction can be achieved by to eliminate specific differences between the conceptual 96 [84,85] 41 clustering algorithms that can be either unsupervised and concrete architecture. Depending on the differ- 97

42 (depend solely on the algorithm and the similarity dis- ences we are interested in eliminating, transformations 98 [83] 43 tance) or supervised (there are constraints and initial may take the form of insertions, deletions, and modifica- 99 [50] 44 conditions as to how the components look like). The tions of components and connectors in the concrete archi- 100

45 intuition behind clustering techniques for architectural tecture to match the conceptual or vice versa. The rationale 101

46 reconstruction is that the software system can be repre- behind such repairs is to minimize structural and design 102

47 sented at a high level of abstraction as a collection of inconsistencies between what is implemented (concrete 103

48 relations between software artifacts. An example of such architecture) vs. what should have been implemented (con- 104

49 a relation is the “calls” relation between functions. Once a ceptual architecture). 105

50 collection of relation tuples is used to model the software 106 51 system at a higher level of abstraction, clustering techni- Identifying inconsistencies between requirements and 107 52 ques can be applied to group together software artifacts run time behavior 108 53 that have a high number or relations between them, while 109 54 as a group have minimal relations with other software In many occasions, either due to prolonged maintenance or 110 55 artifacts that belong to other groups. This is a way to due to erroneous design or implementation of a system, the 111 56 simulate a form of high cohesion and low coupling and system does not deliver correctly its intended functionality 112 Techniques for Software Maintenance 13

01 as this is specified in its observed, the associated failures and threats are then con- 57 [86] 02 Specification (SRS) documents. In order to identify sidered as initial root cause hypotheses. The requirement 58 03 whether the system delivers its intended functionality, we violations/threats can be associated with components or 59 04 need to consider techniques to track a system’s run time with remedy actions through prespecified condition-action 60 [93] 05 behavior so as to detect deviations from its requirement rules, or tuples. 61 06 specification and identify parts of the system that may be 62 [87] 07 responsible for this deviation. Approaches in this area Fixing faults 63 08 fall into three main categories: event-driven, goal-driven, 64 09 and pattern-driven. When a system operates it is possible at some particular 65 10 Event-driven approaches model assumptions through point in time that a fault or bug in the code will be trig- 66 11 appropriate formalisms such as the for gered. This fault in the code may be harmless or may 67 12 Expression Assumptions (FLEA) or through events that are trigger an error. An error is defined as the situation in 68 [88] 13 classified as satisfaction events or as denial events. The which a system enters a state that is different from the 69 14 intuition behind these approaches is to attempt to formally one that is specified. An error may also be harmless or be 70 15 verify or deny such formal expressions that represent remediated through exception handling mechanisms so 71 16 assumptions related to specific functional or non- that the user may not even be alerted or may not notice 72 17 functional system requirements by using information that that something went wrong. However, there are situations 73 18 is collected as the system runs. More specifically, the where errors result in failures. A failure is defined as the 74 19 system is monitored and when an assumption is violated situation where the observed behavior of a system is dif- 75 20 or a denial event is observed the associated requirements ferent than the one specified and from the one the user is 76 21 are considered to be in violation and remedial or mainte- expecting to observe. One part of corrective maintenance is 77 22 nance actions can be taken. A denial event or a violation of identifying, , and correcting faults in the code or the 78 23 an assumption is associated with the failure of a require- configuration of a system once a failure has been observed. 79 24 ment and consequently with the potential failure of one or The software community has proposed over the years a 80 [89] 25 more components. Failure hypotheses are associated number of techniques for making the task of fault identifi- 81 [94] 26 with remedial actions through predefined requirement/ cation easier once a failure is observed. These techni- 82 27 assumption/remedy tuples or through specialized types of ques include static code analysis, identification and 83 [95] 28 analysis such as obstacle analysis. interpretation of antipatterns, as well as tools and tech- 84 29 Goal-driven approaches utilize a model of the system niques to perform historical analysis, bug tracking, and 85 [96,97] 30 that describes the prerequisites for the system’s correct software evolution analysis. Static code analysis 86 31 operation in terms of an AND-OR goal tree or a goal techniques rely on the analysis of the source code of a 87 32 graph. The intuition behind these approaches is to attempt system in order to infer characteristics of the program 88 33 to satisfy logical combinations of goals that have to be that may lead to unsafe array, garbage collection, or pointer 89 34 achieved so that a specific functional or non-functional operations. A common technique for these tools is to use 90 [98] 35 system requirement can be met. The combinations of symbolic execution and path simulation where program 91 36 these goals that need be satisfied can be collected from a execution paths are simulated using symbolic variables to 92 37 constraint satisfaction or a SAT solver given a collection of determine if any software errors could occur. Another 93 38 goals in an AND-OR tree. When there is a mismatch static code analysis technique for bug detection is the 94 39 between the actual and expected behavior, then the sub- identification of source code patterns that match known 95 [99] 40 goals that are associated with this behavior mismatch in the bug patterns. These systems first construct an abstract 96 41 goal tree are considered as potential root cause hypoth- model of the source code being analyzed and then attempt 97 [90] 42 eses. A problem solver (i.e., a propositional satisfiabil- to match these abstract models against predefined error 98 [91] 43 ity—SAT solver) can be used to traverse the goal tree patterns. Most of these systems are customizable in the 99 44 and identify all the combinations of actions or other sub- sense the users can add their own patterns utilizing a tool- 100 45 goals that may contribute to this mismatch between the specific formalism. A variation of the above approach is 101 [92] 46 expected and observed system behavior. based on the identification of antipatterns and the identifi- 102 47 Pattern-driven approaches model requirement failures cation of code “bad smells.” Antipatterns are design pat- 103 48 as patterns specified in a pattern language. The intuition terns that are commonly used but are ineffective and 104 [100] 49 behind these approaches is to identify through pattern counterproductive. Similarly, “bad-smells” are bad 105 50 matching abnormal sequences of events that are inter- design and bad programming implementation patterns 106 51 leaved with normal events as the system runs. The abnor- that are proven to cause problems related to the perfor- 107 52 mal sequences of events are usually associated with mance, reliability, maintainability, and robustness of a 108 [101] 53 different types of possible system failures, attacks, and software system. Software evolution history analysis 109 54 threats, and are usually modeled as patterns in a pattern tools can also be used to examine information relevant to 110 55 language. These patterns are matched against the events the evolution of a system, to investigate the rationale 111 56 that are collected as the system operates. If a pattern is behind applied changes and past maintenance operations, 112 14 Techniques for Software Maintenance

01 and also to store metrics such as complexity metrics, main- maintenance affecting the system’s quality characteristics 57 02 tainability metrics, failure intensity metrics, and cumula- include increasing its maintainability, extensibility, port- 58 03 tive failure counts. These systems can also be used to ability, security, reliability, and usability. In this section, 59 04 associate functional and non-functional requirements we focus on some of the most frequently used perfective 60 05 with particular components of the system or configuration maintenance techniques that are related to adding new 61 [92] 06 constraints. Bug tracking systems also play an impor- functionality to a software system, merging software com- 62 07 tant role in tracing faults in the code. These systems keep ponents to yield a new component, and refactoring a com- 63 08 information on open or already remedied faults as well as ponent or a system to increase some of its quality 64 Q6 09 information on whom and how worked to remedy the fault. characteristics. 65 10 These systems not only help software maintainers to focus 66 11 their attention on a particular component when a failure is Adding new functionality 67 12 observed, but also help users to understand dependencies 68 13 between system elements. Finally, versioning systems play Adding new functionality to a system may take various 69 14 an important role in understanding how one system version forms. One form is to integrate the system with other 70 15 differs from its previous version. In addition to providing components or applications at the architecture level. 71 16 the ability to roll back to previous versions, versioning Another form is to adapt or extend its source code at the 72 17 systems also assist maintainers for associating failures component or class level so that new functionality can be 73 18 with components, as in most cases failures occur due to a added through specific design patterns. Depending on the 74 19 previous change, maintenance operation, or alteration in level of abstraction and granularity these maintenance 75 20 the computing environment or configuration parameters. operations are applied on, we can differentiate between 76 21 Finally, another aspect of corrective maintenance is the architecture-level changes and source code-level changes. 77 22 need for an infrastructure to assist software maintainers to At the architecture level, we consider architectural pat- 78 23 perform both once an error is fixed and to terns that allow the integration of a system with other 79 [102] 24 perform reliability analysis during and after regression systems. The use of facades and wrappers can be 80 [21,56] 25 testing runs. More specifically, performing regres- used to facilitate the addition of new functionality in the 81 26 sion testing after a maintenance operation has been applied system by integrating the system with other components 82 27 is always important and required. Just passing the pre- and applications at the architecture level. Such wrapper 83 28 scribed regression tests may not be adequate especially components provide standard utility services such as trans- 84 29 for mission-critical systems. What is also needed is the action management, message mediation, authentication, 85 30 analysis of the reliability of the system by the utilization access control policies, encryption, etc. In the software 86 31 of reliability growth models. In this respect, the failure engineering literature there are a number of architecture 87 32 intensity of a system is measured every time after a test level patterns that have been proposed for extending in a 88 33 suite is applied and an error has been discovered and fixed. proper way the functionality of a system by altering 89 34 Regression testing, integration testing, and system testing parts of its design at the architecture level. These include 90 35 should be kept on being applied up to a point where the Data Source Architectural Patterns, Object-Relational 91 36 failure intensity of a system measured in failures per CPU Structural Patterns, Object-Relational Patterns, Distribution 92 37 hour of operation falls below a prescribed level. Reliability and Concurrency Patterns, and Session State Patterns. 93 38 growth models can be used to predict how many hours of A collection of architecture-level patterns can be found in 94 39 testing are still required for the failure intensity of a system Refs. [103] and [104]. Yet another architectural-level tech- 95 [21] 40 to drop below a specified value. As an example, con- nique that is used to integrate systems with other applica- 96 41 sider that for some mission-critical systems it has been tions, thus extending the functionality and capabilities of 97 42 reported failure intensity rates of 0.1 to approximately 5.5 such systems, is the use of Enterprise Service Bus (ESB). 98 43 failures per thousand CPU hours of operation. Depending The ESB is an architectural abstraction that is usually 99 44 on the criticality of the mission and the potential impact of implemented utilizing technologies and pro- 100 45 a failure (loss of life, financial loss, bad publicity, or mere vides fundamental system integration services such as data 101 46 inconvenience), the guidelines and the acceptable maxi- and protocol mediation as well as message and event 102 [105] 47 mum tolerable value may vary. processing. Fig. 8 illustrates the deployment and use 103 48 of a typical enterprise bus. The ESB provides a wealth of 104 49 Perfective Maintenance infrastructure services such as message queuing, message 105 50 routing, message mediation, interface adaptation, logging, 106 51 Perfective maintenance aims to apply changes in the sys- monitoring, and security. To date ESB is the dominant 107 52 tem in order to increase some of its functional and some of commercial choice for system integration and 108 53 its non-functional quality characteristics. Examples of per- interoperability. 109 54 fective maintenance operations that deal with the func- At the source code or at class level the addition of new 110 55 tional characteristics of a software system include adding functionality can be achieved in various ways. One way is 111 56 new functionality, while examples of perfective the addition of new methods to a class or the addition of 112 Techniques for Software Maintenance 15

01 57

02 58

03 59

04 60

05 61

06 62

07 63

08 64

09 65

10 66

11 Fig. 8 Schematic of an Enterprise Service 67 Q18

12 Bus. 68

13 69

14 70

15 new classes to a package. This is a technique that requires from which the two versions originate. Three-way mer- 71

16 care, as maintainers have to make sure that key architec- ging deals with techniques that allow for information in 72

17 tural constraints are not violated (e.g., violation of design the common ancestor to be used during the merge process. 73

18 invariants); the enhanced class or the enhanced component Orthogonal to the classification of two-way and three-way 74

19 remains cohesive (e.g., new functions operate on same type merging, software merging techniques can be further clas- 75

20 of data and deliver related to the class/component function- sified as textual, syntactic, semantic, and structural. 76

21 ality); there are no side effects while extending the class or Textual merging techniques consider source code files 77

22 the components (e.g., files, , resources, or critical purely as text files, and merging takes the form of appro- 78

23 regions) are accessed without proper permissions and priately merging source code files line by line (line-based 79

24 locks; and the enhanced class or the enhanced component merging). Syntactic merging techniques omit unnecessary 80

25 maintain a low level of coupling with the other components textual details between the two files that are to be merged, 81

26 (e.g., no unnecessary interfaces are exposed and no data are and it will consider that two artifacts cannot be merged 82

27 accessed without the proper use of interfaces). Another only if they produce a non-syntactically valid result. 83

28 method is the use of appropriate design patterns such as Depending on the data structures used to model the source 84

29 the adapter, decorator, proxy, state, and factory code, syntactic-based merging can be further classified as 85 [106] 30 method. The use of such design patterns assumes that tree-based or graph-based. Semantic merging deals with 86

31 we have access to the source code and it can be achieved techniques that aim to detect situations where the resulting 87

32 via the use of refactorings and program transformations. program is not semantically correct, such as having unde- 88

33 clared variables (static semantic conflicts) or is executing 89 34 Software merging the wrong version of a function (run time semantic con- 90 35 flicts). Finally, structural merging deals with situations 91 36 In some cases, an application is developed as a result of where one or both of the versions to be merged have 92 37 merging source code of two or more existing systems. This been produced as a result of refactoring operations. The 93 38 merging phenomenon is also a very common issue when a problem arises when the two versions are merged and we 94 39 large software system is developed by many groups that cannot decide which of the refactoring operations can be 95 40 have to merge their components to form the final product. valid in the new merged version. Structural merging is an 96 41 Software merging is also a common issue in product-line area for which more research is still needed. A complete 97 42 type of software development where software versions survey of software merging techniques can be found in 98 43 with possibly common ancestors are merged to form new Ref. [108]. 99 [107] 44 versions or new products. In this respect, we can con- 100 45 sider software merging as a form of customization and an Software refactoring 101 46 important part of Perfective Software Maintenance. In 102 47 practice, software merging is handled by tools that support Software refactoring refers to source code transformations 103 48 collaborative software development, and software config- that aim to improve the quality of the system without 104 49 uration management. These tools aim to address the issue altering its behavior characteristics. Even though refactor- 105 50 in a tractable and decidable way by limiting the scope of ing operations do not fix errors or add any new function- 106 51 merging, the programming languages supported, and the ality, they facilitate the understandability, maintainability, 107 [109] 52 type of artifacts that can be merged. In general, software and extensibility of the code. Examples of refactoring 108 53 merging techniques can be classified as two-way or three- include the transformation of source code within a block 109 54 way merging techniques. Two-way merging deals with into a , moving a method or an attribute to a 110 55 techniques that attempt to merge two versions of a soft- more appropriate class, or create more general types to take 111 56 ware artifact without relying on the common ancestor advantage of class inheritance in Object-Oriented systems. 112 16 Techniques for Software Maintenance

01 57

02 58

03 59

04 60

05 61

06 62

07 63

08 64

09 65

10 66

11 67

12 68

13 69

14 70

15 71

16 Fig. 9 Sample refactoring operation 72Q18

17 for the replacement of public data fields. 73

18 74

19 75 20 Fig. 9 illustrates a refactoring operation[109] where a public application so that the data schema can be more maintain- 76 21 class attribute becomes private and accessor and mutator able and more extensible. 77 22 methods are added. At the source code level, refactoring 78 23 takes the form of source code transformations so that 79 Adaptive Maintenance 24 design patterns can be introduced and antipatterns be 80 25 removed. In the related software engineering literature, a 81 26 Adaptive maintenance techniques allow for the migration, 82 number of different standard refactoring transformations porting, and integration of software systems to new plat- 27 and a number of to-be-avoided practices (antipatterns) 83 28 forms, languages, environments, and third-party applica- 84 have been proposed.[95] These catalogs discuss in detail 29 tions. Adaptive maintenance may take several forms. In 85 the different types of transformations, their anticipated 30 this section, we discuss techniques that deal with three of 86 effects on system quality, and the conditions under which 31 the most often utilized maintenance scenarios, namely, 87 these refactoring operations can be applied or antipatterns 32 synchronizing business processes and workflows with run 88 Q7 can be recognized. Refactoring operations on the source 33 time applications that implement these processes and 89 code can be automated to a large extent.[62] These opera- 34 workflows, migrating software systems to Object- 90 tions focus on raising the level of abstraction so that the 35 Oriented, Network-Centric, and Service-Oriented environ- 91 system can be more extensible; decomposing the code into 36 ments, and finally migrating legacy components to new 92

37 more manageable logical units so that the system compo- languages. Adaptive maintenance is an important part of 93

38 nents and code can be more maintainable and reusable; and the software life cycle as it allows for the system to remain 94

39 improving coding standards (e.g., variable renaming, mov- operational when the underlying platforms or the operating 95

40 ing methods, and altering hierarchies) so that the system is environments change. A typical example of adaptive main- 96

41 easier to understand and modify. tenance is when parts of the functionality of a legacy 97

42 Refactoring operations do not only apply at the source system need to be exposed as services (e.g., Web services) 98

43 code level of a system but may also apply at the business, to other applications or users. The sections below discuss 99

44 architecture, and data levels. At the business level, refac- some of the most often occurring scenarios and techniques 100

45 toring takes the form of altering the workflows or business of adaptive maintenance. 101

46 processes in order to introduce proven patterns than can be 102 47 used to increase throughput, decrease cost, or positively Synchronizing business processes with 103 [110] 48 affect KPIs of interest. At the architecture level, refac- run time applications 104 49 toring takes the form of component restructuring or the 105 50 introduction of new components or the integration of the Software evolves in iterative and incremental steps from its 106 51 old system with other applications. Usually the objective is inception to its retirement. The evolution includes change 107 52 to enhance some non-functional and quality characteristics of artifacts at different levels of abstraction, from very 108 53 of a system. For example, the objective of architecture abstract ones, such as business process specifications, to 109 54 refactoring may be to make the system more secure, exten- very concrete ones, such as source code. A perfective or 110 55 sible, robust, or maintainable. At the data level, refactoring corrective maintenance operation such as the addition of a 111 56 operations aim to reorganize the data schema of an new class could directly affect architecture and design 112 Techniques for Software Maintenance 17

01 models, and through ripple effects, even the requirements Migrating to Network-Centric 57 02 specification and the business process models of this sys- Service-Oriented environments 58 [111] 03 tem. If each model change is not propagated to all 59 04 affected models and if changes that are performed in par- The rapid development of new technologies in the area of 60 05 allel are not coordinated, consistency among artifacts is lost Object-Oriented systems and Service-Oriented computing 61 [112–114] 06 and semantic drift is created. In this context, adap- over the past few years necessitated the migration and 62 07 tive maintenance may take the form of propagating and porting of many corporate legacy applications to 63 08 synchronizing software models that pertain to different Network-Centric environments. This migration may take 64 09 levels of abstraction and used by different stakeholders. several forms. One form is wrapping, where legacy com- 65 10 This problem is referred to as model synchronization or ponents are interfaced or wrapped with components that 66 11 model coevolution.[115] The solution here is to analyze the are responsible for exposing the original legacy interfaces 67 12 code so that an activity type of diagram can be compiled.[116] to third-party components over standard application proto- 68 13 Such an activity type of diagram will identify for each use cols such as http, SOAP, or RMI. Another form is the 69 14 case of the system, the related operations, the sequence, and utilization of a framework such a Object Request Broker 70 15 the conditions under which these operations are invoked and (ORB) to allow invocations of native legacy code from 71 16 the data exchanged between these operations. These models remote components written even in a different language 72 17 can be created by static code analysis, dynamic analysis, or a than the legacy component being invoked. 73 18 combination of both.[117] On the other hand, business pro- Wrapping is a technique based on the creation of new 74 19 cesses that implement specific system scenarios should be components that provide bidirectional communication 75 20 analyzed and annotated with information regarding the type between the legacy component and third-party remote 76 21 of data exchanged between the various business process components. In one direction these components offer glob- 77 22 steps; the roles and constraints of each process step; and ally visible interfaces to third-party external client compo- 78 23 task descriptions. The intuition behind the analysis and nents. These globally visible interfaces implement native 79 24 abstraction of the source code with the concurrent annota- calls to the legacy component that is being invoked. In this 80 25 tion of the business processes is to bring source code models respect, the clients do not need to know how the legacy 81 26 and business process models closer so that they can be component is invoked, and the only information they need 82 27 compared and reconciled. Fig. 10 illustrates the concept of to know is the visible interface offered by the wrapper 83 28 component. Furthermore, these wrapper components 84 bridging the conceptual gap between high-level business 29 offer through a specific application and transport layer 85 process models and the system’s source code. This conver- 30 protocol results back to the requesting client. The migra- 86 gence allows for comparisons to be applied in a more 31 tion through wrapping can be achieved through different 87 efficient and meaningful manner. [102] 32 strategies. These include the use of middleware frame- 88 Reconciling annotated business process models with 33 works such as Java Enterprise Edition JEE5 and Enterprise 89 source code models takes the form of identifying differ- 34 Java Beans, the use of Extensible Markup Language 90 ences and similarities between these models. In this respect 35 (XML) wrappers, and the use of proxy components to 91 a number of techniques have been proposed that fall in the 36 wrap components and dispatch client’s requests to native 92 general area of model dependence extraction. These tech- 37 legacy system calls. The intuition behind the wrapping 93 niques include formal concept analysis (FCA) and feature 38 approach is that the legacy components are minimally 94 modeling.[118,119] 39 altered if any way at all. The wrapping approach provides 95 40 a reduced migration risk and effort. On the other hand, 96 41 legacy components are treated often as “black boxes” and 97 42 there is no deeper understanding of the inner workings of 98 43 the components to facilitate future evolution. Fig. 11 illus- 99 44 trates such a wrapping scenario for the exposure of legacy 100 45 code, (UI) screens, and legacy data as ser- 101 46 vice resources, to client and third-party applications. 102 47 In this context, we can differentiate between three major 103 48 strategy types for migrating a legacy component to a 104 49 Network-Centric environment. These can be qualified as 105 50 the black box, the gray box, and the white box strategy. 106 51 In the black box strategy, we have access only to the UI 107 52 of the system. Techniques like screen scraping or dialogue 108 53 tracing can be used to provide a fac¸ade component (the 109 54 wrapper) that invokes the same UI entities through program- 110 [120] Q18 55 Fig. 10 Bridging the conceptual gap between business matic and not through user-driven manual means. In this 111 56 processes and source code. approach, we do not have any access to the code, and we 112 18 Techniques for Software Maintenance

01 57

02 58

03 59

04 60

05 61

06 62

07 63

08 64

09 65

10 66

11 67

12 68

13 69

14 70

15 71

16 72

17 73

18 74

19 75

20 76

21 77

22 78

23 79

24 80 Q18 25 Fig. 11 System integration using wrapping technology. 81

26 82 27 focus only on exposing the UI functionality through such a applications as well. The drawback of the approach is that 83 28 84 fac¸ade component that simulates the original UI. The benefit it requires some level of analysis of the source code and the 29 85 on this approach is that it does not require any analysis of the extraction of dependencies between the extracted compo- 30 86 source code of the system and therefore it is easier, faster, nents. These dependencies may be very obscure and diffi- 31 87 andmosteconomic.Thedrawbackofthisapproachisthat cult to extract and therefore, if these dependencies are not 32 88 any problems of the legacy system remain hidden and many fully revealed, there is the possibility that there may be 33 89 consider it as an approach where sooner or later any quality unforeseen side effects when a component is invoked by a 34 90 problems of the legacy system will eventually surface and third-party external client component. An example could 35 91 the source code analysis will be inevitable. The intuition be that external client components may be accessing secure 36 92

37 behind this strategy is that it requires minimal access to the internal databases or updating a database without the 93

38 legacy component and has the potential to minimize cost proper locking mechanisms. The intuition behind this strat- 94

39 and effort during the migration process. egy is to identify reusable assets that can be used as stand- 95

40 The gray box strategy assumes that we have some alone components to form assembly elements of new sys- 96

41 minimal access to the source code of the system. The tems and applications in an organization. In this respect, 97

42 strategy utilizes an approach that has three major steps. In componentization and access of legacy functionality can 98

43 the first step the system is analyzed using static and be done at a finer level of granularity. 99

44 dynamic analysis techniques so that the system is broken The white box strategy is based on the assumption that 100

45 down to a collection of components that offer specific we have full access to the source code and it is also based 101 [121–123] 46 functionalities. The second step is to identify the on three steps. In the first step we analyze the legacy source 102

47 dependencies of each component with all other compo- code and we extract an . Complex data struc- 103

48 nents that make up the system, and model the signatures tures become class types and functions that utilize such 104

49 of the interfaces of the components we want to expose to data types become methods to the classes generated by the 105

50 third-party clients. The third step is to create wrapper data types. This is a complex process and is referred to as 106

51 components that allow for the remote client components objectification of the legacy code. The second step is the 107 [124] 52 to access the legacy components. The benefit of the automatic generation of Object-Oriented source code from 108 53 approach is that it decomposes the legacy system and the extracted object model. This is a step that can be 109 54 provides a level of understanding of its inner workings. automated especially if the target Object-Oriented lan- 110 55 Another benefit is that the extracted components can be guage shares common features with the legacy source 111 56 considered as reusable assets that can be utilized in other language (e.g., C to Cþþ). The automatic generation of 112 Techniques for Software Maintenance 19

01 the target Object-Oriented source code may be more chal- Once the functionality of a legacy component has 57 02 lenging if the source and target language do not share many become available to remote clients as a service, we can 58 03 common features (e.g., Fortran to Java). The third step is then utilize a number of architectural styles and frame- 59 04 the use of a middleware framework or the creation of works to integrate it with other systems and compo- 60 [125] 05 wrapper components that allow the access of methods of nents. The most relevant architectural styles are the 61 06 individual objects from remote clients. The intuition layered architecture style and in particular the three-tier 62 07 behind the white box approach is that migration to architectural style and the implicit invocation style such as 63 08 Network-Centric environments is best done when the the Model-View-Controller and the Event Driven 64 09 source code has been analyzed and restructured so that Architecture (EDA) style. 65 10 legacy functionality is encapsulated in classes and selec- Regardless of the strategy used, these recent advances in 66 11 tively exposed in the form of methods. In this respect, the software engineering technology allow for various levels 67 12 benefit of the white box approach is that it decomposes the of wrapping and mediation to take place in large systems, 68 13 legacy application in a very fine level of granularity, thus and middleware frameworks also allow for the seamless 69 14 extending the understanding we have on the business logic access of legacy components or objects in a secure way and 70 15 encapsulated in the system. This also allows for individual with robust transaction management policies from remote 71 16 classes and consequently individual objects to be used clients. In this way, the level of granularity can be arbitra- 72 17 from external applications as reusable assets. The draw- rily raised and wrappers can be further wrapped, compo- 73 18 back of the approach is that it requires significant invest- nents can interface with other components, applications, 74 19 ment in time, effort, and funds to extract an accurate object and databases through mediators, and so on, thus building 75 20 model from the legacy system and generate new Object- what is known as Systems of Systems (SoS) or Ultra Large 76 [126] 21 Oriented code. Another drawback is that the original Scale Systems (ULS). ESB technology and Publish/ 77 22 design of the system may be lost and that may create Subscribe protocols can be used to create such large cor- 78 23 maintainability problems with the staff that is familiar porate systems by integrating diverse applications, compo- 79 [53] [103,127] 24 with the old original structure of the system. Fig. 12 nents, and data sources in one seamless environment. 80 25 illustrates sample transformations to migrate procedural C Fig. 13 illustrates an example three-tier architecture (client 81 26 code to Object-Oriented Cþþ code. The first and second side, server side, legacy systems) that is based on the JEE 82 27 transformations illustrate how type definitions and struc- framework and can be used to expose legacy components 83 28 tures can be mapped into classes. The third transformation as services to remote clients. 84 29 illustrates how C functions can be mapped to methods. In 85 30 this particular example the return data type and the para- Migrating to new languages 86 31 meter data type of the original C function are considered to 87 32 allocate the function as a method to the class that stems In many occasions, adaptive maintenance takes the form of 88 33 from this data type (second transformation). migration of a system to a new programming language. 89 34 90

35 91

36 92

37 93

38 94

39 95

40 96

41 97

42 98

43 99

44 100

45 101

46 102

47 103

48 104

49 105

50 106

51 107

52 108

53 109

54 110 55 Fig. 12 Sample migration from procedural to 111 Q18 56 Object-Oriented code. 112 20 Techniques for Software Maintenance

01 57

02 58

03 59

04 60

05 61

06 62

07 63

08 64

09 65

10 66

11 67

12 68

13 69

14 70

15 71

16 72

17 73

18 74

19 75

20 76

21 77

22 78

23 79

24 80

25 81

26 82 Fig. 13 Block diagram of a Q18 27 typical three-tier Network-Centric 83 28 architecture. 84 29 85

30 86

31 This may be required for several reasons. The most com- persistently stored if needed in an Object-Oriented or rela- 87

32 mon reasons include the lack of development, , tional database. In the second step, a transformer guided by 88

33 and testing tools for the old language, the lack of a compi- a transformation control strategy uses the model to gener- 89

34 ler that runs on modern platforms for the old language, the ate the text of the source code of the new migrant system. 90

35 lack of common libraries for the old language, or even the In the related literature, the approaches to generate the 91

36 lack of developers who are familiar with the old language. target program’s source code text have been classified 92

37 Migration to a new language usually takes the form of into two categories, namely, visitor-based approaches 93

38 transliteration where we transform a system written in and template-based approaches. In the visitor-based 94

39 one language to another but without changing the under- approaches, a visitor mechanism (e.g., the visitor design 95

40 lying programming paradigm (e.g., from Fortran to C). pattern) is used to traverse the source model (i.e., the 96

41 Less frequently, migration takes the form of transforming Annotated AST of the source system) and to generate the 97 42 a system from one language to another new language in a appropriate target text (i.e., the target source code for the 98 43 new programming paradigm (e.g., from imperative to migrant system). The template-based approach is based on 99 44 Object-Oriented such as from C to Cþþ). Regardless of collections of templates that represent the target text con- 100 45 the form and type of migration, at the core there is what is taining splices of metacode to access the source model and 101 46 referred to as program transformation technology. In this to perform code selection and iterative expansion. These 102 47 section we focus on classifying and discussing a number of templates form the right-hand side (RHS) of transforma- 103 48 generic source code transformation techniques that fall in tion rules. The left-hand side (LHS) of these rules imple- 104 49 two main categories of transformations. The first category ments logic that accesses the source model and provides 105 50 is referred to as Model-to-Text transformations. The sec- data to variables and metacode in the RHS templates. The 106 [128] 51 ond is referred to as Model-to-Model transformations. result of the rule application is the instantiation of the RHS 107 52 Model-to-Text-based migration is based on two steps. templates to form syntactically valid text that corresponds 108 53 In the first step, a model of the source code is built. This to the source code of the target migrant system. The intui- 109 54 model is usually the Annotated AST of the source code tion behind this approach is to avoid maintaining more than 110 55 being migrated. The model is usually represented as a one abstraction models (i.e., one source model of the ori- 111 56 complex data structure in dynamic memory and can be ginal system and one abstract model of the migrant system) 112 Techniques for Software Maintenance 21

[129, 130] 01 and be able to immediately generate source code text programming paradigm such as C to Java. Fig. 14 57 02 directly from the abstraction models that represent and below illustrates two examples of PL/I-like code segments 58 03 denote the original legacy system. In this respect, the automatically migrated to equivalent C code. Equivalence 59 04 benefits of this approach are twofold. First, it does not here is evaluated by ensuring that the external system 60 05 require the additional complexity of maintaining one behavior of the migrant system is the same as the one of 61 06 source and one target model, thus eliminating the need to the original system. 62 07 define and validate such a target model. Second, it allows Concluding this section, it is important to mention two 63 08 for the direct validation of the transformation rules by points of interest. One is that migration and transliteration 64 09 inspecting the source code as this is directly emitted from do not need to be complete (i.e., produce a fully working 65 10 the transformer. system automatically) for the migration to be successful. 66 11 Model-to-Model approaches are based on transformers The strength of the automated approaches is that they allow 67 12 that are applied to a source model (e.g., the Annotated for the mechanic, uniform, consistent, and fast transforma- 68 13 AST) and yield another model that is then used to generate tion of the bulk of the system (e.g., 95% of the code). There 69 14 code for the new migrant system. The target model is could be a small part of the generated target system code 70 15 usually the Annotated AST of the target system. A pretty that is either not syntactically correct or has some semantic 71 16 printer can then be used to generate the target source code. errors. These can be revealed by testing (unit, system, 72 17 In this respect, Model-to-Model transformations are in fact functional testing) and the assessment of the reliability of 73 18 tree transformations. The frameworks that support such the system using appropriate reliability growth models. 74 19 Model-to-Model transformations fall into four main cate- Automatic transliterators can generate more that 95% of 75 20 gories, namely, direct manipulation, staged structure- the original code to the new code. This is a major step as it 76 21 driven, template-based, and graph transformation-based reduces the migration time and possible manual rewriting 77 22 frameworks. The direct manipulation Model-to-Model errors. The second point has to do with the need for rerest- 78 23 transformation frameworks offer an application program- ing the new migrant system to make sure that the functional 79 Q8 24 ming interface (API) and a corresponding programming and non-functional requirements for the migrant system 80 25 language to manipulate the source models through custom indeed hold. Experience shows that this testing phase is 81 26 user-defined transformation programs written in the pro- probably the most time consuming and expensive in the 82 27 gramming language offered by the framework (Lisp, Java, whole process and should not be underestimated. 83 28 etc.). The staged structure-driven frameworks operate in 84 29 two steps. In the first step, the skeleton of the target model 85 30 is created. In the second step, the skeleton target model is TOOLS, FRAMEWORKS, AND PROCESSES 86 31 filled or instantiated by setting values for attributes and 87 32 linking various model references. The template-based fra- Tools 88 33 meworks utilize template models that contain annotations 89 34 in the form of metacode or constraints. An application Software maintenance for large systems without the use of 90 35 program that implements a control strategy instantiates specialized tools would not be possible due to the complex- 91 36 the template model to yield a concrete model that in turn ity of the task and due to the size of the source code usually 92 37 is used to generate the target source code. The graph involved. Furthermore, many maintenance tasks would be 93 38 transformation-based approach utilizes a graph transfor- error prone if performed manually. Tools automate many 94 39 mation language to transform the source model to a target tedious and repetitive tasks and therefore eliminate the 95 40 model. Triple Graph Grammars and Attributed Graph problem or human errors. Examples include repeated trans- 96 41 Grammars constitute the theoretical framework for these formations, metrics calculation, as well as the extraction of 97 42 types of transformers. The intuition behind the Model-to- data and control flow dependencies between source code 98 43 Model approaches is that it allows for different versions of statements or between components. In this respect, there 99 44 the same target system to be generated from a common are many tools that have been proposed by the industry and 100 45 target model. This common target model is generated by the academia. The implementation view of the architecture 101 46 the source model that is an abstraction of the original of these tools may vary. For example, these may have been 102 47 source code. Once such a target model is created, custo- implemented using a combination of a pipe and filter 103 48 mized fine-tuned transformers can be used to generate architectural style, blackboard style, or an implicit invoca- 104 [103] 49 different versions of the target source code (e.g., from tion style. However, regardless of the implementation 105 50 PL/I to Fortran 95, or to Fortran 2003). view of the architecture, most tools share a common con- 106 51 Source code migration has been successfully used for ceptual architecture. This conceptual architecture is com- 107 52 migrating large programs written from one language ver- posed of components such as the front-end component that 108 53 sion to another version, programs written in imperative allows for parsing the code; the source code repository 109 54 languages (such as PL/I) to other imperative languages component that stores in dynamic memory or in persistent 110 55 such as C, or programs written from one language in one storage the source code model (i.e., the AST); and a num- 111 56 programming paradigm to another language in another ber of plug-in components that allow for the analysis and 112 22 Techniques for Software Maintenance

01 57

02 58

03 59

04 60

05 61

06 62

07 63

08 64

09 65

10 66

11 67

12 68

13 69

14 70

15 71

16 72

17 73

18 74

19 75

20 76

21 77

22 78

23 79

24 80

25 81

26 82

27 83

28 84

29 85

30 86

31 87

32 88

33 89

34 90

35 91

36 92

37 93 Q18 38 Fig. 14 Transliteration example of PL/I-like source code to C source code. 94

39 95

40 96

41 manipulation of the source code model. Fig. 15 illustrates International Conference on Software Engineering 97

42 the conceptual architecture of most source code analysis (ICSE), the IEEE International Conference on Software 98

43 tools. Examples of tools that have been extensively used in Maintenance (ICSM), the IEEE Working Conference on 99 [34] 44 the industry and academia include the Rigi and Reverse Engineering (WCRE), the IEEE International 100 [131] 45 Landscape, and tools that have been developed as Conference on Program Comprehension (ICPC), and the 101

46 research prototypes but have also been extensively used IEEE Conference on Software Maintenance and 102 [35] [132] [133] 47 in the industry are the CIA, Genoa, TXL, Reengineering (CSMR). 103 [134] [135] 48 Columbus, and Bauhaus tools. Finally, in the area 104 49 of , tools such as Dotty, Shrimp, and Frameworks 105 50 Landscape have been extensively used for visualizing var- 106 51 ious software models. These tools provide a basic list of So far, there have been several frameworks and metamo- 107 52 what is available for software maintenance. Elaborate lists deling languages proposed by different communities. The 108 53 of tools and sites related to software maintenance can also OMG in a breakthrough proposal defined a core metamo- 109 [136] 54 be found in many research sites. Also, prototype tools deling language called Meta Object Facility (MOF) that is 110 55 are regularly demonstrated at related international IEEE based on modeling domains and specifying schemas using 111 56 and ACM conferences such as the IEEE/ACM classes, associations, and inheritance. The simplicity and 112 Techniques for Software Maintenance 23

01 (a schema) for the AST of a given language (see Program 57 02 Representation). This schema can be then represented as a Q958 03 MOF model. A parser can be used to parse the source code 59 04 of the system being maintained and instantiate this MOF 60 05 model to create a concrete AST for a given code fragment 61 06 or the complete system in EMF. This concrete model is 62 Q18 07 Fig. 15 Conceptual architecture of software analysis tools. essentially a model of the source code of the system and 63 08 can be analyzed (e.g., compute metrics, identify refactor- 64 09 ing operations, and identify dead code) or transformed to 65 power of this language allows for software engineers to 10 yield a new updated source code model. Such model trans- 66 model the schema of various other modeling languages 11 formation can be achieved using a transformation language 67 including UML itself. The interesting dimension is that 12 such as QVT, ATL, or AGG. This type of maintenance 68 once a schema has been designed as a MOF model, OMG 13 relates to Model-to-Text and Model-to-Model approaches 69 has defined a standard, called eXtensible 14 described earlier. 70 Interchange (XMI), which allows for such MOF models 15 Finally, some commercial tools that facilitate software 71 16 to be represented as XML schemas. This has interesting development and maintenance include the IBM Rational 72 17 implications because once a MOF model is instantiated to Software Architect, the Poseidon UML framework, and the 73 18 form a concrete model, then this concrete model can be StarUML framework, where software engineers can use 74 19 represented again through XMI as an XML document that these to perform reverse or forward engineering. In the 75 20 complies with the XML schema that stems from the reverse engineering mode one can extract class diagrams 76 21 domain model’s schema. Once this XML document that from existing source code. In the forward engineering 77 22 describes the concrete model is available, it can then be fed mode one could export source code models (use cases, 78 23 along with its corresponding XML schema to frameworks class diagrams, sequence diagrams, ASTs) through the 79 24 such as the Eclipse Modeling Framework (EMF). The XMI standard to EMF compliant models for further pro- 80 25 EMF framework then automatically generates a data struc- cessing and transformation from within the EMF 81 26 ture, representing as Java objects in dynamic memory the environment. 82 27 XML document that corresponds to the concrete MOF 83 28 model and provides an API for accessing and manipulating Processes 84 29 these objects. Also, transformers can be used to transform 85 30 one EMF model that is compliant to one schema to another Software maintenance is a complex, expensive, and time- 86 31 EMF model that is compliant to another schema. Such consuming task that encompasses several risks. As such, it 87 32 transformers include Atlas Transformation Language must be planned and organized carefully in a methodolo- 88 33 (ATL), Attribute-Graph-Grammar System (AGG), and gical and structured way. At the beginning of this entry, we 89 34 MTF. The impact of the above is that it allows for the discussed generic processes for software maintenance. 90 35 development of tools that manipulate models that now Even though these processes are helpful for understanding 91 36 are stored as objects in dynamic memory and not as just the issues and for planning the activities related to software 92 37 plain graphs and images. These tools can perform analysis maintenance tasks, they do not provide much of practical 93 38 of models, transformations of models, and generation of benefit unless they are associated with some concrete 94 39 source code for a new migrant system. application guidelines. These guidelines are provided in 95 40 In EMF, MOF follows a multilayer architecture by the form of handbooks and methodologies that allow soft- 96 41 gradually evolving and extending the metamodeling ele- ware maintainers to select, plan, design, implement, and 97 42 ments in different levels. An advantage of such approach is evaluate various maintenance activities. In this context, we 98 43 that it makes it easier to select a subset of the language for discuss two of the most frequently used process methodol- 99 44 applications that do not require the entire features of the ogies for software maintenance, namely, the Software 100 45 language. Essential MOF (EMOF) is extended by Reengineering Assessment (SRA) and the Service 101 46 Complete MOF (CMOF), which provides a more compre- Migration and Reuse Technique (SMART). 102 47 hensive set of modeling features. EMF is another such The SRA methodology is a concrete process methodol- 103 48 modeling facility proposed by the Eclipse community. ogy proposed by the U.S. Air Force Software Technology 104 49 With respect to transformation languages, Query-View- Group. The SRA assumes a limited budget and availability 105 50 Transformation, also referred as QVT, provides means of resources as compared to the large software portfolio of 106 51 for the algorithmic specification of model transformations, an organization. The objective of SRA is to provide a 107 52 pattern matching, traceability, and support for incremental structured methodology for assessing maintenance options 108 53 and bidirectional model updates. and proposing alternatives. SRA is structured around three 109 54 As an example of how the above technologies relate to phases: the technical assessment, the economic assess- 110 55 software maintenance, consider the following scenario ment, and the management evaluation. In the first phase, 111 56 where a defines a domain model the software portfolio of an organization is assessed from a 112 24 Techniques for Software Maintenance

01 technical standpoint. For each system, different mainte- produce a service migration strategy as the primary product 57 02 nance techniques are considered and evaluated with as well as other information related to the state of the 58 [26] 03 respect to its technical merit, impact, risk, and feasibility. legacy system. SMART is also implemented as a tool 59 04 The result of the technical assessment phase is a list of that allows developers to gather and organize the data 60 05 systems that must be considered for maintenance along required so that informed decisions can be made. 61 06 with a list of maintenance options for each such system. 62 07 The economic assessment evaluates each maintenance 63 08 option proposed by the technical assessment from a finan- EMERGING TRENDS 64 09 cial point of view. The options are ranked based on their 65 Q10 10 corresponding Benefit to Investment Ratio and their NPV. As software technologies, programming languages, utility 66 11 The systems and the corresponding maintenance options frameworks, computing platforms, and environments 67 12 with the highest NPV and BIR indicators are evaluated evolve at a rapid pace, software maintenance practices 68 13 jointly by the technical assessment team, the economic must evolve too. In this section, we discuss some of the 69 [24] 14 analysis team, and the management team. The systems emerging trends in software engineering that have a direct 70 15 and the maintenance options that are selected from this impact on software maintenance practices. We classify 71 16 joint evaluation are the ones that will be chosen for these emerging trends in eight categories, namely, pro- 72 17 maintenance. cesses, modeling, transformations, evaluation, software 73 18 Another methodology is the SMART that has been visualization, open source, mining software repositories, 74 19 developed by the Software Engineering Institute at and SoS. 75 20 Carnegie Mellon University. SMART is a technique that Processes: Until recently, the majority of software 76 21 can be used to make initial decisions about the feasibility development processes focused on collections of specifi- 77 22 of reusing legacy components as services within an cations and plans that must be completed before any design 78 23 SOA environment. Fig. 16 illustrates the basic tasks of the or implementation activity starts. These are quite rigid 79 [137] 24 SMART process. SMART operates in three phases. models since they do not allow for any prototypes or 80 25 The first phase is the planning phase where the methodol- implementations to be shared with the end user, until the 81 26 ogy provides a structured way to gather a wide range of final release of the system for . As an 82 27 information about legacy components, the target SOA, and answer to this problem, over the past few years we see an 83 28 potential services. The information is collected using a emerging trend for a new generation of process models that 84 29 methodology referred to as the Service Migration have a profound effect on software maintenance. These 85 30 Interview Guide (SMIG). The SMIG directs information models are referred to as agile processes that are iterative 86 31 gathering for establishing the migration context; for and incremental in the sense that software is developed and 87 32 describing and assessing the existing capability, that is, maintained in iterations (versions) considering new func- 88 33 the data about the legacy components from the identified tional and non-functional requirements incrementally in 89 34 stakeholders; and for describing the target SOA state, that each new version. Every iteration cycle is worked on by a 90 35 is, the data and information for the target SOA system. The team through a full software development cycle, including 91 36 second phase is the GAP analysis to understand the differ- planning, , design, coding, unit test- 92 37 ences between the current state of the system and the future ing, integration testing, and system and acceptance testing 93 38 planned state of the new SOA system. Once this analysis when a working version is to be demonstrated to stake- 94 39 has been performed, the methodology proceeds in the third holders. These processes consider source code as the pri- 95 40 phase. The third phase is the strategy determination phase mary artifact and therefore software maintenance becomes 96 41 where the methodology assists software maintainers to a de facto part of the development process and not only an 97 42 98

43 99

44 100

45 101

46 102

47 103

48 104

49 105

50 106

51 107

52 108

53 109

54 110 55 Fig. 16 Basic activities of the Service Migration and 111Q18 56 Reuse Technique process. 112 Techniques for Software Maintenance 25

01 activity that applies only on systems that have been software systems that utilize standard architectural patterns 57 02 released and are already in operation. Agile methods and middleware technologies. Furthermore, from the soft- 58 03 emphasize face-to-face communication over written docu- ware maintenance standpoint, MDE offers the possibility 59 04 ments and are use case-driven and architecture-centric. of keeping track of the dependencies and the associations 60 05 They also allow for risk avoidance for the early resolution between various software artifact models (e.g., require- 61 06 of problems. Agile development has been widely docu- ments, architecture, and low-level design) with the source 62 07 mented in dedicated IEEE/ACM conferences such as code of a software system. 63 08 (XP) and Agile. To date, we do Transformations: At the heart of most maintenance 64 [139] 09 not have any concrete frameworks to assess how these operations is the concept of software transformations. 65 10 agile processes affect system quality or how these pro- These transformations may take the form of modifying the 66 11 cesses can be further enhanced to become more efficient system so that it can be ported to a new environment, or of 67 12 for software maintainability purposes. Some of the emer- altering the system to add new functionality, or of refactor- 68 13 ging processes that are expected to have an impact on ing the system so that it can be more maintainable. In the 69 14 software maintenance include Agile Unified Process past, software transformations were implemented using 70 15 (AUP) XP, Feature Driven Development (FDD), and proprietary frameworks. Furthermore, transformations 71 [138] 16 Open Unified Process (OpenUP). could not be easily reused among even similar applica- 72 17 Modeling: So far, software representation models were tions. Over the past few years we observe an emerging 73 18 based on proprietary metalanguages and custom-made trend toward the standardization of such transformation 74 19 schemas. This created a problem of building maintenance languages and transformation frameworks. These transfor- 75 20 tools that can interoperate. However, over the past few mation languages and frameworks focus on manipulating 76 21 years we see a standardization of the metamodeling lan- MOF models and are based on graph and tree transforma- 77 22 guages used to represent software artifacts. The metamo- tions. Examples of such emerging languages and frame- 78 23 deling language that emerged as a de facto standard is the works for model transformation include QVT, the ATL, 79 24 MOF. Combined with new transformation technologies and the AGG. 80 25 that aim to manipulate MOF-based models in a program- Evaluation: As discussed in the previous sections, the 81 26 matic manner, we see an emerging trend where software evaluation of software maintenance operations falls into 82 27 artifacts such as low-level designs and source code are two main categories, namely, technical and financial eva- 83 28 created automatically as a result of transforming higher luation. In this respect, an emerging trend in evaluating 84 29 level of abstraction models. This has a profound effect on software maintenance and evolution choices from a tech- 85 30 software maintenance as we see gradually moving from the nical perspective is techniques to establish traceability 86 31 maintenance of source code artifacts to the maintenance of links that allow for what-if type of impact analysis where 87 32 software models with emphasis on techniques to support the effects of a transformation or a maintenance operation 88 [140] 33 model synchronization and model coevolution, that is, how can be better estimated before the operation is applied. 89 34 models can remain consistent when one of them changes Similarly, there are a couple of emerging trends for evalu- 90 35 due to maintenance activities. An area that deals with the ating alternative software maintenance and evolution sce- 91 36 issues of automatic code generation and the maintenance of nario and choices from a financial perspective. These 92 37 software models is MDE that is currently at the forefront of include the use of future options valuation and the valua- 93 38 software engineering research. tion of customer loyalty that can be gained by software 94 39 MDE is a software development methodology that customization operations. 95 40 focuses on the creation, interpretation, and transformation : Software systems grow con- 96 41 of models that represent software artifacts such as use stantly in size and complexity. Even though we have ela- 97 42 cases, requirements, design specifications, and test models. borate tools to analyze large software systems, the results 98 43 These models can be used to automatically or semiauto- produced from these tools would be of limited value if we 99 44 matically generate infrastructure and application source could not have techniques to present these results to soft- 100 45 code for a system. MDE creates unique challenges and ware engineers. For humans, one of the best ways to under- 101 46 opportunities for software development and software stand and analyze information is through images. In this 102 47 maintenance. Some of the challenges relate to devising respect, software visualization is an emerging field in the 103 48 well-defined representations of models and devising trans- area of software maintenance. Software visualization tech- 104 [141] 49 formation techniques for the generation of new models and niques are not new. However, as the complexity of 105 50 source code from existing models. In this context of MDE, software systems grows, the need for efficient visualization 106 [142–146] 51 it is also needed to devise techniques that allow for all these techniques grows even bigger. Some of the emer- 107 52 models to remain consistent and synchronized when one or ging challenges in this field include dealing with complex- 108 53 more of them changes due to maintenance activities. Even ity of data, enhancing the usability of visualization tools, 109 54 though MDE may look complicated at first, it promises and developing collaborative visualization tools. A thor- 110 55 significant gains in software development productivity, ough discussion on research challenges in software visua- 111 56 especially when software development relates to enterprise lization can be found in Ref. [147]. 112 26 Techniques for Software Maintenance

01 Open source: Over the past decade we have witnessed CONCLUDING THOUGHTS 57 02 the revolution of open source system development. During 58 03 this decade we have also gathered valuable information on Software maintenance is a very important part of the soft- 59 04 open source development process models, on the impact ware development life cycle. In this entry, we discussed a 60 05 some design decisions have on open source quality, and on collection of topics that are indicative of this important 61 06 the evolution history and politics of these systems. area of software engineering. Software maintenance is an 62 07 Furthermore, we have gathered detailed reports related to ever-evolving field and we anticipate it will generate sig- 63 08 faults, fixes, and testing processes. An emerging trend in nificant challenges, especially as it has started being con- 64 09 the area of software maintenance is how to leverage the sidered, through agile processes, as an integral part of the 65 10 information gathered from open source system develop- Greenfield development processes. As such, we expect to 66 11 ment and maintenance to apply other non-open source see software maintenance techniques being even more 67 12 projects and how to decide what strategy is best to use for prevalent in the years to come, especially software main- 68 13 software maintenance. The software engineering commu- tenance techniques to support the development and opera- 69 14 nity believes that there is still a lot to be learnt from these tions of large-scale multilanguage, multiplatform Service- 70 15 systems.[148–151] Oriented systems. We encourage the readers to delve into 71 16 Mining software repositories: A software system is the references of this entry and investigate further specific 72 17 composed of artifacts that are far richer than just its source technical details of their interest. 73 18 code. These artifacts include of course the source code, but 74 19 also requirements and design documents, evolution data, 75 20 bug reports, test suites, deployment logs, and archived REFERENCES 76Q11 21 communications. Over the past decade, and for the pur- 77 22 poses of software maintenance and evolution, the research 1. Canning, R. That maintenance ‘iceberg’. EDP Analyzer 78 23 community developed robust technologies to model, store, 1972, 10 (10). 79 24 2. Swanson, E.B. The dimensions of maintenance. In 80 and process these artifacts in specialized repositories called 25 Proceedings of the 2nd International Conference on 81 software repositories.[152–154] As these repositories grew in 26 Software Engineering, San Francisco, 1976. 82 size and complexity, so did the need to develop efficient 27 3. The Institute of Electrical and Electronics Engineers. IEEE 83 and tractable mining and retrieval techniques. Therefore, Standard Glossary of Software Engineering Terminology, 28 84 an emerging field in the area of software maintenance is 1990. IEEE Standard 610.12-1990. 29 85 mining software repositories. In this respect, there are 4. The Institute of Electrical and Electronics Engineers. IEEE 30 86 some challenges to address in this field. These include Standard for Software Maintenance, 1998. IEEE Standard 31 87 dealing with software repository complexity and diversity 1219–1998. 32 88 of artifacts these repositories store, dealing with consis- 5. Hunt, B.; Turner, B.; McRitchie, K. Software maintenance 33 implications on cost and schedule. In Proceedings of the 89 tency of data, dealing with the extraction of high-quality 34 IEEE Aerospace Conference, 2008. 90 data in a tractable manner, and dealing with unstructured 35 6. Chapin, N. Do we know what preventive maintenance is? In 91

36 data. A comprehensive discussion on mining software Proceedings of 2000 IEEE International Conference on 92

37 repositories can be found in Ref. [155]. Software Maintenance, 2000. 93

38 Systems of Systems: This term refers to a collection of 7. Kajko-Mattsson, M. Preventive maintenance! Do we know 94 what it is? In Proceedings of the IEEE International 39 task-oriented, large-scale, concurrent and distributed sys- 95 Conference on Software Maintenance, 2000. Q17 40 tems that integrate and collaborate to form a new, more 96 12. Lehman, M.M.; Fernandez-Ramil, J. Software evolution 41 complex “metasystem,” which offers more functionality 97 and performance than simply the sum of the constituent and feedback: theory and practice. In Software Evolution; 42 Madhavji, N.H., Fernandez-Ramil, J., Perry, D.E., Eds.; 98 systems. These SoS, also referred to as ULS, started to 43 Wiley, 2006. 99 44 appear in the areas of Command, Control, Computers, 13. Lehman, M.M.; Belady, L.A., Eds.; Program Evolution: 100 45 Communication and Information Intelligence (C4I); Processes of Software Change; Academic Press, 1985. 101 46 Surveillance and Reconnaissance (ISR); as well as in bank- 14. Lehman, M.M.; Ramil, J.F.; Wernick, P.D.; Perry, D.E.; 102

47 ing and finance. These systems pose unique challenges in Turski, W.M. Metrics and laws of software evolution—the 103

48 the area of software maintenance. Emerging trends in this nineties view. In Proceedings of the 4th International 104

49 area include the use of unified models for the representa- Software Metrics Symposium, 1997. 105 15. Bruegge, B.; Dutoit, A.H. Object-Oriented Software 50 tion of these systems and their dependencies; techniques 106 Engineering Using UML, Patterns and Java; Prentice 51 for multilanguage multiplatform system analysis and 107 Hall, 2004. 52 maintenance; techniques for monitoring RCA and diagnos- 108 16. Schmidt, D. Model driven engineering. IEEE Comput. 53 tics of SoS; and techniques that assess the alignment of run 2006. 109Q15 54 time IT infrastructure with business processes. The reader 17. Bennettand, K.; Rajlich, V. Software maintenance and evo- 110 55 can also refer to a detailed discussion on the emerging area lution: a roadmap. In Proceedings of the Conference on The 111 56 of SoS and ULS that can be found in Ref. [126]. Future of Software Engineering, Limerick, Ireland, 2000. 112 Techniques for Software Maintenance 27

01 18. Finnigan, P. et al. The software bookshelf. IBM Syst. J. 39. H. A. Muller,¨ K. Wong, and S. R. Tilley. “Understanding 57

02 1997 36 (4). software systems using reverse engineering technology.” In 58 nd 03 19. Sousa, M.J.C.; Moreira, H.M. A survey on the software the Proc. of the 62 Congress of L’Association Canadienne 59

04 maintenance process. In Proceedings of the IEEE Francaise pour l’Avancement des Sciences Proceedings 60 Conference of Software Maintenance, 1998. (AFCAS), 1994. 05 61 20. Seaman, C.B. The information gathering strategies of soft- 40. Holt, R.C. Structural manipulations of software architecture 06 62 ware maintainers. In Proceedings of IEEE Conference on using Tarski relational algebra. In Proceedings of the 07 Software Maintenance, 2002. Working Conference on Reverse Engineering, 1998. 63 08 21. Rook, P. Software Reliability Handbook; Centre for 41. Ducasse, S.; Lanza, M.; Tichelaar, S. Moose: an extensible 64 09 Software Reliability, Springer, 1990. language-independent environment for reengineering 65 10 22. Sliwerski, J.; Zimmermann, T.; Zeller, A. HATARI: raising object-oriented systems. In Proceedings of the Second 66 11 risk awareness. In Proceedings of 10th European Software International Symposium on Constructing Software 67 12 Engineering Conference, 2005. Engineering Tools (CoSET ’00), 2000. 68

13 23. Brodie, M.L.; Stonebraker, M. Migrating Legacy System; 42. Holt, R.C.; Schurr, A.; Sim, S.E.; Winter, A. GXL: a graph- 69

14 Morgan Kaufmann: SanMateo, California, 1995. based standard exchange format for reengineering. J. Sci. 70

15 24. Software Reengineering Assessment Handbook JLC- Comput. Program. 2006, 60 (2), 149–170. 71 HDBK-SRAH Version 3.0, USAF Software Technology 43. http://www.omg.org/technology/documents/modernization 16 72 Support Center. speccatalog.htm Q14 17 73 25. Arnold, R. Software Reengineering; IEEE Computer 44. Horwitz, S.; Reps, T.; Binkley, D. Interprocedural slicing 18 Society Press: Los Alamitos, CA, 1993. using dependence graphs. ACM TOPLAS 1990, 12 (1), 74 19 26. Lewis, G.; Morris, E.; Smith, D.; O’Brien, L. Service- 26–60. 75 20 oriented migration and reuse technique (SMART). In 45. Cifuentes, C.; Simon, D. Procedure abstraction recovery 76 21 Proceedings of IEEE Conference on Software Technology from binary code. In Proceedings of the IEEE on the 4th 77 22 and Engineering Practice, 2005. European Conference on Software Maintenance and 78 23 27. Cifuentes, C.; Gough, K.J. Decompilation of binary Reengineering, 2000. 79

24 programs. J. Softw. Pract. Experience 1995, 25 (7), 46. Zheng, J.; Williams, L.; Nagappan, N.; Snipes, W.; 80

25 811–829. Hudepohl, J.P.; Vouk, M.A. On the value of static analysis 81

26 28. Canfora, G.; Di Penta, M. New frontiers of reverse engi- for fault detection in software. IEEE Trans. Softw. Eng. 82 neering. In International Conference on Software 2006, 32 (4). 27 83 Engineering (ICSE’ 07)—Future of Software Engineering 47. Ritsch, H.; Sneed, H.M. Reverse engineering programs via 28 84 Track (FOSE), 2007. dynamic analysis. In Proceedings of IEEE Working 29 29. Darcy, D.P.; Palmer, J.W. The impact of modeling formal- Conference on Reverse Engineering, 1993. 85 30 isms on software maintenance. IEEE Trans. Softw. Eng. 48. Salah, M.; Mancoridis, S.; Antoniol, G.; Di Penta, M. 86 31 2006, 53 (4). Scenario-driven dynamic analysis for comprehending large 87 32 30. Stoy, J. Denotational Semantics: The Scott-Straehey software systems. In Proceedings of IEEE Conference on 88 33 Approach to Programming Language Theory; MIT Press: Software Maintenance and Reengineering, 2006. 89 34 Cambridge, MA, 1977. 49. Safyallah, H.; Sartipi, K. Dynamic analysis of software 90

35 31. Aceto, L.; Fokkink, W.J.; Verhoef, C. Structural operational systems using execution pattern mining. In Proceedings of 91

36 semantics. In Handbook of Process Algebra; Bergstra, J.A., IEEE Conference on Program Comprehension, 2006. 92 Q12 37 Ponse, A., Smolka, S.A., Eds.; Elsevier, 2001. 50. Sartipi, K. Software architecture recovery based on pattern 93 32. Milner, R. Communicating and Mobile Systems: The Pi- matching. In Proceedings of IEEE Conference on Software 38 94 Calculus. Cambridge University Press, 1999. Maintenance, 2003. 39 95 33. Markosian, L.; Newcomb, P.; Brand, R.; Burson, S.; 51. Canfora, G.; Cimitile, A.; Munro, M.; Tortorella, M. 40 Kitzmiller, T. Using an enabling technology to reengineer Experiments in identifying reusable abstract data types in 96 41 legacy systems. Commun. ACM 1994, 37 (5), 58–70. program code. In Proceedings of IEEE Conference on 97 42 34. Muller,¨ H.A.; Klashinsky, K. Rigi a system for Program Comprehension, 1993. 98 43 programming-in-the-large. In Proceedings of the 10th 52. Yeh, A.S.; Harris, D.R.; Reubenstein, H.B. Recovering 99 44 International Conference on Software Engineering, 1988. abstract data types and object instances from a conventional 100 45 35. Chen, Y.F.; Nishimoto, M.Y.; Ramamoorthy, C.V. The C procedural language. In Proceedings of IEEE Working 101

46 information abstraction system. IEEE Trans. Softw. Eng. Conference on Reverse Engineering, 1995. 102

47 1990, 16 (3), 325–334. 53. Zou, Y.; Kontogiannis, K. Migration to object oriented plat- 103

48 36. Winter, A.; Kullbach, B.; Riediger, V. An overview of the forms: a state transformation approach. In Proceedings of 104 GXL graph exchange language. In Proceedings of the IEEE Conference on Software Maintenance, 2002. 49 105 Software Visualisation International Seminar, LNCS 2269, 54. Gallagher, K.; Lyle, J. Using program slicing in software 50 106 Dagstuhl Castle, Germany, May 2002. Springer-Verlag, maintenance. IEEE Trans. Softw. Eng. 1991, 17 (8), 51 2002; 324–336. 751–761. 107 52 37. Aho, A.V.; Sethi, R.; Ullman, J.D.; Lam, M.S. Compilers: 55. Kagdi, H.; Maletic, J.I.; Sutton, A. Context-free slicing of 108 53 Principles, Techniques, and Tools; Pearson Education, Inc., UML class models. In Proceedings of 21st International 109 54 2006. Conference on Software Maintenance ICSM, 2005. 110 55 38. Bell Canada. “DatrixTM Abstract Semantic Graph: 56. Jorgensen, P.C. Software Testing a Craftsman’s Approach; 111 56 Reference Manual v1.4 2000-05-01”, 2000. CRC Press, 2006. 112 28 Techniques for Software Maintenance

01 57. Castelli. D. A strategy for reducing the effort for database 77. Kazman, R.; Klein, M.; Barbacci, M.; Longstaff, T.; Lipson, H.; 57

02 schema maintenance. In Proceedings of IEEE Conference Carriere, J. The architecture tradeoff analysis method. In 58

03 on Software Maintenance and Reengineering, 1998. Proceedings of IEEE Conference on Engineering of 59

04 58. Muthanna, S.; Kontogiannis, K.; Ponnambalam, K.; Stacey, B. Complex Computer Systems, 1998. 60 A maintainability model for industrial software systems 78. Antoniol, G.; Cimitile, A.; DiLucca, A.; DiPenta, M. Assessing 05 61 using design level metrics. In Proceedings of IEEE staffing needs for a software maintenance project through 06 62 Conference on Reverse Engineering, 2000. queuing simulation. IEEE Trans. Softw. Eng. 2004, 30 (1). Q13 07 59. Pfleeger, S.L.; Bohner, S.A. A framework for software 79. Sneed, H.M. Estimating the costs of software maintenance 63 08 maintenance metrics. In Proceedings of 6th International tasks. In Proceedings of IEEE Conference on Software 64 09 Conference on Software Maintenance, 1990. Maintenance, 1995. 65 10 61. Tran, J.B. Software Architecture Repair as a Form of 80. Boehm, B. Software Engineering Economics; Prentice Hall, 66 11 Preventive Maintenance. Master’s thesis, University of 1981. 67 12 Waterloo, Waterloo, ON, Canada, 1999. 81. Black, F.; Scholes, M. The pricing of options and corporate 68

13 62. Tahvildari, L.; Kontogiannis, K. A software transformation liabilities. J. Polit. Econ. 1973, 81 (3), 637–654. 69

14 framework for quality-driven object-oriented re-engineer- 82. National Aeronautics and Space Administration Cost 70

15 ing. In Proceedings of IEEE Conference on Software Analysis and Evaluation Division http://www.nasa.gov/ 71 Maintenance, 2002. offices/pae/organization/cost_analysis_division.. 16 72 63. Baxter, I.; Pidgeon, C.; Mehlich, M. DMS: program trans- 83. Doval, D.; Mancoridis, S.; Mitchell, B. Automatic cluster- 17 73 formations for practical scalable software evolution. In ing of software systems using a genetic algorithm. In 18 Proceedings of the 26th International Conference on Proceedings of IEEE Conference on Software Technology 74 19 Software Engineering (ICSE ’04), 2004. and Engineering Practice, 1999. 75 20 64. Mens, T.; Tourwe, T. A survey of software refactoring. 84. Tran, J.B.; Holt, R.C. Forward and reverse repair of 76 21 IEEE Trans. Softw. Eng. 2004, 30 (2). software architecture. In Proceedings of IBM Conference 77 22 65. Hauser, R.F.; Friess, M.; Kuster, J.M.; Vanhatalo, J. An Center for Advanced Studies CASCON, IBM Press, 1999. 78 23 incremental approach to the analysis and transformation of 85. Krikhaar, R. Software Architecture Reconstruction. Ph.D. 79

24 workflows using region trees. IEEE Trans. Syst. Man thesis, University of Amsterdam: The Netherlands, 1999. 80

25 Cybernetics, Part C: Applications and Reviews 2008, 38 (3). 86. Egyed, A.; Grunbacher, P. Automating requirements trace- 81

26 66. Henrard, J.; Roland, D.; Cleve, A.; Hainaut, J.L. Large-scale ability: beyond the record and replay paradigm. In 82 data reengineering: return from experience. In Proceedings Proceedings of 17th International ASE Conference, IEEE 27 83 of IEEE Conference on Reverse Engineering, 2008. CS Press, 2002. 28 84 67. Arnold, R.S.; Bohner, S.A. (Eds.). Software Change Impact 87. Mylopoulos, J.; Chung, L.; Nixon, B. Representing and 29 Analysis; Wiley-IEEE CS Press, 1996. using non functional requirements: a process-oriented 85 30 68. Briand, L.C.; Labiche, Y.; O’Sullivan, L. Impact analysis approach. IEEE Trans. Softw. Eng. 1992, 18 (6), 483–497. 86 31 and change management of UML models. In Proceedings of 88. Fickas, S.; Feather, M. Requirements monitoring in 87 32 the 19th ICSM, IEEE CS Press, 2003. dynamic environments. In Proceedings of Requirements 88 33 69. Kataoka, Y.; Imai, Y.T.; Andou, H.; Fukaya, T. A quanti- Engineering Conference, 1995. 89 34 tative evaluation of maintainability enhancement by refac- 89. Robinson, W.N. Implementing rule-based monitors within a 90

35 toring. In Proceedings of IEEE Conference on for continuous requirements monitoring. In 91

36 Maintenance, 2002. Proceedings of HICSS’05, 2005. 92

37 70. Agrawal, H.; Alberi, J.L.; Horgan, J.R.; Li, J.J.; London, S.; 90. Yu, Y.; Wang, Y.; Mylopoulos, J.; Liaskos, S.; 93 Wong, W.E.; Ghosh, S.; Wilde, N. Mining system tests to aid Lapouchnian, A. doPrado Leite, J.C.S. Reverse engineering 38 94 software maintenance. IEEE Comput. 1998, 31 (7), 64–73. goal models from legacy code. In Proceedings of IEEE 39 95 71. Schniedewind, N.F. Software quality maintenance model. RE’05, 2005. 40 In Proceedings of IEEE Conference on Software 91. Moskewicz, M.W.; Madigan, C.F.; Zhao, Y.; Zhang, L.; 96 41 Maintenance, 1999. Malik, S. Chaff: engineering an efficient sat solver. In 97 42 72. Basili, V.; Rombach, D. Tailoring the software process to Design automation. ACM Press: New York, USA, 2001. 98 43 goals and environments. In Proceedings of the 9th 92. Wang, Y.; McIlraith, S.; Yu, Y.; Mylopoulos, J. An auto- 99 44 International Conference on Software Engineering, mated approach to monitoring and diagnosing require- 100 45 Monterey, California, 1987. ments. In Proceedings of IEEE/ACM International 101

46 73. Pfleeger, S.L. Software metrics: a rigorous and practical Conference on Automated Software Engineering, 2007. 102

47 approach. Course Technology, 2nd Ed.; 1998. 93. Razavi, A.; Kontogiannis, K. Pattern and policy driven 103

48 74. Bandi, R.K.; Vaishnavi, V.K.; Turk, D.E. Predicting main- log analysis for software monitoring. In Proceedings of 104 tenance performance using object-oriented design complex- IEEE Conference on Computer Software and 49 105 ity metrics. IEEE Trans. Softw. Eng. 2003, 29 (1). Applications, 2008. 50 106 75. Coleman, D.; Lowther, B.; Oman, P. The application of 94. Chaim, M.L.; Maldonado, J.C.; Jino, M. A debugging strat- 51 software maintainability models in industrial software sys- egy based on requirements of testing. In Proceedings of 107 52 tems, J. Syst. Softw. 1995, 29, 3–16. IEEE Conference on Software Maintenance and 108 53 76. Kazman, R.; Bass, L.; Abowd, G.; Webb, M. SAAM: a Reengineering, 2003. 109 54 method for analyzing the properties of software architec- 95. Brown, W.J.; Malveau, R.C.; Mowbray, T.J. AntiPatterns: 110 55 tures. In Proceedings of IEEE International Conference on Refactoring Software, Architectures, and Projects in Crisis; 111 56 Software Engineering, 1994. Wiley, 1998. 112 Techniques for Software Maintenance 29

01 96. Ying, A.T.T.; Murphy, G.C.; Ng, R.T.; Chu-Carroll, M. Proceedings of the 20th International Conference on 57 02 Predicting source code changes by mining change history. Software Maintenance ICSM, 2004. 58 03 IEEE Trans. Softw. Eng. 2004, 30 (9), 574–586. 116. Hung, M.; Zou, Y. Recovering workflows from multi tiered 59

04 97. Mockus, A. Votta, L.G. Identifying reasons for software e-commerce systems. In Proceedings of the IEEE 60 International Conference on Program Comprehension, 2007. 05 changes using historic databases. In Proceedings of the 61 117. Ernst, M.D. Static and dynamic analysis: synergy and 06 IEEE International Conference on Software Maintenance, 62 San Jose, CA, 120–130, 2000. duality. In Proceedings of International Conference on 07 63 98. Larson, E. Assessing work for static detec- Software Engineeroing (ICSE) Workshop on Dynamic 08 64 tion. In Proceedings of the 1st ACM international Analysis (WODA), 2003. 09 Workshop on Empirical Assessment of Software 118. Ivkovic, I.; Kontogiannis, K. Towards automatic establish- 65 10 Engineering Languages and Technologies, 2007. ment of model dependencies using formal concept analy- 66 11 99. Ayewah, N.; Pugh, W.; Morgenthaler, J.D.; Penix, J.; Zhou, sis. Int. J. Softw. Eng. Knowl. Eng. 2006, 16 (4), 499–522. 67 12 Y. Using FindBugs on production software. In Proceedings 119. Czarnecki, K.; Eisenecker, U. Generative Programming: 68 13 of the ACM OOPSLA 2007 Companion, 2007. Methods, Tools, and Application; Addison-Wesley, 2000. 69

14 100. Koenig, A. Patterns and antipatterns. J. Object-Oriented 120. El-Ramly, M.; Iglinski, P.; Stroulia, E.; Sorenson, P.; 70

15 Program. 1995, 8 (1), 46–48. Matichuk, B. Modelling the system-user dialog using 71 interaction traces. Eighth Working Conference on 16 101. Tsantalis, N.; Chaikalis; T.; Chatzigeorgiou, A. 72 JDeodorant: identification and removal of type-checking Reverse Engineering (WCRE), 2001. 17 73 bad smells. In Proceedings of the 12th European 121. Canfora, G.; Cimitile, A.; DeLucia, A.; DiLucca, G.A. 18 74 Conference on Software Maintenance and Reengineering Decomposing legacy programs: a first step towards 19 (CSMR ’08), Athens, Greece, 2008; 329–331. migrating to client-server platforms. J. Syst. Softw. 2000, 75 20 102. Sneed, H.M. Integrating legacy software into a service 54 (2), 99–110. 76 21 oriented architecture. In Proceedings of IEEE Conference 122. Aversano, L.; Canfora, G.; Cimitile, A.; DeLucia, A. 77 22 on Software Maintenance and Reengineering, 2006. Migrating legacy systems to the web: an experience report. 78 23 103. Schmidt, D.; Stal, M.; Rohnert, H.; Buschmann, F. “Pattern- In Proceedings of the European Conference on Software 79 24 Oriented Software Architecture” Volume 2: Patterns for Maintenance and Reengineering (CSMR), 2001. 80

25 Concurrent and Networked Objects. Wiley, 2000. 123. Canfora, G.; Fasolino, A.R.; Frattolillo, G.; Tramontana, 81

26 104. Fowler, M. Patterns of Enterprise Application Architecture; P. A wrapping approach for migrating legacy system inter- 82 active functionalities to service oriented architectures. 27 Addison-Wesley, 2002. 83 105. Hohpe, G.; Woolf, B. Enterprise Integration Patterns: J. Syst. Softw. 2008, 81 (4), 463–480. 28 84 Designing, Building, and Deploying Messaging 124. Sneed, H.M. Encapsulation of legacy software: a techni- 29 85 Solutions; Addison-Wesley, 2003. que for reusing legacy software components. J. Ann. 30 106. Gamma, E.; Helm, R.; Johnson, R.; Vlissides, J.M. Design Softw. Eng. 2000, 9, 293–313. 86 31 Patterns: Elements of Reusable Object-Oriented Software; 125. Clements, P.; Bachmann, F.; Bass, L.; Garlan, D. 87 32 Addison-Wesley, 1994. Documenting Software Architectures: Views and Beyond 88 33 107. Clements, P.; Northrop, L. Software Product Lines: (SEI Series in Software Engineering); Addison-Wesley, 89 34 Practices and Patterns, 3rd Ed.; Addison-Wesley 2002. 90 35 Professional, 2001. 126. Feiler, F. et al. Ultra-Large-Scale Systems: The Software 91

36 109. Fowler, M.; Beck, K.; Brant, J.; Opdyke, W.; Roberts, D. Challenge of the Future; Software Engineering Institute, 92

37 Refactoring: Improving the Design of Existing Code; 2006. 93 128. Czarnecki, K.; Helsen, S. Feature-based survey of model 38 Addison-Wesley, 1999. 94 110. Gschwind, T.; Koehler, J.; Wong, J. Applying patterns transformation approaches. IBM Syst. J. 2006, 45 (3). 39 95 during business process modeling. In Proceedings of the 129. Kontogiannis, K.; Martin, J.; Wong, K.; Gregory, R.; 40 96 6th International Conference on Business Process Muller, H.; Mylopoulos, J. Code migration through trans- 41 Management (BPM), vol. 5240 of Lecture Notes in formations: an experience report. In Proceedings of 97 42 , 2008. CASCON, 1998. 98 43 111. Yau, S.S.; Collofello, J.S.; MacGregor, T. Ripple effect 130. Martin, J.; Muller,¨ H.A. C to Java migration experiences. 99 44 analysis of software maintenance. In Proceedings of IEEE In Proceedings of IEEE Conference on Software 100 45 Conference on Computer Software and Applications Maintenance and Reengineering, 2002. 101 46 Conference, COMPSAC, 1978. 131. Penny, D. The Software Landscape: A Visual Formalism 102

47 113. Binkley, D. An empirical study of the effect of semantic for Programming-in-the-Large; University of Toronto, 103

48 differences on programmer comprehension. In 1993. 104 132. Devanbu, P.T. GENOA – a customizable front-end retar- 49 Proceedings of the IEEE International Conference on 105 Program Comprehension, 2002. getable source code analysis framework. ACM TOSEM 50 106 114. Mehra, A.; Grundy, J.; Hosking, J. A generic approach to 1999, 8 (2), 177–212. 51 107 supporting diagram differencing and merging for colla- 133. Cordy, J.R.; Dean, T.R.; Malton, A.J.; Schneider, K.A. 52 borative design. In ASE ’05: Proceedings of the 20th Source transformation in software engineering using the 108 53 IEEE/ACM international Conference on Automated TXL transformation system. J. Inf. Softw. Technol. 2002, 109 54 Software Engineering, ACM Press, 2005. 44 (13), 827–837. 110 55 115. Ivkovic, I.; Kontogiannis, K. Tracing evolution changes of 134. Ferenc, R.; Beszedes, A.; Tarkiainen, M.; Gyimothy, T. 111 56 software artifacts through model synchronization. In Columbus reverse engineering tool and schema for Cþþ. 112 30 Techniques for Software Maintenance

01 In Proceedings of the International Conference on In Proceedings of the 22nd International Conference on 57

02 Software Maintenance, 2002. Software Engineering, 2000. 58

03 135. Raza, A.; Vogel, G.; Plodereder,¨ E. Bauhaus—a tool suite 151. M. Godfrey, D. German. “The Past, Present, and Future of 59

04 for and reverse engineering. In Software Evolution”. In Proc. of IEEE International 60 Proceedings of Reliable Software Technologies, Ada- Conference on Software Maintenance, FOSM Track, 2008. 05 61 Europe 2006, LNCS(4006), 2006. 152. Hassan, A.E.; Mockus, A.; Holt, R.C.; Johnson, P.M. 06 62 136. IEEE Technical Committee on Software Engineering, Guest Editor’s Special Issue on Mining Software 07 Committee on Reverse Engineering and Reengineering. Repositories. IEEE Trans. Softw. Eng. 2005, 31 (6), 63 08 http://reengineer.org/tcse/revengr/. 426–428. 64 09 138. Abrahamsson, P. et al. Agile processes in software engi- 153. Zimmermann, T.; Weißgerber, P.; Diehl, S.; Zeller, A. 65 10 neering and extreme programming. In Proceedings of the Mining version histories to guide software changes. IEEE 66 11 9th International Conference, XP, Springer, 2008. Trans. Softw. Eng. 2005, 31 (6), 429–445. 67 12 139. M. Bravenboer, K.T. Kalleberg, R. Vermaas, and E. Visser. 154. Canfora, G.; Cerulo, L. Impact analysis by mining soft- 68

13 “Stratego/XT0.16: components for transformation sys- ware and change request repositories. In Proceedings of 69

14 tems”. In Proc. Of the 2006 ACM SIGPLAN Workshop the 11th International Symposium on Software Metrics, 70

15 on Partial Evaluation and Semantics Manipulation, 2006. IEEE CS Press, 2005. 71 140. Barros, S.; Bodhuin, T.; Escudie, A.; Queille, J.; Voidrot, J. 155. Hassan, A. The road ahead: mining software repositories. 16 72 Supporting impact analysis: a semi-automated technique In Proceedings of IEEE Conference on Software 17 73 and associated tool. In Proceedings of the 11th ICSM, Maintenance, FOSM Track, 2008. 18 IEEE CS Press, 1995. 156. Bassil, S.; Keller, R. Software visualization tools: survey 74Q16 19 142. Rilling, J.; Meng, W.J.; Chen, F.; Charland, P. Software and analysis. In Proceedings of International Workshop on 75 20 visualization—a process perspective. In Proceedings of Program Comprehension (IWPC ’01), 2001. 76 21 VISSOFT, 2007. 157. Beyer, D.; Hassan, A.E. Animated visualization of soft- 77 22 144. Storey, M.; Best, C.; Michaud, J.; Rayside, D.; Litoiu, M.; ware history using evolution story boards. In Proceedings 78 23 Musen, M. SHriMP views: an interactive environment for of the 13th Working Conference on Reverse Engineering 79

24 information visualization and navigation. In Proceedings (WCRE ’06), 2006. 80

25 of Conference on Human Factors in Computing Systems. 158. Collard, M.L.; Kagdi, H.H.; Maletic, J.I. An XML-based 81 þþ 26 ACM Press: New York, USA, 2002. lightweight C fact extractor. In Proceedings of the 11th 82 145. Greevy, O.; Lanza, M.; Wysseier, C. Visualizing live soft- International Workshop on Program Comprehension 27 83 ware systems in 3D. In Proceedings of the 2006 ACM (IWCP), 2003. 28 84 symposium on Software Visualization, ACM Press: New 159. Gotel, O.C.Z.; Finkelstein, A.C.W. An analysis of the 29 York, USA, 2006. requirements traceability problem. In Proceedings of 1st 85 30 146. German, D.; Hindle, A. Visualizing the evolution of soft- International Conference on , 86 31 ware using softchange. Int. J. Softw. Eng. Knowl. Eng. 1994. 87 32 2006, 16 (1). 160. B.A. Kitchenham, G.H. Travassos, A. von Mayrhauser, 88 33 147. Storey, M.A.; Bennett, C.; Bull, I.; German, D. Remixing F. Niessink, N.F. Schneidewind, J. Singer, S. Takada, 89 34 visualization to support collaboration in software mainte- R. Vehvilainen, and H. Yang. “Towards an ontology of 90

35 nance. In Proceedings of IEEE Conference on Software software maintenance“. Journal of Software Maintenance 91

36 Maintenance, FOSM Track, 2008. and Evolution: Research and Practice, 11(6), 1999. 92

37 148. Ogawa, M.; Ma, K.; Bird, C.; Devanbu, P.; Gourley, A. 161. Lehman, M.M.; Perry, D.E.; Ramil, J.F. Implications of 93 Visualizing social interaction in open source software pro- evolution metrics on software maintenance. In 38 94 jects. In Proceedings of 6th International Asia-Pacific Proceedings of the IEEE International Conference on 39 95 Symposium on Visualization (APVIS). Software Maintenance, 1998. 40 149. Xie, T.; Pei, J. MAPO: mining API usages from open 162. Margolis, B. SOA for the Business Developer: Concepts, 96 41 source repositories. In Proceedings of International BPEL, and SCA; Mc Press, 2007. 97 42 Workshop on Mining Software Repositories (MSR), 2006. 163. Williams, C.C.; Hollingsworth, J.K. Automatic mining of 98 43 150. Mockus, A.; Fielding, R.T.; Herbsleb, J.D. A case study source code repositories to improve bug finding techni- 99 44 of open source software development: the apache server. ques. IEEE Trans. Softw. Eng. 2005, 31 (6), 466–480. 100 45 101

46 102

47 103

48 104

49 105

50 106

51 107

52 108

53 109

54 110

55 111

56 112