Appendix F Use Case Diagrams

Visualizing Legacy Systems with UML Abstract Understanding a system is of critical importance to a developer. A developer must be able to understand the business processes being modelled by the system along with the system’s functionality, structure, events, and interactions with external entities. This understanding is of even more importance in reverse engineering. Although developers have the advantage of having the source code available, system documentation is often missing or incomplete and the original users, whose requirements were used to design the system, are often long gone. 0. Introduction A developer requires a model of the system not only in order to understand the business processes being modelled but the structure and dynamics of a system. Visualization of this system model is often necessary in order to clearly depict the complex relationships among model elements of that system. This chapter explains why visualization of a system, through a series of UML diagrams, is necessary and explains why relying on source code as the sole form of system documentation is often inadequate. A detailed outline of the process of deriving information from an analysis of legacy system is given, along with the rules to convert this information to a UML model of this system. This chapter also investigates whether it is possible to visualize a system and which of the nine possible UML diagrams are needed to represent this visualization. A brief overview of the methods, along with their inherent difficulty, that are used to extract UML diagrams from a legacy system is provided. Finally, a tool, TAGDUR, is introduced which has automated some of these analysis processes and which models some aspects of this system in UML. 0.1. Why Not Rely on Source Code for System Documentation? Relying on source code solely to obtain an understanding of the system has many disadvantages, particularly in legacy systems. In many legacy systems, the original design of the system has been obfuscated by the many incremental changes during the system’s long maintenance history. Furthermore, the end-users, whose requirements originally help design the system to meet their business needs, have usually long gone and the documentation outlining these requirements are often missing. Without these original end-users and no documentation, it is often difficult to determine the exact business processes that these systems model. Source code is very programming-language-dependent (Yang, 1999). In order to understand the code, the developer must be fully proficient in the programming language used to develop the system. The function and role of each section of source code within the system may be obvious to a developer but may be meaningless to a non-technical end-user. End-users want to see how the business processes that the system represents are modelled and they want to ensure that all of their business requirements are met in the system. End-users are not concerned with the internal design and details of this system. It is difficult for developers to view parallel data and control flows from reading the code. The control logic, especially if this control logic is heavily nested, is difficult to visualize from the source code, particularly to quickly identify which control constructs affect which parts of code. It is difficult to visualize events occurring in various parts of code and to visualize how these events interact with various objects in the system. Relying on source code as the only documentation source makes it difficult to view the interaction of system objects with other objects and external actors. Source code encompasses many perspectives (such as objects, deployment of components, and timing of object interactions) within itself. These multiple perspectives are confusing – it is difficult to represent the source code in each separate perspective. Source code has the additional disadvantage in that it is difficult to represent abstract concepts and behaviour from low-level, detailed source code. 0.2. UML One method to overcome this problem of requiring multiple perspectives of the same system is to visualize the system using some sort of graphical notation. Each perspective is given its own diagram type which is specialized to best represent this perspective. One of the most common graphical notations is UML (Unified Modelling Language). UML provides multiple perspectives of the system. Use case diagrams model the business processes embodied in the system from a user’s perspective. Statecharts and class diagrams model the behaviour and structure of the system respectively; the behaviour and structure of the system would be of most interest to developers. 0.3. Why Visualization of the System is Necessary for Reverse Engineering? Program understanding can be defined as the process of developing an accurate mental model of a software system’s intended architecture, purpose, and behaviour. This model is developed through the use of pattern matching, visualisation, and knowledge-based techniques. The field of program understanding involves fundamental issues of code representation, structural system representation, data and control flow, quality and complexity metrics, localisation of algorithms and plans, identification of abstract data types and generic operations, and multiple view system analysis using visualisation and analysis tools. Reverse engineering involves analysing data bindings, common reference analysis, similarity analysis, and subsystem and redundancy analysis (Whitney,1995). Program understanding involves the use of tools that perform complex reasoning about how behaviours and properties arise from particular combinations of language primitives within the program. One method of program understanding is to use visitors, or small reusable classes, whose function is to parse the source system and evaluate the combinations of language primitives that had been discovered during parsing (Wills,1993). Other methods try to evaluate and understand a system by taking, as input, the goals and purpose of the system as a specification (Johnson, 1986). Another method is to use clichés which try to recognise commonly used data structures and algorithms and then match these structures and algorithms to higher level abstractions. Examples of clichés are the structures and algorithms associated with hash tables and priority queues. The degree of accuracy of the matching varies with goal of program understanding. An example, an exact match is needed for program verification while only a reasonably close match is needed for documentation purposes. Software visualisation is a technique to enable humans to use their brain to make analogies and to link a visual software representation with the ideas that this representation portrays. This link would be much more difficult to make if the software representations were purely in textual form. Software visualisation relies on crafts such as animation, graphic design, and cinematography (Price,1992). Software visualisation has been used for decades in order to help developers understand programs. These techniques vary from flowcharts [Goldstein] to animated graphical representations of data structures (Baecker, 1981). However, many of these software visualisation systems are limited to displaying one type of data or level of abstraction. Few visualisation systems have the ability to suppress lower-level detail in order to depict higher-level concepts of the system. Program visualisation systems or tools can be characterised according to their scope, content, form, method, interaction, and effectiveness. Scope refers to the visualisation system’s general characteristics such as whether it models concurrent programs or whether there are any size limitations as to the system being depicted. Content refers to the content being visualised. Some visualisation systems can model both data and code while others model algorithms only. Form refers to what elements are being used in the visualisation. Some visualisation systems use animated graphics while other systems provide multiple views of different parts of the system. Method refers to how the tool specifies the visualisation. Does the tool require the program source code to be modified in order for it to be visualised? Some tools require the user to insert special statements in code of special interest in order for this code to be properly visualised. Interaction refers to how the user interacts and controls the visualisation. How does the user navigate through the visualisation of a large system in order to see how different parts of the system are being modelled? Effectiveness refers to how well the visualisation communicates information regarding the system being visualised (Price,1992). Using this taxonomy outlined by Price, a number of program visualisation systems can be objectively evaluated. The film, Sorting Out Sorting, is an animated visualisation tool to explain algorithms. Balsa, another visualisation tool, generates animations of Pascal programs. LogoMotion allows users to indicate what aspects of a Logo program they wish to have visualised. Chifosky and Cross define reverse engineering to be “the process of analyzing a subject system to identify the system’s components and their interrelationships and create representations of the system in another form or at a higher level of abstraction”. Chifosky and Cross state six goals of reverse engineering: z controlling complexity z generating alternative views z recovering lost information z detecting

Load more