Software Archaeology in Practice Software Archaeology in Practice Recovering Lost Behaviour from Legacy Code

Software Archaeology in Practice Software Archaeology in Practice Recovering Lost Behaviour from Legacy Code

RECOVERING LOST BEHAVIOUR FROM LEGACY CODE Software Archaeology in Practice Software Archaeology in Practice Recovering lost behaviour from legacy code Verum Software Tools BV Abstract—Reengineering legacy software is an undesirable but recovered from the legacy codebase. Future rot can be nevertheless occasionally unavoidable necessity. The core minimised by converting the legacy code into verifiably challenge is to efficiently and effectively uncover the behaviour of complete and correct models. legacy code, the origins of which have been lost in the mists of time, establishing a complete and correct foundation for further re- In an ideal world, it would be possible to automatically engineering. In this paper, we present a technique by which such reverse engineer an existing codebase using tooling. But then in lost or poorly understood behaviour can be recovered and turned an ideal world water could flow uphill and time could be nd into formally verifiable models. These models then offer a solid reversed. The simple fact is that the 2 law of thermodynamics foundation for the further development of a software system. means that creating a more highly ordered system (reengineered software) – from a lower ordered system (legacy code) – Keywords—software reengineering; legacy code; model driven requires work. The trick is to perform that work as efficiently software engineering; formal methods; software engineering tools; and effectively as possible. software components; I. INTRODUCTION II. MODEL DRIVEN SOFTWARE ENGINEERING An unpleasant but nevertheless unavoidable truth of software development is that conventionally developed software A generalised definition of Model Driven Software “rots” in time. Rot occurs slowly and insidiously, driven by the Engineering (MDSE) can be found on Wikipedia [1]. A more very nature of source code itself and the human factors that pragmatic description is provided by Küster [2]: impinge on developing and maintaining it. It often starts when • Model-Driven Software Engineering (MDSE) is a changes to source code are not reflected in documentation, software engineering paradigm. leading to a loss of readily accessible information. It accelerates when development teams change and knowledge of – in the • Models are considered as primary artefacts from which meantime, poorly documented – code is lost. It proliferates when parts of a software system can be automatically new features are added by fresh software engineers, based on generated. incomplete documentation and knowledge. It reaches its zenith as the law of diminishing returns bites and development progress • Models are usually more abstract representations of the grinds to a halt. It is at this point that reengineering the software system to be built. becomes unavoidable. • MDSE can improve productivity and communication. From a business perspective, reengineering software is a • MDSE requires technologies and tools in order to be nightmare. It involves spending a lot of time and money just to successfully applied. stand still. It is also highly risky, simply because the existing legacy software is so poorly understood. And in the worst-case As stated by Küster, MDSE can be used to improve history can repeat itself with the newly reengineered codebase productivity and communication. Further, models designed being no more resistant to rot than its predecessor. This risk, and using formal verification techniques have the additional benefit the work involved in reengineering, can be greatly diminished if of increased longevity. Specifically, by objectively establishing the essential functionality and behaviour of the software can be that a model is complete and correct for a range of properties, www.verum.com amongst other benefits one ensures that the model will not be subject to quite the same level of rot as conventionally developed source code. Therefore, when reengineering a software (sub) system one can not only improve productivity, quality and understanding by restating the design of the (sub) system in verifiably complete and correct models, but one also greatly reduces the risk of the design deteriorating into chaos again. In this paper, we will consider the application of Dezyne [3] to the problem of recovering lost or poorly understood behaviour from legacy software and turning it into verifiably complete and correct models. Dezyne is an MDSE tool that enables software engineers to create, explore and formally verify component based software designs for embedded, technical and industrial software systems. Figure 1: Dezyne Interface Model Example III. COMPONENTS AND COMPOSITIONALITY B. Component Models The Dezyne modelling paradigm is analogous to that of the hardware world. Namely, Dezyne is based on the concept of a In the hardware world, components are implemented in component and the inherent compositionality of components. silicon and it is in the silicon that the actual work of a component Dezyne recognises three types of models, interface models, is done. It is essential to the basic utility of an IC that its component models and system models. implementation fully refines the interface that it specifies in its technical documentation. A. Interface Models In the hardware world, the externally visible behaviour of an IC is generally described by two things, a pin-out and a timing diagram, both usually found in a data sheet or technical manual for the component. This abstraction provides enough information for the IC to be used without revealing details of how it is implemented. Dezyne component models are analogous: the component does the actual work of a design. Every Dezyne component provides an interface model and thus it is essential to the basic utility of a component that it fully refines the specification that the interface provides. Dezyne interface models are analogous. They describe the API provided by a software component and the protocol - the Dezyne goes one step further than the hardware paradigm. sequence of allowed events and responses - that the API Namely, any component that in turn requires the use of another implements. An interface model is an abstraction that provides component must also adhere to the protocol of the interface enough information for a software component to be used without provided by that component. Dezyne verification technology revealing details of how the component is implemented. Thus, formally ensures that every component model adheres to the an interface model amounts to a specification of the externally interface models that it provides and requires. visible behaviour of a software component. Interface models are a key Dezyne concept upon which the compositionality of Dezyne components rests. When an interface model is used in conjunction with a Dezyne component model, Dezyne’s verification engine will assert that the component completely and correctly adheres to the protocol of any interface that it provides or requires (the Liskov Substitution Principle [3]). In this way, the structural integrity of entire systems composed from Dezyne components is established. Figure 2 Dezyne Component Header www.verum.com C. System Models IV. SOFTWARE ARCHAEOLOGY In the hardware world, systems are designed by composing Dezyne offers a means to reduce the cost and risk of the work components together. involved in reengineering the behaviour of complex software systems. In a process that we have come to call “Software Archaeology”, Dezyne can be used to rediscover the ‘lost’ behaviour of a system. The key to this process is the use of Dezyne’s interface models to capture externally visible behaviour across interfaces to legacy software and to separate it into expected behaviour on the one hand and unexpected or erroneous behaviour on the other. A. Interfacing to Legacy Software Dezyne interface models are used to connect Dezyne components to legacy software and vice-versa. In this case the The same is true in Dezyne. A Dezyne System Model is used interface model represents merely an assumption of how the to declare instances of Dezyne components and to link them legacy component behaves, against which the Dezyne together through their port definitions, based on interface component is verified. Dezyne’s verification engine cannot be models. A Dezyne System Model can itself provide and require used to show that the legacy software complies to the interface interfaces, making it possible to abstract and hide the entire model, meaning that the interface model offers no guarantee that implementation of a (sub) system. The result is that Dezyne sub- the legacy component will stick to the protocol that the interface systems can be nested and appear as components at a higher model defines. higher level in a system. With a small, simple or highly ordered legacy component it is possible to be confident that an interface model captures the exact protocol that the component implements. But when re- engineering legacy software, the behaviour of a component is poorly understood and therefore it is highly likely that a Dezyne interface model will represent an approximation of the legacy component’s visible behaviour. This can present an issue at runtime because Dezyne generated components assume that all other components they use or are used by, including legacy components, are correctly behaved. Dezyne’s verification engine guarantees that other Dezyne components meet this requirement. However, errant behaviour by a legacy component may cause a protocol violation on an interface, with the consequence that

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us