Dynamic Reverse Engineering of Graphical User Interfaces

Dynamic Reverse Engineering of Graphical User Interfaces Inesˆ Coimbra Morgado and Ana C. R. Paiva Joao˜ Pascoal Faria Department of Informatics Engineering, Department of Informatics Engineering, Faculty of Engineering, University of Porto, Faculty of Engineering, University of Porto rua Dr. Roberto Frias, 4200-465 Porto, Portugal rua Dr. Roberto Frias, 4200-465 Porto, Portugal fpro11016, [email protected] INESC TEC, Porto, Portugal [email protected] Abstract—This paper presents a dynamic reverse engineering The challenge tackled in this research work is the automatic approach and a tool, ReGUI, developed to reduce the effort of construction of part of the software’s GUI model (structure obtaining models of the structure and behaviour of a software and behaviour) using a dynamic reverse engineering technique. applications Graphical User Interface (GUI). It describes, in more detail, the architecture of the REGUI tool, the process followed to The extracted information is presented in several formats that extract information and the different types of models produced allow performing different types of analysis, such as, visual to represent such information. Each model describes different inspection, model checking and MBGT. characteristics of the GUI. Besides graphical representations, The term reverse engineering was firstly defined in 1985 by which allow checking visually properties of the GUI, the tool Rekoff [5] as “the process of developing a set of specifications also generates a textual model in Spec# to be used in the context of model based GUI testing and a Symbolic Model for a complex hardware system by an orderly examination of Verification model, which enables the verification of several specimens of that system”. Five years later, Chikofsky and properties expressed in computation tree logic. The models Cross [6] adapted this definition to software systems: “Reverse produced must be completed and validated in order to ensure that Engineering is the process of analysing a subject system to (1) they faithfully describe the intended behaviour. This validation identify the system’s components and interrelationships and (2) process may be performed by manually analysing the graphical models produced or automatically by proving properties, such to create representations of the system in another form or at as reachability, through model checking. A feasibility study is a higher level of abstraction”. described to illustrate the overall approach, the tool and the The origin of software reverse engineering lies on the results obtained. necessity of improving and automating software maintenance. Keywords-ReGUI; Dynamic Reverse Engineering; GUI Testing; It is estimated that program comprehension [7], i.e., under- Properties Verification; CTL; Model Checking; SMV standing the structure and the behaviour of the software, corresponds to over 50% of software maintenance [8]. As such, developing tools which may aid software engineers on I. INTRODUCTION this task is of the utmost importance. Reverse engineering This paper extends the research work presented in [1], which has already proved to be useful on this subject. For example, describes a reverse engineering tool to extract a model from the reverse engineering helped coping with the Y2K problem, with execution of a Graphical User Interface (GUI). In particular, the European currency conversion and with the migration of the state of the art is improved and the overall approach information systems to the web and towards the electronic is described in more detail. Moreover, a new module was commerce [9]. For such reasons, the IEEE-1219 standard1, implemented in order to automatically generate a Symbolic which was replaced by the IEEE-14764 one2, recommends Model Verification (SMV) model for model checking. The reverse engineering as a key supporting technology to software case study was extended in order to illustrate the additional maintenance [9]. features. In the last two decades, reverse engineering tools have GUI models are key inputs for several advanced techniques, evolved considerably and, nowadays, reverse engineering is such as Model Based GUI Testing (MBGT) [2], [3], [4], useful for other fields of study rather than software main- which enables the automatic test case generation, increasing tenance, such as, software testing and auditing security and the systematisation and automation of the GUI testing process, vulnerability. and model checking, which enables an automatic verification According to Canfora et al. [10], nowadays, the main goals and validation of some properties of the system. However, of reverse engineering are: the manual construction of such models is a time consuming • recovering architectures and design patterns; and error prone activity. One way of diminishing this effort 1IEEE Standard for Software Maintenance is to automatically construct part of the model by a reverse 2Standard for Software Engineering - Software Life Cycle Processes - engineering process. Maintenance • re-documenting programs and databases; enable a visual inspection of some properties, such as the • identifying reusable assets; number of windows of the GUI; textual models can be used • building traceability between software artefacts; for MBGT (Spec# [15] model) and to prove properties (SMV • computing change impacts; [16] model). • re-modularising existing systems; ReGUI v2.0 is fully automatic and uses a different approach • renewing user interfaces; from its previous version [17] so the results achieved, such • migrating towards new architectures and platforms; as the extracted dependencies and the produced graphs, are • testing and maintenance. different. As every technology, reverse engineering techniques can The rest of this paper is organised as follows. Section also be used with malicious intent [11], like removing software II presents the state of the art on user interface reverse protection and limitations or allowing unauthorised access engineering. Section III presents the proposed approach and to systems/data. However, the developers may use the same the developed tool, ReGUI. Section IV presents the exploration techniques in order to assure software’s safety. process and the challenges faced during the development of Yet in 1990, Chikofsky and Cross [6] divided the reverse ReGUI. Section V describes the outputs that can be obtained. engineering process in two parts: an analyser, which collects Section VI presents a feasibility study on Microsoft Notepad and stores the information, and an abstractor, which represents v6.1, presenting the results obtained. Section VII presents that information at a higher level of abstraction, i.e., a model, some conclusions about this research work, along with the either graphical or textual. Figure 1 depicts the representation limitations of the approach and future work. of a common reverse engineering process. II. STATE OF THE ART This Section presents the state of the art on reverse engineering, mainly on GUI reverse engineering, regarding the time interval from 2000 to 2011. A. Static Analysis As stated in Section I, static reverse engineering extracts information from textual sources, usually the source code, of the Application Under Analysis (AUA) [12]. The main techniques used in static analysis are code parsers, which are Fig. 1. Model of reverse engineering tools’ architecture [6] used to analyse the source code itself, query engines, which are used to verify certain facts of the code, and presentation In Figure 1 the analyser is referred to as a Parser, Semantic engines, which are used to depict the query results [18]. analyser because, in the early years of reverse engineering, Staiger’s Approach: Staiger [19] presented, in 2007, an these were the most common techniques. Nowadays, be- approach for the Bauhaus tool suite5 [20] to statically reverse sides static techniques [12], which extract information from engineer the source code of a GUI, in order to support program the source code, there are other three different approaches understanding, maintenance and standards’ analysis. Staiger [13], [14] for reverse engineering: dynamic, which extracts claimed researchers had not focused their work on the static information from a running software system; hybrid, which analysis of GUIs, even though most applications provided mixes both static and dynamic approaches; and historical, one. Staiger’s approach is divided in three different phases: which extracts information about the evolution of the system detecting the GUI elements, detecting widget hierarchies and from version control systems, like SVN3 or GIT4. Dynamic detecting event connections. For the first phase, Staiger detects approaches have the advantages of being independent of the which data types on the source code had any connection to GUI implementation language and of not requiring access to the GUI. Having detected these data types, he identifies the source code. However, the existing dynamic approaches have variables and respective parameters, as well as the functions limitations in terms of the behavioural information they are or methods that are part of the GUI. After the source code able to extract. elements are identified, the next phase finds the actual GUI This paper describes a dynamic reverse engineering tool, elements, by identifying when each of the elements was ReGUI, developed to automatically extract structural and be- created and the relationships among them. This provides the havioural information from a GUI, including some behavioural hierarchy of the elements.

Load more