Thesis Adrian Yankov

Total Page:16

File Type:pdf, Size:1020Kb

Thesis Adrian Yankov Eindhoven University of Technology MASTER Re-engineering the re-engineering process Yankov, A.G. Award date: 2018 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain Department of Mathematics and Computer Science Model Driven Software Engineering Research Group Re-engineering the re-engineering process Master Thesis Adrian Yankov Supervisors: prof.dr. M.G.J. (Mark) van den Brand dr. Y. (Yaping) Luo ir. M. (Marc) Hamilton dr. N. (Natalia) Sidorova Eindhoven, Jan 2018 Abstract The number of legacy systems around the world that still function is high. Their maintenance has turned problematic, due to the original developers leaving or lack of proper documentation. Re-engineering can be one of the solutions to these problems. It applies reverse engineering to recover the missing artifacts and then forward engineering to generate a new and more modern software product. Additionally, the re-engineering process can be backed up by models turning it into model-based re-engineering. Nevertheless, the existing model-based re-engineering standards are not precise enough and do not talk about verification and validation of the new process and product. In this thesis we give an overview of the current model-based re-engineering standards. Then we suggest a new more detailed approach depicted in Business Process Model and Notation (BPMN) with sample activities, process input and outputs. We also investigate reverse engineering as an essential part of the main re-engineering process by scanning the available literature and inter- viewing some experts from industry. We conclude with a case study on an Internet of Things (IoT) project, where our model-based re-engineering approach is applied. ii Re-engineering the re-engineering process Acknowledgements Completing this thesis provided me with a large amount of satisfaction. This important event of my life would not have been possible without a few people, to whom I would like to greatly acknowledge. First of all, I would like to express special gratitude to prof. Mark van den Brand for supervising the overall graduation process. He also helped me greatly with my personal devel- opment. Secondly, I would like to thank dr. Yaping Luo for agreeing to be my daily supervisor and always supporting me in word and deed. She was of huge help of me for mediating between the academic and industrial world. In addition to daily supervisor, dr. Yaping Luo is also a good friend. Thirdly, I would like to express gratitude to Marc Hamilton for providing irreplaceable feedback from an industrial point of view and being my main contact in Altran. His many years of work experience contributed greatly to my research. Last, but not least, my gratitude also goes to dr. Natalia Siderova for playing an essential role as a committee member and assessing my thesis. This last paragraph, I dedicate to my loving parents, girlfriend and friends. They are the main drive behind my motivation. I could not have gone through this journey alone without their moral support. Re-engineering the re-engineering process iii List of Acronyms Acronyms ADL architecture description language API Application Programming Interface AST Abstract Syntax Tree BPMN Business Process Model and Notation CMMI Capability Maturity Model Integration CRUD Create Read Update Delete DSL Domain Specific Language EMF Eclipse Modeling Framework GUI Graphical User Interface IDE Integrated Development Environment IoT Internet of Things IRE integrated reverse-engineering environment KLOC Kilo Lines of Code KPI Key Performance Indicators MDE Model Driven Engineering MDSE Model Driven Software Engineering MOF Meta-Object-Facility MVC Model-View-Controller OMG Object Management Group QVT Query/View/Transformation UML Unified Modeling Language iv Re-engineering the re-engineering process Contents Contents v List of Figures vii List of Tables viii 1 Introduction 1 1.1 General Introduction...................................1 1.2 Problem Definition....................................1 1.3 Research Questions....................................2 1.4 Structure of Thesis....................................2 2 Background 3 2.1 Legacy Systems......................................3 2.2 Re-engineering......................................4 2.3 Reverse Engineering...................................4 2.4 Forward Engineering...................................5 2.5 Model-driven software engineering...........................5 3 The typical process of model-based re-engineering6 3.1 Introduction........................................6 3.2 Related work.......................................6 3.3 Re-engineering scenarios.................................8 3.3.1 Scenario I.....................................8 3.3.2 Scenario II....................................9 3.4 A model-based re-engineering process.........................9 3.4.1 Detailed re-engineering process in BPMN...................9 3.5 Re-engineering process inputs and outputs....................... 11 3.6 Conclusion........................................ 13 4 Status of reverse engineering 14 4.1 Introduction........................................ 14 4.2 Part I: Literature Study................................. 14 4.3 Pretty printers and code visualization......................... 15 4.4 Static and Dynamic Analyses.............................. 15 4.4.1 Static Analysis.................................. 16 4.4.2 Dynamic Analysis................................ 16 4.5 Reverse Engineering Challenges............................. 17 4.6 Reverse Engineering Tools................................ 17 4.7 Part II: Interviews.................................... 20 4.7.1 Planning..................................... 20 4.7.2 Design....................................... 20 4.7.3 Performing the interview............................ 20 Re-engineering the re-engineering process v CONTENTS 4.7.4 Results...................................... 20 4.7.5 Threats to validity................................ 21 4.8 Conclusion........................................ 22 5 Validation of the re-engineering process in an industrial environment 23 5.1 Introduction........................................ 23 5.2 The YouKnowWatt Project............................... 23 5.2.1 Project introduction............................... 23 5.3 Challenges with IoT projects.............................. 24 5.4 Chimera(A-B-C-D).................................... 24 5.5 Lotte............................................ 26 5.6 Validating the model-based re-engineering process.................. 28 5.6.1 Overview..................................... 28 5.6.2 Mapping to the proposed BPMN model.................... 28 5.6.3 Gathering all available project information.................. 29 5.6.4 Reverse Engineering Plan............................ 30 5.6.5 Reverse Engineering Tools: Old YouKnowWatt ! models.......... 31 5.6.6 Perform early validation............................. 35 5.6.7 Applying model-driven engineering....................... 35 5.6.8 Analyzing the results............................... 38 5.7 Extended BPMN after validation............................ 40 5.8 Conclusion........................................ 41 6 Conclusions 42 6.1 Research questions and conclusions........................... 42 6.2 Future work........................................ 43 Bibliography 44 Appendix 49 A Interview materials 50 B Reverse Engineering Plan 52 B.1 Introduction........................................ 52 B.2 Definition & Justification................................ 52 B.2.1 Phase 1.1: Analyze the project goals...................... 52 B.2.2 Phase 1.2: Inventory of available existing components............ 52 B.2.3 Phase 1.3: Determine reverse engineering strategy.............. 52 B.3 Execution......................................... 53 vi Re-engineering the re-engineering process List of Figures 3.1 The SEI Horseshoe Model [1]..............................7 3.2 OMG ADM Standards and the ADM Horseshoe [2]..................7 3.3 Scenario I - same process, two similar products....................8 3.4 Scenario II - two different processes, two similar products..............9 3.5 Detailed Re-engineering Process expressed in BPMN................. 10 4.1 Collected papers' overview................................ 15 4.2 Reverse Engineering Requirements results from interview.............. 21 5.1 YouKnowWatt customer example............................ 24 5.2 A-B-C-D Framework................................... 25 5.3 A-B-C-D Framework evolution............................. 25 5.4 Lotte and YouKnowWatt relation..........................
Recommended publications
  • A Machine Learning Approach Towards Automatic Software Design Pattern Recognition Across Multiple Programming Languages
    ICSEA 2020 : The Fifteenth International Conference on Software Engineering Advances A Machine Learning Approach Towards Automatic Software Design Pattern Recognition Across Multiple Programming Languages Roy Oberhauser[0000-0002-7606-8226] Computer Science Dept. Aalen University Aalen, Germany e-mail: [email protected] Abstract—As the amount of software source code increases, languages of programmers that affect naming, tribal manual approaches for documentation or detection of software community effects, the programmer's (lack of) knowledge of design patterns in source code become inefficient relative to the these patterns and use of (proper) naming and notation or value. Furthermore, typical automatic pattern detection tools markers, make it difficult to identify pattern usage by experts are limited to a single programming language. To address this, or tooling. While many code repositories are accessible to our Design Pattern Detection using Machine Learning the public on the web, many more repositories are hidden (DPDML) offers a generalized and programming language within companies or other organizations and are not agnostic approach for automated design pattern detection necessarily accessible for analysis. While determining actual based on Machine Learning (ML). The focus of our evaluation pattern usage is beneficial for identifying which patterns are was on ensuring DPDML can reasonably detect one design used where and can help avoid unintended pattern pattern in the structural, creational, and behavioral category for two popular programming languages (Java and C#). 60 degradation and associated technical debt and quality issues, unique Java and C# code projects were used to train the the investment necessary for manual pattern extraction, artificial neural network (ANN) and 15 projects were then recovery, and archeology is not economically viable.
    [Show full text]
  • Comparison of Static Analysis Tools for Quality Measurement of RPG Programs
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by SZTE Publicatio Repozitórium - SZTE - Repository of Publications Comparison of Static Analysis Tools for Quality Measurement of RPG Programs Zolt´anT´oth1, L´aszl´oVid´acs2, and Rudolf Ferenc1 1 Department of Software Engineering, University of Szeged, Hungary [email protected], [email protected] 2 MTA-SZTE Research Group on Artificial Intelligence, Hungary [email protected] Abstract. The RPG programming language is a popular language em- ployed widely in IBM i mainframes nowadays. Legacy mainframe systems that evolved and survived the past decades usually data intensive and even business critical applications. Recent, state of the art quality assur- ance tools are mostly focused on popular languages like Java, C++ or Python. In this work we compare two source code based quality man- agement tools for the RPG language. The study is focused on the data obtained using static analysis, which is then aggregated to higher level quality attributes. SourceMeter is a command line tool-chain capable to measure various source attributes like metrics and coding rule viola- tions. SonarQube is a quality management platform with RPG language support. To facilitate the objective comparison, we used the SourceMe- ter for RPG plugin for SonarQube, which seamlessly integrates into the framework extending its capabilities. The evaluation is built on analysis success and depth, source code metrics, coding rules and code duplica- tions. We found that SourceMeter is more advanced in analysis depth, product metrics and finding duplications, while their performance of cod- ing rules and analysis success is rather balanced.
    [Show full text]
  • Incremental Static Analysis of Large Source Code Repositories
    Budapest University of Technology and Economics Faculty of Electrical Engineering and Informatics Department of Measurement and Information Systems Incremental Static Analysis of Large Source Code Repositories Bachelor’s Thesis Author Advisors Dániel Stein Gábor Szárnyas, PhD Student Dr. István Ráth, Research Fellow 2014 Contents Kivonat i Abstract ii 1 Introduction 1 1.1 Context ..................................... 1 1.2 Problem Statement and Requirements .................... 2 1.3 Objectives and Contributions ......................... 2 1.4 Structure of the Thesis ............................. 3 2 Background and Related Work 4 2.1 Big Data and the NoSQL Movement ..................... 4 2.1.1 Sharding ................................ 5 2.1.2 High Availability ............................ 5 2.1.3 4store .................................. 6 2.1.4 Query Languages and Evaluation Strategies ............. 6 2.2 Modeling .................................... 7 2.2.1 Metamodels and Instance Models ................... 8 2.2.2 The Eclipse Modeling Framework .................. 8 2.2.3 JaMoPP ................................. 9 2.2.4 Graph Data Models ........................... 10 2.2.5 Model Queries over EMF and RDF .................. 13 2.3 Static Analysis in Practice ........................... 14 2.3.1 Checkstyle ............................... 14 2.3.2 FindBugs ................................ 15 2.3.3 PMD ................................... 15 2.4 Well-formedness Checking over Code Models ................ 16 3 Overview of the Approach 18
    [Show full text]
  • Bachelor Degree Project Do Software Code Smell Checkers Smell Themselves?
    Bachelor Degree Project Do Software Code Smell Checkers Smell Themselves? A Self Reflection Author: Amelie Löwe Author: Stefanos Bampovits Supervisor: Francis Palma Semester: VT/HT 2020 Subject: Computer Science Abstract Code smells are defined as poor implementation and coding practices, and as a result decrease the overall quality of a source code. A number of code smell detection tools are available to automatically detect poor implementation choices, i.e., code smells. The detection of code smells is essential in order to improve the quality of the source code. This report aims to evaluate the accuracy and quality of seven different open-source code smell detection tools, with the purpose of establishing their level of trustworthiness. To assess the trustworthiness of a tool, we utilize a controlled experiment in which several versions of each tool are scrutinized using the most recent version of the same tool. In particular, we wanted to verify to what extent the code smell detection tools that reveal code smells in other systems, contain smells themselves. We further study the evolution of code smells in the tools in terms of number, types of code smells and code smell density. Keywords: Code smells, Automatic detection, Code smell density, Refactoring, Best practices. Preface We would like to thank our supervisor Francis Palma for the continuous support, helpful advice and valuable guidance throughout this bachelor thesis process. Furthermore, we would also like to acknowledge all of the cups of coffee we consumed in the past couple
    [Show full text]
  • Download Full Volume
    2019 volume 13 issue 1 2019 volume 13 issue 1 Editors Zbigniew Huzar ([email protected]) Lech Madeyski ([email protected], http://madeyski.e-informatyka.pl) Department of Software Engineering, Faculty of Computer Science and Management, Wrocław University of Science and Technology, 50-370 Wrocław, Wybrzeże Wyspiańskiego 27, Poland e-Informatica Software Engineering Journal www.e-informatyka.pl, DOI: 10.5277/e-informatica Editorial Office Manager: Wojciech Thomas Typeset by Wojciech Myszka with the LATEX 2ε Documentation Preparation System All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, transmitted in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publishers. © Copyright by Oficyna Wydawnicza Politechniki Wrocławskiej, Wrocław 2019 OFICYNA WYDAWNICZA POLITECHNIKI WROCŁAWSKIEJ Wybrzeże Wyspiańskiego 27, 50-370 Wrocław www.oficyna.pwr.edu.pl; e-mail: [email protected]; [email protected] ISSN 1897-7979 Print and binding: beta-druk, www.betadruk.pl Editorial Board Co-Editors-in-Chief Zbigniew Huzar (Wrocław University of Science and Technology, Poland) Lech Madeyski (Wrocław University of Science and Technology, Poland) Editorial Board Members Pekka Abrahamsson (NTNU, Norway) Jerzy Nawrocki (Poznan University Apostolos Ampatzoglou (University of of Technology, Poland) Macedonia, Thessaloniki, Greece) Mirosław Ochodek (Poznan University Sami Beydeda (ZIVIT, Germany) of Technology,
    [Show full text]
  • Challenges of Sonarqube Plug-In Maintenance
    Challenges of SonarQube Plug-In Maintenance Bence Barta, Günter Manz, István Siket Rudolf Ferenc University of Szeged, Department of Software Engineering FrontEndART Software Ltd. Árpád tér 2. H-6720 Szeged, Hungary Zászló u. 3 I./5. H-6722 Szeged, Hungary {bartab,magun,siket}@inf.u-szeged.hu [email protected] Abstract—The SONARQUBE™ platform is a widely used only 49% of the cases when the appropriate documentation open-source tool for continuous code quality management. It was missing. Brito et al. [1] asked developers to justify all provides an API to extend the platform with plug-ins to upload their API changes that break compatibility. They found that the additional data or to enrich its functionalities. The SourceMeter plug-in for SONARQUBE™ platform integrates the SourceMeter reason for most of the compatibility-breaking modifications static source code analyzer tool into the SONARQUBE™ plat- were generally inspired by the need to implement new features form, i.e., uploads the analysis results and extends the GUI to be and evolutionary purposes. In addition, the two other common able to present the new results. The first version of the plug-in reasons were that they wanted to simplify the API and improve was released in 2015 and was compatible with the corresponding maintenance. However, there were cases where we cannot say SONARQUBE™ version. However, the platform – and what is more important, its API – have evolved a lot since then, what the reason of the change was. Zhou and Walker [10] therefore the plug-in had to be adapted to the new API. It found that an API item is much more frequently removed was not just a slight adjustment, though, because we had to than marked as obsolete.
    [Show full text]
  • A Comparative Case Study on Tools for Internal Software Quality Measures
    A Comparative Case Study on Tools for Internal Software Quality Measures Bachelor of Science Thesis in Software Engineering and Management MAYRA NILSSON Department of Computer Science and Engineering UNIVERSITY OF GOTHENBURG CHALMERS UNIVERSITY OF TECHNOLOGY Gothenburg, Sweden 2018 The Author grants to University of Gothenburg and Chalmers University of Technology the non-exclusive right to publish the Work electronically and in a non-commercial purpose make it accessible on the Internet. The Author warrants that he/she is the author to the Work, and warrants that the Work does not contain text, pictures or other material that violates copyright law. The Author shall, when transferring the rights of the Work to a third party (for example a publisher or a company), acknowledge the third party about this agreement. If the Author has signed a copyright agreement with a third party regarding the Work, the Author warrants hereby that he/she has obtained any necessary permission from this third party to let University of Gothenburg and Chalmers University of Technology store the Work electronically and make it accessible on the Internet. {A Comparative Case Study on Tools for Internal Software Quality Measures} {MAYRA G. NILSSON } © {MAYRA G. NILSSON}, June 2018. Supervisor: {LUCAS GREN}{VARD ANTINYAN} Examiner: {JENIFFER HORKOFF} University of Gothenburg Chalmers University of Technology Department of Computer Science and Engineering SE-412 96 Göteborg Sweden Telephone + 46 (0)31-772 1000 [Cover: Generated image based on keywords used in this
    [Show full text]
  • Sistema Qsource En La Calidad Del Software Desarrollado En RPG
    Sistema Qsource en la calidad del software desarrollado en RPG TESIS PARA OPTAR EL GRADO ACADÉMICO DE: Maestro en Gestión de Tecnologías de Información AUTOR: Br. Giovanni Barrero Ortiz ASESOR: Dr. Willian Sebastián Flores Sotelo SECCIÓN: Ingeniería LÍNEA DE INVESTIGACIÓN: Sistemas Basados en Gestión de Procesos de Negocio PERÚ – 2018 ii Página del Jurado Presidente Dr. Cesar Humberto Del Castillo Talledo Secretario Dr. Willian Sebastián Flores Sotelo Vocal iii Dedicatoria Este trabajo de investigación se lo dedico a mi madre Ligia Ortiz y a mi padre Rafael Barrero quienes siempre me han apoyado, a mi esposa Virna Lira y mis hijos Paolo Barrero y Camila Barrero quienes alegran mi vida y me dan apoyo constante para el cumplimiento de mis metas y objetivos y a mis hermanos. Giovanni Barrero Ortiz. iv Agradecimiento Agradezco a la universidad Cesar Vallejo por su aporte a mi vida profesional, También expreso mi sincero agradecimiento al Dr. Willian Sebastián Flores Sotelo, por el importante soporte en el desarrollo de la tesis. De manera especial y sincera a la institución donde laboro al Banco AGROBANCO, por darme la oportunidad de laborar en tan prestigiosa institución, pero a la vez por darme las facilidades para el desarrollo de mi tesis. A Dios y a mis familiares, compañeros de trabajo, amigos y a todas aquellas personas que de una u otra forma contribuyeron con el desarrollo de la tesis. Giovanni Barrero Ortiz. v Declaratoria de autenticidad Yo, Giovanni Barrero Ortiz, estudiante del Programa de Maestría en Gestión de Tecnologías de Información de la Escuela de Postgrado de la Universidad César Vallejo, identificado con DNI N° 48839223, respectivamente, con la tesis titulada Sistema QSOURCE en la calidad del software desarrollado en RPG, declaro bajo juramento que: 1) La tesis es de autoría propia.
    [Show full text]
  • Arxiv:1912.02179V3 [Cs.SE] 27 Aug 2020
    Empirical Software Engineering manuscript No. (will be inserted by the editor) A Longitudinal Study of Static Analysis Warning Evolution and the Effects of PMD on Software Quality in Apache Open Source Projects Alexander Trautsch · Steffen Herbold · Jens Grabowski Received: date / Accepted: date Abstract Automated static analysis tools (ASATs) have become a major part of the software development workflow. Acting on the generated warnings, i.e., changing the code indicated in the warning, should be part of, at latest, the code review phase. Despite this being a best practice in software development, there is still a lack of empirical research regarding the usage of ASATs in the wild. In this work, we want to study ASAT warning trends in software via the example of PMD as an ASAT and its usage in open source projects. We analyzed the commit history of 54 projects (with 112,266 commits in total), taking into account 193 PMD rules and 61 PMD releases. We investigate trends of ASAT warnings over up to 17 years for the selected study subjects regarding changes of warning types, short and long term impact of ASAT use, and changes in warning severities. We found that large global changes in ASAT warnings are mostly due to coding style changes regarding braces and naming conventions. We also found that, surprisingly, the influence of the presence of PMD in the build process of the project on warning removal trends for the number of warnings per lines of code is small and not statistically significant. Regardless, if we consider defect density as a proxy for external quality, we see a positive effect if PMD is present in the build configuration of our study subjects.
    [Show full text]
  • Software Project Longevity – a Case Study on Open Source Software Development Projects
    Die approbierte Originalversion dieser Diplom-/ Masterarbeit ist in der Hauptbibliothek der Tech- nischen Universität Wien aufgestellt und zugänglich. http://www.ub.tuwien.ac.at The approved original version of this diploma or master thesis is available at the main library of the Vienna University of Technology. http://www.ub.tuwien.ac.at/eng Software Project Longevity – A Case Study on Open Source Software Development Projects MAGISTERARBEIT zur Erlangung des akademischen Grades Magister der Sozial- und Wirtschaftswissenschaften im Rahmen des Studiums Wirtschaftsinformatik eingereicht von Bernhard Kiselka Matrikelnummer 0125881 an der Fakultät für Informatik der Technischen Universität Wien Betreuung Betreuer: Ao.Univ.Prof. Dipl.-Ing. Mag. Mag.rer.soc.oec. Dr.techn. Stefan Biffl Mitwirkung: Projektass. Dipl.-Ing. Dietmar Winkler Wien, 30.11.2015 (Unterschrift Verfasser) (Unterschrift Betreuer) Technische Universität Wien A-1040 Wien ▪ Karlsplatz 13 ▪ Tel. +43-1-58801-0 ▪ www.tuwien.ac.at Bernhard Kiselka 0125881, 066 926 Master Thesis Software Project and Product Longevity A Case Study on Open Source Software Development Projects E-Mail: [email protected] Phone: 0664/33 23 269 Date: 2015-11-30 For Raphaël - Für Raphaël Software Project Longevity – A Case Study on Open Source Software Projects Contents Contents............................................................................................................................. i Erklärung zur Verfassung der Arbeit..................................................................................iii
    [Show full text]
  • D3.1 Software, Tools, and Repositories for The
    D3.1 Software, tools, and repositories for code mining Modelling and Orchestrating heterogeneous Executive summary Resources and Polymorphic applications for This document provides a detailed description of the software and the Holistic Execution and adaptation of Models tools to be used for code mining. In the MORPHEMIC project, Code In the Cloud mining is needed to define application profiles, to be used for better adapting the available polymorphic deployment configuration to the H2020-ICT-2018-2020 requirements specific of the application. The applications’ Leadership in Enabling and Industrial deployment models provided by MORPHEMIC must be dynamic and Technologies: Information and adaptive, and capable of handling any expected or unexpected Communication Technologies situation. In this way MORPHEMIC assists, the application to supply a more or less constant level of service. The Polymorphic Adaptation Grant Agreement Number works at both the architecture and cloud service level, by defining the 871643 most optimal deployment model according to internal (e.g., available infrastructures) and external (e.g., load) constraints. This means that Duration the code mining functionality helps to define an application profile, 1 January 2020 – which, in turn, is used to obtain the best possible adaptation of the 31 December 2022 application deployment to the available infrastructures and component configurations/forms. The process of Code Mining in www.morphemic.cloud MORPHEMIC is composed by three tasks: web crawling, code analysis and data storage. The three aforementioned tasks define the Deliverable reference three components on which this deliverable is focused. In particular, D3.1 the web crawler has been identified among some candidates in order better support the extraction of sets of information associated with Date projects available on known source repositories (for example 31 December 2020 GitHub).
    [Show full text]