Continuous Architecture Evaluation in the Context of Microservices

Submitted by Kollegger Manuel

Submitted at ISSE - Institute for Software Systems Engineering

Supervisor a. Univ.-Prof. Mag. Dr. Paul Gr¨unbacher Continuous Architecture February 2018 Evaluation in the Context of Microservices

Master Thesis to obtain the academic degree of Diplom-Ingenieur in the Master’s Program Computer Science

JOHANNES KEPLER UNIVERSITY LINZ Altenbergerstraße 69 4040 Linz, Osterreich¨ www.jku.at DVR 0093696 Abstract

Software architecture is an integral and important part to ensure software quality. Traditionally, plan-driven methodologies defined big up front architectures that will provide the foundation to develop upon. However recently, agile software processes avoid big up front designs and focus on the coding aspect, neglecting the architecture in the process. As continuous quality control is an integral part of agile development, the need for a continuous architecture evaluation arises. As microservices are a popular approach to design software architectures nowadays, mechanisms are needed to check the quality of the architecture and the conformance of the code with the architecture. This thesis presents an approach for architecture evaluation in agile processes with a special focus on microservices. The approach supports conformance checks between the defined architecture and the implementation. The developed prototype works across service boundaries using semantic versioning. We discuss our experiences and results of our developed approach for different microservices and discuss the differences to the previous development process without our system. Abstract

Software Architekturen sind ein wichtiger Bestandteil der Qualitätssicherung. Ur- sprüngliche Softwareprojekte definieren ihre Architektur, auf der die Software auf- baut, in der Planungsphase. Neue agile Entwicklungsmethoden vermeiden große Pla- nungsphasen und haben ihren Fokus am Programmieren, was die Architektur in den Hintergrund stellt. Da stetige Qualitätskontrolle ein wichtiger Bestandteil der ag- ilen Entwicklung ist, besteht der Bedarf an kontinuierlicher Architekturevaluierung. Heutzutage sind Microservices ein integraler Bestandteil vieler Software Systeme, wodurch eine Qualitätskontrolle bezüglich der Architektur auch über die Grenzen einzelner Services möglich sein muss. Diese Arbeit präsentiert einen Ansatz der Architekturkontrolle füragile Soft- wareprojekte mit einem Fokus auf Microservices. Unsere Arbeit unterstütztdie Va- lidierung der Konsistenz zwischen Implementierung und Architektur und unterstützt Prüfungenüber die Grenzen individueller Microservices hinaus. Wir präsentieren unsere Ergebnisse und Erfahrungen fürunser System. Desweiteren werden die Verbesserungen des Softwareentwicklungsprozesses aufgezeigt. Acknowledgements

This thesis could not have been written without great support of many people. I want to thank foremost my advisor a. Univ.-Prof. Mag. Dr. Paul Gr¨unbacher for his support and good will. Further I would like to thank Smarter-Ecommerce GmbH for providing the op- portunity to develop the approach as part of my work for the company. On this note I would like to thank DI Peter Rietzler for his great support and mentoring.

Lastly, I want to thank Judith Hochegger for the great emotional support and good will. Without you I would have lost my drive and would probably have not done as well as I did.

1 Contents

1 Introduction 7 1.1 Goals and Research Questions ...... 9 1.2 Research Methodology ...... 10 1.3 Thesis Outline ...... 11

2 Background and Related Work 13 2.1 Architecture Description Language ...... 13 2.2 Source Code Management Tools ...... 14 2.3 Agile Development ...... 15 2.4 Continuous Integration ...... 16 2.5 Continuous Architecture Validation ...... 17 2.6 Semantic Versioning ...... 19 2.7 Microservices ...... 20 2.8 Extraction of Microservices ...... 20

3 Industrial Example: Development at Smarter-Ecommerce 22 3.1 Overview of the Basic Development Process ...... 22 3.2 Microservices ...... 25 3.3 Basic Continuous Integration Workﬂow ...... 28 3.4 Architecture Violations at Smarter-Ecommerce ...... 30 3.4.1 Case Study Metrics ...... 30 3.4.2 Collection of Metrics ...... 31 3.4.3 Collected Data ...... 32 3.5 Requirements for CAE ...... 36 3.5.1 Flexibility ...... 37 3.5.2 Correctness ...... 38 3.5.3 Usability ...... 38 3.5.4 Applicability ...... 38

2 4 Approach and Tool Architecture 40 4.1 Envisioned Continuous Integration Workﬂow ...... 40 4.2 Architectural Evaluation ...... 43 4.3 Extraction of Microservices ...... 45 4.4 Semantic Versioning ...... 50 4.5 Inter-Service Dependency Checking ...... 51

5 Implementation of the CAE Tool 53 5.1 Dependency Checking within a Project ...... 53 5.2 Extraction of Microservices ...... 58 5.3 Semantic Versioning ...... 61 5.4 Inter-Service Dependency Checking ...... 64 5.5 Developer Tools ...... 66

6 Evaluation 69 6.1 Evaluation Method ...... 69 6.1.1 Case Study Metrics ...... 69 6.1.2 Collection of Metrics ...... 70 6.1.3 Collected Data ...... 71 6.2 Summary and Discussion of the Data ...... 77 6.3 Integration into a Bigger Software Project ...... 83 6.4 Lessons Learned and Developer Feedback ...... 84

7 Conclusions 87

8 Curriculum Vitae 96

9 Statutory declaration 99

3 List of Tables

3.1 Architecture violation metrics...... 33 3.2 Architecture violation metrics continued...... 34 3.3 Summary of architecture violation metrics...... 35 3.4 Average time until a violation is ﬁxed...... 35

4.1 Semantic Versioning rules...... 51

5.1 An example of dependency information in the database...... 65

6.1 Structure and examples of the change log table...... 72 6.2 Collected architectural violation metrics...... 74 6.3 Collected architectural violation metrics continued...... 75 6.4 Collected semantic versioning metrics...... 77 6.5 Summary of Metrics...... 78 6.6 The average time until an architecture violation is ﬁxed...... 78 6.7 Upgrade type compared to breaking changes in our automated versioning system...... 78 6.8 Upgrade type compared to breaking changes in the maven repository [48]...... 79 6.9 Most common changes in the maven repository [48]...... 80

4 List of Figures

3.1 Software development process for a new feature at Smarter-Ecommerce. 24 3.2 Microservice architecture at Smarter-Ecommerce...... 27 3.3 Standard workﬂow for Continuous Integration (Adaption of [35]). . . 29

4.1 Our extended workflow for Continuous Integration, improvements of [35]. The added steps are shown as orange tiles...... 41 4.2 A monolithic system with specific bounded contexts, which will be extracted into microservices...... 46 4.3 Contexts, which communicate asynchronously can be extracted without refactoring...... 47 4.4 Two bounded contexts and their dependencies before the split into microservices. The code is refactored into three areas to make extraction easier and define dependency structures...... 48 4.5 Service is extracted. A wrapper encapsulates the public interface into a published interface containing remote endpoints...... 49

5.1 Workflow of the architecture evaluation system...... 56 5.2 The package structure of the service-skeleton microservice...... 60 5.3 Workflow of the semantic versioning system...... 62 5.4 The workflow of the inter-service dependency system...... 65 5.5 The Continuous Integration environment at Smarter-Ecommerce GmbH. 66 5.6 The Continuous Integration environment displaying an architecture violation...... 67 5.7 The error message corresponding to the architecture violation. . . . . 67 5.8 Developer’s view on the build of microservices...... 68 5.9 A view that displays the live version of microservices...... 68 5.10 A web-view of inter-service dependencies between microservices. . . . 68

6.1 Layered architecture with violations shown as red arrows...... 81

5 Acronyms

API Application Programming Interface.

CAE Continuous Architecture Evaluation.

CD Continuous Delivery.

CI Continuous Integration.

DDD Domain Driven Design.

FDD Feature Driven Development.

IDE Integrated Development Environment.

JVM Java Virtual Machine.

PPC Pay Per Click.

SEA Search Engine Automation.

VCS Version Control System.

6 Chapter 1

Introduction

Software engineering faces numerous challenging tasks nowadays. While new features have to be developed as fast as possible, the quality of the software has to be assured. This gave a big rise to topics like Test-Driven-Development (TDD) and Continuous Integration (CI), giving birth to the broad term DevOps, a developer who should automate and take care of all tasks to make a developers job easier. Methodologies such as Agile and DevOps take care of the code quality itself, but do not validate and take care of the high quality software architecture of those fast evolving systems [22]. It is obvious, that the quality of the architecture has to be controlled as well. Therefore, different methodologies and rules have to be defined for a system and basic architectural checks have to be in place to avoid dependency problems and illegal code references [29]. The most basic definition of such basic software architecture checks is to define a dependency system, specifying dependencies on different abstraction levels such as packages in Java. Defining those valid and invalid dependencies of software components in a complex system has to be done by the software architect or the developer. It is undeniable that the software architecture is of utmost importance [30], however, practice shows that the software architectural rules, which were defined beforehand are often neglected during development [22, 55, 44, 43]. We believe that with an architecture definition and the existing code base, the architecture can be validated and this process can be integrated into agile teams and into the continuous delivery process. However, to simply write down the architecture and dependencies of a complex system with an enormous code base is no easy task and the integration into the CI environment is also taxing.

The goals of this thesis are to develop a tool-supported approach allowing architects and developers to deﬁne the architecture of a complex system and to continuously check the evolving code base against the architecture in a Continuous

7 Integration environments. Other approaches [22, 31, 25] try to solve continuous architectural validation by specifying concrete interfaces between components and validating those interfaces. We try to generalize those dependencies by defining different layers of components within a system and defining public interfaces between components. These generalizations make it easier to split the architectural management into different levels, so that developers do not need to know the entire software system. It also comes with a predefined configuration, reducing the load on each developer as they do not necessarily have to specify the common interfaces between complex systems or microservices from scratch. The architecture of each service is self contained so that only the big picture of how different services communicate with each other has to be known to the architect. The self-contained architecture comes with the individual service making architectural knowledge available for everyone. Architectural violations will be checked simultaneously with the code quality assurance and will be integrated into the continuous delivery process.

Based on the evaluation of code dependencies to validate the code architecture, different services and libraries can be versioned based on principles of semantic versioning [47, 48], further decreasing the chance of ending up in ”dependency hell”. As the public interface of different modules is known, we can achieve loose coupling and high cohesion. Based on this public interface the service will be versioned based on the rules defined by semantic versioning. The version number is very relevant for Continuous Integration as it indicates possible incompatibilities between services or libraries. We believe that this can improve the deployment process enormously as dependencies that are transcending the boundaries of a single software system can be checked based on the version of libraries and services. If a service is upgraded with breaking changes, the service is prevented from being released until all other services are compatible with the upgrade. This release block enhances continuous delivery greatly as the release and deployment process is less prone to errors.

If all of the above tools are in place and the dependencies are valid, we believe that an extraction of a microservice is possible if the components are loosely coupled and have high cohesion. This is ensured if the components have a specific bounded context and the interfaces to other services are cleanly defined, which can be validated with Continuous Architecture Evaluation. We try to define specific sets of rules, constraints and patterns that will help the developer to build a microservice and monitor the architecture and versions of a microservice in particular. These checks enable a highly effective refactoring of all involved components and connectors within a complex software system.

8 By evaluating the tool on relatively large projects we can validate its practical application and how it works in well established Continuous Integration environments. Additionally, the advantages of our approach are demonstrated by comparing it to a software development process not using our system.

1.1 Goals and Research Questions

This work addresses the topic of software architecture evaluation introduced in agile software development methodologies. The focus on microservices further poses challenges to the domain of software architecture validation and conformance. Two main goals are introduced in this thesis. The ﬁrst goal is the evaluation and ex- ploration of the Continuous Architecture Evaluation (CAE) [22, 25] domain in the context of microservices. The second goal focuses on the development of a system that allows for CAE in software projects as well as Continuous Software Evaluation for large software systems with a multitude of microservices.

The following research questions will be investigated:

How can Continuous Architecture Evaluation be integrated to software projects? Since the concept of CAE is relatively new and mostly relevant to young methodologies such as agile software development, we will develop a prototype that enables Continuous Architecture Evaluation.

How can Continuous Architecture Evaluation be used for microservices? Traditional methods of CAE [22, 25, 31, 27] do not work for a microservice landscape, as a multitude of services communicate with each other without direct method calls, the extraction of dependencies seems impossible. We will give an approach that tries to handle Continuous Architecture Evaluation for microservices.

How eﬀective is Continuous Architecture Evaluation in improving software quality? As CAE is a relatively unexplored domain and having introduced a prototype for CAE, we analyze how eﬀective our approach is in improving the software development process.

9 Examples of such improvements are the availability of architectural knowledge for the developers or the average time to ﬁx architectural violations in the actual implementation.

Does automated semantic versioning outperform manual versioning? As semantic versioning is commonly used in the industry to version libraries and services [48] and our approach for microservice architecture validation is based on semantic versioning we investigate if automatic versioning of services outperforms manual versioning in terms of correctness.

1.2 Research Methodology

This work follows the design science approach described in [34]. An artifact is build and evaluated to solve the problem. March et al. [42] introduce a way to classify research in the domain of information technology, which can be used for both design science and also natural science. March et al. [42] describe four research activities as well as four research outputs. The research activities encompasses the following activities:

• Build: Create the research outputs. The development of artifacts. The build activity is part of the design science methodology that will be used in this work. The build process in this work is the creation of an architecture deﬁnition, the creation of the Continuous Architecture Evaluation prototype and the integration into the Continuous Integration environment.

• Evaluate: Demonstrate the quality and utility of a design artifact. The evaluation activity is part of the design science methodology. The evaluation tests the CAE prototype in a software development process and discusses the results.

• Theorize: Develop a theory. This is part of the natural science methodology and will not be discussed for this work.

• Justify: Investigate, reason and justify the theory. This is part of the natural science methodology and therefore not relevant for this work.

10 The outputs are deﬁned as follows:

• Constructs: The basic vocabulary of the area. In this work the basic vocabulary consists of software components and the connections between those components.

• Models: The relationship between constructs. As an example the model displays the software components and their connections as concrete software architectures.

• Methods: The workﬂow and steps to execute and carry out a task. The method is the validation of the architecture.

• Instantiations: The realization of a prototype.

As a basis for this work we focus on the model as the abstracted architecture of a software system. The constructs are the Java classes of the implementation and the deﬁned classes of the architecture design, as well their dependencies to other constructs. In order to investigate the deﬁned classes, the dependencies are extracted and validated against the architecture design. The developed CAE tool is the prototype for the architecture evaluation and validation. This prototype serves as the basis for the evaluation, comparing the traditional development process against the CAE tool-supported development process.

1.3 Thesis Outline

The chapters are structured as follows:

Chapter 2: Background and related work This chapter gives an overview on the related work on the topic. First we explain the deﬁnition and background information of related topics. Furthermore, the related work is described as well as the diﬀerences to this work are explained.

Chapter 3: Industrial Example: Development at Smarter-Ecommerce This chapter gives an introduction to the software development process at the company Smarter-Ecommerce GmbH [15]. We introduce a case study that introduces the need and motivation for Continuous Software Evaluation. Finally, the requirements for our CAE prototype are deﬁned.

11 Chapter 4: Approach and system design This chapter gives an introduction to our general approach to build the prototype for Continuous Architecture Evaluation in regards to our speciﬁed requirements. A description on how each subsystem is integrated into a Continuous Integration environment [35] and how the subsystems interact with each other is also included.

Chapter 5: Implementation This chapter addresses our specific implementation as well as the specific tools used to develop our prototype for Continuous Architecture Evaluation. Finally, the view- point of the developer is presented by showing specific web interfaces related to CAE, Continuous Integration and the general development workflow.

Chapter 6: Evaluation This chapter evaluates our prototype by comparing metrics to the related work. The steps to validate our approach are described. The prototype is introduced to the software development process and the eﬀectiveness of our approach for Continuous Architecture Evaluation is analyzed. Finally, the results are discussed.

Chapter 7: Conclusion and Future work This chapter summarizes the results and reﬂects upon our approach. Furthermore, we discuss future work on our approach and additional research in the ﬁeld of Con- tinuous Architecture Evaluation.

12 Chapter 2

Background and Related Work

2.1 Architecture Description Language

Medvidovic et al. [43] provide a basis for architecture description languages (ADL) by establishing a classification framework. Such a description language is necessary as this project’s aim is to test and validate the code against the architecture described. The described architecture consists of coarser grained architectural elements and the connection between all those elements. As there is no clear-cut way of modelling an architecture, Medvidovic et al. [43] introduces a framework that discusses the different concepts, properties and capabilities of architecture description languages. The key point of discussion is that all models of the different languages support the following concepts [43]:

• Components: A speciﬁc part of a system that is capable of performing a task.

• Connections: A connection represents the interaction between components.

• Hierarchical compositions: The grouping of components into a higher-level component based on their domain.

• Communication paradigm: The way a connection can be represented (i.e., a function call, a message is sent).

• Underlying formal model: The theoretical model that is used to explain the behaviour of components.

• Tools for modelling and veriﬁcation: A set of tools that can create, evaluate and verify a model.

13 • Automatic application code composition: A tool that allows for a graphical representation of the application.

Therefore, the architecture description language for the CAE prototype tries to meet all those illustrated concepts. Components and their connection among others have to be validatable by the Continuous Architecture Evaluation prototype developed in this thesis. Luckham and Vera [39] list requirements for an ADL:

• Component abstraction: The interfaces should only deﬁne what they provide and what they need.

• Communication abstraction: The communication should only use those interfaces.

• Communication integrity: The communication must only happen on de- ﬁned connections.

• Dynamic architectures: The number of components and connections may vary. The architecture can be extended arbitrarily. Additional components and connections can be added or removed at will.

• Hierarchical compositions: The components can be replaced by higher level components.

This work should therefore support an ADL that provides a way to abstract components and communication. Furthermore, the architectural evaluation must be able to validate the integrity of the communication while providing dynamic architectures that can change. The architecture description language itself is modelled in a hierarchical way, making hierarchical compositions very intuitive to model. The components and their connectors are the key elements of an architectural deﬁnition language, because the primary user of this system will be the developer and the developer is mainly working with the components and the connectors. That is why our approach will try to simpliﬁes the syntax of components and connectors greatly and tries to provide a very detailed validation of the communication abstraction.

2.2 Source Code Management Tools

Software conﬁguration management is the discipline of managing the evolution of large and complex software systems [23, 54]. This discipline faces numerous challenges [23] as the entire life-cycle of a software product has to be managed [40].

14 Software projects involve a lot of different files, which need to be edited by different parties and integrated to create a product. Keeping track of those files, persisting those files with different versions and allowing for collaboration is a major effort, which is why software companies use specific tools to manage all of the above features [29]. These tools are called Source Code Management tools, configuration management, version control systems, repositories and various other names [29]. These tools are an integral tool to almost every user and basically any software development project [35]. Another advantage is that most version control systems support multiple branches to allow different streams of development.

2.3 Agile Development

A paradigm shift happened in software engineering in the last decade [51]. With the advent of agile development spearheaded by the Agile Manifesto [21] software development steered away from the traditional up front project plans to agile methodologies. The Agile Manifesto is about the values of agile development. As it is very clear and short, the complete text is included here:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan That is, while there is value in the items on the right, we value the items on the left more.

This basically means that huge design phases and big architecture designs are not the best practice for software projects, as requirements might change, rendering the up-front architecture obsolete. Agile developers embrace the changes and focus on adjustments based on changing requirements. Therefore, they are collaborating with customers more closely. This usually requires developers to release their systems often and early, in order to allow the customer to give feedback on certain new features [51]. This is also a big reason why Continuous Integration is important and commonly used in the current industry [35].

Additionally, Fowler [29] describes why up-front architecture can be a fallacy if everything is planned to the tiniest detail in the beginning. However, not planning

15 anything before coding can equally lead to a mess and a specific style called ”code and fix”. This ”code and fix” style has to be avoided as it hinders progress of a system as most of the developers time is occupied fixing problems, because the foundation of the software architecture is a mess [29]. Therefore, some form of up-front design is necessary. The granularity has to depend on the architects experience with the type of problem. If the expertise with the current problem is low then a not-as-detailed architecture description might be better as the change in requirements cannot be foreseen [29]. Therefore, a flexible architecture description is necessary to allow for all granularities of the architecture to be defined and checked within a system.

2.4 Continuous Integration

Continuous Integration is a vital part of DevOps responsibility and is integral to agile software development techniques that are used nowadays. DevOps is a set of practices and revolutionary changes reducing the overhead and time to put a change into production [19]. Continuous Integration supports the developers as builds are quickly finished, tests and deployment are automated and post-deployment processes encourage developers to write more resistant code [19]. Especially microservices enable effective implementations of DevOps by promoting the importance of small teams [58]. The microservice design is becoming the standard for building continuously deployed systems using Continuous Integration [19] Fowler [29] defines the general structure of the Continuous Integration process and provides useful experiences. As our approach builds and enhances the deployment pipeline the whole Continuous Delivery process is affected and has to be adjusted. Continuous Integration defines a process to build, test and deploy software systems automatically [35]. The Continuous Integration environment reacts to changes in the version control system and triggers an automated build, which if successful triggers the automated tests and in the end, creates a deployable file. The biggest benefit of Continuous Integration is reduced risk, as the system monitors the version control environment and tests every change automatically on a non-local machine [35]. The deployment into production can be done at any time with minimal effort, because everything is already build and tested. Every developer knows how far the project has progressed, what works and what does not work. Frequent deployment is very beneficial as it allows the paying customers to get new features more rapidly, breaking down a barrier between developers and consumers.

Approaches to Continuous Integration have to handle dependencies across software artifacts. Especially with the rise of the microservice architectural style [19]

16 the requirement to separate services and build them independently is very relevant. Roberts [49] introduces a system to seperate software into diﬀerent modules, which can be build independently in an Continuous Delivery environment. As each of the individual modules might depend on another module, the latest binary of the modules is required to be available. However, as soon as a module changes, the other module might not be compatible with the new version [49]. Dynamic versions might add nondeterminism to the build. The same module might not yield the same output if the version of the other module changes. This can be handled if the changed module denotes its version number correctly using semantic versioning [48, 47] and there is no bug in the new implementation.

2.5 Continuous Architecture Validation

Several systems try a similar approach to automatically validate the software architecture of a project. With many companies switching to an agile development process, the software is changing more and more frequently. In order to ensure high code quality, development processes such as Test-Driven Development are commonly practised in teams [31]. These processes can easily be automatically tested and verified by state of the art Continuous Integration environments [29, 35]. Architecture on the other hand is not so easily integrated and verified by the Continuous Deliv- ery process [51, 22]. However, as software inevitably ages [46] and the erosion of the software architecture is present in many different software projects in various forms [38], the management and reduction of those ageing effects is necessary but a tedious task that faces many challenges [38]. Some recent projects try to tackle this problem in different ways and introduce systems to verify the software architecture continuously. Goldstein et al. [31] create a system that extracts a semantic model of the software using specific domain transformations and has software architectural patterns and rules that detects architectural antipatterns. Those rules might detect cycles or violations of a layered structure [31]. This approach has a big advantage as you do not need to create an architecture definition manually, as the architecture is extracted in the form of a semantic model. The architectural patterns can be extended to check for additional anti-patterns [31]. However, the domain transformation to end up with the semantic model may be very hard to define and users may not end up with the correct architectural model created by the black box that is the domain transformation. A big advantage is that the validation process can be done automatically and thus be integrated into the Continuous Delivery process easily.

17 Weinreich et al. deﬁne a system that generates the architectural model by extracting information from the source code and additional high-level architectural structures like layers [22]. This architectural model can then be validated against the code. Violations may occur if the code is changed or the model is changed [22]. The system just checks if the extracted architectural model matches the code and vice versa. The model changes in granularity if the code base grows bigger and more complex [22]. A big advantage of this system is that the architecture model can be used to create a stub or skeleton for a project. However, it has to be adjusted quite a bit to be used in a Continuous Integration environment, because the software is available as an Eclipse Plugin with solely the user interface being available to the developer. Additionally, as the architecture model changes with source code changes, the developer might be overwhelmed with the whole architectural model of a big software system.

Eichberg et al. [27] define a very similar approach to [22]. The software enforces dependency checks by creating an Eclipse plugin, which validates the defined dependencies against the source code. The architecture is modelled using a visual language [27]. The big advantage is the performance of this software during build times. However, that system cannot be integrated into a Continuous Integration environment without major modifications. It requires a visual language and can only handle rudimentary dependencies instead of allowing for a whole architecture to be validated or checked for anti-patterns.

Grabner [32] discusses the advantages of Continuous Integration for agile development. Code changes of agile team members are tested using Unit and Functional tests that are executed in every build [32]. The Continuous Integration process is enhanced signiﬁcantly by adding proﬁling data such as CPU and network bandwidth, number of database queries, or browser load times. Additionally, architectural rules are checked every build.

Deissenboeck et al. [25] investigate the architectural decay throughout a software’s life-cycle. If no countermeasures are taken, unwanted dependencies and communication violations are bound to be added to the system [25]. These architectural violations can have a negative impact on the overal system as deﬁned goals like portability or performance might deviate because of the violations [25]. ConQAT employs a component model to specify the architecture of a system. It creates the components and connectors directly from the source. After the extraction the user can specify allowed connections [25]. The validation is done by comparing the actual

18 architecture (which will be mapped to the extracted architecture) to the initially specified architecture [25]. This system has some clear benefits as it is easy to use and can extract dependencies without the need of a big architectural description. However, because of the automatic dependency extraction, some architectural problems might be overlooked and the validation cannot be as specific as a complete architecture description language. This software has to be altered to be used with Continuous Integration.

2.6 Semantic Versioning

Using external libraries is one of the most common practices in software development, even for small projects. However, determining whether there are incompatibilities when upgrading a library is very hard and sometimes an impossible feat for library users [48]. It is the owner’s responsibility to document and indicate compatibility changes. The de facto standard to indicate incompatibilities is done using a speciﬁc version number schema. A very prominent version num- bering technique is semantic versioning [47] that suggests a 3-number format of MAJOR.MINOR.PATCH. These 3 numbers have a very speciﬁc meaning [47]:

• MAJOR: indicates an incompatibility with the previous release. We deﬁned a speciﬁc set of rules, as is described in [47] and will be adapted in Chapter 4, to decide if a change is an incompatible one.

• MINOR: indicates that the changes from the previous are backwards compatible. However, new functionality is added.

• PATCH: indicates that backwards compatible bug ﬁxes were made.

Raemaeker et al. [48] analyze and study the maven repository to check to what degree the semantic versioning scheme is applied to real software projects in terms of binary compatibility. The amount of breaking and non-breaking changes is cal- culated by static code analysis and afterwards the version number is checked if it is consistent with the number of breaking changes [48] (at least one breaking change must result in a major upgrade [48, 47]). However, the static code analysis restricts the project to the Java programming language. This work will ensure that breaking changes are detected for any language running within the Java Virtual Machine.

19 2.7 Microservices

Microservices are an approach to distributed systems that promote the use of ﬁnely grained services with their own lifecycles, collaborating with each other [45]. As microservices are modeled to match various business domains and business needs, they avoid common problems of traditional tiered architecture as the layered architecture is too rigid to model ﬂexible large distributed systems [45]. Additionally, new technologies and techniques have emerged with the rise of microservices in recent years, which helps to mitigate common problems in the domain of service oriented architecture [43, 45]. The whole topic of microservices is evolving very rapidly nowadays. Although the idea has been around for at least a decade, new experiences from development teams across the globe as well as new techniques and technologies are having a great impact on the pace of change of microservices. These techniques are summarized greatly by Newman [45]:

• Domain-driven design is described by Evans [28] and shows better ways to model systems.

• Continuous delivery allows to eﬀectively put software into production.

• Web advancements and developers are used to web development allows for better communication interfaces between systems.

• Infrastructure automation is introduced to maintain and quality check and manage distributed infrastructure.

Literature is still undecided about the size and granularity of microservices. Eaves defines the size of a microservice as something that could be rewritten in two weeks [45]. Other literature [26, 53, 57] defines the size and code base of a microservice as something a small singular team can handle. However, Newman defines the size and granularity of microservices in a slightly more vague and abstract manner: ”Apparently developers and software architects have a good feel on the scope of a system. If it does not feel too big, the system is small enough to be treated as a microservice.” [45]

2.8 Extraction of Microservices

Literature and software developers are somewhat set on the easiest way to develop microservices. That is to start with a monolithic system and only start with microservices if you really need to by refactoring the software system into well structured domains [45, 57, 53, 26]. Newman [45] and Fowler [29] are big advocates of

20 structuring a system’s business logic into domains as is described in the Domain Driven Design model by Evans [28]. These domains are loosely coupled and have high cohesion by design, which are the main criteria of microservices [45]. This refactoring into diﬀerent domains as well as their anti-corruption layer [28] are one of the most important steps for extracting microservices from a monolithic system [45]. As soon as the loose coupling of bounded contexts is established each bounded context might be converted in a microservice on its own, whereas the communication with other bounded contexts has to be converted from simple calls into external communication.

21 Chapter 3

Industrial Example: Development at Smarter-Ecommerce

This chapter will focus on the agile development process within the company Smarter- Ecommerce GmbH. Smarter-Ecommerce [15] is a company located in Austria, with a focus on pay-per-click (PPC) automation and search engine advertising (SEA). The main products (AdEngine and Whoop!) allow for automation of Google Adwords and Google Shopping respectively. The following sections will cover the development at Smarter-Ecommerce. First, the agile development process in general and within Smarter-Ecommerce is described. Afterwards the basic software development process is presented. As the quality improvements of the company’s microservices are the focus of this thesis, we will elaborate and explain how microservices are developed and communicated at Smarter-Ecommerce. Then we tackle Continuous Integration and explain the envisioned CI workﬂow. Afterwards a case study on the current Git repository is conducted to unravel the motivation behind this thesis. This case study establishes the baseline to determine the success of this work in regards to the relevant research questions in Chapter 1. Finally, diﬀerent requirements for our prototype are derived from the metrics of this case study.

3.1 Overview of the Basic Development Process

Smarter-Ecommerce [15] develops their software using the agile methodology. The company’s approach for agile development as described in Chapter 2.3 is an adoption of diﬀerent agile software development frameworks. It is an adopted mix of SCRUM [50], Kanban [21] and eXtreme Programming [20]. There is a daily stand-up that focuses on the progress of a sprint and also considers the obstacles or changing re-

22 quirements one might face. Once a week there is a short planning that discusses the current requirements and drafts a broad architecture of the system that is currently being planned. Usually these architecture drafts contain the description and responsibility of bigger software components and their interaction with other software components. Another important aspects to planning is the definition of stakeholders within the development team. Once everything is planned, the tasks are written on cards and put onto a board in the development area (similar to the Kanban board approach). Every developer can pick a story they want to work on from the Kanban board and starts coding. If the task is done, they have to schedule a code review with the stakeholder within the development team to ensure that the code is of suffi- cient quality and style. These reviews are currently a bottleneck as a lot is discussed in those reviews and architectural changes might not be noticed by the reviewer or the reviewer does not have the responsibility to decide on big architectural changes. This leads to a requirement to make architectural changes visible to all developers, which might improve the code review as architectural changes can be handled in a separate architecture review process. The following figure shows the typical stages and flow of a software project. Depending on which methodology or paradigm is used, the name of those stages and the flow of information may vary, but the essence of the stages is still similar. Figure 3.1 shows the basic flow of tasks for a developer. The developer may be involved in all tasks, but is for sure involved in the tasks highlighted by the yellow color. The highlighted area also covers the scope of this work for the development process. As the architecture is defined in the planning phase it affects the whole planning phase as well as the phases of implementation, test, and deployment. These phases are also part of DevOps responsibility as Continuous Integration is closely related to DevOps and is all about the automation of the test and deployment phase. The CI process spans from the beginning of the developer coding to the beginning of the code review stage. The deployment process and the maintenance process of putting a software component into production are much more easily handled and have their risks minimized by the automated quality checks done by Continuous Integration. These quality checks are described in the following chapters. The different phases are described below:

• Creation of an idea: An idea is formed by a member of the company or a customer. After the idea is weighed and deemed a necessary feature, it is converted to an idea that will be put into production.

• Requirement analysis: This analysis checks and deﬁnes needs and conditions, which have to be fulﬁlled to ensure success of a project or product

23 Figure 3.1: Software development process for a new feature at Smarter-Ecommerce.

resulting from the idea depending on various business needs. The output is basically a user story in Kanban terms [21], which deﬁnes the requirements and boundaries of the idea.

• Design of a user story: The user story is then designed and converted to better ﬁt the average developers and more speciﬁc ideas on how to implement the story are gathered.

• Definition of the architecture: The planning phase tries to model and plan the software in detail. The output is a finer-grained definition of what needs to be done to ensure the success of the project or product. In this phase the general product design is planned and the architecture of the software component

24 is deﬁned and structured on a high abstraction layer. The abstraction depends on how familiar the developers are with the problem, but as requirements are bound to change the architecture is usually very broad.

• Implementation: The implementation is the process where actual code is written. The developer implements the product as it is modeled in the design phase. Agile development, the Continuous Integration process as well as the DevOps methodology require the developer in addition to creating the production code, to also be responsible to write tests, which can later be automatically tested.

• Test process: Tests are necessary to ensure the quality of a software product. Continuous Integration is only possible if suﬃcient tests are written by the developer and only if they can be tested automatically.

• Deployment: Deployment is the process of actually integrating the software to a live system. This makes the whole product accessible for a bigger audience. Continuous integration simpliﬁes this process a lot as the tests and software is already available on an external machine and all the tests passed on that external machine.

• Code Review: If the Continuous Integration reports no errors and the developer is conﬁdent in the piece of code that was produced, a code review can be scheduled.

• The code review is basically a check that the new software components ﬁt the general structure of the company, that the software quality is on par with what is expected and code styles are checked. Usually the architecture is checked to some degree in this review based on the available time and thoroughness of the reviewer.

• Release: If the review process was successful the new software component is put into production and is available to the customer.

3.2 Microservices

Smarter-Ecommerce GmbH adapted the microservice methodology in 2008. In the beginning each system started out as a monolithic system with microservices in mind. The design was closely related to the bounded context architecture described by Evans in [28]. Currently, there are 28 microservices in production with 13 of

25 them being added in 2017. As microservices come with a lot of additional workload, the microservice architecture at Smarter-Ecommerce is displayed in Figure 3.2. In this section we will shortly describe most of the microservice related systems running at Smarter-Ecommerce and introduce the motivation for inter-service dependency checking for this work. Most of the related systems present in Figure 3.2 are described below:

• Service-X: The microservice itself. A standalone executable software component that has an externally accessible API both synchronously (using i.e., REST/GraphQL/GRPC) and asynchronously (using i.e., Messaging). These interfaces are essential for the communication between two microservices and are also essential for our research question regarding the architecture evaluation across microservices. Each microservice has its own private storage (commonly represented as some sort of database).

• Logging framework: Each error has to be propagated to the developers. The logging framework is responsible to detect all errors within an individual microservice and persist these Messages. The errors have to be reproducible and readily available to the developers. Additionally, an analysis tool has to be presented to the developers to enable them to search the logs for useful information. At Smarter-Ecommerce GmbH the developer is responsible for the debug log information. These debug messages along with error messages from the system are stored in an Elasticsearch document [6].

• Messaging: A global message bus has to be available to provide the asynchronous communication across and between services using messaging. The message bus used at Smarter-Ecommerce is a high availability message queue using a RabbitMQ [14] instance.

• Profiling: The performance of each service has to be monitored and bottlenecks have to be detected. If a disk space, memory or runtime performance issue arises a developer has to be notified of the problem as the system is running out of resources. For profiling several tools from the Java Virtual Machine as well as tools provided by the operating system are used.

• Health monitoring: The health of each individual microservice has to be monitored as for example a network problem might aﬀect the function of the infrastructure as a whole including critical systems. Each service monitors itself and exposes that information to the outside world using Spring [16] health indicators and a REST endpoint.

26 27

Figure 3.2: Microservice architecture at Smarter-Ecommerce. • Triggers: Events, Schedules, Messages or Notiﬁcations might trigger a spe- ciﬁc feature of a microservice. The whole triggering infrastructure has to be available in each service.

• Shared storage: Some services might access the same data store. The con- currency and access to these shared data storages has to be ensured.

• Cluster and distributed coordination: The management of where to put microservices (in terms of clusters and nodes) has to be handled by some system. The coordination between services might also be orchestrated by another system.

• Coding Environment: The application code has to be stored in a version control system and dependencies should be available in a repository as well.

• Continuous Integration: Continuous Integration is very relevant to reduce risks and assure high quality of software. This is one of the most important stages for this work, therefore it is necessary to elaborate CI further. Therefore, the next section covers the Continuous Integration at Smarter-Ecommerce speciﬁcally. Summarizing, a microservice approach requires a lot of eﬀort, but is sometimes necessary, because of the requirements of a system, which might be hindered by performance bottlenecks or other reasons. However, the communication between microservices is not trivial and the quality of the communication has to be ensured. Therefore, a system to detect violations and changes to the communication between services is required and very relevant to quality assurance.

3.3 Basic Continuous Integration Workﬂow

Continuous Integration is a very common practice to reduce risks by detecting integration problems very early in the stages of software development. Smarter- Ecommerce works very closely with the original design described in [35]. The following paragraphs will give some insight to the standard workflow of an Continuous Integration environment and explain which stages of software management are affected. For the standard Continuous Integration workflow the first few steps of software management and planning remain the same as with other projects. The stages implementation to maintenance are the ones that are structured and defined in more detail. The basic workflow is displayed in Figure 3.3 and each stage can be described as follows:

28 Figure 3.3: Standard workﬂow for Continuous Integration (Adaption of [35]).

1. The developer writes the code for the software product. This includes all the tests (typically unit tests) that can be executed automatically.

2. The developer runs all the tests on his local machine.

3. If all tests pass the developer checks in his code to the version control system.

4. The Continuous Integration environment monitors the version control system for changes and detects a new check in.

5. An automated build is started and an executable binary for the software is created on the external system (the host of the CI environment).

29 6. If the build failed, the developer is notiﬁed with a notiﬁcation on the web interface of the CI environment. Additionally, a red tile is displayed on a screen in the development area and the developer might receive a mail of the failed build.

7. If the build is successful all tests are initiated.

8. If at least one test fails the developer is notiﬁed. The notiﬁcation mechanism is the same as described in step 6.

9. If all tests pass the software is of suﬃcient quality and can be deployed to a live system either automatically or manually by the developer (in this case it is useful to notify the developer of the completion). This depends on the service and is evaluated on a case by case basis as some services can have severe eﬀects when shutdown or started without anyone maintaining the process.

The quality of a software product can be asserted by introducing this Continu- ous Integration process to almost any software project. The eﬀort and cost of the deployment as well as the maintenance phase are reduced to a minimum, while in traditional software projects those phases have the highest risk and cost associated to them. The reason for the related risk is, that it is very hard to classify the time and cost related to those stages [35]. This is one of the reasons why Continuous Integration was introduced at Smarter-Ecommerce. As CI is already embraced and accepted by the developers a big focus of our thesis is to integrate the Continuous Architecture Evaluation into the Continuous Integration environment.

3.4 Architecture Violations at Smarter-Ecommerce

In order to highlight the problem and need of architecture evaluation in software companies we conduct an exploratory case study on the Git [33] repository of Smarter-Ecommerce GmbH. The goal of this case study is to show the need for early automated detection of architecture violations, as developers in agile environments tend to neglect the architecture as their focus is on the code itself [55, 51].

3.4.1 Case Study Metrics

For this case study we analyzed the commits of the Git repository at Smarter- Ecommerce GmbH. In order to conduct this research we decided that the following metrics will be used in this study:

30 • The date an architecture violation occured. This will be used to derive the time required to ﬁx an architecture violation.

• The commit message to derive what may have caused the violation.

• The branch type (either master-branch or feature-branch), as violations on the master branch are a bad sign. Additionally, these violations are propagated to all developers.

• The date of the resolution of the architecture violation in order to discover the amount of time until the error is ﬁxed.

• The commit message to derive the reason for the ﬁx.

• The number of commits until the problem was ﬁxed.

These metrics are collected at diﬀerent coding stages of the development process for every project. The collection of the metrics is discussed in the next section.

3.4.2 Collection of Metrics

As the Git repository contains a lot of data (over 25,000 commits over the span of almost four years), we deﬁned to following process to collect the data described in the previous section.

The following steps were executed in order to retrieve those metrics:

1. Analyze the Git repository and checkout diﬀerent project stages using edu- cated guesses on possible architecture violations based on the commit message. A good indicator we found was the introduction of new services, repositories and business logic.

2. There is no architecture available for the given date, as Smarter-Ecommerce GmbH did not persist architecture definitions and design choices for the future as the tools have changed and the architecture was not always up to date. Therefore, we have to establish an architecture definition by reconstructing the architecture of that time in collaboration with the software architect at Smarter- Ecommerce. This definition is based on current architecture definitions found in the existing microservice architecture. As layering is present for most projects throughout the lifespan of the Git repository, an architecture can be established for most checkouts. If the architecture cannot be guessed the commit is discarded and not used for the case study.

31 3. Now the architecture is established, we validate the implementation against the deﬁned architecture manually. If no violation of the architecture is present, the commit is discarded as we cannot collect any of the described metrics.

4. If a violation occurred, the violation type is noted and will be stored as well as the commit message and the date. For further processing the revision hash of the Git commit is stored so that the number of commits between the ﬁx and ﬁrst occurrence can be collected.

5. Replay the git commits following that speciﬁc commit with the violation and check architecture against the implementation manually until the architecture violation is no longer present.

6. Take note of the commit message, date and revision number of the commit that ﬁxes the architecture violation.

7. Execute the command git log revisionNumber1..revisionNumber2 – pretty=oneline | wc -l to count the number of commits until the ﬁx occurred. Document and collect all metrics.

3.4.3 Collected Data

We conducted the collection of the metrics as described in the previous section and ended up with the following results. The type of violation can either be:

• layering - The deﬁned layering is violated as a layer accesses an upper layer.

• cylce(X) - A cyclic dependency has been detected. X classes are involved in the cycle.

• context - A cross context (as in bounded context as described in [28, 45]) call is made without permission. This is usually the access to the internal logic of another system.

The data is displayed in Table 3.1 and Table 3.2. As is shown in these tables, the required time to detect and then fix some architectural problems is very high. Most of those changes are not detected early and take quite a while to fix. Most problems also occur on the master build as they were not detected by a code review and merged to the master branch. This is a big problem as the general code base for the developers is corrupted by bad architecture and might serve as a bad example for other developers. The data suggests that many problems occur when new services are introduced and most problems are fixed after a review with a senior software

32 Table 3.1: Architecture violation metrics. Type Date Commit Message branch violation commits to fix Violation 18/11/2015 added methods to get the current bids from master layering the local model Fix 1/12/2015 added logic for customer, account and bidder master - 1,090 state resolution Violation 31/03/2015 first version of our server application master layering Fix 18/11/2015 introduced BidderDTO, quickfix for getBid- master - 851 der and saveBidder Violation 12/10/2015 added getAllUsers Endpoint to Admin-API, master cycle (3) added getAllUsers function in Admin-Res Fix 21/10/2015 Admin.java: adjust after review master - 58 Violation 19/01/2016 added getCustomersForUser in admin service master context Fix 08/02/2016 polishing and daily refactoring master - 111 Violation 22/02/2016 added/changes strategy annotation feature layering Fix 23/02/2016 fixes after review feature - 5 Violation 23/02/2016 fixes after review feature context Fix 24/02/2016 review: import done feature - 85 33 Violation 26/03/2014 unknown master layering Fix 22/10/2017 thanks architecture check master - 22,762 Violation 26/03/2014 unknown master cycle(16) Fix 29/11/2017 added exception to architecture check master - 24,313 Violation 24/03/2017 Import whoop bidders into MW as target master context countries Fix 28/03/2017 Change MW DB schema, Fix error in MW master - 143 Violation 16/03/2017 Add ModelSupplier, Create Customer- master layering Account-Model beans, clients, endpoints Fix 03/04/2017 Refactoring customer master - 779 Violation 31/05/2017 Upgrade to Vaadin 8 master layering Fix 01/06/2017 Review: Move beans to their specific package master - 6 Violation 03/07/2017 Introduce servicelog monitoring to the master context service-skeleton Fix 04/07/2017 Revert logmonitoring rollout to all services master - 128 Violation 03/07/2017 Introduce servicelog monitoring to the master context service-skeleton Fix 03/07/2017 Move logMonitor related wiring from Logic- master - 7 Module Table 3.2: Architecture violation metrics continued. Type Date Commit Message branch violation commits to fix Violation 24/02/2016 introduced basemonitor master cycle(4) Fix 29/02/2016 criterialimit monitor master - 106 Violation 03/03/2016 welcome obfuscatedService master layering Fix 09/03/2016 further work on project structure master - 994 Violation 15/03/2016 Vaadin UI async push; DayAuditService, master layering new Entities Fix 25/03/2016 dependency bug fixed; vaadin preparation master - 115 Violation 04/04/2016 first version of task type destruction master context Fix 02/05/2016 moved all sources to all package fixed depen- master - 168 dencies 34 Violation 04/05/2016 first simple version of change log support master layering Fix 01/06/2016 Streamline logging master - 44 Violation 10/05/2016 employee detail import master cycle(2) Fix 12/05/2016 employee detail work in progress master - 3 Violation 10/05/2016 employee detail import master layering Fix 11/05/2017 Employee Import Review master - 17,935 Violation 18/05/2017 Contacts are created if the corresponding on master layering is not exiting when importing employees Fix 13/07/2017 Export of employee ...(obfuscated) master - 1,953 Violation 13/09/2017 Move health indicator to change module feature context Fix 14/09/2017 Fix change health indicator feature - 20 Violation 14/09/2017 Refactoring feature layering Fix 14/09/2017 Improve interfaces and code after review feature - 59 developer. Also a few commits are related to the Vaadin [18] framework, making it obvious that the developers are not used on structuring their projects correctly usi 6.2.1 Detailed Look at Violations ist der einzige Unterpunkt zu 6.2. – das schaut kong the new technology. The following items were the most severe cases in our case study:

• Two diﬀerent pieces of legacy software were ported to Git on 26/03/2014 with no previous version control system records of them making it unable to determine when a violation was introduced. Both projects contained architecture violations when they were ported. Only when our prototype was put into practice for those projects the violation was noticed (3/10/2017 - three years and six months later).

• While the violation in one of those legacy project could be fixed, the cycle spanning across 16 classes in the other system was deemed to much effort to fix and an exception was added to our architecture evaluation to ignore this cyclic dependency.

• A few commits introduce two diﬀerent violations with only one being detected during the review, resulting in the other violation being present for a long time.

• In one case the ﬁx for an architectural violation caused another architecture violation.

We created a table summarizing the metrics and analyze them as shown below:

Table 3.3: Summary of architecture violation metrics. Type Count Percentage Total violations 23 - Master branch violations 19 82.6% Feature branch violations 4 17.4% Fixes after review 5 21.7% Layering violation 12 52.2% Context access violation 7 30.4% Cyclic dependency violation 4 17.4%

Table 3.4: Average time until a violation is ﬁxed. Type Range Mean Stdev Median Time 0 days to 3.5 years 150.3 380.5 9 Commits 3 to 24,313 3,118.9 7,430.3 115

35 As you can see there is quite a high fluctuation and range of time to fix a violation. This is because some problems are detected very early, because of good reviews while others are overlooked and remain in the system for a long period. We are going to revisit this case study for the evaluation of our prototype to answer our research question How effective is Continuous Architecture Evaluation in improving software quality?.

3.5 Requirements for CAE

With the rise of modern software development processes such as eXtreme Program- ming and Agile development [21], the software architecture naturally erodes as described in the previous sections as well as in [51, 38, 46]. Furthermore, Parnas [46] argues that the aging process and the related erosion is inevitable. Addition- ally, with the rise of microservices and dependencies between different services and systems the management of the inter-service dependencies is not easily maintained manually. We believe that a system to manage the architecture and the related inter-service dependencies using semantic versioning and dependency extraction can support developers while introducing very little additional effort on the behalf of the developers. As Continuous Integration has proven itself to be a good and useful as well as a generally accepted tool, we try to integrate our system into the Continuous Integration process to support the developers in a non-obtrusive way. As described in the sections above, the basic development workflow has problems with architectural decay as well as versioning demands which have to be met. The case study suggests that the time to fix architectural problems can be reduced if the violations are detected early. For our system to succeed we analyzed these problems and extracted possible needs. We want to support agile developers and improve the well established Continuous Integration workflow as described above as well as in [35] with useful features that support the development process without introducing too many hurdles. Possible hurdles we want to avoid are for example the need for additional tools or redefining the whole development process, as that would require every developer to get used to the new environment.

36 In order to be able to support agile developers, we deﬁned certain requirement areas and derived requirements for the continuous architecture evaluation prototype. These requirement areas can ensure, that the improvements would nicely integrate with many agile development processes.

• Flexibility

• Correctness

• Usability

• Applicability

These requirements areas will be considered during the design and development of our system and will be recalled when evaluation of the system takes place.

3.5.1 Flexibility

This requirement represents how our system can be used for different purposes. If one of the flexibility requirements are not met, the number of other systems that can be build on top of this system is reduced. To serve the purpose of being a flexible system that can be used in a wide area of applications, this contains a few different requirements, which we found were significant to the success of this project.

• The prototype is independent of the underlying programming language.

• The prototype can be integrated in various diﬀerent Continuous Integration environments.

• The prototype is conﬁgurable to work with a wide area of software projects and libraries.

• The prototype can work with diﬀerent levels of architectural granularity.

As described in previous sections, the architecture for agile software systems is usually not a big up-front design, but evolves during development according to requirement changes. This makes the last item of the requirements list above of utmost importance.

37 3.5.2 Correctness

Regarding correctness the following requirements have been deﬁned:

• Architectural violations (such as a layer accessing the layer above) have to be detected.

• Breaking changes (such as the deletion of a function or a parameter) have to be detected.

• A change log between two diﬀerent versions has to be consistent and correct. All detected changes between diﬀerent versions have to match the actual change.

• The system has to be consistent. This consistency is based on the ruleset of semantic versioning [47] and the derived version number has to be correct.

3.5.3 Usability

The following requirements are related to the usability of our system. Adoption problems or the tool not integrating well in the development process can be avoided, if the following requirements are adhered:

• Developers must not need to learn an additional tool. The prototype must seamlessly integrate itself into the development process.

• Violations are propagated to the developer so that they are apparent. The individual developer must be notiﬁed of violations with a mail, a message or a notiﬁcation.

• The developers know how to ﬁx problems if a violation occurs. Useful messages regarding the violation have to support the developer.

3.5.4 Applicability

The following requirements indicate practical needs of our system. In other words changes in the mindset of the developers or changes to the development process, which have to have resulted from the application of this system. The derived requirements for applicability are shown below:

38 • Violations are shown to the developer right after they occur. Notifications alerting the developers of a violation are required. These notifications have to be sent immediately for violations affecting a live system, but can be in the background for unrelevant violations (such as those occuring on feature branches) .

• The need for refactoring and redesign is apparent earlier than with the traditional process. The architecture deﬁnition as well as the architectural knowledge have to be present and available to all developers.

• Developers are more aware of communicational and architectural changes. Changes to the architecture deﬁnition are brought to the developer’s attention using notiﬁcations.

• Data ﬂow within a software system is documented and available to all developers. The architectural knowledge has to be spread across the whole development team.

These requirements are very important to the success of our system, but are very hard to measure. A detailed description on how all of the above requirements are evaluated for our system is described in the following chapters.

39 Chapter 4

Approach and Tool Architecture

The focus of this chapter it to give an introduction to our approach for a tool prototype and all subsystems involved in solving the problems described in the previous chapter. First we will take a look on how to seamlessly enhance the Continuous Integration workﬂow. Afterwards we will explain each additional system that is required for our approach of Continuous Architecture Evaluation individually and give a rough description on how they are implemented. A more detailed explanation of the implementation will follow in the next chapter.

4.1 Envisioned Continuous Integration Workﬂow

This project aims to tackle the problems of architectural erosion as described in [46, 38, 55], which can occur, when using for example agile methodologies such as eXtreme Programming. The goal is to create a tool prototype that detects architectural inconsistencies between design and implementation and enhances the Contin- uous Integration workﬂow. After many iterations of a software project, the actual implementation might deviate from the initial architectural design [55, 22]. This problem might occur, because the developer focuses on the stages, such as coding and writing unit tests (introduced by Continuous Integration as shown in Figure 3.3) instead of focussing on the architectural design. The design phase might no longer be considered and the software architecture is often neglected during development [55, 38]. We try to assist the software architect by ensuring that the envisioned architecture matches the actual implementation. The developed prototype should allow the developer to get a bigger picture of the whole architecture. Additionally, the tool should ease the dependency burden on the developer by introducing version numbers and automatic change logs between versions so that there is no need to document all changes manually. Additionally, DevOps’ load is drastically reduced by automating the whole process using Continuous Integration.

40 Figure 4.1: Our extended workﬂow for Continuous Integration, improvements of [35]. The added steps are shown as orange tiles.

An overview of our enhanced Continuous Integration workﬂow is displayed in Figure 4.1. The following steps are added to improve existing Continuous Integration pipelines:

41 • Coding: The developer implements a feature of a product, that was deﬁned in an use case.

• Check in: The source code is uploaded to the version control system’s repository.

• Automated Software Build: The Continuous Integration environment detects changes in the VCS and triggers a build of the software

• Validate Architecture: After the software is built on the external machine the software architecture is validated using special plugins for the Continuous Integration environment. These plugins are our subsystems responsible for Continuous Architecture Evaluation and are described in the following sections. If the software architecture is violated the developer is notified with the corresponding violation. The notification can be done using email, instant messaging or simple notifications in a user interface. If the architecture matches the implementation continue with the build process, otherwise fail early and do not continue the build pipeline.

• Automated Software Tests: The tests written by the developers are executed and evaluated.

• Automated creation of a semantic versioning number: The version number is extracted based on our rules and semantic versioning.

• Automated creation of a change log: A change log of all the relevant changes is created. This change log contains the diﬀerences between two versions such as added methods or deleted ﬁelds.

• Automated check for inter-service dependency conformance: A compatibility check is done to evaluate if existing services and their dependencies are violated based on the changes extracted from the previous step.

• Automated deployment: All checks are done and the software is ready to be put into production. This is done automatically and can be made available to the customer easily.

42 There are three core capabilities that can be derived from our envisioned workﬂow in Figure 4.1. We deﬁne them as:

• Architecture evaluation within a project.

• Semantic versioning of software components and the creation of a change log between versions.

• Inter-service dependency checking of systems.

The following sections will cover the general concept of each feature while the next chapter discussing the implementation details.

4.2 Architectural Evaluation

In order to prevent architectural erosion we define a system for architectural evaluation, which defines the architecture in the form of an architecture description file that will be checked against the actual implementation of that system. A key requirement is to be independent of the programming language, we therefore decided to rely on the Java Virtual Machine’s (JVM) byte code as our underlying common structure. The architecture validation and conformance checking takes place after the automated software build in the Continuous Integration workflow as seen in Figure 4.1 as that is the earliest time we have access to the JVM byte code. It is important to fail early [32], which is why the architecture evaluation is done before the software tests. Failing early means to find errors fast and break a build right away, instead of executing steps that are based on that build being successful. As an example, if one test fails, the whole build should fail right away instead of executing all the other tests, as that allocates unnecessary ressources from the Continuous In- tegration environment. Therefore, because the architectural conformance checking is faster than a big batch of unit tests (especially if integration tests are to be executed), the architectural conformance check is done before the tests. Another reason is that the architecture is very important for the long term success of a project [30] making it a higher priority. This section gives an introduction about the general idea on how to implement such an architecture evaluation tool, while the next chapter goes into detail for our approach to create such a prototype.

For architectural evaluation, an architecture definition file is defined. This architectural file will be parsed and an internal representation is created for the architecture definition. Following up on that, the dependencies are extracted from the executables of the application. Afterwards the dependencies are analyzed and

43 checked against the architecture definition file, determining if the implementation violates the definition.

Listing 4.1: A sample dependency deﬁnition ﬁle with layering and additional access restrictions. module { mainPackage ="smarter.ecommerce.microservice" layering ="app -> ui -> api -> workflow -> logic -> data -> util" // anything outside layers may not reference anything inside layers allowCycles ="false" publicInterface ="pub" strictLayering = true

//Additional Rules: independentOf ="smarter.ecommerce.microservice.logic -> smarter.ecommerce.microservice.ext" independentOf ="smarter.ecommerce.microservice.logic -> smarter.ecommerce.microservice.ui" }

A dependency definition file consists of a main package name, which defines the top level package that will be investigated for violations. In this architecture definition every package that matches the regular expression ”smarter.ecommerce.microservice.*” is considered. Additionally, for the dependency definition file shown in Listing 4.1, a module contains architecture definitions in the form of a layered architecture. Every package of the module, that is, every subpackage of the microservice package is checked for a package named app, ui, api, workflow, logic, data and util. If a component from a lower layer tries to access a layer on top of it the architecture is violated. Therefore, a dependency rule is defined for every class file with a package name containing a layer (i.e., a class in the package ”smarter.ecommerce.microservice.*.app.*” is in layer ”app”). Several parameters are set in this example. ”allowCycles” does not allow cyclic dependencies within the module. ”strictLayering” enforces a stronger layered architecture. If ”strictLayering” is enabled a layer must only access the layer directly underneath it and cannot skip to a layer further down. To allow for greater flexibility additional rules can be defined to specify access restrictions between components. The first restriction defines that every class with a package name of

44 ”smarter.ecommerce.microservice.logic.*” must not access a package of the pattern ”smarter.ecommerce.microservice.ext”.

After all architectural rules are defined for the module, the internal representation is created. This is done by extracting architectural rules defined in the architecture definition. Currently supported rules are layering, cyclic dependencies, additional access rules and bounded context access restrictions that are described in detail in Chapter 4.3. For the sample definition file described above, example rules would be access allowances in the form of ”smarter.ecommerce.api.*” can access ”smarter.ecommerce.workflow.*”, but not ”smarter.ecommerce.logic.*”. Then the implementation is also converted to an abstracted representation, by building a graph of connected software components. Components can be classes or packages while connections can be method calls or references to other components. The abstract representation is created from the byte code of the implementation by using for example Tarjan’s strongly connected components algorithm [52]. Each component and their connections are analyzed and the strongly connected components are extracted. There are tools to generate the abstract representation for the implementation, using the byte code. Those tools will be discussed in the next chapter, discussing implementation details.

4.3 Extraction of Microservices

In order to extract a microservice from a monolithic system, the software system has to be refactored accordingly and diﬀerent bounded contexts have to exist within [45]. If a split factor is present, requiring the monolithic system to be split into multiple microservices, the refactoring into bounded contexts is necessary. Possible split factors are discussed by Newman [45] and Fowler [29]. They have been adapted by Smarter-Ecommerce and are described below:

• Performance bottlenecks

• Diﬀerent service qualities

• Diﬀerent service availability

• Diﬀerent release cycles

• Isolation of development

• Incompatible dependencies

45 We believe that a split into microservices is possible as long as a special set of rules, that is described in the form of an architectural description ﬁle are obeyed. These can be conveniently checked with our dependency checking tool. How dependencies within a bounded context and across borders have to be managed will be explained with the Figures(4.2 to 4.5) below:

Figure 4.2: A monolithic system with speciﬁc bounded contexts, which will be extracted into microservices.

Figure 4.2 describes a minimal viable example of a system with three components and four connectors. The components have been refactored into diﬀerent bounded contexts, which resemble three diﬀerent business use cases. A bounded context for charging customers, one for creating invoices and one for monitoring and logging information about the whole system.

This example would be a very common use case for a company and is also presented in literature [51, 28, 45]. The company sells products and the customer has to pay for those products. One of the ﬁrst iterations was the creation of invoices for products ordered by a customer. In this iteration the customer is not charged automatically, but has to pay in cash or send the money himself. The architecture is not very complex at this point as the business logic is very small. Everything within the monolithic system is in one main context and there are references all over the place.

Now as the company grows in size and products the company decides to charge the customer automatically with for example their stored credit card data. Charging is now part of the invoicing module as all the invoicing data has to be available for charging. The whole process is being logged to monitor the charging procedure.

46 Here is where the software manager detects a problem caused by the tight coupling within the invoicing module and a refactoring is done. The result is displayed in Figure 4.2. Charging is mostly independent with a dependency connection (connector) to Invoicing as the customer’s information and the amount due is required by charging. Monitoring is now necessary for the two diﬀerent modules, which is why it makes sense to refactor the logging process into a separate module as well. There- fore, it can be accessed by every module in the system. The monitor is not as tightly coupled to charging and invoicing. A weak connector is suﬃcient to use the logging module. This weak connector might be a simple message using a messaging protocol.

Figure 4.3: Contexts, which communicate asynchronously can be extracted without refactoring.

Further down the timeline the company detects one of our above deﬁned split factor. The monitoring component is not maintained as often as the charging component, which prevents updates to newer versions in charging. Monitoring has to be extracted from the system into a new system (a microservice). Figure 4.3 shows that one can simply move the code base of the monitoring bounded context to another system. The attributes of the weak asynchronous connector between charging and monitoring as well as invoicing and monitoring allow the system to still work with no changes within charging and invoicing. If a message bus is used, all that needs to be done is conﬁguring the new microservice containing the monitor to use that very same message bus. The whole system should then work out of the box.

47 Figure 4.4: Two bounded contexts and their dependencies before the split into microservices. The code is refactored into three areas to make extraction easier and deﬁne dependency structures.

Following the timeline even further and performance bottlenecks within the software system appear as the system grows larger. It becomes apparent that the invoicing and the charging components have to be split and put onto diﬀerent machines, each one as an individual microservice. This is not an easy task, as communication between charging and invoicing is a strong connector. The customer’s data as well as the amount due has to be available in both services. The extraction process is not as easy as with the weak asynchronous connector and some major refactorings have to be done. Figure 4.4 displays our design on bounded contexts and how they should be structured after refactoring within the same system.

48 This interaction of components allows the internal implementation of services to be hidden from other contexts, while the public interface is accessible to other contexts and the module area can be accessed by other modules in order to load diﬀerent modules an take care of the wiring using dependency injection.

Figure 4.5: Service is extracted. A wrapper encapsulates the public interface into a published interface containing remote endpoints.

As soon as the system has been brought to a state resembling Figure 4.4 our system can verify the structure using Continuous Architecture Evaluation. Figure 4.5 shows a special published interface for the invoicing context that wraps the public interfaces of the public are into remote calls like REST, GraphQL or any other remote call. These remote calls have to be coded by the developers. However, if they adhere to the structure in Figure 4.4, the wrapper can be written easily (it is just a remote interface delegating calls to the business logic).

49 4.4 Semantic Versioning

Semantic versioning [47] is a way to create unique version numbers for software products. It is defined in a very elegant way by [47] and tries to describe possible incompatibilities by introducing major version numbers (breaking changes), minor version numbers (non-breaking changes such as additions to the API) and patches (API remains the same but the internal implementation changed). We defined a specific set of rules by adjusting and refining the semantic versioning rule set defined in [48, 47] to determine, which part of the version number has to be adjusted as seen in Table 4.1. The rules in [48, 47] are very generic and have to be specified for the used languages as there are special conditions and rules of semantic versioning for languages running in the JVM. Special semantic versioning upgrade rules for Methods, Classes, Annotations and Fields had to be tailored to fit for JVM. As an example, the semantic versioning ruleset introduced in [47] only defines that breaking changes are reflected in the first number. Backwards compatibility is reflected in the version number. However, there are no specific definitions, on what constitutes a breaking change in JVM languages. Therefore, we consulted the DevOps department of Smarter-Ecommerce GmbH and extended [48] to end up with a set of rules that works well for the languages Java and Scala. Two different states of an application will be analyzed and checked for changes according to the rules specified in the table above. The change with the highest priority is applied, i.e., major comes before minor, which in turn comes before a patch. That means if at least one or more major changes are detected, the major version number gets incremented and the minor and patch version are set to zero. If no major changes and at least one minor change is detected, the major version stays the same, the minor number is incremented and the patch version is set to zero. If no major and no minor change has been detected, the patch version is incremented and the other version numbers remain unchanged.

For each file, class, method and field, a comparison with the new state of the application is compared to the old state and each rule defined in Table 4.1 is checked. As soon as every component of an application has been investigated according to the set of rules, the change with the highest priority, as described above, is applied to figure out the new version number.

50 Table 4.1: Semantic Versioning rules. Change Type Change Description MAJOR Method has been removed MAJOR Class has been removed MAJOR Field has been removed MAJOR Type of a field changes MAJOR Type of a method’s parameter changes MAJOR Type of a method’s return value changes MAJOR Number of parameters in a method changes MAJOR Interface is removed MAJOR Method visibility/accessibility decreased MAJOR Field visibility/accessibility decreased MAJOR Class visibility/accessibility decreased MAJOR A method is added to an interface MINOR Method has been added MINOR Field has been added MINOR Class has been added MINOR Method visibility/accessibility increased MINOR Field visibility/accessibility increased MINOR Class visibility/accessibility increased MINOR Method is no longer final Patch Overriding method has been removed (base implementation exists) Patch Value of a field or constant changes Patch Annotation target changes Patch Annotation is removed or added Patch Value of an annotation changes Patch Internal implementation changes (no change to the outside interface)

4.5 Inter-Service Dependency Checking

Another part of Continuous Architecture Evaluation especially in this work is the architecture validation across microservices. This subsystem assists the developer by managing a global architecture by introducing an inter-service dependency check between systems. These systems are very often microservices and communicate without direct access by using remote communication such as REST, RPC or asynchronous calls using for example a message bus. It is therefore not possible to extract those dependencies using simple byte-code dependency analysis as in the case of the architecture evaluation of a single system. Therefore, the whole build has to be analyzed and checks have to be based on the semantic versioning of the systems. If a microservice depends on another microservice with a speciﬁc version, then a major change might result in an inconsistent state of the whole service landscape consisting of all services relevant for a big system.

51 We therefore decided to create a prototype, that simply stores which service depends on what service including the version number and detect possible inconsistencies caused by deploying a not compatible service version. This is done by a simple compatibility check based on the semantic versioning principles [47]. If the upgraded version is a major upgrade the developer has to be notiﬁed of the incompatibility that is caused by the breaking change. If the live version of a system gets downgraded, which might result in a state where the current major or minor version is lower than the required one, an incompatibility arises and is presented to the developer as well.

52 Chapter 5

Implementation of the CAE Tool

This chapter describes how we approached and solved the problems presented in Chapter 3. We will present implementation details of systems and components already outlined in the previous chapter. The main focus of this work is the integration of all our relevant subsystems into Continuous Integration. In order to reliably allow the seamless integration into Smarter-Ecommerce’s existing CI-environment as well as to satisfy the ﬂexibility and applicability requirements as discussed in Chapter 3, each system is written as a plugin for a build automation tool such as Maven, Ant or Gradle. For this project we decided to use Gradle as it is mainly used for the Java Virtual Machine programming languages Java, Scala and Groovy. As all those languages are supported, the requirement to be independent of the programming language is already met by using Gradle as our build tool. Additionally, Smarter-Ecommerce already used Gradle in production for its existing build-infrastructure, making it easier to test the seamless integration into existing projects and environments, which is a key requirement also discussed in Chapter 3 (ﬂexibility, usability).

The following sections cover a detailed description how we introduced architectural evaluation in conjunction with Continuous Integration by presenting how the individual systems as well as the workﬂow and data-ﬂow of each system are implemented.

5.1 Dependency Checking within a Project

As described in previous Chapters, we try to ensure that the architecture conforms with the actual implementation of a system. We create a Gradle plugin that parses the dependency deﬁnition ﬁle and creates an internal representation of the architec-

53 ture definition. Afterwards the executable is analyzed and the dependency graph is extracted using an external tool. For each specified module there is a main package, which checks if the defined architecture (for example layering or access rules) is matching the implementation. Additionally, in JVM based languages, cyclic dependencies are considered bad practice. Therefore, while building the dependency graph, checks are in place, detecting strongly connected components and finding cycles.

Extracting the dependency graph can be done with diﬀerent tools:

• JDepend: ”A Java package dependency analyzer that generates design quality metrics.” [11]

• Classycle: ”An analysing tool for Java class and package dependencies.” [4]

• Dependency-Analyser: ”A tool to analyze Java class ﬁles in order to learn more about the dependencies between those classes.” [5]

We evaluated these tools and decided to use Classycle as it provides an easy way to call it from ant, allowing us to use the Ant commands directly from Gradle. All of the above tools use the Tarjan algorithm [52] to guarantee that the system is able to detect the strongly connected components and end up with the correct graph. Dependency-Analyser and JDepend are mostly used for a graphical representation of the dependencies and do not allow for a lean command-line access to the dependency graph, making it tedious to integrate those tools with Continuous Integration. Therefore, our decision was to use Classycle as it was the one best suited to be integrated into Continuous Integration, which is the main focus of this thesis.

Smarter-Ecommerce uses diﬀerent Java Virtual Machine languages, in particular Java, Scala and Groovy. Therefore, the dependency graph is generated on the JVM byte-code level. In eﬀect every language that can be compiled into byte code can have its dependencies extracted.

Automatic detection of dependencies can be performed at various levels of granularity. For large-scale software systems the tool may be used to detect patterns between projects. For medium-sized applications it can be done at the package level. While analyses at the level of classes and methods would also be possible, we found them to be less useful as the architecture in agile projects changes often and a simple renaming of a method might render the architecture invalid. Dependencies among

54 packages and projects are computed based on dependencies among lower level elements, such as classes, that they contain. Specifically, two projects depend on each other only if at least one package from one of the projects depends on at least one package from the other project. Packages depend on each other only if at least one class from one of the packages depends on at least one class from the other packages. Classes depend on each if they, for example, extend each other or contain methods that call methods from the other class [31]. As a large software system contains multiple different modules and components our default architecture definition allows to restrict the access between modules and multiple projects by introducing a specific public interface if it is defined. The public interface is defined to ensure that another service or module does not call anything from a package other than the public interface package. If a public interface is defined we can assume a bounded context structure within the project. This bounded context was coined by Evans [28] and is a crucial architectural style, which will be discussed and analyzed in Chapter 5.2. Additionally, if a bounded context is present further rules have to be introduced. The internal package must not be used outside of the current module (only the service interface and if defined only the public interface package may be used). The public interface package is also used to establish a version number based on semantic versioning rules.

The workﬂow of our architecture evaluation tool is shown in Figure 5.1. The architecture evaluation system can be explained as follows:

1. Architecture description: The architecture is deﬁned in a ﬁle that is checked in to the VCS. An example is displayed below.

2. Parser: The architecture description ﬁle is parsed and converted to the internal representation containing a graph that represents all the classes, packages and projects and the allowed dependencies.

3. Additional Rules: These parameters can contain additional rules specified in the architecture description file as well as specific settings for Classycle (as cyclic dependencies can be checked by Classycle directly).

4. Classycle Dependency Extraction: Classycle checks our executable and extracts all dependencies, creating a graph resembling our internal representation.

5. Conformance Check: The conformance of the internal representation of the architecture with the actual representation of the architecture is checked and validated.

55 Figure 5.1: Workﬂow of the architecture evaluation system.

6. Terminating state: If successful, the next stage of Continuous Integration can proceed. If the architecture does not conform, a rule for Classycle does not succeed or Classycle fails, an error is produced and the developer is notiﬁed of the problems.

56 An architecture description ﬁle example is displayed here:

Listing 5.1: An example dependency deﬁnition ﬁle with bounded contexts. module { mainPackage ="smarter.ecommerce.microservice" layering ="app -> ui -> api -> workflow -> logic -> data -> util" // anything outside layers may not reference anything inside layers crossContext =""" examplecontext -> interfaces examplecontext -> incidents main -> examplecontext """ allowCycles ="false" publicInterface ="pub" strictLayering = true

The dependency check has been implemented as a Gradle plugin. The proposed dependency definition file has been enhanced by adding specific cross context calls, making complex bounded context calls more apparent to the developer while also throttling the introduction of calls to different bounded contexts. This on the one hand makes our checks easier to configure and view while also creating the need for the developer to consider and weigh each call to another context by determining if it is really necessary to call another bounded context.

The rules of the example architecture description ﬁle are described below:

• mainPackage: The working boundaries of the module are described by the main package. Every subpackage of this main package is investigated for architectural violations. This allows for ﬂexible architecture deﬁnitions, as each component can be checked individually by having their own main package. These individual components can be grouped into larger components based on their domain allowing for hierarchical structures.

57 • layering: A layering structure is enforced. Anything outside layers may not reference anything inside layers, every layer may only access components in the layers below. Only components in the publicInterface of the other bounded context may be accessed as the internals have to remain hidden within the context. The publicInterface should be as lean and small as possible.

• crossContext: This parameter deﬁnes a bounded context structure. Each bounded context may only access components within the same context by default. If other contexts are to be accessed, the allowed connections between bounded contexts (cross context calls) are deﬁned by this parameter.

• allowCycles: If this parameter is false, it enforces that no cyclic dependencies exist. Allowing cycles can be a valid practice, but is considered bad practice for Java.

• strictLayering: This paramerter enforces that each layer may only access the layer directly below. If set to false, layers can be skipped, which may be useful to reduce code duplication, but can introduce security risks.

• independentOf: Using this flexible parameter, additional rules that restrict the access between packages or other components can be defined. The architecture description file is flexible and can contain different levels of abstraction. This flexibility is very useful for agile teams, as it is very common that the architecture evolves as the systems growth. However, a basic architecture has to be present for each system to make Continuous Architecture Evaluation reasonable. For Smarter-Ecommerce that default architecture contains bounded contexts and layering. The bounded context structure [28] is also very useful when refactoring a larger system into multiple microservice [45]. A viable example of this extraction is discussed in the next section.

5.2 Extraction of Microservices

For Smarter-Ecommerce a default architectural definition file is created. This basic architecture is based on a common structure found in the existing microservices of the company. This default architecture contains bounded contexts, layering and cycles are not allowed. As the bounded context structure is present, we implement the structure defined in Chapter 4.3. This implementation is the base for all new software systems. The defined package structure introduces a default architecture that allows for an easy refactoring into different contexts, that can be extracted into new services.

58 Figure 5.2 shows the package structure and displays the developer’s view on the modules and internal implementations. The top level is the bounded context while the next layer is application (handles the startup of client and server), client (the user interface) and the server (the backend) packages. Underneath that package the internal, module and public structure as displayed in Figure 4.2 to Figure 4.5. The lowest level is the layering structure that is defined in our dependency definition file. If the system adheres to the given structure (basically the architecture validation succeeds for our predefined structure shown in Figure 5.2) a microservice of the structure as seen in Figure 1 can be extracted, wrapping the functions contained in the published interface into remote endpoints and all calls to those functions into RPC calls. To ensure this, the publicInterface parameter has to be set, as a publicInterface has to be defined to enforce the bounded context criterias. We refactored the service-skeleton into the package structure defined in Figure 5.2. As a simple test, we duplicated the service and only started the incident bounded context in one service and only the example bounded context in the other service. As the start of both services was a success and both services can simply communicate with messaging and RPC calls, the restructuring into the given packages defined with our architecture definition file was eased significantly. A part of this thesis is to evaluate how Continuous Architecture Evaluation can be used for microservices. The results of this concept for Smarter-Ecommerce indicate that this approach can simplify the extraction of microservices from a monolithic system.

59 Figure 5.2: The package structure of the service-skeleton microservice.

60 5.3 Semantic Versioning

This section covers the automated versioning of our systems and microservices. As we believe that versioning can be used to validate architecture across microservices, this section will discuss our approach to automate the versioning of software. Figure 5.3 shows our workflow for the semantic versioning system. We are using byte-code analysis to extract the signature of all methods, fields and classes to determine the semantic versioning upgrade type of the application. The rules for the upgrade are explained in detail in Chapter 4. In order to extract the byte code we examined different tools. Possible tools to read the byte code and helps us create an internal representation of the methods and fields would be:

• ASM: ASM is an all purpose Java byte-code manipulation and analysis framework [1].

• BCEL: The Byte Code Engineering Library (Apache Commons BCEL) can analyze, create, and manipulate Java class ﬁles [2].

• ByteBuddy: ”Byte Buddy is a code generation and manipulation library for creating and modifying Java classes during the runtime of a Java application.” [3]

• FindBugs: ”FindBugs uses static analysis to inspect Java byte code for occurrences of bug patterns.” [7]

• Janalyzer: ”The JAnalyzer Toolkit implements a visualisation tool for static analysis on Java applications.” [10]

We evaluated all of the above tools. FindBugs is a big framework detecting possible code issues based on byte-code analysis. However, the whole framework was too powerful for the tasks we needed it for. The dependencies used by FindBugs as well as the size of the framework itself resulted in us deciding against it. ByteBuddy is mostly used for runtime code generation and runtime code manipulation. As all we want to do is read and extract information from the byte code, this tool did not fit very well. Janalyzer is more of a visualization tool and was not fit to be integrated into Continuous Integration. Additionally, the big focus on Java did violate our flexibility requirement as multiple JVM based programming languages should be supported. BCEL and ASM are both very good and very similar tools. BCEL is easier to use, but ASM has a very good module structure, that can be loaded as needed.

61 In the end we settled for ASM as it was the best documented and leanest tool of the above. The active development and active community for ASM also supported our decision against BCEL, as we want a byte-code analysis framework that is actively maintained to support changes in the JVM. ASM has a lot of diﬀerent modules, which can be loaded if needed and we only needed the extraction module.

Figure 5.3: Workﬂow of the semantic versioning system.

62 The workﬂow of semantic versioning displayed above and our approach can be explained as follows:

1. Executables: The byte code of the application that will be checked is available.

2. Byte-Code analysis: The byte code of the application is extracted using ASM. All the classes, methods and ﬁeld information is gathered.

3. Internal representation: The internal representation is a set of classes, a set of ﬁeld and a set of methods, complete with the signature of methods. The following information is stored:

• Name of the class, the method, the field • Name of superclasses and implemented interfaces • Annotations • Type of fields • Return type of methods • Parameter count, types and order of methods • Visibility of classes, fields and methods

4. Comparison: The diﬀerence between the two representations is checked and stored.

5. Change log: The change log records the changes changes between two versions in oder to document the changes for the developer.

6. Semantic Version Rules: The rules that were deﬁned in Chapter 4.3 are applied on the change log and the upgrade type is applied (according to the highest priority as discussed in Chapter 4.3).

7. Apply upgrade: With the version number of the old executable and the upgrade type ﬁgured out, the new version number is determined.

63 5.4 Inter-Service Dependency Checking

As new services are introduced and the interaction between software components becomes bigger and bigger, the need to manage those communicational dependencies becomes bigger and bigger, especially now with microservices becoming the a standard technique to achieve scalability [58, 19]. We try to handle the compatibility checks between different services by introducing a tool to manage our versions and detecting compatibility problems based on our approach of semantic versioning. If an incompatibility is detected, the user is notified of the problem and the responsibility to fix the compatibility problem is on the programmer. Incompatibilities are checked as seen in Figure 5.4. As a service starts up the live version of that service is written to the database. Additionally, all the release versions created by our automatic semantic versioning tool are also written to the database. When a service released, its dependencies are analyzed using simple Gra- dle commands [8]. If a dependency becomes incompatible by a version number upgrade detected by our semantic versioning tool (as a major upgrade is present), an incompatibility is displayed to the developer. If the live version gets downgraded the check is also done to check if the system compatibilities between the major and minor version as described in Chapter 4 are present. If an incompatibility is present, the developer is notified with an alert showing the problem and possible incentives to fix the problem.

The workﬂow of inter-service dependency checking displayed above and our approach can be explained as follows:

1. Microservice build: The microservice can be built, versioned correctly with the test and architecture evaluation passing.

2. Detect version upgrade: If the new version of a service is a breaking change, aﬀected services have to be changed as an incompatibility has been detected.

3. Service dependency information: This is a database table that contains the information on which service requires which service and the version of that service. It is build by using the workﬂow on the right, that is the extraction of build dependencies by using a naming convention. If a client library (by naming and package convention) of another service is referenced, the name and version of that other service is stored in the database.

64 Figure 5.4: The workﬂow of the inter-service dependency system.

An example of the database dependency table and the interaction with the system can be seen below:

Table 5.1: An example of dependency information in the database. Base microservice Depends on Version MS1 MS2 2.3.1 MS1 MS5 1.0.0 MS1 MS3 4.2.0 MS2 MS3 4.3.0 MS3 MS4 5.3.0

This dependency information shown in Table 5.1 was built by the user in the beginning. Now a script handles the extraction of remote calls from microservices and building the dependency information in the database automatically. The dependencies between microservices are shown. MS1 depends on MS2, MS3 and MS5. As MS2 depends on MS3 as well, the version for MS3 has to be compatible (in this case the live version of MS3 has to be at least version 4.3.0). If the user gets notiﬁed of an incompatibility (example: MS2 gets updated to version 3.0.0), the user has to check if his code is still compatible and change the Version number in the database to 3.0.0. This is usually done in a special user interface.

65 5.5 Developer Tools

The developer can detect architecture violations either right away by running the gradle plugin locally or by simply using the standard coding workﬂow shown in Chapter 4.1 by commiting his changes. As soon as the developer pushes his changes to the remote repository, these commits are detected by the Continuous Integration environment.

The developer can view the web interface of the CI-Environment as seen in Figure 5.5. On the top, the current build queue and the agents that can execute the tests, builds and checks are displayed. On the left of the table, the branches are displayed with traﬃc lights displaying if the build was okay or failed. If a build is red it also shows a message on what failed, indicating a compilation error, an architecture violation, or a failing test. This can be seen in Figure 5.6. If the architecture is violated an error message is displayed, showing which classes access which classes and which rule is violated by this access. This error message is shown in Figure 5.7.

Figure 5.5: The Continuous Integration environment at Smarter-Ecommerce GmbH.

66 Figure 5.6: The Continuous Integration environment displaying an architecture violation.

Figure 5.7: The error message corresponding to the architecture violation.

Additionally, a web interface to support the developers exists. This support web- site for developers displays if a build is broken and also displays the version number as well as the dependencies to other services. This web interface is created using Grafana [9] using Prometheus [13] as a data source. Most Continuous Integration environment have direct support for Prometheus metric exports or provide an easy to access REST interface. Such a dashboard is shown in Figure 5.8 and displays the health and states of the companies code as well as giving an insight to the release process by showing the version numbers created by the semantic versioning as well as the inter-service dependency system. Figure 5.8 shows the builds related to each microservice. A red tile indicates that the build or test is broken. Orange displays an architecture violation and green means that everything worked out for that microservice. Figure 5.9 displays the current live version of a service, which makes it easy to compare to the required service version in Figure 5.10 for communication between microservices enabling architectural checks across diﬀerent services.

67 These views serve the developers in many ways as they can easily detect faulty microservices with either build, test or architectural violations. Also they can monitor the version of the services and check if the semantic versioning was done correctly. Additionally, they can look up which services are violating the inter-service dependency checks and act accordingly.

Figure 5.8: Developer’s view on the build of microservices.

Figure 5.9: A view that displays the live version of microservices.

Figure 5.10: A web-view of inter-service dependencies between microservices.

These views are vital to ensure that the CAE prototype is applicable for software projects as it is an important feedback for the developers.

68 Chapter 6

Evaluation

This chapter describes the evaluation of our tool-supported development process. The prototype developed during this thesis is evaluated and analyzed. We will relate this evaluation to the requirements and case study presented in Chapter 3. First we will describe the evaluation method we applied. Afterwards, the introduction of our system into a big software project is described as this thesis was done in collaboration with the company Smarter-Ecommerce GmbH. Finally, we will discuss the results and beneﬁt this work provides in accordance to our requirements and research questions.

6.1 Evaluation Method

The evaluation method to validate our approach is presented in here. We conduct a case study with the same metrics as our case study presented as the motivation in Chapter 3. Additionally, as the architecture evaluation spans across diﬀerent services we evaluate how well our semantic versioning and the related inter-service dependency checking works. For this evaluation we compare our results to the study of the maven repository presented in [48].

6.1.1 Case Study Metrics

For this case study we analyzed the commits of the Git repository at Smarter- Ecommerce GmbH after the introduction of the CAE prototype. We assume that the average number of commits as well as the average time to ﬁx an architectural violation is lower because of the the earlier and automatic detection. In order to conduct this research and answer the question How eﬀective is Continuous Architecture Evaluation in improving software quality? the following metrics will be used in this study:

69 • The date of the ﬁrst architecture violation occurrence in order to present the time passed until detected and ﬁxed.

• The commit message to derive what may have caused the violation.

• The branch type (either master-branch or feature-branch) as violations on the master branch are a bad sign as they propagate to all developers.

• The date of the resolution of the architecture violation in order to present the time until the error is ﬁxed.

• The commit message to derive the reason for the ﬁx.

• The number of commits until the problem was ﬁxed.

We try to answer our research question Does automated semantic versioning outperform manual versioning? by introducing the following metrics and comparing them to the study of the maven repository discussed in [48]:

• Total number of version changes for version numbers in the structure of Ma- jor.Minor.Patch.

• Total number of Major, Minor and Patch version upgrades.

• Total number of invalid version upgrades, i.e., a major version upgrade when no breaking change is present).

• The reason for the upgrade, i.e., removal of a method resulting in a breaking change and a major upgrade.

Additionally, we want to collect the following information:

• How many inter-service dependency problems occurred?

• How often did the architecture change?

6.1.2 Collection of Metrics

As the CAE prototype was launched and put into the development environment on the 3rd of October, 2017. We collected the metrics of over two months. We try to be consistent with the metrics collected in the ﬁrst case study so the steps basically match the approach in Chapter 3. However, we were able to skip the investigation part, as all architecture violations are apparent from the log ﬁles and can be viewed the web interface of the Continuous Integration environment. The following steps were executed to collect the discussed metrics:

70 1. Retrieve a build with an architectural violation using the web interface of the CI environment. Speciﬁcally the search function of the web interface of TeamCity [17] was used to look for the violations.

2. Take note of the commit message and date as they are an essential metric.

3. Store the Git revision number as it will be used to determine the number of commits to ﬁx the violation.

4. View complete build log to retrieve the type of violation.

5. Investigate the build log and ﬁnd the next build that does not contain the architectural violation.

6. Collect the commit message, the date and the Git revision number of this build.

7. Execute the command git log revisionNumber1..revisionNumber2 – pretty=oneline — wc -l to count the number of commits until the ﬁx occurred. Document and collect all metrics.

Moreover, to collect the metrics for semantic versioning we simply created a database table that contains the change log between two versions. As the semantic versioning tool is running it inserts the relevant data into the database. The table structure can be found in Table 6.1

As inter-service dependency problems break the build in the CI environment as failing early is important [32], we can simply count the number of violations using the web interface or REST-API of the existing Continuous Integration environment.

The number of changes to the architecture is checked by counting the number of commits that change the architecture definition file. This is done using git log –after=”$date” –oneline – ”./*/$architectureDefinitionFileName” — wc -l

6.1.3 Collected Data

We conducted the collection of the metrics as described in the previous section and computed the results displayed in Table 6.2 and Table 6.3 below. The violations that are present can either be a correct architecture violation or that the architecture was not updated. This occurred often in the beginning, because the architecture definition changed. However, the architectural definition file was

71 Table 6.1: Structure and examples of the change log table. TimeStamp Service Name Branch Type Old Ver New Ver MajorChanges MinorChanges 20161015.1713 service-a master Major 1.4.7 2.0.0 Major Changes ref- Minor Changes reference - deletion of erence - addition of method printVariable method printString 20161221.0952 service-a master Minor 2.0.0 2.1.0 - addition of method getParameters 20170210.0223 service-a master Patch 2.1.0 2.1.1 - - 20170121.0722 service-a master Patch 2.1.1 2.1.2 - - 20171028.1812 service-a master Major 2.1.2 3.0.0 Major Changes ref- - erence - method

72 changed: printString 20180205.0722 service-a master Patch 3.0.0 3.0.1 - - 20180112.1118 service-b master Minor 2.8.17 2.9.0 - Minor Changes reference - addition of method addString(String) 20180201.1212 service-b feature-b Patch 2.9.0 2.9.1 - - 20180204.1858 service-b feature-b Minor 2.9.1 2.10.0 - Minor Changes reference - addition of method addString(String, String) 20180130.0910 service-c feature-a Patch 1.4.7 2.0.0 - - not kept up to date, because the developers were not used to the tool developed during this thesis. The diﬀerent types of violation are described below:

• layering - The deﬁned layering is violated as a layer accesses an upper layer.

• cylce(X) - A cyclic dependency has been detected. X classes are involved in the cycle.

• context - A cross context (as in bounded context as described in [28, 45]) call is made without permission. This is usually the access to the internal logic of another system.

• arch - The architecture description ﬁle is invalid.

73 Table 6.2: Collected architectural violation metrics. Type Date Commit Message branch violation commits to fix Violation 29/11/2017 Fix error in dependency resolution feature context Fix 30/11/2017 Configure context dependencies for service feature - 118 Violation 17/11/2017 AdwordsAccountPerformanceDownload is feature layering triggered by a separate schedule Fix 20/11/2017 Fix depCheck violations feature - 9 Violation 28/11/2017 Add filters and metrics feature context Fix 29/11/2017 Update Architecture to reflect new require- feature - 17 ments Violation 01/12/2017 welcome alignment bounded context feature arch Fix 01/12/2017 Update architecture with new bounded con- feature - 12 text 74 Violation 27/11/2017 Introduce suggestor queries feature layering Fix 27/11/2017 Fix project structure feature - 4 Violation 21/11/2017 WIP feature cycle(2) Fix 23/11/2017 fix tests and dependencies feature - 31 Violation 10/11/2017 Move mutation to interface feature cycle(2) Fix 15/11/2017 Remove cyclic class dependencies feature - 68 Violation 10/11/2017 Add Mutator feature layering Fix 10/11/2017 Remove unwanted dependency feature - 4 Violation 06/11/2017 WIP SuggestorQuery feature arch Fix 06/11/2017 Suggestion model extension feature - 35 Violation 06/11/2017 First version of mutator service feature layering Fix 10/11/2017 Break cyclic class dependency feature - 55 Violation 06/11/2017 First version of mutator service feature cylce(3) Fix 10/11/2017 Break cyclic class dependency feature - 55 Table 6.3: Collected architectural violation metrics continued. Type Date Commit Message branch violation commits to fix Violation 20/10/2017 Streamline obfuscatedService1 with other feature arch services Fix 20/10/2017 Fix context dependency definitions feature - 1 Violation 20/10/2017 Streamline obfuscatedService2 with other feature layering services 75 Fix 23/10/2017 Streamline with other services feature - 6 Violation 25/10/2017 Introduce a obfuscatedEvent feature layering Fix 27/10/2017 fix minor code issue feature - 10 Violation 05/10/2017 Template wip feature layering Fix 05/10/2017 Changes during review feature - 5 Violation 11/10/2017 Obfuscated Task was present in all subpro- feature layering jects - move to interface Fix 12/10/2017 Fix build feature - 50 We collected the semantic version changes across all microservices and collected the metrics described in the previous section. The data can be found in Table 6.4 below. As an SQL dump is not useful at this point we executed the following commands to derive the metrics:

• Number of all items

• Number of all items grouped by their upgrade type

• Number of all items with major changes (major change references an empty list)

• Number of all items with minor changes (minor change references an empty list)

• Number of all items with major changes grouped by their upgrade type

• Number of all items with minor changes grouped by their upgrade type

• Number of all items without any major or minor changes grouped by their upgrade type

• Number of all diﬀerent upgrade types in the major change list and minor change list

76 Table 6.4: Collected semantic versioning metrics. Description Total Count Total Upgrade Count 4,913 Total of Breaking Changes 145 Total of Nonbreaking Changes 404 Total of Either Major or Minor Upgrades 431 Total of Patch Upgrades 4,482 Total of Minor Upgrades 274 Total of Major Upgrades 157 Error suggesting Patch instead of Minor Upgrade 3 Error suggesting Minor or Patch instead of Major 12 New method to trait introduced 5 Method deleted 7 Class has been removed 62 Class has been removed and method has been removed 72 Class has been removed and method has been removed 11 and field has been removed New method introduced 70 New fields introduced 49 New Class files introduced 33 New Class files introduced and method introduced 164 New Class files introduced and new fields and new 88 method introduced Total number of inter-service dependency errors in CI 49 environment Architecture changes within the project 99

6.2 Summary and Discussion of the Data

The collected data shows a huge enhancement in terms of time to correct architectural violations. This is a big improvement for the software development process and general quality assurance at Smarter-Ecommerce GmbH. This section analyzes the results of the collected data and compares it to the case study of the historical data as well as to the semantic versioning data presented in [48].

As is displayed in the table Table below, we can see that the average time to fix architecture problems went down drastically, with many architectural violations being fixed within the same or following day. This is a significant improvement compared to a development process without Continuous Architecture Evaluation. Additionally, the number of commits to fix a problem reduced dramatically as well, even though the total amounts of commits per day has increased a lot. This might be caused by the bigger development team compared to 2014. We can therefore conclude, that Continuous Architecture Evaluation is very effective in improving

77 Table 6.5: Summary of Metrics. Type Count Percentage Total violations 16 - Master branch violations 0 0% Feature branch violations 16 100% Fixes after review 1 6.25% Layering violation 8 50% Context access violation 2 12.4% Cyclic dependency violation 3 18.8% Architecture deﬁnition error 3 18.8%

Table 6.6: The average time until an architecture violation is ﬁxed. Type Range Mean Stdev Median Time 0 days to 5 days 1.57 days 1.65 1 day Commits 1 to 118 commits 29.5 33.6 13.5 the software quality, as violations are detected and ﬁxed dramatically quicker than without CAE.

The decline of appearances of architecture violations on the master branch is also an important factor. Most architectural changes occur naturally with the new CI workflow and are detected on the feature branch. Therefore, the code reviewer’s load is lighter as they do not have to manually detect architectural changes. The prototype developed during this thesis highlights architectural problems for them. The percentage of occuring layering violations did not change much when compared to our old case study. An additional error was introduced and that is an invalid definition of an architecture file, this however, is a good sign, as it means that the architecture has been maintained and kept up to date.

For the evaluation of the semantic versioning we analyzed our results for automated semantic versioning as seen below:

Table 6.7: Upgrade type compared to breaking changes in our automated versioning system. at least 1 breaking change Update type Yes % No % Total Major 157 100 0 0 157 Minor 9 3.3 274 96.7 283 Patch 3 0.1 4,482 99.9 4,485 Total 169 34.5 4,856 65.5 4,925

Now if we compare it to the study of the maven repository discussed by Raemaek-

78 ers et al. [48], we see a major drop of breaking changes without the corresponding major version upgrade. This gives us conﬁdence that our system improves software quality and is helpful for the developers at Smarter-Ecommerce GmbH. The study on the maven repository contains a total of 148,253 executables with version numbers whereas only 60,776 are usable for our approach as they do not change their versioning schema throughout development and adhere to the standard described in [47]. Raemaekers et al. [48] describe how often the semantic versioning is violated and not adhered to. We are comparing our system with manually found semantic versioning upgrades, which are forced by manual commit messages. This number is very accurate as the developer can view the version number on the user interface as described in Chapter 5 and takes a note of the version when releasing a product to the customer. If the version number does not correspond with the ex- pectation, a custom commit upgrades the version number and this custom upgrade is logged. This way we can determine how often an incorrect version number was created.

Here we present the essential part of the study described in [48], which we are comparing against our data.

Table 6.8: Upgrade type compared to breaking changes in the maven repository [48]. at least 1 breaking change Update type Yes % No % Total Major 4,268 35.8 7,624 64.2 11,892 Minor 10,690 35.7 19,267 64.3 29,957 Patch 9,239 23.8 29,501 76.2 38,740 Total 24,197 30.0 56,392 70.0 80,589

As we can see when comparing Figure 6.7 with Figure 6.8, the total percentage of Breaking and Non-breaking changes is in the same range as for our software system and projects available in the maven repository. However, we can definitely say that the amount of breaking changes for the minor and patch upgrade type is drastically reduced when using our CAE prototype for automated versioning. There were a total of twelve misclassifications when a breaking change was present and a total of three misclassifications for patch when a minor change should have been detected. These errors are mostly problems of our system to work with annotations, while annotations by our definition is a patch change, some annotations add side effects. Therefore, the twelve Major changes are changes to the @RequestMapping annotation provided by the Spring Framework [16]. While a value change of an Annotation

79 Table 6.9: Most common changes in the maven repository [48]. Description # Occurrences Breaking Changes method removed 177,480 class removed 168,743 interface class removed 46,854 field removed 126,334 method changed 69,335 method parameters changed 54,742 field changed 27,306 method in interface added 28,833 Nonbreaking changes method added 518,690 class added 216,117 field added 206,851 interface added 32,569 should only be a patch change, the value represents the URL of the REST-endpoint, which is a breaking change as the URL changes. Some annotations in Scala also provide hidden Method additions, which result in a minor change. The removal of these annotation would result in a major upgrade. However, these changes are not detected, which is why a few incorrectly classified changes are present in this dataset.

Anyways, we still believe that this is a big achievement to improve software versioning and also help with Continuous Architecture Evaluation as a correct semantic versioning system is necessary for our approach to validate inter-service dependencies. All in all we can answer the research question Does automated semantic versioning outperform manual versioning? with a yes, our CAE prototype most deﬁnitely is an improvement as manual versioning might contain many incorrectly classiﬁed libraries as shown in [48].

Detailed Look at Violations

In this section, the architectural violations at Smarter-Ecommerce that were detected early, were examined in detail. The violation that occured and the related problem for customers and developers are shown. This small investigation is intended to illustrate why architectural violations have to be avoided and why the improvement of the development process with the developed CAE prototype is an improvement to the industry.

80 Figure 6.1: Layered architecture with violations shown as red arrows.

Violation 27/11/2017 Introduce suggestor queries

A layering violation is present. The defined layering structure and the layering of the implementation is defined in Figure 6.1 as Illegal access 1. The violation shows an illegal access from the API layer directly to the Logic layer. However, as strict layering is defined in the architecture, this is a clear violation of the architecture in the implementation.

Multiple problems would arise if this violation was not fixed. Firstly the cohesion of the system decreases, thus reducing the maintainability of the system. The architecture defines that the execution of the business logic has to be done in the workflow layer. However, with this violation the execution of business logic may

81 come from any component from the workflow or the API layer. This makes the system hard to maintain. An even bigger problem that is very harmful for the company is that the security that handles authentication and restricts the access to the business logic is done in the workflow layer. A component of the API layer accessing the business logic directly is a major security flaw and must not be put into production. Thankfully the violation was detected early and was fixed while it was still on the feature branch.

Violation 17/11/2017 AdwordsAccountPerformanceDownload is triggered by a separate schedule

A layering violation is present as seen in Figure 6.1 as Illegal access 2. The logic layer directly accesses the util layer. This might be seen as not as big of a problem as the skipping of the data layer reduces code duplication for the access to util. However, this is again a case of lower cohesion as the access to util is now in diﬀerent places. Additionally, a big bug is introduced, as some components in the util layer cannot be started without proper initialisation of the data layer. This caused the application to crash on startup. Another big problem was that the test initialised the relevant data in the test setup, hiding the problem until it is live. If this violation was put into production, the system would have crashed on startup. This project ensured, that the problem was solved quickly while it was still on a feature branch.

Violation 10/11/2017 Move mutation to the interface

This layering violation is a cyclic dependency between classes. This cyclic dependency causes the application to crash on startup as an infinite loop is introduced when loading one of the affected classes. This architecture violation showed that the developers did not test thoroughly as the infinite loop could have been detected with simple unit tests. However, those tests were not in place and our CAE prototype ensured that the application could not be put into production, as a crash on startup should be avoided. These detailed investigations of software architecture violations give a good insight on why the architecture of a system is important and also why our approach improves the quality of the software in general. It was a major success for Smarter- Ecommerce as some of the above violations would have otherwise likely been put into production.

82 6.3 Integration into a Bigger Software Project

The tool developed during this thesis and the related case studies were done in collaboration with Smarter-Ecommerce GmbH. This section is intended to give an insight into the integration of the developed tool into software projects.

Currently, there are about 28 microservices online in different clusters. An agile team of around ten developers is maintaining and actively developing those microservices. Each microservice is written in a mix of JVM languages including Java, Groovy and Scala. As the number of microservices rose rapidly and the architecture of each service was neglected by developers as there was is no clear module structure and the architecture is not explicitly stated and written down. The architecture is continuously evolving, because of the agile nature of the team. The evolution of the system and size of the team causes a big need for the explicit statement, validation, evaluation and conformance checking of the architecture. As each microservice was released multiple times a day, a system for managing those services was required. As each of these services communicates with each other either asynchronously using a message bus or synchronously using REST or another form of RPC, a change within a system requires a change in all other systems that are communicating with it. Therefore, the semantic versioning became the technique of choice as it explicitly states if a change might break other systems based on the version number. We implemented the architecture evaluation first and put it into practice for our ”microservice stub”, a skeleton of a service, which contains all relevant modules and communication endpoints for a microservice. This skeleton also contains a testing infrastructure so that the individual developer can focus on coding without having troubles setting the microservice up. The architecture file was added to the skeleton and the architecture validation gradle plugin was introduced as described in Chap- ter 5. These plugins were enabled in the already existing Continuous Integration environment, which is currently TeamCity developed by Jetbrains [17]. This forced all developers to use the architecture validation system for new microservices while the old microservices are still up and running and will be refactored into this style with the architecture definition file later on in daily refactorings. The semantic versioning system has been introduced with a big bang switch for all microservices. The tool was implemented and tested thoroughly. After the checking the correctness of the tool, it was rolled out to all systems by adding the Gradle plugin to each microservice and also enabling the plugin on Teamcity. The

83 database handling the current verison numbers was installed on the same server that the Continuous Integration environment was installed on, in order to mitigate possible connection problems as it is one of the most vital parts of the new release procedure. As a release has to be possible all the time in order to allow for critical bugfixes in an emergency (even if no correct version number is available), we added a fallback to retrieve the latest release version from the git-tag in case the database is not reachable. If no version to compare with is available we defined the default behaviour to simply assume a minor upgrade. If the change is breaking, but cannot be detected by our tool a feature was added to manually force a major upgrade using custom commit messages. Each successful build is also archived by putting an artifact into an artifactory [12]. For snapshots that are simply commited the branch name is added to the artifact and the timestamp of the build is added after the version number to ensure unique artifact names. The inter-service dependency checking was introduced last, because it builds upon the semantic versioning. Currently, the inter-service dependency does not break the build if the dependency check is violated across services based on the version number. A simple user interface in the form of a dynamic web page, shows the dependencies between services, as well as the version number it relies upon. Currently, this web page proved successful, as developers are using it to troubleshoot problems related to the dependencies between microservices. For future iterations of this tool, the system can be adjusted to notifying the developer of violations, using instant messaging or email in addition to the user interface. Overall, the incremental introduction of the different systems was a success. The developers accepted it really well meaning that the requirement usability was satisfied. The positive feedback from the developers as well as the feedback from the main software architect showed that the system was an enhancement to the development process proving the requirement of the applicability of the system to be met. The correctness requirement was checked using the previous evaluation and the flexibility is already given as the code within Smarter-Ecommerce is written in Java, Scala and Groovy.

6.4 Lessons Learned and Developer Feedback

In summary it can be said that the tool suite developed and deployed as part of this thesis was a success, as it was accepted and in general well received by the development team. As soon as the architecture validation was introduced, developers started to get notiﬁed of diﬀerent architecture violations and had to take a closer look at the def-

84 inition as well as their own implementation. This is an advantage as the developer takes a closer look at the architecture definition and the architectural knowledge is spread throughout the development team. An advantage of our plain-text architecture description file is that it is always checked in. This gave the developers the advantage that simple package refactorings that were supported by the IDE also changed the architecture description accordingly, making the architecture always visible and always up to date. The plain-text architecture definition is, in this case, superior to external tools. We also introduced a default architecture for all of the microservices, that at least contains a layering architecture. This way Continuous Architecture Evaluation is done for every microservice that is created. It also introduces a minimal form of an architecture for the developer to work with. Throughout the development process of the microservices, the architecture file was gradually enhanced, making the architecture more detailed. This allowed for additional architectural checks, which caused an improvement in quality of the system. As the automated validation of the architecture of our systems was introduced, the development process slowly changed. Reviews took less time as no changes in the architecture meant smaller reviews. For reviews with architectural changes, the first step was to review the new architecture and focus on the code quality in later steps. The tool developed in this thesis allows for an easier software review process as the new architecture is immediately visible and comprehensible. In the beginning strict layering checks for new microservices were disabled, as the developers got used to working with our tool and learned how to interpret the errors displayed in the Continuous Integration environment we gradually introduced stricter rules. This was surprisingly well accepted as the developers noticed the importance of the architecture evaluation while working with the system. Summarizing the following points were very apparent and important to the development team and are shown below:

• The developers gained knowledge about the architecture.

• The architecture was visible and present.

• The architecture was always up to date.

• The default architecture for microservices was an improvement.

• A ﬂexible architecture to support newly introduced and mature services alike was required.

• Developers were forced to evaluate architecture choices.

85 • Developers were forced to think about the architecture.

• The plain-text checked in architecture deﬁnition was visible to the developer and was kept up to date.

• If something is hidden behind tools it is neglected by the developers.

• Automate as much as possible. The CAE tool running in the background and detecting changes automatically was a big enhancement over running it manually with a button click.

• Start with lean rules and create strict rules as the developers get used to working with the tool.

• Refactoring work decreased, according to the development team.

86 Chapter 7

Conclusions

In this work we discussed the importance of adhering to a software architecture, which is often neglected during development [22, 55, 44, 38], especially when using agile software methodologies and microservices [29]. As Continuous Integration is an integral and well established technique in current software development processes we introduced a system that validates the implementation of a software system against its architecture. Nowadays, the architecture of a software product is no longer re- stricted to a single software component, but can be a multitude of microservices communicating with each other. These communications can be in the form of remote calls or asynchronous messaging. This type of communication between services hinders the extraction of the architecture of the implementation. We therefore introduced a system, that creates an automatic version number based on semantic versioning [48, 47]. These version numbers indicate possible incompatibilities across services by specifically showing breaking changes in their version number. The dependencies across microservices (the architecture of a microservice landscape) can be validated using the version number of each individual microservice. Lastly, to allow an easy refactoring of a monolithic system, we defined a bounded context architecture for a microservice, which supports re-structuring a software system into multiple microservices. We heavily based the defined architecture on [45, 28] and found that the separation into bounded contexts facilitates the extraction as typically as each bounded context was already very loosely coupled and could become a microservice on its own.

The following section tries to answer the questions that could be answered using our methodology of design science, building a prototype to evaluate our approach. Below the questions and the related answers are presented.

87 How can Continuous Architecture Evaluation be applied to software projects? The existence of this prototype is a good example on how Continuous Architecture Evaluation could be applied to software projects. As the system is in production at Smarter-Ecommerce GmbH we believe that our approach is a valid one to apply CAE to software projects.

How can Continuous Architecture Evaluation be used for microservices? As there is no traditional way to apply CAE across the boundaries of a single software system, we combined our approach for CAE with semantic versioning. As our evaluation in Chapter 6 reports a total of 49 errors for inter-service dependency violations we believe that architectural checks across services are possible with our approach. It can be reasoned that semantic versioning is a valid approach to use CAE for microservices.

How effective is Continuous Architecture Evaluation in improving software quality? Our case study in Chapter 6 suggests a massive improvement to the software quality as architecture violations are detected early and fixed rapidly. The time and variance until a fix occurs has improve significantly. As the available dataset is still small, the results might not be representative for all software systems, however, the improvement is so massive (with the average time to fix an architecture violation decreasing by a factor of 100) that its efficiency can hardly be denied.

Does automated semantic versioning outperform manual versioning? The evaluation in Chapter 6 shows a tremendous improvement of automated semantic versioning when comparing it to manual versioning. The assigned version number is more consistent and requires little to no eﬀort for the developers. The comparison with the study of the maven repository [48] shows a massive advantage of automated semantic versioning in terms of correctness and consistency.

88 In order to follow up on the requirements for our systems, the CAE tool is investigated. The analysis is presented below.

• Flexibility: The independence of programming languages is achieved to some degree. Our baseline is the JVM as Smarter-Ecommerce uses Groovy, Java and Scala in production. The developed prototype can handle all three languages, fulfilling the language independence requirement. However, a restriction is that it only supports languages running in the Java Virtual Machine. The prototype is written with the Gradle [8] build automation tool and can be integrated in most existing CI environments. Additionally, the architecture definition is flexible to be defined for different granularities to work for most projects and most software system evolution stages. Summarizing the flexibility requirement has been achieved.

• Correctness: Our requirements state that the architecture violations have to be detected, that breaking changes have to be detected correctly and that our prototype conforms with our rules and is consistent in that he produces the same results when run repeatedly. These requirements are easily veriﬁed by simply looking at the evaluation done in Chapter 6. The amount of errors is astoundingly low and while some errors are present, the consistency with our rules is given, fulﬁlling our correctness requirement.

• Usability: As our tool is integrated into a CI environment, the developer does not need to learn an additional tool. The developer view in Chapter 5 is a good example on the usability of the system as violations are made apparent to the developers and the error description allows for easy solutions to ﬁx the given violation. Summarizing we can reason that the usability requirement is fulﬁlled.

• Applicability: The time to notice and react to violations is made apparent in Chapter 6 when comparing the time using our prototype with the time to fix architectural violations without a tool in Chapter 3. As the architecture is also documented as a file that is checked in, we can reason that our requirement for applicability has been fulfilled. The plain-text format of the architecture definition file does not require the developer to learn additional tools.

This tool is already in production at Smarter-Ecommerce GmbH [15]. Future work will be the introduction of additional functionality for the architecture description ﬁle, in order to allow for even more architectural checks to be performed. One

89 could include design patterns, developers have to adhere to. The factory pattern would be a great example of an additional functionality. A definition of the factory class and the access rules can easily be introduced. The factory pattern can already be integrated using the ”additional rules” parameter of the current implementation. However, a specific parameter to ease the addition of design patterns to the architectural definition file would be an excellent addition. The dependencies between classes of the actual implementation could be displayed in the developer view as well as a directional graph between classes. This could increase the visibility of implementation problems or design problems. As the architecture description is currently a text file, a graphical representation in the developer view would be nice as well. However, we experienced that a special tool to edit and create the architecture is not accepted by our development team, so a solution would be to keep the architecture description in the textual format that will be checked in into the version control system. A second view for the architecture however, would be a nice addition and would integrate quite well to the current developer view.

The extraction of bounded contexts of a monolithic system into multiple microservices is currently manual work and can be done with a simple script that copies the data of the bounded contexts and creates a communication wrapper (can encapsulate the communication between microservices using remote calls) on top of it.

Overall, our case study shows that our tool-supported approach has been a success, that can help development teams a lot, especially if the domain is related to agile methodologies and microservices. Based on these experiences we are conﬁ- dent that this work is very good at propagating information to the developer and introducing architectural knowledge to the individual software developer. While there are still many research questions for CAE in relation to microservices, we hope that this work is a good foundation to build upon for future research.

90 Bibliography

[1] Asm. asm.ow2.org, [January 29th 2018], 2002.

[2] Byte code engineering library (apache commons bcel). commons.apache.org/proper/commons-bcel, [January 29th 2018], 2004.

[3] Bytebuddy. bytebuddy.net, [January 29th 2018], 2013.

[4] Classycle. classycle.sourceforge.net, [January 29th 2018], 2004.

[5] Dependency-analyser. dependency-analyzer.org, [January 29th 2018], 2003.

[6] Elasticsearch. elastic.co [January 29th 2018], 2015.

[7] Findbugs. ﬁndbugs.sourceforge.net, [January 29th 2018], 2006.

[8] Gradle. gradle.org [January 29th 2018], 2009.

[9] Grafana. grafana.com [January 29th 2018], 2014.

[10] Janalyzer. sourceforge.net/projects/jptoolkit, [January 29th 2018], 2003.

[11] Jdepend. github.com/clarkware/jdepend, [January 29th 2018], 2004.

[12] Jfrog artifactory. www.jfrog.com/artifactory/ [January 29th 2018], 2014.

[13] Prometheus. prometheus.io [January 29th 2018], 2012.

[14] Rabbitmq. rabbitmq.com [January 29th 2018], 2010.

[15] Smarter-ecommerce. smarter-ecommerce.com [January 29th 2018], 2007.

[16] Spring. spring.io [January 29th 2018], 2004.

[17] Teamcity. www.jetbrains.com/teamcity [January 29th 2018], 2006.

[18] Vaadin. Vaadin.com [January 29th 2018], 2007.

91 [19] A. Balalaie, A. Heydarnoori, and P. Jamshidi. Microservices Architecture En- ables DevOps: Migration to a Cloud-Native Architecture. IEEE Software, pages 42–52, May 2016.

[20] K. Beck. Extreme programming explained: embrace change. Addison-Wesley Professional, 2000.

[21] K. Beck, M. Beedle, A. Van Bennekum, A. Cockburn, W. Cunningham, M. Fowler, J. Grenning, J. Highsmith, A. Hunt, R. Jeﬀries, et al. Manifesto for agile software development. 2001.

[22] G. Buchgeher and R. Weinreich. Integrated Software Architecture Manage- ment and Validation. In 2008 The Third International Conference on Software Engineering Advances, pages 427–436, Oct 2008.

[23] R. Conradi and B. Westfechtel. Version Models for Software Conﬁguration Management. ACM Comput. Surv., pages 232–282, 1998.

[24] A. Decan, T. Mens, M. Claes, and P. Grosjean. When GitHub Meets CRAN: An Analysis of Inter-Repository Package Dependency Problems. In IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pages 493–504, March 2016.

[25] F. Deissenboeck, L. Heinemann, B. Hummel, and E. Juergens. Flexible architecture conformance assessment with ConQAT. In 32nd International Conference on Software Engineering, pages 247–250, May 2010.

[26] N. Dragoni, S. Giallorenzo, A. Lluch-Lafuente, M. Mazzara, F. Montesi, R. Mustaﬁn, and L. Saﬁna. Microservices: yesterday, today, and tomorrow. 06 2016.

[27] M. Eichberg, S. Kloppenburg, K. Klose, and M. Mezini. Deﬁning and continuous checking of structural program dependencies. In ACM/IEEE 30th Inter- national Conference on Software Engineering, pages 391–400, May 2008.

[28] E. Evans. Domain-Driven Design: Tackling Complexity in the Heart of Soft- ware. Addison-Wesley 2003. ISBN: 978-0-321-12521-7.

[29] M. Fowler and K. Beck. Refactoring: improving the design of existing code. Addison-Wesley Professional, 1999.

[30] D. Garlan. Software Architecture: A Roadmap. In Proceedings of the Confer- ence on The Future of Software Engineering, ICSE, pages 91–101, New York, NY, USA, 2000. ACM.

92 [31] M. Goldstein and I. Segall. Automatic and Continuous Software Architecture Validation. In 2015 IEEE/ACM 37th IEEE International Conference on Soft- ware Engineering, pages 59–68, May 2015.

[32] A. Grabner. Performance and architecture validation with existing unit tests. dynatrace.com [January 29th 2018], 2011.

[33] J. Hamano. Git. git-scm.com [January 29th 2018], 2005.

[34] A. R. Hevner, S. T. March, J. Park, and S. Ram. Design Science in Information Systems Research. MIS Q., pages 75–105, March 2004.

[35] J. Humble and D. Farley. Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation. Addison-Wesley Profes- sional, 1st edition, 2010.

[36] K. Jezek, J. Dietrich, and P. Brada. How Java APIs Break - An Empirical Study. Inf. Softw. Technol., pages 129–146, September 2015.

[37] S. Kim, S. Park, J. Yun, and Y. Lee. Automated Continuous Integration of Component-Based Software: An Industrial Experience. In 23rd IEEE/ACM International Conference on Automated Software Engineering, pages 423–426, Sept 2008.

[38] S. Lakshitha and B. Dharini. Controlling software architecture erosion: A sur- vey. Journal of Systems and Software, pages 132–151, 2012. Dynamic Analysis and Testing of Embedded Software.

[39] D. C. Luckham and J. Vera. An event-based architecture deﬁnition language. IEEE Transactions on Software Engineering, pages 717–734, Sep 1995.

[40] S. A. MacKay. The State of the Art in Concurrent, Distributed Conﬁguration Management. In Selected Papers from the ICSE SCM-4 and SCM-5 Work- shops, on Software Conﬁguration Management, pages 180–193, London, UK, UK, 1995. Springer-Verlag.

[41] F. Mancinelli, J. Boender, R. Cosmo, J. Vouillon, B. Durak, X. Leroy, and R. Treinen. Managing the Complexity of Large Free and Open Source Package- Based Software Distributions. In Proceedings of the 21st IEEE/ACM Interna- tional Conference on Automated Software Engineering, ASE, pages 199–208, Washington, DC, USA, 2006. IEEE Computer Society.

[42] S. T. March and G. F. Smith. Design and natural science research on information technology. Decision Support Systems, pages 251–266, 1995.

93 [43] N. Medvidovic and R. N. Taylor. A classiﬁcation and comparison framework for software architecture description languages. IEEE Transactions on Software Engineering, pages 70–93, Jan 2000.

[44] G. C. Murphy, D. Notkin, and K. J. Sullivan. Software reﬂexion models: bridg- ing the gap between design and implementation. IEEE Transactions on Soft- ware Engineering, pages 364–380, Apr 2001.

[45] S. Newman. Building Microservices: Designing Fine Grained Systems. O’Reilly 2015. ISBN: 978-1-4919-5035-7.

[46] D. L. Parnas. Software aging. In Proceedings of 16th International Conference on Software Engineering, pages 279–287, May 1994.

[47] T. Preston-Werner. Semantic versioning 2.0.0. semver.org [January 29th 2018], 2014.

[48] S. Raemaekers, A. van Deursen, and J. Visser. Semantic Versioning versus Breaking Changes: A Study of the Maven Repository. In IEEE 14th Interna- tional Working Conference on Source Code Analysis and Manipulation, pages 215–224, Sept 2014.

[49] M. Roberts. Enterprise Continuous Integration Using Binary Dependencies. In J. Eckstein and H. Baumeister, editors, Extreme Programming and Agile Processes in Software Engineering, pages 194–201, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg.

[50] K. Schwaber and M. Beedle. Agile software development with Scrum. Prentice Hall Upper Saddle River, 2002.

[51] A. E. Solovjev. Operationalizing the Architecture of an Agile Software Project. 2014.

[52] R. Tarjan. Depth-ﬁrst search and linear graph algorithms. In 12th Annual Symposium on Switching and Automata Theory (swat 1971), pages 114–121, Oct 1971.

[53] J. Thoenes. Microservices. IEEE Software, pages 116–116, Jan 2015.

[54] W. F. Tichy. Tools for Software Conﬁguration Management. SCM, pages 1–20, 1988.

94 [55] J. B. Tran, M. W. Godfrey, E. H. S. Lee, and R. C. Holt. Architectural repair of open source software. In Proceedings IWPC 2000. 8th International Workshop on Program Comprehension, pages 48–59, 2000.

[56] S. Wiltamuth, A. Hejlsberg, P. F. Sollich, and B. M. Abrams. System and methods for providing versioning of software components in a computer programming language. Patent US 6981250 B1 - Microsoft Corporation.

[57] M. Zaymus. Decomposition of monolithic web application to microservices. 2017.

[58] L. Zhu, L. Bass, and G. Champlin-Scharﬀ. DevOps and Its Practices. IEEE Software, pages 32–34, May 2016.

95 Chapter 8

Curriculum Vitae

The following two pages contain the curriculum vitae of the submitter as is required by the standard for Master Thesis submissions for Computer Science students of the Johannes Kepler Universitt Linz (JKU).

96 Manuel Kollegger Software Developer  +43 180 870 01  [email protected]  Aubrunnerweg 41a, 4040 Linz, Austria  Born on 23rd of April 1992

I am a Software Developer, currently living in Austria and working for Smarter Ecommerce as a Software Engineer. Iamaboutto graduate my Master’s studies in Computer Science at the Johannes Kepler University (JKU) in Linz. My main focus and interest is developing in the Java ecosystem.I have been interested in Software Development for a long time and started programming in Java and C when I was 15 years old. I am particularly interested in Microservices and their whole environment. My Master’s Thesis is a tool to support the refactoring of monolithic systems into Microservices as well as the automatic versioning and automatic software architecture validation in an Continuous Integration environment.

 Skills

Programming Languages : Java, Scala, Android, Groovy, C, C# Frontend Programming : Javascript, AngularJS, Bootstrap, LESS, CSS, JSF JSP Databases : MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server Development environments : IntelliJ Idea, Eclipse, Visual Studio Supporting technologies : AWS, Gradle, Git, CI-Environment(Teamcity), Docker, Ansible, Maven, Ant, SVN Operating Systems : Mac OS, Windows, Debian Linux, Windows Server, Other Linux distributions Testing : Spock, Jasmine, JUnit Others : Spring Framework (Boot, Core, Data, Security, Messaging, REST), SOA, TDD, Vaadin, Akka, RabbitMQ, Machine Learning and feature analysis using Python and Java (WEKA)

 Work Experience

Present Software Engineer | Full Stack Developer | DevOps, Smarter Ecommerce, Austria July 2016 ∠ Product development in Java and Scala ∠ Creating a skeleton-stub for Microservices ∠ Define a new enhanced Continuous Integration workflow ∠ Automatic monitoring of existing Microservices ∠ Create monitoring UIs in Vaadin ∠ Attend various software development conferences ∠ Work with a Test Driven Development mindset ∠ Work in an agile multi cultural team ∠ Use English as the main language for communication Spring Scala Java Groovy TDD Gradle Teamcity Vaadin Akka IntelliJ Idea

Present Software Developer, Personal Project, Austria December 2016 ∠ Develop an application to compare customers ∠ Create a User Interface for said application ∠ Personal project with an economist Spring Java Vaadin Business Experience

Oktober 2015 Summer Job | Full Stack Developer, Smarter Ecommerce, Austria July 2015 ∠ Develop an application to support internal processes ∠ Create a User Interface for that application ∠ Work with a Test Driven Development mindset ∠ Work in an agile multi cultural team Javascript Bootstrap AngularJS Spring Java Less TDD

Summer 2013 Summer Job | Database engineer , MIC customs solutions, Austria Summer 2013 ∠ Create a tool to automatically replicate a database ∠ Manipulate replication with special configurations ∠ Database optimizations ∠ Database performance improvements SQL Stored procedures Java JDBC H2 PostgreSQL Oracle October 2012 Civil Service | transport of patients , Hospital Gmunden, Austria January 2012 ∠ Mandatory civil or military service ∠ Escort or transport patients to various examinations Social skills

Summer 2011 Summer Jobs | Developer , M&R Automation, Austria Summer 2008 ∠ Various summer jobs ∠ Automate various robots and machines ∠ Create user interface ∠ Embedded system programming ∠ Support tool development in C and C# SPS S7 C C# sensory technology bus technology

 Languages

German ○ ○ ○ ○ ○ ○ ○ English ○ ○ ○ ○ ○ ○ 䐣

 Programming Languages

Java ○ ○ ○ ○ ○ ○ 䐣 Scala ○ ○ ○ ○ ○ 䐣䐣 Groovy ○ ○ ○ ○ ○ 䐣䐣 C ○ ○ ○ ○ 䐣䐣䐣 Javascript ○ ○ ○ ○ ○ 䐣䐣 AngularJS ○ ○ ○ ○ 䐣䐣䐣 C# ○ ○ ○ 䐣䐣䐣䐣

 Studies

2016 - Present Master’s program in Computer Science at the Johannes Kepler University Linz (Graduating in Q1 of 2018) Thesis topic : Continuous Architecture Validation in the Context of Microservices 2016 Exchange semester in Logan Utah, USA - Computer Science program at the Utah State University 2012 - 2015 Bachelor’s program in Computer Science at the Johannes Kepler University Linz (Graduated January 2016) Thesis topic : Light-Field computing and rendering on Android 2006-2011 School for higher technical education in Computer Science - HTL Wels Special Courses Machine Learning | Pattern Classification | Java SE 8 | Functional Programming | Compiler Construction | Game Development | Computer Vision | Computer Graphics | JVM System Software | Agent Systems Conferences Regular at ”Enterprise Java User Group” meetups | regular at the ”Technologieplauscherl” - a technology meetup | Socrates Linz (Open Space Conference) | Topconf 2017 | IAAM (DevOps Meetup)

 Hobbies

Traveling Traveled through Europe in 2010 and 2012 and a lot in the US while I was on my exchange semester Cycling I cycled an annual road bike marathon in Wildon from 2011-2017 Running I ran the Linz city marathon (just the half-marathon) in 2017 Hiking Running, walking or crawling - whatever gets me up a mountain Reading Cooking Chapter 9

Statutory declaration

Ich erklärean Eides statt, dass ich die vorliegende Masterarbeit selbstständigund ohne fremde Hilfe verfasst, andere als die angegebenen Quellen und Hilfsmittel nicht benutzt bzw. die wörtlich oder sinngemäß entnommenen Stellen als solche kenntlich gemacht habe. Die vorliegende Masterar- beit ist mit dem elektronisch übermittelten Textdokument identisch.

I hereby declare that the thesis submitted is my own unaided work, that I have not used other than the sources indicated, and that all direct and indirect sources are acknowledged as references. This printed thesis is identical with the electronic version submitted.