Evaluation of a Qualitative Model for a Company's Technical Maturity Within Continuous Integration, Continuous Delivery and Devops

DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS ,

Evaluation of a qualitative model for a company's technical maturity within Continuous Integration, Continuous Delivery and DevOps

PER HAGSTEN

KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Evaluation of a qualitative model for a company’s technical maturity within Continuous Integration, Continuous Delivery and DevOps

PER HAGSTEN

Master’s degree in Computer Science Date: August 24, 2018 Supervisor: Alexander Kozlov Examiner: Cristian M Bogdan Swedish title: Utvärdering av en kvalitativ modell för ett företags tekniska mognadsgrad inom Continuous Integration, Continuous Delivery och DevOps School Of Electrical Engineering And Computer Science

iii

Abstract

The purpose of this study is to continue development of a benchmarking model to help companies assess their technical maturity when it comes to adopting Continuous Integration, Continuous Delivery and DevOps in their organization. The goal of the research is to assess how to improve the quality of qualitative models. Which conclusions can be drawn from comparing companies using benchmark and to assess which actions are the most effective to take to reach higher Continuous Integration, Continuous Delivery and DevOps maturity. The benchmark consisted of a questioner of two hundred statements that were answered for level of agreement with a current situation analysis and an ought-to-be analysis to be able to draw conclusions from the possible discrepancy between these categories. The questioner was answered during an interview study with chosen clients. Conclusions drawn from this study were that a lot can be done to improve the quality of qualitative models for examining Continuous Integration, Continuous Delivery and DevOps maturity. Different actions are necessary but the most important seems to be to ask open ended questions as well ass questions about different aspects of the same problem to promote discussion. It was also showed to be important to peer review the questions in the interview material beforehand to increase quality. The study also showed that it is possible to see trends in Continuous Integration, Continuous Delivery and DevOps maturity when comparing qualitative results for research subjects. The study showed that the most effective method for increasing Continu- ous Integration, Continuous Delivery and DevOps maturity is to use extensive automated testing suites that covers all testing disciplines. Keywords: Continuous Integration, Continuous Delivery, Devel- opment Operations, Software Development, Interview Study, Qualita- tive Study. iv

Sammanfattning

Syftet med studien är att vidareutveckla ett benchmarkingverktyg för att hjälpa företag att bedöma sin tekniska mognad när det gäller att anta Continuous Integration, Continuous Delivery och DevOps i sin organisation. Målet med forskningen är att bedöma hur man kan för- bättra kvaliteten på kvalitativa modeller för att mäta detta, samt vilka slutsatser som kan dras av att jämföra företags resultat som nyttjat studien. Samt att undersöka vilka åtgärder som är effektivast att ta för att nå en högre mognadsgrad inom Continuous Integration, Continuous Delivery och DevOps. Benchmarken bestod av ett frågebatteri av tvåhundra påståenden som besvarades av kunden i hur mycket de instämde till ett påståen- de. Resultatet samanställdes till en aktuell nulägesanalys och en bör- lägesanalys, med målet att dra slutsatser i vilka skillnaden som fanns mellan dessa två kategorier. Kunden besvarade frågebatteriet under en intervjustudie med utvalda anställda. Slutsatser som härrör från denna studie var att mycket kan göras för att förbättra kvaliteten på kvalitativa modeller för att undersöka Continuous Integration, Continuous Delivery och DevOps mognadsgrad. Olika åtgärder är möjliga, men det viktigaste förefaller vara att fråga öppna frågor för att främja diskussion samt att ställa frågor om olika aspekter av samma problem. Samt att opponera frågorna internt i intervjuundersökningen innan det utförs hos en kund, för att öka kvaliteten. Studien visade också att det även är möjligt att se trender i Continuous Integration, Continuous Delivery och DevOps mognad hos deltagarna när man jämför de kvalitativa resultaten. Studien visade att de mest effektiva metoderna för att öka Continuous Integration, Continuous Delivery och DevOps mognadsgrad är att använda om- fattande automatiserade testsviter för samtliga testmetoder. Nyckelord: Continuous Integration, Continuous Delivery, Deve- lopment Operations, Mjukvaruutveckling, Intervjustudie, Kvalitativ studie. v

Acknowledgements

I learned a lot during the thesis, both academically but also about how software development works in the real world. I am glad for the op- portunity of the education that The Royal Institute of Technology have given and for R2M for choosing me above all other applicants for this master thesis topic. It has been a fun journey to go from the signing the first papers to the finished report and presentation. I would like to thank my excellent tutor at R2M, Björn Tegeberg, for his excellent guidance, comments, advice, encouragement and suggestions during the thesis. I would like to thank my supervisor at KTH, Alexander Kozlov for his guidance, comments and suggestions during the thesis. Thank you to Anders Wildelv and Hannes Hagman at R2M, for aiding me with finding clients for the interviews. Thank you to Carl Frendin and Marcus Weurlander at R2M, for aiding me in proofreading the thesis. I also want to extend my thanks to the organization of R2M and my colleges for their support, encouragement and fun post work activities that has helped me to stay motivated, feel welcomed and happy in the work place. Finally, I would also like to thank you the reader of this thesis for reading my report! I hope that you will find it as interesting as I did! Per Hagsten List of Tables

6.1 Results from Company One’s gap analysis in table form. 46 6.2 Company One’s level of agreement to the backlog. . . . . 46 6.3 Results from Company Two’s gap analysis of the new workﬂow in table form...... 49 6.4 Company Two’s level of agreement to the backlog of the new...... 49 6.5 Results from Company Two’s gap analysis in table form. 52 6.6 Company Two’s level of agreement to the backlog. . . . . 52 6.7 Results from Company Three’s gap analysis in table form. 54 6.8 Company Three’s level of agreement to the backlog. . . . 54

7.1 Calculated mean values...... 67 7.2 Mean level of agreement to the backlogs...... 67

vi List of Figures

4.1 A mind map showing the different parts of CI...... 13 4.2 The CI pipeline...... 15 4.3 The deployment pipeline...... 18 4.4 The traditional view of release candidates...... 18 4.5 Blue Green Deployment...... 20 4.6 The DevOps cycle...... 23 4.7 What is DevOps? ...... 24 4.8 The Waterfall model...... 29 4.9 The Scrum cycle...... 32

5.1 Iterative Research Pattern...... 37 5.2 The Interview Workﬂow...... 40 5.3 Circle diagram used in the gap analysis, based on mockup answers...... 41

6.1 Company One’s gap analysis...... 45 6.2 Company Two’s gap analysis for the new workﬂow. . . . 48 6.3 Company Two’s gap analysis...... 51 6.4 Company Three’s gap analysis...... 53

7.1 All current situations from the four companies...... 64 7.2 All ought to be situations from the four companies. . . . 65 7.3 The mean for all companies...... 66

vii viii LIST OF FIGURES

List of abbreviations

CI ...... Continuous Integration

CD ...... Continuous Delivery

CMMI ...... Capability Maturity Model Integration

DevOps ...... Development Operations

IDE ...... Integrated Development Environment

IT ...... Information Technology

GUI ...... Graphical User Interface

R2M ...... R2Meton AB

RC ...... Release Candidate

RQ ...... Research Question

SOA ...... Service Oriented Architecture Contents

List of Tables ...... vi List of Figures ...... vii Abbreviations ...... viii

1 Introduction 1 1.1 Thesis Outline ...... 3

2 Background 4 2.1 Pre-study results ...... 4 2.2 Problem Deﬁnition ...... 4 2.3 Purpose ...... 5 2.4 Research Questions ...... 6 2.5 Delimitations ...... 6

3 Related Work 8 3.1 R2M’s Previous Work ...... 8 3.2 DORA ...... 10 3.3 CMMI ...... 10

ix x CONTENTS

4 Relevant Theory 12 4.1 Continuous Integration ...... 12 4.1.1 Advantages of Continuous Integration ...... 16 4.1.2 Disadvantages of Continuous Integration . . . . . 16 4.2 Continuous Delivery ...... 17 4.2.1 Deployment Patterns ...... 19 4.2.2 Advantages of Continuous Delivery ...... 21 4.2.3 Disadvantages of Continuous Delivery ...... 21 4.3 Development Operations ...... 22 4.3.1 Advantages of Development Operations . . . . . 25 4.3.2 Disadvantages of Development Operations . . . . 25 4.4 Software Testing ...... 26 4.4.1 Unit Testing ...... 26 4.4.2 Integration Testing ...... 27 4.4.3 System Testing ...... 27 4.4.4 Functional Testing ...... 27 4.4.5 Acceptance Testing ...... 27 4.4.6 Capacity Testing ...... 27 4.4.7 Destructive Testing ...... 28 4.5 Software Development Methodologies ...... 28 4.5.1 The Waterfall Model ...... 28 4.5.2 Agile ...... 30 4.6 Software Architecture ...... 32 4.6.1 Monolith Architecture ...... 33 4.6.2 Service Oriented Architecture ...... 33 4.6.3 Microservices Architecture ...... 34 CONTENTS xi

5 Method 36 5.1 Research Methods ...... 36 5.1.1 Qualitative vs Quantitative Research ...... 36 5.1.2 Iterative Research Pattern ...... 37 5.2 Pilot Study ...... 37 5.3 Thesis ...... 38 5.3.1 Preparations ...... 39 5.3.2 The Interview Phase ...... 40 5.3.3 Evaluation of the Interview Study ...... 42

6 Results 44 6.1 The Improved Model ...... 44 6.2 The Interview Study ...... 44 6.2.1 Company One ...... 45 6.2.2 Company Two ...... 47 6.2.3 Company Three ...... 53

7 Discussion 56 7.1 The Improved Model ...... 56 7.1.1 Conﬁguration Management ...... 57 7.1.2 Build Systems ...... 57 7.1.3 Task-Based Development ...... 58 7.1.4 Tools Integration ...... 58 7.1.5 CI and CD Tool Support Usage ...... 58 7.1.6 Deployment ...... 58 7.1.7 Unit Testing ...... 59 7.1.8 System- and Integration Testing ...... 59 7.1.9 Acceptance Testing ...... 59 xii CONTENTS

7.1.10 Performance Testing and Destructive Testing . . . 59 7.1.11 Security ...... 60 7.1.12 Logging and Feedback ...... 60 7.1.13 Installation and Upgrading ...... 61 7.1.14 Software Architecture ...... 61 7.1.15 Company Culture ...... 61 7.2 The Interview Study ...... 62 7.3 Post Interview Study Evaluation ...... 63 7.3.1 Comparisons and Similarities ...... 63 7.3.2 Interpreting the Feedback Data ...... 67 7.3.3 Does the Result Change Over Time? ...... 68 7.3.4 Flaws In the Study Evaluation ...... 68

8 Conclusions 70 8.1 Research Questions ...... 70 8.1.1 RQ1: How can we improve qualitative models for measuring CI/CD and DevOps maturity? . . 70 8.1.2 RQ2: To what extent can we compare qualitative results in between companies using a qualitative model? ...... 71 8.1.3 RQ3: Which actions are effective for improving CI/CD and DevOps compliance in an organization? ...... 72

9 Suggested Future Work 73

10 Retrospective 75

Bibliography 76 CONTENTS xiii

A Appendix 80 A.1 Sample Model ...... 80 A.2 Sample Task Backlog ...... 84 A.3 Sample Feedback Material ...... 92 A.4 Results Company One ...... 95 A.4.1 Feedback Company One ...... 97 A.5 Results Company Two ...... 102 A.5.1 Company Two, Project One ...... 102 A.5.2 Feedback Company Two, Project One ...... 104 A.5.3 Company Two, Project Two ...... 108 A.5.4 Feedback Company Two, Project Two ...... 110 A.6 Results Company Three ...... 115 A.6.1 Feedback Company Three ...... 117

Chapter 1

Introduction

This thesis will entail the topics of Continuous Integration, Continu- ous Delivery and Development Operations. The research topic is to examine and continue research on a maturity model that consists of a questionnaire to determine technical maturity for organizations. This questionnaire is conducted upon a company using a qualitative focused interview study. My contribution to the research field will be insights of what can be learned by performing these types of assess- ments at companies. The act of developing software as a company is filled with risks, to be successful a lot of different factors has to come together to deliver reliable and successful software to the customer. In recent years a set of best practices have risen: Continuous Integration, Continuous Deliv- ery and Development Operations. Which are designed to alleviate the inherent risks of software development in the professional field. The adoption of these practices can be difficult to implement in practice and it can be tricky for companies to know where to start the adoption process. Therefore, guidelines and decisions support tools of where to begin are needed to aid companies in the adoption process. The consultant firm R2Meton AB1 henceforth abbreviated as R2M, has developed a maturity model for estimating the technical maturity of their clients in three key areas widely accepted as best practices in the field, Continuous Integration, Continuous Delivery and Devel- opment Operations, henceforth abbreviated as CI/CD and DevOps.

1https://r2m.se/

1 2 CHAPTER 1. INTRODUCTION

The terms CI/CD and DevOps are used to refer to a set of practices that emphasize the collaboration and communication of both software developers and IT professionals facilitating the automatization of the process of software delivery and rapid infrastructure changes. See Ch. 4, for an in-depth explanation of these terms. As previously mentioned R2M is a technically oriented consultant firm focusing on creating software solutions and software system integration’s for their customers, founded in the year of 1996. R2M also focuses upon creating integration solutions for existing software solutions to improve the efficiency and workflow of their customers. The maturity model consists of a questionnaire to assist R2M and potential customers in the process of determining technical maturity within CI/CD and DevOps. The goal of the model is for it to be used as a decision-support tool for assisting customers with improving CI/CD and DevOps compliance. The model will provide an analysis of the current situation and the ought-to-be situation of a client, therefore conclusions can be drawn from the discrepancy in between these two situations to provide the client with a backlog of tasks to be implemented to achieve their goals. All clients that have partaken in the study have given their approval, to participate as research subjects and were given the option to end participation at any time without consequences from R2M’s or me. CHAPTER 1. INTRODUCTION 3

1.1 Thesis Outline

The thesis is divided into the following chapters.

• Chapter 1, Introduction: Introduction to the ﬁeld of research.

• Chapter 2, Background: Detailed summary of the research problem and its delimitations.

• Chapter 3, Related Work: Detailed outline of the related work for the thesis.

• Chapter 4, Relevant Theory: Outline for the relevant concepts for understanding the thesis.

• Chapter 5, Method: The methods used in the thesis work.

• Chapter 6, Results: The results gathered in the study.

• Chapter 7, Discussion: Discussion about the results presented.

• Chapter 8, Conclusions: Conclusions drawn from the study, answering the research questions presented.

• Chapter 9, Suggested Future Work: Suggestions for future work.

• Chapter 10, Retrospective: Retrospective and wrapping up the thesis. Chapter 2

Background

2.1 Pre-study results

The method of the pre-study consisted of a literature study which ﬁrstly focused upon reading existing literature. The method for the pre-study is discussed in Sec. 5.2. Much of this material were of a quantitative character. During the pre-study, it became clear that there existed very little quality focused research within the ﬁeld of CI/CD and DevOps when it comes to creating models for estimating technical maturity. The organization DORA was discovered to be one of the few companies which focus on qualitative based studies regarding CI/CD and DevOps maturity, their work is further discussed in Sec. 3.2.

2.2 Problem Deﬁnition

This research problem consists of taking an already proposed theoret- ical maturity model designed by R2M and evaluating it. This problem will not consist of implementing any software. Instead an evaluation will be performed on the current version of the model with the goal to test feasibility, accuracy and how to improve the maturity model.

4 CHAPTER 2. BACKGROUND 5

2.3 Purpose

The pre-study showed that the use of qualitative measurement methods is not a well researched topic within the field of CI/CD and DevOps research. There exist a few strictly quantitative methods for measuring CI/CD and DevOps maturity, according to R2M’s previous work. While quantitative methods produce hard numerical values of how mature a company is in CI/CD and DevOps, they fail to con- sider values, ideas and feelings that inherently exits among humans in the workplace and its culture. In addition to technical aspects CI/CD and DevOps also emphasizes the need to change company cultures and workflow, therefore the need to use qualitative measurement methods is a promising research topic to further the research field of CI/CD and DevOps. As previously mentioned the pre-study showed that the use of qualitative models for measuring CI/CD and DevOps maturity within organizations is less well researched field of study. Therefore, the scientific goals of this thesis are to examine and improve ways for qualitative focused models for measuring how mature a company is in their implementation of CI/CD and DevOps. This will result in an evaluation of how CI/CD and DevOps work within organizations and identifying the challenges of CI/CD and DevOps in a practical setting. The primary goal of this thesis is to gather real world insights that can answer the research questions and advance the field of research within software development methodologies focusing on CI/CD and DevOps. Primarily focused on qualitative ways to measure CI/CD and DevOps. The secondary goal is to examine which actions can be taken by companies to increase their CI/CD and DevOps compliance. Then to examine which of these actions seem to be the most beneficial for increasing CI/CD and DevOps compliance within an organization. The intended reader of the thesis are students, scientists and industry professionals within the fields of computer science and CI/CD and DevOps research. In addition, students, scientists and professionals within the IT operations field are also intended readers of this thesis. 6 CHAPTER 2. BACKGROUND

The research is also relevant for R2M and their customers, thus the research is relevant outside of R2M and might even be used by other companies. The goal from R2M side was for a master thesis student to evaluate their proposed model and estimate its feasibility in real world scenarios with their customers. R2M wanted to evaluate if the model is feasible when compared to existing literature and other maturity models for estimating technical maturity. R2M also wanted to evaluate if the model needs to be readjusted to achieve a more accurate and equitable result.

2.4 Research Questions

• RQ1: How can we improve qualitative models for measuring CI/CD and DevOps maturity?

• RQ2: To what extent can we compare qualitative results in between companies using a qualitative model?

• RQ3: Which actions are effective for improving CI/CD and DevOps compliance in an organization?

2.5 Delimitations

It is important to note what topics falls outside of the scope of the thesis. Most importantly no software will be developed and no algorithm research will be done. The thesis will consist of research work and interviews. The research method will not take industrial economic aspects into consideration, in real world economic aspects could affect the estima- tion of technical maturity for a ﬁrm. However, the interview candidates might have economic aspects that can affect their answers in the interviews. The model is meant to be used on a smaller team of professionals, namely a team consisting of ten to twenty employees. Thus, larger teams and entire organizations are omitted from the research. CHAPTER 2. BACKGROUND 7

The study will focus upon at most five different customers of R2M and no more, due to the limited time scope of the interview phase of the thesis project. R2M’s estimated that the result from the interviews would be lacking if more than five clients are interviewed. R2M’s customers might have external demands such as laws that they must adhere to which might affect their answers. For example, a customer might have strict guidelines how they treat personal data. All the customers and their data will be anonymized in the public report per R2M’s wishes. A non-anonymized report will be provided internally at R2M’s. I only found one other model for estimating CI/CD and DevOps maturity, discussed in Sec. 3.2, during the pre-study1, this model is copyrighted and not publicly available. Since I cannot use another companies work and without their permission due to legal reasons. I will not be comparing my results with this model to try to evaluate my results. If such a model were publicly available it would still be difficult to achieve a fair comparison since this is a qualitative study and thus it is inherently difficult to produce the same starting environment so that the results are not affected by outside factors.

1https://devops-research.com/assessment.html Chapter 3

Related Work

As mentioned in Ch. 2, there exists little work in the related field due to the research field of CI, CD and DevOps still being quite young. Also, the aspect of taking a qualitative perspective into consideration is scarce, hence there exists little related work to draw upon for this thesis. This is because this field of research is quite new and therefore largely unexplored. However, the most important will be listed in the sections below.

3.1 R2M’s Previous Work

My company supervisor, Björn Tegeberg, has developed a model for estimating technical maturity and has been the driving force for coming up with the project and he has laid the groundwork for the thesis. He has a background of ﬁfteen years of working experience within the software industry and a Master’s of science degree in computer science. The current version of the model was created using existing literature and concepts in the research area of CI/CD and DevOps, it was complemented with my tutor’s experience in the ﬁeld. The model has been designed to be technology agnostic when it comes to technique, development environments, programming languages, organization type and the organization size.

8 CHAPTER 3. RELATED WORK 9

The model consists of a questionnaire with about one hundred statements used as indicators of an organization’s CI/CD and DevOps maturity. The questionnaire was developed by researching relevant literature, customer contacts and the supervisor’s working experience within the software industry. These statements cover ten software engineering practices, which are presented in the bullet points below. Each of these ten practices make up a questionnaire category in the model.

1. Conﬁguration Management.

2. Build systems.

3. Task-based development.

4. Tools integration.

5. CI tool support usage.

6. Unit testing.

7. System- and integration testing.

8. Performance testing.

9. Installation and upgrade.

10. Deployment.

The model is then conducted upon the client in the form of an interview focused on qualitative aspects. During this interview, the client answers the statements from two perspectives, i.e. from a current situation and an ought-to-be situation, with a grading scale of one to four of agreement, as well as a I don’t know to the statement. Then a comparison analysis in between the current and ought-to-be situation is made. Finally, a backlog of tasks to reach the ought-to-be situation is created based on the interview results. The model has been tested in the ﬁeld before this thesis was conducted. The benchmark performed at the pre-study company has been the major steeping off point for this thesis. The benchmark consists of the following: 10 CHAPTER 3. RELATED WORK

• Results from an implementation of the benchmark at the pre- study client.

• Self-assessment questionnaire results.

• A backlog with thirty actions as a recommendation to the pre- study client how to increase their CI/CD maturity.

The goal of this benchmarking model is to be used as a decision- support tool for assisting the client and R2M’s consultants to help the client improve their CI/CD and DevOps compliance.

3.2 DORA

The organization DORA, DevOps Research & Assessment1, are one of the leading experts when it comes to DevOps research. They regularly provide technical papers, peer reviewed papers and conference talks about the current state of DevOps research. Their work has been very helpful for me during this thesis. They have also developed their own proprietary model for determining DevOps maturity 2, this model is copyrighted and not publicly available, as mentioned in Sec. 2.5. Like R2M’s model, DORA’s model focus on qualitative aspects. Their model and their ﬁndings about qualitative models are not made available to the public, since they sell their assessment as a tool for other companies to help their customers to assess their CI/CD and DevOps maturity.

3.3 CMMI

The Capability Maturity Model Integration, CMMI is one of the most popular traditional models used for companies to measure their technical maturity[1]. CMMI comes in three different ﬂavors of which CMMI for development CMMI-DEV, is of interest for this thesis3.

1https://devops-research.com/ 2https://devops-research.com/assessment.html 3https://en.wikipedia.org/wiki/Capability_Maturity_Model_Integration CHAPTER 3. RELATED WORK 11

CMMI initially was developed for software engineering companies and it has been refined during the years to encompass other types of organizations. CMMI consists of a set of best practices on how to structure and run an organization. It provides some guidance for improvements, CMMI may also be used as benchmark to test an organization’s maturity. These maturity levels can be divided into five distinct categories which build upon each other ranging from least desired to highest desired: Initial, Managed, Defined, Quantitatively Managed and Optimizing. CMMI-DEV also allows you to adopt an approach which instead uses capability levels in the same way, which are: Incomplete, Preformed, Managed and Defined. These two approaches are meant to be used to measure the maturity of several processes in your organization, or the capability of a single process[1]. This inherently means that to achieve a high score in CMMI you must comply to the model’s view of how a mature organization looks like. Since, CMMI describes a set of best practices it is up to each organization to judge which practices are suitable for their organization. CMMI also strives to be as generic as possible when it comes to which practices to adopt. Therefore I think that its practices might be difficult to adopt in an agile software team, which of course favors agile practices. My company supervisor did examine the CMMI model before creating the initial maturity model mentioned in Sec. 3.1. However, he deemed that CMMI focused too much upon quantitative aspects and did not provide enough guidance or leave room for qualitative aspects for organizations. R2M also wished to examine an interview study approach which CMMI does not entail. Therefore, I will not be examining CMMI more closely in this study since it was deemed that it’s methods fall outside the scope of the desired research questions and this study emphasizes the qualitative models. Chapter 4

Relevant Theory

4.1 Continuous Integration

The practice of Continuous Integration (CI) became widely adopted within the software industry with the release of the book Continuous Integration: Improving Software Quality and Reducing Risk 1st Edition[2] in the year 2007. Before then, the concept had only really shown up briefly in other articles and publications at the time. CI has since become popular for managing software development work. CI is a software development practice that puts emphasis on developers to frequently integrate their code into shared repositories, preferably several times a day[2][3], thus, limiting the scope for each change since the batch size for software decreases. Every check in of code is then evaluated and verified using automated tests to detect re- gressions. The purpose of this practice is to detect and catch errors ear- lier in development and provide fast feedback to developers regarding the effects of their checked in code. Integration errors can be costly to resolve if caught later. A prerequisite for CI is extensive automated testing environments and build automation to facilitate the automated pipeline[2][3]. The general idea is that every build must pass every test stage in the pipeline and if an error is discovered the fixing of that error takes priority. The concept of CI can be broken down into several individual parts[2] as seen in the Fig. 4.1.

12 CHAPTER 4. RELEVANT THEORY 13

Deploy Software

Fast Run Feedback Inspections

Continuous Integration

Compile Source Run Tests Code

Integrate Database

Figure 4.1: A mind map showing the different parts of CI.

At the heart of CI, a CI-Server is needed to facilitate developers with the tools required for integration. The CI servers job is to periodi- cally check the version control system for changes in the project, build these changes and test the build for defects. The server should also provide tools for giving feedback to developers, for example a digital dashboard of the status of the current build[3]. To automate builds decoupled build scripts for build automation are used to handle building source code. These scripts need to be decoupled from the IDE so that the scripts can be centralized in the 14 CHAPTER 4. RELEVANT THEORY

version control system, to enable the CI-Server to build projects automatically by fetching these scripts from version control. Therefore, all software assets should be centralized into the version control system. Since building the software is handed off to the CI-Server instead of the developers, there is a need for a swift feedback mechanism, preferably less than ten minutes. This goal is sometimes unfeasible in larger software projects where software complexity increases and thus build times increases, in that case the use of staging builds into smaller parts should be employed. So, that builds can build faster and provide fast feedback to the developers. The practice of private developer builds should be used to test changes locally, before pushing to mainline to avoid unnecessary mistakes. The act of pushing changes into trunk this way is often called an integration build[2]. Most applications today rely on one or more databases to operate; thus, databases should be integrated into CI system to achieve full CI compliance. Databases assets should be placed into version control, as well as accompanying test scripts. Developers should use a local sandbox of the database to test their changes. Developers should also be able to access and modify the server’s database and push assets into version control, to achieve the greatest CI compliance. The point of these procedures is to make sure that databases are automated and integrated into the CI system[2]. A key factor of CI is the use of automated test suites, an in-depth explanation of different software test methodologies is given in Sec. 4.4. Developers should implement software tests for testing to achieve the greatest test coverage possible for the software. Tests should then be broken down into different slots, sorted by execu- tion time so that fast tests are ran ﬁrst before committing to more time-consuming tests. The goal is to create an extensive testing environment so that the highest possible test coverage can be reached and for the test suite to be fully automated. So that developers can push changes in the software to the version control system and get fast feedback if the project passes the automated testing suite[2]. Code inspection is an important but sometimes overlooked part of the CI pipeline. The goal of code inspection is to automatically catch possible risks in the code before they can cause harm. Automatic code inspection can help to alleviate code complexity, code duplication and to enforce loosely coupled code. Also, an organization can use code CHAPTER 4. RELEVANT THEORY 15

inspection to enforce code standards or to check if the code is sufficiently documented. Lastly code inspection tests can be used to check test coverage for changes, so that untested code is not pushed to the trunk[2]. If the source code change falls outside the scope of the chosen code inspection standards the change is rejected by the CI-Server automatically. Finally, feedback is provided to the development team via a feedback mechanism, this mechanism can take the form of email, digital or analog status boards or even a beacon light of appropriate color. The point of this mechanism is to provide the entire team with the status of the integration build for the project, in a timely and simple manner that can be viewed by the entire team. If the integration build is rejected by the CI-Server, the top priority for the team becomes to fix the build as soon as possible[2]. If the build is accepted by the CI-server, the build is ready for the deployment phase. This phase can be broken down into several smaller steps, firstly a clean build environment should be created, to minimize possible side effects from the build environment. Then the CI-Server should label or tag the build, so developers easily can check out which versions, artifacts etc. have been used for the build. Lastly a feedback report should be generated by the CI-Server for the team, containing release notes for the build, bugs fixed, features added and other relevant information about the build. Note that if the labeling of builds is sufficiently finely grained developers also possess the capability to perform a stable rollback. Since, they know which artifacts and source code versions have been used for that particular build[2].

Data Compile Run In- Produce Deploy Base Run Source spection Feed- Soft- Inte- Tests Code Tests back ware gration

Figure 4.2: The CI pipeline.

These concepts come together to make up the concept Continuous Integration[4]. 16 CHAPTER 4. RELEVANT THEORY

4.1.1 Advantages of Continuous Integration

CI comes with a lot of benefits for a software projects. Mainly reduction of risks and reduction of manual intervention in the software development process. Since check-ins are performed so frequently less backtracking is required to fix errors in the software. Thus, both time and money can be saved since less time is wasted upon debugging. Problems with the build are discovered early in the project due to these frequent integrations to the repository. It has been shown in research that due to these frequent integrations code quality is improved. This in turn leads to greater confidence in the product from the development team and greater project visibility within the organization[2][4][5][6]. But perhaps the most obvious advantage of CI is that we can create a viable release candidate, henceforth abbreviated as RC, at the push of a button. This gives us the advantage that a RC can then be sent to the production environment without much inconvenience to the team[5]. Another advantage from a business point of view is that the transparency of the project is improved for non-technical personnel in the project. Since the information about the status of the project can be accessed via the feedback mechanisms provided to the developers by staff outside the development team[7].

4.1.2 Disadvantages of Continuous Integration

CI is of course not the universal solution that ﬁxes every issue that comes up in a software development project. One of the main critiques against CI is the increased initial overhead due to structure changes and maintaining the CI infrastructure. However, these issues can be mitigated by adopting CI at an iterative manner rate rather than going all in and adopting every principle straight away[2]. Also, this initial overhead will most likely pay off in the long run. Another critique against CI is that it is expected of the team that if an integration build breaks the highest priority is to ﬁx the issue rather than to continue development. This is often not feasible in the real world where developers that have external factors to adhere too, for example project deadlines. What often ends up happening is that CHAPTER 4. RELEVANT THEORY 17

the project continues even though the build is broken and it is ﬁxed later when more time for issue correction is available[8][9][10][5][6]. CI also places great trust in the use of automated tools which might raise concerns among professional’s due to an over-reliance of trust to the automation process[2]. Moreover, CI requires the use of frequent commits and the use of an integration branch which everyone in the team integrates with, this approach might not be suitable for every project, where separate feature branches are needed. This in turn forces programmers to work in smaller batches and to integrate their changes to the integration build more frequently, which might cause issues[8][2][10][5][6]. Another critique against CI is the increased overhead of economic costs that comes with CI. CI requires additional hardware to run the CI-Server and CI tools are often not free to use. Thus, both additional hardware and software must be purchased for the project, in the addition the cost of setting up and maintenance the CI systems well as training of staff[5][2]. However, a counter argument to this is the fact that once these systems are in place they can save money in the long term[2]. Thus, these negative short-term aspects must be weighed against the long-term beneﬁts. Also, once a CI methodology is in place it may be used for several projects in the future.

4.2 Continuous Delivery

Continuous Delivery (CD) is an extension of CI, with the goal to make every change checked into version control a potential RC, thus software deployments can be entirely automated from check-in to deployment to the production environment. CD has been on the rise after the wide spread adoption of CI in the software development industry, today CD is used by some of the biggest IT companies in the world[7]. The concept if CD took off after the publication of the book Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation was released in 2010[11]. Since the technique promotes shorter cycles, it supposedly reduces costs, time investment and risks of delivering changes to production. CD often increases the team’s 18 CHAPTER 4. RELEVANT THEORY

conﬁdence in the project since the risk of human error diminishes with the use of extensive automation[11]. Continuous Delivery only relates to the delivery aspect of software development. Also, Continuous Delivery is sometimes confused with Continuous Deployment; Continuous Deployment means that every change in the software is pushed to production, while for Continuous Delivery the choice is left to the team. Therefore, we can view Con- tinuous Deployment as a subset of Continuous Delivery. Continuous Delivery focuses on automation and repeatable processes for different aspects of software development. These concepts of extending the CI process to reach Continuous Delivery culminates into the concept of the deployment pipeline[11], which is seen in Fig. 4.3.

Accep- Ca- Release Commit (Manual tance pacity of Latest Stage Testing) Testing Testing Version

Figure 4.3: The deployment pipeline.

The pipeline aims to make every check-in a potential RC. This differs from the traditional view of software deployment in which the software moves through stages of completion[11]. The traditional process of software deployment can be seen in, Fig. 4.4. In this ﬂow process software moves through different stages of completeness, from initial Pre-Alpha build up to the releasable build, commonly called a Gold Build.

Release Pre- Alpha Beta Can- Gold Alpha didate

Figure 4.4: The traditional view of release candidates.

The idea behind the deployment pipeline is to deploy and test a RC into a copy of the production environment. Testing of the RC should be performed in a real world test environment to make sure that the RC behaves the same in the productions environments infrastructure, CHAPTER 4. RELEVANT THEORY 19

configuration, the application stack etc. Every change of the source code should then propagate through the pipeline and if any step fails the pipeline is stopped and rejects the RC, notifying the team[11][12]. The commit stage usually consists of these steps: compile the code, run commit tests, prepare binaries, prepare artifacts, analyze code. The source code is compiled if need be. Commit tests are usually made up of unit tests and integration or system-tests. Build metrics are then gathered, for example, information about test coverage. Artifacts1 and binaries2 required for the later stages of the pipeline are prepared beforehand[11][12]. The objective of the acceptance test stage is to use automated testing so the customer can verify that the project functions per their requirements. Therefore, confidence that the project is fit for purpose can be had by both developers and the customer. This step also eliminates the need for manual testing thus reducing cycle time for delivering a new RC[11][12]. The role of automated capacity testing is to test the application against real world scenarios like test performance and robustness. For example, how well the application scales, detecting memory leaks, simulating worst day scenarios, third party application integration etc. These tests often take a lot of time therefore they should be run in par- allel to achieve feasible performance[11][12]. After this manual testing is performed for non-automatable tests, for example certain GUI tests for assuring the look and feel of the application[11][12]. Finally, after all the steps have been completed successfully the RC is delivered into production[11][12].

4.2.1 Deployment Patterns

CD enables the use of more advanced release patterns to make software deployment safer and more reliable. The most prominent beneﬁt of these release patterns is so called zero downtime releases, for example Blue-Green Deployments or Canary Releasing[11][3].

1https://en.wikipedia.org/wiki/Artifact_software_development 2https://en.wikipedia.org/wiki/Binary_ﬁle 20 CHAPTER 4. RELEVANT THEORY

Blue-Green Deployment builds upon having two versions of the production environment live at any time, of which one is in use by the customer. When a new version of the software is released it is pushed to either the blue or green environment. Devices such as routers then simply update their routing tables and point customers to the environment which hosts new version of the software. While the old version is kept in the other environment, as a backup. If an error is detected in the newly deployed software, a rollback can be swiftly performed since it only requires returning users back to the old environment while the new version is fixed[11][3]. See the figure below for a clarification of this process Fig. 4.5.

Green Production Environment

Router / Load Customers Balancer

Blue Production Environment

Figure 4.5: Blue Green Deployment.

The second technique is Canary Releasing which builds upon the same concept as Blue Green Deployment but instead of pointing all users to the new environment when a new version is deployed. Only a small subset of users are routed to the new version. Therefore, these users act like test pilots for the new version, sometimes the pilot users are chosen at random; other times they are chosen by a special crite- rion. If problems are detected in the new software only a small subset of users are affected. Then the load balancer can simply re-route the canary users back to the stable environment while the software is patched. When the build is ﬁxed, canary users are re-pointed back to CHAPTER 4. RELEVANT THEORY 21

the new version. When the team is conﬁdent that the new version is stable, all users are then routed to the new version from the old version[11][3].

4.2.2 Advantages of Continuous Delivery

Software deployment is often a time-consuming process, if performed manually since staff are required to oversee and intervene during the deployment process. The tasks performed during a deployment sel- dom differs which leads to a lot of repeated manual work. Therefore, the beneﬁt of removing manual tasks from the building process by using CD offers great value[11][3]. Another advantage of CD is that it eliminates the risk of human error in a release process[13][11]. Since the technique focuses upon automation of processes to minimize the risk of human error. CD also reaps nonfunctional beneﬁts in the form of shorter deployment cycles which enables shorter time to market, increased feedback to developers as well as increased reliability and robustness and quality assurance for customers[13]. CD comes with the advantage of being easily mutable if the need for change in the production environment is necessary, for example, if another software library or operating system is required for a new version of the build[3].

4.2.3 Disadvantages of Continuous Delivery

Many of the disadvantages of CD are inherited from CI due to CD being an extension of CI. For example, CD needs additional hardware and network capacity, for running tests and measurements, which can be expensive to purchase and maintain. Also, indirect costs are increased for training and or employing staff to maintain these systems[11][9]. The use of automated testing suites can be problematic for teams to implement and maintain. Also, there can exist resistance among employees over an over-reliance on build automation due to distrust to the automated processes[11][9]. 22 CHAPTER 4. RELEVANT THEORY

It can also be difﬁcult to ascertain and verify, if the automated test suite is comprehensive enough and achieves the desired coverage and correctness when the entire process is automated[3][9]. CD also promotes a very rigid deployment pipeline which can prove to be problematic to work with; like when the pipeline may need to be bypassed. For example, when a security hotﬁx must be pushed to production as fast as possible when critical errors slip past the testing suite[11][3].

4.3 Development Operations

Development Operations (DevOps) is a term that refers to practices that focus upon communication and cooperation between developers and operations during the entire lifecycle of a software project. Tra- ditionally there exists little cross department cooperation in-between software developers and operations personnel[3][7]. DevOps promotes methods for increased collaboration in between departments thus increasing efficiency. Note that there exists no DevOps software, DevOps is a workflow which is adopted by software teams. This in turn makes a DevOps an organizational shift, which differs; from CI/CD. Additionally, DevOps is used in conjunction of CI and CD practices. The goal is to increase deployment efficiency and frequency for software projects[3]. DevOps also greatly empathizes on learning from mistakes, agile development and the deconstruction of working in silos. A visualization of the workflow of DevOps[3] can be seen below in Fig. 4.6. CHAPTER 4. RELEVANT THEORY 23

Build

Code Test

Plan Package

Monitor Deploy

Operate

Figure 4.6: The DevOps cycle.

Explanation of Fig. 4.6:

• Code - Development of the software.

• Build - The use of CI tools.

• Test - The use of automated testing.

• Package - Creating deployment artifacts, laying the ground work for staging.

• Deploy - The use of CD tools.

• Operate - Infrastructure operation.

• Monitor - Monitoring and gathering data.

• Plan - Planing for the future. 24 CHAPTER 4. RELEVANT THEORY

Ops is an umbrella term for system administrators, system engineers, operations staff, release engineers, database administrators, network engineers, security professionals and various other job titles. Tra- ditionally the responsibilities of maintaining and creating infrastructure has been postponed to after a release has been made by developers, in the sense that developers hand-off the solution to the operations team. The goal of adopting DevOps is for each of these departments to work together during the entire lifecycle of the project to reduce risks of software development and deployment[3][14]. DevOps can be quite a fuzzy topic to explain due to its buzzword status in the industry. DevOps means different things for different people; however, DevOps promotes the cooperation between departments per the ﬁg- ure, Fig. 4.7.

Software Development

DevOps

Quality Assurance Operations

Figure 4.7: What is DevOps? CHAPTER 4. RELEVANT THEORY 25

As previously mentioned the goal of DevOps is facilitate the entire life cycle of the development process by increasing cooperation in between departments. This causes a cultural shift within organizations where departments communicate frequently and the status of the project is transparent within the entire organization using issue track- ing systems, Kanban boards, et cetera. Developers and Operations staff can be put on the same team from beginning which alleviates issues[15].

4.3.1 Advantages of Development Operations

Studies have shown that the adoption of DevOps has many benefits in real world scenarios both from a project perspective and from a management perspective. Projects benefits from more frequent releases, improved quality of deployments, reduced cycle time for resolving unplanned work, improved detection of problems, improved reliability et cetera[3][16][17][18][19][20]. For an organization, the adoption of DevOps comes with improvements stemming from the increased cooperation between development and operation teams, decreased overhead for communications, increased customer satisfaction, the breaking down of silos, increased efficiency, increased responsiveness to change et cetera[3][16].

4.3.2 Disadvantages of Development Operations

Since DevOps is an organizational and culture shift it comes with caveats from an organizational perceptive[3][16]. Firstly, the removal of silos can backﬁre since the lack of structure from “old” may confuse employees. The culture and mindset changes with the adaptation of DevOps and employees are given new titles and responsibilities. This in turn can cause resistance and backlash from employees as the fear of change is often prevalent within people and organizations[21][3][16]. DevOps places great emphasis on transparency in between staff. This requires extensive tooling support to be employed to accommodate the need for issue trackers, Kanban-boards and other feedback tools[7][3]. 26 CHAPTER 4. RELEVANT THEORY

DevOps also places emphasis on the reorganization of software architecture which can cause great difficulties[22]. Firstly, these reorga- nizations often come with hidden costs and complexities since migrat- ing from monolithic application structure to a more loosely coupled for example a micro service architecture, can prove to be difficult to implement in practice[22]. The need of training is often a major caveat raised against DevOps, since DevOps practices are still a quite young and not widely adopted in the industry. Many researchers recommend that DevOps is adopted in incrementally to alleviate growing pains that comes with the adaptation of DevOps[17][18][19][20][23]. Another risk of DevOps adaptation is that IT security can become less of a priority[21][3]. Due to the rapidly changed workflow previous security routines can be left out as they become outdated. New features are pushed into production more frequently which leaves less time to proper security audit. Also, DevOps requires personnel have greater access to development environments, which can pose a security risks as security specialists lose control over environments. Another risk of the adaptation of DevOps is the cultural shift which can create different issues for each organization. What worked when transitioning to DevOps for one organization might not work at another. Due to this cultural shift there also exists a possibility that current leadership structures in an organization is shifted which might also cause issues[21].

4.4 Software Testing

Different testing techniques are mentioned in this thesis, these techniques are shortly described in the paragraphs below.

4.4.1 Unit Testing

The goal of Unit Testing is to test individual functions or methods of a program. Unit tests are therefore fast, since they test the smallest testable parts of a program. An example unit test is to verify that a method outputs the correct result. Every software change should come with its accompanying unit tests to achieve the greatest test coverage[3]. CHAPTER 4. RELEVANT THEORY 27

4.4.2 Integration Testing

Integration Testing tests individual units of the software, for example several methods that are combined and tested as group. The goal of this test type is to see if integration in between units works as intended by the developers[3].

4.4.3 System Testing

System Testing tests the entire software suite is tested for integration, to make sure that system meets the speciﬁed integration requirements. An analogy would be that a system test, checks the entire computer for functionality while unit tests checks individual components for example the CPU. The testing technique of black box testing is most often used in system tests to test for functionality[3].

4.4.4 Functional Testing

Functional Testing, tests software against functional requirements i.e. Note that this test type does not put emphasis on how the software works, it purely focuses on the result produced by the software. Often real world test cases are used to test software[3].

4.4.5 Acceptance Testing

Acceptance Tests are used to determine if the business requirements are meet by the software. Or in other words to test if the software meets the customer’s needs that are specified by the customer. Accep- tance testing does not care for the implementation of the software only if the software is fit for it’s specified purpose[3].

4.4.6 Capacity Testing

Capacity Test, tests system performance, stability and scaling under load. One example of capacity testing is the classic stress test i.e. to test the system under very high load. How does the system perform in an absolute worst-case scenario? Another example is an endurance test where the goal is to determine the software’s behavior during a long-term scenario, with the purpose to catch unwanted behavior, for example memory leaks[3]. 28 CHAPTER 4. RELEVANT THEORY

4.4.7 Destructive Testing

Destructive Testing is designed to test robustness of the system during real world conditions. In the real world one or more components might fail can and will most likely fail over time thus the system should be able to handle these failures without catastrophically failing all together. One approach could be to simulate a server outage for a critical service to see if the system still functions as intended[24][25]. Or to deliberately crash a database to see if the failure cascades to the entire system.

4.5 Software Development Methodologies

There exist different approaches to software development. Some companies adopt more traditional methods, others adopt agile practices. Today most companies are moving away from the traditional development models in favor of more agile methods due to the rise of lean practices the last couple of years[7]. Also, agile methods have shown to increased software quality and shorter delivery time[3]. Software development can be seen as a sequence of tasks which is taken during the development process. Different teams have different prerequisites and conditions which makes the choice of model up to each team. There exists no silver bullet method which is the most effective method for everyone. I will brieﬂy touch upon the two most common methods in the paragraphs below.

4.5.1 The Waterfall Model

One of the oldest for software development methods is the classical Waterfall development model, which takes a top down approach, in which distinct steps are taken one after the other. After each stage, the customer is updated on the progress of the project via text reports and reviews of the progress so far. If the review passes there is an agreement in between the customer and development organization that the project is progressing correctly[7][3]. CHAPTER 4. RELEVANT THEORY 29

An analyses phase is conducted in which the customer’s requirements are formalized into a speciﬁcation document which will lay the foundation for the entire software project. Then software architects design the system in theory, specifying which functions, classes, methods et cetera... that should be developed to meet the requirements. Af- ter this the implementation phase begins and the entire software suite is developed by the programmers. After this the testing phase starts in which the entire system is tested for bugs and defects by testers. Then the system is deployed and delivered to the customer, with an optional continued maintenance phase in which support and maintenance services are provided to the customer[7][3]. A visualization of this paradigm is presented in, Fig. 4.8.

Cus- Imple- tomer’s Mainte- Design men- Testing Require- nance tation ments

Figure 4.8: The Waterfall model.

As one can imagine the great advantage of the waterfall model is that it is very simple to follow, for both programmers and customers. It is also simple to understand since there are distinct phases for each step in the waterfall. Another advantage of the waterfall is that there exists less “feature creep3” in which new features are added by the customer as the project goes on. Because all requirements are decided upon at the beginning of the project. Another advantage of waterfall is that the model is quite simple to get started with and set up, thus it helps to keep costs down[7][3]. Of course, there are disadvantages of the waterfall method as well. The most obvious one is that it is a very inﬂexible method due to the bulk-heading of each distinct step. Therefore, it is only useful in projects which have very strict and well-deﬁned requirements, which are agreed upon early in the project. Therefore, it is of limited use in projects subjected to dynamic changes by outside forces. Another dis- advantage of the waterfall model is that the testing phase is done at the end of the development process. Therefore, software errors are caught

3https://en.wikipedia.org/wiki/Feature_creep 30 CHAPTER 4. RELEVANT THEORY

at the much latter stage of the project. Which can lead to a pile up effect of bugs, since bugs often give way for more bugs. This in turn makes debugging difﬁcult and time consuming as code complexity and scope grows[7][3].

4.5.2 Agile

Agile methods strive to minimize dangers of software development by employing greater flexibility than rigid methodologies. Agile methods lend themselves well to shorter development cycles in which customer’s requirements can change rapidly. A backlog of items is created which contains the requested features by the customer. Then for each cycle a set of items are selected from the backlog; to be implemented for this development cycle, thus a cycle is a period in which the software is developed. These items are developed and tested and then delivered at the end of each cycle. A cycle lasts for one up to two weeks which gives great flexibility for changes during the project. After each cycle, an assessment of the progress of the project is made and a new set of backlog items are picked for the next development cycle[7][3]. One of the most popular agile methods of today is SCRUM. It is an iterative method that places great focus upon quality of the development process. Each development cycle is called a sprint and lasts for one to four weeks. For each sprint three roles are assigned to one or multiples staff members of the team, “Scrum Master”, “Product Owner” and “Team”[26]. The Scrum Master is responsible for management roll for the project, he or she is tasked with aiding and providing support to the team members and also with removing obstacles from outside factors. He or she can be seen as a traditional project manager[7][3]. The product owner is responsible for the technical development of the project and for maintaining and prioritizing the backlog of development tasks. He or she also assigns task from the backlog to the team to be implemented for each sprint. The product owner also has the final say when it comes to the technical design of the project and the approval and verification of new or completed backlog items[7][3]. CHAPTER 4. RELEVANT THEORY 31

The team is the development team, they are responsible as a group for implementing the backlog items. The team is most often self- organized and self-governed by the team members. Teams often consists of only a handful of people, to keep sizes down and promote to ﬂexibility. Different teams in the organization may integrate with one another if special competence is sought after, which is not present in the current team[7][3]. An important part of SCRUM is the use of daily stand up meetings. During these meetings, each team member must answer the following three questions, “What did I do yesterday?”, “What am I going to do today?” and “Which obstacles have I encountered?”. The goal of these meetings is to spread awareness of the status of the current sprint. Also, obstacles can be detected early during stand-up meetings, which hopefully increases the sprint chances for success as they are detected as early as possible[7][3]. At the end of each sprint the sprint is reviewed and a retrospective is done by the team, so that the quality of the next sprint can be improved if need be. Also, the expected productivity and the probability of completion for the next sprint is anticipated to maximize the next sprints chances of success[7][3]. The scrum process can be seen in Fig. 4.9. 32 CHAPTER 4. RELEVANT THEORY

Planning the Sprint

Start

Sprint Delivery Development Retrospective

Sprint Review

Figure 4.9: The Scrum cycle.

Note that there exists no Scrum rule book that rigidly defines the method. Every organization can adopt different parts of Scrum methodology and change parts that does not suit them. As previously mentioned the goal of the entire Scrum workflow is to be adaptable to the dynamic market and to promote flexibility.

4.6 Software Architecture

There exist many types of software architecture i.e. how an application is structured in large projects. In the following sections I will touch brieﬂy upon the most relevant application structures mentioned in the thesis. Note that there exist many other ways to structure software. Also, no architecture is perfect in practice, projects often transition to and from different architectures over the lifetime of the project as it grows and changes. Also, different projects have different preconditions, therefore some software structures might be out of reach for a project due to real world constraints. CHAPTER 4. RELEVANT THEORY 33

4.6.1 Monolith Architecture

Monolith structure is a software architecture in which every compo- nent of the software is contained within one single unit, this implies in turn that the unit is only ran on a single platform. In other words, all functionality for the application is contained within one single binary of software. The advantage of this is that monolith structures are simple to work with and overview since everything is contained in one single unit. Which makes it an efﬁcient structure for small scale projects and business, since monoliths are easy and cheap to get started with[3]. It is also an efﬁcient structure for rapidly prototyping software due to the low technical overhead inherent to monoliths. The downside of monolith applications is that when the application grows in scope, so does the technical debt4 for the entire project. Thus, monolithic architecture scales poorly and has inherently poor support for modularity. Another downside is that the entire code base for the project must be recompiled for every change of the software, which leads to long build times for larger projects. Since, the application is one software unit it means that the entire project must be re-deployed at the same time.

4.6.2 Service Oriented Architecture

Service oriented architecture, SOA, is common architecture today in the industry of software development. This means that the functionality of an application is split up into distinct services for example a back-end layer and a front-end layer which communicate with each other. Another example could be a service which handles database look-ups. SOA paves the way for greater scaling and loosely coupled code compared to monoliths since services can be replicated to increase scaling. Another advantage is that SOA enables distinct services of the system to be deployed quickly instead of the entire system. Which in turn enables greater ﬂexibility and shorter lead time for changes in the project. The downside of SOA is that it can be hard to create a truly modu- lar application stack so that scalability can persist when new features

4https://en.wikipedia.org/wiki/Technical_debt 34 CHAPTER 4. RELEVANT THEORY

are added in the project. This in turn leads to the risk of increased coupling over time. Also, it can be hard to fine tune SOA applications for robustness and performance since they still retain some parts of their monolithic ancestry, because services often aren’t finally grained enough in practice[3]. Another difficulty is that it might be difficult to decompose services into correctly sized parts, services might vary widely in scope depending on their purpose.

4.6.3 Microservices Architecture

Microservices is the evolution of SOA in which an application is de- composed into micro sized services which together make up a swarm of services that encompasses the application. A service should only be responsible for a single thing in the application. The swarm of services then communicate with each other and together they make up the application that the user interacts with. This means that each service is very small and self-contained which lends itself well to scaling since operations can redeploy the affected services serval times if the application needs to be scaled up. Also, each service can be in- dependently tested and deployed without affecting other services if done correctly[22]. Microservices can also be a very powerful technique which lends itself well to DevOps methodologies due to the independent nature of services. Meaning that changes to a service shouldn’t create more work for other teams and changes shouldn’t affect other team’s services. However, implementing and transitioning to microservices can be complicated due to the major paradigm shift[27]. Microservices can be one of the hardest structures to implement in practice and is only recommended for large teams and large applications and only if adopted incrementally over time. It often takes years to transition into a microservice architecture even for the biggest companies[27]. Due to the extensive requirements of tooling, infrastructure, education of staff, et cetera. Microservices comes with the issues of having to create an infrastructure environment that can handle the technical debt of these services, service discovery, communication, status et cetera. Also, a communication standard must be set for which the services communicate over, this can often be a very complicated CHAPTER 4. RELEVANT THEORY 35

process to implement and often it takes several iterations before suc- ceeding to implement an efﬁcient communication strategy. Another issue is the heavy reliance of network communication which in turn in- troduces latency and communication overhead compared to SOA and monoliths[22]. Chapter 5

Method

5.1 Research Methods

5.1.1 Qualitative vs Quantitative Research

There exist two types of research methods that were used for this thesis, qualitative and quantitative research. The difference between these methods is that quantitative research focuses upon open ended statements, interviews et cetera... it asks Who? or Why?. Rather than quantitative research which tries to measure and quantify data, by asking What?, Where?, When? or Who?1. A typical qualitative research question would be: Why do people prefer product A over product B?

A typical quantitative research question would be: How many people prefer product A over product B?

1https://en.wikipedia.org/wiki/Qualitative_research

36 CHAPTER 5. METHOD 37

5.1.2 Iterative Research Pattern

Iterative research is a method approach which takes the form of a pattern of four distinct steps[28]. As seen in, Fig. 5.1. The ﬁrst step is observing the research subject in question, for example an application. Then to identify the real problem for the subject. Then a solution is developed and lastly the solution is tested. After this, either a new iteration begins or we step out of the loop and deliver the result. If this pattern is adopted, it is meant to be used for several iterations before stepping out of the loop and declaring the project ﬁnished. Hence the name iterative research pattern.

Observe

Test Identify

Develop

Figure 5.1: Iterative Research Pattern.

5.2 Pilot Study

The method of the pilot study consisted of a literature study which focused upon reading existing literature that had been deemed im- 38 CHAPTER 5. METHOD

portant during the research phase for the speciﬁcation. The concept of “Lean Organizations”2 was often mentioned in the relevant literature, therefore additional literature about the concept of the Lean were added to the study material[29][7]. The current version of the maturity model was studied in detail and meetings were held with my supervisor at R2M, Björn Tegeberg. We discussed my questions and thoughts of the current model and any other matters that had risen since our last meeting. Research papers and technical publications from universities and companies were also examined. I discovered that there was little thesis work solely examining CI/CD and DevOps. Most thesis work focused upon solving another problem while taking the topic of CI/CD and DevOps into consideration to solve the problem. The organization DORA3 proved to be very valuable since they regularly provide technical papers, research and discussions about the current state of DevOps research. They have also developed their own model for determining DevOps maturity of organizations. Technical video lectures available for free on YouTube, mainly GOTO conferences4, were also examined during the pilot study. Since they provided real world insights into CI/CD and DevOps concepts, which I lacked due to lacking working experience within the ﬁeld. These insights proved to be very valuable for improving the current version of the model.

5.3 Thesis

For the thesis, a mixed quantitative and qualitative approach was used, where I examined both the results from the qualitative interviews and quantitative literature to ﬁnd the answers for the Research Questions. Since I was not well versed in performing qualitative studies before starting this thesis some papers and resources were examined during

2https://www.lean.org/WhatsLean/ 3https://devops-research.com/ 4https://blog.gotocon.com/ CHAPTER 5. METHOD 39

the entire thesis to improve the quality of the thesis[30][31]. Addition- ally, my supervisor advised me on how to improve and reﬁne qualitative method employed during this study. He advised me to employ a mixed approach where I tried to quantify as much data as possible from the interview studies.

5.3.1 Preparations

In preparations for the interview stage of the project the current maturity model of R2M was improved using an iterative research pattern. As seen in Fig. 5.1, the first step was to observe the problem, in this case the initial version of R2M’s maturity model, and the existing literature on the topic and then identifying any potential weak areas in the model and fixing these. After that solutions were put together. In this case additional statements based on the literature to amend weak spots. Lastly these changes were put to the test to see if they held up in the real world. Therefore, I continued to build upon my work from the pilot study. Due to this the preparations stage for the interview phase started somewhat at the Identify phase since I had already performed one cycle in the iterative research pattern. Thus, I had already observed the problems with the original model by R2M during the pilot study and had begun identifying potential weaknesses already. As the identified problem areas grew in scope I decided to split up the work into smaller loops, one for each topic covered in the maturity model, as seen in Sec. 3.1. I chose a topic of the model from personal preference and I focused my efforts solely upon reading up on that specific topic and improving that area of the model. This was then repeated for each of the topics in the model. Additional topics in the model were also added at that stage since I deemed that they were missing from R2M’s model, when first examining the current research. However, the test part of the iterative approach was postponed for each of these loops until later due to the summer vacation since no clients nor my tutor were available for testing, i.e. reviewing my changes to the model. After I had remade the model, my findings and improvements were peer-reviewed by relevant staff at R2M. 40 CHAPTER 5. METHOD

5.3.2 The Interview Phase

The interview phase began, interviewing chosen clients using the improved model. For each client, key personnel with key competences were chosen for answering each of the statements corresponding to the interview subject’s area of work. By the clients but also with input from me to aid them in finding appropriate interview candidates. The results from these interviews can been seen in Ch. 6. The interviews took the form of qualitative focused interview sessions with the interview subject or subjects in question. Desired interview subjects were developers, testers, team mangers and other professionals with a good understanding of the software development process. The interviews focused on specifics for each statement in the model to find qualitative aspects in the subject’s answers. The interview process could be stopped by the candidates at any time if they wished to, if they did not which to participate in the study anymore. A figure for the method employed for the entire interview phase can be seen in, Fig. 5.2.

Inter- Inter- Gap view: Sum- view: Analysis Start-up Ought- mary Current and Meeting to-be Presen- Situ- Backlog Situ- tation ation Creation ation

Figure 5.2: The Interview Workﬂow.

First, a start-up meeting was held with the client, during which the client chose staff members to be interviewed for each of the categories in the improved model. A project plan was then agreed upon for the benchmark study at the client, interview subjects, time slots, constraints et cetera. Then the interview sessions were conducted the subject answered the questions in the improved model. These questions were answered in a six-graded point scale, zero to ﬁve, corresponding to the agreement level to the statement by the client. A summary score was then calculated for each category by taking the mean value for entire category, the data point for the mean score was then transferred into a circular diagram, a circle diagram with mock up data can be seen in CHAPTER 5. METHOD 41

Fig. 5.3. The equation for calculating the summary score can be seen in Eq. 5.1, in which X corresponds to the number of questions for that category resulting in a value from one to ten in the diagram. The current situation analysis forms the inner blue circle seen in the diagram, corresponding to the client’s current CI/CD and DevOps compliance.

n=X n=X X X (( )/( ) ∗ 5)) ∗ 10 = Mean (5.1) n=1 n=1 After this the client answered the same statements again from an ought-to-be situation perspective. The mean value for the ought-to-be situation was then calculated using Eq. 5.1. These data points corre- spond to the outer red circle seen in Fig. 5.3.

Figure 5.3: Circle diagram used in the gap analysis, based on mock-up answers.

An analysis of the interview material and the gap analysis of the circle diagram was then conducted. This analysis generated a backlog 42 CHAPTER 5. METHOD

of tasks unique to each client. These tasks were formulated by analyzing which areas in the circle diagram had the biggest discrepancy between the current situation analysis and the ought-to-be analysis. Additionally, notes and voice recordings taken during the interview factored in the creation of these backlog tasks. It was often necessary to read in between the lines of what interview subjects said and weigh this to the possible solutions to the problem which the client faced. A proposal for the most critical tasks to implement in the first cycle to increase CI/CD and DevOps compliance was documented for the client, i.e. the backlog of tasks, that was to become the result of the workshop. The backlog was presented to the client in the form of a summary report that contained the backlog of proposed tasks, the gap analysis and other relevant information for that client. The client was also given a feedback questionnaire about the backlog in which the client quantified their satisfaction with the items presented in backlog, to try to gauge the quality of my suggestions for improvement drawn from the model. The intent was to use the feedback to help gauge the scientific quality of the model. This feedback material can be seen in Sec. A.3.

5.3.3 Evaluation of the Interview Study

After the interview phase was completed I focused upon analyzing my results from the interview study to answer the Research Questions seen in Sec. 2.4. My conclusions were derived from a combination of factors. Some results were derived by comparing the results from the interview studies and current research in the area. I also compared results from the different clients to each other to ﬁnd any differences and patterns in the interview data. Lastly, I used the answers gathered about client satisfaction with the backlog to answer my research questions. The feedback questionnaire played a big role in this process, it can be seen in Sec. A.3. In which I try to quantify the client’s satisfaction with the presented backlog items and the conclusions I drew from their answers. It had three categories that are of interest in the evaluation of the research questions, seen in the list below. Candidates were able to answer these questions on a seven grade Likert Scale. I choose to use the Likert Scale here since it seemed to be the most appropriate. CHAPTER 5. METHOD 43

1. Insightful: We experience that the presented backlog items are insightful and coincide with our organization’s development opportunities?

2. Clarity: We think the presented backlog items are clear and concrete?

3. Doable: We think that the featured backlog items are doable to us?

In addition to the more open-ended statements in which the client could provide feedback on the model, interview process and any other thoughts that might be of value. The feedback questioner was designed by me, with input and suggestions from both my tutor and supervisor. Based on the answers given by each client I calculated a mean value of agreement in percent for these three categories, seen in Eq. 5.2. In which X corresponds to the total amount of backlog items presented in the backlog for that particular client. This means that a total score for each category was summarized by adding the result for that backlog statement and then divided by the maximum attainable score, i.e. the summation of the maximum score the category. This results in a mean value which can be converted into percent.

n=X X Score/Max = Mean (5.2) n=1

The goal was to use these calculated mean values to draw conclusions on how usable and effective this benchmark was and the method employed at large. By examining the calculated mean values by themselves but also by comparing them to the other clients results that partook in the study. Chapter 6

Results

6.1 The Improved Model

The public version of this report only contains a sample of the model due to R2M’s inherent copyright to the material, seen in Sec. A.1. This sample consists of the sales material handed over to potential interview candidates, which contains three questions from each topic of the benchmark model, to aid interview candidates in choosing relevant staff for the interview study. A sample backlog is also provided in Sec. A.2, with an accompanying a sample feedback protocol seen in Sec. A.3. All clients have been anonymized in the public version of the report. Only R2M has access to the full model and non-anonymized answers from interview subjects. Also, the detailed backlogs of each client are not provided since it can be used to deduce which companies partook in the study.

6.2 The Interview Study

The results from the gap analysis performed as a part of the interview study are presented here in the sections below.

44 CHAPTER 6. RESULTS 45

6.2.1 Company One

The interviews of Company One took approximately ten hours in total. The results can be seen in Fig. 6.1, and in Tab. 6.1. Before the interviews started Company One had already expressed a wish to improve in the ﬁeld of software testing. The results show that there exists a large discrepancy between the current situation and ought to be situation for the different approaches for software testing. Thus, the model captured this area of possible improvement. The model also seems to capture the need for improvement in the areas of Build systems, Security and Logging and Feedback.

Figure 6.1: Company One’s gap analysis. 46 CHAPTER 6. RESULTS

Table 6.1: Results from Company One’s gap analysis in table form. Ability Current Situation Ought to be CM methodology 7,286 8,286 Building Systems 6,727 8,909 Task-based Development 7,500 8,167 Tools Integration 4,333 5,667 CI and CD 6,400 7,100 Deployment 5,778 7,333 Unit Test 5,400 8,000 Integration and System Test 3,833 5,167 Acceptance Test 2,000 5,273 Performance and Destructive Test 2,000 3,143 Security 5,091 6,727 Logging and Feedback 6,000 7,286 Installation and Upgrade 6,167 6,500 Architecture 7,000 7,600 Corporate Culture 8,429 9,000

The backlog and the feedback can be seen in Sec. A.4. Company One’s agreement to the results formulated in the backlog can be seen in the Tab. 6.2. The full feedback protocol can be seen in A.4. In total twenty-four backlog items were formulated using the results gathered at Company One.

Table 6.2: Company One’s level of agreement to the backlog. Agreement Category Percent Insightful 84,03% Clarity 100,00% Doable 75,69%

It is positive to see that Company One was very pleased with the evaluation process when looking at their feedback. A high score was achieved across the board for all three categories in the feedback material. Company One were also pleased to have par took in the study and CHAPTER 6. RESULTS 47

they thought that the model captured their deﬁnition of what CI/CD and DevOps means well. An interesting thought raised by Company One was that they wanted a more open discussion about the ought to be situation. For example, they wanted me to provide them with my interpretation of a suitable ought to be situation, since their desired situation does not imply a good CI/CD and DevOps maturity level.

6.2.2 Company Two

Company Two wanted to evaluate two different projects within their organization in which there were some key differences in between methods used for the projects. One project with the new workﬂow and one project with the old workﬂow. Specially they were interested in the differences measured in between these two projects to aid them in evaluating their approach and which method to use going forth.

New Workﬂow

The interviews at Company Two took approximately ﬁfteen hours in total. The results can be seen in the ﬁgure Fig. 6.2, and in Tab. 6.3. The results show that Company Two has come far with their work for achieving CI/CD and DevOps maturity according to the benchmark. Notably a very high score in Performance and Destructive testing was achieved. However, there still exists some work to be done in Acceptance Test and Security, for the new methodology. 48 CHAPTER 6. RESULTS

Figure 6.2: Company Two’s gap analysis for the new workﬂow. CHAPTER 6. RESULTS 49

Table 6.3: Results from Company Two’s gap analysis of the new work- ﬂow in table form. Ability Current Situation Ought to be CM methodology 7,286 8.000 Building Systems 9,091 9,636 Task-based Development 8.000 9,000 Tools Integration 7.714 8,286 CI and CD 8,100 8,600 Deployment 7,333 7,333 Unit Test 7,800 8,200 Integration and System Test 6,400 7,636 Acceptance Test 2,800 4,800 Performance and Destructive Test 7,714 8,143 Security 5,600 7,400 Logging and Feedback 7,714 8,286 Installation and Upgrade 7,636 8,000 Architecture 8.000 8,000 Corporate Culture 6,769 8,000

The backlog and the feedback can be seen in Sec. A.5.1. Company Two’s agreement to the results formulated in the backlog can be seen in the Tab. 6.4. In total twenty-two backlog items were formulated using the results gathered at Company Two for the new workﬂow.

Table 6.4: Company Two’s level of agreement to the backlog of the new . Agreement Category Percent Insightful 72,73% Clarity 70,45% Doable 72,73%

It is positive to see that Company Two was pleased with the evaluation process when looking at their feedback. A high score was 50 CHAPTER 6. RESULTS

achieved across the board for all three categories in the feedback material. However, they wanted to answer the more open-ended statement during a face to face session thus these answers are not provided in the feedback material. Another comment on the results is that Company Two choose to answer the same on the Insightful and Doable category, since they thought that these two topics were closely associated for their projects. This compromise was accepted since I deem that it is their right as an interview organization to answer these questions as they saw ﬁt.

Old Workﬂow

The interviews at Company Two for the old workflow took approximately fifteen hours in total. The results can be seen in the figure Fig. 6.3, and in Tab. 6.5. The results show that there exists a very large discrepancy the current situation and the ought to be situation for Performance and De- structive test which is very interesting to see, when taking the previous result from Company Two into consideration. However the trend of needing to improve in the areas of Acceptance Test and Security still per- sists using the old methodology. CHAPTER 6. RESULTS 51

Figure 6.3: Company Two’s gap analysis. 52 CHAPTER 6. RESULTS

Table 6.5: Results from Company Two’s gap analysis in table form. Ability Current Situation Ought to be CM methodology 7,429 8,286 Building Systems 8,727 9,818 Task-based Development 7,333 9,000 Tools Integration 7.429 8,286 CI and CD 7,700 8,700 Deployment 7,778 8,444 Unit Test 7,200 8,200 Integration and System Test 5,273 7,636 Acceptance Test 2,400 4,800 Performance and Destructive Test 1,714 8,143 Security 5,000 7,400 Logging and Feedback 7,429 8,286 Installation and Upgrade 6,667 8,167 Architecture 7,000 8,000 Corporate Culture 6,154 8,000

The backlog and the feedback can be seen in Sec. A.5.3. Company Two’s agreement to the results formulated in the backlog can be seen in the Tab. 6.6. In total twenty-ﬁve backlog items were formulated using the results gathered at Company Two for the old workﬂow.

Table 6.6: Company Two’s level of agreement to the backlog. Agreement Category Percent Insightful 73,33% Clarity 72,00% Doable 73,33%

The results show that Company Two thinks the results gathered and formulated for the old workﬂow are of marginally better quality. I think that this is due to the greater differences in the current and ought to be situations and that the differences corresponded better with the other organizations that were present in the study. Thus, I could draw CHAPTER 6. RESULTS 53

upon the results from the other candidates to improve Company Two’s backlog. For the old workﬂow Company Two choose to use the same procedure to answer the feedback material.

6.2.3 Company Three

The interviews at Company Three took approximately nine hours in total. The results can be seen in the ﬁgure Fig. 6.4, and in Tab. 6.7. The results show that there exists a large discrepancy between the current situation and ought to be analysis for more advanced testing methods. The model also seems to capture the need for improvement in the areas of CI and CD, Deployment and Logging and Feedback.

Figure 6.4: Company Three’s gap analysis. 54 CHAPTER 6. RESULTS

Table 6.7: Results from Company Three’s gap analysis in table form. Ability Current Situation Ought to be CM methodology 8,429 9,571 Building Systems 8,000 8,909 Task-based Development 8,333 9,000 Tools Integration 7,714 8,571 CI and CD 6,700 8,600 Deployment 5,500 7,250 Unit Test 7,400 8,600 Integration and System Test 7,000 8,167 Acceptance Test 3,400 4,545 Performance and Destructive Test 0,429 3,571 Security 2,910 5,818 Logging and Feedback 4,571 8,286 Installation and Upgrade 5,833 7,167 Architecture 7,200 8,600 Corporate Culture 7,000 8,000

The backlog and the feedback can be seen in Sec. A.6. Company Three’s agreement to the results formulated in the backlog can be seen in the Tab. 6.8. In total twenty-six backlog items were formulated using the results gathered at Company Three.

Table 6.8: Company Three’s level of agreement to the backlog. Agreement Category Percent Insightful 76,28% Clarity 73,72% Doable 65,38%

It is positive to see that Company Three were pleased with the results overall. Even though a lower score in the category Doable was achieved compared to the other categories and the other participants in the study whom answered above seventy percent. I theorize that this could be from a variety of factors, perhaps I wasn’t thorough CHAPTER 6. RESULTS 55

enough during the interviews and therefore I misjudged the capabilities of Company Three. Perhaps Company Three were more pes- simistic about their capabilities to put the backlog items to use. How- ever, they were still pleased with their participation in the study and they thought that the model captured their deﬁnition of what CI/CD and DevOps means well. Therefore, I still think the results are of great value. Chapter 7

Discussion

7.1 The Improved Model

The model doubled in size from around one hundred statements to around two hundred statements during the improvement phase of the project, since I deemed that the original model was not comprehensive enough. The additional statements were created by drawing from the existing material as previously mentioned in Sec. 5.3.1. Every statement was peer reviewed and approved for quality by staff at R2M, mostly by my tutor. Every statement also received an accompanying explanation on why the statement is in the model. The intention was to improve the quality of each statement and to try to filter out unnecessary or redun- dant statements from the model, since if a statement could not be motivated it was probably flawed to begin with. These explanations were also peer reviewed for quality within R2M. Some statements in the original model were also removed or rewritten to improve the quality of each statement. The model was also extended to support a finer grained scale for answering the statements. The scale is now a six-grade, compared to the previous four grade scale. Allowing for a finer grained answer to be given to a statement. This scale corresponds to the classical Likert scale1, with a few changes. Namely the option to answer neutral and I

1https://en.wikipedia.org/wiki/Likert_scale

56 CHAPTER 7. DISCUSSION 57

don’t know was removed. Because the goal is to force the client to take a stand on the statement. Otherwise it opened the possibility for every client to answer neutral to the statement. Thus, possibly thwarting the purpose of the study, since the weight of the answers matters. The structure of the model was changed as well to accommodate the changes added. New categories were added to the model. These categories were decided upon by me, by examining and drawing conclusions from the literature and other models trying to achieve similar goals, to determine new suitable categories. All categories were peer reviewed within R2M and some statements were re-categorized into other topics, to better ﬁt the state of mind. Initially the improved model contained about two hundred and forty statements. The model was shrunken down during the peer reviewing process since some statements were of low quality or overlapping. A typical low-quality statement was often coupled to a certain technique or method thus violating R2M’s wishes of the model remaining technique and brand neutral. Another example of a low- quality statement was one that was not formulated as a question rather than a statement. For example, Do you use a CI-server in your project? which is a yes or no question, which leaves the interview subject little room for reasoning and thought. A detailed explanation of what has been added in the improved model can be seen in the sub paragraphs below.

7.1.1 Conﬁguration Management

Additional statements were added for this topic mostly relating to the use of integration focused development and the usage of feature toggles and other methods for facilitating a single integration branch. Additionally, improvements were made to the motivation on why the statement was in the model to begin with.

7.1.2 Build Systems

Few changes were made for this topic since it was deemed that the quality of it was already good enough. However, many statements lacked an explanation of its purpose in the model thus those were added to tie the statements to the literature. Therefore, some changes were made to the statements to better ﬁt with the literature. 58 CHAPTER 7. DISCUSSION

7.1.3 Task-Based Development

This topic was greatly improved by taking many aspects from lean and agile methods[7][29] into consideration when formulating new statements and improving existing statements. The topic was also made to be more technique neutral instead of focusing on solely digital or- ganizers for development tasks. As an example, the use of Kanban boards or post-it boards was added. Since, teams might prefer an analog variant of keeping track of development tasks.

7.1.4 Tools Integration

This topic remained almost completely unchanged in the improved model. It was deemed to cover the most important aspects and the quality of the statements were up to par.

7.1.5 CI and CD Tool Support Usage

This category previously only covered CI aspects. Therefore, it was decided to extend it with CD practices to achieve a more balanced result. Since, CI and CD goes hand in hand in a healthy development pipeline I decided not to split them up into two categories. These extensions were taken mostly from the existing literature[11][3]. real world insights gathered from employee education seasons were also used to improve the statements of this category based on R2M’s ﬁrst- hand knowledge of these topics, due to their specialization within the ﬁeld of CI and CD. Other insights were gathered from video lectures[32][33]. Finally, aspects of code inspections were added to check if the client had a set of code standard and checked for risks in their projects.

7.1.6 Deployment

This category was extended with more advanced deployment practices, i.e. Blue Green deployment and Canary releasing, since they were not present in the initial version. These deployment techniques are relatively advanced, therefore only touched upon brieﬂy for now. This is a possible area of improvement as the model needs to be ad- justed to the techniques used in the industry in the future. CHAPTER 7. DISCUSSION 59

7.1.7 Unit Testing

The category was extended by adding statements which related to checking for correctness of the unit tests. Adding more unit tests is trivial but adding the right unit tests isn’t. For example, if a unit tests the result of a function and not how the function itself implemented i.e. black box testing. Also, statements for checking the speed of unit tests were added, since unit tests are supposed to be very efﬁcient due to their small scope.

7.1.8 System- and Integration Testing

The category was improved by adding statements that went even more in depth into the different aspects of integration and system testing with the goal to build of the unit testing statements and to transition into the acceptance test stage. Since integration and system testing is important, a lot of time was dedicated in this area to reach high quality statements.

7.1.9 Acceptance Testing

The concept of acceptance testing was not present in the original model. Having a well formulated acceptance test stage enables teams to work faster and have more conﬁdence in the RC, since fewer manual man hours will be spent checking the customer’s requirements and more time can be spent implementing and improving features. The foundation for the new statements were taken from literature[11][3]. Due the lose deﬁnitions of acceptance tests and their inherent nature of factoring in human opinions, like look and feel, there are only a few statements in this category.

7.1.10 Performance Testing and Destructive Testing

The concept of destructive testing often came up in relation to micro services, service oriented architecture and DevOps[27]. Since testing 60 CHAPTER 7. DISCUSSION

for robustness in software relates to performance testing it was decided to merge these two topics, thus extending the existing performance testing topic. Most of the statements added concern performance testing due to the lack of literature and other peer reviewed research into destructive testing. Noteworthy is that performance testing is inherently hard to do effectively in the real world and often takes a lot of time and resources due to the need to simulate real world conditions over extended periods of time. Some projects even seem to omit performance testing entirely and instead test the performance in the production environments with real customers, handling the issues as they appear.

7.1.11 Security

Statements to check a projects maturity in managing security risks were not checked in the original model. It was decided to add this topic to the model, as IT security seems to be negated when transitioning to DevOps per real world insights[34][3][21]. The security professionals are brought in at the end of the project. However, this can lead to massive security audits which might delay projects or in the worst case prevent the project from being deployed[34].

7.1.12 Logging and Feedback

This topic was added to the model since it was lacking in the original version. It was deemed that log data from the application as well as infrastructure and feedback from customers is an important enabler for making better business decisions historical and for future use of the project[7][3]. Gathering these metrics seems yet not to be an ortho- dox practice in the industry. Therefore, statements covering aspects of logging were added to the model. The new statements also cover the types of data and if the client uses statistical analysis to predict trends and risks in the application as well as infrastructure. CHAPTER 7. DISCUSSION 61

7.1.13 Installation and Upgrading

Only a few changes were made in this area, mostly they related to improving existing statements. Some new statements were also added. The statements regard the practical sides of CI/CD pipeline, for example versioning, which software is running, conﬁguration of hardware, employee’s workstations et cetera. In projects these practical aspects cannot be taken for granted and there might exist a lot of uncertainty of what software and which versions are being used at any one time. Every statement also received a motivation for existing since none existed previously in this category.

7.1.14 Software Architecture

Statements regarding software architecture were added to the model. Several times during the thesis the fact that transitioning from monolithic structures to service oriented or micro services architecture enables companies to adopt more CI and CD practices came up. There- fore, it was deemed this topic to be necessary for a complete model. These statements were gathered from the existing literature[3][22]. real world insights gathered from employee education seasons were also used to improve this section of the model.

7.1.15 Company Culture

DevOps is not just about infrastructure automation, which is a common misconception. DevOps also puts emphasis on structural and cultural change within the company for breaking down silos leading to cross functional teams and to inject a learning culture into the company. Therefore, this category was added to the model with statements designed to capture if an organization was on the way of implementing these culture changes. For example, the use and application of Learning Culture and the transition to a generative organization seen in the Westrum typology[3]. Another key difference from other categories is that this category of statements tries to estimate more soft values within companies. 62 CHAPTER 7. DISCUSSION

7.2 The Interview Study

The practices of how to conduct the interview part of the benchmark process was not changed while improving the model. Since it was deemed that the approach of how to conduct the interview benchmark as seen in Sec. 5.3.2, didn’t need to be improved. During the interview study, it was shown that there existed a larger than expected discrepancy for which questions companies could answer. In the real world companies outsource parts of their business for example testing to external suppliers. Hence some statements proved more difﬁcult to answer than initially expected. Often this resulted in that the client answered the statements to the best of their abilities or from the perspective of what expectations they had on the outsourcing partner. Another problem discovered during the interview study was that different companies had different deﬁnitions of technical terms used in the software industry. This lead to confusion when clients answered the statements, because they had differing interpretations of for example what the term infrastructure included. Another possible problem with the interview study was the risk of bias among the interview subjects. As one can imagine it can be hard to admit to possible areas of improvement in your work and to remain unbiased. I theorize that this fact might skew the answers gathered from interview subjects on the positive side. There was also a risk of bias from my side when conducting the interviews. Namely in how I choose to interpret the interview subject’s answers and which notes I took. Of course, I tried to be as unbiased as possible and my tutor went through my notes when appropriate to see if he thought I missed anything important or if my notes were lacking due to bias. The backlogs created for each client in the study were also reviewed by my tutor to increase their quality and to remove as much bias as possible. So, the preconditions for the gathering of feedback material would be as good as possible. CHAPTER 7. DISCUSSION 63

7.3 Post Interview Study Evaluation

7.3.1 Comparisons and Similarities

If we compare the results from the gap analysis diagrams from all the interview subjects we can draw some conclusion relating to the research questions, Sec. 2.4. If we overlay every current situation analysis from each company in a single diagram seen in Fig. 7.1 we can draw conclusions of common strengths and weakness. We can see in the ﬁgure, Fig. 7.1, that most companies are proﬁcient in CM methodology, Task-based development, CI and CD, and architecture. It is apparent that automated testing suites are less adopted in the industry. Most companies are adept in unit testing, integration testing and system testing, but this pattern soon diminishes since almost every company in this study struggles with acceptance testing, performance testing and destructive testing. There is also a great variance in skills when it comes to security testing and adopting a DevOps culture. I think that the reason for this is that the use of automated testing suites for acceptance, performance and destructive testing is inherently an advanced practice reserved for the highest performing organizations. 64 CHAPTER 7. DISCUSSION

Figure 7.1: All current situations from the four companies.

If we overlay every current situation analysis for each company in a single diagram seen in Fig. 7.2 we can draw conclusions of what companies in the industry wish to be more proﬁcient in. We can clearly see that everyone wishes to have strong building systems, be even better at task based development, architecture and cooperate culture. An- other interesting insight is that everyone wishes to be good at unit testing while the other methods of testing are less prioritized. How- ever, there is a consensus that everyone wants to improve in acceptance testing. Another interesting insight is that having robust logging and feedback systems are very prioritized in the industry, which I initially did not expect. CHAPTER 7. DISCUSSION 65

Figure 7.2: All ought to be situations from the four companies.

To draw results by a mean value in the software industry, I decided to calculate an average using all the results gathered in the interview study, Fig. 7.3, Tab. 7.1. We mainly see that everyone wishes to improve a bit in every category of the benchmark. Another insight is that the mean company prioritizes unit testing, system and integration testing while other testing methodologies are not prioritized. It is also very interesting to see that IT security is lacking but there also is a big discrepancy between the current and ought-to-be situation. As previously mentioned there is also a great wish to improve in software testing, logging and feedback, and company culture. 66 CHAPTER 7. DISCUSSION

Figure 7.3: The mean for all companies. CHAPTER 7. DISCUSSION 67

Table 7.1: Calculated mean values. Ability Current Situation Ought to be CM methodology 7,607 8,536 Building Systems 8,136 9,318 Task-based development 7,792 8,792 Tools Integration 6,798 7,702 CI and CD 7,225 8,250 Deployment 6,597 7,590 Unit Test 6,950 8,250 Integration and System Test 5,627 7,152 Acceptance Test 2,618 4,855 Performance and Destructive Test 3,000 5,750 Security 4,550 6,836 Logging and Feedback 6,429 8,036 Installation and Upgrade 6,576 7,458 Architecture 7,300 8,050 Corporate Culture 7,088 8,250

7.3.2 Interpreting the Feedback Data

If we calculate a mean satisfaction value across all four projects partic- ipating in this study we get the results seen in Tab. 7.2. These mean values seen in Ch. 6, of how satisﬁed the clients partaking in the study, give us an overall indication of the effectiveness and usability of my work and this benchmark. I think that these overall high percent scores show that the companies partaking in the study did agree on that the method and the benchmark was a useful tool for them.

Table 7.2: Mean level of agreement to the backlogs. Agreement Category Percent Insightful 76,59% Clarity 79,04% Doable 71,78%

I think that this data combined with the mean values and comparisons seen in Sec. 7.3.1 gives us a strong indication that this bench- 68 CHAPTER 7. DISCUSSION

marking tool discovers strengths and weakness within a organization and it seems to be both a usable and effective result.

7.3.3 Does the Result Change Over Time?

A question that often came up during the interviews with interview organizations was, “What happens if we redo this benchmark study in a year?”. This is an interesting question that I think should be touched upon to afﬁrm the scientiﬁc value of this research. The following will be speculation from my side, backed up by the results from the interview study. I think that if a company were to redo this study a year from now the results from the benchmark would of course be different. Hope- fully the new current situation will be the previous year’s ought-to-be situation, and a new even more ambitious ought-to-be situation would be the new goal for the company. Of course, that would be a perfect result in a controlled environment. I think that a more likely result would be that some issues from the old backlog and benchmark would be resolved, while others will remain untouched. Since the complexity and different factors that has to come together for a successful development organizations are many, it is unreasonable to think that every issue presented in the backlog would have been mended at the client. This was also apparent when I studied the feedback material seen for each interview organization in Ch. 6. Every client did not fully agree on that every item in the backlog was clear and doable for that organization. Therefore, I think that a feasible outcome a year from now would be that some of the issues presented in the backlog will have been implemented while other issues remain for the partaking organization.

7.3.4 Flaws In the Study Evaluation

An inherent ﬂaw in the method of this study that I thought of while analyzing the results, is that the clients themselves decided upon the ought to be situation. This causes a problem since what the client deems to be important might not be complying with CI/CD and CHAPTER 7. DISCUSSION 69

DevOps. This means that there is ample room for clients to apply their own bias in the results. Nonetheless, this is still a qualitative study therefore bias will always be a problem inherent to the method. A way to amend this is to create quantitatively measured ought-to-be situation to give as a guidance for the interview subject, when determining their ought-to-be situation. Of course, this prepared ought-to-be situation has to be adapted to the client’s prerequisites. Another possible ﬂaw in the method that ties into the above is that it can be difﬁcult for employees to admit their employer’s weaknesses. After all it takes a great deal of insight to see where you can improve. This in turn could have effects on the possibility to compare results since the different interview candidates will of course introduce their biases into the results. Chapter 8

Conclusions

8.1 Research Questions

8.1.1 RQ1: How can we improve qualitative models for measuring CI/CD and DevOps maturity?

Many aspects need to be employed to improve the quality of qualitative models as discussed in Sec. 7.1. Firstly, this study shows that there is a great need to ask open ended questions that leaves room for inter- penetration for the interview subjects to promote discussion among them and the interviewee. Secondly it seems to be important to ask questions about different aspects about the same problem to achieve a more nuanced result due to overlapping in the answers. Another helpful way to improve the quality of qualitative models is to peer review the questions as much as possible before testing against interview subjects. Another way to improve the qualitative models is to adopt more ﬁne-grained answers scales to allow for more ﬂexibility and room for discussion. However, I strongly believe that options to answer neutral to a statement should be removed in these types of studies to counter- act the ability for interview subjects to answer with neutral statement, with the intent to force the subject to take a stance on the statement being asked. It is very important to eliminate as much bias a possible that could affect the study, both from the researchers and the research subject’s perspective.

70 CHAPTER 8. CONCLUSIONS 71

These conclusions are further strengthened by the promising results from the feedback material seen in Sec. 7.3.2. According to the interview subjects the benchmark is comprehensive enough to ascertain an overview of organizations strengths and weaknesses within the CI/CD and DevOps ﬁelds. Thus, the steps taken to improve the model’s quality by me seems to have been successful.

8.1.2 RQ2: To what extent can we compare qualitative results in between companies using a qualitative model?

As seen in Sec. 7.3.1, it is possible to compare the results to draw conclusions from the comparisons. We can clearly see which categories of the benchmark are well or ill adopted within the software industry. We can clearly see from the results that some categories are less adopted in the industry. Namely the use of acceptance testing, performance and destructive testing, and security. This conclusion is further strengthened by the figure and data for the mean organization, Fig. 7.3, Tab. 7.1. Though an addendum to this conclusion is that these categories might have been unfairly harsh upon clients thus skewing the result to a low one every time. Still, none of the interview subjects mentioned this in their feedback. I also theorize that the implementation of these more advanced testing suites falls into developer territory thus requir- ing more skill to develop. Of course, the sample size for this study is still small, thus a larger sample size is needed to draw more definitive conclusions. It would be interesting to interview an organization which is very proficient in the use of automated acceptance, destructive-testing and security to see if the model captures this or if it remains skewed. I strongly believe that these types of comparisons would be difficult if the organizations examined differ too much though when it comes to scope, size and structure. The four projects examined in this study were all roughly similar character since they were R2M’s customers before partaking in the study. I think that this conclusion would change if we compare results from two companies that had used two different qualitative models. I therefore, draw the conclusion that it is possible to some degree to compare results from qualitative models. 72 CHAPTER 8. CONCLUSIONS

8.1.3 RQ3: Which actions are effective for improving CI/CD and DevOps compliance in an organization?

Drawing conclusions based on my research of which actions are the most effective for increasing CI/CD and DevOps compliance is a bit harder. I think that comparing the different results from the interview study and examining which actions are shared across several clients can give an indication of which actions are the most important as seen in Sec. 7.3.1. While of course also taking their respective feedback into consideration and its mean values seen in Tab. 7.2. If the client is satisﬁed with their backlog of tasks, it also should give an indication that the tasks proposed in the backlog should be the most effective for reaching better CI/CD and DevOps maturity. It seems that the ought to be results for the mean organization seen in Fig. 7.3 favor the use of extensive automated testing suites. Par- ticularly if the entire testing pipeline can be automated from simple unit tests to acceptance and performance tests. The need to improve and develop building systems and logging and feedback systems also seems to be prioritized for the mean company to achieve higher maturity in the benchmark. The interview subjects also seem to agree with this conclusion if we factor in their satisfaction with their backlog. This theory also coincides with other studies done examining the same topics[35][36]. So, I strongly believe that the focusing on implementing and improving an extensive automated test suite is a very beneﬁcial step to reach higher compliance. From the results from the interview study, it also seems that the need for changing work methodologies and adopting lean and agile practices to facilitate a DevOps culture is important for companies. This study shows this as well since the mean company wants to improve their company culture. Another theory I have is that many organizations in the study expressed the notion that their understanding of what CI/CD and DevOps meant had been improved by partaking in the study. Perhaps an excellent way to increase overall maturity in CI/CD and DevOps would be to focus more on staff education of the subject itself to increase awareness of CI/CD and DevOps. My conclusion is that the most effective actions for increasing CI/CD and DevOps compliance is to focus upon creating fully automated testing suites. Chapter 9

Suggested Future Work

If I could continue the thesis for a few more months there are some things I would like to improve. Firstly, I would have continued interviewing more clients to gather more data, so that more reliable conclusions and mean values could have been drawn from the interview study. Unfortunately, the time investment required for each interview proved to be too great to fit even more interviews within the time frame of the master thesis. My second goal should make reaching my first goal for improvement easier. Would be to transfer the questionnaire web based format and if possible simplify the model so the need for in person interviews are decreased. For example, a web based questionnaire written in an appropriate language would probably have been a neater solution. I theorize that a web based solution could make it easier to gather more respondents thus providing me with a better basis for drawing conclusions upon. Another thing I would have like to try is to test if the results change if I only to interview a single person at the company whom answered all the questions in the benchmark. Then compare this result from a single interview candidate with a benchmark result from interviewing several employees, using the method used in this study. To see if the results vary widely and to see if the method I used is scientific and appropriate, or if an entirely different method should be used i future studies.

73 74 CHAPTER 9. SUGGESTED FUTURE WORK

During the interview phase clients often complained that the model was too long, therefore I would suggest that in future work the model should either be trimmed down or split up into tiers. To prune statements, I would continue to investigate which methods proved to be most efficient for increasing CI/CD and DevOps compliance and remove low impact statements. Or I would split up the model into tiers where the most important aspects are tested first to keep the number of statements low, then in coming tiers additional statements would be presented to the client. This tier model could simply take the form of basic, advanced or full-version of the benchmark. As previously mentioned during interviews technical terms often had differing meanings among clients, which might have affected the outcome of this study. Therefore, I would opt for creating a dictionary for technical terms as defined by me that the interviews subject has to adhere, to make sure to eliminate this potential factor of result pollu- tion. Chapter 10

Retrospective

If I could redo the thesis from scratch with the knowledge I have ac- quired now, I would have done a few things differently. Firstly, I would have opted not to start the project during the summer months since this caused a few hurdles. The interview phase took longer than expected since finding interview subjects took longer than expected due to the summer vacation. This lead me not being able to start booking customer meetings until the end of August. I theorize that if I had conducted the project during spring or autumn, booking initial start-up meetings and finding interested interview clients could have been started at the beginning of the project. My second point ties into the first, if the project had been done during the working months more time could have been dedicated to san- ity checking the improved model. This would have complied better with working in an iterative approach since the tests for each iteration could have been made continuously instead for at the end when the vacation months ended.

75 Bibliography

[1] CMMI Product Team. CMMI for Development, Version 1.3. Tech. rep. https://resources.sei.cmu.edu/library/asset- view.cfm?assetID=9661. Software Engineering Institute, 2010. [2] Andrew Glover Paul M. Duvall Steve Matyas. Continuous Integration: Improving Software Quality and Reducing Risk 1st Edition. 1st Edition. ISBN: 978-0321336385. Addison-Wesley Professional, July 2007. [3] John Willis Gene Kim Patrick Debois and Jez Humble. The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations. ISBN: 978-1942788003. IT Revolution Press, Oct. 2016. [4] CONTINUOUS INTEGRATION. https://www.thoughtworks.com/continuous-integration. Accessed: 2017-07-14. [5] Denis Polkhovskiy. “Comparison between Continuous Integration tools”. MA thesis. Tampere University of Technology, 2016. [6] Lauri Hukkanen. “Adopting Continuous Integration – A Case Study”. MA thesis. Aalto University School of Science, 2015. [7] Barry O’Reilly Jez Humble Joanne Molesky. Lean Enterprise: How High Performance Organizations Innovate at Scale. 1 edition. ISBN: 978-1449368425. O’Reilly Media, Jan. 2015. [8] Yegor Bugayenko. Why Continuous Integration Doesn’t Work. https://www.thoughtworks.com/continuous-integration. Accessed: 2017-07-14. Sept. 2014.

76 BIBLIOGRAPHY 77

[9] Pavan Belagatti. 5 REASONS WHY ORGANIZATIONS FAIL TO ADOPT CI/CD. http://blog.shippable.com/5-reasons-why- organizations-fail-to-adopt-ci/cd. Accessed: 2017-07-14. [10] Pavan Belagatti. Continuous Integration is Dead. http://www.yegor256.com/2014/10/08/continuous- integration-is-dead.html. Accessed: 2017-07-14. Oct. 2014. [11] David Farley Jez Humble. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation. 1st Edition. ISBN: 978-0321601919. Addison-Wesley Professional, July 2010. [12] Niklas Sundbaum. “Automated Veriﬁcation of Load Test Results in a Continuous Delivery Deployment Pipeline”. MA thesis. Royal Institute of Technology CSC, 2015. [13] Aleksi Häkli. “Implementation of Continuous Delivery Systems”. MA thesis. Tampere University of Technology, 2016. [14] Ernest Mueller. What Is DevOps? https://theagileadmin.com/what-is-devops/. Accessed: 2017-07-27. Aug. 2010. [15] Nicole Forsgren. GOTO 2015 DevOps: Next. https://www.youtube.com/watch?v=dMwGfRINpz0. 2015. [16] Anand Srivatsav Amaradri and Swetha Bindu Nutalapati. “Continuous Integration, Deployment and Testing in DevOps Environment”. MA thesis. Blekinge Institute of Technology, 2016. [17] ThoughtWorks Puppet Labs IT Revolution Press. 2014 State of DevOps Report. Tech. rep. https://puppet.com/resources/whitepaper/2014-state- devops-report. Puppet Labs, 2014. [18] IT Revolution Puppet Labs. 2015 State of DevOps Report. Tech. rep. https://puppet.com/resources/whitepaper/2015- state-devops-report. Puppet Labs, 2015. 78 BIBLIOGRAPHY

[19] DORA DevOps Reserach Puppet Labs and Assesment. 2016 State of DevOps Report. Tech. rep. https://puppet.com/resources/whitepaper/2016-state-of- devops-report. Puppet Labs, DORA DevOps Reserach and Assesment, 2016. [20] DORA DevOps Reserach Puppet Labs and Assesment. 2017 State of DevOps Report. Tech. rep. https://puppet.com/resources/whitepaper/state-of-devops- report. Puppet Labs, DORA DevOps Reserach and Assesment, 2017. [21] Jeff Smith. GOTO 2016 DevOps - The Good, The Bad, The Ugly. https://www.youtube.com/watch?v=qLUt6bwNnks. 2016. [22] Sam Newman. Building Microservices: Designing Fine-Grained Systems. 1 edition. ISBN: 978-1491950357. O’Reilly Media, Feb. 2015. [23] Viktor Farcical. The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices. 1st Edition. ISBN: 978-1523917440. CreateSpace Independent Publishing Platform, Feb. 2016. [24] What is Destructive Testing? https://www.tutorialspoint.com/software_testing _dictionary/destructive_testing.htm. Accessed: 2017-09-05. [25] PRINCIPLES OF CHAOS ENGINEERING. http://principlesofchaos.org/. Accessed: 2017-09-05. Apr. 2017. [26] Tobias Nyholm. “Effektiv mjukvaruutveckling med continuous integration och automatisering”. MA thesis. Royal Institute of Technology CSC, 2013. [27] R. Meshenberg. GOTO 2016 Microservices at Netﬂix Scale: Principles, Tradeoffs & Lessons Learned. https://www.youtube.com/watch?v=57UK46qfBLY. 2016. [28] Kevin S. Pratt and Hr Bright. Design Patterns for Research Methods: Iterative Field Research. http://www.kpratt.net/wp- content/uploads/2009/01/research_methods.pdf. Accessed: 2017-08-14. BIBLIOGRAPHY 79

[29] Pär Åhlström Niklas Modig. This is Lean: Resolving the Efﬁciency Paradox. ISBN: 978-9198039306. Rheologica Publishing, Nov. 2012. [30] Claire Anderson. Presenting and Evaluating Qualitative Research. Tech. rep. American Journal of Pharmaceutical Education 2010; 74 (8) Article 141. Univeristy of Nottingham, 2010. [31] Tony Lynch. Writing up your PhD (Qualitative Research) Independent Study version. Tech. rep. English Language Teaching Centre, University of Edinburgh, 2014. [32] Sam Newman. GOTO 2017 Feature Branches and Toggles in a Post-GitHub World. https://www.youtube.com/watch?v=lqRQYEHAtpk. 2017. [33] Ken Mugrage. GOTO 2017 It’s Not Continuous Delivery If You Can’t Deploy Right Now. https://www.youtube.com/watch?v=po712VIZZ7M. 2017. [34] George Spafford Gene Kim Kevin Behr. The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win. 1 edition. 978-0988262591. IT Revolution Press, Jan. 2014. [35] Jez Humble Nicole Forsgren. DevOps: Proﬁles in ITSM Performance and Contributing Factors. Tech. rep. Western Decision Sciences Institute, Oct. 2015. [36] Jez Humble Nicole Forsgren. The Role of Continuous Delivery in IT and Organizational Performance. Tech. rep. Western Decision Sciences Institute, Oct. 2015. Appendix A

Appendix

A.1 Sample Model

80 Author: Per Hagsten Date: 2017-09-01

Who do you think can answer these questions? CM methodology • All projects adhere to good and proven CM methodology.

• All project artifacts are CM-managed, not just source code (architecture documents, database scripts, development tools, artifacts, etc.).

• Branches in the CM system have a short lifespan, at most days before they are reintegrated to the trunk.

Building Systems • The build process is triggered automatically when a developer submits a change. The developer always receives adequate feedback from the build system.

• The reason for a failed build is easily identified. Developers can troubleshoot on their own. Resources outside the team with specialist skills are not required for troubleshooting.

• If the build system disapproves of a build, it always due to an actual defect in the code and not, for example, a broken pipeline.

Task-based development • All development activities are based on well-defined tasks / defects / requirements / user stories.

• We limit the number of development activities that are processed at the same time. No new activities will start until we are done with the activities we are committed to. That is, activities are handled with a "rolling window".

• Definition-of-Done for a task / defect / requirement / user story is clear and communicated. Everyone adheres it.

Tools Integration • The version management system can visualize the source code changes that belong to a certain version.

• All check-ins are mapped 1: 1 against specific themes / story / requirement / defective case in the version management system.

• Each source code change can be derived from a specific build, a specific baseline, and a specific release.

CI and CD • There are clear rules for using the CI / CD tools. All projects comply to these rules.

Author: Per Hagsten Date: 2017-09-01

• The CI chain is complete with building, unit tests, integration tests, system tests.

• The CD chain is complete with acceptance tests, performance tests, safety tests, destructive tests and if any manual tests.

Product Deployment • All differences between deployment to test environments and deployment to the production environment are known and managed by the CI / CD system. Deployment to production is therefore trivial.

• There are mechanisms in place that ensure that the test environments are identical to the production environment.

• We trust our CI / CD process so much on that we always dare deploy release candidates automatically to the production environment.

Unit Test • There is tool support in place to automatically trigger unit tests when checking changes into version control.

• There are mechanisms that prevent delivery of new source code unless unit tests are included. (see TDD).

• In cases where unit tests are dependent on mockups of external systems, there is a methodology and process in place to ensure that mockup and external systems are always synchronized.

Integration and System Test • The data integration tests are running against corresponds well against the production environment data.

• The GUI is tested automatically.

• All test cases are tested automatically with relevant inputs, both per happy path, with corner cases and with incorrect conditions.

Acceptance Test • Acceptance tests test the "intention" but do not function of the "implementation" of the function. That is, from the user's perspective.

• Acceptance testing is an accepted part of our CI / CD pipeline.

• Acceptance tests map well against the actual production environment.

Performance and Destructive Test • We test how the system behaves in production if we short circuit some of the code base with, for example using a Chaos Monkey.

Author: Per Hagsten Date: 2017-09-01

• We test the code base with deliberately broken / corrupt / poor test cases. To trigger uncontrolled behavior.

• The relevant performance characteristics can be tested automatically.

Security • We perform automated security tests such as penetration tests as part of our CI / CD chain.

• We security tests all third-party libraries etc. to protect us from dependency attacks. So-called Dependency Scanning.

• Application and infrastructure log security data.

Logging and Feedback • We use log data for performance from the application and / or infrastructure to make decisions and solve problems.

• Status of the integration branch and the projects status are easily accessible in, for example, a visual assembly point.

• We collect information about when, how and what, when the application / infrastructure generates non-normal data.

Installation and Upgrade • There is an established, functioning and used process for all environmental configurations (firewall openings, load balancing, dishwashing, etc.).

• The components that the CI / CD process needs to meet the CI / CD process SLAs. Examples: availability and capacity of building servers, external test systems, etc.

• There is knowledge of all essential software and hardware that exists in the company’s environments (versions, configurations, model numbers, etc.).

Architecture • We can make changes to the system without creating more jobs for other teams.

• We can install parts of the application without having to redeploy the entire application.

• If any service goes down, the error is not propagated to the entire system.

Corporate Culture • New ideas and approaches to problems are welcomed from all employees.

• DevOps is not a separate team, Ops and Dev collaborates across all teams.

• The time to recover system failures, such as an unplanned interruption, is short.

84 APPENDIX A. APPENDIX

A.2 Sample Task Backlog

CI/CD Benchmark – Mockup company

SUMMARY REPORT AND BACKLOG PER HAGSTEN

Author: Per Hagsten Date: 2017-11-08

Table of Contents Background ...... 2 High priority activities - in ranking, most important first ...... 4 Medium-term activities - without mutual order...... 5 Activities with lower priority - without mutual order ...... 5 R2M's comments on the result ...... 6

Background During September and October 2017, a number of workshops have been held with mockup company. The purpose has been to identify ways to improve and further develop their CI / CD environment. By taking a stand for about two hundred claims about the CI / CD environment, a current situation and an ought to be situation have been identified. These claims are typical indicators of a good and well-functioning development environment.

A few examples: • The building system is deterministic. Two consecutive builds give exactly the same result. • All differences between deployment to test environments and deployment to the production environment are known and managed by the CI / CD system. Deployment to production is therefore trivial.

The claims concern fifteen different areas with a direct impact on the CI / CD environment: 1. CM methodology 2. Building system 3. Task-based development 4. Tool Integration 5. CI and CD 6. Production Composition 7. Unit Test 8. Integration and System Test 9. Acceptance test 10. Performance and Destructive Test 11. Security 12. Logging and Feedback 13. Installation and Upgrade 14. Architecture 15. Corporate Culture

Statement of agreement has been made from the following scale: • Disagree. • Disagree largely. • Disagree somewhat. • Agree somewhat. • Agree largely. • Totally agree.

The current situation and the ought to be situation are described in Figure 1 and Table 1. Based on the differences between current situation and the ought to be situation, a so-called gap analysis was made where a number of activities were identified as necessary to bridge the gaps and thus achieve the desired ought to be situation.

CI/CD/DevOps Analysis

CM methodology Corporate Culture 10 Building Systems 9 8 Task-based Architecture 7 6 development 5 Installation and 4 Tools Integration Upgrade 3 2 1 0 Logging and Feedback CI and CD

Security Product Deployment

Performance and Unit Test Destructive Test Integration and Acceptance Test System Test

Current sitauation Ought to be Max

Figure 1 Results from the gap analysis

Ability Current Ought to be Difference CM methodology 7,263 9,121 1,858 Building Systems 5,623 7,132 1,509 Task-based development 8,630 8,711 0,081 Tools Integration 7,563 8,524 0,961 CI and CD 9,123 9,200 0,077 Product Deployment 8,151 9,123 0,972 Unit Test 4,123 8,354 4,231 Integration and System Test 3,689 8,457 4,768 Acceptance Test 2,456 4,521 2,065 Performance and Destructive Test 1,257 6,328 5,071 Security 8,634 9,214 0,580 Logging and Feedback 5,237 6,000 0,763 Installation and Upgrade 8,563 8,563 0,000 Architecture 7,234 8,672 1,438 Corporate Culture 9,635 9,990 0,355

Table 1 Figure data Identified improvement activities Below are the identified improvement activities summarized. All activities aim to optimize the CI / CD environment and help achieve the desired ought to be situation. The activities are divided into three priority categories: low, medium and high. High priority activities should of course be prioritized and performed before activities with lower priority. In addition, the activities that fall within the framework of high priority are ranked among the most important first.

For each activity, the attribute is work effort and impact are estimated. Work effort means the size of the work to implement the change. By impact is meant how big the positive contribution to the desired ought to be situation is expected to be. Both attributes are described according to the scale "small, medium, large". The measure is relative and should not be interpreted or translated into absolute monetary or man hours. The purpose of the attributes is only to give a pointer of what is likely to produce the most effect in relation to the investment made. High priority activities - in ranking, most important first 1. Title: Implement automatic performance testing. a. Description: Today mock up company conducts performance testing manually which is very positive. However, R2M sees a strong learning culture within mockup company and a strong desire to improve in performance testing. R2M therefore, deems that mockup company should investigate how to implement automatic performance testing as a part of their CI/CD pipeline. b. Work effort (small / medium / large): Large c. Impact (small / medium / large): Large d. Backlog ID: 1, Performance and Destructive Test

2. Title: Work upon creating more automatic integration tests. a. Description: Mockup company has a few automatic integration and system test today in their pipeline. However, R2M deems that almost every manual integration test performed today could be automatized due to repeated tasks which does not differ much in between each testing round. b. Work effort (small / medium / large): Medium c. Impact (small / medium / large): Large d. Backlog ID: 2, System and Integration test

Medium-term activities - without mutual order 1. Title: Develop unit tests that tests abuse stories. a. Description: Today mockup company does a lot of automatic unit testing which is very positive. However, most of these tests are “happy path” tests which does not take into account real world behavior. R2M deems that “abuse story” test should be written as well to accommodate for user behavior in the real-world. This would help mock up company to know that their application is robust and secure. b. Work effort (small / medium / large): Small c. Impact (small / medium / large): Large d. Backlog ID: 3, Unit test 2. Title: Add automatic unit testing. a. Description: Today, acceptance testing is done manually at mockup company even for basic tests. R2M deems that acceptance tests that test “hard” and “solid” attributes can be entirely automatized to save time. This will also give product owners assurance that the product meets the demands of the customer. b. Work effort (small / medium / large): Large c. Impact (small / medium / large): Large d. Backlog ID: 4, Acceptance Test

Activities with lower priority - without mutual order 1. Title: Document a definition of done. a. Description: Today the definition of done for a task is a silent agreement between the developers and the product owner. This causes issues when the customer does not agree that a feature is done. To mitigate this issue R2M thinks that the definition of done should be documented, so there exists no room for disagreement between the developers, the product owner and the customer. b. Work effort (small / medium / large): Small c. Impact (small / medium / large): Medium d. Backlog ID: 5, CM Methodology

2. Title: Gather more feedback from the customer during the development process. a. Description: Today feedback about the product is only gathered at the end of a project. R2M thinks it is better for mockup company to gather feedback as the project goes along instead. That way mock up company can be assured that their work is heading in the right direction which will alleviate the risks of long term application development. b. Work effort (small / medium / large): Small c. Impact (small / medium / large): Medium d. Backlog ID: 6, Corporate Culture R2M's comments on the result A CI / CD environment and a CI / CD process span by definition the entire organization. The majority of the identified gaps between current and future positions are largely caused by imbalances between different teams: different modes of work, differences in maturity and experience, differences in communication and reporting, etc. Improvement activities can therefore often not be carried out by the teams themselves. Change requires a management commitment. The person (s) and groups assigned to address the improvements need both resources and mandate to enforce team-wide changes.

A.3 Sample Feedback Material Author: Per Hagsten Date: 2017-09-01

Feedback on participation in the CI / CD benchmark. Mockup company.

I would like to start by thanking you for being a test company in my master’s degree project! Your feedback on the survey and the backlog will be an important part of determining the quality of my degree project, so I would like to ask you to fill out the following feedback form.

1. Backlog items

Please answer these questions according to this scale:

Totally Disagree. 0

Disagree largely. 1

Somewhat disagree. 2

Neither agree nor disagree. 3

Somewhat agree. 4

Agree largely. 5

Totally agree. 6

a. We experience that the presented backlog items are insightful and coincides to our organization's development opportunities?

Heading in Backlog Answer (Number) Implement automatic performance testing. 6 Work upon creating more automatic integration tests. 4 Develop unit tests that tests abuse stories. 6 Add automatic unit testing. 2 Document a definition of done. 5 Gather more feedback from the customer during the development process. 0

b. We think the presented backlog items are clear and concrete?

Heading in Backlog Answer (Number) Implement automatic performance testing. 4 Work upon creating more automatic integration tests. 2 Develop unit tests that tests abuse stories. 6 Add automatic unit testing. 2 Document a definition of done. 2 Gather more feedback from the customer during the development process. 3

Author: Per Hagsten Date: 2017-09-01

c. We think that the featured backlog items are doable for us?

Heading in Backlog Answer (Number) Implement automatic performance testing. 6 Work upon creating more automatic integration tests. 2 Develop unit tests that tests abuse stories. 6 Add automatic unit testing. 4 Document a definition of done. 5 Gather more feedback from the customer during the development process. 3

2. What did you think about the layout of the study? Was something missing or something superfluous? a. We thought it covered just about everything.

3. Do you have any comments on the questions in the benchmark or the structure of the benchmark? a. It was quite long.

4. Did you experience that the questions in the benchmark matched your definition of CI / CD and DevOps? If no, what do you think should be changed? a. Yes, absolutely nothing was missing.

5. Do you have any other comments? Tips, ideas or praise? a. Great study! Good luck on your master thesis!

APPENDIX A. APPENDIX 95

A.4 Results Company One Author: Per Hagsten Date: 2017-10-31

CI/CD/DevOps Analysis

Current Ought to be Max

CM methodology 10 Corporate Culture Building Systems 8 Architecture Task-based development 6

4 Installation and Upgrade Tools Integration 2

0 Logging and Feedback CI and CD

Security Product Deployment

Performance and Unit Test Destructive Test Integration and System Acceptance Test Test

Figure 1 Results from gap analysis

Ability Current Ought to be Difference CM methodology 7,286 8,286 1,000 Building Systems 6,727 8,909 2,182 Task-based development 7,500 8,167 0,667 Tools Integration 4,333 5,667 1,333 CI and CD 6,400 7,100 0,700 Product Deployment 5,778 7,333 1,556 Unit Test 5,400 8,000 2,600 Integration and System Test 3,833 5,167 1,333 Acceptance Test 2,000 5,273 3,273 Performance and Destructive Test 2,000 3,143 1,143 Security 5,091 6,727 1,636 Logging and Feedback 6,000 7,286 1,286 Installation and Upgrade 6,167 6,500 0,333 Architecture 7,000 7,600 0,600 Corporate Culture 8,429 9,000 0,571

Table 1 figure data

A.4.1 Feedback Company One Författare: Per Hagsten Datum: 2017-11-01

Feedback på deltagande i CI/CD benchmark. Jag vill börja med att tacka er för att ha ställt upp som testföretag i mitt examensarbete! Er feedback på undersökningen och backlogen kommer vara en viktig del i att fastställa kvalitén på mitt examensarbete, därför skulle jag vilja be er att fylla i följande feedback formulär.

1. Backlog items

Vänligen svara på dessa frågor enligt denna skala:

Instämmer inte alls. 0

Instämmer inte till stor del. 1

Instämmer inte till viss del. 2

Instämmer varken eller. 3

Instämmer till viss del. 4

Instämmer till stor del. 5

Instämmer helt och hållet. 6

a. Vi upplever de presenterade backlog items som insiktsfulla och träffande för vår organisations utvecklingsmöjligheter?

Rubrik i backlogen Svar(Siffra) Backlog item 1 6 Backlog item 2 6 Backlog item 3 4 Backlog item 4 4 Backlog item 5 5 Backlog item 6 6 Backlog item 7 3 Backlog item 8 5 Backlog item 9 5 Backlog item 10 6 Backlog item 11 5 Backlog item 12 5 Backlog item 13 4 Backlog item 14 5

Författare: Per Hagsten Datum: 2017-11-01

Backlog item 15 6 Backlog item 16 6 Backlog item 17 5 Backlog item 18 4 Backlog item 19 6 Backlog item 20 5 Backlog item 21 5 Backlog item 22 6 Backlog item 23 6 Backlog item 24 3

b. Vi tycker att de presenterade backlog items är tydliga och konkreta?

Rubrik i backlogen Svar(Siffra) Backlog item 1 6 Backlog item 2 6 Backlog item 3 6 Backlog item 4 6 Backlog item 5 6 Backlog item 6 6 Backlog item 7 6 Backlog item 8 6 Backlog item 9 6 Backlog item 10 6 Backlog item 11 6 Backlog item 12 6 Backlog item 13 6 Backlog item 14 6 Backlog item 15 6 Backlog item 16 6 Backlog item 17 6 Backlog item 18 6 Backlog item 19 6 Backlog item 20 6 Backlog item 21 6 Backlog item 22 6 Backlog item 23 6 Backlog item 24 6

Författare: Per Hagsten Datum: 2017-11-01

a. Vi tycker att de presenterade backlog items är relevanta och genomförbara för oss?

Rubrik i backlogen Svar(Siffra) Backlog item 1 5 Backlog item 2 6 Backlog item 3 3 Backlog item 4 6 Backlog item 5 5 Backlog item 6 5 Backlog item 7 3 Backlog item 8 4 Backlog item 9 4 Backlog item 10 5 Backlog item 11 4 Backlog item 12 5 Backlog item 13 4 Backlog item 14 5 Backlog item 15 5 Backlog item 16 4 Backlog item 17 4 Backlog item 18 4 Backlog item 19 5 Backlog item 20 5 Backlog item 21 3 Backlog item 22 5 Backlog item 23 6 Backlog item 24 4

2. Vad tyckte ni om upplägget av studien? Saknades något eller var något överflödigt?

Den var mer än komplett, men inte för mycket.

It was more than complete, but not too much.

3. Har ni några synpunkter på frågorna i benchmarken eller benchmarkens struktur? a. XXX tror att en öppen diskussion kring börläget kan ge mycket. Kanske inte hör hemma i denna övning om en XXX tror XXX skulle vilja ha den om XXX beställt den utvärdering som uppdragsgivare. En ”export” att argumentera mot vore bra, vad är det som säger att XXX tolkning av bör är rätt?

XXX believes that an open discussion about the ought to be situation can give a lot. Maybe it does not belong in this exercise if a XXX believes XXX would like it if XXX ordered the evaluation as a real client. An "export" to argue against would be beneficial, what is it saying that XXX interpretation of should be correct?

Författare: Per Hagsten Datum: 2017-11-01

4. Upplevade ni att frågorna i benchmarken överensstämde med din definition av CI/CD och DevOps? Om nej, vad tycker ni bör ändras? a. JA

YES

5. Har ni några övriga synpunkter? Tips, idér eller beröm? a. Bra jobbat och kul att XXX fick vara med!

Good job and great that XXX could join!

102 APPENDIX A. APPENDIX

A.5 Results Company Two

A.5.1 Company Two, Project One Author: Per Hagsten Date: 2017–11-02

CI/CD/DevOps Analysis

Current Ought to be Max CM methodology 10,000 Corporate Culture Building Systems

8,000 Architecture Task-based development 6,000

4,000 Installation and Upgrade Tools Integration 2,000

0,000 Logging and Feedback CI and CD

Security Product Deployment

Performance and Unit Test Destructive Test Acceptance Test Integration and System Test

Figure 1 Results from gap analysis

Ability Current Ought to be Difference CM methodology 7,286 8,000 0,714 Building Systems 9,091 9,636 0,545 Task-based development 8,000 9,000 1,000 Tools Integration 7,714 8,286 0,571 CI and CD 8,100 8,600 0,500 Product Deployment 7,333 7,333 0,000 Unit Test 7,800 8,200 0,400 Integration and System Test 6,400 7,636 1,236 Acceptance Test 2,800 4,800 2,000 Performance and Destructive Test 7,714 8,143 0,429 Security 5,400 7,400 2,000 Logging and Feedback 7,714 8,286 0,571 Installation and Upgrade 7,636 8,000 0,364 Architecture 8,000 8,000 0,000 Corporate Culture 6,769 8,000 1,231

104 APPENDIX A. APPENDIX

A.5.2 Feedback Company Two, Project One Författare: Per Hagsten Datum: 2017-11-01

1. Backlog items.

Vänligen svara på dessa frågor enligt denna skala:

Instämmer inte alls. 0

Instämmer inte till stor del. 1

Instämmer inte till viss del. 2

Instämmer varken eller. 3

Instämmer till viss del. 4

Instämmer till stor del. 5

Instämmer helt och hållet. 6

a. Vi upplever de presenterade backlog items som insiktsfulla och träffande för vår organisations utvecklingsmöjligheter?

OBS inbakat 1c i denna tabell.

NOTE 1c baked into this table.

Författare: Per Hagsten Datum: 2017-11-01

Backlog item 12 5 Backlog item 13 4 Backlog item 14 4 Backlog item 15 4 Backlog item 16 3 Backlog item 17 5 Backlog item 18 6 Backlog item 19 5 Backlog item 20 3 Backlog item 21 4 Backlog item 22 2

b. Vi tycker att de presenterade backlog items är tydliga och konkreta?

Rubrik i backlogen Svar(Siffra) Backlog item 1 6 Backlog item 2 2 Backlog item 3 5 Backlog item 4 5 Backlog item 5 4 Backlog item 6 4 Backlog item 7 3 Backlog item 8 6 Backlog item 9 6 Backlog item 10 5 Backlog item 11 2 Backlog item 12 5 Backlog item 13 5 Backlog item 14 2 Backlog item 15 2 Backlog item 16 5 Backlog item 17 3 Backlog item 18 5 Backlog item 19 6 Backlog item 20 5 Backlog item 21 2 Backlog item 22 5

c. Vi tycker att de presenterade backlog items är genomförbara för oss?

(Samma svar som i 1a)

Same answers as 1a

Författare: Per Hagsten Datum: 2017-11-01

Rubrik i backlogen Svar(Siffra) Backlog item 1 6 Backlog item 2 4 Backlog item 3 3 Backlog item 4 4 Backlog item 5 4 Backlog item 6 5 Backlog item 7 3 Backlog item 8 6 Backlog item 9 6 Backlog item 10 4 Backlog item 11 6 Backlog item 12 5 Backlog item 13 4 Backlog item 14 4 Backlog item 15 4 Backlog item 16 3 Backlog item 17 5 Backlog item 18 6 Backlog item 19 5 Backlog item 20 3 Backlog item 21 4 Backlog item 22 2

2. Vad tyckte ni om upplägget av studien? Saknades något eller var något överflödigt? a. Besvaras muntligt på möte.

Answered orally at a meeting.

3. Har ni några synpunkter på frågorna i benchmarken eller benchmarkens struktur? a. Besvaras muntligt på möte.

Answered orally at a meeting.

4. Upplevade ni att frågorna i benchmarken överensstämde med din definition av CI/CD och DevOps? Om nej, vad tycker ni bör ändras? a. Besvaras muntligt på möte.

Answered orally at a meeting.

5. Har ni några övriga synpunkter? Tips, idér eller beröm? a. Besvaras muntligt på möte.

Answered orally at a meeting.

108 APPENDIX A. APPENDIX

A.5.3 Company Two, Project Two Author: Per Hagsten Date: 2017–11-02

CI/CD/DevOps Analysis

Current Ought to be Max CM methodology 10,000 Corporate Culture Building Systems 8,000 Architecture Task-based development 6,000

4,000 Installation and Upgrade Tools Integration 2,000

0,000 Logging and Feedback CI and CD

Security Product Deployment

Performance and Unit Test Destructive Test Integration and System Acceptance Test Test

Figure 1 Results from gap analysis

Ability Current Ought to be Difference CM methodology 7,429 8,286 0,857 Building Systems 8,727 9,818 1,091 Task-based development 7,333 9,000 1,667 Tools Integration 7,429 8,286 0,857 CI and CD 7,700 8,700 1,000 Product Deployment 7,778 8,444 0,667 Unit Test 7,200 8,200 1,000 Integration and System Test 5,273 7,636 2,364 Acceptance Test 2,400 4,800 2,400 Performance and Destructive Test 1,714 8,143 6,429 Security 4,800 7,400 2,600 Logging and Feedback 7,429 8,286 0,857 Installation and Upgrade 6,667 8,167 1,500 Architecture 7,000 8,000 1,000 Corporate Culture 6,154 8,000 1,846

Table 1 figure data

110 APPENDIX A. APPENDIX

A.5.4 Feedback Company Two, Project Two Författare: Per Hagsten Datum: 2017-11-01

1. Backlog items.

Vänligen svara på dessa frågor enligt denna skala:

Instämmer inte alls. 0

Instämmer inte till stor del. 1

Instämmer inte till viss del. 2

Instämmer varken eller. 3

Instämmer till viss del. 4

Instämmer till stor del. 5

Instämmer helt och hållet. 6

a. Vi upplever de presenterade backlog items som insiktsfulla och träffande för vår organisations utvecklingsmöjligheter?

Författare: Per Hagsten Datum: 2017-11-01

Backlog item 15 6 Backlog item 16 6 Backlog item 17 4 Backlog item 18 6 Backlog item 19 1 Backlog item 20 5 Backlog item 21 3 Backlog item 22 3 Backlog item 23 4 Backlog item 24 1 Backlog item 25 3

b. Vi tycker att de presenterade backlog items är tydliga och konkreta?

Rubrik i backlogen Svar(Siffra) Backlog item 1 5 Backlog item 2 5 Backlog item 3 4 Backlog item 4 3 Backlog item 5 6 Backlog item 6 2 Backlog item 7 5 Backlog item 8 4 Backlog item 9 4 Backlog item 10 3 Backlog item 11 3 Backlog item 12 6 Backlog item 13 6 Backlog item 14 5 Backlog item 15 3 Backlog item 16 6 Backlog item 17 3 Backlog item 18 5 Backlog item 19 2 Backlog item 20 5 Backlog item 21 5 Backlog item 22 5 Backlog item 23 5 Backlog item 24 5 Backlog item 25 3

Författare: Per Hagsten Datum: 2017-11-01

c. Vi tycker att de presenterade backlog items är genomförbara för oss?

Rubrik i backlogen Svar(Siffra) Backlog item 1 5 Backlog item 2 5 Backlog item 3 4 Backlog item 4 5 Backlog item 5 6 Backlog item 6 4 Backlog item 7 3 Backlog item 8 4 Backlog item 9 5 Backlog item 10 5 Backlog item 11 6 Backlog item 12 6 Backlog item 13 6 Backlog item 14 4 Backlog item 15 6 Backlog item 16 6 Backlog item 17 4 Backlog item 18 6 Backlog item 19 1 Backlog item 20 5 Backlog item 21 3 Backlog item 22 3 Backlog item 23 4 Backlog item 24 1 Backlog item 25 3

2. Vad tyckte ni om upplägget av studien? Saknades något eller var något överflödigt? a. Besvaras muntligt på möte.

Answered orally at a meeting.

3. Har ni några synpunkter på frågorna i benchmarken eller benchmarkens struktur? a. Besvaras muntligt på möte.

Answered orally at a meeting.

4. Upplevade ni att frågorna i benchmarken överensstämde med din definition av CI/CD och DevOps? Om nej, vad tycker ni bör ändras? a. Besvaras muntligt på möte.

Answered orally at a meeting.

Författare: Per Hagsten Datum: 2017-11-01

5. Har ni några övriga synpunkter? Tips, idér eller beröm? a. Besvaras muntligt på möte.

Answered orally at a meeting.

APPENDIX A. APPENDIX 115

A.6 Results Company Three Author: Per Hagsten Date: 2017-11-13

CI/CD/DevOps Analysis

Current Ought to be Max

CM methodology 10,000 Corporate Culture Building Systems 8,000 Architecture Task-based development 6,000

4,000 Installation and Upgrade Tools Integration 2,000

0,000 Logging and Feedback CI and CD

Security Product Deployment

Performance and Unit Test Destructive Test Integration and System Acceptance Test Test

Figure 1 Results from gap analysis

Ability Current Ought to be Difference CM methodology 8,429 9,571 1,143 Building Systems 8,000 8,909 0,909 Task-based development 8,333 9,000 0,667 Tools Integration 7,714 8,571 0,857 CI and CD 6,700 8,600 1,900 Product Deployment 5,500 7,250 1,750 Unit Test 7,400 8,600 1,200 Integration and System Test 7,000 8,167 1,167 Acceptance Test 3,273 4,545 1,273 Performance and Destructive Test 0,571 3,571 3,000 Security 2,909 5,818 2,909 Logging and Feedback 4,571 8,286 3,714 Installation and Upgrade 5,833 7,167 1,333 Architecture 7,200 8,600 1,400 Corporate Culture 7,000 8,000 1,000

Table 1 figure data

APPENDIX A. APPENDIX 117

A.6.1 Feedback Company Three Författare: Per Hagsten Datum: 2017-11-01

1. Backlog items.

Vänligen svara på dessa frågor enligt denna skala:

Instämmer inte alls. 0

Instämmer inte till stor del. 1

Instämmer inte till viss del. 2

Instämmer varken eller. 3

Instämmer till viss del. 4

Instämmer till stor del. 5

Instämmer helt och hållet. 6

a. Vi upplever de presenterade backlog items som insiktsfulla och träffande för vår organisations utvecklingsmöjligheter?

Rubrik i backlogen Svar(Siffra) Backlog item 1 5 Backlog item 2 6 Backlog item 3 4 Backlog item 4 4 Backlog item 5 4 Backlog item 6 5 Backlog item 7 5 Backlog item 8 3 Backlog item 9 3 Backlog item 10 6 Backlog item 11 5 Backlog item 12 6 Backlog item 13 4 Backlog item 14 3 Backlog item 15 6 Backlog item 16 5 Backlog item 17 6 R2Meton AB | Org nr: 556531-7129 | Box 823 SE-101 36 Stockholm Besöksadress: Klarabergsviadukten 90C 5tr | 08-633 13 00 | [email protected] | www.r2m.se

Författare: Per Hagsten Datum: 2017-11-01

Backlog item 18 3 Backlog item 19 3 Backlog item 20 5 Backlog item 21 6 Backlog item 22 5 Backlog item 23 3 Backlog item 24 4 Backlog item 25 5 Backlog item 26 5

b. Vi tycker att de presenterade backlog items är tydliga och konkreta?

Rubrik i backlogen Svar(Siffra) Backlog item 1 5 Backlog item 2 4 Backlog item 3 6 Backlog item 4 3 Backlog item 5 5 Backlog item 6 6 Backlog item 7 6 Backlog item 8 6 Backlog item 9 5 Backlog item 10 5 Backlog item 11 5 Backlog item 12 4 Backlog item 13 5 Backlog item 14 3 Backlog item 15 5 Backlog item 16 5 Backlog item 17 3 Backlog item 18 3 Backlog item 19 4 Backlog item 20 4 Backlog item 21 6 Backlog item 22 2 Backlog item 23 3 Backlog item 24 6 Backlog item 25 3 Backlog item 26 3

c. Vi tycker att de presenterade backlog items är genomförbara för oss?

Rubrik i backlogen Svar(Siffra) Backlog item 1 4 Backlog item 2 6 Backlog item 3 6 Backlog item 4 3 Backlog item 5 4

Författare: Per Hagsten Datum: 2017-11-01

Backlog item 6 3 Backlog item 7 2 Backlog item 8 2 Backlog item 9 4 Backlog item 10 4 Backlog item 11 4 Backlog item 12 3 Backlog item 13 6 Backlog item 14 4 Backlog item 15 6 Backlog item 16 4 Backlog item 17 5 Backlog item 18 3 Backlog item 19 4 Backlog item 20 3 Backlog item 21 6 Backlog item 22 2 Backlog item 23 5 Backlog item 24 4 Backlog item 25 3 Backlog item 26 2

2. Vad tyckte ni om upplägget av studien? Saknades något eller var något överflödigt? a. Tycker upplägget av studien var bra och kändes heltäckande, åtminstone för mig som inte studerat CI/CD på akademisk nivå.

I think that the layout of the study is good and comprehensive, at least for me who have not studied CI / CD at academic level.

3. Har ni några synpunkter på frågorna i benchmarken eller benchmarkens struktur? a. Benchmarken åskådliggjordes bra med spindeldiagrammet för att ge en snabb överblick av resultatet. I detta diagram så skulle jag uppskatta om även genomsnittet för alla respondenter i studien visades.

The benchmark was well illustrated with the spider diagram to give a quick overview of the result. In this chart, I would appreciate if the average for all respondents in the study was also shown.

4. Upplevade ni att frågorna i benchmarken överensstämde med din definition av CI/CD och DevOps? Om nej, vad tycker ni bör ändras? a. Ja.

Yes.

5. Har ni några övriga synpunkter? Tips, idér eller beröm? a. Tycker intervjuerna gav en bra förståelse för CI-CD, hur beroende olika team är av varandra inom IT, eller hur sammansättningen av ett team bör se ut för att uppnå ett gott resultat.

Författare: Per Hagsten Datum: 2017-11-01

I think that the interviews gave a good understanding of CI-CD, how dependent different teams are of each other in IT, or how the composition of a team should look like to achieve a good result.