<<

MASARYK UNIVERSITY Faculty of informatics

Continual and long-term improvement in agile development

DIPLOMA THESIS

Brno 2015 Bc. Juraj Šoltés

Declaration

I declare that the work in this thesis is my own authorial work, which I have worked out on my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Juraj Šoltés

Advisor: doc. RNDr. Tomáš Pitner, Ph.D.

ii

Acknowledgment

I would like to express thanks of gratitude to everyone, who supported me throughout the course of this thesis. My special thanks belong to my advisor, who helped me with the overall idea and structure of my diploma thesis. I would also like to thank my family and friends for material and psychological support, Kentico software for the opportunity of doing my diploma thesis under its roof, my colleagues who helped me to become Scrum master and the On-line marketing team for their patience with me.

iii

Abstract

Diploma thesis deals with the improvement in agile teams and ways how it can be measured and tracked. The work puts emphasis on choosing proper metrics and working with them so that it complies with the principles behind agile manifesto. The metrics then serve as a basis for creating the performance indicators for chosen goals. Several representative indicators for the agile process measuring are created, concretely in areas of collaboration and predictability. These are then used for analysis of the data collected during two releases in Kentico software. The results are compared with the outcome of the team’s retrospectives. The indicators’ reliability is discussed and both process improvements and improvements on usage of the indicators are suggested.

Keywords

Agile methods, agile metrics, performance indicators, process optimization, Scrum, predictability, collaboration, efficiency, case study

iv

Table of contents

1. Introduction ...... 3

1.1. Motivation ...... 3

1.2. Aim of the thesis ...... 4

1.3. Thesis outline ...... 4

2. Agile methods ...... 5 2.1. ...... 5

2.2. Dynamic systems Development Method ...... 7

2.3. Adaptive ...... 9

2.4. ...... 11

2.5. Lean software development ...... 16

2.6. Scrum ...... 18

2.6.1. Principles ...... 18

2.6.2. Roles and artifacts ...... 19

2.6.3. Meetings ...... 19 2.7. Disciplined agile delivery ...... 21

3. Improvement in agile teams ...... 25

3.1. Indicators, metrics and diagnostics ...... 25

3.2. Questions to ask about metrics ...... 26

3.3. Metrics in agile software development ...... 28

3.4. Process metrics ...... 31

3.4.1. Velocity ...... 32

3.4.2. Sprint burndown chart ...... 35 3.4.3. Rate of features delivered ...... 39

3.4.4. Work in process ...... 41

3.4.5. Story cycle time ...... 43

4. Case study...... 45

4.1. Environment ...... 45

4.1.1. Company ...... 45 4.1.2. Team ...... 46

1

4.2. Methodology ...... 47

4.3. Results ...... 51

4.3.1. Predictability of the delivery ...... 51

4.3.2. Collaboration on the team ...... 62

4.3.3. Efficiency of the process ...... 64

5. Conclusion...... 66

5.1. Summary ...... 66

5.2. Interpretation of the results ...... 67

5.3. Future research ...... 70 5.4. Final word ...... 71

6. Bibliography ...... 72

2

1. Introduction

1.1. Motivation

At the beginning of this century, software engineers from all over the world have taken one great step towards better ways of developing software. Several people, representing different agile philosophies such as Extreme Programming, Scrum, DSDM, Adaptive Software Development, Crystal, Feature-driven development and others, met together to create an Agile Manifesto. These four sentences represent the values they decided to manifest (Beck, 2001):

Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan

While there is value in the items on the right, the items on the left are valued more. Opposed to the heavyweight deterministic process of the stands the lightweight opened-to-change process of agile software development. Through the practical experiences the IT world realized that this is how development of the complex software really works. Together with the Agile Manifesto, twelve principles of agile software were uncovered. To mention only some of them:

 Our highest priority is to satisfy the customer through early and of valuable software.  Continuous attention to technical excellence and good design enhances agility.  Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.  The best architectures, requirements, and designs emerge from self- organizing teams.

They express what is really important in agile - to deliver valuable software of great quality in a sustainable pace. It can be only accomplished in a highly collaborative environment of a self-organizing team. These principles put emphasis on four dimensions of the agile software development - value to the customer, quality of the product, predictability of the delivery and collaboration of the team. In order to get better, we need to track how we move forward in these

3 dimensions and base our decision on it. And this is the basis for continual and long-term improvement in agile software development. Agile manifesto describes the process of continual improvement in one of its principles: “At regular intervals, the team reflects on how to become more effective, then tunes and adjust its behavior accordingly.” It is really important for the team to know if the decisions made are good, if it is moving forwards, backwards or if it is stagnating at the same place. The new agile ways of working need new type of metrics. The literature on this topic started to emerge only recently and the agile metrics are still terra arcana incognita. Moreover, sometimes the inherent agile way of process improvement may not be satisfactory. Importance of the presence of hard data on retrospectives underlines the fact that the lack of agile process tracking mechanism may be one of the failure factors in agile software development (Chow, 2008).

1.2. Aim of the thesis

Aim of this thesis is to contribute to optimization of agile development in Kentico Software s.r.o. and suggest a set of useful metrics for measuring the improvement of the work of the team. This should be done in compliance with the principles of agile manifesto and Scrum principles of transparence, control and adaptation. The goal of the first part of the thesis is to do a comprehensive research on metrics in agile software development with the emphasis on the process – related metrics. The goal of the second part of the thesis – case study is to use the metrics to develop suitable indicators for the environment of Kentico software s.r.o. and decide on possible optimizations of the process in the team.

1.3. Thesis outline

The thesis consists of four parts. In the first part, agile methods and agile concepts are introduced. The second part is about improvement in agile teams and metrics that can be used to measure it. Both, first and second parts of the thesis are based on a comprehensive study of recent literature. On this foundation a best practices and set of vital metrics is proposed. These metrics are then used to form seven indicators for measuring the predictability and collaboration in agile team. These are then examined in a company environment in Kentico software. The case study was conducted during two minor releases (8.1 and 8.2) in an independent cross-functional team consisting of ten people. The study describes the usage of tracking mechanisms in company and in the team. Later the analysis of seven indicators is conducted and it is compared with the team retrospectives. The results and suggested optimizations are presented. In the last part the results are summarized and possible applications of the analyzed indicators are suggested. The indicators are put in the broader context by examining different points of view and the long-term results.

4

2. Agile methods

The agile manifesto was signed by the representatives of different agile methods – Extreme programming, SCRUM, DSDM, Adaptive Software Development, Crystal, Feature-Driven Development and Pragmatic Programming. However a lot have changed since the times of agile manifesto and today is the family of agile methods broader. Since those days, several other methods have been introduced - Agile Modeling, Agile , Lean software development, Kanban, Scrum-ban and Disciplined agile delivery. In this chapter I am going to write about five agile methods – Kanban (Kniberg, 2010), Dynamic systems Development Method (Abrahamson, 2002), Adaptive software development (Highsmith, 2002; Abrahamson, 2002), Extreme programming (Beck, 1999; Abrahamson, 2002; Kniberg, 2007), Lean software development (Ballé, 2005; Poppendieck, 2003), Scrum (Schwaber, 2013) and Disciplined agile delivery (Ambler, 2013). The agile methods have a lot common. All of them are empiric-based, iterative, concentrate on delivering features early and often, welcome change during the process and require customer involvement. They are people-centric (instead of process-centric) and consider teams to be complex systems, with emerging self- organizing patterns. Minimizing the damage of failure by making things as simple as possible allows teams to learn instead to be punished. Abrahamson (2002) describes 4 major characteristics of agile methods. They are: incremental (they have small software releases with rapid cycles), cooperative (customer and developers are working constantly together communicating closely), straightforward (the method itself is easy to learn and to modify and it is well documented) and adaptive (it is possible to make last moment changes). Even though in most areas they are overlapping, every method has its own flavor and perhaps adds something where is something missing in other methods. As stated by the “fathers” of Agile manifesto (Fowler, 2001): “While the group believes that a set of common purposes and principles will benefit the users of agile methods, we are equally adamant that variety and diversity of practices are necessary”. We can find a useful win-win in implementing the combination of the methods and thanks to the wide range of them; we have the possibility to choose which.

2.1. Kanban

Kanban is an agile methodology that originated in Japan from Toyota production system. Kanban is Japanese word that means “signal card”. Signal card represents cards on the Kanban board, which is used to visualize the flow of work. Kanban is very flexible methodology; it has only three rules to follow:

1. visualize the flow

5

2. limit work in progress 3. maximize throughput

To visualize the flow, Kanban uses the technique known as the Kanban board. Kanban board shows the type of work and the current state of the work in the workflow. Here is an example of a Kanban board:

backlog in progress done deploy in production (3) (2)

project A task 1 task 4 task 5

project B taks 2

task 3

project C task 6

task 7

Figure 2.1: Kanban board

The rows with the headlines Project A, Project B and Project C in the Fig. 2.1 represent the current projects on which the team is working. They are usually ordered according to the priority. The columns represent the different stages of the work. Team can add as many columns as seems to be appropriate.

The second is rule is to limit work in progress. The constraint is provided by a number next to the current workflow state. Work in progress limit is the maximum number of the cards in current state (column). When the maximum number is reached, nobody should exceed it by putting another card there, instead everybody should collaborate on removing the work from the designated column. In order for WIP to work, the items in the Kanban board should be roughly of the same size or, if not, different technique of limiting the work should be used (story points instead of number of issues).

6

Maximizing the throughput is achieved by measuring the cycle time (or lead time) and optimizing the process based on this metric. Lead time clock starts when the request is made and ends at delivery. Cycle time clock starts when work begins on the request and ends when the item is ready for delivery. Cycle time is a more mechanical measure of process capability. Lead time is what the customer sees (Joyce, 2009).

Apart from these three rules, you can customize the process as you wish. It is possible that several teams work over the same Kanban board. Planning, retrospective and release events can have different cadence (e.g. planning - 2 weeks, retrospective - 4 weeks, release - 1 week), some of them can be even triggered by events (e.g. planning meeting is triggered every time the WIP limit in the “backlog” column is exceeded).

2.2. Dynamic systems development Method

Dynamic systems development method (DSDM) originated in 1994 in UK. It was created as a framework for RAD (Rapid Application Development) and it is both method and framework of controls for RAD. It is maintained by a DSDM consortium – a group of companies using DSDM. The method is more suitable for business domains than for engineering or scientific applications. The planning is based on the idea that instead of fixing functionality and then time and resources, it is better to do it vice versa. The DSDM knows 9 practices (or, as DSDM calls them - principles):

- Active user involvement is imperative. - DSDM teams must be empowered to make decisions. - The focus is on frequent delivery of products. - Fitness for business purpose is the essential criterion for acceptance of deliverables. - Iterative and incremental development is necessary to converge on an accurate business solution. - All changes during development are reversible. - Requirements are base-lined at a high level. - Testing is integrated throughout the lifecycle. ( is emphasized.) - A collaborative and cooperative approach shared by all stakeholders is essential.

The process consists of 6 phases:

1. feasibility study phase 2. business study phase 3. functional model iteration phase

7

4. design and build iteration phase 5. implementation phase

First two of the phases happen only once during one project, while the other three are iterative and incremental and can be repeated several times. In DSDM iterations are understood as time-boxes meaning every iteration ends, when the time reserved for the iteration runs out.

The feasibility study phase serves for deciding on suitability of using the DSDM for the project. The main criteria are type of project and organizational and people issues. The outcomes of this phase are the feasibility report, outline plan for development and optionally also fast prototype of the product. This phase usually does not last longer than a few weeks.

The business study phase consists of workshops with customer experts in which decisions about all relevant facets and priorities of the developments are made. Outcomes of this phase are Business Area Definition consisting of business processes (like ER diagrams, business object models, etc...) and user classes. Also System Architecture Definition (first system architecture sketch) and Outline Prototyping plan (prototyping strategy and plan for configuration management) are produced.

The Functional Model is the main outcome of the functional model iteration phase. The Functional Model consists of the prototype code and the analysis models. Other outcomes are: Prioritized functions – prioritized set of functions delivered at the end of the iteration, Functional prototyping review documents - users’ comments on the model, Non-functional requirements and Risk analysis of further development

The design and build iteration phase is the part, where is the design mainly built. The outcome of this phase is the Tested system.

In the implementation phase, the system is transferred from development environment to actual production environment. The user is trained on using the system and the system is handed over to her. The outcome of this phase is primarily the User Manual and then also the Project Review Report, which consist of the outcome of the project and decision on further development. There are four possible scenarios of further development. If the system fulfills all requirements, no further work is needed. If the substantial amount of requirements has been left aside process runs through again. If the less-critical functionality is omitted, the process runs again from the functional model iteration phase. If the technical

8 issues are not addressed due to lack of time the process is iterating from design and build iteration phase.

The DSDM knows 15 roles. I will mention only the dominant ones. From technical roles, there are developers and senior developers and the technical coordinator. The seniority in the developers is based on the experience in the tasks they perform. These roles contain all development staff (analysts, designers, programmers, testers). Technical coordinator defines system architecture, is responsible for the technical quality and technical project control (use of software configuration management). Four user roles are dominant in DSDM: ambassador user, adviser user, visionary and the executive sponsor. Ambassador user is responsible for bringing the knowledge of the user community to the project and disseminating the information about the progress of the project to other users and thus provides developers the user feedback. Since it is hard to bring all the users’ views by just 1 person, adviser user also helps. The adviser user is usually somebody from the users’ side (IT staff, financial advisors). The visionary ensures that the project goes according to the business objectives. Finally the executive sponsor is someone, who has the financial authority and responsibility and therefore power to make decisions. The team (consisting of both technical and user roles) is of a size from 2 to 6 people. It is possible for many teams to collaborate on 1 project.

2.3. Adaptive software development

Adaptive software development (ASD) was developed by James A. Highsmith and published in 2000. ASD is descendant of RAD (RAD-cal software development). ASD focuses mainly on the problems in large complex systems. The main philosophy is about balance between the chaotic and deterministic approach. In ASD, the static plan-design-build waterfall life cycle is replaced by a dynamic speculate-collaborate-learn life cycle.

Speculate gives a room to explore and to clear the realization that we are unsure. The team doesn’t abandon planning; it acknowledges the reality of uncertainty. The speculate part of the project consist of 5 steps:

1. Setting the project mission and objectives, understanding constraints, establishing the project organization, identifying and outlying requirements, making initial size and scope estimates, and identifying project key risks. 2. Time-box for the entire project is established.

9

3. Number and time-box for the iterations is set based on the degree of uncertainty and the overall project size. 4. The team members develop the theme/objective for each of the iterations. 5. Developers and customers assign features to each iteration. The criterion is that the iterations should deliver visible, tangible set of features to the customer. Customers decide on feature prioritization.

Collaboration is needed, because for complex application, large amount of information needs to be collected and analyzed, and no individual can handle this only by herself. Technical team delivers working software and project managers facilitate collaboration and concurrent development activities. Collaboration is fostered by trust and respect. Teams must collaborate on technical problems, business requirements and rapid decision making.

Once we understand we are fallible, learn part becomes vital. We should focus on learning regularly after every iteration, through project retrospectives, customer focus groups and reviews. There are four categories of things to be learned at the each of the development cycle. Result quality from the customer’s perspective, result quality from a technical perspective, the functioning of the delivery team, the practices team members are utilizing and the project’s status. Result quality from the customer’s perspective is obtained using customer focus group. This session is designed to explore a working model of an application and record customer change requests. Result quality from a technical perspective is assessed standardly using periodic technical reviews/pair programming sessions. Besides these continual techniques, overall architecture review should be conducted on a weekly/iteration base. The functioning of the delivery team and the practices team members are utilizing can also be called people and process review. It’s designed for team to monitor its own performance. For this purpose are ideal end-of-iteration mini-retrospectives. The fourth category is Project status, which leads into re- planning of the new iteration. Typical questions in this stage are: Where is the project? Where is it versus the plans? Where should it be? Particularly the last question is very important, since in agile projects measurements against the project plan aren’t sufficient, customer and the team must collaborate and continually ask: What have we learned so far, and does it change our perspective on where we need to go?

ASD has 6 basics characteristics. It is:

1. Mission focused 2. Feature based 3. Iterative 4. Time-boxed

10

5. Risk driven 6. Change tolerant

Mission serves to narrow the focus during the project. Without a proper mission statement and refinement, the iterations may become cycles swinging back-and- forth without any visible progress. Mission statement provides guideline for important decision throughout the project.

The ASD concentrates on delivering features. Features represent working functionality delivered to the customer during iteration. Documents are secondary deliverables to the software features; however user manual as a document is also feature.

Time-boxing is used for the length of iterations and the length of project. Sometimes this practice is wrongly understood as a way how to force stuff to a long overtime hours and cut the quality. However these are just a form of tyranny. The real purpose of time-boxing is to focus and force hard trade-off decisions.

Leadership-collaboration management is the management style typical for ASD. This management style contradicts the more traditional Command-Control management style. While commanders know the objective, leaders grasp the direction. Controllers demand, collaborators facilitate. This model asserts that in turbulent environment, adaptation is more important than optimization. Becoming adaptive means instead of stabilizing the process (making it repeatable), transfer only the necessary information between processes, and strive making the transformation as algorithmic as possible. ASD tries to balance on the edge of the chaos - between the chaos and the stability. And this is very hard; it requires enormous managerial and leadership skills. In order to make the adaptation possible, it needs to happen on all organization levels - this means good flow of information and distributed decision making. Hierarchical structure is stabilizing, while the horizontal-networking structure is adapting. The first one breeds stagnation, the second one chaos. Modern managers understand when they have to make decisions and when to distribute them. It is a delicate thing, than needs to be balanced properly.

2.4. Extreme programming

Extreme programming (XP) is a method developed by Kent Beck in 1999. It evolved as a reaction to problems caused by long development cycles. It is most suitable for outsourced or in-house development of small- to medium-sized systems where requirements are vague and likely to change.

11

In the extreme programming the system is sliced into parts called user stories. The typical waterfall phases of analysis, design, implementation and testing are compressed into small pieces called user stories. Several user stories can fit into one iteration. The situation is explained on the Fig. 2.2.

Figure 2.2: Evolution of the analysis, design, implementation and test phases in the projects (a) waterfall model (b) iterative model (c) XP (Beck, 1999)

XP consists of practices which have been already known. However, they were collected, put together and implemented to the “extreme levels”. The 13 major extreme programming practices are:

1. Planning game 2. Small releases 3. Metaphor 4. Simple design 5. Tests 6. Refactoring 7. Pair programming 8. 9. Collective ownership 10. On-site customer 11. 40-hour weeks 12. Open workspace 13. Just the rules

1. Planning game Planning game is collaboration of customers and developers in the planning process. Developers estimate the functionality written in the form of user stories and based on these estimations; the customer sets the scope and the timing of the release.

12

The has usually this form: “As a , I want so that . It is usually followed by acceptance criteria (acceptance tests) written as a high level test of both business logic and UI elements (Cohn, 2004). Finally, there is some additional description for the Developer Team. Writing of the user stories should follow the INVEST rule (Cohn, 2004). The INVEST rule says that user stories are:

1. independent 2. negotiable 3. valuable to user 4. estimable 5. small 6. testable

The risk-driven approach in agile prescribes to work on a small chunks of work, so that the risk won’t be increasing while working on a big item for a long time. It is unwritten rule, that every story in the Sprint should be smaller than ½ of the velocity. According to some sources, the size of the stories should be even between 1/6 and 1/10 of the velocity of the Sprint (Lawrence, 2012). More on the Invest rule can be found in Mike Cohn’s User stories applied (Cohn, 2004). The estimates are done in a unit called story point. There are two approaches in understanding the story-point. It can mean either ideal man-day or complexity. When it means the ideal man-days, the estimate can differ according to the abilities of the people estimating it. This is not a problem, if everybody in the team has approximately the same speed in delivering the functionality. However, if these two differ across the team, the team has to estimate the average value of the ideal man-day and the estimate becomes abstract. The second way is to understand story-points as a complexity of the user story. In this case, the story- point is a relative measure and the team has to create its own referential estimate scale. However, this way the estimates don’t differ for the people, the only prerequisite is good technical knowledge of the problem across the team. The value of a Story Point is on an adjusted Fibonacci’s scale: 0,1,2,3,5,8,13,21,40,100. That is because the bigger the complexity, the higher the uncertainty of the estimate. 0 is only for stories, which are almost in the state of Done (e.g. there is only slight testing left without any big issues expected) and 100 is for epics = big cluster of stories that needs to be broken down. The Done state is defined by the Definition of Done (DoD). It usually means that the story went through all phases of development (design, analysis, implementation, testing) and is finished. The stories are estimated during a team meeting with the customer, where the customer presents the user stories and the team estimates them using a technique called planning poker. Every time new story is introduced, everybody in the team

13 shows the card with her estimate at the same time. Then the highest and lowest estimate is discussed, team asks customer more questions or clears the technical solution and then re-estimates. The process repeats until consensus is made.

2. Small releases Small releases have length from one day to one month. Parts of the system are put into production, before releasing the whole system.

3. Metaphor The shape of the system is defined by a metaphor shared between the customer and programmers.

4. Simple design Simple design means at every moment it runs all the tests, communicates everything the programmers want to communicate, contains no duplicate code, and has the fewest possible classes and methods.

5. Tests Programmers write unit tests before they implement tested functionality. Customers write (or design) functional tests. Both test are collected and are run each time new code is integrated with the existing system.

6. Refactoring The design of the system is created through continuous practice of refactoring. Refactoring is rewriting the old code, which doesn’t allow writing clean running tests.

7. Pair programming All production code is written by two people sitting at one computer. This is called pair programming. This practice allows people to share knowledge, learn and make fewer mistakes in the process.

8. Continuous integration Continuous integration is a practice of integration of the new code within the old code after no more than a few hours. All tests must pass, or the changes are discarded.

9. Collective ownership Collective ownership means any programmer can provide any changes anywhere in the system.

14

10. On-site customer On-site customer sits with the team full-time.

11. 40-hour weeks No one can work second consecutive week of overtime. Overtime is a sign of deeper problems that need to be addressed. Developers ideally work 40-hour weeks.

11. Open workspace The team works in an open workspace with no cubicles or other obstacles between the team members.

12. Just the rules Everybody, who is part of the XP team, signs up to follows the rules. But they are just the rules. The team can change them every time they agree upon a way how to deal with the effects caused by the change.

The Figure 2.3 explains how the XP process looks like. The process consist of four phases – Exploration phase, Planning phase, Iteration release phase and Productionizing phase. In the Exploration phase, the user stories are prepared and the architectural spike is conducted. The architecture is then presented to the customer using a system metaphor. The user stories serve as requirements for the team’s release planning. Another outcome of this phase is the acceptance tests for the user stories. In the Planning phase, the Release planning is done. In case, the team isn’t confident about its estimates, spikes are conducted to reveal the complexity of the stories. The outcome of this phase is the Release plan. In the Iteration release phase the product is developed iteratively. At the end of every iteration, the increment is subjected to the acceptance tests. Acceptance tests serve as input for the next iteration and reveal bugs in the process. Once the increment passes the acceptance tests it goes into the Productionizing phase. In this phase the product is confronted with the customer and if customer accepts it, the product is released.

15

Figure 2.3: The model of XP project life-cycle (Kurz, 2008)

2.5. Lean software development

Lean development originated in Japan automobile industry. It has been intensively studied since 1980s. Ballé says there are four key factors behind the original Toyota lean manufacturing process (Ballé, 2005). First is, that the engineers actually care about what customers think of their product. Behind this is a strong vision of a product spread amongst everybody in the company. It also helps in making the right decisions throughout the design process. The second factor is limiting the late engineering changes. Third factor is mastering the flow of drawing and tool elaboration. This really means, all key design issues are solved upfront and the production of the actual drawings is tightly scheduled and precise. The last one is the focus on quality and cost of the process itself. Toyota developed system for the absolute elimination of waste. The lean thinking is proven to be applicable to the wide range of industries and also, for software industry. Mary and Tom Poppendieck have written a book about how lean development philosophy complies with agile software development (Poppendieck, 2003). The book describes seven principles behind the lean software development. These are:

1. Eliminate waste. Waste is anything, that doesn’t add value (as perceived by the customer) to the product. Seven wastes of lean manufacturing are: inventory, extra processing, overproduction, transportation, waiting, motion and defects. Corresponding to these there are seven wastes of software development:

16

- partially done work - extra processes - extra features - task switching - waiting - motion - defects

In software industry an example of extra features is if developers code more features than is immediately needed. Handling off development from one group to another is example of task switching. Generally speaking, whatever gets in a way of rapidly satisfying the customer is waste.

2. Amplify learning. In lean thinking, there is a difference between lean production and lean development. Lean production is an exercise in reducing variation, while lean development is an exercise in discovery. Good example of the difference is writing a recipe and producing a dish. Software development is equivalent to writing a recipe. It is vital for development team to produce several variations on a theme as a part of the learning process.

3. Decide as late as possible. Waiting allows making decisions based on facts, not a speculation. Keeping design options open is more valuable than committing early. For this to happen, we need to build a capacity for change into the system.

4. Deliver as fast as possible. Fast delivery enhances fast feedback from the customer. The product goes through the cycle of design, implement, feedback and improve. It is important to make this cycle short. Speed allows customers to get what they need now, not what they needed yesterday. Compressing the value stream is one of the fundamental techniques of removing waste in lean development.

5. Empower the team. The people in the team understand the details, so they can do better technical and process decisions than a central authority not working so closely with the product. Also, in changing environments, decisions are made late in the process and need fast execution. It would be late to wait for the information to go all the way up and then all the way down. Lean uses pull techniques and local signaling mechanisms as tools for empowering the team. Example of pull technique is an agreement to deliver increasingly refined version of working software in regular intervals. Local signaling mechanisms are charts, daily meetings, frequent integration and comprehensive testing.

17

6. Build integrity in. Firstly there has to be conceptual integrity - concepts work together as a smooth, cohesive whole. This is basis for the perceived integrity – how is the integrity perceived by the users. This means, system maintains its usefulness over time. It has a coherent architecture, high usability, fitness for purpose, maintainable, is adaptable and extensible. The research has proven, that the integrity comes from wise leadership, relevant expertise, effective communication, and healthy discipline

7. See the whole. To see the system as a whole, one needs a deep expertise in many diverse areas. It is not unusual, that people do not focus on the overall performance; instead they are just trying to make best only the part of the product that is in their area of expertise. This is also sometimes called sub- optimization.

Lean is a philosophy applicable to different scale of industries. Lean software development is adoption of this philosophy to the software development process. The lean software development principles can be very useful for process improvements when running on any of listed agile methods.

2.6. Scrum

Scrum is a method developed by Ken Schwaber in 1995. The aim of this chapter is to introduce Scrum, principles behind Scrum, and basic rules. Ground rules are written down in the “Scrum guide” - online guide updated on a yearly basis. These are notes taken from the last version (Schwaber, 2013). According to the Scrum guide, scrum is “a framework within which people can address complex adaptive problems, while productively and creatively delivering products of the highest possible value” (Schwaber, 2013).

2.6.1. Principles

Scrum is based on empirical process control theory with three base pillars: transparency, inspection, and adaptation. Transparency means significant metrics (indicators) are visible to the responsible people. Scrum uses several tools to improve visibility - they are called artifacts. Namely, they are Product Backlog, Sprint Backlog and the Increment. Inspection is activity of checking the indicators to inspect the current state of affairs. This is actually comparing the short-term state towards some long-term goal. If needed, it is immediately followed by adaptation. Adaptation is in place if one or more elements deviate from the acceptable limits so that the resulting product will be unacceptable. There are 4 Scrum events designed for Inspection

18 and Adaptation - Sprint Planning, Daily Scrum, Sprint Review and Sprint Retrospective.

2.6.2. Roles and artefacts

Scrum recognizes three roles involved in the product development - Development Team, Product Owner and Scrum Master. Together they form the Scrum Team. Scrum Team is a team of professionals with all skills needed to deliver fully functional product. It consists of people with different specialties (the team is cross-functional). The Scrum Team is self-organizing, meaning the team chooses how to accomplish its work, rather than be directed by someone outside of the team. Development Team creates the potentially releasable Increment at the end of each Sprint. The team is self-organizing meaning that no one tells the Development Team how to do its work. The accountability for the work done belongs to the Team as a whole. Even though the team is cross-functional no title other than Developer is recognized in the Development Team. Product Owner is responsible for the product backlog. Product Backlog is a prioritized list of items representing all intended changes to the product. Product owner is the only one, who can add, delete or reorder items in the Product Backlog. He ensures the work of team is of a high value for the customer. Scrum Master is the servant-leader for the Scrum team. He is helping the whole team to understand and enact Scrum, helps people outside of the team to understand, which interactions with the Scrum Team are useful and which not, collaborates with the Product owner on the clarity of the Product Backlog, coaches the Development team to be self-organized, and helps the Development Team to Inspect by facilitating the meetings and Adapt by removing impediments in its way.

2.6.3. Scrum meetings

There are 4 types of the meetings in Scrum: Sprint Planning, Daily Scrum, Sprint Review and Sprint Retrospective. Scrum uses iterative approach to , and the iterations are called Sprints. At the end of every Sprint the Development Team is supposed to deliver working software - Increment to the Product Owner. A special principle called time-boxing applies for all scrum meetings. It means, every meeting has prescribed length. The meeting never lasts longer than the prescribed length, however it can end sooner. The Sprint length can differ from team to team, it is usually between 2 and 4 weeks. The Sprint starts with a Sprint Planning and ends with a Sprint Review and a Sprint Retrospective.

19

The main purpose of the Sprint planning is to plan work for the next Sprint. Sprint Planning consist of two parts. In the first part Development Team forecasts the amount of the work completed in the next Sprint. The forecast is based on the latest product Increment, the capacity of the team in the next Sprint and the past performance of the Development Team. Using the forecast, the Development Team pulls items from the top of the Product Backlog to the Sprint Backlog. In the second part, the Development team collaborates on the system how to convert the Sprint Backlog into working software Increment. During this phase the Development team decomposes the Sprint Backlog to smaller tasks at the size of about 1 day. At the end the chosen tactics to accomplish the Sprint goal and possible changes to the Sprint Backlog scope are presented to the PO and the SM. Two types of Sprint planning are commonly used - the velocity-based planning and the commitment-based planning. In the velocity-based planning the team adds stories to the Sprint Backlog until the total count of story points equals the estimated velocity. The problem with velocity-based planning is that the team learns to work in safe boundaries, passing the responsibility on the metric. In result, this leads to lower productivity. Also if we concentrate on the velocity too much, we can start to think that what we want is more accurate estimates although what we really want is to deliver higher quality software of the door faster (Hundermark, 2009). In the commitment-based planning team gradually adds stories to the Sprint Backlog and every time new story is added, Scrum Master raises a question if everybody commits the actual scope of the backlog, and if yes, another story is added. If no, it means it is the final scope of the Sprint Backlog. The problem with commitment-based planning is it quite depends mainly on the guts and actual mood of the leading member in the team. And without any objective data this can be sometimes really misleading (especially with young teams). That is why it is good to combine both types of planning.

At the Daily Scrum Development Team synchronizes on the current status of the work and shares the impediments on the way to accomplishing the Sprint Goal. It is strictly time-boxed to 15 minutes. Basically, team answers these 3 questions:

 What did you accomplished yesterday?  What will you do today?  What obstacles are impeding your progress?

For better Inspection of the progress on the Sprint also Scrum board and the Sprint burndown chart can be used.

Sprint review serves as a demonstration of a completed work in the work to the Product Owner. Also, customers and other stakeholders can be invited. Product Owner accepts the user stories in case they are in the Done state. It is important to

20 share common understanding of the Done state. For this purpose, the Development team and the Product Owner have to share common Definition of done - an agreement with a list of steps to be accomplished for certain item from the Product Backlog in order to be Done.

The Scrum Team inspects the way how they work at the Sprint Retrospective. The Scrum Team looks back on the last Sprint from the perspective of the people, relationships, process and tools. Things that went well and potential improvements are identified.

2.7. Disciplined agile delivery

Disciplined agile delivery (DAD) has been founded by Scott Ambler a senior consulting partner at Scott Ambler + Associates, company helping organizations improving their software process. DAD provides broader framework to organizations that want to go beyond Scrum. It explains how to get from Scrum to DAD in 5 steps:

1. Focus on consumable solutions, not just potentially shippable software 2. Extend Scrum’s construction lifecycle to address the full delivery lifecycle 3. Move beyond method branding 4. Adopt explicit governance strategies 5. Take a goal-based approach to enable scaling.

1. Focus on consumable solutions. In Scrum lifecycle, there is usually potentially shippable software at the end of every iteration. DAD says there is far more to be delivered at the end of every iteration. In addition to software, there is supporting documentation, upgrade/redeployment of the hardware on which the software runs, change of the business process around usage the software or even the change of the organizational structure of the people using the system. So the result is not only software, but complete solution. Moreover, stakeholders doesn’t value something, that is just “potentially shippable”, they need to see something consumable. That is why instead of “potentially shippable software”, DAD uses the concept of consumable solutions.

2. Full delivery lifecycle. Scrum doesn’t provide a lifecycle model for the whole release. DAD extends the Scrum iteration lifecycle model and proposes a delivery version of the Scrum construction lifecycle. In the beginning, there is an initial phase with initial requirements elicitation (or backlog population), initial architecture modeling, and initial release planning. In the DAD lifecycle, this phase is called Inception. At the end of the production, there is a need to release the product into production and this phase is called Transition. Both Inception and

21

Transition can comprise of one or more iterations. The situation is depicted on the Figure 2.4.

Figure 2.4: A governed agile delivery lifecycle (Ambler, 2013)

3. Move beyond method branding. As can be seen from the picture, DAD also uses different naming than Scrum. DAD tries to move beyond the method branding of Scrum (or any other agile methodology in particular). Table 2.1 compares the Scrum names for artifacts and meetings in the process and corresponding DAD names.

Scrum naming DAD naming Product Backlog Work Items Potentially shippable Consumable Solutions software Sprint Planning Planning Sprint Review Review Sprint Retrospective Retrospective Daily Scrum Daily coordination meeting Sprint Tasks Tasks Scrum Master Team Lead

Table 2.1: Comparison of Scrum and Dad naming for meetings and artifacts

4. Adopt explicit governance strategies. Appropriate approach is based on motivating and then enabling people to do what is right for your organization.

22

This means using the corporate assets (reusable code, patterns, data sources), following guidelines for better consistency and working towards a shared vision of the organization. Appropriate governance is based on trust and collaboration, not command-and-control. Besides appropriate governance strategies also other practices are important, namely: working closely with enterprise professionals, adopting and following enterprise guidance, leveraging enterprise assets, enhancing your organizational ecosystem, adopting a DevOps culture, sharing learnings and open and honest monitoring.

5. The goal-based approach allows professionals who use DAD to choose proper agile methodology according to the goals, they want to achieve. Fig. 2.5 lists goals for every phase in the agile delivery lifecycle.

Figure 2.5: The goals of DAD (Ambler, 2013)

Every goal has its process goal diagram, where the goal is divided to sub-goals, each with list of options from different agile methods. The decomposition of the goal on the agile methods is further shown in Fig. 2.6.

23

Figure 2.6: Process goal diagram for goal: Explore Initial Scope (Ambler, 2013)

In conclusion, Disciplined agile delivery is a methodology, that builds on Scrum and goes beyond using both the practices from other agile methods and the know- how of using agile methods in larger - organizational-wide scale. It builds on 5 principles that serve as steps on how to get from Scrum to DAD – consumable solutions, full delivery lifecycle, brand-independent naming, explicit strategies and goal-based approach.

24

3. Improvement in agile teams

Continuous improvement is important characteristic of agile methods. Lean uses the Japanese word Kaizen, which means “good change”. Well known is also Shewhart circle (or Deming circle) circle divided on four parts – plan, do, check, act used in business for continuous improvement of processes and products. This chapter is mainly about the ways, we can see the improvement and how to measure it. Measuring in means of predicting the future in a deterministic sense would be however against principles of the agile software development. The plan- driven approach relies on following the plan and fighting against any change in the process. On the other hand it is not possible to predict the future in a complex non-deterministic environment. Therefore, instead of following the plan, it is more important to adapt to change. But also in this environment some important decisions have to be made about how to adapt for the particular situation. In order to do good, informed decisions, we have to have some hard data and we have to understand them. Or, said in a Scrum terminology, in order to Adapt, we have to Inspect. The DAD draws attention to the fact that Scrum proposes managerial visits on team daily meetings. However people in management have often busy schedule and don’t have time to visit the daily meetings and also they may not be interested in such a deep detail. This is another reason why good metrics are so useful. The outcome of the metrics can be automated and put on dashboard using a proper information detail for the management. Also, good metrics can reinforce desired behavior and provide meaningful conversation in the team.

3.1. Indicators, metrics and diagnostics

Key performance indicator is actual short-term measurement which is input for a business decision or action. It is considered to be indicator of actual business performance. We know two types of performance indicators. Leading indicator answers the question: “Where is it all leading to?” It signals the event before it actually happens. On the contrary lagging indicator (also known as result Indicator) refers to something what have already happened. It is a report to past event. There is a difference between the “indicators” and the “metrics”. Both of them are some kind of measurement. Whereas metric is standard measurement and provides simple number in a generally agreed metric system, indicator is a representation of a measurement toward the stated goal/objective. Hartmann and Dymond (2006) introduced another notion for metrics used for local process improvements. These are called “diagnostics”, because they are designed to diagnose & improve the processes that produce the business value.

25

3.2. Questions to ask about metrics

Before we get to the metrics, we need to define what good metrics are and how to use them. It is important to state, that metrics should be something to help us visualize the state of affairs so that we can base decision on it. Metrics are very powerful tool and therefore they can easily enslave us. It is important always to remember what the goal of using the particular metric is. In order to prevent misusing of the metrics, literature recommends to ask several questions before the metrics creation (Swanson, 2014):

1. What decisions will be made based on this metric? Are we really measuring what does matter? 2. Is this the right thing to measure or an easy-to-measure proxy? 3. What might be the unintended consequences of this metric? What is compensating metric to counteract it (speed vs. quality)? 4. How could this metric be ‘gamed’? 5. What can we do to guard against unintended consequences and ‘gaming’?

Hartmann (2006) suggests 11 tests to be applied when designing measurements in the agile process. A good agile metric or diagnostic:

1. Affirms and reinforces Lean and Agile principles 2. Measures outcome, not output 3. Follows trends, not numbers 4. Answers a particular question for a real person 5. Belongs to a small set of metrics and diagnostics 6. Is easy to collect 7. Reveals, rather than conceals, its context and significant variables 8. Provides fuel for meaningful conversation 9. Provides feedback on a frequent and regular basis 10. May measure value (Product) or Process 11. Encourages “good-enough” quality

Hartmann also provides a checklist of what should be in the metric included:

- name – well chosen to avoid ambiguity, confusion, oversimplification - question – answer a specific, clear question for a particular role or group - basis of measurement – clearly state what is being measured, including units - assumptions – identified to clear understanding of data presented - level and usage –intended usages at various levels of the organization - expected trend – what the designers of the metric expect to happen

26

- when to use it – what prompted creation of this metric? How has it historically been used? - when to stop using it – when will it outlive its usefulness, become misleading or extra baggage? - how to game it – the natural ways people will warp behavior of information to yield more ‘favorable’ outcomes - warnings – balancing metrics, limit on use, dangers of improper use

Gustafsson (2011) adds that every metric should have clearly stated owner, in order to have somebody to drive it.

Put together, we get these 7 questions to ask about every metric:

1. What does this metric measure? o ask a specific, clear question for a particular role or group o clearly state what is being measured, including units

2. What are the assumptions about the process? o identify assumptions to clear understanding of data presented

3. How to visualize it?

4. How to use it? o who should be the owner of this metric, who should drive it? o when to use it? o how has it historically been used? o when to stop using it? when will it outlive its usefulness, become misleading or extra baggage? o where does it fit in a Scrum process? o how are the data collected? o intended usages at various levels of the organization o what decisions can be based on this metric? o what indicators can be based on this metric?

5. What is the expected trend? o what the designers of the metric expect to happen

6. What can be the wrong approach to his metric? o What might be the unintended consequences of this metric? o What is compensating metric to counteract it (speed vs. quality)? o dangers of improper use

27

7. How to game it? o the natural ways people will warp behaviour of information to yield more ‘favorable’ outcomes? o What can we do to guard against unintended consequences and ‘gaming’?

These questions will be later used to describe the metrics in the chapter 3.4.

3.3. Metrics in agile software development

Key metrics Gustaffson (2011) suggests that the most important metrics are those affecting the economy the most. These are the Throughput (T), Investment (I) and Operation Expense (OE). Gustaffson characterizes these variable in following manner: “Throughput is the rate of revenues generated from delivered software. Investment is the money spent on obtaining the requirements. Operation Expense are the entire cost associated with turning the requirements into working code.” In order for these to work the classical accounting should be replaced with a throughput accounting. Based on these, the Net Profit (NP) or Return of investment (ROI) can be calculated using these formulas:

NP = T-OE; ROI = NP/I

This agrees with Hartmann (2006), who states that the most important thing to measure is the “Business value delivered”. He considers it to be a key metrics and other product and process metrics are supposed to be an addition to this proprietary metric. It can be calculated in different ways, including Net present value, Return of investment and Internal rate of return.

Performance metrics In order to analyze the results from the key metric, the key metric can be accompanied with several other metrics. These metrics measure the performance of the individual teams. Hundermark (2009) recognizes four important dimensions in agile software development - value (to customer), predictability (of the delivery), collaboration (during the process) and quality (of the product). These metrics are in compliance with the Agile manifesto (see chapter 1.). To measure these dimensions, he suggests using indicators based on these seven metrics:

- predictability: velocity, rate of features delivered - predictability & quality: running tested features (running automated tests) - quality: technical debt

28

- collaboration: work-in-process, story cycle time - value: customer surveys (net promoter score)

If the above are mastered, we can start to measure additional six metrics:

- predictability: velocity chart, sprint burndown chart - collaboration &value: team surveys - value: cost per sprint (story point), real value delivered - predictability, collaboration, value, quality: ROI (NPV)

On the other hand Hartmann (2006) suggest that the other metrics can be added depending on the usefulness in the particular situation from this list:

- Agile practice maturity - Obstacles cleared per iteration - Team member loading - Obstacles carried over into next iteration - User Stories carried over into next iteration - Iteration mid-point inspection - Unit tests per user story - Functional (Fitnesse) tests per user story - Builds per iteration - Defects carried over to next iteration

Gustafsson (2011) differentiates five categories, quality, predictability, vale, lean and cost:

- quality: defect count, technical debt, fault-slip through - predictability: velocity, running automated tests - value: customer satisfaction survey, business value delivered - lean: lead time, work in progress, queues - cost: average cost per function

However cost is not a good thing to measure, because in agile software development it is a fixed variable. With the shift from old ways of management to new agile ways, also the cost driving of the metrics changed to value-driven paradigm (Hartmann, 2006). That is why it is recommended to measure value instead. The “lean” category comes from lean development, which has a slightly different emphasis than the other agile methods. In lean development, instead of delivering in iterations, new features are delivered continuously with a pull just in-time system called Kanban. This reflects in the slightly different approach to the

29 metrics. Scrum uses velocity as a primary metric for planning and process improvements, while Lean (and Kanban) uses lead time. In conclusion, we have three different categories of metrics – key metrics, used as a referential metric for all other metrics, agile metrics that are most useful when developing using agile methodologies and iterations and lean metrics, most useful with the continuous delivery without iterations. Tables 3.1, 3.2 and 3.3 show the key metrics, agile metrics and lean metrics, respectfully. key metric predictability quality collaboration value Return of investment x x x x Net present value x x x x Internal rate of return x x x x

Table 3.1: Key metrics in the agile software development agile metric predictability quality collaboration value Velocity x Rate of features delivered x Sprint burndown chart x Running automated tests x x Builds per iteration x x Defect count x x Running tested features x Unit tests per user story x Technical debt x Fault-slip through x Work-in-process x Story cycle time x Team member loading x Team surveys x x Functional tests per user story x Customer satisfaction survey x Net promoter score x

Table 3.2: Agile metrics lean metric predictability quality collaboration value Lead time x x Queues x x Work in progress x x

Table 3.3: Lean metrics

30

3.4. Process metrics

The performance metrics can be divided on two categories - those which measure predictability and collaboration = process metrics and those which measure quality and value = product metrics. This slicing makes sense from the responsibility point of view. In Scrum the Scrum master is responsible for the process side of development and the Product owner for the product side. Both points of view are important and should be balanced using the key metric. This thesis aims on the improvement of the work of the team and process optimization and therefore concentrates on the process metrics and leaves the product metrics aside. Table 3.4 displays all process metrics from the tables in chapter 3.3. process metric prectability collaboration Velocity x Rate of features delivered x Sprint burndown chart x Lead time x x Queues x x Work in progress x x Work-in-process x Story cycle time x Team member loading x

Table 3.4: Process metrics

Hundermark (Hundermark, 2009) suggests using only few vital metrics that are valuable for the team. To justify this recommendation I have chosen a set of metrics vital in the environment of the case study: velocity, sprint burndown chart, rate of features delivered, work-in-process and the story cycle time. These five metrics are displayed in the Table 3.5.

predictability quality collaboration value velocity x sprint burndown chart x rate of features delivered x work-in-process x story cycle time x

Table 3.5: Agile metrics vital in the environment of the case study

31

Important role in process optimization plays the Little’s law. It describes the relationship between the collaboration metrics work-in-process and story cycle time and velocity. The Little’s law says that the average number of items in the queuing system L equals average waiting time in the system for an item W multiplied by the average number of items arriving per unit time ʎ (Chhajed, 2008):

L = ʎ W

Applied to our situation average work in process (WIP) in story-points equals average story cycle time (CT) multiplied by the average velocity (V).

WIP = CT*V

3.4.1. Velocity

Most of the answers for the questions in this chapter are inspired by the Appropriate agile measuring (Hartmann, 2006).

1. What does this metric measure? Velocity answers the question: “How much software can my team deliver per iteration?” It can be measured in story points or “ideal man-days”. It should be accompanied by the projected capacity of the team in man-days.

2. What are the assumptions about the process? The basic assumption is that the team is delivering working software every iteration. Other assumptions are that the user stories are written properly and the process of estimating was done correctly. Also, only user stories in the Done state should be counted (Hundermark, 2009). The length of the Sprint for the release is fixed and is never extended (Hundermark, 2009).

3. What is the expected trend? “Velocity can be affected by many factors: changes in the team members, obstacles, toolsets, practices, difficulty of feature, amount of learning required and company policies” (Hartmann, 2006). All these factors will lower the velocity. Except of the unexpected obstacles, stable team on the same project with the required resources will gain in velocity in the beginning of the project, and after several initial Sprints should be the velocity stable.

4. How to visualize it? The velocity can be visualized using the velocity chart. Velocity chart is simple bar graph showing the velocity over several last Sprints in the project. The x-axis

32 displays the number of the Sprint and the y-axis number of story points. One example of the velocity chart is the Fig. 3.1.

Velocity chart 90 80 70 60 50 commitment 40 velocity 30 20 10 0 8.1.3 8.1.4 8.1.5 8.1.6 8.2.1 8.2.2 8.2.3 8.2.4 8.2.5

Figure 3.1.: Velocity chart

5. How to use it Velocity is usually used during the whole course of the project. “In a longer project when the team, resources, and technology are all stable, velocity will also become stable. The team may suspend collecting velocity since it is ‘known’.” (Hartmann, 2006) Velocity is most useful at the project level and should be owned by the Scrum master. Velocity is used to predict future delivery of the team. It can be used for both iteration and release planning. The velocity is an indicator most useful in the long-term (Cohn, 2008), the rolling average of three Sprints seems to be reasonable approximation (Hundermark, 2009). “The rolling average of the last three sprints is the leading indicator of predictability for the release planning” (Hundermark, 2009). Velocity is also primary metric for process optimization in agile software development (Kniberg, 2010); it should be used as input for iteration and release retrospectives. The velocity of the last iteration is discussed at the retrospective if it differs from expected values. According to Boyd, if the velocity is more than 10% lower than the value predicted for the Release (for prediction of the velocity in the release is used rolling average of three previous Sprints – see previous paragraph), it may be indicators of these problems (Boyd, 2011):

- Product Owner isn’t providing enough information to the development team

33

- Team size is changing between sprints (generally, the core team must be consistent; allowances should be made for absences, e.g vacation, sick) - Team does not understand the scope of work at the start of the sprint - Team is being disrupted - Team’s reference feature stories are not applicable to the current release - Team is doing very short release cycles (<3 sprints) or doing maintenance work (the Team might consider Kanban or XP over Scrum under these circumstances)

Based on that we can say, that:

I1: The difference between the sprint velocity and the velocity predicted for the release is lagging indicator of predictability of the sprint.

According to (Boyd, 2011) a low score on velocity/commitment indicates that:

- team does not have reference story to make relative estimates - not every team member understands the relative story - product owner isn’t improving enough information to the development team - requirements scope creep - team is being disrupted

Base on this, we can say, that:

I2: The story point completion ratio is lagging indicator of disruptions in the sprint, bad estimates or bad sprint planning.

Also, since the team should ideally plan according to the prediction of the velocity for the Sprint, we can say, that:

I3: The difference between the prediction for the sprint and the team commitment is indicator of team’s ability to plan or estimate.

6. How to game it? Comparing velocity between teams can lead to inflation of the estimates by the team members. This way the velocity may be increased, while completing the same amount of work.

34

7. What can be the wrong approach to his metric? “The higher is the velocity, the better.” In reality, high velocity may mean, the team is incurring technical debt. Velocity should be always checked with the quality of the solution. High velocity today may mean very low velocity in the future. “Velocity measures productivity or value.” Velocity is a metric of predictability. It doesn’t measure the value to the customer, neither the productivity of the team. Velocity should be used for predicting future delivery, as it was designed. “We should always commit with 100% accuracy.” (We must deliver exactly 24 SP). This is wrong approach, because our goal is not to estimate more accurately, but deliver higher quality software of the door faster (Hundermark, 2009). “The sprint failed, because we haven’t met the velocity.” In reality, failed Sprints provide important information for the teams from which they can learn in future. “Maximum information is provided at 50% probability. If we prevent failure, we limit the team responsibility” (Hundermark, 2009).

3.4.2. Sprint burndown chart

What does this metric measure? Sprint burndown chart answers the question: “What effort is remaining in the Sprint?” It is a graph showing both ideal effort and effort remaining in the Sprint. It can be accompanied with the team’s projected capacity in the Sprint.

What are the assumptions about the process? The team delivers features in Iterations with given scope. The scope shouldn’t be changed too much during the project. The metric works ideally if the team estimates the subtasks created by decomposition of the user stories in ideal man- hours.

What is the expected trend? The ideal team has a constant pace of delivering new features. The rate of the features delivered in the Sprint ought to be constant. This is in alignment with one of the principles of agile manifesto (Beck, 2001):

„Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.“

Any deviations from the expected trend should be inspected and the team effort should be adapted accordingly.

35

How to visualize it? Sprint burndown chart is itself a form of visual display of the data. The x-axis displays the working days and the y-axis the remaining effort in ideal man-hours remaining. It also shows the ideal effort as a guideline and the real progress of the effort. The example is shown in Fig. 3.2.

Figure 3.2.: Sprint burndown chart (Reynolds, 2014)

How to use it? Sprint burndown chart should be used in all Sprints, unless the development team’s progress during Sprint is visible without it. It is best used at the Project level. The owner for this metrics should be the Scrum master. In case the Sprint scope changes often, the sprint burndown should also show the total backlog size. The idea behind the Sprint burndown chart is that the team progress should copy the ideal effort shown on the graph; it helps the team to inspect its progress and adapt, if it deviates. The inspection takes place at the daily meetings and the chart is usually shown on the Scrum board. The shape of the graph can be also used to analyze last Sprint on the Sprint retrospective. Figures 3.4-3.9 (Kocurek, 2011) show some common examples of the different situations that can occur in Sprint burndown chart. Interpretations by Kocurek (2011) are shown below each two graphs.

Figure 3.4: Ideal team Figure 3.5: Great team

36

Fig. 3.4 indicates great team ability to self-organize. It indicates a great product owner who understands the reason for a locked sprint backlog and a great Scrum master able to help the team. The team is not over-committing and finished the Sprint backlog on time. The team is also able to estimate capacity correctly. No corrective action is necessary in such case.

Fig. 3.5 indicates an experienced team. The team has completed work on time and met the sprint goal. They also have applied the principle of getting things done, but the most important is they have adapted a scope of the sprint backlog to complete the sprint. At the end the team has a possibility to complete some additional work. In the retrospective, the team should discuss the reasons of late progress in the first half of the sprint and solve issues so they are better in the next sprint. The team should also consider the amount of work that they are able to complete.

Figure 3.6. Nice team Figure 3.7. Let’s rest

Fig. 3.6. shows a typical progress that can be observed in many experienced agile teams. The chart displays that the team was able to complete their commitment on time. They adapted the scope or worked harder to complete the sprint. The team is self-reflecting. The team should discuss change of plan immediately as they see the progress has been slowing down from the beginning of the sprint. Typically it is suggested to move a low priority item from the sprint backlog to the next sprint or back to the product backlog.

Fig. 3.7. indicates the team has a problem. The problem is either the team committed to less than they are able to complete or the product owner does not provide enough stories for the sprint. The reason might be also an over-estimation of complexity, which ends up in completion earlier than expected at the beginning of the sprint. The Scrum Master should identify this problem earlier and ask the product owner to provide the team with more work. Even if stories are over- estimated, the team should at least continue with stories from the next, already preplanned, sprint.

37

Figure 3.8: The management is coming! Figure 3.9: It is too late.

Fig. 3.8 indicates the team is probably doing some work, but maybe it does not update its progress accordingly. Another reason might be that the product owner has added the same amount of work that was already completed, therefore the line is straight. The team is not able to predict the end of the sprint or even to provide the status of the current sprint. The Scrum Master should improve it Scrum masterships and coach the team on why it is necessary to track the progress and how to track it. Such team should be stopped after two or three days that shows a flat the line of progress and should immediately apply corrective actions.

Sprint burndown chart on Fig. 3.8 says “You have not completed your commitment". The team has been late for the entire sprint. It did not adapt the sprint scope to appropriate level. The team has not completed stories that should have been split or moved to the next sprint. In such situation the capacity of the next sprint should be lowered. If this happens again, corrective actions should be taken after a few days when slower progress is observed. Typically, lower priority story should be moved to the next sprint or back to the product backlog.

Base on this, we can say, that:

I4: The shape of the sprint burndown chart is indicator of predictability in the sprint and adaptability of the team.

How to game it? If the team fears that the Sprint burndown chart may be used against it, they may try to make something like in the Fig. 3.8. In that case would be the metric unusable. The scrum master can predict it by building the confidence of the team towards the management and by helping him to better estimate its work. Another way to game it is to overestimate the stories and undermine the capacity of the Sprint as shown in Fig. 3.7. In this case should the Scrum master address the problem early in the process by asking the product owner to add more stories.

38

What can be the wrong approach to his metric? The ideal effort depicted in the Sprint burndown chart can be understood as the deterministic prediction of the Sprint. This is certainly not the point. Rather it should be used as a tool for invoking the Adapt effort when the process deviates from the plan. The team that can adapt well is certainly better agile team that a team that strays any risk just to follow the exact route of the ideal effort line on the Sprint burndown chart.

3.4.3. Rate of features delivered

What does this metric measure? Rate of features delivered answers the question: “How much effort is remaining in the Release?” It is measured as a ratio of story points or ideal man-days already delivered in the finished Sprints to the story points remaining in the rest of the Release.

What are the assumptions about the process? The team delivers features in Release with given scope. The scope shouldn’t be changed too much during the project.

What is the expected trend? The ideal team has a constant pace of delivering new features. The rate of the features delivered in the Release in the rolling average is expected to be constant. This is in alignment with one of the principles of agile manifesto (Beck, 2001):

„Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.“

How to visualize it? Rate of features delivered can be visualized by the Release burndown chart. Release burndown chart is a bar chart, where the x-axis displays the Sprints and the y-axis the number of story points remaining in the Release.

39

Figure 3.3.: Release burndown chart in Jira

This particular example shows also the story points added to the release, which comes in handy in showing the scope change.

How to use it? Release burndown chart should be used during the whole Release. If for several Releases there is no change in the Release scope, the Release burndown chart may be omitted. The Release burndown chart is metric suitable for all organization levels. It shows both the scope change to the release and the effort in story points remaining till the end of the release. It is an artefact of discussion between the Development team and the customer, therefore it should be owned by the Scrum master. The release burndown chart should be discussed at the end and the beginning of each Sprint – on the review and planning meetings and every time the change to the Release scope is made. On the Release burndown chart could be based this indicator:

I5: The release burndown chart is lagging indicator of the scope changes in the Release.

How to game it? There is no simple way of gaming the Release burndown chart. However it may cease to show the progress on the Release, if radical scope changes during the Release are made too often. This can be prevented by good Release planning and close collaboration with the customer during the whole process.

40

What can be the wrong approach to his metric? The Release burndown chart can raise the feeling that any changes in the scope of the release are bad. This would be step back to the plan-driven approach of waterfall methodologies. Instead this should be understand as a communication artefact between the Product owner and the Development team showing the customer the effect the changes in the Release, pushing the Product owner into hard decisions about priorities and of the progress of the work on the product for both the customer and the team. The changes in the scope are welcome in case the result brings more value to the customer.

3.4.4. Work in process

What does this metric measure? Work-in-process answers the question: “How many user stories are being worked at the moment?” It is measured in number of the user stories that are in process. In order to indicate collaboration, it should be accompanied by the number of people in work and ideally also with information about other work the team is working on, besides of the Sprint backlog.

What are the assumptions about the process? This metric is designed to empower collaboration in the team. Basic assumption is that the user stories are written in a way they allow collaboration of more team members over them or the team uses the XP practice of pair programming.

What is the expected trend? If the team members collaborate on the task, the expected trend is that the number of user stories in progress is lower than the number of the team members. In Scrum, one of the hardest things is to work just at one user story at the time. The ideal-sized team of 5-9 people would ideally work on 2 or less stories at the same time (Hundermark, 2009). This means 2-5 people working on the same user story.

How to visualize it? The work in progress can be seen directly on the scrum board in a column “In progress” as it can be seen on Fig. 3.4. The scrum board can be updated either continuously or once a day at the daily meeting.

41

STORY TO DO IN PROGRESS DONE

Figure 3.4.: Visualizing the work in process (in progress) in the Scrum board.

How to use it? This measure should be used to improve the collaboration in the team. Owner of this metric is Scrum master. The ultimate goal however is to work on the top priorities and to not take any other work (than first two items), until the first item is done. Then the progress on the third item can be started, etc. The indicator that can be based on this metric is:

I6: The average number of people in work divided by the average number of stories in process is lagging indicator of the collaboration on the team.

How to game it? Simple way to do a work-around is to work on things that are not in Sprint backlog or, alternatively do nothing. This way the number of items in progress would be low, but at the expense of low productivity of the team. This can be preventing by ensuring that everybody in the team works on something of a high priority. For this purpose, also the work on bugs, defects (and other distractions not in Backlog) should be tracked.

What can be the wrong approach to his metric? This metric can’t be used as a tool for Scrum master to force the developers to collaborate more. It is not designed to push the collaboration, there has to be motivation in the team and also there has to be technical and product predispositions (size of the user stories). The better understanding would be that it can show the obstacles on the way to better collaboration.

42

3.4.5. Story cycle time

What does this metric measure? The story cycle time answers the question: “What is the average time to complete the user story?” It is measured in time from the start of working on the story till the story is finished. It should be accompanied by the average complexity of the stories in the sprint.

What are the assumptions about the process? The team tracks the start and the end of the work on the story.

What is the expected trend? According to Hundermark, ideal time where we want to get is around 3 days (2009).

How to visualize it? Story cycle time can be visualized by a Control chart. A Control chart for the story cycle time is a graph that shows the number of days it took to deliver the story on the y-axis and the time it was delivered on the x-axis. It also shows the mean value of the cycle time so that the variation can be seen. General control chart is on Fig. 3.5.

Fig. 3.5: Example of general Control chart. (Control chart, 2014)

How to use it? It serves as an internal info for the Scrum Master. The trend can be used as a feedback for process decisions. The deviations can be used to identify potential problems to be discussed at the retrospective. On this metric can be based this indicator:

43

I7: The average story cycle time is lagging indicator of efficiency of the process.

How to game it? The cycle time can be made lower on the expense of the quality of the user story. The best way how to predict this type of gaming is to monitor the quality of the stories.

What can be the wrong approach to his metric? The story cycle time can be indicator of a very broad area of different events. It may be hard to interpret it correctly.

44

4. Case study

4.1. Environment

4.1.1. Company

The case study was conducted in the Kentico software s.r.o, software company developing the integrated marketing solution for digital agencies with the same name - Kentico. The company has been founded in 2004 and has been growing ever since. At the beginning the company has been producing the content management system, however in the last years the focus shifted towards on-line marketing and e- commerce, and Kentico started to sell the complete enterprise marketing solution. Today, 18000 websites in 90 countries run on Kentico and Kentico software has 1100 partners worldwide. At the time of writing of this case study, Kentico has around 150 employees with headquarters in Brno and branches in USA, UK and Australia. I was working there as a Scrum master together for one year and four months. The case study is based on data collected during the 8 months period I was working with the product team developing the On-line marketing solution. When I came to the On-line marketing team, Scrum was already well-known in the team. The team has been running on Scrum for around 2 years, and the people were acquainted with the majority of the agile practices. The way of how Scrum was implemented has been more-less standardized across the company. Scrum has been used for planning the user stories and spikes into Sprints, and Kanban for work with the bugs and defects, which have been coming along the way. Ideally, it would be to use just one of the methods - in case of Scrum estimate and plan bugs and defects (B&D) in the Sprint, in case of Kanban cancel Sprints and pull the new functionality from the product backlog just in time. However, there were reasons why not to do any of this. The first alternative was not possible because the company one-week bug- fixing policy won’t go together with the company-wide agreement to do two-week Sprints. Also, sometimes the customer bugs had to be done within one day from their occurrence and hence it was not possible to plan them in the Sprint. The second alternative was not applied, since most managers and Scrum masters preferred Scrum before Kanban and also some teams were not mature enough to adopt Kanban. The user stories has been based on the interviews with the company partners, suggestions from the end customer and written by the Product owner with the collaboration of the Development Team. The main topic for the 8.1 going also through other teams was Performance improvement. For 8.2 it was creating the Demo sites for Sales. For optimizing the processes at Kentico software on the company level, three metrics have been used – monthly profit, net promotion score and customer satisfaction survey. However, these were not related to the metrics used on the

45 team level. The overall attitude to metrics in the company was slightly negative and was becoming more positive in the end of the release 8.2

4.1.2. Team

Team composition The On-line marketing team was one of the most recently established teams in the company. After short time it has been divided into two parts, since it was expected that it will grow. However it didn’t so in the beginning of the version 8.1 the team was put together again in one room as a fully collocated team. During the releases of 8.1 and 8.2 the Development Team consisted of 2 technical leaders (the most experienced and knowledgeable of the developers), 4 developers, 2 testers, 1 UX designer, 1 technical writer shared with the Content management team, and for short period in the beginning of the Release 8.1 also 1 front-end specialist. Besides of these people, in the Scrum Team there was also dedicated Scrum master (me) and Product owner sitting on-site.

Agile practices in the team The team was running on Scrum in combination with XP for user stories and planning game. The standard sprint length across the company was 2 weeks. None of the teams in the Product development used release planning for planning the releases 8.1 or 8.2. However, all standard Scrum meetings have been used. The planning meeting has been standardly divided into two parts – backlog grooming, where the Product owner consulted the stories with the Development team and the top of the backlog was prepared for the Sprint planning, and the Sprint planning, where the team revised the estimates from the Backlog grooming and planned for the Sprint. In the second half of the release 8.2, also the part of the planning where the team divided the user stories form the sprint planning on the technical tasks has been introduced. The team has been estimating both user stories and spikes in story-points and has been using the story-points as a complexity measure. The estimates in 8.1 were given based on everybody’s own experience. The estimates in 8.2 have been based on a referential set of user stories placed on the team’s wiki. There have been 3 examples for every estimate value on the Fibonacci’s scale. Bugs and defects were not estimated. Stories were estimated using the planning poker and before each estimate, the estimate scale was showed. Referential user stories were updated regularly after every Sprint in collaboration with one member of the Development team.

46

Metrics used by the team The team was using just a few metrics during my activity as a Scrum master there. Most of the metrics were tracked automatically by Jira – velocity, velocity chart, sprint burndown chart, release burndown chart, cumulative flow diagram and control chart for cycle time. Velocity has been used mostly by the Product management to Inspect the rate of features delivered during the Release. The approach to understanding velocity has been different along the all of the product development teams. In On-line marketing team is has been used with combination of predicted capacity and the commitment-based planning (team was committing more and more stories, until they felt they have enough work for the Sprint based on the velocity from the last Sprint and the capacity in the next Sprint) to plan the Sprints. Sprint burndown chart has been automatically tracked in story-points in the Jira. However it was never actively used, due to the negativism towards this metric in the team. Rate of features delivered has been passively tracked in Jira, however never actively used, due to the lack of understanding of this metrics and due to the fact, that Release planning practice was still not adopted by the product development. Work-in-process has been visualized by the Scrum board and the current state inspected at the Daily Scrum. However it has never been consistently tracked for a longer period of time in spite of the fact that it has been topic of many retrospectives. Story cycle time has been passively tracked by Jira, however it has never been inspected.

4.2. Methodology

Eight indicators based on the process metrics from chapter 3.4. have been inspected on the data from releases 8.1 and 8.2 collected from the Jira at the end of the release 8.2. The way of calculating the indicator from the metric have been optimized to be the simplest possible to answer the question of the indicator. The way of working with these indicators in this case study is described under each of the indicator. My approach here is similar to goal-question-metric, however I started with the agile metrics, based on literature research decided on proper question to ask (indicator) and put them into hierarchy based on the goals. The hierarchy is shown in Figure 4.1.

47

Goal 1 (I1): Good predictability of the delivery Objective 1.1 (I5): Good planning on the release level Objective 2.1 (I2): Good sprint Objective 2.1.1 (I3): Good planning on the sprint level Objective 2.2.1 (I4): Good predictability and adaptability during sprint

Goal 2 (I6): Good collaboration on the team

Goal 3 (I7): Growing efficiency of the process

Figure 4.1: The hierarchy of the indicators based on the goals they represent

I1: The difference between the sprint velocity and the velocity predicted for the release is lagging indicator of predictability of the sprint.

Firstly, the prediction of the velocity for the release has to be made. The 70% correlation of the velocity and the availability for data in this study suggests that the best value for prediction of the velocity is the velocity/availability ratio. If the velocity from the previous release is unknown or is not relevant, there is no prediction for first Sprint, prediction for the second Sprint is based on the first Sprint and prediction for the third Sprint is based on the first two Sprints. Prediction for the rest of the release is then based on the rolling average of the velocity/availability ratio in the last three Sprints. So the prediction is calculated as the rolling average of the velocity/availability ratio in the last three Sprints multiplied by the availability in the next Sprint. After the predictions are made, the deviation from the predicted value is calculated in percent as:

I1 = 100%*|velocity – prediction|/velocity

This number serves then as an indicator of predictability for the current Sprint. The higher is the value of I1, the lower the predictability of the Sprint. This number should be maximally 25% and ideally within 10% range of deviation. In case:

1. I1 < 10%, the predictability of the sprint was good. 2. 10% < I1 < 25%, the predictability of the sprint can still be improved 3. I1 > 25%, the predictability of the sprint was bad and the problem has to be addressed at the retrospective.

48

2nd and 3rd case can be further analyzed using the other predictability indicators:

- I2: For bad sprint planning, estimates or disruptions in the sprint. - I3: For bad estimates or bad sprint planning. - I4: For the predictability in the sprint and the adaptability of the team. - I5: For bad release planning

I2: The story point completion ratio is lagging indicator of disruptions in the sprint, bad estimates or bad sprint planning.

This indicator is calculated as:

I2 = 100%*velocity/commitment

In case:

1. I2 = 100%, the sprint was good. 2. I2 > 100%, the estimates or the planning was bad or the team did some extra effort in the sprint. 3. I2< 100%, the estimates or sprint planning was bad or the team was disrupted during the sprint.

2nd and 3rd case can be further analyzed using the other predictability indicators:

- I3: For bad estimates or bad sprint planning. - I4: For the predictability in the sprint and the adaptability of the team.

I3: The difference between the prediction for the sprint and the team commitment is indicator of team’s ability to plan or estimate.

The value of this indicator is calculated as:

100%*(commitment-prediction)/commitment

Since the team uses combination of velocity-based and commitment-based planning, the difference between the prediction and commitment can mean two things. One is that the planning was bad, meaning the team didn’t use proper method for predicting the velocity. This is certainly the case for most of the sprints, since the method for predicting the velocity was discovered during writing this thesis. The second is that the higher (or lower) commitment was caused by higher (or lower) estimates than in the preceding sprints. In case:

49

1. I3 = 0%, the planning and estimates are good. 2. I3 > 0%, the planning was bad or the estimates in the sprint are higher than in preceding sprints. 3. I3 < 0%, the planning was bad or the estimates in the sprint are lower than in preceding sprints.

I4: The shape of the sprint burndown chart is indicator of predictability in the sprint and adaptability of the team.

This is the only indicator for analyzing the process on the Sprint level. To analyze the shape of the Sprint burndown chart the patterns from chapter 3.4.2. are used:

1. “Ideal team” – nothing to improve. 2. “Great team” – good adaptability, issues in the first half of the sprint, sprint planning can be improved. 3. “Nice team” – relatively good adaptability, but it can be improved. 4. “Let’s rest” – team over-estimated the stories. 5. “The management is coming!” – zero predictability, not possible to analyze. 6. “It is too late” – low adaptability.

I5: The release burndown chart is lagging indicator of the scope changes in the Release.

The scope changes in the release are calculated as 100%*|story points added to the Release – story points removed from the Release|/all completed story points in the release. If the value of I5 is high, it is indicator of bad release planning.

I6: The average number of people in work divided by the average number of stories in process is lagging indicator of the collaboration on the team.

This number indicates the number of people working on one story. The value is calculated for the whole Sprint based on the average value of the stories in process and the average availability per day. The ideal number of people working on 1 story according to chapter 3.4.4 is 2-5, depending on the number of people in the team. In the case of 8 people (as in this case study), the ideal number is 4. The collaboration is growing on the scale of 1 to 4. The higher number may result in lower efficiency of the process (see indicator I7).

50

I7: The average story cycle time is lagging indicator of efficiency of the process.

The ideal value of the average cycle time in the sprint according to the chapter 3.4.5 is 3 days. The higher cycle times should be optimized towards this value, the lower estimates may indicate very small user stories in the Sprint. Also, since the stories differ in complexity, the story cycle time should be accompanied by the average complexity of the stories in the sprint.

The results from the Indicators are validated by the team retrospectives log from the team wiki. Based on this validation process improvements and the added value on using the indicators can be decided.

4.3. Results

This chapter presents the results for the eight indicators introduced in chapter 4.2.

4.3.1 Predictability of the delivery

I1: The difference between the sprint velocity and the velocity predicted for the release is lagging indicator of predictability of the sprint.

Release 8.1

Since we didn’t have data from the previous release, we can’t predict the velocity for first Sprint in the Release 8.1. But that doesn’t matter, because we can instead use commitment-based planning for the Sprint 8.1. The velocity in second Sprint would be based on the velocity in the first Sprint and the velocity in the third Sprint on the average of first two Sprints. Prediction for the other Sprints will be based on rolling average of the previous three Sprints.

Sprint length Availability velocity prediction I1 Sprint [wd] [md] [sp] [sp] [%] 8.1.3 15 101 67 - - 8.1.4 10 77 57 53 8 8.1.5 10 77 40 55 37 8.1.6a 4 33 14 21 53 8.1.6b 6 47 46 26 43

Table 4.1: Predictability of the Release 8.1

51

As we see from the table, the predictability in the Sprint 8.1.4 was good. However in the Sprints 8.1.5, 8.1.6a and 8.1.6b is the deviation larger than 25%. These sprints have to be further analyzed for the bad predictability.

Release 8.2

Sprint length Availability velocity prediction I1 Sprint [wd] [md] [sp] [sp] [%] 8.2.1 12 70 24 - - 8.2.2 10.5 71 35 32 8 8.2.3 6.5 39 21 18 12 8.2.4 9 51 36 25 30 8.2.5 10 44 42 25 30

Table 4.2: Predictability of the Release 8.2.

We still can’t use the velocity from the Release 8.1 to predict the velocity in 8.2. The main reason behind this is that there were two months between the end of 8.1.6b and start of 8.2.1 when the team haven’t been developing, just fixing bugs and defects. This caused that the technical setup and team composition was different in 8.2. Also, the character of work was different, causing the estimates to be incomparable. This is why we have to proceed similarly than in Release 8.1 – second Sprint is based on the velocity of the first Sprint and the third Sprint on the average of the first two. Predictions for all other Sprints are based on the rolling average of last three Sprints. The predictability in Release 8.2 was better (avg. deviation 20%) than in Release 8.1 (avg. deviation 35%). This was caused probably by the fact, that the team estimated better, planning has been done better and the team adapted better during the sprint. In the Table 4.2 we can see worsening trend during the whole release 8.2. Predictability in the sprint 8.2.2 was good, in the sprint 8.2.3 was in the acceptable range (but could be improved) and was bad for Sprints 8.2.4 and 8.2.5.

52

I2: The story point completion ratio is lagging indicator of disruptions in the sprint, bad estimates or bad sprint planning.

Release 8.1

Commitment Velocity I2 I1 Sprint [sp] [sp] [%] [%] 8.1.3 75 67 89 11 8.1.4 47 57 121 21 8.1.5 85 40 47 53 8.1.6a 27 14 52 48 8.1.6b 30 46 153 53

Table 4.3: Story point completion ratio in release 8.1.

We see that none of the sprints were absolutely good. In the sprints 8.1.4 and 8.1.6b the estimates or the planning was bad, or (and this is the case of the sprint 8.1.6b) the team did some extra effort. In the rest of the sprints the team didn’t deliver, meaning the estimates or sprint planning was bad or the team was disrupted during the sprint.

Release 8.2

Commitment Velocity I2 I1 Sprint [sp] [sp] [%] [%] 8.2.1 34 24 71 - 8.2.2 69 35 51 8 8.2.3 42 21 50 12 8.2.4 36 36 100 30 8.2.5 42 42 100 30

Table 4.4: Story point completion ratio in release 8.2.

In the Table 4.5 we see the worsening trend until the sprint 8.2.3 and improving trend from sprint 8.1.5 till the end of the release. In the sprints 8.2.1, 8.2.2 and 8.2.3 the estimates or sprint planning was bad, or the team was disrupted during the sprint. The worst value is for sprint 8.2.3. This agrees with the Sprint 8.2.3 retrospective team identified lots of problems and areas that need to be improved. The Sprints 8.2.4 and 8.2.5 show good results. At the retrospective team also identified that one of the success factors was that they stayed longer in work just to finish all work in the Sprint and that the return of the technical leader from USA helped better organization of work during the Sprint.

53

I3: The difference between the prediction for the sprint and the team commitment is indicator of team’s ability to plan or estimate.

Release 8.1

Prediction Commitment I3 I2 Sprint [sp] [sp] [%] [%] 8.1.3 - 75 - 89 8.1.4 53 47 -12 121 8.1.5 55 85 +36 47 8.1.6a 21 27 -21 52 8.1.6b 26 30 +12 153

Table 4.5: Indicator I3 in release 8.1.

In the Table 4.5 we see the worsening trend until the sprint 8.1.5 and improving trend from sprint 8.1.5 till the end of the release. In none of the sprints was the planning or estimates good, in sprints 8.1.4 and 8.1.5 was the planning bad or the estimates were lower than in preceding sprints. In sprints 8.1.5 and 8.1.6b the planning was bad or the estimates were higher than in preceding sprints. The planning and estimates was best in sprints 8.1.4 and 8.1.6b and worst in sprint 8.1.5.

Release 8.2

Prediciton Commitment I3 I2 Sprint [sp] [sp] [%] [%] 8.2.1 - 34 - 71 8.2.2 32 69 +53 51 8.2.3 18 42 +56 50 8.2.4 25 36 +30 100 8.2.5 25 42 +39 100

Table 4.6: Indicator I3 in release 8.2

All of the sprints in the release 8.2 have been quite poorly planned. The planning and estimating in the release has improving trend. Despite of the fact that team overestimated its abilities; it managed to deliver more than was expected. Therefore it seems, that the bad predictability of the last two sprints was not caused by bad planning, but more likely by the great motivation of the team to finish the scope of the sprint maybe even for the price of quality.

54

I4: The shape of the sprint burndown chart is indicator of predictability in the sprint and adaptability of the team.

Release 8.1

Sprint 8.1.4 Sprint 8.1.3 80 100

60 80 60 40 40 20 20 0 0 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10111213141516

Sprint 8.1.6a 100 Sprint 8.1.5 40 80 30 60 40 20 20 10 0 0 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5

Sprint 8.1.6b burndown chart 50 45 40 35 30 25 remaining effort 20 ideal effort 15 10 5 0 1 2 3 4 5 6 7

Figure 4.2: Sprint burndown charts in Release 8.1

55

Sprints 8.1.3, 8.1.4, 8.1.5, 8.1.6a – “It is too late”

The Figure 4.2 shows all Sprint burndown charts in 8.1. The shape of sprints 8.1.3 to 8.1.6a is typical shapes of “It is too late”. The team was late during the whole Sprint and didn’t complete its commitment. Several days before the end of the Sprint team started to work harder, but it was still not enough. The team should have adapted much sooner. At the 8.1.3 retrospective, the team rated positively the “joint effort to finish the Sprint”. They perceived the course of the Sprint as a problem, concretely the fact, that:

- the first story was closed as the last - there was a lot of testing at the end of the Sprint - one story could be reviewed and tested sooner - we had to wait for the external resources (CTO) for some of the stories

In the Sprint 8.1.4 team added a lot of stories during the Sprint. This can be interpreted as a badly conducted Sprint planning. This is supported by the statement from the retrospective “in the middle of the sprint there were developers who didn’t have work to do”. However the team adapted pretty well for this situation and finished all stories that were added to the Sprint. In the Sprint 8.1.5 team haven’t delivered most of the stories. This is indicator that the ability to adapt wasn’t the only problem in the Sprint. Also the progress stayed unchanged during the most of the Sprint and it was not possible to predict anything from the Sprint burndown chart. The team should have stopped after two or three days and immediately apply corrective actions. At the retrospective team identified these process problems:

- chaos – missing overview of who is doing what - some work items were deadlocks - new definition of done causes problems with closing the stories

The situation in sprint 8.1.6a is a little different. It was ok in the first day after start of the sprint; however, it was late the rest of the sprint. The main problem was that Sprint was so short, that the team wouldn’t be able to adapt even if they wanted to. One story was unfinished and the reason was that at the time the team planned it to the sprint; they didn’t know it was blocked. The scrum master should have removed the story from the Sprint immediately after finding out it is not possible to work on it.

56

Sprint 8.1.6b – “Nice team”

The Sprint 8.1.6b is an example of “Nice team” chart. The team was late during the whole Sprint, however they were able to adapt and finish all the stories. However, most of them were finished on last minute and this can possibly mean low quality of the Increment. At the retrospective, the team identified these process items:

- great motivation to do stuff - we have tendency to not to close stuff - 1 story had 6 subtasks and it was necessary to test it all at once – everybody has to test - things throwing YSOD (yellow screen of death) went to testing – developers should go through their code before they send it to testing

Release 8.2

Sprint 8.2.1 Sprint 8.2.2 50 80

40 60 30 40 20 20 10 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 1 2 3 4 5 6 7 8 9 10

Sprint 8.2.3 Sprint 8.2.4 50 40 40

40 30 30 30 20 20 20 10 10 10 0 0 0 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10

57

Sprint 8.2.5 burndown chart 45 45

40 40

35 35

30 30

25 25 ideal effort 20 20 remaining effort 15 15

10 10

5 5

0 0 1 2 3 4 5 6 7 8 9 10 11 12 13

Figure 4.3: Sprint burndown charts in Release 8.1

Sprints 8.2.1, 8.2.2, 8.2.3 – “It is too late”

The shape of Sprints 8.2.1 to 8.2.3 is typical shape of “It is too late” already mentioned in Release 8.1. The team was late during the whole Sprint and didn’t complete its commitment. Several days before the end of the Sprint team started to work harder, but it was still not enough. The team should have adapted much sooner. In the middle of the Sprint 8.2.1 team realized that one of the stories is much bigger than expected, so they changed the estimate from 5SP to 13SP. Better late than never, however this behavior is certainly an indicator of bad team performance on the Sprint planning. The status of the Sprint was unchanged most of the Sprint and therefore there wasn’t any way to track the progress of the team. There was no process insight from the Sprint retrospective. Some of this was could be caused by the absence of the Scrum master during the Sprint. Sprint 8.2.2 shows typical example of team overestimating its powers. It could be either bad estimating or bad planning, but if the team planned 34SP less it would be a successful Sprint. This was supported with the retrospective where team together with product owner identified that it gives small estimated. Even if the team looked at the Sprint burndown chart, they could adjust the Sprint scope accordingly. Team identified at the retrospective, that one of the stories was bigger than expected, and correctly it should be removed immediately from the Sprint and exchanged with a proper sized story. Also one of the team members was estimating something different than was reality.

58

The team showed effort to adjust and finish all the stories near the end of the Sprint 8.2.3, however it was still not enough. On top of that most of the Sprint the Sprint status stayed unchanged making it impossible to track progress on the Sprint. The Sprint was also very short. Team identified that process problems were still in:

- organization of testing - clear out things on planning meeting - working on things that are not in the Sprint - communication and expectations from the PO

Sprint 8.2.4 – “Nice team”

The Sprint 8.2.4 is an example of “Nice team” chart. The team was late during the whole Sprint, however they were able to adapt and finish all the stories. However, most of them were finished last minute and this can possibly mean low quality of the Increment. At the retrospective, the team identified as the success factor that it went extra mile just to finish things and stayed until 9pm in the work on the day before the Sprint end.

Sprint 8.2.5 – “The management is coming”

This shape indicates that the team was doing some work during the whole Sprint but it was not tracking it accordingly. The problem could be that either the stories were too big for the Sprint, or the team was not collaborating on finishing them very well, or the way the team tracks its progress is not very good. The result is, that the team can’t predict the end of the Sprint and can’t track the progress. The Scrum master should coach the team why it is necessary to track the progress and how to track it. Also, closing the majority of the stories at the end of the Sprint can mean low quality of the Increment. This is supported by the item from the retrospective – “QA engineer is unhappy about the way the stories were closed.” The reason behind this was that one story was poorly tested and in one story was some unfinished text. At the retrospective, there were two success factors identified hind the positive course of the Sprint:

- team didn’t spend too much time on innovation time (they adjusted the time they spent on innovation time in order for all work in Sprint to be finished) - the daily table was unified (and the progress on the Sprint was thus clearer)

59

I5: The release burndown chart is an indicator of the scope changes in the Release.

Release 8.1

Figure 4.4.: The release burndown chart for Release 8.1.

The burndown chart shows extensive changes in the scope of the release what corresponds to the fact, that the release was not planned in advance. In fact, 255 SP was added during the Release and -16 SP were removed, making it total 71% scope change. At the retrospective of 8.1.5 team identified that it would need a more concrete goal for the whole Release period so that they would have overview of the whole Performance project. Good high level Release plan would solve this problem.

60

Release 8.2

Figure 4.5.: The release burndown chart for Release 8.2.

The situation in 8.2 is quite similar to the situation in 8.1. There was no release planning and therefore the progress on features in 8.2. was not tracked. Total of 120SP was added during the release making it the 70% release scope change. At the 8.2 release retrospective several teams in company realized that the success of the projects would be much higher if there was some release planning. This would also improve the communication and clear out expectations between the product owner and the team.

61

4.3.2. Collaboration on the team

I6: The average number of people in work divided by the average number of stories in process is lagging indicator of the collaboration on the team.

Release 8.1

avg. stories I6 [avg. people per Sprint avg. availability per day in process per day story] 8.1.3 3.2 6.5 2.0 8.1.4 4.3 7.2 1.7 8.1.5 5.2 7.5 1.4 8.1.6a 1.5 8.0 5.3 8.1.6a 6.4 6.9 1.1

Table 4.4: Value of the indicator I6 in Release 8.1

The collaboration indicated by the indicator I6 was continuously worsened during the release. The only Sprint 8.1.6a broke the trend, since it was more collaborative than both the Sprint 8.1.5 and 8.1.6b. The only two Sprints, that fit into recommendations from chapter 3.4.3 that ideally 2-5 people collaborate on the same user story are the Sprints 8.1.3 and 8.1.6a. At the 8.1.3 retrospective the team was quite positive about the collaboration, they said:

- we are collaborating more and synchronizing more - we solve the problems together - there is a joint effort to close the stories - we are testing together - we are 1 team with 2 technical leaders

At the 8.1.4 retrospective, team identified several problems with the collaboration:

- the daily meeting is about solving technical problems - in the middle of the Sprint, there were developers who didn’t have any work to assign even though there was some work

In 8.1.5 the collaboration was even worse, team identified, that:

62

- there was a chaos, no overview about who is working on what - people were isolated and pessimistic - team didn’t know about the acceptance criteria - we haven’t drunk together as a team yet

In 8.1.6a, where the collaboration was improved to the value from 8.1.4, the team identified these success factors:

- we had a goal (common goal reinforces collaboration) - great team and environment

Release 8.2

I6 avg. stories avg. availability per Sprint [avg. people per in process per day day story] 8.2.1 1.9 5.5 2.9 8.2.2 4.8 6.0 1.3 8.2.3 2.6 5.9 2.3 8.2.4 1.4 6.3 4.5 8.2.5 5.2 5.0 1.0

Table 4.4: Value of the indicator I6 in Release 8.2

The best collaboration was in the Sprint 8.2.4 and the worst in Sprint 8.2.5. The collaboration started very well in the first Sprint with the value 0.34 stories per person. At the 8.2.4 retrospective, team identified:

- great daily meeting last 14 days - one of the testers also started to develop - the testing was good

Also the low value of the stories per person was influenced by the fact team was working on items not in Sprint backlog - writing blogposts and helping the other team. The collaboration in Sprint 8.2.2 was one of the worst in Release. At the retrospective the team mentioned:

- long daily meeting - testing bad organized - lot of things on 1 person

63

- one story was not finished, because the responsible person was not collaborating much with other people in the Sprint

In the Sprint 8.2.3, the collaboration got a little better. The team learned lessons from the Sprint 8.2.2. The team kept positive trend and the Sprint 8.2.4 was actually the best in the Release regarding the collaboration. Team identified these success factors:

- the communication and synchronization was significantly better - the technical leader returned from USA – better organization - the scrum board was finally unified

The positive trend was disrupted by the Sprint 8.2.5, which was actually the worst in the Release with respect to the collaboration of the team. Most of the stories were moved in progress at the beginning and stayed there until end. At the retrospective, the team admitted it is a problem for them to close the stories gradually, but they don’t know how to do it better. The analysis showed, that main reason for this is the bad sizing of the stories. The collaboration coefficient (stories per person ratio) strongly correlates (79%) with the number of user stories that are not of proper size - between 1/6 and 1/10 of the velocity (Lawrence, 2012). The team should collaborate with the product owner on creating better sized user stories at the Sprint planning since it enables collaboration during the Sprint.

4.3.3. Efficiency of the process

I7: The average story cycle time is lagging indicator of efficiency of the process.

Release 8.1

Sprint I7 [avg. story cycle time] avg. size of the story [SP] 8.1.3 7d 11h 13 8.1.4 5d 18h 10 8.1.5 6d 7h 7 8.1.6a 2d 18h 4 8.1.6b 5d 15h 6

Table 4.6: Values of indicator I7 for release 8.1.

64

As we can see from the table, the story cycle time was lowest in Sprint 8.1.6a. However, this is probably only because the finished stories were very small. The worst story cycle time was in the sprint 8.1.3, again probably due to the big size of the stories. The trend of I7 was improvement during the whole release.

Release 8.2

Sprint I7 [avg. story cycle time] avg. size of the story [SP] 8.2.1 2d 19h 8 8.2.2 6d 17h 6 8.2.3 6d 1h 11 8.2.4 7d 16h 9 8.2.5 7d 18h 6

Table 4.7: Values of indicator I7 for release 8.2.

With the exception of sprint 8.2.3 the cycle time in 8.2 was worsening during the whole release. The most effective Sprint was Sprint 8.2.1 and the least effective the Sprint 8.2.5. The low value of indicator I7 in 8.2.1 was however probably caused by low complexity of the stories in 8.2.1. For example spike estimated for 13 SP was closed after 1 day in process. However, the improved value of I7 in the Sprints 8.2.3 was probably result of process improvements in these Sprint. After bad result in 8.2.2, team decided to apply these improvements in the Sprint 8.2.3:

- Scrum master will make the daily meeting shorter - all + scrum master ensure that the testing will be organized well - start of use of the upgraded estimate scale - start doing the Planning more properly

65

5. Conclusion

5.1. Summary

When you want to start measuring progress of your agile teams, first you have to decide on the primary metric. This will be the reference metric for balancing all other metrics. Hartmann suggest measuring business value delivered (2006). It can be calculated using Return of investment, Internal rate of return or Net present value. Gustafsson identifies economic influence as the main goal of measurement system (2011). He further describes proxy variables as variables that indicate no direct financial outcome, but have a strong relation to it. Hartmann develops special name for this type of metrics- “diagnostics” to remember that these are used to “diagnose and improve” the processes on a local level and should be used only if needed and only for a fixed length of time to avoid measuring for measurement’s sake (2006). Hundermark (2009) suggest a set of metrics from which you choose a “vital few” that actively enhance the desired behavior. At Kentico software for optimizing the processes on the company level, three metrics has been used – monthly profit, net promotion score and customer satisfaction survey. However, these were not related to the metrics used on the team level. Net promotion score and customer satisfaction survey are good metrics for measuring quality for the customer and is suggested that they are designed to distinguish the contribution of individual teams to the overall score. The focus shift from cost-centered to value-centered metrics in agile (Hartmann, 2006) suggest using business value delivered (and ROI) as a driving metric instead of monthly profit. If a team deploys continuously with no iterations (Kanban - style), the primary metric for the optimizations at a local level is lead time. It can be accompanied with measurements of queues and work in progress and together visualized in Cumulative flow diagram (these have been categorized in this thesis as “lean metrics”). For teams that work in iterations and deploy at the end of the release, the primary metric for optimizations at a local level is velocity. It can be accompanied with other predictability metrics like sprint burndown chart, rate of features delivered and velocity chart. For measuring the collaboration on the team, work-in-process could be used. For measuring the efficiency of the process, story cycle time could be used. In order to predict gaming on velocity, also technical quality has to be measured. This can be measured by the running automated tests and technical debt. The only metric being actively used at a local level in Kentico software was velocity. Several other measurements were passively tracked with the tracking software. These are the metrics analyzed in this thesis, namely velocity chart,

66 sprint burndown chart, rate of features delivered, work-in-process and story cycle time. Results from these metrics show that using these measurements especially with combination with soft data from retrospectives and one on one session will help improve the overall agile process.

5.2. Interpretation of the results

The process of using these metrics is the following. Velocity is the primary metric. It is most useful in long-term and the rolling average is used to identify the deviations in individual Sprints. If the deviation is high (in general, more than 25%) the team has low predictability and the problem should be addressed using the other metrics. Some sources say that the velocity can be optimized in a way, the deviations would be < 10%. That’s why Basically any Sprint with deviation higher than 10% from the predicted value should be analyzed using the other metrics and at the retrospective. Velocity chart and the completion rate of the stories show if the problem was caused by bad planning, bad estimates or disruptions during the sprint. Sprint burndown chart further shows the problems with predictability on the Sprint level and the team’s ability to adapt and face the challenges during the Sprint. The collaboration metrics work-in-process and story cycle time have influence on the velocity. The Little’s law says that the average number of items in the queuing system L equals average waiting time in the system for an item W multiplied by the average number of items arriving per unit time ʎ (Chhajed, 2008):

L = ʎ W

Applied to our situation average work in process (WIP) in story-points equals average story cycle time (CT) multiplied by the average velocity (V). Alas velocity equals WIP divided by the CT.

WIP = CT*V → V = WIP/CT

To modify it according to indicators in this case study, where we use work-in- process in number of issues per day and person and cycle time per story-point, we need to measure velocity as a velocity/availability ratio. In some literature this is called focus factor (Downey, 2013). The strong correlation of availability and velocity in this case study (70%) confirms that this ratio is of better use for prediction of the velocity than velocity itself. The modified equation then is:

67

푊퐼푃 푉 푃퐼푊 푊퐼푃 ∗ 푆표푆 ~ = 퐴 퐶푇 푃퐼푊 ∗ 퐶푇 푆표푆

Where V is velocity measured in story points, A is availability of the team measured in man-days, WIP is number of items in process per day, SoS is size of the story, PIW is number of people in work and CT cycle time of a story, all taken as a long-term average. This is also in compliance with the measurements taken – see table 5.1:

Release left-side of the right side of the accuracy [%] equation equation 8.1 0.67 0.73 92 8.2 0.66 0.61 93

Table 5.1: Littles’s law for releases 8.1 and 8.2

Small difference between the left side and the right size of of the equation is caused by the unfinished stories that were count in WIP, but not in velocity. Also the availability not always equals the Sprint length multiplied by the average number of people in work, because of the work during weekends, overtimes and flexible working hours.

The table below summarized all indicators measured in this case study.

predictability collaboration efficiency Sprin I1 I2 I3 I4 I5 I6 I7 t [%] [%] [%] [shape] [%] [people] [days] 8.1.3 - 89 - "It is too late" 2 7 8.1.4 8 121 -12 "It is too late" 1,7 5 8.1.5 37 47 36 "It is too late" 1,4 6 8.1.6a 53 52 -21 "It is too late" 5,3 2 8.1.6b 43 153 12 "Nice team" 71 1,1 5 8.2.1 - 71 - "It is too late" 2,9 2 8.2.2 8 51 53 "It is too late" 1,3 6 8.2.3 12 50 56 "It is too late" 2,3 6 8.2.4 30 100 30 "Nice team" 4,5 7 8.2.5 39 100 39 "The management is coming" 70 1 7

Table 5.2. Indicators dashboard

68

The process dimensions of agile software development have been measured – predictability of the delivery, collaboration on the team and efficiency of the process. All of them have their driving metric - I1, I6 and I7, respectively. For the predictability also four helping metrics have been identified – I2, I3, I4 and I5. More diagnostics can be created, if needed.

Long-term improvement The significance of the table above goes beyond the use of just comparing individual Sprints. It can be also used as an indicator of long-term improvements in the team.

predictability collaboration efficiency Sprint I1 [%] I6 [people] I7 [days] 8.1.3 - 2 7 8.1.4 8 1,7 5 8.1.5 37 1,4 6 8.1.6a 53 5,3 2 8.1.6b 43 1,1 5 8.2.1 - 2,9 2 8.2.2 8 1,3 6 8.2.3 12 2,3 6 8.2.4 30 4,5 7 8.2.5 39 1 7

Table 5.3: Long-term improvements during the releases

The table shows the changes for the indicators between the Sprints. We can see negative trend on overall predictability, positive trend on collaboration and again, negative trend on efficiency.

Metric levels It is important to discern different levels for usage of the metrics. In Kentico we had four levels of management – individual, team, department and organization. Individual level is not suitable for agile metrics, since it can lead to sub- optimization. Team, department and organization levels correspond to widely known project, program and portfolio levels. All of the metrics in the table are suitable for the project level. For the program and portfolio level would be perhaps more suitable different table:

69

velocity velocity availability velocity/availability prediction deviation Sprint [sp] [md] [sp/md] [sp] [%] 8.1.3 67 98 0.68 - - 8.1.4 57 77 0.74 53 8 8.1.5 40 77 0.52 55 -37 8.1.6a 14 33 0.42 21 -53 8.1.6b 46 47 0.98 26 43 8.2.1 24 53 0.45 - - 8.2.2 35 71 0.49 32 8 8.2.3 21 39 0.54 18 12 8.2.4 36 51 0.71 25 30 8.2.5 42 44 0.95 25 39

Table 5.3: Dashboard customized for Product management.

The table is designed so that it helps the management to understand the meaning of velocity as an indicator of predictability. Ideally in the dashboard there would be also indicators for quality and value. This goes however beyond the scope of this thesis.

5.3. Further research

There is a big room for expanding this research for the other agile dimensions – especially the key metric – return of investment, quality and value for the customer. Also, as this thesis and the design of the metrics was independent strive of one individual, in future research is suggested to use more collaborative approach to designing the metrics with both the team (for feedback on using the metrics), the management (understanding the key metric) and other people responsible for corresponding areas (QA coordinator for quality metrics). This case study was written based on the data collected in the time period from nine months ago to three months ago. And at the time it was written, lots of the facts had been already forgotten. Therefore I suggest collecting the data regularly, e.g. after each Sprint and immediately apply corrective actions for the measurements if needed because later they can be forgotten. Another area, which should be extended in future studies, is the effect of the management and team decisions on the process and product. The effect of the decisions from retrospectives on the process was already slightly touched in this thesis. However in future this could be at time suitable for the evaluation. This way a decision-impact model could be created serving to better understand the

70 processes in the agile software development. Also it would satisfy the empiric approach to the understanding of the improvement in agile – to experiment with the independent variables and see or measure the outcomes of the dependent variables.

5.4. Final word

This thesis shows that in Kentico software is a room for improving the usage of the agile process and product metrics. The topic of agile metrics is still a novelty in the world of software engineering. There is a big room for on-site research in IT companies that adopt agile methods on how to measure the results they bring. At the time of writing this thesis, the usage agile methods still expand and the trend shows the rising interest of agile metrics. This can be seen on the fact that most of the literature that try to put together best practice on the agile metrics are from the last two years. Use of the agile metrics however should be always in compliance with the agile principles and the hard data should be combined with the soft data (from one-on-ones and retrospectives). As Fowler wrote in his article on Agile manifesto (Fowler, 2001):

“Trust in people, believing that individual capability and group interaction are key to success extends to trusting teams to monitor and improve their own development processes.”

71

6. Bibliography

1. ABRAHAMSSON, Pekka; et al. Agile software development methods: Review and analysis. VTT Publications 478. 2002.

2. AMBLER, Scott and M. LINES. Going Beyond Scrum: Disciplined Agile Delivery. Disciplined Agile Consortium. White Paper Series. 2013.

3. BALLÉ, Freddy and Michael BALLÉ. Lean development. Business Strategy Review 2005, 16.3: 17-22.

4. BECK, Kent, et al. Manifesto for agile software development [online]. 2001. [cited 2014-12-10]. Available from Internet:

5. BECK, Kent. Embracing change with extreme programming. Computer, 1999, 32.10: 70-77.

6. BOYD Bob. Scrum metrics. Implementing agile: Tips and best practices on implementing agile. 2011-26-06. [cited 2014-12-15]. Available from Internet:

7. CHHAJED, D. and TJ. LOWE (eds.). Building Intuition: Insights From Basic Operations Management Models and Principles. Springer Science+Business Media, LLC, New York, 2008, pp. 81 – 100.

8. CHOW, Tsun and Dac-Buu CAO. A survey study of critical success factors in agile software projects. Journal of Systems and Software, 2008, 81.6: 961-971.

9. COHN, Mike. User stories applied: For agile software development. Addison- Wesley Professional, 2004. 268 p. ISBN 978-0-321-20568-1.

10. COHN, Mike. Agile estimating and planning. (presentation). Mountain Goat Software, LLC. 2008.

11. DOWNEY, Scott, and Jeff SUTHERLAND. Scrum Metrics for Hyperproductive Teams: How They Fly like Fighter Aircraft. In System Sciences (HICSS), 2013 46th Hawaii International Conference on, pp. 4870-4878. IEEE, 2013.

12. Control chart [online]. Family Practice Notebook, LLC., © 2014 [cited 2014- 12-11]. Available form Internet:

72

13. FOWLER, Martin and Jim HIGHSMITH. The agile manifesto. Software Development. 2001. 9.8: 28-35.

14. GUSTAFSSON, Johan. Model of Agile : A Case Study. Goteborg, 2011. Master’s thesis. Chalmers University of Technology, Department of and Engineering.

15. HARTMANN, Deborah and Robin DYMOND. Appropriate agile measurement: using metrics and diagnostics to deliver business value. In: Agile Conference, 2006. IEEE, 2006. p. 6 pp.-134.

16. HIGHSMITH, Jim. Agile software development ecosystems. Addison-Wesley Longman Publishing Co., Inc., 2002-05-26. ISBN 0-201-76043-6.

17. HUNDERMARK, Peter. Agile metrics. In: Measuring for Results. [webcast, online] ScrumSense, 2009. Available from Internet:

18. HUNT, John. Feature-Driven Development. Agile Software Construction, 2006, 161-182.

19. JOYCE, David. Lead Time vs Cycle Time. In: Systems thinking, lean and kanban [online]. 2009-04-18. [cited 2014-11-12 14:08]. Available from Internet:

20. KNIBERG, Henrik. Scrum and XP from Trenches: How we do Scrum. C4Media, 2007. 140 p. ISBN 978-1-430-32264-1.

21. KNIBERG, Henrik and Mattias SKARIN. Kanban and Scrum-making the most of both. Lulu.com, 2010.

22. KOCUREK, Dusan. Understanding the Scrum Burndown chart. In: Methods & Tools [online]. 2011. [cited 2014-12-23]. Martinig & Associates, © 1995-2014. Available from Internet:

23. KUNZ, Martin; Reiner R. DUMKE and Niko ZENKER. Software metrics for agile software development. In: Software Engineering, 2008. ASWEC 2008. 19th Australian Conference on. IEEE, 2008. p. 673-678.

73

24. LAWRENCE, Richard, 2012. New story splitting resource. In: Agile for all. [online] 2012-01-27. [cited 2014-12-27]. Available from Internet:

25. POPPENDIECK, Mary and Tom POPPENDIECK. Lean software development: an agile toolkit. Addison-Wesley Professional, 2003.

26. REYNOLDS, Scott. Agile software development – buzz word or real business benefit? Part 5 – Conclusions. [online]. 2014. [cited 2015-01-02]. White Springs Limited, © 2014. Available from Internet:

27. SCHWABER, Ken and Jeff SUTHERLAND. The Scrum Guide [online]. 2013. [cited 2014-12-10]. Available from Internet:

28. SWANSON, Brad. Measuring the success of your agile transformation webinar, part 1. In: AgileLIVE™ Thought Leadership Webinar Series. [webcast, online]. 2014. [cited 2014-11-15]. VersionOne, Inc., © 2013. Available from Internet: .

74