Master Thesis Software Engineering Thesis no: MSE-2011-55 Month Year

Finding common denominators for agile software development: a systematic literature review

Saripalli Phani Shashank David Hem Paul Darse

School of Computing Blekinge Institute of Technology SE-371 79 Karlskrona Sweden This thesis is submitted to the School of Engineering at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering. The thesis is equivalent to 20 weeks of full time studies.

Contact Information: Author(s): Saripalli Phani shashank E-mail: [email protected] David Hem Paul Darse E-mail: [email protected]

University advisor: Dr Tony Gorschek Software Engineering

School of Computing : www.bth.se/com Blekinge Institute of Technology Phone : +46 455 38 50 00 SE-371 79 Karlskrona Fax : +46 455 38 50 57 Sweden II

III ABSTRACT

Context: In the last decade, Agile software development methods were proposed as a response to plan-driven methods. The main aim for this paradigm shift was to cope up with constant changes. Core values that are central to agile methods are described in Agile Manifesto. Agile practices define how agile methods are implemented in practice.

Objectives: This study aims to investigate what practices were considered and adopted as part of agile philosophy, and identify evidence on the use of agile practices in reference to what defines agile.

Methods: We conducted a systematic literature review. The review includes 186 primary studies published between 2000 and 2010. Quality assessment based on rigor and relevance as identified in the selected papers has been conducted. Non-empirical papers were classified based on themes they addressed. Empirical papers were analyzed based on two factors: 1. if they described the use of agile practices for a software project/product in general, 2. if they described the use of agile practices for a specific purpose/activity

Application type, team size and experience of subjects in the primary studies were extracted. The alignment between practices reported in the studies with the core agile values is analyzed.

Results: A total of 119 studies were conducted in industry and 67 in academia. Nearly half the papers published by researchers are non-empirical and present analysis of agile practices in various topics. Over half of the empirical papers were focused on evaluation/assessment of a specific aspect of agile. ‘Pair programming’ received most attention in this direction. Among the empirical studies that described agile practices for academic projects, ‘Pair programming’ and ‘Test driven development’ were the most considered practices. Among the 119 studies conducted in industry, 93 studies described the use of agile practices for the overall project/product development while the remaining studies described experiences of single practices like Pair Programming or the modification/adoption of agile for non- software aspects in software projects. Practices adopted have been ranked based on team size, practitioners’ experience and application type. A method for agile assessment based on core agile values is developed and exemplified.

Conclusions: Practices that are considered agile vary with context although ‘Pair programming’, ‘Continuous integration’, ‘Refactoring and Testing continuous throughout the project are the most common practices used. ‘Test driven development’ was not highly adopted by practitioners. But it was not clear if test cases were written before code construction in projects where continuous testing was performed. However, this was completely opposite in academic projects. Practices ‘On-site customer’ and ‘Stand-ups’ are frequently modified. In many of the studies inspected, practices adopted were not aligned with agile values. Although practices related to technical aspects of software development are in place, agile practices that focus aspects like ‘working together’ and ‘motivated individuals’ are often not used in practice. Moreover, many of the studies were not reported to an extent that it was possible to draw any inferences on the usability/applicability, benefits and limitations of the agile practices. To this end, a major implication is to improve the quality of the studies and a need for a common framework to report the results of agile practices’ adoption.

Keywords: Agile practices, agile software development, Systematic literature review

IV ACKNOWLEDGMENTS

We would like to extend our gratitude to Dr. Tony Gorschek for giving the opportunity to work on our Master thesis under his guidance. His guidance, feedback and support throughout the thesis has been immensely valuable. We would like to thank our parents for their continuous support, motivation and patience during our course of time in Sweden. We are thankful to Abhinaya, Rajesh and Revanth for their kind support all the time.

V 1

List of tables ...... 2! List of figures ...... 3! 1! Introduction ...... 4! 2! Background and related work ...... 5! 2.1 Agile software development ...... 5! 2.2 Related work ...... 6! 3! Research methodology ...... 7! 3.1 Search strategy ...... 8! 3.2 Study selection criteria ...... 8! 3.3! Primary study selection ...... 10! 3.4! Data extraction & study quality assessment ...... 10! 3.5! Procedure for assessing research rigor and relevance ...... 11! 3.6! Validity threats ...... 12! 3.6.1 Publication bias ...... 12! 3.6.2 Selection of primary studies ...... 12! 3.6.3 Data extraction ...... 12! 4 Results and analysis ...... 13! 4.1! Overview of the studies ...... 13! 4.1.1 Publication year ...... 13! 4.1.2 Research method ...... 13! 4.1.3 Context ...... 13! 4.1.4 Study design ...... 13! 4.1.5 Validity evaluations ...... 13! 4.1.6 Rigor ...... 13! 4.1.7 Relevance ...... 14! 4.1.8 Practice classifications ...... 15! 4.2 What agile practices are most cited in literature? (RQ1) ...... 17! 4.3! What agile practices are most adopted in practice? (RQ1.1) ...... 17! 4.3.1 Research studies ...... 20! 4.4 What are the context specific aspects of agile practices reported? (RQ1.2) ...... 22! 4.4.1Team size ...... 22! 4.4.2 Application type ...... 24! 4.4.3 Context specific practices ...... 26! 4.4.4 HRR ...... 29! 4.5 What modifications to agile practices have been reported? (RQ 1.3) ...... 29! 4.6 What is the strength of evidence available on the use of agile practices in reference to agile philosophy? (RQ2) .. 31! 5 A method for agile assessment based on core agile values (MAA-CAV) ...... 34! 5.1 Step 3 - Calculating measures ...... 34! 5.2 Step 4 – Measure representation ...... 34! 6 Conclusion ...... 37! References ...... 37! Systematic literature review references ...... 39! Appendix A Overview of studies: extended material ...... 44! Appendix B Other practices ...... 45! Appendix C Other empirical studies in academia ...... 49! Appendix D Practice modifications/adaptations ...... 51! Appendix E MAA - CAV examplification ...... 54!

2

LIST OF TABLES

Table 1 Differences between plan-driven and agile 5 Table 2 Agile principles 5 Table 3 Research Questions 8 Table 4 Search String 8 Table 5 Data Extraction 11 Table 6 Research Rigor 12 Table 7 Relevance 12 Table 8 Company Size Reported 14 Table 9 Study Rigor 14 Table 10 Relevance Across Studied 15 Table 11 Practice Classifications 15 Table 12 Citation Count of Practices 17 Table 13 Practices Found in Empirical Studies 17 Table 14 Research Studies on Software Development in General 20 Table 15 Customer Practices Proposed by Researchers 21 Table 16 Small and Large Teams 23 Table 17 Top Ranked Practices for Small Teams 24 Table 18 Top Ranked Practices for Large Teams 24 Table 19 Application Type 24 Table 20 Practice Ranking and Application Type 26 Table 21 Agile Practices Supporting Non-Technical Activities 26 Table 22 Empirical Studies Focusing on PP 27 Table 23 Empirical Studies with Focus on Software Development Process Model 28 Table 24 Empirical Studies with Focus on TDD 28 Table 25 Communication Practices 29 Table 26 Customer Collaboration Adaptations 29 Table 27 Adaptations for Stand-ups 30 Table 28 Adaptations for Project Progress 30 Table 29 Adaptations for Conducting Retrospectives 31 Table 30 Continuous Testing in Agile 31 Table 31 Mapping Values to Agile Principles 32 Table 32 Value Alignment Among Studies with Rigor>4 32 Table 33 Value Alignment Among Studies with Rigor<3 33 Table 34 Value Alignment Among Studies with Rigor=3 33 Table 35 Measure Definitions for Values and Principles 34 Table 36 Measure Illustration 35 Table 37 Template for Agile Assessment 36

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 3

LIST OF FIGURES

Fig. 1 Review Process 7 Fig. 2 Search Strategy 8 Fig. 3 Primary Studies Selection 9 Fig. 4 Distribution of Publications Across Years 14 Fig. 5 Research Methods 14 Fig. 6 HRR 15 Fig. 7 Team Size and Practitioner’s Experience 22 Fig. 8 The Four Steps of MAA-CAV 34 Fig. 9 Levels in Agile Assessment 36 Fig. 10 Agile Assessment Visualization 36

4

Finding common denominators for agile software development: a systematic literature review

Phani Shashank Saripalli and David Hem Paul Darse

——————————  ——————————

1 INTRODUCTION OFTWARE is an important part of our daily life and [24], and the Dynamic software development method San important factor for all industries [3]. Software de- (DSDM) [25]. These methods marked a new trend in the velopment is a human centered and knowledge inten- way software was developed [9]. The main contrasting sive activity [7]. In today’s world software development is differences between plan-based methods and agile in- fast paced because of frequently changing customer needs clude fundamental assumptions in software develop- [1]. Failing to keep the development pace may reduce the ment, management style, knowledge management, com- probability of market dominance [2]. Meeting customers’ munication, organizational structure and quality control needs is also equally important. Changing customer [54]. Agile methods provide a compromise between no needs, i.e. product requirements (functionality and quali- planning at all, and too much planning with ability for ty), crossing allotted budget are also common to software information systems evolution [27]. Agile practices define industry. Therefore, software industry should respond to how agile principles are implemented in practice [28]. the challenge of requirements volatility and in turn Agile practices are the various ways of working put into should be flexible enough to accommodate late changes practice to gain agility, i.e. the ability to increase devel- to the product [8]. There has been an increase in the atten- opment pace and respond to changes. tion on software development methodologies by re- Publications show an increase in the evidence on the searchers and practitioners [4][5], to bring fastness and adoption/use of agile methods. For example, surveys [29] methodological improvements. This was aimed to ad- [30] gave a bigger picture of the adoption of agile meth- dress the challenges and bottlenecks in the early devel- ods. We can find a variety of contexts in which agile opment methodologies, which often have been criticized adoptions have taken place: small and medium sized for being heavy and involving large amounts of docu- companies [29], large and distributed environments [31] mentation [9] [10] [11]. Plan-driven software development [32]. Evidence shows that agile methods are being used in is driven by rigorous planning and is document centric. a variety of domains like embedded systems, and telecom These methods contain sequential steps (phases). A prom- [29] [33]. Initially agile methods were considered suitable inent plan-driven development method is the waterfall for only small teams developing simple applications [44] model [1] suggested by Royce [13]. Other examples of [45]. Lately it has been found that agile methods can be plan-driven software development are the Rational Uni- suitable for large teams developing complex applications fied Process (RUP) [14] and the V-Model [15]. Plan-driven [45] [46]. Experiences show certain agile practices are methods have been criticized on account of late delivery adopted and used to complement the existing develop- of the software, limited budget constraints and lack of ment process and sometimes methods are modified to fit flexibility in welcoming late requirements [8] [16] [18]. the development context [34] [35]. In reality, adoption of To sustain a competitive advantage, software industry agile methods and practices depend upon the develop- should continuously improve their processes and practic- ment context, i.e. the type of software itself and other es and also bring change by adopting new development characteristics like, development effort, team size etc. [35]. practices [52]. A new trend of software development has It is observed that companies looking for implementing emerged to address the challenges in plan-driven devel- agile methods and practices do not have the same level of opment. This is known as agile software development. consideration towards the existing agile practices [36]. Agile development consists of several methods, also Krznaik et al. [37] coined the phrase “working set” which known as agile methods, proposed in the last decade with refers to the set of agile practices that bring out the best the aim of making software development lighter and fast- impact in a project. Over the years, the concept of select- er [9] [10]. These were presented in the form of “Agile ing agile practices to suit the projects in hand has in- Manifesto” [19]. Agile methods were developed based on creased [37] [38]. However, evidence of what agile prac- the four core agile values and twelve principles listed in tices work(ed) well together, and in what contexts, is The Agile Manifesto. The most referenced methods are missing. Also, there has not been much research in defin- Extreme Programming (XP) [20], Scrum [21], [22], ing what practices can be considered within the broad Crystal Methodologies [23], Feature-driven Development umbrella term ‘agile’ [29]. The Forrester report [29] states

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 5 that confusion around what constitutes ‘agile’ increase as RUP is not considered agile as this process heavily de- the awareness of agile development grows. Practices are pends on planning and documentation [1]. Contrasting transient in nature, i.e. one can find introduction of new differences between plan-driven and agile software de- practices or de-commitment of practices used in previous velopment are given in Table 1. projects [37]. It is also found that project success depends on the choice of agile practices [43]. A recent review on TABLE 1 agile methods and their implementation by Dybå and DIFFERENCES BETWEEN PLAN-DRIVEN AND AGILE Dingsøyr [39] revealed that agile methods XP and Scrum Dimension Plan-driven Agile received the most attention in practice and other methods Assump- Output well Output not known did not receive much attention. Their review did not fo- tion/Planning defined and concretely until cus on practices that were implemented as part of these planned from the solution is methods and the context variables (for e.g. company size). the beginning developed. A The evidence of how agile development is carried in prac- high-level plan- tice, practice selection and the impact is relatively nascent ning is done and and weak [40]. detailed planning Inspired by the phrase “working set” [37] and the state in iterations. of empirical evidence on agile development [39] [40] we conducted a Systematic Literature review (SLR) [42] to Requirements Detailed re- Welcome new & investigate practices that were implemented as part of quirements late requirements. agile philosophy and methodology, identifying relevance specification Regular contact issues for implementation based on the context variables and agreement with customer for found in the review, and finally to identify how compa- in the form of product features. nies attaint the core agile values described in The Agile contract. manifesto in their endeavors to be ‘agile’. We follow the De- Detailed and Not initially de- idea of Evidence-Based Software Engineering (EBSE) par- sign/Architectu accomplished in tailed and contin- adigm [41] to assess industrial relevance. re a specific phase. uous architectural The remainder of the paper is organized as follows: Back- evaluation. ground and related work is given in Section 2, research Development Accomplished Throughout the methodology in Section 3. Section 4 is dedicated to results in one phase project and in and analysis of the study. In Section 5, we described our and is specifica- various iterations. proposed method for Agile assessment based on the four tion-driven. Code changes core agile values (MAA-CAV). The method is illustrated based on new using a case example. In section 6 a short discussion on requirements from our study is given. Conclusions are given in Section 7. customers. Extended material to our study is given in Appendices A- Verification & Performed after Testing done E Validation the coding throughout the phase. Formal project. No specif- 2 BACKGROUND AND RELATED WORK roles (testers, ic roles. In some inspectors) are cases, testing is 2.1 Agile software development available. performed in par- In the last decade several software development methods allel with devel- were proposed as a response to plan based methods to opment. make software development faster and lighter [9]. “Agile development” was coined in the year 2001 [47]. This was Agile Manifesto states twelve principles that describe presented in the form of what is known as The Agile how agile values can be attained [28]. These are given in Manifesto [19]. The term “paradigm” was used by Kuhn Table 2. [48] to describe a change in coherent tradition of scientific research that includes research agenda, techniques and TABLE 2 knowledge. A shift in paradigm took place in order to AGILE PRINCIPLES tackle the problem of requirements volatility, and to de- ID: Agile velop software in iterations and new requirements can be PRINCIPLE accumulated [49].! Principle (AP) Customer satisfaction through early The Agile Manifesto defines four values that form the AP01 core of agile software development. These are the follow- and continuous delivery of software. Welcome changing requirements, even ing: (1) Individuals and interactions over processes and tools, AP02 (2) Working software over comprehensive documentation, (3) late in development. Customer collaboration over contract negotiation and (4) Re- AP03 Deliver working software frequently. sponding to change over following a plan. These four values AP04 Business and developers work together. Build projects around motivated indi- describe the contrasting differences between plan-driven AP05 software development and agile software development. viduals.

6

AP06 Face-to-face conversation. el (CMM). No strong evidence or support for the intro- AP07 Working software. duction of agile methods was found at that time. They Maintaining constant pace continuous- also stated that agile methods would not rule out tradi- AP08 ly. tional plan-driven methods. Cohen et al. also observed AP09 Technical excellence. that context variables, or factors such as team size and AP10 Simplicity. application domain would mainly affect the adoption of AP 11 Self-organizing teams. agile methods and selection. AP 12 Continuous reflection. Erickson et al. [54] conducted a review on XP. They found that there was only a small number of empirical In the context of Software Development, agility can be studies on the success of XP. They identified a need to defined as “agility means to strip away as much of the heavi- study XP practices in order to find what practices work in ness, commonly associated with the traditional software devel- practice. opment methodologies, as possible to promote quick response to A recent review of agile methods by Diva and changing environments, changes in user requirements, acceler- Dingsøyr [39] in 2008 identified 33 primary studies. They ated project deadlines and the like” [50]. Agile practices de- focused on empirical studies, and how agile development fine how agile principles are implemented in practice [9]. was conducted. They classified the studies into four cate- In the context of software business, practices can be de- gories: introduction and adoption, human and social fac- fined “to relate to the different well established ways that dif- tors, perceptions of agile methods, and comparative stud- ferent development companies have cultivated to cope with re- ies. XP was found to be the most prominent agile method quirements posed by customers, business environment, market in terms of citations and also adoption. Further, Scrum section, and software products” [51]. Each agile method has was found to be a useful project management method in its own set of practices so that software development is industry. State of the art and practice suffers from low aligned with agile values and principles listed in Agile evidence as the majority of the studies suffer from low Manifesto. For Example, XP has twelve practices. Practic- research rigor and also, the majority of studies were not es are defined to implement agile principles. Agile meth- performed with experienced agile teams. ods have their own set of practices describing various Laanti et al. [55] conducted a survey at Nokia, a large ways of implementing agile principles. telecom company to explore employees’ opinions on agile Agile practices have transient nature [37]. There is no adaption. Their survey received positive feedback on ag- definite set of practices that are used, instead practices are ile methods, namely higher job satisfaction, increased selected and modified based on the situations (context quality and earlier detection of happiness. A majority of variables) at hand [36] [37] [38]. The phrase “working set” survey respondents stated that they would not return to [37] is context dependent. Practices should be selected in their old way of working, i.e. plan-driven development. such a way that they bring the highest impact on project The population of the survey consists of 1000 respondents while maintaining the goals of agile. in seven different countries. The study reports that indus- trial adoption of agile is positive and efforts on research 2.2 Related work are a valid one. Williams [56] presented a review of agile methods and Ivarsson and Gorschek [52] performed a review of Re- practices. She observed that software development teams quirements Engineering journal where they assessed em- select a subset of agile practices and create their own hy- pirical evaluations pertaining to Requirements Engineer- brid development method instead of adhering to all prac- ing (RE). They assessed empirical evaluations and identi- tices prescribed in a given agile method. fied research rigor and relevance as a means to provide Hossain et al. [57] conducted a review on the use of decision support for technology transfer in RE. Although Scrum practices in the context of Global Software Devel- their work falls under RE research, they highlighted the opment (GSE). They found that many of the scrum prac- need to provide decision support to practitioners, a well tices were modified or extended in order to fit the needs as researchers, as part of EBSE. of distributed development. The study presented in this There is a plethora of empirical studies and a few paper a SLR on agile practices. The aim of the study is SLRs’ on agile development. [9] [39] [53] [54] [40]. The based on the extension of the studies that report imple- first review on agile was by Abrahamsson et al. [9] in mentation of agile practices, in the context of both indus- 2002. They discussed the concept of agile software devel- try and academia. Our focus is to investigate what agile opment, methods and their practices. They found that practices have been used given a certain set of context Scrum majorly covers project management dimensions of variables and, not methods per se. Further focus of our software development and also stated that only DSDM study is to identify what practices constitute agile philos- gives full coverage of all phases of software development. ophy and identify how practices have been modified to They also observed that evidence for agile methods is suit situations at hand. Further contribution of our study anecdotal and empirical validations supporting the claims is a method to measure the state of “being agile” based on were scarce. the four core agile values, principles and practices put Cohen et al. [53] published a review in 2004 that into practice. found that agile had roots in other disciplines such as manufacturing industry. Furthermore, they discussed the relations between agile and the Capability Maturity Mod-

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 7

3 RESEARCH METHODOLOGY We describe our Research methodology in this section. This includes design and execution of our SLR. The basic Start outline of our review process is given in Fig. 1. This sec- tion is concluded with a discussion on threats to the va- lidity and credibility of our review. Stage 1: Need for a systematic Step 1 of our review, i.e. need for a systematic review, is motivated in the related work section. Furthermore, to review P check if similar research was done we did a systematic L search in the databases Inspec and Compendex. A Main A reason for searching in these two databases is that often Stage 2: Defining research N research papers are indexed in Inspec and Compendex questions N and also, these are the primary search venues for SLRs in I software engineering. Biolchini et al [58] have defined N various synonyms for SLR. We formulated a search string Stage 3: Develop the review G using these synonyms and within Title, Abstract and protocol Keywords. The following search string was used:

(agile OR “extreme programming” OR xp OR scrum) AND Stage 4: Evaluate the review (SLR OR "systematic review" OR “literature review” OR "re- search review" OR "research synthesis" OR "research integra- protocol tion" OR "systematic overview" OR "systematic research syn- thesis" OR "integrative research review" OR "integrative re- E view") Stage:5 Pilot selection and X extraction E We found that none of the search results addressed the C overall aim of our study. Our review covers a wider per- U spective rather than a method per se, and also our re- T search questions (step 2) are different from the ones Stage 6: Primary study I found in the search results. Our research questions are selection O given in Table 3 with rationale. For examples, RQ1 at- N tempts to identify practices that are more cited, hence more popular in terms of adoption and implementation. Furthermore, we looked for modifications to agile prac- Stage 7: Data extraction and tices that have been made in various contexts. This is an- study quality assessment swered in a sub-question of RQ1 (RQ1.3). Mapping of context variables and most cited practices is the focus of RQ1.2. The focus of our RQ2 is on finding the strength of A evidence found in the articles reporting the use of agile Stage 8: Analysis / Synthesis N practices. A The main objective here is to assess the articles in terms L of research rigor and relevance. This kind of assessment Y will increase the trust placed in the evidence itself [59]. S Based on the answers to RQ1 and 2, a method to identify Stage 9: Draw conclusions I the state of “being agile” is developed and exemplified. S We developed our review protocol with the aim to re- duce research bias [42] and provide scope for replication of our study in the future. We developed protocol (step 3) Stage 10: Documentation and evaluated (step 4) in a series of iterations. A research- er who had prior experience in SLRs evaluated the review protocol. By taking his feedback and examining the SLR [39], we improved our research design. This was mainly Stop helpful in identifying keywords for search and various databases to look in. Our final review protocol is de- scribed in the following sections. Fig. 1. Review process

8

To start with, we applied our selection criteria on both TABLE 3 Title and Abstract. That is, no publication is excluded by RESEARCH QUESTIONS reading the title solely. TABLE 4

SEARCH STRING ID Research question Rationale RQ1 What agile practic- To provide an inventory of Search string = Population AND Intervention es are most cited in practices reported in aca- Population literature? demia and implemented in ((agile OR (software AND agility)) OR "extreme programming" practice. OR xp OR xp2 OR scrum OR (crystal AND software AND (clear RQ1.1 What practices are To create a list of common OR orange OR red OR blue OR family OR method)) OR dsdm OR most adopted in agile practices reported in "dynamic software development method" OR fdd OR (feature practice? peer-reviewed literature, AND driven AND development AND software) OR lean OR "ag- i.e. a sort of common de- ile modeling" OR ”agile modelling” OR (streamline AND devel- nominator(s) for agile. opment AND software) OR (iterative AND development AND RQ1.2 What are the con- To provide a mapping of software) OR (internetAND speed AND development AND soft- text specific aspects context variables and the ware) OR (iteration AND development AND software) OR (adap- of agile practices most cited practices. tive AND development AND software) OR (light AND weight reported? AND development AND software) OR (kanban AND software) RQ1.3 What modifications To identify reported modi- OR (incremental AND software AND development)) to agile practices fications to agile practices have been report- in their implementation. Intervention ed? (practices OR (ways AND working)) RQ2 What is the To investigate the evidence strength of evi- on the use of agile practices dence on the use of and evaluate the state of agile practices in “being agile”. To accom- reference to agile plish this, research papers philosophy? reporting the use of agile are evaluated.

3.1 Search strategy We derived search terms (keywords) from our research questions and also by examining [39] to identify termi- nology relevant to ‘agile’. Our search string comprises of Population AND Intervention [42] as shown in Table 4. Our search strategy to identify the initial set of studies for study inclusion is shown in Fig. 2. We verified the strength of our search string by checking our search catch against known primary studies. We initially had a set of 25 studies relevant to our study. Based on our search catch we made modifications to our search string. How- ever, we incorporated keywords for Intervention (for e.g. keyword ‘practices’). Without intervention the search resulted in nearly 80000 results, which would have been impractical and impossible to process. Also, most of the search hits were found to be highly irrelevant. Therefore, we added Intervention keywords. Although there is a chance that some relevant articles would be missed this way, this was a necessary tradeoff. Fig. 2. Search strategy Our refined search string (see Table 4) resulted in

11032 publications. We applied the search string only on Title and Abstract. We used EndnoteX4 as reference man- The following basic criteria are applied on the obtained agement tool for removing duplicates, creating libraries publications. etc.

• Articles should be in English 3.2 Study selection criteria • Articles should be published between 2000 and 2010 Our strategy for study selection is inclusion heavy [59]. • Articles should be in available in full text.

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 9

through articles: Agile refers to methodology to bring The following questions are considered to decide if a fastness in development, welcome changes to require- study is to be included as primary study: ments and cope up with constant changes. By ‘agile prac- 1. Is the study relevant to software development tices’ we mean any ways of working implemented as part in any form, i.e. embedded system, IT, safety of agile methodology. If either of the questions above re- critical etc.? ceives a No, the study is excluded. We also excluded 2. Does the sudy present any material on im- books, editorial notes, comments etc. Fig. 3 shows the plementation of agile practices? databases we selected and stages involved in study re- trieval. The following definition of agile is used while going

IEEE ACM Digi- SpringerLink ISI Web of Inspec Compendex Scopus Wiley Inter- Xplore tal Library Science science 612 182 141 562 3456 4247 148 1608

Fig. 3. Primary studies selection

10

3.3 Primary study selection The aim here is to provide trustworthy and realistic evi- We piloted our study selection process to find if there is a dence supporting technology transfer by examining when homogenous understanding of selection criteria and also and where evidence has been produced [52] [59]. In pilot to find if we are not missing articles relevant to the re- data extraction (step 7) we encountered a few problems in view. We extracted 30 articles for the pilot from Inspec extracting the property ‘research method’ (P7). We de- library and accomplished study selection (step 5) togeth- fined the following data dictionary for P7: er. We did not find major problems with the selection • Case study: if the study mentions it explicitly, or criteria expect that we found articles on agile and lean defines a goal and applies it using a case study covering non-software like manufacturing industry. We (for e.g. using an application) [63] decided not to include such articles in our primary stud- • Experiment: if the study conducts an experiment ies. We decided to store such articles in a separate data- and defines its design [63] base. The step-by-step procedure for obtaining primary • Survey: if the study collects data (qualitative or studies from the initial pool of 11032 publications is quantitative) using methods like questionnaire or shown in Fig. 3. We first removed the duplicates (automa- interviews [64]. tized in EndnoteX4), removed articles published in non- • Action research: if this is specifically mentioned English languages. We discarded a total of 6041 articles in the study. based on title and abstracts and for 159 articles we could • Lessons learned: if the study discusses reporting not retrieve full texts. A total of 1286 articles were left for of implementation of practices in industry, i.e. ex- full text review. Finally, 186 papers were left after filtering periences [52] [63]. articles that did not match our selection criteria. For the • Pure research paper: if the study discusses con- 1286 articles that remained for full-text review, precision cepts covering agile practices, implementation rate of 16% is observed. A reason for such a low number and do not contain any empirical part. is that many articles that had statements on agile and • Unclear: if the study does not state any research practices did not have any text on implementation of agile method and in cases where the research method practices while some articles described models or frame- cannot be classified undet the above categories. works on various concepts (for e.g. maturity, adoption, human factors). Although these are important papers for For study context (P5) we classified studies into two the agile topic but we excluded them as they did not types. Industry context studies contain studies performed match our study focus. in industry or in collaboration with industry (for e.g. re- 3.4 Data extraction & study quality assessment searchers performing in industry). Academia context Similar to study selection we did a pilot data extrac- studies contain studies performed in academia or the tion by selecting 25 articles from our primary studies ran- ones that cannot be classifies as industry context. For domly. We started with an initial discussion meeting to studies performed in industry, we extracted the property identify properties to extract from the primary studies. P11, company size using European Recommendation We defined a list of properties to extract and data diction- 2003/361/EC [63]. In cases where we could not extract ary (see Table 5). Table 5 shows the rationale behind ex- company size, we visited the websites of the companies tracting the properties and mapping to RQs. We first ex- and noted the values. Furthermore, we noted the values tracted data from ten articles individually. We used for properties project duration (P12) and team size (P9) Fleiss’ kappa [60] as a statistical measure to evaluate ho- for studies performed in industry for studies performed mogeneity between the two of us. We calculated Kappa in both industry and academia. We noted subjects (P6), coefficient for the data properties P1 to P10 and we found and scale of application used (P7). For studies performed a moderate agreement value (0.6). We had a consensus using industrial applications, P7 is given value ‘industrial meeting where we improved our understanding on our application’ and in rest of the cases it is marked as ‘other’, data dictionary. In the next iteration, we selected 15 more which include toy examples developed for student pro- studies (plus the ten from the previous iteration) and pi- jects conducted in academia etc. We have noted applica- loted data extraction. We found a strong agreement be- tion type (P10) for empirical students. We classified ap- tween us (0.85), which is considered good according to plication type into: ‘embedded’, ‘non-embedded’ (IT for Landish and Koch [61]. We divided the primary studies banking, game development etc.), ‘unclear’ in cases where according to year of publication. Data from the articles we could not classify them as embedded, ‘not mentioned’. published between 2000 and 2005 are extracted by the We could not classify them as embedded, ‘not mentioned’ second author, and the rest were extracted by the first (where no details of product is mentioned). We could not author. classify them as embedded, ‘not mentioned’ (where no Study quality assessment is not performed as a separate details of product is mentioned). For studies that did not step. It is performed as part of data extraction (step 7). have any empirical part, i.e. ‘pure research papers’, prop- The first three data properties (See Data extraction form, erties P6-P12 are given the value ‘inappropriate’ instead Table 5) are combined to identify research rigor [52] [59]. of ‘not mentioned’. Study quality is performed on the basis of identifying For studies that did not have any empirical part, i.e. how the studies are reported. The properties P1 to P7 ‘pure research papers’, properties P6-P12 are given the concern the credibility of studies for the evidence availa- value ‘inappropriate’ instead of ‘not mentioned’. ble, i.e. how the evidence is produced and presented [52].

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 11

TABLE 5 DATA EXTRACTION Mapping ID Property Data values Description to RQs Strong To specify to what extent context of the study is P1 Context Medium RQ2 described Weak Strong To specify to what extent design of the study is P2 Study design Medium RQ2 described Weak Strong To specify to what extent validity of the study is P3 Validity evalations Medium RQ2 discussed Weak From the reviewed RQ2, P4 Research method articles To capture research method used in the study RQ3 Unclear Industry P5 Study context To capture context where the study took place RQ2 Academia Practitioners Researchers P6 Subjects To capture the subjects in the study RQ2 Students Inappropriate Industrial application Scale of application To specify the scale of the application used in the P7 Other RQ2 used study Inappropriate To capture agile experience of the subjects used in Beginner the study. Marked as ‘Beginner’ when the subjects Experienced P8 Agile experience reported were new to agile, i.e. implementing agile RQ1 Not mentioned practices for the first time or were not part of an Inappropriate agile team. From the reviewed articles P9 Team size To capture the size of the team used in the study. RQ1 Not mentioned Inappropriate Embedded Non-embedded Classified as Embedded or non-embedded. When P10 Application type Unclear it is not clear from the study, application domain RQ1 Not mentioned is classified as ‘Unclear’. Inappropriate Micro Small Medium Marked according to European Recommendation RQ1 P11 Company size Large 2003/361/EC [63]. Not mentioned Inappropriate From the reviewed articles To capture project duration of the agile project P12 Project duration RQ1 Not mentioned described in the reported study Inappropriate From the reviewed RQ1, P13 Agile practices To capture agile practices reported in the study. articles RQ2

3.5 Procedure for assessing research rigor and stances, for e.g. [52] [59] [65]. A potential factor for trans- relevance ferring research results into use in industry, or for indus- Gap between what is proposed in academia and what is try in general is how empirical evaluations (of any sort, practiced in industry has been observed in various in- for e.g. using industrial applications, or toy examples) are

12 conducted and reported [59]. Furthermore, it is important 3.6 Validity threats for researchers and industry professionals (practitioners) There are three potential threats to the validity of our SLR to gauge results, their credibility and validity in practice and its results namely publication bias, selection of primary [59]. Ivarsson and Gorschek [59] developed a model to studies and data extraction. identify and assess rigor and relevance to evaluate evi- dence and find decision support material to try out re- 3.6.1 Publication bias search results. Publication bias refers to the problem that negative re- Research rigor refers to how empirical evaluations search outcomes are less pronounced or published than have been performed and reported [59]. To assess rigor positive outcomes [42]. For example, a threat can be pos- we noted values for the properties P1-P3. Calculation of sibility of not publishing of negative outcome of agile rigor is given in Table 6. implementation or excluding such publications in our Relevance refers to the potential for impact in tech- review. However it should be noted that research ques- nology transfer [59]. For example research methods like tions for our review are not defined from the perspective case studies and lessons learned provide stronger evi- of performance or success of agile, rather to investigate dence to practitioners than experiments conducted using what agile practices have been implemented. To reduce student subjects [59]. Studies conducted using laboratory the threat of publication bias, we included publications experiments (for e.g. experiments conducted using stu- that discussed implementation of agile practices. Fur- dents in academia, conceptual analysis (for e.g., pure re- thermore, we did not restrict our search to a particular set search papers without any empirical section) do not con- of conference proceedings or journals, instead a wide tribute to realism [59]. Central to technology transfer is range of digital libraries are considered to cover the realism involved in evaluations. For examples, studies breadth of the field. To gather reliable information, we conducted using industrial applications provide higher did not include grey literature, technical reports, non-peer realism of state of practice than studies using toy exam- reviewed or unpublished literature. ples invented for evaluations. Similarly, studies using industry professionals as subjects provide a richer picture 3.6.2 Selection of primary studies of realism over studies with student subjects [59]. Rele- We defined our search string to retrieve as many publica- vance is calculated using the properties P4-P7. Table 7 tions as possible. However, considering the large number shows how relevance is calculated. of initial results we incorporated ‘intervention’ in our search string. To ensure that searching for relevant publi- TABLE 6 RESEARCH RIGOR cations is unbiased we ran pilot searches, modified search strings. Research rigor = C + S + V However, a general problem with software engineering P1: Context (C) is its inconsistency in the use of terminology. Terminolo- P2: Study design (S) gy is not standardized [39]. SLR by Dybå and Dingsøyr P3: Validity evaluations (V) [39] observed similar problem in identification of primary Values for each property: studies. If we had we omitted the use of ‘intervention’ we Strong = 2, medium = 1, weak = 0 would have obtained a large number of relevant studies. But this trade-off was necessary considering the initial results without the use of ‘intervention’ (nearly 80000 TABLE 7 RELEVANCE hits). Although we reviewed our search protocol and pi- Relevance = C + RM + Su + Sc loted article search, the problem of missing a few primary Context (C) studies still exists. For example, we could not retrieve full Research method (RM) texts for more than 150 articles. To check if we missed any Subjects (Su) articles, we did manual crosscheck verification in Agle Scale of application used (Sc) conferences. Properties contributing to relevance are given the value 1 and 0 otherwise 3.6.3 Data extraction Contributing to relevance When we piloted our data extraction process, we found • Studies conducted in industry context (i.e., that many of the articles lacked details such as system context = industry) characteristics used in the empirical study. We refined • Studies using practitioners as subjects (i.e., sub- our data extraction form to extract data that give overall jects = practitioners) description of mainly: context in which the study was • Studies using industrial applications (i.e., Scale conducted, design of the study and validities addressed. of application used = industrial application) This process involves subjective judgment. To ensure • Studies conducted using research methods homogeneity, we ran a pilot study, had consensus meet- ‘case study’ OR ‘lessons learned’ OR ‘action re- ings to get a common understanding of the data, refined search’ OR ‘exploratory survey’ OR ‘inter- data extraction based on the checklist defined by Ivarsson views’ OR ‘experiment’ (non-laboratory) and Gorschek [59]. Data extraction process was hindered by the way studies were reported in several articles. For example, many of articles did not address factors that

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 13 could affect the outcome of the use of agile. To minimize particular context [69]. For example, Of these 114 studies the misunderstanding of the data, we improved our con- 46 (40%) studies did not address team size and 57 (50%) sensus iteratively. This was done as part of pilot data ex- studies did not address duration of the projects. This fur- traction process. Our Fliess’ kappa coefficient improved ther makes evidence based judgments weaker as the in- from 0.6 to 0.8, which shows a strong agreement between formation on how large or small are the projects cannot the reviewers [60]. be obtained. Research, however a show that agile is suita- ble for small and medium projects [6] [67]. 48 (42%) stud- ies conducted in industry were identified as mature teams. That is, subjects in these studies were at least one 4 RESULTS AND ANALYSIS project old in terms or agile or have been using agile for A total of 186 primary studies were identified. These are at least one year. broadly classified into two: studies of empirical nature Experience of the practitioners in usng agile practices (empirical studies) and studies that are not of empirical was not reported in 26 (23%) of studies conducted in in- nature (pure research papers). The latter category mainly dustry. A high number of studies conducted on agile ma- discussed agile concepts by illustrating agile practices. ture teams (48, 42%) also suggest that using agile is prom- Before answering our research questions and presenting ising and there is substantial interest towards applying it our analysis, we provide a brief overview of the studies in industry. A high percentage of studies (81%) did not identified. address context fully in industry. Only 18% of the studies in industry context described context fully. An overview 4.1 Overview of the studies of the extent to which context is described in studies re- 4.1.1 Publication year ported in given in Appendix A. A general overview of publication years in our study is given in Fig. 4. A total of 115 (52%) primary studies were 4.1.4 Study design published in or after 2005. Publications on agile seem to Of the all the studies reviewed, 23 (12%) did not describe be widespread equally before and after 2005. This further the property study design (P2). In 49 (26%) studies study may imply that there has always been substantial interest design has been described well whereas 114 (61%) studies in implementing agile philosophy. Spikes can be ob- described the property partially (See Appendix A). It can served for the years 2004 and 2009. be observed that Study design was well described in studies conducted in industry when compared to those in 4.1.2 Research method academia. This further increases the suitability of results A general overview of research methods used in the presented in industry context as being more relevant for primany studies is shown in Fig. 5. Research methods practice. were extracted according to the classification discussed in Section 3.4. A very high majority of studies used “lessons 4.1.5 Validity evaluations learned” (79, 36%) and “case studies” (52, 24%). Experi- Of all the studies reviewed, 102 (55%) studies did not ments have been reported in 24 (11%) studies. We could address the property validity evaluations (P3) at all. not extract research method for one study as it lacked a Study validity is addressed partially in 67 (36%) and in description on the study was conducted and research detail in 20 (11%) studies (See Appendix A). A high num- method used. ber of studies conducted in industry (65, 54%) did not address validity and credibility of the results at all. This 4.1.3 Context hinders drawing conclusions based on evidence available The study settings were classified into industry and on the use and implementation of agile practices and ap- academia (non-industry). Among the studies conducted plying to a different context. A majority of studies con- in industry, 114 were empirical studies and the rest 5 are ducted in Industry are ‘lessons learned’ (or experience non-empirical studies. For 114 studies (63%), context was reports). To make evidence more convincing and adapta- identified as Industry indicating that a high percentage of ble, practitioners should address factors that could have studies were performed in realistic settings. We extracted affected the outcome of agile implementation. company sizes from the reviewed studies using European Recommendation 2003/361/EC [63]. Company size was 4.1.6 Rigor not mentioned for 39 (34%) studies. In other studies re- Rigor is obtained by summing up the properties Context porting company size, large can be found in 31 (28%), (P1), Study design (P2) and Validity evaluations (P3). Ri- small and medium size in 31 (28%) studies. Interestingly, gor found in the reviewed studies in shown in Table 9. Of 27 (24%) of the studies that reported company size are all the studies reviewed, we found a total of 61 (31%) large companies. This can further mean that agile practic- studies that are reported well (rigor greater than or equal es are found to be useful in the context of large companies to 4). Of all the studies conducted in industry, 40 (35%) where it is complex to deploy or integrate new processes studies are well reported whereas in academia a total of and techniques with existing ones [31]. Experience also 21 (25%) studies are well reported. These studies further suggests that agile practices match the need of large com- can give decision support material on how agile was im- panies [31]. However, not discussing context in the stud- plemented. ies hinders in judging the evidence as being relevant to a

14

Publications vs Year!

36"

29"

23" 20" 19" 19" 15"

Publications" 8" 9" 5" 3"

2000" 2001" 2002" 2003" 2004" 2005" 2006" 2007" 2008" 2009" 2010"

Fig. 4. Distribution of publication across years

SLR" 1"

N/A" 1"

Interview" 3" Survey" 10" Research"paper" 36"

Lessons"learned" 68"

Grounded"theory" 3" Experiment" 15" Case"study" 48"

Action"research" 1"

#"Publications" 0" 10" 20" 30" 40" 50" 60" 70" 80"

Fig. 5. Research methods

TABLE 8 COMPANY SIZE REPORTED TABLE 9 STUDY RIGOR Company size Publications – Nr. (%) Rigor Industry Academia Micro 10 (8) 0 1 3 Small 15 (12) 1 11 9 Medium 12 (10) 2 37 18 Large 31 (26) 3 31 20 N/A 46 (39) 4 26 9 Inappropriate 5 (4) 5 12 4 6 1 4

4.1.7 Relevance Relevance is obtained by summing up the properties studies (54%) were performed in realistic studies, thereby Context (P5), Research method (P4), Subjects (P6) and increasing the validity of our SLR results for state of prac- Scale of application used (P7). An overview of relevance tice. We combined Rogor and Relevance found across found in the studies is shown in Table 9. Relevance gives studies. Of the 198 studies, 42 (19%) studies had High an understanding of to what extent studies are conducted Rigor and Relevance (HRR), as seen in Fig. 6. in realistic settings. It can be observed that majority of

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 15

tices were claasified into thirtytwo different categories based on the most suitable category they fall into. The TABLE 10 Relevance across studies descriptions of the practices as well as their acrynms used are given in Table 11. For e.g pair programming and pair- Relevance Industry Academia ing are the same practice actice but have been modified 0 0 40 by the situation. In this scenario we placed them under 1 6 23 the same cateogery of pair programming, it is the same 2 1 3 case with sustainable pace and 40-hour week. Practice 3 41 1 clasisfications are based on literature by original propo- 4 108 - nents of Agile and the work by Petersen and Wohlin [9]. These practice classifications were used in [9] to analyze how practices relate to Agile principles. For example, 4.1.8 Practice classifications Sprints and Iterations can be grouped into a single prac- tice as they share the same philosophy of developing We classificed the practices found in our primary studies software in a series of mini-projects [9]. based on the definitions found in literature. These prac-

Fig. 6. HRR

TABLE 11 Practice Classifications

Nr ID Practice Definition . 1 PP Pair Programming Two people write the code at one computer [56]. Testing refers to any of the practices used for enabling testing 2 Te Continuous Testing thorughout the. Unit tests, Acceptance tests etc. are run in every iteration [243]. Restructures the system by removing duplicates as well as improve 3 Re Refactoring communication [56].

16

Close interaction between customer and programmer: the programmer Planning Game/Sprint 4 PG estimates the effort needed to implement a customer stories, and the planning customer then decides about the scope and timing of release [9]. A close cooperation between the customer and the developers exists for rapid feedback and clarify the requirements. This avoids 5 CC Customer Collaboration implementation of features not avoided by the customers [1]. Customer collaboration in this context means, collaboration exists irrespective of the customer being on-site or off-site. 5 CO Collective ownership Anyone can change any part of the code anytime [9]. A new piece of code is integrated into the code base regularly, thus the 6 CI Continuous Integration system is integrated and built many times a day. All tests are run and they have to be passed for the changes in the code to be accepted [9].

Coding rules exists and they are followed by programmers. 7 CS Coding standards Communication through code should be emphasized [9]. Stand-ups are short meetings, approximately 15 minutes daily, to keep 8 Su Stand-ups track of project progress [56]. Stories are the short statements of the functionality desired by a Metaphors/Stories/Featur 9 Me system user or customer. Stories are customer visible, customer valued es functionality that can be completed in a single iteration [9]. The team delivers code to customers in the form of supported releases 10 SR Short Releases more often; the purpose of having short releases is to get the product out to customers as quickly as possible [9].

At the end of each iteration, team gather for a retrospective meetings. In this meeting data is presented on how the iteration went, such as how much working software was produced. The team evaluates the Retrospectives/Sprint good that they can carry on for the next iteration as well as the loop 11 Rt review holes that they want to discard for the next iteration. Specific action items are taken out of retrospectives and made as stories for the next iteration. This helps the team to allocate necessary resources towards process improvement [56].

The emphasis is on designing the simplest possible solution that is 12 SD Simple Design implementable at the moment. Unnecessary complexity and extra code are removed immediately [9]. A maximum of 40 hour working week. No two overtime weeks in a 13 40h 40-hour week row are allowed [9].

Sprints are used in the context of Scrum method and Scrum teams. each scrum is is a self contained .Every iteration is a self contained mini project composed of activities such as requirements, design programming and test. Each iteration is time boxed (stipulated time). 14 It Iterations/ Sprints If the team is aware that they can’t meet the objectives, rather than moving out of the time box the scope of the iteration is reduced. At the end of a iteration the team obtains feedback on the code delivered from the manager [56].

15 Test-driven Software development is a test driven. Unit tests are implemented TDD Development before the code and are run contimously [56]. Different channels for communication that would support 16 COM Communication communication among different. Practices with central focus on Office structure or environment to sup- 17 Of Office port adoption of use of agile software development ate classified as Office. Agile teams that utilize the team practice whih consider the cross 18 Tm Team functional team of all those necessary for the product to succeed as one team [9].

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 17

Progress is tracked my milestones based on software deliverables and 19 Pr Progress major decisions rather than written documents [9].

This practice refers to the use of an agile coach in the projects. Coach 20 CR Coach Role plays a major role in driving the other members in the team [56].

Practices are grouped into Documentation if their goal was to reduce 21 Do Documentation documentation or documentation in agile projects.

SD 26 16 4.2 What agile practices are most cited in Pr 15 6 literature? (RQ1) The purpose of this research question was to identify and Me 50 26 provide an inventory of agile practices that were consid- Te 49 25 ered agile. A major aim of this question is to give a view- CR 6 2 point of importance given to each of the agile practice. An COM 7 2 overall view of practices cited in our study is given in Table 12. As we can see from Table 12, the most cited ag- Of 14 0 ile practices are PP, Re, CC, CC, CS, PG, SD, CO, and DO 6 4 practices related to Te. Apart from practices shown in CM 2 2 Table 11, we identified other practices that we grouped into ‘Other practices’. The motivation for showing them 40h 23 14 as separate practices are, they are mainly single cited and Tm 14 6 were used to support other practices. These are given in Appendix B. A general observation looking at the Table is that practices like PP, Re etc. seem to be more popular. PP 4.3 What agile practices are most adopted in received the most attention (citation count: 97) among practice? (RQ1.1) agile practices found in our study. Looking at the citation The purpose of this RQ was to distinguish studies be- count, one can find that PP is the most popular agile prac- tween industry and academia and show what practices tice. Practices like CS, CO, SR, and Rt have almost the are widely adopted. Further aim of this RQ was to identi- same citation count. fy coupling between agile practices in practice. To this We identified several other practices, which we end, we have identified what practices are used together mapped to the practices shown in Table 11. For example, in combination. Contrast between the use of practices in Task Kanban [149] and A/B/C List [112] are put in the industry and academia is shown in Table 13. However, it category ‘Features’. We identified practice modifica- should not be noted that all the empirical studies (both in tions/adaptations. These are discussed in section 4.4 industry and academia) are not included for analysis in (RQ1.3). Table 13. The main reason behind this that some studies

TABLE 12 Citation count of practices discussed only implementation of a specific practice like PP, or discussed the use of non-software activities in

Industry software projects. For example, PP is cited in 65% of the Academia (Studies = studies reported in industry and 58% of the studies in (Studies = 69) Practice 119 academia. Results from these studies are discussed in Nr. Nr. Section 4.4.3.

It 93 12 TABLE 13 Practices found in empirical studies Su 39 6 Industry (Studies Academia (Stud- SR 34 15 Practice = 93) ies = 15) CO 33 20 Nr. % Nr. % CS 37 17 It 88 95 6 40 Re 43 24 Su 33 35 5 33 CI 50 19 SR 30 32 4 27 CC 50 16 CO 32 34 6 40 PP 59 38 CS 38 41 4 27 TDD 27 21 Re 44 47 9 60 Rt 38 8 CI 42 45 6 40 PG 48 14 CC 45 48 5 33

18

PP 60 65 13 87 and TDD. Of related practices like ‘small offices’ [98] and honey- TDD 29 31 8 53 comb cubicles’ [204] were more or less invented by practi- Rt 30 32 2 13 tioners owing to practicalities. St pasted on wall accord- PG 50 54 6 40 ing to their prioritization in an Open workspace can im- SD 28 30 4 27 prove communication [248]. A possible reason for not adopting Open office or not considering it agile is because Pr 13 14 2 13 of investment that needs to be put in infrastructure Me 58 62 4 27 change [158]. Moreover, it can be difficult for the devel- Te 44 47 7 47 opers to focus working in an open environment [261]. Agile is a social process which developers seldom count CR 6 6 2 14 [266] which may limit disciplined use of Of and Tm prac- COM 6 6 2 14 tices. The use of shared wikis and communication chan- Of 11 12 0 0 nels like instant messengers’ are used to reduce distance between sites [264] [255]. DO 1 1 2 14 Scrum practices like time boxing, i.e. Sp, PB are better CM 2 2 2 14 utilized by practitioners. On the flipside, challenges in 40h 23 25 2 14 practicing PB and planning Sprints include lack of overall Tm 10 11 0 0 and long term focus [261]. Su were considered as one of the important communication channels [168] and also used to set up the tone of the day and remove any back- Contrasting differences between academia and indus- logging problems [254]. For example, a company Su meet- try can be observed for most of the practices. PP is a dom- ing may involve about 20 people often discussing status inant practice both in industry and academia but has a updates [254] or involve sub-teams [168]. Su were com- slightly higher percentage in academia. PP has been most plimented by Pr practices like Burn-down charts [247] tried out by practitioners and appears to be popular. A [122] [172]. success factor for using PP can be credited to its ease of Industrial adoption is a more inclined towards CI than use, improved code quality and knowledge transfer [134]. other practices like CO. CI enables the team to identify PP was found to be a good way to review code [261]. A issues where a change to one component impacted anoth- common reason for its industrial prominence is it is wide- er [250] and also enables continuous testing as there will ly studied and thus tending to be more popular [12]. It is be always something to test [261]. However, adoption of suggested to build code patterns in pairs and use these the practice is less pronounced within large teams (team patterns to build sub-modules solo [151]. PP is found to size greater than 30). Industrial adoption shows it is hard- increase the level of satisfaction and confidence [83] but it er to practice in projects with large teams [188]. For suc- is preferred that team learns the essence of PP and takes cessful use of Continuous integration, it is recommended time for the results to be visible. Other advantages of PP to combine with Te [249] [110] enabling implementation include increase in code quality and reduction in code of late changes [187]. It provides the team with valuable inspection time [96] [111] and reduction of cycle time by feedback because the code is in continuous and integrated 40 to 50 percent [233]. Pairing is sometimes done only for condition [188] and protects against loss of configuration complex code when team size is small and is practically control [233]. Although TDD seems to be less pro- difficult to accommodate PP as a regular task [106] [132] nounced, industry seems to maintain the essence of con- [133]. The use of PP was found to be useful for develop- tinuous testing throughout the project (See Te, Table 12). ing user stories [160] [200]. Overall, PP is widely studied Unit tests and acceptance tests are carried in every itera- [12] and adopted. It was found to be a fun practice to tion. adopt and use and can raise the level of innovation and On the other side, adoption of CO is associated with knowledge transfer between the developers [115] [126]. developers’ mindset. For example, in one study develop- TDD appears to be less popular among practitioners. ers were reluctant to adapt code written by others and TDD is found to aid team in building high quality prod- continued to speak of “their code” [260]. Management ucts. Although there can be reduction of productivity in and developers’ interest affect adoption of CS. For exam- terms of time, it is found to be compensated by improved ple, in one study we found developers and management quality and decrease in code complexity [131]. A de- were reluctant to enforce a common standard for coding. motivation for TDD adoption can be that more time needs This explains why CO is little less common in terms of to be invested for development [165]. However, from the adoption. CS was not put into place until the code became product-lifecycle perspective reduction in maintenance so complex that it was unmanageable and required tight- costs and customer satisfaction were observed [146] and er control [260]. Peer pressure can also impact adoption of has potential for increasing the level of testing [165]. In CS [213]. Observed benefits of coding standards were one study developers felt writing test cases before actual readability of the code, communication and strong sup- development felt unnatural and laborious although it was port when using PP [187] or CS [222]. CS is understood as useful for critical tasks [286]. It is thus evident that prac- a simple practice to adopt [187] and can produce con- tices that were embraced or used most in academia are PP sistency in coding and code reviews [156]. Company

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 19 standard coding guidelines can be used to extend the tices are used in combination in industry is presented practice [102]. here. By coupling, we mean what practices practitioners XP’s design related practice, SD is considered as a dif- consider as agile in combination. To identify this we de- ficult practice to adopt and yet very useful [101]. Agile fined three categories for coupling: (1) Strongly agile: cou- methods have been criticized for their lack of emphasis on pling is categorized as ‘Strongly agile’ between practice X architecture and design [85]. Architecture related practic- and practice Y if Y is cited in more than 70% papers that es like Architecture analysis, Architecture evaluation etc. cited practice X, (2) Agile: coupling is categorized as ‘Ag- were proposed by Babar [85]. Creating simplest possible ile’ between practice X and practice Y if Y is cited in be- design is a difficult task when the team has to plan when tween 50 and 70% papers that cited practice X, and (3) new code would interact with coming functionality [107]. ‘Not so agile’: coupling is categorized as ‘Not so agile’ be- Design planning can be performed based on available tween practice X and practice Y if Y is cited in less than knowledge from similar projects and not just based on 50% papers that cited practice X. user stories in iteration and not in order to reduce the Tendency to combine Scrum practices and XP practices amount of Re [248]. For successful use of SD developers is prevalent, and evident. Practices like PP, CS, Re are need to be equipped with knowledge of how to design cited in half of the studies (40%, 36%, 32% respectively) effectively [116]. Non-functional attributes like security that cited Scrum based sprints. Scrum is majorly advocat- and safety are suggested to be designed through architec- ed as a project management methodology and not code tural practices while the ‘keep it simple’ rule of SD is ad- construction methodology per se [21]. vocated to reserve to individual sub-systems [258]. Practices like CO, CS are cited more frequently than Re was found to have strong impact in terms of en- SR. Practices considered ‘Strongly agile’ with Short re- hancing reusability, eliminating complexity etc. [123]. The leases are: PP (74%), Re (70%) and considered ‘Agile’ popularity of Re can be credited to design improvement (based on relative citation count) are PG (68%), Me (65%), [164] [168]. Automating Re using tools like Eclipse was CI (65%), CO (61%), and CS (61%). found very successful [168] [116]. When using Me, CC seems to be a determinant. This is Me is perceived as not-so easy practice to adopt evident as role of customer was reported in 81% of the [248]. Me is found to strongly interact with practices PG studies that used Metaphors. Other practices that were and Re in steering the project and creating a simple and found as ‘Strongly agile’ with Me include Re (85%), CO stable structure (as a guidance scheme for naming classes, (78%), PP (89%) etc. methods etc.) [123]. Me provides a high level of abstrac- The tendency to use the practice Re is found to be tion level and are ideally suitable for systems governed highest when using Su. Around 60% of the studies that by a high degree of complexity [170] and are used to cited Rt cited Su. Use of St or Sp (41% each) are second communicate with the customer about the system [105]. highest cited when using Re. PG, CS, PP and Re are found St are brief informal descriptions of user requirements, to be ‘Strongly agile’ (94%, 91%, 88% and 79% respective- often written by customer [141]. It is not very clear from ly) with CO. Among the studies that cited CS, 77% cited the studies reviewed if Me are developed by systems’ PP, 72% cited Re, 77% cited CO. A possible explanation is customer or the team. For example, when the software is that CS benefits to have consistency in coding [156]. PP intended to a mass market, Me are to be developed and is cited in 72% of studies using Re. But it is not clear from analyzed by the team [133]. Story size should be small the studies if pairing can enhance Re. enough such that it is possible to construct individual Among studies that reported TDD, over 75% cited us- tasks for development and thus the team is committed to ing PP, 60% cited using PG. It is observed that TDD is code within the deadline of iterations [266]. used majorly in combination with PP and not other way The citation counts for practices in the categories Of round. Although there is an increase in development time and Tm, as seen in Table 12 are less compared to other when using TDD, there may be a considerable increase in practices. Practices like Shared working room [158] were code quality and passing more functional test-case [68]. found to be less pronounced. It is suggested to have a To cut cost in development due to the use of TDD, train- small offices (150 to 200 people) [98] and a dedicated ing is needed [106]. Adopting TDD without training may meeting room [247]. The use of ‘Share project space’ in- give mixed feelings and contradictory opinions on the volves transformation of office infra structure of cubicles efficiency of the practice thus de-motivating to use further into an open room [158]. Role changing [172] and merg- [115] and the role of coach was suggested [137]. ing of teams for integration [132] were among the cited On a positive note, practitioners seem to embrace Rt practices used for team management. Re provide regular meetings as a way to find improvement potential [254] team reflections and adjustments to become more effec- [174]. Practice that was cited most in studies reporting Rt tive [250]. Feedback sessions are used to demonstrate user is Su (59%). Rt can be practiced independent of other stories for customer approval [266]. Re (or reflection) ses- practices [105]. Sustainable pace or working 40h, i.e 40h is sions are used as venues for iteration/sprint demo. As it an essential practice for work satisfaction and overall can be observed from Table 12 using iterations or Scrum’s productivity [91]. On the flipside, working only 40h may sprints are common. Su are used as a means to progress need extra social and technical effort and companies often check mostly on a daily basis. A discussion on how fre- need to work overtime owing to schedule deadlines [123]. quently Su meetings were conducted is given in Section The emphasis on working 40 hours per week is a social 4.4 (RQ 1.3). Coupling between practices, i.e. what prac- concern practice [133] and teams that work overtime often

20 make mistakes [250]. Software development process model

4.3.1 Research studies We categorized three papers into this category as they We identified 30 research papers reported by research- proposed process model for agile software development. ers that discussed various topics in the context of agile Yinghua [257] proposed UniX process adapting concepts software development. We classified into categories and practices from Unified Process (UP) and XP. UniX based on the topics that were central to them. These are process drops the XP practice SD and recommends work- discussed below. ing out system architecture as early as possible and sub- system modules suing UML. UniX also considers the use Overall software development of PP as time consuming and instead suggests using for critical code and complex parts of the systems. UniX We identified five studies focusing agile for software states ‘anyone can change any code’ practice (CO) is not development in general; See Table 14. always practical and recommends assigning authorities with respect to competencies in particular technologies to TABLE 14 Research studies on software development improve code. Visconti et al. [181] propose an ideal pro- in general cess model for agile methods and identify practices that XP and Scrum fail to provide. They state that XP lacks Article Description practices to provide ‘Proper environment and support’ [103] The paper proposes how agile practices can and Scrum lacks practices to allow ‘Business people and be used for military embedded systems. developers to work together’ and ‘Continuous attention [110] The rsearch paper assesses the applicability to technical excellence and good design’. Nawrocki et al. of XP practices to engineer high-integrity [248] proposed a modified XP method and practices com- systems. bining XP and ISO 9000 to provide guidance in applying [126] The research paper presents a mapping of XP in ISO 9000:2001 certified firms. Their method defines agile principles to best practices laid by key process areas with maturity levels and practices asso- Complex Adaptive System (CAS) theory to ciated at each level. These levels are: Customer relation- support response to change in developing ship management (Level 1), Product quality assurance complex software. (Level 2), PP (Level 3) and Project performance (Level 4). [143] The research paper analyzes existing Agile Kirk and Tempero [153] present KiTe framework as a ba- methodologies and maps practices to soft- sis for identifying risk conditions inherent in XP projects. ware project requirements. A risk condition inherent in the practice CI, for example, [251] The paper argues to add refelective mode of is the inability of the developers to submit the developed thinking to each agile practice when applying stories on time to the build, which may result in slow to both embedded and non-embedded soft- progress and increased defect rate. Paulk in his report ware development projects. [233] summarizes how XP practices can help organiza- tions realize SW-CMM goals and identifies its key process Garg [103] states the use of CM and automated testing areas that are not addresses by XP. These process areas to support Re of embedded systems as a small change in Software subcontract management (Level 2), Training code can affect timing behavior. Configuration control is program and Integrated software management (Level 3), considerer as a key ingredient in verifying software Quantitative process management and Software quality against hardware changes. She also recommends auto- management (Level 4), and Technology change manage- mated documentation with the help of integrated tools. ment and Process change management (Level 5). In simi- Paige et al. [110] suggests modifications to XP to suit the lar lines, Nawrocki et al. [232] propose a maturity model development of High-Integrity Systems (HIS). They state for XP. The model has four maturity levels and describes CO is inapplicable to HIS as it requires substantial do- practices that can be used at each level. main knowledge like sensors, pressure pumps and there- Siebra et al. [140] investigates how XP can be used in fore everybody involved in the project cannot have the the management of innovation process. Moreover, they ability to modify the code. They also state that the prac- define the cycle of innovation consisting of five different tices PG, PP, Me, Re, 40h and CC require modifications. iterations: investigation, identification, aggregation, con- They suggest practices like Static analysis, explicit risk solidation and post-validation. They suggest allotting 10 mitigation strategies and techniques, and integration of days to the first four iterations and 10 days to the last it- system design models (for example, Simulink/MATLAB) eration. Levine and Walker [152] examine the XP practic- with XP practices. Meso and Jain [126] present a mapping es and investigate how they apply to the task of assigning between CAS theory to agile prinicples in developing grades to students’ work in computer science courses. complex software. They state best practices based on Hazzan and Dubinsky [275] present two chasms: abstrac- principles given in CAS theory. For example, to support tion and satisfaction inherent in software development frequent releases and CI they suggest starting with a min- and desctibe how XP practices can bridge these chasms. imalist development team and scale up/configure team’s Abstraction chasm deals with the cognitive abilities of the structute to suit project gwoing size and complexity. programmer to think at various levels of abstraction while the Satisfaction chasm deals with satisfying the cur-

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 21 rent needs and teammates. Bergin and Grossman [160] experiements conducted in a different setting 11at Uni- present an excercise to teach XP and its practices to teach versity of Utah. They found that PP is an economically novices (students, managers, customers, programmers viable option when compared to solo programming. and non-programmers). Duersen [153] studies the possi- bilities of program comprehensions, i.e. the task of build- Architectural practices ing mental models of software systems at various abstrac- tion levels in the context of five practices namely PP, Te, Juric [228] presents an overview of XP practices and Re, Evolutionary design and Collaborative planning. He raises concerns regarding their role in conceptual model- identifies comprehension risks and benefits and suggests ing and code generation. Jensen et al. [155] introduce a a need for Reverse engineering perspectives on agile prac- new practice to XP called Developer stories. Developer tices. stories are written by developers and provide opportuni- ty to consider the architecture of the application. “Devel- Customer practices oper stories describe changes to units of developer-visible properties of the software” [155]. They state that Devel- We identified two studies on customer practices. Wang oper stories act as a tool to plan and express demands for et al. [135] proposed a set of practices in XP projects that Re and facilitate opportunity to reflect upon the architec- emphasize collaboration between developers and cus- ture and design of the system. tomers. Similarly, Martin et al. [150] present customer specific practices that enable agility in software projects. Software security These practices are given in Table 15. Ge et al. [162] propose two new security practices, TABLE 15 Customer practices proposed by research- Fundemental security architecture and Security training, ers for applying XP. They recommend providing security training for both developers and customers and use this Article Practices knowledge in writing security stories and perform securi- [135] Creating customer team, Building up devel- ty testing. They suggest defining Fundemental security opment environment, Discussing and re- architecture before the commencement of It. Wäyrynen et viewing project plan, Feedback meetings, al. [179] report an analysis of XP practices from the per- Serving as consultant, Taking part in usage spective of the Systems Security Engineering Capability testing and trial operation, Tracing devel- Maturity Model (SSE-CMM) and Common Criteria (CC) opment process, and Checking and accept- standards. They identify that XP requires modifications as ing final software product. it lacks security-engineering support in many areas con- [150] Customers’ apprentice, Programmer holi- cerning specifying security needs and assurance. They day, Story standards, Show & Tell, Custom- recommend the role of a security engineer who works on er pairing, Look before you leap, and Three identifying risks along with the on-site customer. Moreo- month counselor. ver, they advocate the participation of security engineer in PP activities and document a security review.

Quality assurance Distributed development

Huo et al. [193] analyze agile practices’ quality assur- Dou et al. [88] provide a review of tools supporting ance abilities and state that in agile methods, quality as- distributed PP and propose a framework for it. The surance activities are inside the development practices framework aims to support the use of same editor be- like PP, TDD, passing 100% uni tests, architectural spikes tween pairs and also different environments and editors and CI and occur at higher frequency and earlier than in between pairs. The tools they reviewed include Sangam, the waterfall process. Hayes et al. [256] discuss opportu- RIPPLE, COLLECE and COPPER. Batra [91] in his report nities, challenges and plans for the synthesis of TDD and states that agile practices need to be modified to suit the traceability. They state that TDD is a well-suited practice needs of distributed development. He also states that the to achieve traceability and recommend research on the practices 40h and CC are particularly not feasible. To this data that can be collected and analyzed to aid develpers end, Hossain et al. [94] provide a review on the use of in the TDD practice and propose a methodology to collect Scrum in the context of distributed development. They and maintain traceability information in the TDD prac- identified a need for empirically founded knowledge on tice. the use of Scrum and agile practices to mitigate the chal- Succi et al. [235] propose an experimental framework lenges of distributed development. They identified sever- to quantify benefits and costs of PP and compare design al practices like Synchronous work hours, Local scrum aspects of the resulting software artefacts and their defect team, multiple and informal communication modes or behavior. They use a set of six Object-oriented design channels Distributed scrum of scrum, integrated scrum metrics from the CK metrics suite to describe design deci- team etc. to mitigate distributed development challenges. sions. Erdogmus and Williams [269] perform an economic We also identified these practices in our review thus evaluation of PP. They used quantitative data based on strengthening the case of distributed development. To

22 this end they also proposed a framework for risks identi- we classified products into ‘embedded’ and ‘non- fication and mitigation practices for distributed develop- embedded’ software. For studies where the nature of the ment using Scrum [101]. A process-support environment product was not clear enough, we put them in the catego- called MILOS to support teams in a distributed setting is ry ‘unclear’. described in [241] [244]. MILOS supports PG and defining user stories activities, developing and maintaining sched- 4.4.1Team size ule of releases, It and also facilitate distributed PP ses- An overview of number of studies performed in indus- sions with NetMeeting. trial context using mature teams and teams that are new In addition to the studies described above, we identi- to agile is shown in Fig. 7. Among studies reporting fied four research papers reported by practiotioners based teams that are new to agile, 12 (32%) did not mention on their experienced. We did not classify them as ‘lessons team size. For the rest of the 25 teams, size of majority of learned’ reports as they clearly lacked a research method teams is small. For 20 (80%) studies, team size is less than and also there was no description of projects and data 20. There seems to be a clear indication of team size being based on which the study has been reported. However, small. An important question thus raises is ‘How small major focus of these studies was on Software develop- are agile teams?. Of the 25 studies reporting team size, 19 ment process model or a partucluar practice like PP or (76%) studies were found to have teams of less than 15 improving quality through CI. These studies are de- members. Overall 49 studies were found with teams that scribed as follows. are experienced with agile. Vanderburg [105] discusses how various XP practices We could extract team sizes for 26 (49%) studies. Of are tangled with eachother. He states that the practices these 26 studies, in 19 (73%) studies team size was found Re, CS, SD and 40h are ‘noise filters’ improving the over- to be less than 20. This can imply that experienced teams all quality and allow other practices to be effective. Miller also are small in size. In 17 (89%) of the studies whose [247] proposes an approach to resolve conflicts between sizes are less than 20, team size was found to be less than pairs when there are discrepencies about design deci- 15. It can be thus understood that agile teams prefer team sions. The approach is based on giving scores on a scale of size of less than 15. Working in small teams enable work- 1-3. The highest score refers to what the next step would ing closely, share knowledge and monitor the progress of consist of. He also defines rules for what the pairs should the project [72]. This was also observed by [39]. do when the scores are tied at 1, 2 and 3. Rogers [177] re- An overview of practices adopted by small teams is ports on the importance of ‘Acceptance Testing’ per- given in Table 16. Apart from It, practices PP, PG and Su formed by customers and gives advice on how to perform were found to be highly cited. However, it can be ob- them. He also gives a variety of startegies for practicing served that there is a slight decrease in the adoption lev- CI [189]. els for PP and Su by Experienced practitioners when compared to practitioners new to Agile. PP can be subject 4.4 What are the context specific aspects of agile to difficulties owing to problems relating to staff, attitude practices reported? (RQ1.2) and is a team oriented practice requiring willingness to The purpose of this research question was to identify work together [123]. Limited resources can be a factor for contexts and map to practices that were used. In regard to dropping PP and divide individual tasks although it has context, we mapped practices used by mature teams and benefits to offer [123]. Moreover, PP can be cost inefficient teams using agile for the first time. Second type of context for very small teams [122]. On the other hand, Su can be identified for analysis is the type of product. To this end, considered as overkill [73]. Moreoever, as size of these

Fig. 7. Team size and practitioners’ experience SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 23 teams are between 1 and 5, or 6 and 10, it may be normal principles (for example, Object Oriented coding guide- to have close coordination and discuss progress and prob- lines). Lack of experience can hinder adopting CS, CI and lems informally. A generic explaination to low adoption also PG. Limiting factors for adopting CI can be the need for practices such as CS and CO among small teams is for infrastructure to automate tests and add to builds. that practiotioners might have been using these practices Lack of automation can also hinder CI as they are tightly but did know the right name [122]. For examples, compa- coupled [123]. Sfetsos et al. [123] observed that type of nies can have their own established coding guidelines projects can be a limiting factor for both small and large [102] or develepers follow generally prescribed coding

TABLE 16 Small and large teams

Team size = SMALL Team size = LARGE Practice Experienced Total = 36 Beginner = Experienced Total = 13 Beginner = 5 16 = 13 = 6 It 75 75 75 85 100 100 Su 44 44 44 38 40 33 SR 33 31 31 31 20 50 CO 22 19 19 31 40 33 CS 25 13 13 31 40 33 Re 25 19 19 69 80 67 CI 31 25 25 31 20 50 CC 36 44 44 46 40 67 PP 56 50 50 69 80 50 TDD 14 25 25 8 0 0 Rt 31 38 38 15 20 17 PG 53 44 44 54 40 50 SD 8 13 13 31 40 33 Pr 11 31 31 0 0 0 Me 19 6 6 69 60 67 Te 47 31 31 69 60 83 CR 3 0 0 8 0 17 COM 0 0 0 0 0 0 Of 14 25 25 15 20 17 DO 2 0 0 0 0 0 Pl 0 0 0 0 0 0 CM 2 0 0 0 0 0 40h 16 0 0 23 20 33 Tm 14 19 19 8 20 0 teams. A possible explaination is the relation between multiple times a day and measure the progress of the project and the kins of automation infrastructure needed. tests. For example, web based products may need skill and ex- An interesting point mentioned by Stetsos et al. [123] is perience in tools like Selinium while other types need the lack of guidelines for Re in the XP theory. This can be competence in tools like Maven, Ant etc. Moreover, it can reflected in our data (See Table 16). Refactoring tools and be a time consuming activities for small teams facing with expertise in Re for duplicate code removal, code reuse limited resources [107]. Expertise in using tools for con- and application of design patterns or heuristics are thus tinuous tools can be an enabling factor for CI, TDD and necessary [123]. also to enable continuous testing, i.e. Te, as can be ob- In regard to large teams a higer percentage of adoption served for experienced teams in Table 13. Likewise, to for many of the practices can be observed. However, it nable TDD, unit tests can be added to the build and run should not that the sample size for large teams in our SLR

24 is relatively much smaller. Nevertheless, we believe our the cost effectiveness of 40h [151] and sometime because data can be used for further work. As can be seen in Table of distributed development teams have to work on week- 13, practices PP, Re, Me and Te are highly adopted. Inter- ends due to time zone differences [106]. estingly, we did not find any citations for TDD. Mastering To sum, an overview of the top ranked practices (based writing tests early in the project can be tricky as it is diffi- on citations) for small and large are given in Tables 17 cult to determine the number of tests [123]. This process and 18. However, we believe the data and ranking for can further be complex owing to system complexity and large cannot be over emphasized because of the small also can be a time consuming activity [165]. sample size. If there were more studies reported (in the PP maybe adopted to increase team competence [123], order of 30s or 40s), results might have been different. as an informal way of training and to increase collective knowledge. Once this is accomplished, developers may TABLE 17 Top ranked practices for small teams have dropped its usage to work individually. This can be a probable reason for the decrease in the citation percent- Total Beginner Experienced age for experienced teams when compared to beginner Rank = 1 It It It teams. There is an increase in the adoption levels for prac- tices SR and Me relatively for Experienced teams. Master- Rank = 2 PP PP PP ing the practice of releasing software frequently and in Rank = 3 PG PG, Su, CC PG, Su, CC small cycles require team capabilities and be competent in Rank = 4 Te Rt Rt estimating and using user stories [123]. Large systems can face integration problems owing to system complexity, Rank = 5 Su Te, SR Te, SR large amount of code. Furthermore, integration problems are exacerbated if integrations are delayed [96]. To this end, CI was found to be efficient in revealing integration TABLE 18 Top ranked practices for large teams problems. Early integration builds can reduce the risk of software release by enabling continuous feedback from Total Beginner Experienced stakeholders and validate requirements [96]. In regard to Rank = 1 It It It CC, limiting factors include lack of a specific customer Re, [106], for example in a market-driven context and organi- Rank = 2 PP, Re, PP Te zational & resourcing constraints [95]. Also, lack of cus- Me, Te tomers’ knowledge on importance of agile practices [123] Rank = 3 PG Me, Te CC, Me, Re and particularly his/her involvement right from the start PG, CC, Su, SR, CI, PG, Rank = 4 CC of the project can affect the use of CC. Customer in- Co, CS, SD PP volvement is considered an important ingredient for suc- SR, CI, 40h, 40h, Su, CO, Rank = 5 Su cess, however customer can find close coordination of Rt, Of, Tm CS, SD developers sharing same room and specification pair programmers noisy [136]. Also, in large projects involving 4.4.2 Application type large teams, application domain can be beyond the expe- In regard to application type (P10), we extracted two rience of the customers [123]. To this end, use of customer categories: Embedded software and Non-embedded proxy [125] or surrogate customer can be used [111] [193]. software. Among studies performed in industry, we iden- On the positive side, customer involvement can bring tified 19 and 43 studies where application types were em- rapid feedback, better use of valuable resources leading to bedded and non-embedded software respectively. a better product [123]. Among studies performed in academia, non-embedded Design practice SD seems to be more pronounced with software was reported in 13 studies and we did not iden- large teams based on the data. Lack of theoretical guid- tify any study on embedded software. An overview of ance can be a major setback in SD adoption [123]. It can be adoption of agile practices is given in Table 16. particularly difficult in large projects where the team has to plan for how the new code would interact with coming TABLE 19 Application type functionality [101]. SD can be considered very abstract and thus many teams do big upfront design to develop Industry Academia full design [72] [82]. However, having continuous unit Non- Non- Practice Embedded testing can harness SD because of coordination between embedded embedded (19) requirements development and code production [123]. (43) (13) The practice 40h is more pronounced among experienced % % % practitioners both for small and large teams. An enabling factor for 40h is strict implementation of working tasks It 84 96 38 such as iterations and effective management [123]. How- Su 42 93 30 ever, large teams may face stress situations owing to SR 42 31 30 deadlines and small companies may not adhere to the CO 53 27 38 standard 8-hour work schedule on a daily basis [123]. Moreover, management may have negative perception on CS 53 33 23

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 25

Re 47 42 54 observed that CS and CO have been well established be- fore agile was incorporated. It is also possible that coding CI 53 47 46 guidelines are informal and not written down in print CC 42 40 38 [107]. It can also be possible that for this reason experi- PP 58 58 92 enced practitioners do not need CS as they accumulate TDD 26 24 61 good coding practices through experience. A reason for not practicing CO can be that person with highest Rt 32 38 15 knowledge often inspects the code and make changes PG 32 40 46 [107]. Developers can be reluctant to adapt to code writ- SD 32 24 23 ten by others [261]. Adoption levels of CC are almost the same for both Me 58 49 30 embedded and non-embedded software. However, the Te 47 47 38 role of customer may not be always necessary for devel- oping embedded systems. For example, for certain sys- PP is found to be one of the most commonly adopted tems the hardware-software architecture and interaction practice both in academia and industry and also for both is predefined and therefore limits the possibility of nego- embedded and non-embedded. However, both PP and tiating with the customer [213]. This can also be true to TDD obtained more emphasis in academia. Auvinen [127] telecommunication systems where the individual systems in their pilot project recommend that PP should not be are typically large, evolve over the years and released to a enforced rather should be recommended to the team. A mass market. Moreover, regular communication is neces- recommendation was that PP gives room to the pro- sary between hardware and software teams so that the grammers to use their rhythm and experience and in turn changes proposed by hardware design teams (for exam- was found to increase the competence level and ple, processor design) are incorporated [213]. Likewise, knowledge sharing of the whole team. PP was also used for many non-embedded applications customers are not in the context of distributed development for products in always present. For example, for a banking software real the telecom sector, where remote pairing was conducted customers are banks and ideal customers are not readily using teleconferencing and common IDE [281]. On the available [218]. For small projects, CC can be expensive other hand PP may requires patience and can be strenu- and demanding [207]. Customers can find the working ous [127]. Also, pairing can be perceived as a tedious style of agile offbeat and some practices such as PP can be process and can be limited to detailed design and com- perceived as noisy [207]. However, an inside customer plex code [213]. who can write stories and performs acceptance tests is Re appears to be one of the highly adopted practices very beneficial to the team [116]. A committed customer for the embedded software. Re helps in improving the can participate in team’s daily meetings, i.e. Su and also quality of code by reducing the kludginess, making the in prioritization of user stories thus being an enabling design simple, thus making the code more maintainable factor for Me adoption [116]. This can be true for both and facilitating room for improvement in the low level embedded and non-embedded systems. However, ac- implementation [213]. On the flipside, Re can be a very cording to the data in Table 16, Me has a low level of different practice to learn for practitioners that are new to adoption in the context of non-embedded systems. A pos- agile [116]. It was reported that there is no formal mecha- sible reason is that Me and also SD are abstract in nature nism to evaluate the quality of the refactoring’s [132] and and may be problematic to use [74]. Moreover, there is no also no automation support to verify refactoring intro- guidance to apply these practices [75]. This may be a duced bugs and therefore can be tedious and error prone problem especially to teams that are new to agile as can [222]. This can be a problem for both embedded and non- be seen from Table 14, where we identified a low level of embedded software, for example in a mission-critical sys- adoption for Me in the context of small teams, and small tem and also for software supporting banking activities. It teams that are new to agile. Modifiability, maintainability was also found that Re strongly supports Object oriented and dependability are very important aspects of embed- code [85] and thus object orientation can be a factor in ded systems, which can be described through Me [76]. adopting Re. Successful implementation of Re needs a Practitioners seem to better describe these issues through strong support for automated regression testing to man- detailed description by the use of Me. This can be inter- age large amount of code [107]. preted based on data in Table 19. Salo and Abrahmsson [282] conducted a survey in the In regard to other practices, adoption levels for CI are context of European embedded software. Their results similar for both embedded and non-embedded, more im- show that 40h and ‘Open office’ are the most widely portance for continuous testing is given for embedded adopted practices. However, they stated that they might systems while TDD is considered as a more important have been adopted and used independent of agile philos- agile practice for non-embedded applications. Use of CI ophy. Results for other practices are similar. Similarly allows teams to identify issues where a change to one practices like CS and CO can also be implemented irre- component affects others. Moreover, CI enables frequent spective of agile philosophy. For example, Schooender- integration of hardware and software components so that woert [121] in his industrial experience of applying agile at the end of the iteration the embedded product is in practices on an embedded software product from scratch completely tested and in usable state [249]. Similarly, it

26 may be difficult to continuously integrate due to lack of with respect to the type of applications as shown in Table coordination between hardware and software teams op- 20. erating at different sites [255]. This is an infrastructure problem for practitioners at different regions in a distrib- TABLE 20 Practice ranking and application type uted development and as consequence practitioners face long delays during regular continuous builds [281]. Alt- Embedded Non-embedded hough CI has been adopted considerably for non- Rank = 1 It It embedded systems, a negative factor can be a lack of unit test framework [107]. This can be problem to beginner Rank = 2 PP, Me PP teams or small teams. Moreover, CI can be perceived as Rank = 3 CO, CS, CI Me complicated and time consuming [107]. TDD can be less Rank = 4 Re, Te CI, Te suitable for embedded software because of the hardware that usually takes long time from finish to production Rank = 5 Su, SR, CC Su [77]. Moreover, software at many times can be tested on real test environment, which often is not available imme- diately [77]. However, it is not clear from the studies if 4.4.3 Context specific practices unit tests were written before code construction although In our analysis in previous sections, we included stud- studies mention testing as part of iterations irrespective of ies that had focus on overall software development. We the type of application. Based on the data, we infer that also identified studies that reported the use of agile prac- practitioners prioritize to perform continuous testing by tice(s) for a specific cause/purpose. We classified such having the code tested as and when it is developed in studies into various categories. For example, agile has each iteration. In regard to SD, we found it is strongly been adopted in software companies for the purposes of supported by code refactorings, and requires expertise managing non-software activities, innovation manage- [116]. SD may be difficult to practice because of the com- ment etc. Some of the studies in industry focused on sin- plexity involved in embedded software such as timing gle practices, like evaluating the efficiency of PP or TDD, and memory constraints. In such cases maintaining sim- identifying practices relevant to architectural issues etc. plest possible design solution may be difficult [78]. How- We did not include such studies in our earlier analysis. In ever, sometimes assembly code in some of the embedded this section we summarized the findings from the studies systems can be difficult to maintain and thus, simple de- and practices adopted. First, a summary of practices in- sign is useful [78] [79]. However, this may not always be vented and adopted for non-software activities in soft- possible because teams have to focus on future milestones ware projects in industry is summarized in Table 21. and explore dependencies that could have drastic out- It can be observed that Studies on TDD reported comes in mission critical software [80]. For this reason, mixed results as far as defect detection and productivity teams may extend designing solutions beyond the current are concerned (See Table 24). This largely explains the milestone or iterations thus not exactly sticking to the SD low level of adopted of TDD in software product devel- practice. Moreover, SD is not highly adopted for other opment irrespective of the application type (See Sections type of applications. It is considered abstract [82] and also 4.3, 4.4.1, 4.4.2). On the other hand, studies on PP sug- there is a lack of guidance on how to keep the design pro- gested modifications to the practice (See Table 22). This cess [41]. observation can also be traced to studies reporting on PP Based on the data presented one should not be under and TDD in experiments and other empirical research the assumption that most of the studies focused on XP studies conducted with students in academia (See Ap- exclusively. In many of the studies it was reported that pendix C). Among these studies include support for PP in Scrum was used in combination with XP. For example, in distributed context [93] [216] [253] [256]. 18% of the studies reporting application type as non- embedded used a combination of scrum and XP and as low as 9% of the studies used Scrum exclusively. These figures are similar for embedded systems also. In summary, we ranked the most adopted practices

Table 21 Agile practices supporting non-technical activities

Article Description Category Practices • Study abroad - spending time with another team (for QA, content development) Reports experiences of how com- • Ask-a-dev (anyone is free to discuss with a [254] pany values have been used to Value creation designated person or pair) personalize agile practices. • Now-or-never bugs (every morning bugs are triggered by the customers and deve- lopers. “never” bugs are worked on im-

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 27

mediately, “never” bugs are recorded and discarded An experience report that shows • Staffing wall how culture is infected with agile • Recruitment wall thinking and how agile is applied Agile for non- • Learning matrix [98] to non-software development acti- software • Feedback in training sessions vities like training sessions, • Rotation between roles recruitment staffing, office re- • Small offices (150 to 200 people) forms, strategic decisions making. • Customers’ apprentice • Programmer on-site • Programmer holiday Article reports customer practices Customer practic- • Roadshow [88] used by agile teams. es • Customer pairing • Customer bootcamp • Big picture up-front • ReCalibration • Paper prototyping with key customers Investigated how the integration User-centered de- • Evaluating HTML mockups with cus- [89] of agile methods and user-centered sign tomers to get more requirements (UCD) is carried out in practice. • Integrating usability results in iterations • Not switching pairs frequently A lessons learned report on mis- • Letting the architecture emerge takes in implementing XP practic- Mistakes in XP • Not working in a team room [129] es in teams involving junior devel- implementation • Letting team members specialize opers. • Using long iterations • Not including junior developers • Two week sprint prior to start of devel- The paper reports case study of User-centered de- opment for user behavior analysis [145] integration of User experience de- sign/Innovation • First week of each sprint spent on re- sign and agile development. management searching and user testing • Personas

Table 22 Empirical studies focusing on PP

Article Description Results • Frequent pair rotation improves knowledge transfer • PP improves quality – low defect count The paper reports experiences of using PP in [134] • More suitable for complex tasks an industrial project (application type: embedded) • Development effort found to be lower than solo pro- gramming • The model proposes to build design code patterns in pairs and use these patterns to write code in solo The paper describes Software Process Fusion [151] • Results suggest that the longer programmers work (SPF) that unites pair and solo programming. alone, more code patterns they develop for reuse in pairs. The paper presents a field study on developer • Two person teams working independently are more [191] pairs and impact on productivity (application productive than pairs working concurrently type: embedded) • The framework found to be a mechanism for under- standing aspects of XP, assessment of their suitability The paper proposes and presents the results of on the basis of cognitive aspects a framework for the study of PP through ob- • Novice pairs were found to be overconfident believing [219] servations and analysis of verbalisations, inter- they would develop software in their own way thus actions and production of artefacts (application condradicting their peers type: non-embedded) • Experienced pairs had fewer interactions as they had better understanding of their role, expertise and of their peers

28

Table 23 Empirical studies with focus on software development process model

Article Description Results The paper introduces a development practice ‘Divide After You Conquer’ for large agile projects

• Successful implementation of the practice needs Conquer phase: A stable working example of the sys- efficient staffing at the beginning of the divide tem architecture and design is build iteratively by the phase core expert team. Use-cases are developed in Test • A need for communication between the sub- Driven manner. [157] teams other than frequent members’ rotation

was observed Divide phase: The divide phase starts by Dividing the

project into sub-teams based on business areas followed by

pairing between developers, busininess analysts and

quality analysts. Rotate team members from the newly staffed members thus allowing them to the practice CO

The paper presents a development model that maps • Model adopted showed a 60% drop in defects seven dimensions of agile maturity, defined into five • Velocity of the team, i.e. productivity improved maturity levels, to agile practices. between 8-20% • Many of the component teams were provided [250] Practices: TDD, Storing project builds and CM files, Re, by offshore suppliers and this was perceived as CI, a threat to their business model • As the model was highly engineering focused it was not initially treated positively by the busi- ness people

Table 24 Empirical studies with focus on TDD

Article Description Results • Writing tests before code construction resulted in higer quality (lesser defect density) products The paper reports a post hoc analysis of an • A decrease in 15% in defect density and a productivity [131] IBM team on the practice of TDD over a period loss of 15% was found and it was perceived as econom- of 5 years ically justified • A very little increase in Cyclomatic code complexity over the course of 10 releases was observed • Pre-release defect density decreased by 40-60% when compared to similar projects that did not use TDD The paper reports case studies at Microsoft and • An increase of 15-35% in development time was ob- [146] IBM on the utility of Test Driven development served (TDD) • It was recommended not to adopt TDD in the middle when the design is decided • Add tests to automatic build on a daily basis • It was found that test after coding reduces time for The article presents an experiment on TDD writing the tests and modifying the code [179] conducted with practitioners. • TDD was found to increase testing accuracy and im- prove code quality • TDD approach resulted in superior code quality • 78% of the programmers involved in TDD found that The paper presents an experiement comparing that the practice increases productivity [273] code development using TDD and traditional • TDD facilitates simpler design waterfall-like approach • Programmers following TDD took more time to com- plete the task

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 29

4.4.4 HRR Using communication channels like Wikis, Information The purpose of this section was to examine the radiators were found effective in communicating and also strength of evidence of the most adopted practices. To as a medium to keep track of progress [176]. CRC cards this end, we extracted practices from studies with HRR. were found efficient in explaining responsibilities and As shown in Fig. 6, there were 36 studies with HRR. inter module collaboration [133]. Other forms of commu- Among these, 12 studies reported development of em- nication practices are summed up in Table 25. bedded systems and 17 studies on non-embedded. There is a marginal variation of adoption levels for many of the TABLE 25 Communication practices practices when HRR is considered. However, adoption levels for PP, Me and Te are found to be similar in com- Communication parision to all the studies irrespective of rigor and rele- Practices Adapted Citation vance. For example, TDD adoption is found to be consid- Chat [223] erably less when compared to other practices and this Coffe/Ice cream meeting [172] observation is same for studies with HRR. On the other Communicating through Ims [172] hand, contrasting results can be noticed in case of practic- Communication channels - Wikis, Infor- [176] es SR, Re, CI and Rt. That is, there is a marginal difference mation radiators [253] in citation percentage for these practices. In the context of Conferencing embedded sytems, there is not much variation in terms of Cultural ambassadors to Overcome cul- [137] citation percentage for all the studies and studies with tural barriers among remote teams HRR, for practices PP, TDD, Me, Te, Re and CC. Howev- Extensive communication [205] er, among these practices PP (58%) was the most adopted Face to face communication [181] practice followed by Me (53%), Te (47%), Re and CC Quick feedback [205] (42%). Among studies reporting non-embedded systems Rapid interaction [270] and the ones with high HRR, low variation in citation Show and tell meetings [220] percentage can be found for the practices CC, Re, CO, Rt, Tools like wiki, cvs, im, discussion boards [190] [192] Me and Te. Among these practices, the most adopted White board [223] practice is Me (53%) followed by Te (47%), Re, CC and Rt Use of collaborative tools [94] (41%) and CO (29%). Based on the data it can be understood that the most adopted practices are PP, Me, Te, Re and CC for both ap- plication types. That is, evidence suggests that these prac- The role of customer was found in requirements elici- tices are most adopted. However, results drawn from tation, by using HTML mock-ups to gather more re- studies with HRR and all the selected primary studies quirements and paper prototyping [89] and inviting cus- reported do not agree with eachother. Moreover, among tomers for development process [158]. Participation of studies with HRR there is a not a broad discussion on customer spanned user stories development [260], re- how agile practices were implemented and the resulting views and feedback [227] and performing acceptance tests consequences. Therefore, we find that strength of evi- [87] [111] as can be seen in Table 26. dence on what are the most important agile practices, or what practices are considered the most agile in nature is TABLE 26 Customer collaboration adaptations not high. In regard to studies reporting context specific practic- es, many practices have been reported. However, we can- Customer collaboration not claim the applicability of these practices in different Practices Adapted Citations settings. For example, very few studies reported customer Active stakeholder participation [191] practices. Therefore, directness of evidence for these prac- tices in different settings is uncertain because these prac- Ambassador from onsite [192] tices. However, a practical implication is that practition- Co-located customer [204] ers can identify these practices to their own project char- Customer accepted interfaces [238] acteristics and examine what are suitable. Customer accustomed tests [178] 4.5 What modifications to agile practices have been reported? (RQ 1.3) Customer apprentice [82] The purpose of this research questions was to identify Customer Boot Camp [193] modifications/adaptions of various agile practices as re- Customer demos [108] [175] ported in the studies. To this end, we also identified prac- [37] [104] tices that were used to support agile incorporation. Most Customer feedback [114] [175] interesting adaptations belong to the Communications category. Practitioners have adopted various communica- Customer involvemnt [264] tion practices. These range from Chat or Instant messen- Customer pairing [82] [193] gers [172] [223] to using Coffee/Ice cream meetings [172].

30

Customer proxies [219] Scrum meetings shared by whole [87] [172] Customer role [90] team using Videoconferencing Customer tests [87] [111] Customers’ apprentice [193] Scrum teleconference across sites [128] Evaluating an HTML mockup ver- [89] Shared stand-up once in a week [87] sion with customers to get more Standup for 20 minutes [238] requirements Weekly scrum of scrum [144] Expert user involvement [112] Kickoff (2-3 hours): discuss stories [260] Meeting practices include Weekly release meetings with customer, agree on require- [266], meetings to form pairs for PP [134] and Dirty laun- ments, create list of tasks dry meetings [116]. Dirty laundry meetings are a unique Paper prototyping with customers [89] practice that was used to slove any personal conflicts in the team that could hinder productivity. Programmer on-site [193] Practices related to Office were regarding creating an Proxy client [192] open workspace [100] [210]. Open workspace facilitates Surrogate customer engagement [193] [111] communication among team members [20]. PP was found to be more efficient for critical or complex tasks and User involvment in development [227] therefore was thus limited to such tasks [216] [219]. In one process and feedback instance where the teams where physically distributed, web conference was used to practice PP to get new team members speed up in understanding the features of the The length of It was not uniform in the studies re- project [172].User research was put into place to prioritize viewed. It lengths varied from 2-3 days [257] to 4 weeks features in product backlog [145]. We found a variety of [104]. It specifications [261] were used to provide a de- tools/artifacts that were used to track Progress. These tailed account of requirements, features and information include Kanban visibility board [250], Burn down charts to test features in the It. Similarly, length of sprint was [108] [122] as can be seen from Table 28. kept to 2 weeks in many of the studies. Sprint breaks after 5 to 6 sprints were made mandatory in one study [173]. This was to ensure that team members do not burn out in TABLE 28 Adaptations for project progress the long run by doing sprints continuously. Other men- tions of sprint lengths include 4 weeks, three and a half Progress weeks. Web conferences and other forms of informal Practices Adapted Citation channels were used to conduct Su. Su were held majorly Burn down charts [221],[ 253] on daily basis but other variations like Su 3 times a week Customized share Point site [136] [136] and shared Su once in a week [87] were evident. An for team contact infor- overview of modifications/adaptations of Su is given in mation, status updates, an- Table 27. nouncements, discussions and documents TABLE 27 Adaptions for stand-ups Daily updates using Wikis [87]

Stand-up Darts [136] Practices Adapted Citation Kanban visibility board [253] Measure progress by work- [250] Daily communication through in- [102] ing software formal channels Productlog burn-down [108] Daily scrum meetings held by local [128] Project burn-down charts [108] [122] [172] sub-teams

Daily scrums reduced to three times [136] Project dashboard [112] a week Project dashboard and met- [96] Daily stand up meetings: These are [87] rics not shared by all team members, in- Project data sheet [147] stead rotating role of proxy is created Project velocity [248] Open communications and daily [248] [113] Project visibility [96] stand-up Sequencing activities using [138] Scrum masters meeting in scrum of [128] Design Structure Matrix scrums Structured time boxing [264] Tracing development project [138] Velocity assessments [290]

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 31

Effort estimation [187] TABLE 30 Continuous testing in agile

In the context of Re, automating Re was found to make Testing Re inexpensive and reduce human error when when Practices Adapted Citation modifications were frequently made to the system [96]. [143] [105], Acceptance testing Running multiple build per day [249] [167] was pointed [195] [207] [208], out for CI. Builds were run multiple times [249] and also Automate acceptance tests using a overnight [87]. We identified many variations to the prac- "“subcutaneous” test approach [249] tice Rt. Rt sessions were conducted, for example, by post- with a unit testing approach ing comments on Wikis [94], using videoconferencing [87] Automate as close to 100% of the [249] as can be seen in Table 29. acceptance tests

Automated regression testing [250] TABLE 29 Adaptations for conducting retrospectives Automated tests [220] Retrospectives Code inspection [143] Practices Adapted Citation Define and execute "just enough Asynchronous retrospective meetings [94] [250] acceptance tests" (posting comments and results on Wikis, Develop unit tests for all new code e-mailing the minutes of local scrum [250] meeting to the onshore team) during a sprint Collective and individual feedback [219] Frequent testing and feedback [205] Company retrospectives - 2 to 3 times a [247] Functional verification tests [131] [168] year, 90 minutes Integrated testing [253] [270] Continuous feedback [133] Interface integration testing [250] Daily feedback [167] Just enough testing [220] Demo and retrospective using videocon- [87] ferencing QA in sync [108] Discussing and reviewing project plan [135] Quality agreement [112] Estimate retrospective [149] Regression testing [102] Feedback in training sessions [98] Iteration demo and retrospective [158] Run all acceptance tests in the re- KPT (Keep/Problem/Try) [149] gression test suite with the build [250] Monthly review [113] (daily at a minimum) Quality review phase [143] Run all unit tests with every build [250] ReCalibration [82] [93] Story testing [147] Regular team reflection and adjustments [250] to become more effective Support regression tests [258] Retrospective sessions at each location [136] [248] [102] [105] and comments updated on SharePoint [128] [131] [143] Review (1-2 hours): demonstrate com- [260] [229] [163] [168] [176] [179] [201] pleted stories for customer approval and Unit testing [210] [213] [220] feedback Review feature list [143] [243] [245] [245] Review of iterations with the core team [122] [248] [249] [261] Rolling waves at the end of each itera- [96] [263] [264] [270] tion [255] Sprint demo [144] Weekly retrospectives [247]

4.6 What is the strength of evidence available on As discussed earlier, Te is one of the most adopted the use of agile practices in reference to agile practices. In many studies, we could not identify if tests philosophy? (RQ2) were written prior to code construction. In regard to the practice Te, we identified several variations and types like unit tests within each It, customer defined acceptance However, rigor found in these studies significantly var- tests etc. These are given in Table 30. Modifications for iesThe purpose of this research question was to evaluate other practices are less pronounced. A list of modifica- the strength of evidence on the use of agile practices. To tions to other practices is given in Appendix D. this end, studies reporting agile practices have been as- sessed on the grounds of rigor (See Table 6) and relevance (See Table 7). Rigor is calculated on a scale of 0 to 6 and

32 relevance is calculated on a scale 0 to 4 (a value of 4 im- plying highest relevance). Based on rigor identified, stud- V4: Responding to AP02: Welcome change ies are classified into three types: (1) Studies with a high change over following a AP10: Simplicity rigor (rigor ≥ 4), (2) Studies with medium rigor (rigor = 3), plan AP12: Continuous reflection and (3) Studies with a low rigor (rigor ≤ 3). An overview of rigor and relevance found in the studies is shown in Fig. 6. 108 studies (58%) were found to have a relevance of 4. That is, these studies were conducted in industry Based on the practices and agile principles that they and thus results would be more appropriate for industry relate to, we identified which of the core agile values practice and agile implementation in industry. Of these were aligned and not aligned to. In regard to studies with studies, only 36 studies (33%) are found to have rigor high rigor, practices implemented were aligned to attain- greater than 3, 29 studies with rigor 3 (27%) and the re- ing the core agile values. On the negative side, we identi- maining 43 studies are described rather poorly. This is fied many studies where practices implemented were not disappointing from a technology transfer perspective as it aligned to values V1, or V4. This phenomenon is similar is difficult to synthesize how agile practices were imple- to studies with low and medium rigor but we identified mented and on what contexts agile had been put into comparatively less percentage of studies where practices practice. Overall, 36 (19%) studies offer decision support were not aligned to reach either value V1 or V4. Over- for agile practices in the sense that these studies were view of studies and agile values that were not aligned conducted in industry (realistic settings) and were well when using agile practices are given in Tables 32 – 34. reported. Decision support, in this context implies that We could not identify any distinguishing relation with context is described to an extent practitioners can com- either company size or agile experience of practitioners in pare to their own settings, study design is described in a regard to alignment between practices and agile values. It way practitioners can identify a procedure to obtain simi- can be thus understood values V1 and V4 are often ne- lar results and finally validity of the study is described so glected or difficult to achieve. V1 is more related to team, as to identify factors affecting the study results. societal and communication aspects than technical as- In the first step we identified in which the practices pects like producing code and design. This is in line with implemented were aligned and not aligned to the four the observation with our results discussed in Section 4.3. core agile values. The four core agile values are the goals For example, practices like 40h week are less cited and of agile philosophy and the 12 principles are the rules that thus meaningless used than practices like PP and Contin- should be followed (See Section 2.1). A mapping of the 12 uous integration. Next, we selected the studies with high agile principles and agile values has been defined defined rigor to analyze if practices reported were aligned to the by Petersen [17]. He also identified linkage between vari- core agile values (See Table 32) ous agile practices and agile principles, thus establishing traceability between agile practices and the ultimate aim Table 32 of agile philosophy, i.e. agile values. Agile values and the Value alignment among studies with corresponding principles that should be followed are giv- Rigor > 4 en in Table 31. Practi- TABLE 31 Values N Company tioners Mapping values to agile principles not Ref . size experi- aligned ence Value Agile Principle(s) [284] 1 Micro [178] AP05: Motivated individuals 2 Small [261] V1: Individuals and inter- AP06: Face-to-face conver [248] actions over processes sation 2 Medium [214] and tools. AP08: Sustainable pace None [121] AP11: Self-organizing teams [127] 3 Large Beginner AP03: Frequent deliveries [218] V2: Working software AP04: Work together Not men- over comprehensive doc- [106] 1 AP07: Working software tioned umentation AP09: Technical excellence [173] 1 Medium V1 V3: Customer collabora- [113] 1 Large AP01: Customer satisfaction tion over contract negoti- Not men- V4 [90] 1 ation tioned [207] [208] 3 Micro Experi- None [210] enced [253] 1 Large

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 33

[87] [136] 2 Not men- [222] Not men- [176] tioned 4 [226] tioned [255] Table 34 [144] 1 Medium Value alignment among studies with rigor=3 V1 [102] 2 Large [177] Values Ref Nr Company Practitioners V1, V4 [220] 1 Medium not size experience aligned None [116] 2 Small Beginner Table 33 [213] Value alignment among studies with rigor<3 [158] 1 Medium [167] 1 Not men- Values not Ref N Company Practi- tioned aligned r size tioners V1 [234] 1 Medium experi- [139] 1 Large ence [247] 2 Not men- None [122] 2 Large Beginner [264] tioned [212] None [175] 2 Micro Experienced [266] 3 Not men- [176] [149] tioned [82] 2 Small [170] [204] V1, V4 [133] 1 Large [189] 1 Medium [129] 3 Not men- [264] 1 Large [163] tioned [111] 2 Not men- [250] [187] tioned None [254] 3 Small Experi- V1 [206] 1 Small [245] enced [205] 1 Medium [285] [211] 1 Not men- [108] 1 Large tioned [96] [172] 4 Not men- V4 [193] 1 Small [174] tioned [200] None [230] 1 Micro Not men- V1 [249] 2 Large [111] 2 Large tioned [109] [115] [173] 1 Not men- [128] tioned V1 [95] 1 Medium V1, V2 [157] 1 Not men- V1, V4 [244] 1 Not men- tioned tioned V2, V4 [175] 1 Not men- tioned Among the studies with Beginner teams, the instances None [114] 2 Small Not where principles AP04 and AP11 were none and there [281] men- was only one study that reported practices which corre- [132] 1 Medium tioned spond to following the principle AP08. This further [117] 2 Large means, it was not a rule to work together (AP04), main- [137] tain the pace of working for 40 hours a week (AP08) and [138] have the teams to self-organizing teams (AP11). Same [250] 6 Not men- phenomenon was observed in the case of teams with ex- [253] tioned perienced practitioners agile wise with the expectation [170] that there were a very few more studies that followed [192] AP08. From the studies reported in the review, we [198] mapped a few practices to principles. These are: ‘Chang- [241] ing role’ (AP11) [172], ‘Frequent rotation between on-site [262] and offsite teams’ (AP11) [138], and ‘Splitting teams across sites’ (AP11) [128]. V1 [95] 1 Medium

34

Although traceability between practices and princi- The method includes four main steps and is shown in ples was identified in our analysis, it is difficult to say Fig. 8. Step 1 of the method is covered in Table 31. Step 2 how rigorously a principle is followed actually. This is of the method is extensively covered by Petersen [17] and because it is not mentioned in the reviewed studies which not discussed here. Steps 3 and 4 of the method are in the principles were followed and seldom there was a note on following sections. agile principles. This kind of traceability was based on the analysis by Petersen [17]. Furthermore, based on the data from the studies reviewed it seems that importance to agile principles and core agile values is not much. It is also not clear how the practices were considered for adoption in the first place and secondly, what kind of evidence (opinion or measurement based initiates) is con- sidered as viable information for agile adoption. Adopt- ing agile practices on account of their popularity [81] and job satisfaction [249] is evident.

5 A METHOD FOR AGILE ASSESSMENT BASED ON CORE AGILE VALUES (MAA-CAV)

The purpose of this section is to introduce and exemplify the method proposed to measure how agile a company is Fig. 8. The four steps of MAA-CAV from the perspective of practices implemented. Traceabil- ity between the core agile values and principles is given 5.1 Step 3 - Calculating measures in Table 31. Agile values define the main goals of agile Measures are calculated for each of the agile values. Table philosophy and provide rationale why companies should 35 illustrates measures for each agile value based on the adopt agile [1]. Principles on the other hand constitute the mapping between principles and values. rules that should be followed to achieve the four agile goals [1]. In our analysis we found that although many 5.2 Step 4 – Measure representation companies achieve the four agile goals by implementing Representation of core agile value assessment is done various sets of agile practices, they do not adhere to fol- at two levels. Firstly, measures for each of the core agile lowing the 12 agile principles. For example, in the sample values are represented individually and next, these are of 20 studies with high region (rigor ≥ 4, See Table 32), 18 combined to identify how agile a company is. The nota- studies were found to have practices towards achieving tion to represent agile value assessment is in the form of a the four agile values. However, we did not find any in- fraction where the numerator shows the sum of all the stance where all 12 principles were followed. Practices measures of the principles corresponding to an agile val- corresponding to Principles ‘AP04: Work together’ and ue, and the denominator shows if an agile value is ad- ‘AP11: Self-organizing teams’ were not found in many dressed at all. That is, if an agile value is addressed, the studies. The same is the observation with the principle denominator takes the value 1 and 0 if it is not addressed ‘AP08: Sustainable pace’ which was found only in three at all. This kind of calculation is performed for all the four studies only. A general observation, thus, is achieving the agile values. Thus we get four fractions (referred hereafter four agile values is imperative to agile in practice and not as value fractions), each representing an agile value. The following principles. Also, by having a right set of prac- rationale for this kind of work up is to know if an agile tices, all four agile values can be attained without follow- value is addressed and if yes it facilities to know how ing all 12 agile principles as a rule. With certain number much of value is created (in the form of quantitative rep- of practices put into use, it is interesting to answer a few resentation between 0 and 1). questions that define agile. These are:

1. What principles are followed? TABLE 35 Measure definitions for values and princi- 2. How much of each agile core value is achieved? ples 3. How agile is the company in practice? Value Principle Measure For each of the Based on these questions we proposed the method Agile AP05: Motivated principles repre- assessment based on core agile values (MAA-CAV). The V1 individuals sented, measures purpose of the method is multifold. Firstly, the method AP06: Face-to-face defined are as enables mapping between agile practices implemented Maximum conversation follows and agile values. Secondly, based on the principles fol- value: 1 AP08: Sustainable : If followed and lowed we can measure how much of each value is creat- Minimum pace ed. Thirdly, the method enables to know how agile a achieved: 0.25 value: 0 AP11: Self- company is and finally provides decision support in iden- : If followed and organizing teams tifying improvement potential aspects related to agile. result correspond-

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 35

ing to this principle AP09 0.25 is not satisfactory: 0.125 ≈ 0.12 V3 - - 0/0 : If not followed: 0 For each of the AP02 0.33 principles repre- sented, measures AP03: Frequent V4 AP10 0.33 1/1 defined are as deliveries V2 follows AP04: Work to- AP12 0.33 :If followed and gether Maximum achieved: 0.25 2.75/3 AP07: Working value: 1 :If followed and Combining V1, 2,3 and 4 Level 1 – Par- software Minimum result correspond- tial: Partial AP09: Technical value: 0 ing to this principle excellence is not satisfactory:

0.125 ≈ 0.12 Once measures for each of the agile values are collect- :If not followed: 0 ed in the form of fractions, they are combined to show a quantitative representation of how agile a company is. :If followed and The resultant representation is a fraction where the nu- V3 achieved: 0.33 merator is the sum of all the numerators of the value frac- :If followed and tions and denominator is the sum of denominators of the AP01: Customer Maximum result correspond- value fractions. The rationale here is that if a company satisfaction value: 1 ing to this principle says it is agile, it ought to address the four core agile val-

Minimum is not satisfactory: ues given in agile manifesto. value: 0 0.165 ≈ 0.16 We defined four levels to represent core agile assess- :If not followed: 0 ment. This is shown in Fig. 9. For example, Level 4 means For each of the prin- that all the four core agile values were addressed and ciples represented, were complete. That is in essence, all the principles were measures defined actually followed in practice. An example to illustrate the V4 are as follows method described is given here (See Table 36). Lets’ con- AP02: Welcome :If followed and sider among principles contributing to V1, AP05 is not change Maximum achieved: 0.33 followed at all. According to the measures given in Table AP10: Simplicity value: 1 :If followed and 35, AP06, AP08 and AP11 get 0.25 points each. Next for AP12: Continuous Minimum result correspond- V2, let us consider all the principles are followed with reflection value: 0 ing to this principle successful results; principles contributing to V3 are not is not satisfactory: followed while principles contributing to V4 are followed. 0.165 ≈ 0.16 The next step, calculating measures and defining core :If not followed: 0 value assessment using the scale is given in Table 35. Based on the scale defined, agile assessment level of the example company is Level 1 which means that prac- TABLE 36 Measure illustration tices implemented by the company do not address all the four core agile values. Agile assessment value 2.75/3 Agile assess- shows that only 3 agile values were addressed (i.e. de- Principles Value Measures ment repre- nominator) and the 3 values were partially achieved (i.e. followed sentation numerator). A visualization chart can be used to know the area of improvement potential. An example of such AP06 0.25 visualization is shown in Fig. 10. In the graph, it can be noted that a small area is agile and a large area of what AP08 0.25 can be agile is untouched. A template (Table 37) is de- V1 0.75/1 signed to aid the process of agile assessment for calculat- AP11 0.25 ing measures (Steps 2 and 3). This template can be used as a tool for charting out practices used and issues found

AP05 - thus pointing out to what worked and what did not. We illustrated the use of the method proposed using studies AP03 0.25 (rigor > = 4) found in our review. These are given in Ap- pendix D. V2 AP04 0.25 1/1

AP07 0.25

36

Fig. 9. Levels in agile assessment

TABLE 37 Template for agile assessment

• Project name • Team size • Team agile experience • Project duration! Nr. Practice Purpose Actors Outcome Other support- Notes ing practices used

Fig. 10. Agile assessment visualization

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 37

projects is not often aligned with the core agile values of agile philosophy. Most often non-technical aspects of ag- 6 CONCLUSION ile are not in place. To this end, we proposed a pre- We conducted a SLR that consists of 186 primary studies limenary method for agile assessment based on core agile published between 2000-10 that investigates what agile values (MAA-CAA). In our future work, we intend to practices have received most attention in terms of adop- validate and propose modifications. To this end, it is in- tion. To this end we identified the most adopted practices teresting to investigate the phenomenon or process be- in regard to team size, practitioners’ experience agile wise hind choosing a set of practices in projects. and application type. In addition we identified many em- pirical and non-empirical papers (research papers) that focused on specific aspects of agile software development and not software project development in general. These REFERENCES studies broadly fell into the following themes/categories: [1] K. Petersen, Implementing Lean and Agile Software Develop- ment in Industry, PhD Dissertation No. 2010:04, ISBN 978-91- 7295-180-8, Blekinge institute of technology, Karlskrona, 2010. • Overall software development: These studies dis- [2] George Stalk Jr. Time - the next source of competitive ad- cussed the implementation of agile practices in the vantage. Harvard Business Review, 66(4), 1988. [3] B. Kitchenham and S.L. Pfleeger, “Software quality: the elusive overall development of a software project/product target [special issues section],” Software, IEEE, vol. 13, Jan. and not on very narrowed aspects. 1996, pp. 12-21) • Non-technical activities: Studies that pro- [4] K. Elissa, “An Overview of Decision Theory,"unpublished. posed/used new practice to support non-technical (Unplublished manuscript) [5] R. van Solingen, D. F. Rico, and M. Zelkowitz, #Calculating aspects of agile projects like customer collaboration, software process improvement‘s return on investment,‖ in innovation management, Staffing etc. Advances in Computers, 2006, vol. 66, pp. 1–41 • Pair-programming/ Test-driven development: Stud- [6] Lan Cao, K. Mohan, PengXu, and B. Ramesh, “How extreme does extreme programming have to be? Adapting XP practices ies whose focus was solely on the effort to evaluate, to large-scale projects,” in System Sciences, 2004. Proceedings of propose modifications, or better understand Pair- the 37th Annual Hawaii International Conference on, 2004, p. 10 pp. programming or Test-driven development. [7] M. Scho¨nstro¨m, Software development as knowledge work – a conceptual framework, in: J. Hedman, T. Kalling, D. Khakar, • Architectural practices: These studies analyzed lack O. Steen (Eds.), Lund onInformatics, CBS Press, Copenhagen, of attention on software architecture in agile and 2005, pp.132–155. proposed/evaluated a set of architectural practices. [8] G. Kotonya and I. Sommerville, Requirements Engineering: Processes and Techniques, Wiley, 1998. • Quality assurance/security or risk assessment: The- [9] Warsta, J., Salo, O., Ronkainen, J. and Abrahamsson, P., “Agile se studies focused on proposing/evaluating quality software development methods. Review and analysis”, VTT assurance aspects of agile, for example evaluating re- Electronic Espoo, pp. 1-107, 2002. factoring, defining quality indicators in code refac- [10] M. Fowler, “The new methodology,” Wuhan University Journal of Natural Sciences, vol. 6, 2001, pp. 12-24. toring, proposing security practices to support exist- [11] B. Boehm, “Get Ready for Agile Methods, with Care,” Comput- ing agile development. er, vol. 35, Jan. 2002, p. 64–69. [12] H. Gallis, E. Arisholm, and T. Dybå, “An Initial Framework for Research on Pair Programming,” in Empirical Software Engineer- The review provides a list of findings according to the ing, International Symposium on, Los Alamitos, CA, USA, 2003, above themes/categories which practiotioners compare vol. 0, p. 132. with their settings and identify relevance to their needs. [13] W.W. Royce, “Managing the development of large software systems: concepts and techniques,” Proceedings of the 9th in- In regard to the resear papers presented, we identified a ternational conference on Software Engineering, Los Alamitos, divergence rather than concurrence. That is, research pa- CA, USA: IEEE Computer Society Press, 1987, p. 328–338. pers presented discussions on a variety of topics and thus [14] P. Kruchten, The Rational Unified Process: An Introduction, our study is limited by the directness of the evidence in Addison-Wesley Professional, 2000. [15] V. Schuppan and W. Russwurm, “A CMM-Based Evaluation of extracting combinatorial results. Our findings also show the V-Model 97,” 2000 how practices were modified/adapted. In this regard, [16] T. Hall, A. Rainer, N. Baddoo, and S. Beecham, "An empirical practices related to the role of customer, Stand-ups are study of maintenance issues within process improvement pro- grammes in the software industry," in Proceedings of the IEEE most often modified. International Conference on Software Maintenance, 2001, pp. As far as practice adoption is considered, XP practices 422-430. or Scrum in combination with XP are being favored. [17] K. Petersen, “Is Lean Agile and Agile Lean? A Comparison Between Two Development Paradigms”; in Modern Software However, we find a high rate of adoption for Pair- Engineering Concepts and Practices: Advanced Approaches; programming, Refactoring, Continuous Integration and Ali Dogru and VeliBicer (Eds.) continuous testing, i.e. testing throughout the project. [18] Truex, D. P., Baskerville, R. and Travis, J,” A methodical sys- tems development: The deferred meaning of systems develop- However, it is not clear if the tests are written before code ment systems,”. Accounting, Management and Information construction. However, we did not identify how success Technology 10, 53-79, 2000. and benefits of the use of agile practices were measured. [19] “Manifesto for Agile Software Development.” [Online]. Availa- In many studies, we did not find discussions on outcomes ble: http://agilemanifesto.org/. [Accessed: 10-May-2011]. [20] K. Beck, Extreme Programming Explained: Embrace Change, of agile implementation and measures taken into consid- Addison-Wesley, 2000. eration. To this end, a framework to measure the outcome [21] K. Schwaber, M. Beedle, “Agile Software Development with of agile practices is needed. Scrum,” Prentice Hall, Upper Saddle River, 2001. We also identified that the choice of agile practices in

38

[22] M. Poppendieck, T. Poppendieck, Lean Software Development [49] V. Rajlich, “Changing the paradigm of software engineering,” – An Agile Toolkit for Software Development Managers, Addi- Communications of the ACM, vol. 49, Aug. 2006, p. 67–70. son-Wesley, Boston, 2003. [50] Erickson, K. Lyytinen, and K. Siau, “Agile Modeling, Agile [23] A. Cockburn, Crystal Clear: A Human-Powered Methodology Software Development, and Extreme Programming,” Journal of for Small Teams, Addison-Wesley, 2004. Database Management, vol. 16, 34. 2005, pp. [24] S.R. Palmer, J.M. Felsing, A Practical Guide to Feature-driven [51] M. Striebeck, “Ssh! We Are Adding a Process...,” AGILE Con- Development, Prentice Hall, Upper Saddle River, NJ, 2002. ference, Los Alamitos, CA, USA: IEEE Computer Society, [25] J. Stapleton, DSDM: Business Focused Development, second 2006, pp. 185-193. ed., Pearson Education, 2003. [52] M. Ivarsson and T. Gorschek, “Technology transfer decision [26] M. Fowler, Is design dead?, Addison-Wesley Longman Publish- support in requirements engineering research: a systematic re- ing Co., Inc. Bostonp. 3–17, 2001. view of REj,” Requirements Engineering, vol. 14, [27] J. Sutherland and W.-J. van den Heuvel, “Enterprise application [53] Cohen, D., Lindvall, M., and Costa, P., “An Introduction to integration and complex adaptive systems,” Communications Agile Methods,” in Advances in Computers, Advances in Soft- of the ACM, vol. 45, Oct. 2002, p. 59–64. ware Engineering, vol. 62, M. V. Zelkowitz, Ed. Amsterdam: [28] PekkaAbrahamsson, “Agile Software Development Methods: A Elsevier, 2004. Minitutorial”, VTT publications, 2002. [54] J. Erickson, K. Lyytinen, and K. Siau, “Agile Modeling, Agile [29] Schwaber, C., and Fichera, R.: ‘Corporate IT leads the second Software Development, and Extreme Programming,” Journal of wave of agile adoption’ (Forrester Research, Inc., 2005). Database Management, vol. 16, 34. 2005, pp. 88-100. [30] Sillitti, A., Ceschi, M., Russo, B., et al.: ‘Managing uncertainty in [55] M. Laanti, O. Salo, and P. Abrahamsson, “Agile methods rapid- requirements: a survey in documentation-driven and agilecom- ly replacing traditional methods at Nokia: A survey of opinions panies’. Proc. Software Metrics 2005, 11th IEEE Int. Symp.,2005, on agile transformation,” Information and Software Technolo- pp. 17–27 gy, vol. 53, Mar. 2011, pp. 276-290. [31] Lindvall, M., Muthig, D., Dagnino, A., et al.: ‘Agile software [56] K. Sureshchandra and J. Shrinivasavadhani, “Adopting Agile in development in large organizations’, Comput., 2004, 37, pp. 26– Distributed Development,” Global Software Engineering, In- 34 ternational Conference on, Los Alamitos, CA, USA: IEEE Com- [32] Eckstein, J.: ‘Agile software development in large-diving into puter Society, 2008, pp. 217-221. the deep’ (Dorset House Publishing, 2004) [57] E. Hossain, M.A. Babar, H.-young Paik, and J. Verner, “Risk [33] O. Salo and P. Abrahamsson, “Agile methods in European em- Identification and Mitigation Processes for Using Scrum in bedded software development organisations: a survey on the Global Software Development: A Conceptual Framework,” actual use and usefulness of Extreme Programming and Asia-Pacific Software Engineering Conference, Los Alamitos, Scrum,” Software, IET, vol. 2, Feb. 2008, pp. 58-64. CA, USA: IEEE Computer Society, 2009, pp. 457-464. [34] Smigelschi, D.: ‘Combining predictability with extreme pro- [58] J. C. de Almeida Biolchini, P. G Milan, A. C. C Natali, T. U. gramming in embedded multimedia project’. Proc. 3rd Int. Conte, and G. H. Travassos, “Scientific research ontology to Conf. Extreme Programming and Agile Processes in Software support systematic review in software engineering, “ Advanced Engineering (XP2002), Sardinia, Italy, 2002, pp. 191–192 Engineering Informatic, vol. 21, no. 2, pp. 131-151, Apr. 2007. [35] B. Boehm, “Get Ready for Agile Methods, with Care,” Comput- [59] M. Ivarsson and T. Gorschek, “A method for evaluating rigor er, vol. 35, 2002, pp. 64-69. and industrial relevance of technology evaluations,” Empirical [36] B. Ramesh, with Lan Cao, “Agile Requirements Engineering Software Engineering, vol. 16, no. 3, pp. 365-395, Oct. 2010. Practices: An Empirical Study,” Software, IEEE, vol. 25, Feb. [60] J. Fleiss, “Measuring nominal scale agreement among many 2008, pp. 60-67. raters,” Psychological Bulletin, vol. 76, no. 5, pp. 378–382,1971. [37] L. Krzanik, P. Rodriguez, J. Simila, P. Kuvaja, and A. Rohunen, [61] J. R. Landis and G. G. Koch, “The measurement of observer “Exploring the Transient Nature of Agile Project Management agreement for categorical data,” Biometrics, vol. 33, no. 1, pp. Practices,” Jan. 2010, pp. 1-8. 159–174, Mar. 1977. [38] Glazer, H., J. Dalton, D. Anderson, M. Konrad, and S. Shrum, [62] “Enterprise - SME definition,” Aug. 2009. [Online]. Available: “CMMI® or Agile: Why Not Embrace Both!”, CMU/SEI-2008- http://ec.europa.eu/enterprise/enterprise_policy/ TN-003, Pittsburgh, 2008. sme_definition/index_en.htm [39] T. Dybå and T. Dingsøyr, “Empirical studies of agile software [63] M. V. Zelkowitz and D. Wallace, “Experimental validation in development: A systematic review,” Information and Software software engineering,” Information and Software Technolo- Technology, vol. 50, Aug. 2008, p. 833–859. gy,vol. 39, no. 11, pp. 735–743, 1997. [40] A Preliminary Roadmap for Empirical Research onAgile Soft- [64] Robert K. Yin. Case study research: design and methods. Sage ware Development. Publications, 3rdedition, 2003. [41] B. A. Kitchenham, T. Dyba, and M. Jorgensen, “Evidence-Based [65] Neill CJ, Laplante PA (2003) Requirements engineering: the Software Engineering,” Proceedings of the 26th International Con- state of the practice. IEEE Softw 20(6):40–45. ference on Software Engineering, p. 273–281, 2004. [66] Dybå T, Dingsøyr T et al (2007) Applying systematic reviews to [42] B. Kitchenham and S. Charters, “Guidelines for performing diverse study types: an experience report. First International systematic literature reviews in software engineering,” Soft- Symposium on Empirical Software Engineering and Measure- ware Engineering Group, Keele University and Department of ment (ESEM). ComputerScience, University of Durham, UK, Technical Report [67] A. Qumer and B. Henderson-Sellers, “An evaluation of the EBSE-2007- 01, Jul. 2007. degree of agility in six agile methods and its applicability for [43] A.C.C. França, F.Q.B. da Silva, and L.M.R. de Sousa Mariz, “An method engineering,” Information and Software Technology, vol. empirical study on the relationship between the use of agile 50, no. 4, pp. 280-295, Mar. 2008. practices and the success of Scrum projects,” Proceedings of the [68] B. George and L. Williams, “A structured experiment of test- 2010 ACM-IEEE International Symposium on Empirical Soft- driven development,” Information and Software Technology, vol. ware Engineering and Measurement, New York, NY, USA: 46, no. 5, pp. 337-342, Apr. 2004. ACM, 2010, p. 37:1–37:4. [69] K. Petersen and C. Wohlin, “Context in industrial software [44] A. Sidky and J. Arthur, “Determining the Applicability of Agile engineering research,” in Proceedings of the 3rd International Practices to Mission and Life-Critical Systems,” Proceedings of Symposium on Empirical Software Engineering and Measure- the 31st IEEE Software Engineering Workshop, Washington, ment, Orlando, USA, 2009, pp. 401–404. DC, USA: IEEE Computer Society, 2007, p. 3–12. [70] Peeters J. Agile Security Requirements Engineering. Presented [45] J. Highsmith, Agile Software Development Ecosystems, Addi- at the Symposium on Requirements Engineering for Infor- son-Wesley Professional, 2002. mation Security, 2005. [46] K. Schwaber and M. Beedle, Agile Software Development with [71] P. Schuh, “Recovery, redemption, and extreme programming,” Scrum, Prentice Hall, 2001. Software, IEEE, vol. 18, no. 6, pp. 34-41, 2001. [47] C. Hansson, Y. Dittrich, B. Gustafsson, and S. Zarnak, “How [72] N. B. Moe, T. Dingsøyr, and T. Dybå, “Understanding self- agile are industrial software development practices?,” Journal organizing teams in agile software development,” in 19th Aus- of Systems and Software, vol. 79, Sep. 2006, p. tralian Conference on Software Engineering, 2008, pp. 76–85. [48] T.S. Kuhn, The Structure of Scientific Revolutions, University [73] T. Dingsøyr, G. Hanssen, T. Dybå, G. Anker, and J. Nygaard, Of Chicago Press, 1996. “Developing software with scrum in a small cross-

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 39

organizational project,” Software Process Improvement, pp. 5–15, [94] E. Hossain, M.A. Babar, and H.-young Paik, “Using Scrum in 2006. Global Software Development: A Systematic Literature Re- [74] K. Conboy and B. Fitzgerald, “Method and developer charac- view,” Global Software Engineering, International Conference teristics for effective agile method tailoring: a study of XP ex- on, Los Alamitos, CA, USA: IEEE Computer Society, 2009, pp. pert opinion,” ACM Transactions on Software Engineering and 175-184. Methodology (TOSEM), vol. 20, no. 1, p. 2, 2010. [95] P Clutterbuck, T Rowlands, O Seamons (2009) A case study of [75] . Rumpe and A. Schröder, “Quantitative survey on extreme SME Web application development via agile methods. In Pro- programming projects,” in Third International Conference on Ex- ceedings of the 2nd European Conference on Information Man- treme Programming and Flexible Processes in Software Engineering, agement and Evaluation, Academic Publishing Limited, Read- XP2002, May, pp. 26–30. ing, UK, pp. 77-87. [76] J. Srinivasan, R. Dobrin, and K. Lundqvist, “‘State of the Art’in [96] H Koehnemann, M Coats (2009) Experiences Applying Agile Using Agile Methods for Embedded Systems Development,” in Practices to Large Systems. In Agile Conference, 2009. AGILE 2009 33rd Annual IEEE International Computer Software and Appli- '09, Chicago, pp. 295-300. cations Conference, 2009, pp. 522–527. [97] P. Hamill, D. Alexander, and S. Shasharina, “Web Service Vali- [77] J. Grenning and E. Class, “Test driven development for embed- dation Enabling Test-Driven Development of Service-Oriented ded software,” ESC, vol. 241, 2007. Applications,” Services, IEEE Congress on, Los Alamitos, CA, [78] T. Punkka, “Agile Methods and Firmware Development.”. USA: IEEE Computer Society, 2009, pp. 467-470. [79] M. Pikkarainen, J.M Sagredo and S Estela, "An Agile Practice [98] C. Doshi and D. Doshi, “A Peek into an Agile Infected Culture,” Tool Box." [Online]. Available: http://www.agile- AGILE Conference, Los Alamitos, CA, USA: IEEE Computer itea.org/public/deliverables.php. [Accessed: 09-Oct-2011]. Society, 2009, pp. 84-89. [80] J. Bowers, J. May, E. Melander, M. Baarman, and A. Ayoob, [99] M. Zeng, C. Wang, and Q.-yun Long, “Practices of Extreme “Tailoring XP for large system mission critical software devel- Programming for ERP Based on Two-dimensional Dynamic opment,” Extreme Programming and Agile Methods—XP/Agile Time Scheduling Interface Method,” Software Engineering, Universe 2002, pp. 269–301, 2002. World Congress on, Los Alamitos, CA, USA: IEEE Computer Society, 2009, pp. 236-240. [100] K Usha, N Poonguzhali, E Kavitha (2009) A quantitative ap- proach for evaluating the effectiveness of refactoring in soft- SYSTEMATIC LITERATURE REVIEW REFERENCES ware development process. In Proceeding of International Con- ference on Methods and Models in Computer Science, ICM2CS 2009, Delhi, pp. 1-7. [81] P.K. Vejandla and L.B. Sherrell, “Why an AI research team [101] E. Hossain, M.A. Babar, H.-young Paik, and J. Verner, “Risk adopted XP practices,” Proceedings of the 47th Annual South- Identification and Mitigation Processes for Using Scrum in east Regional Conference, New York, NY, USA: ACM, 2009, p. Global Software Development: A Conceptual Framework,” 72:1–72:4. Asia-Pacific Software Engineering Conference, Los Alamitos, [82] A. Martin, R. Biddle, and J. Noble, “XP Customer Practices: A CA, USA: IEEE Computer Society, 2009, pp. 457-464. Grounded Theory,” AGILE Conference, Los Alamitos, CA, [102] A.I. Nawaz and I.A. Zualkernan, “The role of agile practices in USA: IEEE Computer Society, 2009, pp. 33-40. disaster management and recovery: a case study,” Proceedings [83] R. Mahapatra, S.P. Nerur, K. Price, and V. Balijepally, “Are Two of the 2009 Conference of the Center for Advanced Studies on Heads Better than One for Software Development? The Collaborative Research, New York, NY, USA: ACM, 2009, p. Productivity Paradox of Pair Programming,” Management In- 164–173. formation Systems Quarterly, vol. 33, Mar. 2009, pp. 91-118. [103] A Garg (2009) Agile Software Development. DRDO Science [84] Nawrocki, J.R., Jasinski, M., Bartosz, W., Wojciechowski, A.: Spectrum, pp. 55-59. Combining Extreme Programming with ISO 9000. In: Proceed- [104] C.A. Wellington, “Managing a project course using Extreme ings of EurAsia-ICT 2002, Iran, October 29-31, 2002. Lecture Programming,” Oct. 2005, p. T3G-1. Notes in Computer Science, Vol.2510. Springer-Verlag, Berlin [105] G. Vanderburg, “A simple model of agile software processes – Heidelberg New York (2002) 786–794. or – extreme programming annealed,” ACM SIGPLAN Notices, [85] M.A. Babar, “An exploratory study of architectural practices vol. 40, Oct. 2005, p. 539–545. and challenges in using agile software development approach- [106] B. Upender, “Staying Agile in Government Software Projects,” es,” Sep. 2009, pp. 81-90. Agile Development Conference/Australasian Database Confer- [86] P. Middleton, A. Flaxel, and A. Cookson, “Lean Software Man- ence, Los Alamitos, CA, USA: IEEE Computer Society, 2005, pp. agement Case Study: Timberline Inc.,” in Extreme Programming 153-159. and Agile Processes in Software Engineering, vol. 3556, H. [107] H. Svensson and M. Höst, “Introducing an Agile Process in a Baumeister, M. Marchesi, and M. Holcombe, Eds. Berlin, Hei- Software Maintenance and Evolution Organization,” Software delberg: Springer Berlin Heidelberg, 2005, pp. 1-9. Maintenance and Reengineering, European Conference on, Los [87] J. Sutherland, G. Schoonheim, N. Kumar, V. Pandey, and S. Alamitos, CA, USA: IEEE Computer Society, 2005, pp. 256-264. Vishal, “Fully Distributed Scrum: Linear Scalability of Produc- [108] J W Spence There has to be a better way! [software develop- tion between San Francisco and India,” AGILE Conference, Los ment]. In Proceedings of Agile Conference, AGILE 2005, Den- Alamitos, CA, USA: IEEE Computer Society, 2009, pp. 277-282. ver,2005, pp. 272-278. [88] Wanfeng Dou, Kui Hong, , and , Xiaoyong Zhang, “A Frame- [109] N Pikkarainen, U Passoja (2005) An Approach for Assessing work of Distributed Pair Programming System,” Dec. 2009, pp. Suitability of Agile Solutions: A Case Study. In 6th International 1-4. Conference on Extreme Programming and Agile Processes in [89] Z Hussain, W Slany, A Holzinger (2009) Investigating Agile Software Engineering, pp. 171-179. User-Centered Design in Practice: A Grounded Theory Perspec- [110] R.F. Paige, H. Chivers, J.A. McDermid, and Z.R. Stephenson, tive. In Andreas 5th Annual Usability Symposium 2009, Linz, “High-integrity extreme programming,” Proceedings of the pp. 279-289. 2005 ACM symposium on Applied computing, New York, NY, [90] S. Shahzad, “Learning from Experience: The Analysis of an USA: ACM, 2005, p. 1518–1523. Extreme Programming Process,” Information Technology: New [111] V B Miic Perceptions of extreme programming: A pilot study, Generations, Third International Conference on, Los Alamitos, In Engineering Management Conference 2005, Newfoundland, CA, USA: IEEE Computer Society, 2009, pp. 1405-1410. 2005, pp. 307-312. [91] D. Batra, “Modified agile practices for outsourced software [112] T. Little, “Context-adaptive agility: managing complexity and projects,” communications of the ACM, vol. 52, Sep. 2009, p. uncertainty,” Software, IEEE, vol. 22, Jun. 2005, pp. 28- 35. 143–148. [113] A. Law and S. Learn, “Waltzing with changes [agile software [92] J. Sutherland and I. Altman, “Take No Prisoners: How a Ven- development],” Jul. 2005, pp. 279- 285. ture Capital Group Does Scrum,” AGILE Conference, Los [114] M RHilkka, T Tuure, R Matti Is extreme programming just old Alamitos, CA, USA: IEEE Computer Society, 2009, pp. 350-355. wine in new bottles: A comparison of two cases. Journal of Da- [93] T Schummer, S Lukosch (2009) Understanding Tools and Prac- tabase Management 16(4), 2005, pp. 41-61. tices for Distributed Pair Programming. Journal of Universal [115] Y Dubinsky, O Hazzan, A Keren” Introducing extreme pro- Computer Science 15(16): 3101-3125. gramming into a software project at the Israeli Air Force”. In

40

6th International Conference on Extreme Programming and Ag- [137] R. Urdangarin, P. Fernandes, A. Avritzer, and D. Paulish, “Ex- ile Processes in Software Engineering, Sheffield,2005, pp. 19-27. periences with Agile Practices in the Global Studio Project,” [116] A F Da Silva, F Kon, C Torteli (2005) XP South of the Equator: Global Software Engineering, International Conference on, Los An eXPerience Implementing XP in Brazil. In Proceedings of Alamitos, CA, USA: IEEE Computer Society, 2008, pp. 77-86. the 6th International Conference on Extreme Programming and [138] K. Sureshchandra and J. Shrinivasavadhani, “Adopting Agile in Agile Processes in Software Engineering, Sheffield, pp. 10-18. Distributed Development,” Global Software Engineering, In- [117] C Olivier, C David, G M George G (2005) A software methodol- ternational Conference on, Los Alamitos, CA, USA: IEEE Com- ogy for applied research: eXtreme Researching. Software: Prac- puter Society, 2008, pp. 217-221. tice and Experience 35(15): 1441-1454. [139] M. Summers, “Insights into an Agile Adventure with Offshore [118] G Canfora, A Cimitile, A A Visaggio (2005) Empirical study on Partners,” AGILE Conference, Los Alamitos, CA, USA: IEEE the productivity of the pair programming. In 6th International Computer Society, 2008, pp. 333-338. Conference, XP 2005, pp. 92-99. [140] C.A. Siebra, M. Filho, F. Silva, and A. Santos, “Deciphering [119] E. Bellini, G. Canfora, F. García, M. Piattini, and C.A. Visaggio, extreme programming practices for innovation process man- “Pair designing as practice for enforcing and diffusing design agement,” Sep. 2008, pp. 1292-1297. knowledge,” Journal of Software Maintenance and Evolution: [141] A.A. Sharifloo, A.S. Saffarian, and F. Shams, “Embedding Ar- Research and Practice, vol. 17, Nov. 2005, pp. 401-423. chitectural Practices into Extreme Programming,” Software En- [120] S. Xu and V. Rajlich, “Empirical Validation of Test-Driven Pair gineering Conference, Australian, Los Alamitos, CA, USA: IEEE Programming in Game Development,” Computer and Infor- Computer Society, 2008, pp. 310-319. mation Science, 5th IEEE/ACIS International Conference on, [142] S.H. Rayhan and N. Haque, “Incremental Adoption of Scrum Los Alamitos, CA, USA: IEEE Computer Society, 2006, pp. 500- for Successful Delivery of an IT Project in a Remote Setup,” AG- 505. ILE Conference, Los Alamitos, CA, USA: IEEE Computer Socie- [121] N.V. Schooenderwoert, “Embedded Agile Project by the Num- ty, 2008, pp. 351-355. bers With Newbies,” AGILE Conference, Los Alamitos, CA, [143] M Qasaimeh, H Mehrfard, and A Hamou-Lhadj (2008) Compar- USA: IEEE Computer Society, 2006, pp. 351-366. ing Agile Software Processes Based on the Software Develop- [122] M. Striebeck, “Ssh! We Are Adding a Process...,” AGILE Con- ment Project Requirements. In International Conference on ference, Los Alamitos, CA, USA: IEEE Computer Society, Computational Intelligence for Modelling, Control and Auto- 2006, pp. 185-193. mation, Vienna, pp. 49-54. [123] P. Sfetsos, L. Angelis, and I. Stamelos, “Investigating the ex- [144] M Paasivaara, S Durasiewicz and C Lassenius (2008) Using treme programming system–An empirical study,” Empirical scrum in a globally distributed project: a case study. Software Software Engineering, vol. 11, 2006, pp. 269-301. Process: Improvement and Practice 13(6): 527-44. [124] J.-G. Schneider and R. Vasa, “Agile Practices in Software De- [145] M. Najafi and L. Toyoshiba, “Two Case Studies of User Experi- velopment - Experiences from Student Projects,” Software En- ence Design and Agile Development,” AGILE Conference, Los gineering Conference, Australian, Los Alamitos, CA, USA: IEEE Alamitos, CA, USA: IEEE Computer Society, 2008, pp. 531-536. Computer Society, 2006, pp. 401-410. [146] N. Nagappan, E.M. Maximilien, T. Bhat, and L. Williams, “Re- [125] R Moser, A Sillitti, P Abrahamsson, G Succi (2006) Does Refac- alizing quality improvement through test driven development: toring Improve Reusability? In 9th International Conference on results and experiences of four industrial teams,” Empirical Software Reuse, ICSR 2006, Torino pp. 287-97. Software Engineering, vol. 13, 2008, pp. 289-302. [126] P. Meso and R. Jain, “Agile Software Development: Adaptive [147] R. Mugridge, “Managing Agile Project Requirements with Sto- Systems Principles and Best Practices,” Information Systems rytest-Driven Development,” IEEE Software, vol. 25, 2008, pp. Management, vol. 23, 2006, p. 19. 68-75. [127] J. Auvinen, R. Back, J. Heidenberg, P. Hirkman, and L. Mi- [148] L. Madeyski, “Impact of pair programming on thoroughness lovanov, “Software Process Improvement with Agile Prac- and fault detection effectiveness of unit test suites,” Software tices in a Large Telecom Company,” in Product-Focused Software Process: Improvement and Practice, vol. 13, May. 2008, pp. 281- Process Improvement, vol. 4034, J. Münch and M. Vierimaa, Eds. 295. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 79-93. [149] F. Kinoshita, “Practices of an Agile Team,” Aug. 2008, pp. 373- [128] J. Sutherland, A. Viktorov, J. Blount, and N. Puntikov, “Distrib- 377. uted Scrum: Agile Project Management with Outsourced De- [150] A. Martin, J. Noble, and R. Biddle, “Programmers are from velopment Teams,” Hawaii International Conference on System Mars, customers are from Venus: a practical guide for custom- Sciences, Los Alamitos, CA, USA: IEEE Computer Society, 2007, ers on XP projects,” Proceedings of the 2006 conference on Pat- p. 274a. tern languages of programs, New York, NY, USA: ACM, 2006. [129] R. Lawrence, “XP and Junior Developers: 7 Mistakes (and how [151] K. M. Lui and K. C. C. Chan, “Software Process Fusion: Uniting to avoid them),” Aug. 2007, pp. 234-239. Pair Programming and Solo Programming Processes,” in Soft- [130] L Madeyski (2007) On the effects of pair programming on thor- ware Process Change, vol. 3966, Q. Wang, D. Pfahl, D. M. Raffo, oughness and fault-finding effectiveness of unit tests. In 8th In- and P. Wernick, Eds. Berlin, Heidelberg: Springer Berlin Hei- ternational Conference on Product-Focused Software Process delberg, 2006, pp. 115-123. Improvement, Riga, pp. 207-221. [152] D.B. Levine and H.M. Walker, “XP practices applied to grad- [131] J.C. Sanchez, L. Williams, and E.M. Maximilien, “On the Sus- ing,” ACM SIGCSE Bulletin, New York, NY, USA: ACM, 2006, tained Use of a Test-Driven Development Practice at IBM,” p. 173–177. Aug. 2007, pp. 5-14. [153] D. Kirk and E. Tempero, “Identifying Risks in XP Projects [132] R. Sison and T. Yang, “Use of Agile Methods and Practices in through Process Modelling,” Software Engineering Conference, the Philippines,” Asia-Pacific Software Engineering Conference, Australian, Los Alamitos, CA, USA: IEEE Computer Society, Los Alamitos, CA, USA: IEEE Computer Society, 2007, pp. 462- 2006, pp. 411-420. 469. [154] K.M. Lui and K.C.C. Chan, “Pair programming productivity: [133] M. Sulfaro, M. Marchesi, and S. Pinna, “Agile Practices in a Novice-novice vs. expert-expert,” International Journal of Hu- Large Organization: The Experience of Poste Italiane,” in Agile man-Computer Studies, vol. 64, Sep. 2006, pp. 915-925. Processes in Software Engineering and Extreme Programming, vol. [155] R. N. Jensen, T. Møller, P. Sönder, and G. Tjørnehøj, “Archtec- 4536, G. Concas, E. Damiani, M. Scotto, and G. Succi, Eds. Ber- ture and Design in eXtreme Programming; Introducing ‘Devel- lin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 219-221. oper Stories’,” in Extreme Programming and Agile Processes in [134] Jari Vanhanen and Harri Korpi, “Experiences of Using Pair Software Engineering, vol. 4044, P. Abrahamsson, M. Marchesi, Programming in an Agile Project,” Jan. 2007, p. 274b-274b. and G. Succi, Eds. Berlin, Heidelberg: Springer Berlin Heidel- [135] X. Wang, Z. Wu, and M. Zhao, “The Relationship between De- berg, 2006, pp. 133-142. velopers and Customers in Agile Methodology,” Computer Sci- [156] H. Holmström, B. Fitzgerald, P.J. Ågerfalk, and E.Ó. Conchúir, ence and Information Technology, International Conference on, “Agile Practices Reduce Distance in Global Software Develop- Los Alamitos, CA, USA: IEEE Computer Society, 2008, pp. 566- ment,” Information Systems Management, vol. 23, 2006, p. 7. 572. [157] A. Elshamy and A. Elssamadisy, “Divide After You Conquer: [136] M. Vax and S. Michaud, “Distributed Agile: Growing a Practice An Agile Software Development Practice for Large Projects,” in Together,” AGILE Conference, Los Alamitos, CA, USA: IEEE Extreme Programming and Agile Processes in Software Engineering, Computer Society, 2008, pp. 310-314. vol. 4044, P. Abrahamsson, M. Marchesi, and G. Succi, Eds. Ber- lin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 164-168.

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 41

[158] R. Coffin, “A tale of two projects [agile projects],” Jul. 2006, p. 7 [179] J. Wäyrynen, M. Bodén, and G. Boström, "Security Engineering pp.-164. and eXtreme Programming: an Impossible marriage?," in Ex- [159] G. Canfora, A. Cimitile, F. Garcia, M. Piattini, and C. A. Visag- treme programming and agile methodsXP/Agile Universe gio, “Productivity of Test Driven Development: A Controlled 2004, C. Zannier, H. Erdogmus, and L. Lindstrom, Eds. Experiment with Professionals,” in Product-Focused Software LNSC3134, Berlin: Springer-Verlag, 2004. Process Improvement, vol. 4034, J. Münch and M. Vierimaa, Eds. [180] S. Lee and H.-S. Yong, “Distributed agile: project management Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, pp. 383- in a global environment,” Empirical Software Engineering, vol. 388. 15, 2009, pp. 204-217. [160] J. Bergin and F. Grossman, “Extreme Construction: Making [181] Visconti, M. & Cook, C. R. (2004). An Ideal Process Model for Agile Accessible,” AGILE Conference, Los Alamitos, CA, USA: Agile Methods. Proceedings of 5th International Conference on IEEE Computer Society, 2006, pp. 384-389. Product Focused Software Process Improvement PROFES 2004, [161] R. Baskerville, B. Ramesh, L. Levine, and J. Pries-Heje, “High- Lecture Notes in Computer Science, 3009, 431-441. Speed Software Development Practices: What Works, What [182] MARY BETH SMRTIC, GEORGES GRINSTEINA Case Study in Doesn’t,” IT Professional, vol. 8, Aug. 2006, pp. 29-36. the Use of ExtremeProgramming in an Academic Environment [162] X. Ge, R. F. Paige, F. Polack, and P. Brooke, “Extreme Pro- Extreme Programming and Agile Methods - XP/Agile Universe gramming Security Practices,” in Agile Processes in Software En- 2004, Pages175-182. gineering and Extreme Programming, vol. 4536, G. Concas, E. [183] H. Sharp and H. Robinson, “An Ethnographic Study of XP Damiani, M. Scotto, and G. Succi, Eds. Berlin, Heidelberg: Practice,” Empirical Software Engineering, vol. 9, 2004, pp. 353- Springer Berlin Heidelberg, 2007, pp. 226-230. 375. [163] S. Berczuk, “Back to Basics: The Role of Agile Principles in Suc- [184] Middleton PJ, “Lean software process,” Jun. 2010. cess with an Distributed Scrum Team,” AGILE Conference, Los [185] O. Salo, K. Kolehmainen, P. Kyllönen, J. Löthman, S. Salmijärvi, Alamitos, CA, USA: IEEE Computer Society, 2007, pp. 382-388. and P. Abrahamsson, "Self-Adaptability of Agile Software Pro- [164] K Peterson (2010) An Empirical Study of Lead-Times in Incre- cesses: A Case Study on PostIteration Workshops," 5th Interna- mental and Agile Software Development. In International Con- tional Conference on Extreme Programming and Agile Process- ference on Software Process, ICSP 2010, Paderborn, pp. 35-356. es in Software Engineering (XP 2004), GarmischPartenkirchen, [165] B. George and L. Williams, “An initial investigation of test Germany, 2004. driven development in industry,” Proceedings of the 2003 ACM [186] Rogers, O. “Acceptance Testing vs. Unit Testing: A Developer’s symposium on Applied computing, New York, NY, USA: Perspective”, Lecture Notes in Computer Science, Vol. 3134, ACM, 2003, p. 1135–1139. Springer-Verlag: 22 – 31, 2004. [166] Ramachandran V., Shukla A.; Circle of Life, Spiral of Death: are [187] O. Salo, “Improving Software Process in Agile Software Devel- XP Teams Following the Essential Practices? in Extreme Pro- opment Projects: Results from Two XP Case Studies,” EU- gramming and Agile Methods –XP/Agile Universe 2002; LNCS ROMICRO Conference, Los Alamitos, CA, USA: IEEE Comput- 2418, pp. 166-173; Springer, 2002. er Society, 2004, pp. 310-317. [167] S. Andrzeevski, “Experience Report ‘Offshore XP for PDA de- [188] Rogers, O., Scaling Continuous Integration, presented in XP velopment’,” AGILE Conference, Los Alamitos, CA, USA: IEEE 2004, (2004). Computer Society, 2007, pp. 376-381. [189] H. Robinson, H. Sharp, The characteristics of XP teams, in: Pro- [168] A. Elshamy and A. Elssamadisy, “Applying Agile to Large ceedings of XP2004 Germany, 2004, pp. 139–147. Projects: New Agile Software Development Practices for Large [190] C. Poole, "Distributed product development using extreme Projects,” in Agile Processes in Software Engineering and Extreme programming," in XP2004, GarmischPartenkirchen, Germany. Programming, vol. 4536, G. Concas, E. Damiani, M. Scotto, and pp. 60-67. G. Succi, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, [191] A. Parrish, R. Smith, D. Hale, and J. Hale, “A field study of 2007, pp. 46-53. developer pairs: productivity impacts and implications,” Soft- [169] B. Ramesh, L. Cao, K. Mohan, and P. Xu, “Can distributed ware, IEEE, vol. 21, Oct. 2004, pp. 76- 79. software development be agile?,” Communications of the ACM, [192] M.F. Nisar and T. Hameed, “Agile methods handling offshore vol. 49, p. 41–46, Oct. 2006. software development issues,” Dec. 2004, pp. 417- 422. [170] A. Kornstädt and J. Sauer, “Mastering dual-shore development: [193] M. Huo, J. Verner, L. Zhu, and M.A. Babar, “Software Quality the tools and materials approach adapted to agile offshoring,” and Agile Methods,” Computer Software and Applications in Proceedings of the 1st international conference on Software engi- Conference, Annual International, Los Alamitos, CA, USA: neering approaches for offshore and outsourced development, Berlin, IEEE Computer Society, 2004, pp. 520-525. Heidelberg, 2007, p. 83–95. [194] P. Baheti, E. Gehringer, and D. Scotts. Exploring the efficacy of [171] R. Benefield, “Seven Dimensions of Agile Maturity in the Glob- Distributed Pair Programming. In Proceedings of Extreme Pro- al Enterprise: A Case Study,” Hawaii International Conference gramming and Agile Methods . XP/Agile Universe 2002, pages on System Sciences, Los Alamitos, CA, USA: IEEE Computer 208.220, Chicago, Illinois, USA, 2002. Society, 2010, pp. 1-7. [195] Melnik, G., & Maurer, F. (2004). Introducing Agile Methods: [172] A. Danait, “Agile Offshore Techniques - A Case Study,” Agile Three Years of Experience. Paper presented at the 30th EU- Development Conference/Australasian Database Conference, ROMICRO Conference, Rennes, France. Los Alamitos, CA, USA: IEEE Computer Society, 2005, pp. 214- [196] Jokela and P. Abrahamsson, Usability assessment of an extreme 217. programming project: close co-operation with the customer [173] M.-W. Chung, S. Nugroho, and J.F. ?JF? Unson, “Tidal Wave: does not equal to good usability, Profes 2004 (2004). The Games Transformation,” AGILE Conference, Los Alamitos, [197] D. McKinney and L.F. Denton, “Affective assessment of team CA, USA: IEEE Computer Society, 2008, pp. 102-105. skills in agile CS1 labs: the good, the bad, and the ugly,” ACM [174] T. Smith, “Product Innovation is Practical, Important, and Pos- SIGCSE Bulletin, New York, NY, USA: ACM, 2005, p. 465–469. sible,” AGILE Conference, Los Alamitos, CA, USA: IEEE Com- [198] K. Mannaro, M. Melis and M. Marchesi, Empirical Analysis on puter Society, 2008, pp. 561-565. the Satisfaction of IT Employees Comparing XP Practices with [175] R. Wijnands and I. Van Dijk, “Multi-tasking agile projects: the Other Software Development Methodologies, Lecture Notes in pressure tank,” Proceedings of the 8th international conference Computer Science (2004), pp. 166–174. on Agile processes in software engineering and extreme pro- [199] W. Ambu and F. Gianneschi, “Extreme Programming at Work,” gramming, Berlin, Heidelberg: Springer-Verlag, 2007, p. 231– in Extreme Programming and Agile Processes in Software Engineer- 234. ing, vol. 2675, M. Marchesi and G. Succi, Eds. Berlin, Heidel- [176] M. Yap, “Value Based Extreme Programming,” AGILE Confer- berg: Springer Berlin Heidelberg, 2003, pp. 347-350. ence, Los Alamitos, CA, USA: IEEE Computer Society, 2006, pp. [200] G. Luck, “Subclassing XP: breaking its rules the right way,” Jun. 175-184. 2004, pp. 114- 119. [177] B Fitzgerald, G Hartnett, K Conboy (2006) Customising agile [201] I. Lokpo, M. Babri, and G. Padiou, “Assistance for Supporting methods to software practices at Intel Shannon. European XP Test Practices in a Distributed CSCW Environment,” in Ex- Jounal of Information Systems 15(2):200-213. treme Programming and Agile Processes in Software Engineering, [178] L. Williams, W. Krebs, L. Layman, A. Antón, P. Abrahamsson, vol. 3092, J. Eckstein and H. Baumeister, Eds. Berlin, Heidel- Toward a Framework for Evaluating Extreme Programming, berg: Springer Berlin Heidelberg, 2004, pp. 262-265. presented at Empirical Assessment in Software Eng. (EASE) [202] P. Becker-Pechau, H. Breitling, M. Lippert, and A. Schmolitzky, 2004, Edinburgh, Scot., 2004. “Teaching Team Work: An Extreme Week for First-Year Pro-

42

grammers,” in Extreme Programming and Agile Processes in Soft- [221] E. Bos and C. Vriens, "An agile CMM," In proceedings of 4th ware Engineering, vol. 2675, M. Marchesi and G. Succi, Eds. Ber- Conference on Extreme Programming and Agile Methods XP/ lin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 386-393. Agile Universe, pp. 129138, 2004. [203] B.S. Boelsterli, “Iteration Advocate/Iteration Transition Meet- [222] Aveling, B., 2004. XP lite considered harmful? In: Proceedings ing: Small Sampling of New agile Techniques Used at a Major of the Fifth International Conference of Extreme Programming Telecommunications Firm,” Agile Development Confer- and Agile Processes in Software Engineering, LNCS 3092, ence/Australasian Database Conference, Los Alamitos, CA, Springer, pp. 94–103. USA: IEEE Computer Society, 2003, p. 109. [223] S. Atsuta and S. Matsuura, “eXtreme Programming support [204] A. Law and A. Ho, “A study case: evolution of co-location and tool in distributed environment,” vol. 2, pp. 32- 33 vol.2, Sep. planning strategy,” Jun. 2004, pp. 56- 62. 2004. [205] L. Cao, K. Mohan, P. Xu, and B. Ramesh, “How Extreme Does [224] T. Dingsøyr and G. K. Hanssen, “Extending Agile Methods: Extreme Programming Have to Be? Adapting XP Practices to Postmortem Reviews as Extended Feedback,” in Advances in Large-Scale Projects,” Hawaii International Conference on Sys- Learning Software Organizations, vol. 2640, S. Henninger and F. tem Sciences, Los Alamitos, CA, USA: IEEE Computer Society, Maurer, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004, p. 30083c. 2003, pp. 4-12. [206] Kuranuki Y., Hiranabe K.: Antipractices: AntiPatterns for XP [225] Y. Dubinsky and O. Hazzan, “Using Metaphors in eXtreme Practices. Agile Development Conference (2004). Programming Projects,” in Extreme Programming and Agile Pro- [207] Koskela, J. and Abrahamsson, P. On-Site Customer in an XP cesses in Software Engineering, vol. 2675, M. Marchesi and G. Project: Empirical Results from a Case Study. In Proceedings Succi, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, 1th European Conference on Software Process Improvements pp. 420-421. (EuroSPI 2004) (Trondheim, Norway, November 10-12, 2004), [226] P. Amey and R. Chapman, “Static verification and extreme 1-11. programming,” ACM SIGAda Ada Letters, New York, NY, [208] Korkala, M., Abrahamsson, P.: Extreme Programming: Reas- USA: ACM, 2003, p. 4–9. sessing the Requirements Management Process for an Offsite [227] P. Abrahamsson and J. Koskela, “Extreme Programming: A Customer. In: Proceedings of the European Software Process Survey of Empirical Data from a Controlled Case Study,” Em- Improvement Conference EUROSPI 2004, Springer Verlag pirical Software Engineering, International Symposium on, Los LNCS Series (2004). Alamitos, CA, USA: IEEE Computer Society, 2004, pp. 73-82. [209] T. Bozheva, “Practical Aspects of XP Practices,” in Extreme Pro- [228] R. Juric, “Extreme programming and its development practic- gramming and Agile Processes in Software Engineering, vol. 2675, es,” Jun. 2000, pp. 97- 104. M. Marchesi and G. Succi, Eds. Berlin, Heidelberg: Springer [229] J. Kivi, D. Haydon, J. Hayes, R. Schneider, and G. Succi, “Ex- Berlin Heidelberg, 2003, pp. 360-362. treme programming: a university team design experience,” vol. [210] J. Koskela, P. Abrahamsson, and J. Takalo, “Improving Re- 2, 2000, pp. 816-820 vol.2. quirements Management in Extreme Programming with Tool [230] J. Grenning, “Launching extreme programming at a process- Support - An Improvement Attempt that Failed,” EUROMI- intensive company,” Software, IEEE, vol. 18, no. 6, pp. 27-33, CRO Conference, Los Alamitos, CA, USA: IEEE Computer So- Dec. 2001. ciety, 2004, pp. 342-351. [231] M.M. Müller and W.F. Tichy, “Case study: extreme program- [211] P. Hodgetts, “Refactoring the development process: experiences ming in a university environment,” Proceedings of the 23rd In- with the incremental adoption of agile practices,” Jun. 2004, pp. ternational Conference on Software Engineering, Washington, 106- 113. DC, USA: IEEE Computer Society, 2001, p. 537–544. [212] C. Ho, K. Slaten, L. Williams, and S. Berenson, “Work in pro- [232] J. Nawrocki, B. Walter, and A. Wojciechowski, “Toward ma- gress-unexpected student outcome from collaborative agile turity model for extreme programming,” pp. 233-239, 2001. software development practices and paired programming in a [233] M.C. Paulk, “Extreme Programming from a CMM Perspec- software engineering course,” Oct. 2004, pp. F2C- 15-16 Vol. 2. tive,” IEEE Software, vol. 18, 2001, pp. 19-26. [213] B. Greene, “Agile Methods Applied to Embedded Firmware [234] C.J. Poole, T. Murphy, J.W. Huisman, and A. Higgins, “Extreme Development,” Agile Development Conference/Australasian maintenance,” 2001, pp. 301-309. Database Conference, Los Alamitos, CA, USA: IEEE Computer [235] Succi, G., M. Stefanovic, and W. Pedrycz. Quantitative Assess- Society, 2004, pp. 71-77. ment of Extreme Programming Practices. in Canadian Confer- [214] R. W. Miller, “When Pairs Disagree, 1-2-3,” in Extreme Pro- ence on Electrical and Computer Engineering. 2001. Toronto, gramming and Agile Methods — XP/Agile Universe 2002, vol. 2418, Ontario, Canada. D. Wells and L. Williams, Eds. Berlin, Heidelberg: Springer Ber- [236] A.V. Deursen, “Program Comprehension Risks and Opportuni- lin Heidelberg, 2002, pp. 231-236. ties in Extreme Programming,” Reverse Engineering, Working [215] J. Vanhanen, J. Jartti, and T. Kähkönen, “Practical Experiences Conference on, Los Alamitos, CA, USA: IEEE Computer Socie- of Agility in the Telecom Industry,” in Extreme Programming and ty, 2001, p. 176. Agile Processes in Software Engineering, vol. 2675, M. Marchesi [237] H. Erdogmus and L. Williams, “The Economics of Software and G. Succi, Eds. Berlin, Heidelberg: Springer Berlin Heidel- Development by Pair Programmers,” The Engineering Econo- berg, 2003, pp. 279-287. mist: A Journal Devoted to the Problems of Capital Investment, [216] FAVELA, J., et al., 2004, “Empirical Evaluation of Collaborative vol. 48, 2003, p. 283. Support for Distributed Pair Programming”, In: International [238] J. B. Fenwick, “Adapting XP to an Academic Environment by Workshop on Groupware, pp.215-222, Lecture Notes in Com- Phasing-In Practices,” in Extreme Programming and Agile Methods puter Science, San Jose, Costa Rica, set. - XP/Agile Universe 2003, vol. 2753, F. Maurer and D. Wells, Eds. [217] B. Tessem, “Experiences in Learning XP Practices: A Qualitative Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 162- Study,” in Extreme Programming and Agile Processes in Software 171. Engineering, vol. 2675, M. Marchesi and G. Succi, Eds. Berlin, [239] B. George and L. Williams, “An initial investigation of test Heidelberg: Springer Berlin Heidelberg, 2003, pp. 131-137. driven development in industry,” Proceedings of the 2003 ACM [218] B. Ramesh, with Lan Cao, “Agile Requirements Engineering symposium on Applied computing, New York, NY, USA: Practices: An Empirical Study,” Software, IEEE, vol. 25, Feb. ACM, 2003, p. 1135–1139. 2008, pp. 60-67. [240] A. Greca, V. Jovanovic, and J. Harris, “Enhancing learning suc- [219] S. Bryant, “Double Trouble: Mixing Qualitative and Quantita- cess in the introductory programming course,” vol. 1, Nov. tive Methods in the Study of eXtreme Programmers,” Visual 2003, pp. T4C- 15-21 Vol.1. Languages and Human-Centric Computing, IEEE Symposium [241] S. Bowen and F. Maurer, “Process support and knowledge on, Los Alamitos, CA, USA: IEEE Computer Society, 2004, pp. management for virtual teams doing agile software develop- 55-61. ment,” 2002, pp. 1118- 1120. [220] G. Broza, “Adapting Extreme Programming to Research, De- [242] O. Hazzan and Y. Dubinsky, “Bridging Cognitive and Social velopment and Production Environments,” in Extreme Pro- Chasms in Software Development Using Extreme Program- gramming and Agile Methods - XP/Agile Universe 2004, vol. 3134, ming,” in Extreme Programming and Agile Processes in Software C. Zannier, H. Erdogmus, and L. Lindstrom, Eds. Berlin, Hei- Engineering, vol. 2675, M. Marchesi and G. Succi, Eds. Berlin, delberg: Springer Berlin Heidelberg, 2004, pp. 139-146. Heidelberg: Springer Berlin Heidelberg, 2003, pp. 47-53. [243] D. Karlström and P. Runeson, “Decision support for extreme programming introduction and practice selection,” Proceedings

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 43

of the 14th international conference on Software engineering [264] A. N. Bowers, R. S. Sangwan, and C. J. Neill, “Adoption and knowledge engineering, New York, NY, USA: ACM, 2002, of XP practices in the industry—A survey,” Software Pro- p. 835–841. cess: Improvement and Practice, vol. 12, no. 3, pp. 283-294, [244] F. Maurer, “Supporting Distributed Extreme Programming,” May. 2007. Proceedings of the Second XP Universe and First Agile Universe [265] O. Murru, R. Deias, and G. Mugheddue, “Assessing XP at a Conference on Extreme Programming and Agile Methods - XP/Agile European Internet company,” Software, IEEE, vol. 20, Jun. 2003, Universe 2002, p. 13–22, 2002. pp. 37- 43. [245] F. Maurer, S. Martel, Extreme programming: rapid develop- ment for web-based applications, IEEE Internet Computing 6 [266] B.'Victor'and'N.'Jacobson,'“We'Didn’t'Quite'Get'It,”'Aug.'2009,' 2002) 86–90. pp.'271G274.' [246] G. Melnik and F. Maurer, “Perceptions of Agile Practices: A Student Survey,” Extreme Programming/Agile Universe, Chi- cago, IL, 2002. [247] K.R. Howard, “The Covert Agilist,” AGILE Conference, Los Alamitos, CA, USA: IEEE Computer Society, 2009, pp. 131- ' 134. ' [248] S.J. Khalaf and K.A. Maria, “An Empirical Study of XP: The Case of Jordan,” Information and Multimedia Technology, ' International Conference on, Los Alamitos, CA, USA: IEEE ' Computer Society, 2009, pp. 380-383. ' [249] S. Stolberg, “Enabling Agile Testing through Continuous Inte- ' gration,” AGILE Conference, Los Alamitos, CA, USA: IEEE ' Computer Society, 2009, pp. 369-374. [250] S.H. VanderLeest and A. Buter, “Escape the waterfall: Agile for aerospace,” Oct. 2009, pp. 6.D.3-1-6.D.3-16. [251] O. Hazzan and J. Tomayko, “The Reflective Practitioner Per- spective in eXtreme Programming,” in Extreme Programming and Agile Methods - XP/Agile Universe 2003, vol. 2753, F. Maurer and D. Wells, Eds. Berlin, Heidelberg: Springer Berlin Heidel- berg, 2003, pp. 51-61. [252] G. Hedin, L. Bendix, and B. Magnusson, “Introducing Software Engineering by means of Extreme Programming,” Software Engineering, International Conference on, Los Alamitos, CA, USA: IEEE Computer Society, 2003, p. 586. [253] A.M. Figueiredo, “An Executive Scrum Team,” AGILE Confer- ence, Los Alamitos, CA, USA: IEEE Computer Society, 2009, pp. 147-150. [254] P. Ingalls and T. Frever, “Growing an Agile Culture from Value Seeds,” AGILE Conference, Los Alamitos, CA, USA: IEEE Computer Society, 2009, pp. 119-124. [255] B. Jensen and A. Zilmer, “Cross-Continent Development Using Scrum and XP,” in Extreme Programming and Agile Processes in Software Engineering, vol. 2675, M. Marchesi and G. Succi, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 146- 153. [256] J.H. Hayes, A. Dekhtyar, and D.S. Janzen, “Towards traceable test-driven development,” Traceability in Emerging Forms of Software Engineering, ICSE Workshop on, Los Alamitos, CA, USA: IEEE Computer Society, 2009, pp. 26-30. [257] Y. Zhou, “UniX Process, Merging Unified Process and Extreme Programming to Benefit Software Development Practice,” Edu- cation Technology and Computer Science, International Work- shop on, Los Alamitos, CA, USA: IEEE Computer Society, 2009, pp. 699-702. [258] M. Yap, “Follow the Sun: Distributed Extreme Programming Development,” Agile Development Conference/Australasian Database Conference, Los Alamitos, CA, USA: IEEE Computer Society, 2005, pp. 218-224. [259] O. Salo and P. Abrahamsson, “Agile methods in European em- bedded software development organisations: a survey on the actual use and usefulness of Extreme Programming and Scrum,” Software, IET, vol. 2, Feb. 2008, pp. 58-64. [260] P. Tingling and A. Saeed, “Extreme programming in action: a longitudinal case study,” in Proceedings of the 12th international conference on Human-computer interaction: interaction design and usability, Berlin, Heidelberg, 2007, p. 242–251. [261] M. Pikkarainen, J. Haikara, O. Salo, P. Abrahamsson, and J. Still, “The impact of agile practices on communication in soft- ware development,” Empirical Software Engineering, vol. 13, no. 3, pp. 303-337, May. 2008. [262] J. Rasmussen, “Introducing XP into Greenfield Projects: lessons learned,” Software, IEEE, vol. 20, no. 3, pp. 21- 28, Jun. 2003. [263] J. Vanhanen, J. Itkonen, and P. Sulonen, “Improving the Inter- face Between Business and Product Development Using Agile Practices and the Cycles of Control Framework,” in Agile Devel- opment Conference/Australasian Database Conference, Los Alami- tos, CA, USA, 2003, vol. 0, p. 71.

44

APPENDIX A OVERVIEW OF STUDIES: EXTENDED MATERIAL

Total primary studies: 186

Context: Industry Context: Academia Empirical 114 35 Non-empirical 5 32

Empirical papers in Industry

Agile Experience

Agile experience Nr. % Beginner 38 33 Experienced (Mature) 48 42 N/A 28 24

Company size

Company size Nr. % Micro 10 9 Small 15 13 Medium 12 11 Large 31 27 N/A 46 40

Factors contributing to rigor for the 119 in Industry

Context described Design described Validity discussed 0 3 11 65 1 95 72 42 2 21 36 12

Factors contributing to rigor for the 67 in Academia

Context described Design described Validity discussed 0 7 12 37 1 41 42 22 2 19 13 8

Factors contributing to rigor for all the primary studies

Context described Design described Validity discussed 0 10 23 102 1 136 114 67 2 40 49 20

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 45

APPENDIX B OTHER PRACTICES

Practice Category Ref Definition A short period of envisioning amongst the business stake- Big Picture Up-Front Design [82] holders and project team to set the direction of the project.

Programmer On-site Developer Schedule site visits for programmers so that programmers [82] can understand more about the end-users of the software. Plan to adjust commitments and resources regularly based Re-Calibration Management [82] on what both customers and developers learn during the

iterations. Road show Demonstrate the software to end users and other stake- Customer [82] holders so that the team can obtain feedback on the direc- collaboration tion of the project. Program Weak Points identified in terms of coupling and Program Weak cohesion factors. The locations in the source code where Points coupling can be decreased and cohesion can be increased Developer [100] are also identified. The output of this phase will be program points to be refactored in the object oriented code that has to be refactored. Culture of trust: Building the right teams is another centripetal force that in- Developer [102] volves building trust, communication and personal bridges. As mentioned earlier, “thinking light” about the design was one of the hardest things that the team had to confront. There were strong and passionate opinions about the opti- Just in time design Design [102] mal way of designing the application. Often the disagree-

ments are due to the variety of perspectives that developers brought to the team (e.g. data model, security, and usability perspectives). First, we prioritized the user groups based on their immedi- User Backlog ate needs and internal politics. The evolving system would . Customer [102] be rolled out to one user group at a time. Generally each collaboration user group would get one or two releases. In essence, we

developed a “user backlog”. The customer decided to hold a meeting in which everyone Dirty laundry meet- was supposed to resolve their conflicts. This meeting was ings Management [116] called “dirty laundry meeting” because it was a chance for everyone to say what was on their mind about others and walk away with a clean slate. The specialists brought some fresh air into the team and reduced the burden of everyone having to study all new Specialists and technologies. They did not have special rights to stories in study time their areas. In fact they were discouraged from taking these Developer [116] stories at all. They were available to pair program when someone had trouble in their areas of research and also conducted seminars to teach the rest of the team what they were learning. Prototyping/spiking to be an effective practice for reducing risk. However, we noticed that this requires that the teams be provided with guidelines and templates on the process of Early Prototyp- identifying the cause of risks and focus the goals of the pro- Developer [124] ing/Spiking totypes/spikes to minimize them. As a consequence, the teams were much more confident that they will succeed in the project, hence substantially increasing their motivation and level of participation. Risk management We started to focus the team’s attention to both identifying Developer [129] the risks as well as categorizing the risks based on causality

46

(e.g. knowledge gap, skill gap, unstable library etc.). Once the risks were identified, prioritized, and potential causes for them established, the teams developed spikes, each of which aimed at minimizing a set of risks. After several iterations, the working software is almost done, Accepting working Customer and then developers can submit it to customers for usage software. [135] colalboration test or trial operation. Customer accept this current release if

it accords with the agreement in release plan. Customer represents new requirements at the beginning of Requirement man- eachiteration, and confirms priority levels of them. In ar- Customer agement. [135] ranged meeting, customer represents requirements or collaboration amends requirements. And, customers themselves decide whether they should submit some important changes. There are feedbacks in release plans, iteration plans, usage Serving as consult- Customer [135] test, as a complementary, customer should answer any de- ant. collaboration veloper’s questions in design, coding, and unit test phases. Customer must provide test cases based on real environ- ment, join the integration test and operate current released Taking part in usage Customer software. To expose more software errors, released soft- testing and trial op- [135] collaboration ware should be run in a limited scope by some customer eration. departments, and operation report should be provided by customer coach as quick as possible. Checking and ac- Customer team inspects whether the final software product Customer cepting final soft- [135] to meet customer’s requirement for check and accept, and collaboration ware product. then confirm the accept time with developers. Outsourcing in general and on an agile project in particular requires a lot of trust between client and vendor, near shore Using Nearshore resources onsite (or at least someone who can be contacted Resources Management [136] in the same time zone) that could meet regularly, face to . face, were crucial to building trust quickly. A technical lead

and Scrum Master were used to maintain both technical and organizational lines of communication open Direct user involve- ment Rely on frequent on-site customer visits and direct user in- Customer [143] tervention techniques to identify any possible changes trig- collaboration gered by customers.

Scrum’s main documentation consists of product and spring backlog lists. FDD and Crystal methodologies use UML dia- Object models grams such as use cases, class diagrams, and object mod- Design [143] els to document the design. Test cases have also been used by XP and Crystal methodologies as documentation artifacts. User scenarios are developed for features in subsequent Personas Developer [143] Sprints. Designs for features engineering plans to implement

in the first Sprint are also created and tested High-speed development teams often enlist customers or users either full time or temporarily, assigning them as Part Customer implanta- of the team, a practice known as customer implantation. As tion Customer [161] developers construct the software from informally specified . collaboration requirements and customer-developer discussions, custom-

ers prioritize requirements and redirect development, re- lease-by-release. Production prototyp- Production prototypes are efficient enough to support live ing Developer [161] operation.

An iteration of technical tasks so that the customer can have Programmer Holiday Customer [82] [150] some time to think-ahead. A “programmer holiday” is not collaboration often an actual holiday (although there maybe times when

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 47

that is indeed appropriate), but mostly the customer will choose to prioritise technical debt, lower priority bugs, tech- nical system upgrades and such above stories. Although mentioned last this is one of the first things that a distributed development team should get right. It includes not only the idea that everyone regardless of location has access to a common source base but that they have access Common Source Developer [190] to a common set of automation tools and processes. If you

can’t build your entire product down to the packaging from a common source base and use a common set of automated testing and production tools then you are going to have a much harder time getting anything out the door. Hiring process is to be highly related to the success of its agile practices. Developers who can recognize the im- portance of pattern-based development and providing Identifying and standard interfaces across the system are seen as key to managing develop- successfully projects. developers’ knowledge about archi- [206] ers. Management tectural design and design patterns is seen as critical. Ap- [218] proximately only 15% of the applicants are considered to be qualified for this project environment. Upholding developer morale is also seen as key to successful project outcomes

The group follows an “architect” practice While the develop- ers take part in every product’s entire life-cycle, some tasks Architect practice are centralized in the hands of the architect. he is responsi- . Design [220] ble for the vision and overall design of Informatics products,

tracking design decisions, constraints and prospective de-

velopment. He evaluates feature and change requests in light of existing and suggested designs Use cases are a very useful way to capture customer re- quirements to ensure that both the customer and theteam understand the functionality. From the information gathered Use Cases in that meeting and using UML, we were able to write out a Design [229] set of use cases. These use cases were then further refined through several customer meetings to ensure that all the requirements were captured. At the start of each mini deliv- erable, (see incremental) we then sat down and wrote spe- cific cases for the one section. Queue Visibility The use of an on-line queue that is available globally across Across Distributed Progress [234] our development and customer services sites. Ex: Sites webcams.

To see how frequently new pieces of code are integrated Integration log: Progress [248] with the system (XP suggests to have several integrations

per day. Version manage- It supports collective code ownership and continuous inte- ment system. Developer [248] gration.

To face the challenging environment offered by fast chang- ing technological scenarios, it is necessary for an Organiza- tion to have its people up to date with the latest tools and technologies. Especially for organization practicing agile Trainings Management [251] method of development training of resources is essential for

keeping the development process up to the mark, satisfying customers and business expectations. Training of profes- sionals always has a positive and significant impact on productivity. Domain Presenta- Systems Analysts, User Experience and Architecture team Design [261] tion representatives would provide the Development and Sys-

48

tems Test teams the iteration Domain Specification as well as a quick summary of the artifact they were to implement in the next iteration. Before the meeting, the User Experience resource would post the wireframes on the wall. A weekly ‘heart beat’ was instituted. just because heartbeats were weekly,this did not imply all project teams conducted Heart Beat weekly iterations. Rather, the heartbeat offered project Progress [261] teams a point in time and a place (ITM) from which they were able to push their deliverables up the chain whether this is specifications or code. This was an opportunity for any representative of the cross- SIP updates Progress [261] functional team to propose updates to the SIP as executed

by the release manager. For any kind of problem that occurs in the team, if the coach cannot solve it easily and directly, a good way of tackling the problem is to take a TimeOut, i.e., to bring the team together for a brief stand-up meeting, and let the team solve the Time out problem together. This works for technical problems, where Developer [252] . someone on the team might have skills in a certain area, or where the team can brainstorm possible solutions or ideas for experimental spikes for solving the problem. A TimeOut is also useful for process problems where the team needs to agree on what practices to follow Estimates: During the sprint (Scrum name for an iteration), estimates

Progress [255] are updated daily, thus allowing assessment of whether the

team is on track or not

Proactive resource Resource management” helps ensure that a Scrum team management Management [94] [101] has the necessary tools and skills to support their Scrum

Practices in distributed settings.

This practice is widely used by Scrum teams to ensure syn- chronous communication among distributed sites can be arranged. This is done by adjusting working hours, working from home, working long hours Some Scrum teams used Synchronized work strategies to avoid the need of increased overlap time. For hours: Management [94] [101] example, a Scrum team used strict timeboxed meeting to avoid late night meeting at some sites. To make the meet- ings short and effective, team members post their three daily Scrum questions or develop backlog (feature list) before attending the distributed meetings

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 49

APPENDIX C OTHER EMPIRICAL STUDIES IN ACADEMIA

Article Description [97] Web service testing to support TDD in SOA applications [165] This report reports the results of experiment conducted on using TDD taking practitioners as sub- jects. Although TDD practice produced high quality code, passing more test cases, it took 16% more time for development.

Article Description [83] Results indicate that pairs performed at the level above the second best performers and best per- formers of nominal pairs in each pair. There is no sufficient analysis to claim that the performance of collaborating pairs exceeds the performance of its best member working alone. [93] Presents XPairtise, a plug-in for Eclipse that supports PP in a distributed setting. [118] The report presents an experiment to compare productivity of a solo programmer and pair pro- grammer. Results show that PP decreases the effort of the single programmer. [119] The report describes the experiment report of extending PP to software design called 'pair design- ing'. Results show that ‘pair designing’ is an efficient means to enforce design knowledge. [148] The paper reports an experiment to provide empirical evidence on the impact of pair program- ming on both, thoroughness and e%ectiveness of test suites. However, the results do not show a positive impact on testing to make it more thorough (Branch coverage and mutation core indica- tor). [154] The paper presents a controlled experiment called repeat-programming which can facilitate the understandings of relationships between human experience and programming productivity in the context of PP. Results show that PP is more efficient when two individuals are new to pro- gramming tasks, dealing with complex and callenging code. Also, in working on sloutions that individuals have already experienced PP is less efficient than solo programming. Results also show that novice-novice pairs against novice programmers are much more efficient than expert- expert pairs against expert solos. [212] The focus of this report was in the context of females and PP. Female students felt that PP can decrease pressure while coding and increase confidence and courage. On the negative side, bad experiences in PP gave negative feelings on the practice of PP. It was mainly when they were paired with male students from whom they did not get major help/support. [216] The paper reports an empirical evaluation of the COPPER collaborative editior to support distrib- uted PP. Results who that distributed pairs could identify same mount of errors as their collocat- ed counterparts, however no evidence found that pairs using collaborative services has better code comprehension. [223] The paper reports evaluation of support for distributed PP. The tool aims to collect work related information when using PP which in turn can be used for improvement in project management and developers’ work conditions. [256] The paper reports an experiemet to evaluate PP in the context of distributed development. The results indicate that PP in a distributed context is comparable to collocated pairing in terms of quality and productivity. It was found that PP in a distributed setting fosters teamwork and communication.

Article Description [100] A method to use Re practice is proposed and evaluated. The evaluation of Re is based on quantita- tive metrics – Cyclometric complexity, LOC, CBO, LCOM, MHF and AHF. Findings show that resulting code is simpler, increase in system comprehensibility, decrease in dependencies classes, thus making the code more reusable. [125] The report investigates if Re improves reusability of object-oriented classes. The results show that Re enhances both quality and reusability of the code. [141] The paper introduces two practices: Continuous Architectural Refactoring (CAR) and Real Archi- tecture Qualification (RAQ) in order to empower XP's development process toward improving systems' Architecture. [147] The paper reports storytest-driven approach as a complimentary practice to TDD, to manage agile project requirements. [168] The paper provides students’ experience in using Me. It was found that Me leads to a more confi-

50

dent understanding of the system design.

' ' ' ' ' '''

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 51

APPENDIX D PRACTICE MODIFICATIONS/ADAPTATIONS

' Documentation Software iteration plans [261] Practices Adapted Citation Using long iterations [129] Iteration planning [90] [129] Data and Documentation [219] [210] [231] Data integration [219] Iterations - 2 weeks [108] [266] Documentation including architec- [227] tural description Iterations - 3 weeks [158] [136]

Just right documentation [192] Iterations - 1 week [75] [174] Reduced documentation [251] Reflection and documentation [252] Small releases: iterations of 30-45 days [113] [132]

Right documentation [263] ' Streamlined documentation [219] ' Office& ' Practices'Adapted& Citation' ' Iterations Community'workstations' [234]' Practices Adapted Citations Honey'comb'cubicles' [204]' Cycle of innovation consisting of 5 itera- [140] Management'support'and'organizaG [205]' tions: Investigation, Identification, Aggre- tional'culture' gation, Consolidation and Post-validation First iteration Open'office'layout'' [98]' [252] Open'workspace' [210]' Incremental deliverables [229] Personal'vs'team'space' [234]' Incremental development Shared'working'space'' [158]',[253]' [279] Iteration plans: each iteration - 1 month, [73] Single'room'' [72]' each release - 3 iterations Small'offices'(150'to'200'people)'' [98]' Iteration retrospectives [264] Iteration review [129] War'room' [204]' iteration specification [261] ' Iteration transision meetings [261] ' Iterations - 4 week length, planning in the [104] Pair&programming& 1st week and demonstration in the 4th Practices'Adapted& Citation' week Iterations (3 weeks) [87] Daily'meetings'for'forming'pairs'' [134]' periodic performance reviews [219] Pre-project iteration [134] Release and iteration Planning: release [90] Flexible'pair'programming' [218]' length - 2 months, 3 months, 2 week iterations Pair'courage' [212]' Pair'pressure' [212]' Review of iterations with the core team [122] Pair'review' [212]' Pairing' [234]' Small iterations [238] Pairing'in'all'high'risk'activities' [216]' Small releases: iterations 2-3 days to 2-3 [79] weeks Pairing'prison' [210]' Small version of iterative, evolutionary [99] PP' (practiced' in' the' beginning' and' [132]' development approach later'abandoned)''

52

PP'via'web'conference'' [172]' Practices Adapted Citation Programming'speed' [248]' Role'switching'2G3'times'a'day'' [134]' 2 week sprint (changed from 2 to 1) [92] work'in'pairs'in'all'high'risk'activiG [219]' 20-30 days sprint [95] ties' 2-week sprint prior to start of develop- [145] Teaching'–'additional'roleG'spectator' [93]' ment ' 2-week sprints [144] 2-week sprints in maintenance team [144] ' Planning&(&&Planing&game)& 30 day sprint [76] Develop unit tests for all new code dur- [74] Practices'Adapted& Citation' ing a sprint Scrum pre-sprint planning [95] [282] Adaptive'cycle'planning'' [143]' Adaptive'planning'' [96]' Sprint 2-week [128] [173] Aggregate'product'plan'' [112]' [281] Bi'weekly'planning' [204]' Big'picture'UpGFront'' [193]' Sprint backlog with the tool Base camp [142] Designing'upfront' [205]'['218]' and Excel Sprint break after 5/6 sprints [173] Group'planning'meeting'' [108]' Impact' of' upfront' design' on' ' other' [205]' Sprint demo [143] practices' Milestone'and'iteration'planning' [264]' Sprint management [106] Sprint planning [72] [76] [85] Monthly'planning' [204]' [95] [106] Planning'cycle' [204]' [128] [138] Planning'meeting'breakouts'' [108]' [144] [282] Sprint planning is prepared separately [87] Planning'meetings'' [108]'[139]'[122]' with both sides of the team by the Product Owner. Sprint planning 2, cre- Programming/' pre' release' evaluaG [195]' ating tasks, is performed jointly after tion' both sides have received explanation of Project'Outline'' [143]' the user stories Release'planning' [210]'

Release' planning' (1' hour,' as' needG [266]' ed):'create'long'term'product'plan'' Sprint planning meeting [181] [213] [253] Release'system' [238]' ' Sprint review [106] [213] Sizing'&'Planning'' [92]'[281]'[266]' [253] ' ' Synchronized 4-week sprint [144] ' Citation Practices Adapted Three and a half weeks sprint [102] Controlled software spikes [117] Time boxing with sprint [164] Design/code reviews [261] Flexible design [251] ' Just in time design [106] ' Technical excellence and Good de- [75] ' sign ' ' Stories Sprint Practices Adapted Citation

SARIPALLI AND DARSE: FINDING COMMON DENOMINATORS FOR AGILE SOFTWARE DEVELOPMENT: A SYSTEMATIC LITERATURE REVIEW 53

Collect scenarios [238] Frequent rotation between onsite [138] Enhanced user stories [208] and offshore teams Splitting stories [234] Generally located management [94] Story cards [234] team [242] Generally located management [94] [267] team Synoptic picture [242] Gradual team distribution [94]

Task cards [234]

Tracking stories [234]

User stories were divided into tasks [227]

' ' Team Practices Adapted Citation Build team around motivated [75] individuals with the necessary environment, support, and trust to get job done. Business leaders in monthly re- [213] view meeting Business owner given control [72] Changing role [172] Creating customer team [135]

54

APPENDIX E MAA - CAV EXAMPLIFICATION

Agile as- AP1 AP2 AP3 AP4 AP5 AP6 AP7 AP8 AP9 AP10 AP11 AP12 sessment value [173] Y Y Y Y Y 2.08/3 [178] Y Y Y Y Y Y Y 2.58/4 [190] Y Y Y Y Y Y Y Y 3/4 [73] Y Y Y Y Y Y Y Y 3/4 [121] Y Y Y Y Y Y Y Y Y Y 3.4/4 [127] Y Y Y Y Y Y Y 2.66/4 [218] Y Y Y Y Y Y Y 2.66/4 [213] 3.5/4 [106] Y Y Y Y Y Y Y Y 2.66/4 [207] Y Y Y Y Y Y 2.41/4 [208] Y Y Y Y Y Y 2.41/4 [210] Y Y Y Y Y Y Y Y Y Y 3/4 [102] Y Y Y Y Y Y Y 3/4 [253] Y Y Y Y Y Y Y Y 3/4 [222] Y Y Y Y Y Y Y Y Y Y Y 3.75/4 [226] Y Y Y Y Y Y Y 2.17/4 [255] Y Y Y Y Y Y 2.33/4 [87] Y Y Y Y Y Y Y Y Y 3.25/4