Master Thesis Software Thesis no: MSE-2020-NN 01 2021

Acceptance Testing in Agile Perspectives from Research and Practice

Nayla Nasir

Dept. Blekinge Institute of Technology SE–371 79 Karlskrona, Sweden This thesis is submitted to the Department of Software Engineering at Blekinge Institute of Technology in partial fulfillment of the for the degree of Master of Science in Software Engineering. The thesis is equivalent to 20 weeks of full-time studies.

Contact Information: Author(s): Nayla Nasir E-mail: [email protected]

University advisor: Davide Fucci Dept. Software Engineering

Dept. Software Engineering Internet : www.bth.se/dipt Blekinge Institute of Technology Phone : +46 455 38 50 00 SE–371 79 Karlskrona, Sweden Fax : +46 455 38 50 57 Abstract

Context: is an important activity that verifies the confor- mance of a system to its acceptance criteria. It aims to provide a detailed com- munication of domain knowledge and is used to evaluate whether the customer requirements are met. Existing literature lacks the empirical evidence for accep- tance testing. Especially in context of industry practice, it is not in the authors’ consideration, except for a few studies, where the authors have investigated the state of practice in a specific domain. Objective: This study aims to recognize the state of research and practice of acceptance testing in Agile Software Development and investigate the similari- ties and differences in both perspectives. The study contributes to identify the industry-academia gap in the context of acceptance testing. Research Method: To identify the acceptance testing practices and challenges from research, I have conducted a literature review. For the industry perspective on acceptance testing practices and challenges, I have conducted an interview- based survey of the practitioners working in Agile Software Development envi- ronment. I followed the snowball search strategy to search the primary studies, whereas to select the respondents, I used the convenience and snowball sampling method. For data analysis, I followed the approach of thematic synthesis. Results: The results of this thesis are the outcome of a literature review of 20 selected studies and an interview-based survey with 12 practitioners representing 10 companies. I identified acceptance testing practices and challenges from re- search and industry. In the research, the most recommended form of acceptance testing is acceptance test-driven development (ATDD), and the majority of the studies are referring to the use of FIT for the acceptance testing. Customer in- volvement in different phases of acceptance testing is recommended in research. From the interviews, I come across that acceptance testing is manual at large in the industry, and the most challenging aspect is the customer’s involvement. Conclusions: From the findings of this thesis, it is concluded that there is a gap between the research and industry perspective of acceptance testing prac- tices. Currently, acceptance testing in industry is mostly manual, the research is not focusing on this aspect of acceptance testing. Despite the differences, there are some commonalities as well. Specially, most challenges of acceptance testing are similar in both perspectives. Researchers have to consider the commonalities, and they have to look at how they can minimize the acceptance testing challenges from the perspective of industry.

Keywords: Acceptance testing, ATDD, Agile Software Development, practition- ers’ perspective, industry-academia gap. Acknowledgments

First of all, I would like to thank my supervisor, Davide Fucci, for guiding me in this study. Whenever I asked for any guidance, he ensured his availability.

Secondly, I would like to thank all the experts who took active and volun- teer participation in the interviews. Without their volunteer participation, I would not be able to finish this study.

Lastly, I would like to extend my heartiest thanks to my family and friends for their unconditional support during this work. I owe a lot of gratitude to my husband, Nasir Mehmood Minhas, who has always been a source of guidance and motivation for me. His confidence in me, makes me set higher goals for myself.

Nayla Nasir

ii List of Figures

3.1 Example color coding to extract AT practices and challenges . . . 12

4.1 Collaboration challenges identified by the practitioners...... 32 4.2 AT practices – similarities and differences in two perspectives. . . 35 4.3 AT practices – extent of commonality in two perspectives. .... 36 4.4 AT challenges—similarities and differences in two perspectives. . . 37 4.5 AT challenges – extent of commonality in two perspectives. .... 38

iii List of Tables

3.1 Snowball iterations...... 11 3.2 Data extraction form...... 12 3.3 Interview Questions...... 14 3.4 Interview participants...... 16 3.5 Participants’ Organizational Context...... 16 3.6 Analysis procedure adopted for step2 and step 3...... 17

4.1 Acceptance testing practices in the literature...... 21 4.2 Acceptance testing challenges in the literature...... 22 4.3 State of acceptance testing practice/organizational policy...... 24 4.4 Acceptance test automation...... 25 4.5 Practitioners’ perspective of acceptance testing...... 27 4.6 Participants’ opinion on ATDD...... 28 4.7 Practitioners’ perspective of AT practices...... 29 4.8 Practitioners perspective of AT Challenges...... 31

iv Contents

Abstract i

1 Introduction 1 1.1 Background ...... 1 1.2 What is acceptance testing? ...... 2 1.3Whatisthegap?...... 2 1.4Focusofthiswork...... 3 1.4.1 Objectives ...... 3 1.4.2 Contributions ...... 4 1.5 Thesis arrangement ...... 4

2 Related Work 5

3 Method 8 3.1 Research Questions ...... 8 3.2 Selection of research method ...... 9 3.3 Detailed study design ...... 10 3.3.1 Literature review ...... 10 3.3.2 Survey ...... 12 3.3.3 Data analysis ...... 15 3.4 Threats to validity ...... 18

4 Results 20 4.1 Acceptance testing in research perspective – RQ-1 ...... 20 4.1.1 Acceptance testing practices in literature ...... 20 4.1.2 Acceptance testing challenges in literature ...... 22 4.2 Acceptance testing in industry perspective– RQ-2 ...... 23 4.2.1 Organizational policy of acceptance testing ...... 23 4.2.2 Acceptance test automation ...... 25 4.2.3 AT definition ...... 26 4.2.4 Practitioners’ Perception on ATDD ...... 27 4.2.5 Acceptance testing practices in industry ...... 29 4.2.6 Acceptance testing challenges in industry ...... 31 4.3 Similarities and differences in the two perspectives – RQ-3 .... 34

v 4.3.1 Similarities and differences in AT practices ...... 34 4.3.2 Similarities and differences in AT challenges ...... 36

5 Analysis and Discussion 40 5.1 Who is involved in acceptance testing ...... 40 5.2 When acceptance tests are written ...... 41 5.3 How acceptance tests are written ...... 42

6 Conclusions and Future Work 44 6.1 Conclusions ...... 44 6.2FutureWork...... 45

vi Chapter 1 Introduction

1.1 Background

Testing is an essential activity of the software development life cycle. It is a complicated and costly activity and can consume up to 50% of the total cost [1]. The goal of testing is to ensure the quality of the system under test, and it is performed at various levels, including unit, integration, and system level. It verifies the correctness of the system’s functional and non-functional aspects, and this fact classifies the testing into two categories: and non- functional testing. Some of the common testing types are , , , and acceptance testing [2]. The success of any software product lies with the fact that the customers/users accept it. Acceptance testing has a central role in this regard. It advocates the involvement of the domain experts (customers) in the verification/validation process. It gives them a chance to check if everything matches their expectations and if the requirements have been communicated and implemented correctly [3]. Despite its significance, empirical evidence on acceptance testing practices in Agile is very low [4], many authors [4, 5, 6, 7] suggested more research in this area. This fact motivated the author to carry out an empirical investigation on ac- ceptance testing in Agile Software Development. Agile Software Development is a controversial topic since the outset. Its proponents and opponents are in both research and practice. Despite many criticisms, there is no doubt that Agile methods are widely used in the software industry. Customer involvement, early and frequent feedback, and quick releases are some of the benefits of Agile de- velopment methodologies. Unlike classical software development methods, Agile methods advocate for early and frequent acceptance testing [8]. The subsequent sections define acceptance testing, describe the research gap, and put a light on the focus of current research.

1 Chapter 1. Introduction 2 1.2 What is acceptance testing?

Acceptance tests are also known as “customer tests,”, “customer-inspired tests,” and “conditions of satisfaction” [5]. In systems and software engineering—Vocabulary (ISO/IEC/IEEE) [9], ac- ceptance testing is defined as:

“Formal testing conducted to enable a user, customer, or other authorised entity to determine whether to accept a system or component.”

The goal of acceptance testing is to ensure that the system functions according to the customer’s expectations, [10, 11]. Acceptance tests are customer tests owned and defined by the customer to verify whether the developed modules meet the acceptance criteria [4, 11], and also termed as customer testing and story testing [4, 12, 13]. Acceptance testing gives the customer confidence that the system under de- velopment has the required features and behaves correctly. A project is complete when its all acceptance tests are pass [14]. Acceptance tests can be used to estimate the time required to deliver the product to the customer as they can be linked with requirements, and thereby providing a clear picture of the whole development process. Acceptance testing is imperative to enhance the shared understanding of product requirements among the stakeholders [6]. Along with the compliance with the requirements and design specifications, the scope of the acceptance tests includes the verification of the user experience and legal matters [15].

1.3 What is the gap?

The existing literature presents different aspects of acceptance testing that need further evidence from the industry. For instance, the existing literature empha- sizes on customer involvement for acceptance testing [10, 15], at the same time lack of customer involvement is presented as a challenge in acceptance testing [3, 16, 10]. Similarly, when, how, and by whom the acceptance tests should be designed is another important aspect of concern[5]. A substantial amount of literature discusses acceptance test-driven development (ATDD) as a method to promote the principles of Agile methodology [4, 6, 15]. This approach requires acceptance tests to be written before the implementation of the system. Therefore, it is important to investigate the state of practice in this context. Agile school of thought suggests that tests can complement high-level requirements, and test cases are probably a more accurate means of expressing detailed requirements than natural language [17]. But there is also empirical evidence suggesting that Chapter 1. Introduction 3 customers find it difficult to express requirements in terms of acceptance tests [3] and the developers mostly write acceptance tests [16]. The customers have more domain knowledge, and the developers have more technical knowledge [6], so it is quite interesting to investigate how acceptance tests are written and what roles are involved in designing the acceptance tests in the industrial practice. The existing literature reports a lack of empirical knowledge on acceptance testing. The focus of existing empirical studies is the development of easy to use acceptance testing tools [16], and a majority of the studies have investigated the use of framework for integrated testing (FIT 1) tool in different domains [4]. There are few studies [3, 16, 12], where the authors have investigated the state of practice. For instance, Liebel et al. [16], conducted a multi-case study with six companies to examine the state of practice of GUI-based system and acceptance testing. Similarly, the practical use of FIT in specific organizations is investigated by different authors [3, 12]. To the best of my knowledge, I could not find any study on acceptance testing investigating the state of practice in industry at a broader level.

1.4 Focus of this work

This work aims to investigate the state of research and practice on acceptance testing in the context of Agile Software Development and to analyze the similari- ties and differences in both perspectives. I did not address the strategies used to mitigate the identified challenges. I have carried out this study in two phases. In the first phase, I have conducted a literature review of 20 research studies. For finding the relevant studies, I fol- lowed the snowball search strategy [18]. The objective of conducting literature review was to find the research perspective on acceptance testing practices and challenges. In the second phase, I have conducted semi-structured interviews with practitioners from 10 different companies representing three different continents (Asia, Europe, and North America). These interviews enabled me to understand the industry perspective of acceptance testing practices and challenges. For the analysis of data I followed the thematic synthesis approach [19]. The aim was to analyze the acceptance testing practices and challenges, from the perspective of research, as well as the industry. Similarities and differences in the two per- spectives were also analyzed. The objectives and contributions of this study are presented in the following subsections.

1.4.1 Objectives O1: To understand the acceptance testing practices and challenges as presented in literature. 1http://fit.c2.com/ Chapter 1. Introduction 4

O2: To identify the acceptance testing practices and challenges from the indus- try.

O3: To identify the industry-academia gap on acceptance testing comparing the acceptance testing state of research and practice.

1.4.2 Contributions The following three are the primary contributions of this study: C1: Presenting the research perspective on acceptance testing practices and chal- lenges. C2: Presenting the practitioners’ perspective on acceptance testing practices and challenges. C3: Presenting a mapping between two perspectives (i.e., research and practice) regarding the acceptance testing practices and challenges.

1.5 Thesis arrangement

This thesis consists of six chapters. Chapter 1 presents an introduction. Chapter 2 gives an overview of the related work. Chapter 3 provides a detailed description of the methods adopted to carry out this research. Chapter 4 presents the study’s findings, organized according to the research questions. Chapter 5 provides an analysis and discussion of the results of this study. Finally, Chapter 6 concludes this thesis. Chapter 2 Related Work

This chapter presents an overview of related work found in the area of acceptance testing. Although there are many papers published on acceptance testing, in most studies, the authors have investigated the implementation of acceptance testing tools for a specific domain [4]. I selected those studies for the discussion in the related work where the authors have presented any aspects of acceptance testing practice. Weiss et al.[4] carried out a systematic literature review of empirical research on acceptance testing. The authors highlighted the lack of empirical studies on acceptance testing, and they found only 26 such studies. They revealed that the majority of existing work had investigated the use of FIT in different domains. They attributed various benefits to ATDD, including a reduced amount of time required for , the decreased number of human errors during test execution, and the availability of additional details of requirements from the customers’ perspective. The authors also identified some of the practices and the challenges associated with acceptance testing. Another literature review was conducted by Park and Maurer [20]. The au- thors focused on story test driven development (STDD), which is considered as a synonym of ATDD. The aim of their study was to identify the benefits and issues related to the story test driven development. They categorized the re- sults in seven themes i.e., time, cost, people, code design, testing tools, what to test, and test automation issues. The authors highlighted that there is a lack of empirical evidence on STDD. According to their findings, most of the existing studies are carried out with a small group of participants and the findings are not generalizable. In his Doctoral thesis, Melnik [12] presented an empirical analysis of exe- cutable acceptance test driven development (EATDD). The author used multiple methods of investigation for the study and mainly investigated the use of FIT framework in the context of specifying functional requirements in the develop- ment of business applications. The author revealed that there is a correlation between EATDD and a better communication in software teams. The author also noted that the current state of tool support of EATDD is weak, specially with reference to maintainability and scalability of acceptance tests. The studies

5 Chapter 2. Related Work 6 presented in this thesis describe the acceptance testing practice and the associated limitations. The findings of the thesis represents the academic and industry per- spectives. However, the scope of findings is limited to EATDD and line-business applications. Haugset et al. [3] conducted a literature review and a case study on auto- mated acceptance testing (AAT). The authors investigated the use of FIT and Selinum on two projects in a medium-sized company. From the case study results, the authors revealed various limitations in adopting AAT in the industry. They suggested that the customers’ preference to use traditional ways of elicitation can serve as a bottleneck towards the widespread use of AAT in the industry. They recognized the potential downside of AAT, e.g., developers im- plementing just enough code to make the test pass rather than understanding and implementing the full requirement. Costs associated with the design and maintenance of automated tests were also considered as a drawback. The au- thors have also highlighted some potential benefits of using AAT. For example, AAT can increase confidence among developers and improve communication and collaboration in the team by allowing easy sharing of code. They also identified that acceptance tests could serve as valuable documentation of the desired be- havior of the system. Although this study’s findings represent the research and industry perspective, the scope of the industry findings is limited to a medium- sized software consultancy company. Furthermore, the results are confined to the use of FIT and . In their conclusions, Haugset et al. suggested further empirical investigations on AAT. Ricca et al. [17] conducted an empirical assessment to identify the role of acceptance tests written in FIT to clarify the requirements. They suggested that FIT tables effectively contribute towards a better understanding of requirements, but they involve additional effort. The study is carried out in the academic environment with the master’s degree students. Therefore, the findings may not represent the industry context in the right spirit. Liebel et al. [16] carried out a multiple-case study to investigate the state of practice in GUI-based acceptance and system testing. They performed detailed studies with six companies of varying contexts. The authors concluded that the manual GUI-based system and acceptance testing are more practiced in the industry than automated GUI-based testing. The reasons behind this include test tool limitations, higher costs of testing, and the need for customer involvement in testing. The authors highlighted the fact that in the existing research, FIT is the most popular tool. However in their case study findings, the authors also pointed out the fact that companies are not using FIT. This study’s findings may be considered as representative of industry practice, since these represent the context of six different organizations. The investigation is only confined to GUI-based acceptance testing, and it did not represent the research perspective. Hotomski et al. [21] conducted an exploratory study to investigate the han- dling of documentation for requirements and acceptance tests in the industry. Chapter 2. Related Work 7

They interviewed twenty practitioners from fifteen organizations. The authors revealed some interesting facts. For example, test documentation is more ex- tensive compared to the documentation of requirements. They further revealed that technical roles are not involved in requirement engineering activities, which results in ambiguous requirements, and the acceptance tests, which are written based on the requirements, are not necessarily complete. The authors also iden- tified that acceptance tests are not appropriately maintained in the companies. The context of this study is also related to my work. However, the study’s scope is limited to the documentation of requirements and acceptance tests. Further, the authors did not consider the literature while presenting the results. Related work presented here is mainly focusing on ATDD or AAT [3, 4, 17, 20], the majority of the studies are investigating the use of FIT for AAT [3, 12, 17]. Existing literature reviews [3, 4, 20] highlighted the lack of empirical investiga- tions in the field of acceptance testing. One study [16], presented the state of practice of GUI-based automated acceptance testing (AAT) in six companies, and another study presented the state of practice regarding the documentation of requirements and acceptance tests. To the best of my knowledge, I did not find any study with an explicit focus on the state of research and practice of acceptance testing. Such a study can help identify the industry-academia gap concerning acceptance testing, and can provide a basis for researchers to align their research with industry practices. Chapter 3 Method

This chapter presents the detail of methodologies utilized to conduct this research. I have conducted a literature review and an interview-based survey of practitioners involved in acceptance testing. The subsequent sections provide an account of the research questions, the research methods selected for this study, detailed study design, and threats to validity.

3.1 Research Questions

RQ-1: What is the state of research on acceptance testing in Agile Software Development? RQ-1.1: What are the acceptance testing practices reported in the research? RQ-1.2: What are the acceptance testing challenges reported in the research? Motivation: The answer to this question will help me to understand the prac- tices and challenges of acceptance testing in an Agile environment as presented in research. These findings will help to communicate the relevant acceptance testing research to the Agile practitioners. RQ-2: What is the state of practice in acceptance testing in Agile Software Development? RQ-2.1: What is the perception of practitioners regarding acceptance testing practices? RQ-2.2: What are the challenges that practitioners are facing while perform- ing acceptance testing? Motivation: It is important to investigate how Agile practitioners are undertak- ing the acceptance testing, which steps are crucial to them and what sort of tools they are using, as well as the challenges faced by the practitioners in this context. These findings can serve as the basis for further research, such as identification of mitigation strategies to the AT challenges. RQ-3: What are the similarities and differences in these two perspec- tives? Motivation: The objective is to see if there is a gap in research and industry perspectives of acceptance testing. The findings will help the researchers to align their research to the identified industry needs. At the same time, the practitioners

8 Chapter 3. Method 9 will have a chance to review their practices corresponding to the best practices defined in the literature.

3.2 Selection of research method

The selection of an appropriate research method is the first step to the success of any research work [22]. Easterbrook et al., identify the following research meth- ods to be relevant to software engineering: Controlled Experiments, Case Studies (both exploratory and confirmatory), Survey Research, Ethnographies and Action Research [23].

To understand the research and practice perspectives on acceptance testing, I performed literature review and conducted interviews with the testing practi- tioners. The aim was to find the answers to the research questions presented in Section 3.1. To address the RQ-1, I have reviewed 20 research papers, which enabled me to recognize the acceptance testing practices and challenges, as pre- sented in the existing literature. It also helped me to build the foundation for the next phase of the investigations.

Initially, two alternatives, the case study approach and the survey approach, were considered to investigate the RQ-2. A case study approach is used to develop an in-depth understanding of the phenomena and is appropriate in sit- uations where context is important. Most often, a case study aims to analyze the mechanism for cause-effect relationships [23]. Whereas a survey approach helps to identify characteristics of a broader population [23]. In this approach, a representative sample is selected from a well-defined population. Data analysis techniques are used to discover a common phenomenon across the population, and the outcomes can be generalized. Since this study aims to understand the overall industry perspective of acceptance testing practices and challenges in Ag- ile Software Development, I opted for survey approach. The data collection methods for survey-based research are questionnaire, interviews, and data log- ging techniques [23, 24, 25]. Instead of an online questionnaire, I considered conducting interviews with the practitioners. Interviews provide an opportu- nity to have a more detailed investigation of the phenomenon under study. It allows simultaneous bi-directional interaction to clarify any ambiguities and ask further questions. I have been able to conduct semi-structured interviews with 12 practitioners from 10 different companies, representing different product domains.

Regarding RQ-3, a qualitative comparison of findings from RQ-1 and RQ-2 was performed. The purpose was to find the similarities and differences in research and practice perspectives on acceptance testing practices and challenges. Chapter 3. Method 10 3.3 Detailed study design

3.3.1 Literature review To answer the first research question (RQ-1), I have conducted a literature review of 20 selected papers. I do not claim the exhaustive search of the literature, as the aim was not to conduct a systematic literature review. Instead, the focus was to search a representative set of the studies which are discussing any aspect of acceptance testing practice or challenges related to acceptance testing. I did not opt for a systematic literature review because of elaborate systematic literature reviews presented in [3, 4, 20]. To search for the relevant literature, I used the snowball search strategy by following the guidelines presented in [18]. To find a seed set for snowballing, I searched Scopus, as Scopus is among those databases that provide more re- liable quality indexing and better bibliographic records in terms of accuracy of information [26]. To search the relevant studies, I used the following search string :

("acceptance testing" OR "user acceptance testing" OR "automated accep- tance testing" OR ATDD OR "story test driven development") AND "Agile Soft- ware Development" AND ("literature review" OR "systematic literature review" OR SLR) I used the key terms of “literature review” or “systematic literature review” be- cause I was interested in finding the most recent secondary studies, from which I could select a start-set of primary studies. My search returned three studies [3, 4, 20]. I opted to choose the literature review by Weiss et al. [4] as a base- study for further searches because this is the latest secondary study relevant to my topic, and it includes both the other studies. This study presents a systematic literature review of 26 empirical research articles regarding the state of research on acceptance testing. I examined all the primary studies from this paper and, based on the inclu- sion and exclusion criteria presented in Subsection 3.3.1, I selected ten studies as a start-set for further snowball iterations. For these ten studies (start-set), I performed backward and forward snowball searches. For backward snow- ball, I examined each selected study’s references, and for the forward snowball, I reviewed all the papers that have cited the selected studies. For the forward snowball searches, I used Google Scholar because it provides all possible cita- tions. Table 3.1 presents the list of studies selected in different iterations. In each iteration, the papers were selected based on the inclusion and exclusion criteria mentioned in Subsection 3.3.1. For every selected study in each iteration, I went through the snowballing process (both backward and forward). In the first iter- ation, I found eight related papers, and in the second iteration, I found only two papers. After the third iteration, I stopped the process because I could not find new papers related to my topic. Chapter 3. Method 11

Table 3.1: Snowball iterations. Iterations Selected studies Total Base Study [4] 1 Start set [3, 5, 16, 12, 20, 27, 28, 29, 30] 10 First Iteration [10, 21, 31, 32, 33, 34, 35] 7 Second Iteration [36, 37] 2 Third Iteration No relevant results returned 0

Inclusion and exclusion criteria For the selection of primary studies, I have applied the following inclusion and exclusion criteria:

Inclusion criteria:

• Articles that are focusing on any aspect of my research topic i.e., acceptance testing practices or challenges.

• Articles available in the English language.

• Articles available in full text. Exclusion criteria:

• Acceptance testing papers that are presenting any proposed technique/frame- work and not considering any aspects of acceptance testing practices or challenges.

• Articles which are repeated

• Articles which are not written in English

• Articles that are not available in full text

Quality assessment criteria: The quality assessment criteria used for the inclusion of the papers includes the following:

• Is the paper peer-reviewed?

• Are the study’s aim and objectives clearly defined?

• Are the study outcomes clearly mapped to the objectives of the study?

• Do the methods used for investigation conform to the established guidelines? Chapter 3. Method 12

Figure 3.1: Example color coding to extract AT practices and challenges

Data Extraction and Analysis To extract the data from the selected primary studies, I used data extraction form (Table 3.2). I followed the recommendations suggested by Cruzes and Dyba [19].

Table 3.2: Data extraction form. Data Item Description Title Title of the selected study Authors Authors’ names and affiliations Publication Type, year, and venue of publication Research Method Stated objectives of research, and chosen research methodology AT practices Acceptance testing practices, along with the authors’ description of the prac- tices AT practice label Label assigned to the acceptance testing practice Challenges Challenges related to acceptance testing mentioned in the study. AT challenge label Label assigned to the challenge

Before the data extraction, I thoroughly read all the selected studies. After the first round of reading, I started marking the practices and challenges by using different colors (green for practices and yellow for challenges). An example of color-coding is presented in Figure 3.1. After completing the color coding of a study, I assigned the appropriate labels to the identified practices and challenges. For example, in Figure 3.1, the first verbatim “One of the main ideas in acceptance testing is that customers specify test cases and seen as test oracle” was labeled as “Customer specify acceptance tests”. Similarly, the statement highlighted yellow is a challenge, I labeled it as “Customer collaboration”. I entered the data along with the associated label into the data extraction form, and grouped similar finding to eliminate the redundancies. Finally, I created the separate tables for acceptance testing practices and challenges.

3.3.2 Survey To conduct the survey I have utilized the guidelines provided in [38]. The survey design consists of various steps including the definition of the research objectives, Chapter 3. Method 13 defining the population and sample, developing the survey instrument(s), evalu- ating the survey instrument(s), obtaining valid data and analysing the data [39]. These steps are elaborated in the following sub-sections.

Definition of Research objectives Research objectives are derived by understanding the problem, and then develop- ing a conceptual model. A conceptual model describes the objects to be investi- gated, the research variables and the relationships between them [40]. I identified two areas where I would be collecting data: i) the state of practice of AT in Agile Software Development, and ii) the challenges associated with these practices.

Population and Sampling It is important to characterize the target population to select a representative sample for it. The population for this study consists of industry practitioners who are actively involved in acceptance testing (AT) in Agile Software Develop- ment. Sampling plan: The two types of sampling techniques are: probabilistic sam- pling and non-probabilistic sampling [41]. For this study, I chose to use the non-probabilistic sampling approach because the target population was specific and well defined. More precisely, I initially utilized the convenience sampling approach and collected the data from the respondents who were known to me and willing to participate. Later, I opted for snowball sampling and requested the participants to provide further references of the practitioners who are i) working in an Agile environment, and ii) involved in acceptance testing.

Ethical concerns: All the respondents of this study participated voluntarily, and I have taken prior consent of them regarding their participation and providing me the required information. I also took permission from the participants to record the interview sessions, and I also agreed with the participants to maintain anonymity regarding the identity of participants and their organizations.

Design of survey instrument Interview Design: Following the guidelines provided by Robson [25], I for- mulated a detailed interview guide. I followed the semi-structured design for the interviews. The interview guide has a clear introduction of study purpose and objectives, and it comprises the following sections: Respondents’ background, questions related AT practices and data collection related to AT challenges.Ikept all the questions open-ended. The purpose was to get a detailed perspective of the respondents. Table 3.3 presents the detailed design of the interview guide. The average duration for each interview was around “50 to 55” minutes. Chapter 3. Method 14

Table 3.3: Interview Questions. Section Questions

Respondent’s The first part of the interview is concerned with respondent’s background. It helped Background to characterize the expertise of the respondent regarding AT. This section includes the following questions: Q 1: What is your professional background?(qualification, years of experience) Q 2: What is your role and responsibilities within your organization? Q 3: How many years of experience do you have with Automated Acceptance testing in Agile environment? Q 4: How would you define acceptance testing? Acceptance This section is concerned with understanding the perspective of practitioners on ac- testing Prac- ceptance testing and the practices within an Agile organization. These practices also tices lead us to highlight some challenges.The questions in this section are as follow: Q 5:Give us a walk through of the acceptance testing process in your organization? Q 6:Does your organization have a formal policy for Automated acceptance testing?( e.g. formal guidelines, standards). Q 7: Who designs the tests? Q 8: Do you involve customer representatives while designing test cases? Q 9: Which artifacts are used as input to design test cases (for instance formal user stories etc)? Q 10: How do you measure the coverage of acceptance tests (e.g Line cover- age,functional or feature coverage), and what coverage level do you consider? Q 11: When in the development cycle are acceptance tests performed?(change re- quest, requirement validation, release) Q 11.1: Do you think that if acceptance tests are performed before development,they can lead towards better understanding of user requirements? Q 12: Which tools do you use for automated acceptance testing? Q 13: Who maintains the acceptance test suite, and how? Q 14: How is test data for acceptance tests is stored and used? Q 15: Is the acceptance testing process successful in identifying all the faults? Acceptance In this section of the interview, the interviewee was explicitly asked to highlight testing Chal- challenges associated with different implementation aspects of acceptance testing. lenges This section have the following questions: Q 16:Are there any challenges associated with the designing of Acceptance test cases? Q 17:What are the challenges you face while maintaining the acceptance test suite? (e.g. cost, complexity, size etc.). Q 18: Do you think that the tools you use for automated acceptance testing are appropriate for the purpose? Q 19: Is automation of acceptance testing considered essential by your organization? Q 20: What do you think is the most difficult aspect of acceptance testing? Q 21: What solutions you propose to address the acceptance testing challenges? Q 22: Do you want to further elaborate on any aspect of AT that we missed out?

Pilot testing the interviews To evaluate the interview design, I have conducted two interviews, one with an ex- perienced researcher who is actively working on various aspects, including acceptance testing. The other was an online interview with an indus- try practitioner who has hands-on experience of acceptance testing in an Agile environment. To ensure the accuracy of the content and avoid any misinterpre- tation of the data, I recorded the interviews with the interviewees’ permission. I asked all the questions in the interview guide and asked questions regarding the interview guide’s feedback. After finishing with the pilot testing, I made changes to a few questions based on the experts’ feedback. The interview guide’s pilot testing enabled me to get the relevant details and further understand the context. Chapter 3. Method 15

Lessons Learnt from the pilot implementation: The pilot implementa- tion helped me to improvise the interview guide. Based on the experience during the pilot interview, I have made the following improvements to the interview guide:

• I rephrased some of the questions to make them easier to understand by the respondent. I also decided to provide possible explanations in case the respondent may need them.

• I also made changes in the sequence of some questions to ensure asking more relevant questions together.

• I also learned to focus on the aspects addressed in the interview guide in case of asking some complimentary questions. It helped me to obtain the most relevant information and also enabled me to complete the interview at the specified time.

Conducting the Interviews I conducted 12 interviews with practitioners from 10 different organizations work- ing in various domains of Agile Software Development. Each interview took about one hour and was audio-recorded with the permission of the participants. All the interviews were conducted online as the participants represented the companies from three different continents (Asia, Europe, and North America). Table 3.4 presents the background of interview participants. Personal information and or- ganizational affiliation of the participants is omitted, as we agreed to keep this information anonymous. However, some essential information related to the par- ticipants’ organizations is presented in Table 3.5. It is significant to highlight that all the participants are experienced. The experience of the participants ranges from 5 to 25 years. The participants rep- resent different technical roles, and all the participants have testing experience. Considering the size of the companies, all segments (small, medium, and large) are represented in this study. These companies represent a diversified range of product domains, including telecom, business intelligence, health care, web-based applications, and customized solutions. Moreover, the results of this study rep- resent the perspective of both product and project-based companies.

3.3.3 Data analysis For the analysis of the data collected from the interviews, I followed the thematic analysis steps defined by Cruzes and Dybå [19]. Thematic synthesis is a suitable data analysis method that identifies themes and labels from a qualitative data Chapter 3. Method 16

Table 3.4: Interview participants. Company PID1 Experience Role Responsibilities C1 P1 11Year Test Architect Test automation C2 P2 14 Year Senior Developer Development & Customer support C3 P3 11Year QA Manager Test planning & management C4 P4 08 Year QA Manager Test design C5 P5 14 Year Tech Lead C1 P6 25 Year Lead Developer Development & testing C6 P7 12 Year QA Lead Validation, Test automation, &tester mentoring C7 P8 23 Year Developer/tester Development& AAT C8 P9 10 Year Development lead Development/testing C9 P10 8 Year R & D test engi- QA/testing neer C10 P11 15 Year Quality engineer , generation and test suite maintenance. C9 P12 5 Year Acceptance testing 1 PID: Participant ID.

Table 3.5: Participants’ Organizational Context. Company PID1 Company Product Domain Process Model Req Source2 Location Size C1 P1 10,000+ Real time back-end Staged Ag- Market-driven Sweden telecom ile (Hybrid of Scrum & Waterfall) C2 P2 1800+ Micro-service based Scrum Market-driven Sweden telecom C3 P3 5,000+ Business intelligence Scrum Market-driven USA C4 P4 25,00+ Health record Scrum Market-driven USA C5 P5 3500 Customized Solution Scrum & Kan- Bespoke Ukraine Provider ban C1 P6 10,000+ Telecom Banking Scrum Market-driven Sweden C6 P7 4,000 Web-based applica- Scrum Market-driven UK tions C7 P8 24,000 Healthcare product Scrum Bespoke Finland C8 P9 150 Business support Scrum Market-driven Sweden systems C9 P10 14,000+ Telecom Market-driven China C10 P11 600+ Web-based applica- Scrum & Kan- Bespoke Denmark tions ban C9 P12 14,000 Back-end Telecome Kanban Market-driven China application 1 PID: Participant ID. 2 Req Source: Requirement source corpus concerning the specified research questions. Furthermore, I also took in- spiration from the method used in [42]. Following steps were followed for thematic analysis.

Step 1: Data transcription In the first step, I transcribed the audio record- ings and entered the data into the excel sheets organized according to the in- terview guide. After completing each interview transcription, I listened to the interview recording to verify if I have missed any essential information. Chapter 3. Method 17

Step 2: Creating meaningful themes Although the transcribed data was inserted into structured sheets, still for each category, I had to do further analysis to create meaningful themes. The purpose was to make it uncomplicated and easily accessible. I used color codes to separate the concepts and identified the similarities and differences in different interviews’ findings. Similar statements were grouped together for further analysis. Table 3.6 presents how statements of some participants that were similar and grouped together.

Table 3.6: Analysis procedure adopted for step2 and step 3. SNo Original Statement Category Restructured Label 1 i) Developers designing the test cases don’t have the Challenge Team members lack in knowledge of the legacy system. the domain knowledge. ii) New team members lack the domain knowledge, hence they cannot understand the context and intent of the system. iii) To get hold of the right person to write the accep- tance tests. Because in case of a Employee switching, it becomes very difficult for the new employee to fill that knowledge gap, and to understand the context of the product. Lack of this knowledge may result in inappropriate acceptance tests.

2 i) User stories in the form of JIRA tickets using natural Practice User stories are written language. in natural language. ii) User stories written in natural language represents the requirements. iii) User stories along with some context written in wikis. Presentations and informal communication are also used to communicate requirements.

3. i) Acceptance tests are manual for the new stories, Practice For the new features, automated after the release(for the regression suite). acceptance teasing is ii)Acceptance tests for new features are written and performed manually. performed manually. iii) User stories along with the acceptance criteria are used to write the acceptance tests, for new features acceptance tests are manual.

4. i) Most customers lack the technical knowledge, and Challenge Collaboration with the sometimes they do not have a deeper understanding customer. of the system. ii) Customers expect the product to run according to their specified criteria, they are not interested in test process. iii) Customers lose the motivation over time. iv) Customers do not have time to participate in ac- ceptance testing.

Step 3: Assigning labels After creating the themes of findings, the next step was to assign the appropriate labels for the acceptance testing practices and challenges. In this step, besides labeling, I also restructured the interviewees’ statements where it was necessary. Table 3.6 presents the labels along with the restructured Labels. Chapter 3. Method 18

Step 4: Presenting the results Chapter 4 presents the results of my study. I summarized the results in the form of data tables and provided sufficient dis- cussions on each essential aspect of the findings.

Step 5: Validating the results The qualitative studies are subject to the researcher’s bias regarding the interpretation of the results, which can be avoided by validating the results from the source. To validate the results’ interpretation, I sent the results to the interview par- ticipants to review the interpretations. I requested them to review their results and see if I have missed any essential aspects. In response to my emails, all the re- spondents sent their reviews, and all of them were agreed with the interpretations of the result. Only two respondents suggested minor changes in their responses. One of them revised his perspective on acceptance test-driven development, while the other participant suggested augmenting some statements to his perception about the automation of acceptance tests, which I adjusted accordingly.

3.4 Threats to validity

The findings of this thesis are based on a literature review and a qualitative survey with industry practitioners. The data for the survey is collected using semi-structured interviews. There could be potential threats to the validity of the results obtained through literature and survey. The subsequent paragraphs discuss potential threats to the study’s validity along with possible mitigation strategies. To manage the validity threats, I followed the guidelines published in [43, 44]. Construct validity: The purpose of construct validity is to ensure the cor- rect choice and use of operational measures and concepts/terms of the study. In the case of this study, it could be linked to the study selection process for liter- ature review and instruments (interview guide) design for the survey. To ensure enough representation of the relevant literature for the study, I initially used a keyword-based search to find the most relevant secondary studies on the topic, and this helped me select an appropriate start set of studies. For the subsequent searches, I went through the snowballing technique [18]. For the study’s selection, I used well-defined inclusion and exclusion criteria (Section 3.3). I do not claim for the exhaustive searches, but the consistency of findings is the evidence that I have selected a sufficient amount of relevant studies. While designing the survey instruments (interview guide), I carefully followed the guidelines [25]. Further, to ensure the interview guide’s correctness and consistency, I did conduct pilot interviews with the experts. Based on the experts’ feedback, I made the necessary changes to the interview guide. Internal validity: Internal validity refers to how well the study is conducted and how credible the study’s findings are. It refers to the factors that affect Chapter 3. Method 19 the outcome of the research. In my work, it could relate to the analysis of the literature, conducting and analyzing the interviews. For the analysis of the liter- ature, I followed the guidelines of systematic synthesis. For the interviews, in my opinion, two aspects are significant, i) selection of participants and ii) conducting interviews. I followed the convenience sampling method to select the interview participants, and all the selected participants are experienced in the field. I used audio recordings for the interviews to ensure that I should not miss any essential aspect of the participant’s perception. For the analysis of the interview results, I followed a comprehensive procedure. External validity: External validity refers to the generalization of the re- sults. In this study, along with the literature review, this study’s results are the outcome of the interviews conducted with the practitioners of different organiza- tions working on diversified domains. The participants represent the companies operating on three different continents. Furthermore, the necessary information about the participants and their companies is part of this dissertation. So findings of the study are generalizable, and hence threat to external validity is minimized. Conclusion validity: The conclusion validity refers to the quality of the conclusions drawn from the study’s findings. Conclusions credibility is mainly dependent on the correct and unbiased results. I ensured the triangulation for all aspects of data, that is, data collection and interpretation. This study’s conclu- sions are the outcome of data collected from the literature review and interviews. I utilized well-established methods for data interpretation and analysis. I also validated my results from the selected respondents. Chapter 4 Results

4.1 Acceptance testing in research perspective – RQ-1

This section presents the literature findings on acceptance testing. To identify the research perspective on acceptance testing practices and challenges, I have reviewed 20 selected studies.

4.1.1 Acceptance testing practices in literature Table 4.1 presents the various practices identified from the literature. Most of the identified practices relate to when, how, and by whom the acceptance tests should be specified. Agile Software Development has a keen focus on delivering maximum value to the customer, and in context of acceptance testing, the failed acceptance tests may point towards problems in delivering the intended business value to the customer [28]. Customer involvement in different phases of acceptance test- ing, (from design to execution) is highlighted in literature [20, 5, 4, 27, 29, 12]. There is enough evidence from research that suggests that acceptance criteria is specified by the customer [3, 5, 27, 12]. Many studies suggest that the customer or the customer representative should specify the acceptance tests [4, 3, 28]. In case, acceptance tests are designed by the developers, then these should be reviewed by the customer [27, 12]. There is also an evidence of involvement of multiple roles in acceptance test design[4, 20, 27, 12, 31]. This may include domain experts, business analysts, testers and developers etc. The customer collaboration with the technical roles helps promotes knowl- edge sharing contributing towards a better understanding of system complexities. When it comes to Acceptance test execution, a couple of studies suggest that anyone in the team, including the customer, can run the acceptance tests [5, 12]. According to [16], acceptance tests are conducted by the customer. It is also reported that acceptance tests are mostly performed in an informal fashion, e.g a software demonstration [29]. Regarding the maintenance of tests, the suggestion is that it should be the responsibility of the domain experts [28].

20 Chapter 4. Results 21

Table 4.1: Acceptance testing practices in the literature. LPr 1 AT Practice Studies

LPr1 Requirements are written in natural language (e.g User stories) [3, 12, 27]

LPr2 Domain specific language and formats are used to specify the [28, 30, 35] requirements

LPr3 Customer specifies acceptance criteria for each story [3, 5, 12, 27]

LPr4 Acceptance tests are specified by the customer/customer repre- [4, 28] sentative

LPr5 Customer collaborates in in acceptance test design [4, 5, 12, 20, 27, 29, 30, 31]

LPr6 Acceptance tests are written using fit tables [4, 16, 17, 20, 27, 33, 45]

LPr7 Acceptance tests are written in natural language [20, 37]

LPr8 Acceptance test are specified before the development [4, 5, 12, 27, 28, 29, 30, 31, 35]

LPr9 Requirements are specified as executable acceptance tests [4, 5, 12, 28, 29, 30, 31, 35]

LPr10 Several roles are involved in specifying acceptance tests [4, 12, 20, 27, 31]

LPr11 Negative tests are also specified in acceptance tests scenarios [3, 12, 27, 29]

LPr12 Anyone in the team including customer can run the acceptance [5, 16, 12] tests

LPr13 Acceptance tests are maintained by domain experts [28]

LPr14 Dedicated role for acceptance testing [20]

1 LPr: Literature practice. .

The current literature also provides an evidence of having a dedicated role (e.g., user acceptance tester) for acceptance testing [20]. Literature suggests that in order to minimize ambiguities in requirements, there should be an appropriate mechanism for correct communication of require- ments between customers and developers [28]. The practices to specify acceptance tests, mentioned in the literature are, i) specifying the acceptance tests in an ex- ecutable format, i.e., using FIT tables [4, 16, 20], ii) writing the acceptance tests in natural language using Microsoft Word, Spreadsheet, etc. [20, 21]. Ac- cording to [29], the acceptance tests written in tabular form contribute towards better comprehension of requirements. Acceptance tests cases should cover main and alternative paths. Studies also highlight the need for specifying the accep- tance tests for negative scenarios [12, 27, 29]. In terms of when acceptance tests should be designed, the majority of studies support the acceptance test driven development (ATDD), and suggest to specify the acceptance tests in an executable format before implementation [4, 5, 12, 28, 30, 29]. It is suggested that, executable acceptance tests, when written before implementation can pro- Chapter 4. Results 22 vide certain benefits, i.e., i) they can help to communicate domain knowledge required for development[4, 28], ii) they help to communicate the current status of the implementation [28], and iii) they can contribute towards better estimation of stories. [20].

4.1.2 Acceptance testing challenges in literature Table 4.1.2 enlist the challenges concerning acceptance testing found in the litera- ture. Various authors have highlighted that collaboration with the customer

Table 4.2: Acceptance testing challenges in the literature.

LC 1 AT Challenge Studies

LC1 Collaboration with customer [4, 16, 10, 29, 30]

LC2 AT design and maintenance is time consuming & costly [3, 4, 16, 20, 31]

LC3 Poor test quality can lead to deceptive confidence [3, 4]

LC4 Lack of AT budget [16, 20]

LC5 AT execution is time consuming and costly [3, 16, 20, 28]

LC6 Lack of time for building AT infrastructure [16, 20]

LC7 Lack of competence and experience of the roles involved [16, 20, 30, 31, 35]

LC8 Lack of appropriate testing infrastructure [20]

LC9 AT suites are difficult to maintain and optimize [3, 20, 21, 30, 33]

LC10 Limitations of acceptance testing tools [16, 20]

LC11 Test data generation for automated tests [16]

LC12 Missing requirement context [27, 30]

LC13 Aligning acceptance tests with changed requirements [21, 36]

LC14 Testing of Legacy systems [16]

1 LC: Literature Challenge. . is a challenge [4, 16, 10, 29, 30]. Various reasons make it difficult to collaborate with the customer. For instance, lack of time, lack of knowledge (e.g, customer’s testing skills [3]), and customer motivation are among the reasons that make collaboration a challenge [10]. Acceptance test suite maintenance and op- timization is a difficult task [3, 20, 30]. Different reasons make it difficult. For example, GUI-based acceptance tests are fragile and can break even if there is a small change. Lack of time and budget for acceptance tests is a another chal- lenge [16, 20], this challenge is significant because of the facts, that acceptance tests design and maintenance is time consuming and costly activity [4, 16, 20], and acceptance tests execution is time consuming and costly [3, 16, 20, 28]. In many Chapter 4. Results 23 cases, acceptance testing become more complex, time consuming, and costly if it is performed manually [3]. Regarding the automated acceptance testing, a sig- nificant challenge is the limitation of acceptance testing tools for example, missing a required functionality [16, 20]. Insufficient training of the involved people in testing tools, and time required to get familiar with the testing tools could also be taken as challenges [20]. Lack of competence and experience of the roles involved in acceptance testing is another related challenge specified in many studies [16, 20, 30, 31, 35]. Another challenge that could be linked here is the availability of appropriate infrastructure required for automated acceptance testing [20]. Poor communication among the stakeholders and lack of documentation can lead to another challenge of aligning the acceptance tests with the changed requirements [21, 36], in the case when changes are not communicated or doc- umented properly, it can happen that testers may report newly added/changed features as bugs.

4.2 Acceptance testing in industry perspective– RQ-2

This section presents the practitioners’ perspective on acceptance testing. I have investigated various aspects of acceptance testing, including the organizational policy on acceptance testing, how practitioners define the acceptance testing, how they practice acceptance testing, and the challenges they face during the acceptance testing practice.

4.2.1 Organizational policy of acceptance testing Table 4.3, presents the overview of participants’ perspective regarding their or- ganizations’ policy of acceptance testing. The majority of the participants stated that they do not have a uniform organization-wide policy. Four of them stated that the acceptance testing policy in their organizations is adhoc. Similarly, three participants stated that their organizations’ acceptance testing policy varies from team to team, and teams decide based on the project needs. Three participants stated that their organizations have a defined policy of acceptance testing. On the other hand, in one organization, acceptance testing is not a regular feature, and their policy is to perform acceptance testing if their client asks for it. Chapter 4. Results 24

Table 4.3: State of acceptance testing practice/organizational policy. PID1 State of acceptance testing practice P1 No organization-wide well-defined policy for acceptance testing. Rather I would say it is Ad- hoc. Practices vary across teams based on their needs. Important to note is that Acceptance testing is mostly considered at the technical needs level.

P2 Our organization has a focus on quality, and I would say that we have a well-defined policy on acceptance testing. Tools used for testing may vary across teams, but every team is required to have a test plan. Mentoring is arranged for newer employees. The test plan template is provided by higher management.

P3 The current management has a strong focus on customer behavior. We have a Standardized policy for acceptance testing, and across the teams, acceptance testing is uniform. Processes are formalized by using standardized tools and procedures, and a uniform set of tools is used throughout the organization.

P4 The acceptance testing practice varies from team to team. There is no clear organizational policy on acceptance testing. Usually, after implementation, the product is tested in two stages. In the first stage, it is tested at the unit and functional level by the QA testers. In the second stage, it is tested for user acceptance testing.

P5 Our organizational policy is that acceptance testing is only performed if the client requests for it. The practice adapts according to the customer’s limitations( budget and time) and according to the project’s nature. The time to market is critical from the client’s perspective, and QA is often seen as a bottleneck. There should be a focus on unit and integration testing during development, and for acceptance testing, the QA should work with the support team to address faults faced by the users after the release. It’s okay to have some unhappy clients, but It’s better to validate the product in the real environment with real users and get the actual customer feedback.

P6 Over the years, I have seen that we do not have a defined policy for acceptance testing. I would say acceptance testing is Adhoc, and it varies from team to team. Standardization is a double-sided sword. It can help maintain the consistent practice, but it can halt innovation. E.g. A polyglot programmer uses the best possible tools for tasks.

P7 The testers in the development team perform acceptance testing. But the actual customer can also perform hands-on acceptance testing before the release if they want. In our organi- zation, acceptance testing can be performed by 1) the QA representative (tester) in front of the customer representatives (e.g., a support department), 2) the customer representatives themselves can perform it with the help of QA, or 3) On their own.

P8 There are no formal guidelines, but the organization’s focus is on customer communication. Continuous customer feedback is collected, and response to the requested change is rapid. Acceptance tests are written by the developers and shared with the product owner when necessary.

P9 No formal policy on acceptance testing, but the management encourages employees’ feedback on process and practices.

P10 Different teams are working on different quality aspects of the system. No specific guidelines for acceptance testing, the selection of tools and practices depend on the nature of the feature.

P11 There are no formal guidelines on acceptance testing, it varies from project to project. How- ever, the policy is that we should perform the acceptance testing near the release.

P12 Teams can choose different practices and tools depending upon the features they are working with – no organization-wide policy on acceptance testing. However, conformance to the specified standards for the product is strictly measured. 1 PID: Participant ID . Chapter 4. Results 25

Table 4.4: Acceptance test automation. PID1 Automation Focus P1 Heavy focus on automation. At the lower level, 100% of the tests are automated. At the business level, acceptance tests are manual.

P2 Acceptance tests are manual. In total manual tests can be considered as 30 to 40 % of all tests. There are no plans for acceptance test automation, but there is a scope of automation for long-term maintenance.

P3 At our organization, we are keen on test automation, and we give high priority to the au- tomation of acceptance tests. The developers write unit tests in c#, and all functional tests are automated at three levels( UI, API, DB).

P4 Strong inclination towards / development environment with Selenium for higher-level tests. But there are challenges in automation.

P5 We do not automate the acceptance tests until they become a part of the regression suite.

P6 Focus is to automate everything up to the functional testing level and the regression suites. Relying on manual acceptance test will not be economically feasible going forward. Automa- tion is key - but the automation can only discover regression errors (intentional changes or not), it cannot uncover new paths (or at least not validate whether these paths are correct)

P7 Acceptance tests for new features are not be automated. As there are many changes through- out the development, and automation first will add to production costs. Writing acceptance tests takes a lot of time and effort. Changes contribute to the maintenance of automated suites. Acceptance should be automated while performing regression. In the case of new features, lower level automation can be performed, as it helps in early bug detection.

P8 The focus is to automate all acceptance tests, but there are manual acceptance tests as well. For new features, acceptance tests are written by the developers using automation tools.

P9 The organization is positive towards test automation. Acceptance testing is automated for stable features, especially for UI related features.

P10 Our organization is inclined towards automation. Unit, functional, and integration tests are automated. Acceptance tests are manual due to the complexity of features.

P11 The organization has an overall positive attitude towards the automation of acceptance tests, as it reduces the time and effort required for acceptance testing. Almost 80% of acceptance tests are automated.

P12 There is an increasing trend towards automation, as it helps in early fault detection and speeds up regression testing. A small proportion of acceptance tests is automated, mostly for stable features. 1 PID: Participant ID .

4.2.2 Acceptance test automation Table 4.4 describes the state of acceptance test automation in the participants’ or- ganizations. Most of the participants revealed that generally, their organizations are focusing on test automation. However, acceptance tests are not completely automated. Some participants believe that acceptance test automation is only possible for the stable features, and it can not be for the new features. These par- ticipants said that automation of acceptance tests for new features is not possible because of various factors, including frequent changes and complexity of features. Chapter 4. Results 26

They are with the view that automation is viable and cost-effective for stable features only. A couple of participants stated that they automate the acceptance tests when these become part of the regression suite. Some organizations have automated tests at the lower level, and at the business level, acceptance tests are manual. Only a couple of organizations have a large part of their acceptance tests automated.

4.2.3 AT definition All the practitioners who participated in this survey are working with Agile methodologies, and the majority of them using Scrum to manage their software development process. Agile methodologies advocate for an early and frequent acceptance testing, for example acceptance test driven development [8]. While giving an overview of the acceptance testing process, the practitioners explained what acceptance testing is for them. According to most participating practition- ers, acceptance testing is performed to evaluate the system’s conformance to the written requirements/acceptance criteria, standards, and other regulations appli- cable to the domain. Table 4.5, presents the exact verbatim statements of the participating practitioners. Chapter 4. Results 27

Table 4.5: Practitioners’ perspective of acceptance testing. PID1 Perspective of AT

P1 Acceptance testing can be performed at two levels, at a lower level and the business level. The lower level represents the developer’s perspective, and it is done as unit testing. The other level is business need-based, and it is in the context of the customer’s perspective on how customer needs are being fulfilled. The lower level acceptance tests are easier to create and implement and are less costly to maintain. As we go up towards the business level, this cost increases due to the dependencies in a large-scale organization. In a small organization, there are less prominent differences in AT at these two levels.

P2 The goal of acceptance testing is to assure the conformance to the product owner’s written requirements, conformance to standards, and any other regulations.

P3 The purpose of AT is to ensure a superior user experience in terms of user scenarios. (user satisfaction is measured through surveys)

P4 A product should be tested at two levels i) by the QA testers in the testing environment at the unit and functional level, ii) Then during the demo, the product should be tested by the product owner (initiator of the requirement)

P5 To test the conformance of the product with the specifications provided by the Business analyst/product owner.

P6 To test a product for conformance with the written acceptance criteria and defined standards and protocols.

P7 Acceptance testing is one of the final stages of testing, where the product is presented to the customer to validate if all the requirements are met, and the acceptance criteria for the product are satisfied.

P8 Testing from the perspective of customer representatives.

P9 Acceptance testing ensure the conformance of product to the customer’s needs and specified standards.

P10 Acceptance tests are feature level tests, which ensure that the product functions according to defined criteria and tests results are up to the standards.

P11 Acceptance testing is to ensure the conformance of the release to the acceptance criteria.

P12 Acceptance tests are performed to measure the conformance of the product to the specified standards.

1 PID: Participant ID .

4.2.4 Practitioners’ Perception on ATDD Although, the majority of the organizations are not practicing acceptance test driven development (ATDD). Most of the participants agree that ATDD would be helpful in terms of understanding/communicating requirements. Another im- portant aspect highlighted by the practitioners is that acceptance tests can help to communicate domain knowledge, only if they are written at a higher-level, writing tests at a lower-level will not benefit in practice. A couple of the partic- ipants were with an opinion that ATDD can not be beneficial in their context. According to another perspective, the adoption of ATDD depends upon the con- Chapter 4. Results 28

Table 4.6: Participants’ opinion on ATDD. PID1 Opinion on ATDD

P1 Yes, if business level acceptance tests are in context, writing them first will help in better understanding of requirements. If they are written at a lower level, writing them first will not matter.

P2 ATDD can be helpful for stable products, where the product is already functional and new features are being added. It is not effective in the case of products that are being newly developed.

P3 Yes, it helps to communicate requirements.

P4 Customer collaboration in early SDLC is useful, but the focus is to get the tasks accomplished. So the organization prefers the code first approach.

P5 No, I do not think that I will help in our context.

P6 Yes if the tests are written at the correct level.

P7 Writing the acceptance tests first can help communicate requirements, only if written at the correct level. Writing lower level tests first will not help understand the product.

P8 It depends on the context of the system.If the requirements change often then it is important to make sure that an appropriate approach to acceptance testing is chosen,that makes it easy to radically change the tests (i.e., Approval Testing).

P9 ATDD can help to communicate the requirements, but it is not practiced because the prac- titioners are hesitant to change the typical code-driven mind set. In my opinion, it requires extra time and effort.

P10 Acceptance criteria at the feature level, when clearly specified, helps to understand better the context of the feature and helps for a better test case design.

P11 ATDD helps to communicate/understand the requirements, we define the acceptance criteria in natural language before implementation.

P12 ATDD provides better knowledge sharing between testers and developers.

1 PID: Participant ID . text of the system. For example, if the requirements change is often, it is essential to make sure that an appropriate approach to acceptance testing is chosen, mak- ing it easy to change the tests radically. Table 4.6, summarizes the opinion of participants regarding ATDD. Chapter 4. Results 29

Table 4.7: Practitioners’ perspective of AT practices. IPr1 AT Practice P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12

IPr1 The organization has a well -  --- -- - - - defined policy for acceptance testing IPr2 Requirements are written in ---- -  ----- formal domain specific lan- guage. IPr3 Requirements are written in  -  -     Natural language. IPr4 Collaborative requirement re- -- -  ------finement(at sprint start) IPr5 Clear acceptance criteria is de- -- -  -  -- -  - fined before development IPr6 Conformance to acceptance -- -  -  ----- criteria is considered a part of definition of done IPr7 Separate acceptance test suite  --- -  are maintained IPr8 Acceptance tests are written -- -  -  -- -  before the development IPr9 Customer representation in ac- -  --- -  -  ceptance test design IPr10 Acceptance tests are written in     natural language scenarios IPr11 For the new features, ac-     ceptance teasing is performed manually IPr12 Acceptance tests are auto- -- --- -- - mated for stable features IPr13 Customer representation in ac- -  --- -- - - ceptance test execution IPr14 Acceptance testing results are -- --- -  -  stored IPr15 Dedicated roles for acceptance -- ------ -  testing IPr16 Customer feedback loops are -- -  -- -  integrated in the requirement specification process for subse- quent releases IPr17 Several roles are involved in ac- -  --- -  -- ceptance test design IPr18 Technical roles are maintaining     the acceptance tests

1 IPr: Industry practice .

4.2.5 Acceptance testing practices in industry Table 4.7, presents the practitioners’ perception of how they are undertaking acceptance testing practices. Most of the participants said that they write requirements/user stories using natural language. Only two participants revealed that they use for- mal domain-specific language to write user stories. Half of the participants revealed that they integrate the customer feedback loops in the requirement Chapter 4. Results 30 specification process of subsequent releases. All the participating companies are writing acceptance test scenarios in the natural language. Among the participating companies, only three define the acceptance criteria before the development, and their definition of done is the conformance to the specified acceptance criteria. Five participants informed that they write the acceptance tests before the development.In some companies, acceptance tests are written in parallel to the development. And in a couple of companies, acceptance tests are written once the code is written. Half of the participants said they involve the customers or their represen- tatives during the acceptance test design. There are four participating com- panies where customers/customer representatives are involved during the subse- quent steps of acceptance testing. In three companies, the user acceptance tester is a separate role, whereas, in most companies, several roles are involved in acceptance test design. There are multiple scenarios in this regard. In some companies, product owner and test teams are involved, in some cases, the development/test teams do this job, and in a couple of cases, business analysts and the development team perform the acceptance testing activities. In the cases where product owner or customer representatives are involved, their role is limited to specify the accep- tance criteria and reflect on acceptance tests design. According to the participants, acceptance tests for new features are written manually in all the participating companies. In some companies, acceptance tests are automated for stable features. For automation of acceptance tests, companies are using open source tools, such as Selenium, the ROBOT frame- work,andText test. At two companies, in-house developed tools are used for test automation. One practitioner also reported the use of Specflow. The majority of the companies are maintaining the acceptance test suites separately. However, a couple of companies do acceptance testing if customer demand for it, otherwise they do not perform acceptance testing. In all com- panies, technical roles (e.g., developers/testers/QA engineer) are maintaining acceptance tests. Five companies are maintaining/storing the acceptance test results for tacking and monitoring purposes, further these results could be used for the optimization of acceptance testing. However, the lack of time (a chal- lenge reported by most of the companies see Table 4.8) hinder the optimization activities. Chapter 4. Results 31

Table 4.8: Practitioners perspective of AT Challenges. IC1 AT Challenges P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12

IC1 Team members lack in the do-  -- -- - - - main knowledge

IC2 Lack of competence for test de-  -- -  -- - - sign

IC3 Designing and maintaining ac-  -- -- - ceptance tests need a lot of time and effort

IC4 Time for optimization  -  -  -- - -

IC5 Communication gap between  -  ----   the customer representative and technical roles

IC6 Collaboration with the cus-     tomer

IC7 Limitation of acceptance test-  -  -  -  - ing tools

IC8 Lack of time for AT in general -  -- -- - -

IC9 Lack of AT budget - - - -  -  -- -  -

IC10 Inappropriate testing environ-  -- -  -- ment

IC11 Lack of standardized practice  -- ------

IC12 Acceptance testing for various ----- -  -- - - customizations

IC13 Appropriate test data for auto- -  --- -- - - mated acceptance tests

1 IC: Industry challenge .

4.2.6 Acceptance testing challenges in industry Table 4.8 presents the challenges of acceptance testing, which industry practi- tioners think are significant. The challenge reported by all the participating practitioners is the collaboration with the customer, although different prac- titioners representing different companies have their own reasons. For instance, customer’s competence is highlighted as a challenge by the participants. There are two opinions about the customer’s competence, lack of technical knowledge, and lack of domain knowledge. Lack of customers’ do- main knowledge is highlighted as a challenge by the two participants of this study. One of them, who is working on a micro-service based product, men- tioned that “the customer may not be aware of complete flow”. Lack of technical Chapter 4. Results 32

Figure 4.1: Collaboration challenges identified by the practitioners. knowledge is specified by four participants (P1, P5, P10, and P14). According to them, most customers lack technical knowledge, and sometimes they do not have a deeper understanding of the system. Another reason for the Lack of customer collaboration identified in the inter- views is the Lack of time. Four participants (P4, P5, P7, and P8) highlighted this issue. According to P7, “customer was not willing to give much time”. While referring to the off-site customer, P4 suggested that “going to the customer takes extra time, pressing time limits to deliver the product make it difficult to get customer involvement.” Lack of customer’s interest is also reported by four participants. They mentioned that the customers are only interested in the qual- ity of the product and the test results. They are not willing to commit resources to the testing process. Another aspect of that hinders customer involvement is the Lack of motivation. P7 revealed that most of the external customers were enthusiastic at the start of the project, but after some releases, they felt less motivated to participate in the acceptance testing process. Figure 4.1 shows customer collaboration challenges as identified by the participating practitioners. The communication gap between the customer representative/product owner and technical roles (development/testing teams) is highlighted as a challenge by most participants. The product owners are more focused on the business needs, where as developers focus on technical details of the product. In most cases, the product managers who specify requirements, and set timelines for the delivery, are unaware of technical complexities. About communication gap, P1 stated that the gap between the product owner and developer/tester perspective is a big challenge. Product managers who gather requirements and set timelines for the delivery mostly do not know the requirement’s technical complexities. They only focus on the business value of the feature (i.e., whether they would be able to sell it or not). Similarly, P2 revealed that the gap between the product owner and the development team’s perspectives is a challenge. The Chapter 4. Results 33 product owners may not be aware of the actual scope or technical complexities of a story. This may lead to timeline issues and possible discoveries of some conflicting/impossible requirements refinements. The majority of the participants highlighted Limitation of acceptance testing tools. For instance, the tools currently available are not mature enough to manage business level requirements. These tools cannot process the tests writ- ten in natural language, whereas the business level acceptance tests are mostly written in natural language. GUI-based tools lack maturity in many aspects. For example, the GUI-based tests will fail if the name or location of a GUI element changes. Similarly, if the driver of the element being tested changes, the Selenium tests may fail, and end-to-end tests written using these tools are slow and inflex- ible. In the case of a change request, much maintenance is required. Another aspect is that writing acceptance tests in automated tools is very time consuming and require a proper skill set. Maintaining the acceptance test suites requires a continuous effort to ensure that they have evolved with the product. It takes a lot of skill and time to ensure the quality of the test suite. Maintaining test quality is essential. Tests should be repeatable, without any flakiness, because flakiness contributes to verification cost. Acceptance test suite maintenance and optimization is a challenge for most participating companies. Participants reported that business-level ac- ceptance tests are mostly manual, and there are no mechanisms to optimize these test suits. For example, it is challenging to identify the redundant and obsolete acceptance tests written in natural language. The lack of standardized practices for writing manual acceptance tests is one of the reasons. The absence of appro- priate documentation is also an issue in this regard, especially if the employee managing the acceptance suites leaves the organization. Time for optimization is another challenge. Practitioners reported that they do not have much time for optimization, as the product’s timely delivery is more in focus. Acceptance testing is a time-consuming, and thus a costly activity. It takes time to understand the system and find the appropriate test data. Lack of time for AT (acceptance testing) activities is another challenging aspect. For instance P7 states, pressing time limits in Agile environment, where you have to respond quickly to change, may lead to certain compromises in acceptance testing. Specially the quality of the test suite is affected e.g., there could be some flaky tests or some test paths are ignored. Some of the participants reported that the team members who design the acceptance tests lack in the domain knowledge. They stated that companies have only a few people who have the domain knowledge, and em- ployee turnover contributes to this problem. Similarly, lack of competence for test design is reported as challenge by the participants. The challenge become more harder, for the automated acceptance tests design, as it requires good pro- gramming skills, and mostly QA people do not have the required skills. In this regard another essential issue is that the senior practitioners who are involved in Chapter 4. Results 34 , they do not have time to learn new automation tools. To com- pensate this companies hire new people and train them for the purpose. But the dilemma is that most of them switch jobs very frequently. Acceptance tests are high-level tests, which are resource hungry and require control over the execution environment. The success of any task is subject to a favorable environment. If the environment is not appropriate, it can lead to fail- ure. Inappropriate acceptance testing environment is another challenge reported by the participants. In the case of third-party dependencies, there is a need to use the simulated environment. The simulated environment can not mimic the actual environment, as actual test data is not available. Mostly mocked up data is used, which is unable to depict the actual user environment. In such a case, there is a high chance of producing false positives. This leads to another associated challenge of having appropriate test data for automated accep- tance tests. It takes time to understand the system and to find the appropriate test data. It takes skill to design good quality test cases with minimal test data and to make them easy to maintain and understand. Acceptance testing for various customizations is highlighted as a chal- lenge by a couple of participants. According to P6, managing acceptance tests for various customizations adds to verification and maintenance costs. Similarly, P8 reported that the company puts much effort into making sure that it does not deviate too much between the customers. With multiple customers, you can- not just handle the requirements by driving everything from conversations with customers straight into acceptance tests because you have to make decisions to protect the integrity of the product.

4.3 Similarities and differences in the two perspec- tives – RQ-3

This section presents the similarities and differences about acceptance testing practices and challenges from the research and industry perspectives.

4.3.1 Similarities and differences in AT practices Figure 4.2 provides a comparative view of acceptance testing practices in research and industry. A few practices overlap in both perspectives. However, the extent of commonality is not at the same level for the overlapping practices (see Figure 4.3). Besides the overlapping practices, there are practices which are unique to either of the perspective (i.e., literature or industry). The literature suggests some practices that are not being practiced in the industry. Similarly, some of the practices identified from the industry are not found in the literature. In the existing research, the majority of the authors referred to the use of FIT (framework for integration testing) (ref: LPr6 Table 4.1). However, in the Chapter 4. Results 35

Research Industry

Acceptance tests are specified by Seperate suites for acceptance tests customers/customer representatives Requirement in natural language

Customer can run Acceptance tests Conformance to Acceptance criterias a part Customer collaboration in AT design of DOD

Acceptance criteria is specified by customer Domain specific language for specifying requirements Technical roles are maintaining acceptance tests

Acceptance tests written before development Acceptance tests are maintained by Acceptance tests are manual for new features Domain Experts Acceptance tests written in natural language

Acceptance tests are automated for stable Several roles involved in Acceptance test design Requirements are written as exectable features acceptance tests. Customer representation in test case execution Acceptance tests results are stored.

Negative Test scenarios are written Dedicated roles for acceptance testing for acceptance tests Customer feed back loops are integrated in the development cycle for subsequect releases Acceptance tests are written using FIT

Figure 4.2: AT practices – similarities and differences in two perspectives. industry, I did not find the use of FIT for acceptance testing. From the empirical studies, it is revealed that several roles are involved in acceptance test design (ref: LPr10 Table 4.1), my findings conform to this analogy (ref: IPr17 Table 4.7). Having a dedicated role for acceptance testing is another practice that is common among the two perspectives (ref: LPr15 Table 4.1 and IPr15). This practice is adopted in three participating companies, and it is mentioned in one study. The literature emphasises that the best practice is that the customer should specify acceptance tests (ref: LPr4 Table 4.1). In companies, customers do not specify acceptance tests exclusively. However, in some of the participating com- panies, customer representatives are involved in acceptance test design (ref: IPr9 Table 4.7). Collaboration with the customer during different phases of acceptance testing is emphasized in several studies (ref: LPr3, LPr4, LPr5, LPr12, & LPr14 Table 4.1). In my findings though some companies reported that they do collaborate with the customer, however all participants state that collaboration with the customer is a big challenge for the companies (ref: IC6 Table 4.8). The empirical studies in which authors presented the state of acceptance testing practice, also report that collaboration with the customer is a challenge. Regarding the specification of requirements, some of the empirical studies suggest to use natural language, and some studies suggest to use domain-specific language (ref: LPr1, LPr2 Table 4.1). In the industry, most of the companies are using natural language for requirement specifications. Only two participants revealed that they are using a domain-specific language (Gherkin) for this purpose Chapter 4. Results 36

Figure 4.3: AT practices – extent of commonality in two perspectives.

(ref: IPr2, IPr3 Table 4.7). Literature suggests that customer should specify the acceptance criteria before the development (ref: LPr3 Table 4.1). In the interviews three participants stated that they define the acceptance criteria before the development. Among these three, in one company customer representative defines the acceptance criteria, and in the other two companies, it is defined by the technical roles. Furthermore, the literature emphasize to specify requirements as executable acceptance tests (ref:LPr9 Table 4.1). During the interviews, no participant specified this practice. Regarding running of acceptance tests, literature refers that anyone in the team including customer can run the acceptance tests (ref: LPr12 Table 4.1, from the industry responses, majority of the participants told us that customer is not involved during the execution of acceptance tests, only four participants told that customer representatives are involved in during acceptance tests execution (ref: IPr12 Table 4.7). Regarding maintaining acceptance tests, a study suggests that the domain experts should maintain acceptance tests (LPr13 Table 4.1). In the interviews with the practitioners, it is revealed that acceptance tests are maintained by the technical roles (ref: IPr18 Table 4.7).

4.3.2 Similarities and differences in AT challenges Figure 4.4 presents an overview of similarities and differences in acceptance testing challenges from research and industry perspectives. It shows an overlap among the perspectives about the majority of the challenges, which implies that at large, research and industry are on the same page regarding the acceptance testing Chapter 4. Results 37 challenges. However, the extant of commonalities is not at the same level for the overlapping challenges (see Figure 4.5).

Research Industry

Customer collaboration

ATs execution is time consuming Time for optimization Aligning acceptance tests with AT design is time consuming changed requirements Lack of competence in roles involved

Lack of appropriate testing infrastructures Communication gap between Poor quality ATs can lead to deceptive customer representative and technical AT suites are difficult to maintain and optimize confidence roles Test data generation for automated tests

Lack of time for building infrastructures Limitations of AT tools Lack of standardized practice

Testing of legacy system Missing requirement context Acceptance testing for various Lack of AT Budget customizations

Figure 4.4: AT challenges—similarities and differences in two perspectives.

The most significant challenge is the collaboration with the customer,as it is mentioned in various included studies and highlighted by all participating participants during the interviews (ref: LC1 Table 4.1.2 and IC6 Table 4.8). Authors of different empirical studies elaborate that in many cases, it is difficult for the customers to write requirements as acceptance tests [5, 27, 46]. The interview participants complement this fact, and state lack of customer’s technical knowledge as a challenge. Furthermore, It is reported in an empirical study [3], that in some cases the developers claimed to have better domain knowledge than the customers. Most of the practitioners who participated in the interviews, are working in the market- driven development environment (ref: table 3.5). They stated that in such a scenario, customer can not be a primary source of requirements, especially, for the new features. For this reason, the customer may not have enough domain knowledge to specify the acceptance tests. Lack of domain knowledge of the customers, is highlighted as a challenge by the participants of this study. In the case of market-driven development, Acceptance testing for various customization is a challenge. The authors of an empirical study [16] stated that it becomes difficult to select representative customers in market-driven develop- ment without deviating from the actual product. During the interviews, some participants also identified the same challenge (ref: IC12, Table 4.8). Lack of AT budget (ref: LC4 Table 4.1.2) and higher cost of acceptance Chapter 4. Results 38

Figure 4.5: AT challenges – extent of commonality in two perspectives. testing are presented as a challenge in the literature [20, 16]. Three participants from medium-sized companies also referred to this challenge (ref: IC9 Table 4.8). They revealed that customers do not invest resources for the acceptance test- ing in the case of the bespoke projects. Similarly, the practitioners also stated that appropriate infrastructure and practitioners’ training are essential for accep- tance testing success. However, managements are reluctant to invest in testing infrastructures and training. Limitations of acceptance testing tools, is another challenge that is re- ferred in the literature (ref: LC10 Table 4.1.2) and highlighted by most of the interview participants(ref: IC7 4.8). For instance, most of the existing acceptance testing tools require that GUI must exist before creating the tests. Moreover, GUI based tests can break with minor changes in the user interface [20]. During the interviews, many practitioners voiced this issue. For instance, P1 states, GUI- based tools lack maturity in many aspects. These tests will fail if the name or location of a GUI element changes. Similarly, if the driver of the component being tested changes, the Selenium tests may fail. P2 said that end-to-end tests writ- ten using these tools are slow and inflexible. In case of a change request, much maintenance is required. P4 argues that writing acceptance tests in automated tools is very time-consuming and requires a proper skill set. P7 revealed, tests written in Selenium (for the UI) are fragile and may break in case of even a small change. If you have automated a nonstable feature, you may end up doing much maintenance. Besides the similarities among the two perspectives, a few challenges are listed Chapter 4. Results 39 in the literature and not identified by the practitioners. Similarly, there are challenges identified by the practitioners, which are not listed in the included studies. The challenges that could be referred as literature specific are i) aligning acceptance tests with changes in requirements, ii) poor quality of acceptance tests can lead to deceptive confidence, iii) lack of time for building infrastructure, and iv) testing of legacy systems. The unique challenges identified from the interviews are i) time for optimization, ii) communication gap between product owner/customer and development team, iii) lack of standardized practice, and iv) acceptance testing for various customization. Chapter 5 Analysis and Discussion

This chapter presents the analysis of the findings of this study. The purpose of this study was to investigate the perspectives of research and industry on acceptance testing, which revolve around the three questions presented in Chapter 3 Section 3.1. The study results are discussed in Chapter 4, which are organized according to research questions. Here in this chapter, the results will be synthesized in three aspects of acceptance testing i.e., By whom, when, and how acceptance testing is performed, in context of research, as well as in the industrial setting. I will reflect on the associated challenges and possible solutions as well.

5.1 Who is involved in acceptance testing

Literature suggests that acceptance tests should be written by the customer [5, 28], or in collaboration with the customer [29, 30, 31]. However, customer collaboration for acceptance testing is also presented as a challenge in the liter- ature [4, 16, 10, 29, 30], as well as highlighted by all the interview respondents. As reported by most of the interview respondents, acceptance tests are designed, maintained and executed by the technical roles, i.e., the developers, testers or the QA team. Acceptance tests intend to validate the system’s conformance to its require- ments [16], it implies that the requirements are communicated effectively to the development team(s). In the context of a complex system, with several teams working on different modules, effective communication of domain knowledge be- comes imperative to ensure that all the teams have a coherent view of the system. Most Agile companies have on-site customer proxies, such as product owners or product managers, to incorporate customer’s perspectives in the development process. However, the communication gap between the product owners and the devel- opment teams was highlighted as a challenge. It implies that there is a lack of effective mechanisms for communicating domain knowledge to the development teams. There can be several reasons for this gap, as identified by the interview re- spondents. These include i) the software development model being followed, e.g., a staged Agile approach where the requirement specification is still being per-

40 Chapter 5. Analysis and Discussion 41 formed as a distinct activity, ii) the organizational structure, where the technical roles are not involved during initial specification of requirements, iii) requirements been vaguely specified by the product managers, without specifying clear accep- tance criteria. iv) the roles who are responsible for specifying and prioritizing requirements lack technical knowledge. The difference in perspective of product managers and technical roles, along with the non-collaborative requirement refinement, can result in under-estimated requirement complexity and misunderstood requirements [21]. The literature highlights some essential acceptance testing practices, i.e., the customer speci- fies the acceptance criteria for each story, the customer/customer representative specifies acceptance tests, and the customer collaborates in acceptance test design. These practices are not incorporated in the industry at large. The evidence from the interviews shows that customer representatives specifying acceptance criteria was reported as a practice by 33% of the participants, whereas 16% of the partic- ipants reports collaborative refinement of requirements at the start of the sprint. Similarly, customer representative collaborates in acceptance test design in 50% of the companies. Most of the practitioners reported that product owners often lack time and technical knowledge to specify the acceptance criteria/tests. Regarding the execution of acceptance tests, the literature suggests that all the team members, including the customer, should be able to execute the tests[5, 16, 12]. Whereas in industry, the customer representation at the time of acceptance test execution is reported by only 33% of the respondents. In most cases,tests are executed by the technical roles, (such as developers and testers) and the test results are communicated to the customer representatives. The results from the interviews highlight a clear gap between the customer representatives and the technical roles. There is a need for effective mechanisms to ensure close collaboration between the domain specific and technical roles at the time of requirement specification and acceptance test design. Minimizing ambiguity should be in focus right from the requirement specification process. For this purpose, requirements can be specified in a ubiquitous language [28], along with clearly defined acceptance criteria. When it comes to acceptance test design and execution, customer representatives and the technical roles should perform it as a collaborative activity.

5.2 When acceptance tests are written

Agile Software Development focuses on ensuring responsiveness to rapidly chang- ing requirements. It can be achieved by integrating the requirements, design, implementation, and testing processes [30]. The existing literature regarding acceptance testing in Agile Software Development focuses on acceptance test- driven development [4, 35]. This approach promises several benefits including, knowledge transfer between customers and developers, a better estimation of the Chapter 5. Analysis and Discussion 42 stories, and tracking development progress [20]. From the interviews with the practitioners, three participants reported that acceptance tests are written after the implementation, four participants said that acceptance tests are written in parallel with code development, and five participants confirmed that acceptance tests are written before the development started. Moreover, when the practitioners were asked about their perspective about ATDD, nine participants were positive about this practice. Most of the par- ticipants agreed that ATDD could serve as a tool for better understanding the requirements, implied that acceptance tests are defined at a correct level. Here, it is important to mention that writing the acceptance tests before development is one aspect of ATDD, whereas writing the requirements in the form of exe- cutable acceptance tests is another aspect of ATDD as mentioned in the literature [4, 31, 30]. None of the interview participants reported specifying requirements as executable acceptance tests. ATDD offers promising benefits but hindered customer collaboration is also discussed as potential side effects of this approach. For example, [20] suggests that developers will focus on just enough implementation to get the tests to pass without collaborating with the customer. Similarly, [5] emphasize that tests are meant to support communication, not replace it. It can be inferred from the interview results that acceptance tests, when written before development, can facilitate a better understanding of requirements, given that they are defined at a correct level, and they intend to complement customer collaboration and not replacing it.

5.3 How acceptance tests are written

There are several aspects of how acceptance tests are written. These include the artifacts used for acceptance test design, the level of system abstraction at which acceptance tests are written, the extent to which acceptance tests cover requirements (verifying all user journeys), the formats and tools used to spec- ify acceptance tests. I will reflect on these aspects from industry and research perspectives. The existing literature on acceptance testing in Agile context suggests that requirements are written as executable acceptance tests. In such a case, accep- tance tests become live documentation for the system and can be used as a tool to track the system’s progress [31]. However, this does not relate to the state of practice. Customers/ customer representatives mostly prefer to specify re- quirements in plain text, and requirements are mostly written in natural language, using several tools, such as JIRA, Microsoft Word, Spreadsheets, TFS (Team Foundation Server), and Microsoft Visual Studio. Acceptance tests are derived from the requirements document. The tracking between acceptance tests and requirements is maintained manually. This practice conforms to the find- Chapter 5. Analysis and Discussion 43 ings of Hotmski and Charrada [21]. In case of manual tracking, the alignment of requirement changes to the acceptance test becomes a challenge [21]. When it comes to the level of the system abstraction at which acceptance tests are written, the literature suggests that acceptance tests are essentially the high-level tests [16]. Some of the interview participants find it difficult to manage business level needs in terms of technical level needs. This difficulty is attributed to lack of domain knowledge in the roles responsible for acceptance testing. It was mentioned that the development teams often have a narrow vision of the product, which makes it difficult to align the lower level tests with the business level acceptance tests. Another aspect to be considered while writing the acceptance tests, is the ex- tent to which an acceptance test covers requirements to verify all user journeys. Literature suggests that alternate paths and exceptional scenarios should be con- sidered while designing the acceptance tests [17]. However, most of the interview participants mentioned that pressing time limits lead towards compromises in acceptance testing, and most of the time, acceptance tests are written only for the critical path. Acceptance test automation is another significant aspect of acceptance testing. Automated acceptance tests are considered excellent for regression testing, as they allow for immediate handling of faults [30, 20]. Literature focuses on writing acceptance tests in an executable format, i.e., using FIT tables. From the interviews it is elaborated that acceptance tests for the new features are written in natural language user scenarios.Oneof the respondents, mentioned that automated acceptance tests are also written for the new features, along with the manual exploratory tests. While discussing the organizations’ attitude towards automation, most par- ticipants reported an increasing inclination towards automation. But some chal- lenges of automation were also highlighted, such as lack of automation skills, limitations of acceptance testing tools, selection/ generation of appropriate test data for automated tests, and limitations of the testing environments. Moreover, most of the respondents consider acceptance tests for stable features as feasible candidates for automation. For new features, automated acceptance tests are not considered practicable because of two reasons, i) the requirements may undergo several changes during the release ii) verification of several user journeys may imply testing along multiple paths. Both of these reasons contribute towards higher testing costs. It can be concluded that automation of acceptance tests offers several potential benefits, such as reducing the time required for regression testing and shifting the focus of manual testing towards . However, the practice can be most useful when effective mechanisms are devised for the alignment of acceptance tests with the requirements and the code. One potential solution can be natural language processing (NLP) tools for automatically generating guidance for aligning acceptance tests to the requirements specified in natural language [37]. Chapter 6 Conclusions and Future Work

6.1 Conclusions

In this thesis, I have investigated the perspectives of research and industry on acceptance testing. To represent the research perspective, I have conducted a literature review of 20 research papers selected using snowball search strategies. To identify the industry perspective, I have conducted an interview-based survey with 12 practitioners representing 10 companies. For the data analysis, I followed the thematic synthesis approach. From the results and analysis discussion, I have concluded that the focus of most studies in the literature is ATDD and automated acceptance testing. Whereas in the industry, at large, acceptance testing is manual. The interview results revealed that most of the companies do not have a well-defined accep- tance testing strategy. The companies are keen on the automation of acceptance testing, but complete automation of acceptance tests is not feasible in the prac- titioners’ opinion. They have valid reasons for their claim. For instance, due to frequent changes in requirements, new features, an unsupportive environment, limitation of tools, and lack of time are some of the practitioners’ reasons. This fact signifies the need for researchers to focus on this aspect of acceptance test- ing, which is currently very low. Regarding ATDD, the results show that most companies do not practice it, but practitioners think that ATDD could help understand requirements correctly, provided if the product/system context al- lows for it. ATDD requires extensive customer’s involvement during the different phases of AT. However, customer collaboration is highlighted as a challenge by all the participating practitioners. Various authors also mention this challenge in the literature. Besides the collaboration between the customers and compa- nies, practitioners revealed that collaboration between several roles (e.g., domain experts and technical people) is also a challenge. From the analysis of the results, it is further concluded that there is a visi- ble gap in the research and industry perspectives of acceptance testing practices. There are some commonalities as well, especially, there are many challenges com- mon between research and practice. It is required that researchers should consider the solutions to these challenges from the perspective of the industry.

44 Chapter 6. Conclusions and Future Work 45 6.2 Future Work

There is a need to work in industry-academia collaborative projects to reduce the research and industry gap on acceptance testing. There are various aspects where in collaboration with the industry practitioners, researchers can contribute. As food for thought, here are some points from where such collaborations could be initiated:

1 Mechanisms need to be devised to improve collaboration between various roles in the companies.

2 Customer collaboration, especially for the bespoke project-based companies, needs to be further investigated.

3 Development of natural language processing tools which can generate exe- cutable acceptance tests from requirements written in natural language.

4 The customers’ perspective on acceptance testing needs to be investigated.

5 In the context of large-scale development, there is need of devising strategies for the automation of tests for the new features. Bibliography

[1] Nasir Mehmood Minhas, Sohaib Masood, Kai Petersen, and Aamer Nadeem. A systematic mapping of test case generation techniques using uml interac- tion diagrams. Journal of Software: Evolution and Process, 32(6):e2235, 2020.

[2] Itti Hooda and Rajender Singh Chhillar. Software test process, testing types and techniques. International Journal of Computer Applications, 111(13), 2015.

[3] Børge Haugset and Geir Kjetil Hanssen. Automated acceptance testing: A literature review and an industrial case study. In Agile 2008 Conference, pages 27–38. IEEE, 2008.

[4] Johannes Weiss, Alexander Schill, Ingo Richter, and Peter Mandl. Literature review of empirical research studies within the domain of acceptance testing. In 2016 42th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pages 181–188. IEEE, 2016.

[5] Grigori Melnik, Frank Maurer, and Mike Chiasson. Executable acceptance tests for communicating business requirements: customer perspective. In AGILE 2006 (AGILE’06), pages 12–pp. IEEE, 2006.

[6] Borge Haugset and Tor Stalhane. Automated acceptance testing as an agile requirements engineering practice. In 2012 45th Hawaii International Con- ference on System Sciences, pages 5289–5298. IEEE, 2012.

[7] Filippo Ricca, Massimiliano Di Penta, Marco Torchiano, Paolo Tonella, Mar- iano Ceccato, and Corrado Aaron Visaggio. Are fit tables really talking? In 2008 ACM/IEEE 30th International Conference on Software Engineering, pages 361–370. IEEE, 2008.

[8] Ming Huo, June Verner, Liming Zhu, and Muhammad Ali Babar. Software quality and agile methods. In Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004., pages 520–525. IEEE, 2004.

46 BIBLIOGRAPHY 47

[9] IEEE Standards Association et al. Systems and software engineer- ing—vocabulary iso/iec/ieee 24765: 2010. Iso/Iec/Ieee, 24765:1–418, 2010.

[10] Itziar Otaduy and Oscar Díaz. User acceptance testing for agile-developed web-based applications: Empowering customers through wikis and mind maps. Journal of Systems and Software, 133:212–229, 2017.

[11] R Owen Rogers. Acceptance testing vs. unit testing: A developer’s per- spective. In Conference on and Agile Methods, pages 22–31. Springer, 2004.

[12] Grigori Igorovych Melnik. Empirical analyses of executable acceptance test driven development, volume 68. 2007.

[13] Jeff Offutt and Paul Ammann. Introduction to software testing. Cambridge University Press Cambridge, 2008.

[14] Roy Miller and Christopher T Collins. Acceptance testing. Proc. XPUni- verse, 238, 2001.

[15] Jung-Ah Shim, Hyun-Jung Kwon, HJ Jung, and Moon-Sung Hwang. Design of acceptance test process with the application of agile development method- ology. International Journal of Control and Automation, 9(2):343–352, 2016.

[16] Grischa Liebel, Emil Alégroth, and Robert Feldt. State-of-practice in gui- based system and acceptance testing: An industrial multiple-case study. In 2013 39th Euromicro Conference on Software Engineering and Advanced Ap- plications, pages 17–24. IEEE, 2013.

[17] Filippo Ricca, Marco Torchiano, Mariano Ceccato, and Paolo Tonella. Talk- ing tests: an empirical assessment of the role of fit acceptance tests in clarify- ing requirements. In Ninth international workshop on Principles of software evolution: in conjunction with the 6th ESEC/FSE joint meeting, pages 51– 58, 2007.

[18] Claes Wohlin. Guidelines for snowballing in systematic literature studies and a replication in software engineering. In Proceedings of the 18th international conference on evaluation and assessment in software engineering, pages 1–10, 2014.

[19] Daniela S Cruzes and Tore Dyba. Recommended steps for thematic synthesis in software engineering. In Proceedings of the International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 275–284, 2011. BIBLIOGRAPHY 48

[20] Shelly Park and Frank Maurer. A literature review on story test driven development. In International Conference on Agile Software Development, pages 208–213. Springer, 2010. [21] Sofija Hotomski, Eya Ben Charrada, and Martin Glinz. An exploratory study on handling requirements and acceptance test documentation in industry. In 2016 IEEE 24th International Requirements Engineering Conference (RE), pages 116–125. IEEE, 2016. [22] Klaas-Jan Stol and Brian Fitzgerald. The abc of software engineering research. ACM Transactions on Software Engineering and Methodology (TOSEM), 27(3):11, 2018. [23] Steve Easterbrook, Janice Singer, Margaret-Anne Storey, and Daniela Damian. Selecting empirical methods for software engineering research. In Guide to advanced empirical software engineering, pages 285–311. Springer, 2008. [24] Kate Kelley, Belinda Clark, Vivienne Brown, and John Sitzia. Good practice in the conduct and reporting of survey research. International Journal for Quality in health care, 15(3):261–266, 2003. [25] Colin Robson and Kieran McCartan. Real world research. John Wiley & Sons, 2016. [26] Antonio Cavacini. What is the best database for computer science journal articles? Scientometrics, 102(3):2059–2071, 2015. [27] Grigori Melnik and Frank Maurer. Multiple perspectives on executable ac- ceptance test-driven development. In International Conference on Extreme Programming and Agile Processes in Software Engineering, pages 245–249. Springer, 2007. [28] Shelly Park and Frank Maurer. Communicating domain knowledge in exe- cutable acceptance test driven development. In International Conference on Agile Processes and Extreme Programming in Software Engineering, pages 23–32. Springer, 2009. [29] Filippo Ricca, Marco Torchiano, Massimiliano Di Penta, Mariano Ceccato, and Paolo Tonella. Using acceptance tests as a support for clarifying re- quirements: A series of experiments. Information and Software Technology, 51(2):270–283, 2009. [30] Elizabeth Bjarnason, Michael Unterkalmsteiner, Emelie Engström, and Markus Borg. An industrial case study on test cases as requirements. In International Conference on Agile Software Development, pages 27–39. Springer, 2015. BIBLIOGRAPHY 49

[31] Shelly S Park and Frank Maurer. The benefits and challenges of executable acceptance testing. In Proceedings of the 2008 international workshop on Scrutinizing agile practices or shoot-out at the agile corral, pages 19–22, 2008.

[32] Rick Mugridge. Managing agile project requirements with storytest-driven development. IEEE software, 25(1):68–75, 2008.

[33] Rodrick Borg and Martin Kropp. Automated acceptance test refactoring. In Proceedings of the 4th Workshop on Refactoring Tools, pages 15–21, 2011.

[34] Gáspár Nagy. Improving efficiency of automated functional testing in agile projects. Zoltán Csörnyei (Ed.), page 74, 2012.

[35] Elizabeth Bjarnason, Michael Unterkalmsteiner, Markus Borg, and Emelie Engström. A multi-case study of agile requirements engineering and the use of test cases as requirements. Information and Software Technology, 77:61– 79, 2016.

[36] Sofija Hotomski, Eya Ben Charrada, and Martin Glinz. Aligning require- ments and acceptance tests via automatically generated guidance. In 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW), pages 339–342. IEEE, 2017.

[37] Sofija Hotomski. Supporting requirements and acceptance tests alignment during software evolution. PhD thesis, University of Zurich, 2019.

[38] Forrest Shull, Janice Singer, and Dag IK Sjøberg. Guide to advanced empir- ical software engineering. Springer, 2007.

[39] Barbara A Kitchenham and Shari L Pfleeger. Personal opinion surveys. In Guide to advanced empirical software engineering, pages 63–92. Springer, 2008.

[40] Marcus Ciolkowski, Oliver Laitenberger, Sira Vegas, and Stefan Biffl. Prac- tical experiences in the design and conduct of surveys in empirical software engineering. In Empirical methods and studies in software engineering, pages 104–128. Springer, 2003.

[41] Barbara Kitchenham and Shari Lawrence Pfleeger. Principles of survey re- search: part 5: populations and samples. ACM SIGSOFT Software Engi- neering Notes, 27(5):17–20, 2002.

[42] Nasir Mehmood Minhas, Kai Petersen, Jürgen Börstler, and Krzysztof Wnuk. Regression testing for large-scale embedded software development– exploring the state of practice. Information and Software Technology, 120:106254, 2020. BIBLIOGRAPHY 50

[43] Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. Experimentation in software engineering. Springer Science & Business Media, 2012.

[44] Per Runeson and Martin Höst. Guidelines for conducting and reporting case study research in software engineering. Empirical software engineering, 14(2):131, 2009.

[45] Jarbele CS Coutinho, Wilkerson L Andrade, and Patrícia DL Machado. Re- quirements engineering and software testing in agile methodologies: a sys- tematic mapping. In Proceedings of the XXXIII Brazilian Symposium on Software Engineering, pages 322–331, 2019.

[46] Borge Haugset and Geir K Hanssen. The home ground of automated ac- ceptance testing: Mature use of fitnesse. In 2011 Agile Conference, pages 97–106. IEEE, 2011.