Lasse Löytynoja

Applying Test Automation to GSM Network Element - Architectural Observations and Case Study

Master’s Thesis in Information Technology September 5, 2018

University of Jyväskylä

Faculty of Information Technology

Kokkola University Consortium Chydenius Author: Lasse Löytynoja Contact information: lasse.m.loytynoja@jyu.fi Phonenumber: 045 2736112 Title: Applying Test Automation to GSM Network Element - Architectural Obser- vations and Case Study Työn nimi: Testausautomaation soveltaminen GSM-verkkoelementin ohjelmistotes- tauksessa - Arkkitehtuuriset havainnot ja tapaustutkimus Project: Master’s Thesis in Information Technology Page count: 64+0 Abstract: This thesis examines functional test automation and potential risks that may lie in its architecture. The environment of the thesis is a component of the GSM network. The first chapter examines overall testing, test automation, testing levels and techniques. Next, the structure of the GSM network, its components and working principles are presented. Finally, based on literature and the author’s own work experience, the common problems in test automation are presented and detailed solution are suggested for each one, with case examples. Suomenkielinen tiivistelmä: Tutkielmassa tarkastellaan funktionaalisen testauk- sen automatisointia ja sen arkkitehtuurissa mahdollisesti esiintyviä riskejä. Toim- intaympäristönä on GSM-verkon komponentti. Aluksi tarkastellaan testausta, tes- tausautomaatiota, eri testaustasoja sekä tekniikoita. Seuraavaksi tutustutaan GSM- verkkoon, sen komponentteihin ja toimintaan. Lopuksi esitellään testausautomaa- tiossa mahdollisesti esiintyviä ongelmia kirjallisuuteen ja tekijän omaan työkoke- mukseen perustuen, sekä niiden ratkaisuja yksityiskohtaisin tapausesimerkein. Keywords: GSM, Functional testing, Telecommunications, Test Automation Avainsanat: GSM, funktionaalinen testaus, matkapuhelinverkot, testausautomaatio

Copyright c 2018 Lasse Löytynoja

All rights reserved. Glossary

2G Second generation of wireless telephone technology A interface The interface between BSS and NSS Abis interface The interface between BTS and BSC The interface between MS and BTS Ater interface The interface between BSC and transcoder BCF Base Control Function BTS Base Station BSC Controller BSS Base Station Subsystem CEPT Conférence des Postes et Telecommunications CI Continuous Integration DX 200 The digital switching platform of Networks EDGE Enhanced Data rates for GSM ETSI European Telecommunication Standards Institute E-GSM Extended GSM Gb The interface between BSS and SGSN GPRS General Packet Radio Service GSM Global System for Mobile Communications HLR Home Location Register IMEI International Mobile Equipment Identity IMSI International Mobile Subscriber Identity LA Location Area mcBSC Multicontroller BSC MML Man-Machine Language MS Mobile Station MSC Mobile-Services Switching Centre MSISDN Mobile Station International ISDN Number

i MSRN Mobile Station Roaming Number NSS Network Switching Subsystem O&M Operations & Maintenance OSS Operations Subsystem P-GSM Primary GSM PCM Pulse Code Modulation PSTN Public Switched Telephone Network R-GSM Railway GSM SGSN Serving Gateway Support Node SMS Short Message Service SSH Secure Shell TDMA Time Division Multiple Access Testware Software tooling used in testing TMM Testing Maturity Model TMSI Temporary Mobile Subscriber Identity V-model A software development process VLR Visitor Location Regiter

ii Contents

Glossary i

1 Introduction 1

2 Software Testing 3 2.1 Software testing overview ...... 3 2.1.1 Software quality ...... 3 2.1.2 Errors and Defects ...... 5 2.1.3 Test Activities ...... 6 2.1.4 Software Testability and Instrumentation ...... 8 2.2 Software Testing Levels ...... 8 2.2.1 The V-Model ...... 9 2.2.2 Acceptance testing ...... 9 2.2.3 System Testing ...... 11 2.2.4 Integration Testing ...... 11 2.2.5 Unit Testing ...... 11 2.2.6 Other Testing Levels ...... 11 2.2.7 Regression Testing ...... 12 2.3 Software Testing Techniques ...... 13 2.3.1 Black and White Box Testing ...... 13 2.3.2 Other testing techniques ...... 13 2.3.3 Fault based techniques ...... 14 2.3.4 Usage-based techniques ...... 15 2.3.5 Model-based techniques ...... 15 2.3.6 Techniques based on nature of application ...... 15 2.4 Software Test Automation ...... 15 2.4.1 Requirements of the organization ...... 18 2.4.2 Continuous Integration ...... 19 2.5 Software Testing Maturity Model ...... 20 2.5.1 Maturity levels and goals ...... 21

iii 3 The GSM Network 24 3.1 History ...... 24 3.2 Present and future ...... 26 3.3 Technology ...... 27 3.4 Mobile Station ...... 32 3.5 Base Station Subsystem ...... 32 3.5.1 ...... 33 3.5.2 Base Station Controller ...... 33 3.6 Network Switching Subsystem ...... 35 3.7 Operations Subsystem ...... 37

4 Nokia implementation and test process 38 4.1 DX 200 platform ...... 38 4.2 Base Station Controller ...... 39 4.2.1 Software ...... 42 4.2.2 User interfaces ...... 42 4.3 Operations & Maintenance Functional Testing of the BSC Software . 43 4.4 Holistic Integration Tester ...... 44 4.5 Test Automation Framework ...... 45 4.6 Test Automation Process ...... 45 4.6.1 Test automation infrastructure ...... 46 4.6.2 Extending the framework ...... 48

5 The Suggested Structure of Automated Test Cases 49 5.1 Problems faced with test automation ...... 49 5.1.1 Test-case portability ...... 49 5.1.2 Script-to-script dependencies ...... 51 5.1.3 Test Setup and Teardown Handling ...... 53 5.1.4 Test libraries and user interface abstraction ...... 56 5.1.5 Complexity and Execution time of test cases ...... 58 5.2 Chapter Summary ...... 59 5.3 Supporting findings in another project ...... 60

6 Summary 61

References 62

iv 1 Introduction

The objective of this master’s thesis is to recognize the special features and problems in test automation of Nokia’s 2G BSC (base station controller) O&M functional test- ing environment. The goal is to suggest solutions to the issues and use them as basis to develop a streamlined, common architecture for automated test cases which can be used to improve their consistency and maintainability for the testing organiza- tion. The need for this kind of architecture emerged during the author’s own work as a testing engineer of the subject. The task was to introduce the test automation to testing organization and to train and support colleagues in its use. The author cre- ated a basic architecture for the initial ramp-up trainings, but when all requirements were finally clear, it proved to be insufficient and needed further development. This resulted as tedious test code refactoring work, in which all implemented test cases would use the new model that would provide benefits of transportability and bet- ter maintainability. Without even the support and boundaries of basic architecture, the results would have been a large mass of unmaintainable, non-transferable and frequently recurring test code. This thesis aims to provide a more complete and motivated architecture. The theoretical part is based on literature, some industrial reports of same type of test automation environments, and some training materials. It describes soft- ware testing and test automation, GSM network structure and operation and finally the network elements of the base station subsystem. Much of the background also comes also from the author’s own working experience on the subject during 2012- 2015, the last position being a thesis worker for the Tampere unit. This thesis is made per request of Nokia Networks. The main question this thesis attempts to answer is:

• What are the requirements to automate O&M functional testing of the GSM base station controller for the test automation? Are the automated test cases able to benefit from common architecture and what would it look like?

The supporting questions are:

• What are the factors that affect the maintainability of the automated test case?

1 • What are the factors that affect the execution time of the automated test cases?

• What are the factors that affect the the transportability of the automated test case?

The research of software testing and test automation is done mainly by reading academic and other professional literature. Information about the test automation and the BSS domain at Nokia is obtained through having worked with the product for several years. The architectural plan is supported by the theoretical background of testing and reviewing academic literature and case studies of similar testing en- vironments. Chapter 2 defines the software testing and the software quality. We will take look in errors and defects and basic test activities. Test metrics and software testability are also discussed. The chapter also introduces Boehm’s V-model and the basic lev- els of software testing, describes regression testing and its role and costs in software development and explains the different software testing techniques. The latter sub- chapters define test automation and its role in modern software development, and briefly introduces the software testing maturity model and the associated maturity levels. Chapter 3 defines the GSM network history from past to present, and the ba- sic structure of the network, with focus on the base station subsystem components. Chapter 4 describes the Nokia implementation of GSM networks, the DX200 plat- form and the structure of the base station controller. The chapter also provides an overview of the functional O&M testing of the base station controller. Chapter 5 de- fines the issues faced during test automation process and the suggested solutions. Chapter 6 summarizes the thesis.

2 2 Software Testing

This chapter contains the definition of software testing and software quality in gen- eral. The first subchapter briefly reviews verification and validation, as well as er- rors and defects in software and their common causes. Common test activities and the factors that make software more testable are also discussed. In second subchap- ter, software testing levels are described with support of the V-model. The third subchapter describes software test automation and discusses what it requires from the testing organization. It also introduces rudimentary method called continuous integration. The fourth subchapter discusses the software testing maturity model, which aims to define the level of maturity of a software testing process in an orga- nization.

2.1 Software testing overview

In this subchapter, the software quality and software errors and defects are defined. We’ll also take a look to test activities and software testability and the related instru- mentation.

2.1.1 Software quality

Software quality is an ambiguous term and there is a demand for software testing to become more frequent. The following definition of software quality is made by [7]: "Software quality is achieved by conformance of all requirements regardless of what characteristics is specified or how requirements are grouped or named". The stakeholder value that the software produces along with price and lead time and how "fit for use" it is, could also be simplified quality descriptors. In general, software quality refers to desirable characteristics of the product and the level at which these are attained, but it also includes the processes and the tools that are used to reach them. There are many definitions of software testing as well as many misconceptions, resulting in poor testing performance. Some of most common misconceptions are:

3 • Testing is the process of demonstrating that errors are not present.

• The purpose of testing is to show that a program performs its intended func- tions correctly.

• Testing is the process of establishing confidence that a program does what it is supposed to do.

These statements are broad, but also a bit thin. IEEE describes software testing as follows:

• "The process of operating a system or component under specified conditions, observing or recording the results and making an evaluation of some aspect of the system or component [20]."

• "Software testing consists of the dynamic verification that a program provides expected behaviors on a finite set of test cases, suitably selected from the usu- ally infinite execution domain [7]."

Four keywords emerge from the definition of [7], providing some basic insight into software testing: dynamic, finite, selected and expected. Dynamic verification means that the software or system under test may be complex and nondeterministic and therefore give a different output with the same input, depending on the system state. Finite means that a set of test cases must be delimited, as theoretically even simple programs have so many possible test cases that exhaustive testing could take a long time to complete. Test case selection is affected by the test technique in use, i.e. code-based or specific to the nature of the application. The expertise and in- tuition of the software engineer may also play a big part in the selection process. Defining the exactly expected outcome is not always easy, but it must be done in order to have rational test results. The outcome must be compared against specifi- cations, requirements or other acceptance criteria. Software testing is about increasing the quality of the software by using a well- defined process. It is pervasive over the life cycle of the software; starting from early stages of the requirements process and developed and refined throughout the actual software development, until the maintenance phase of the software. The importance of software testing is reflected in various past incidents. Accord- ing to James Gleick [14], a small deficiency in the testing phase can have expensive

4 consequences, as with the Ariane 5 rocket. In its maiden voyage, carrying four satel- lites, the software in the guidance system tried to convert 64-bit number in to a 16-bit number, which resulted in an overflow error. This caused guidance system to shut down and the redundant system did not help as it was running same faulty soft- ware. After 39 seconds of flight, the self-destruct mechanism finished the rocket by exploding it. Ultimately the price tag for this 10-year project and equipment was US$7 billion. There are numerous other examples in the history of software devel- opment.

2.1.2 Errors and Defects

According to [20], an error in software is when a computed, observed or measured value or condition differs from a true, specified or theoretically correct value or con- dition. Although the terms "error", "fault", "defect" and "failure" may be used inter- changeably in practice [8], Burnstein discusses the differences between the terms. The source of an error is the software developer (this includes software engineers, programmers, analysts and testers), who has made a mistake, misconception or misunderstanding during the development process. A fault, or a bug or a defect, is a result of an error. It means incorrect behaviour of the software, a situation where the functionality of the software fails to adhere to its specification. A failure of soft- ware in turn is "inability of software system or a software component to perform its required functions within specified performance requirements" [20]. According to Whittaker [33] and Burnstein [8], the source of software defects could be any of these:

• The user executed untested code.

• The statements in the code were executed in a different order to what would be "normal".

• An untested combination of input values is fed to the software.

• The software was executed in a previously untested environment.

In [8], Burnstein classifies defects in four categories. The categories are require- ments and specification defects, design defects, implementation defects and testing defects.

5 The requirement and specification defects are generally due to incorrect, am- biguous, incomplete or even missing descriptions of functionalities. These kinds of defects are often encountered in faulty interactions of separate software features, i.e. saving and subsequent categorization of customer data in a customer relationship management system, interfaces (i.e. graphical user interfaces) or machine interfaces (i.e. APIs, application programming interfaces). Design defects are flaws in the design of algorithms, control logic and sequences, data elements, module interface descriptions and external software / hardware / user interface descriptions. If the design description is not detailed enough with pseudocode presentation of major control and data flow structures, many defects classified as design defects should instead be classified as implementation defects. The sources of implementation defects are purely errors in code implementation. An example of an implementation defect would be a comparison between inappro- priate data types or a faulty data type conversion, as described in section 2.1.1. These may be closely related to design defects, if pseudocode is used to describe the de- sign. Moreover misunderstandings or mistakes of software developers, or a lack of communication and education and understanding of the used tools or languages may be the sources for this kind of defects. The actual software product is not the only place where the defects can be found. Testing defects refers to defects that are found in test cases, test plans, test harnesses and test procedures. Defects of this type are i.e. the incorrect or incomplete design of test cases. Some testing levels require the development of additional test har- ness code in order to be able to carry out the testing. This harness code is subject to similar defects with any other software product and it should also be carefully tested.

2.1.3 Test Activities

Testing activities are parts of the testing process. The exact name and content of the activities vary from an organization to another, yet the tasks are usually similar despite the conventions. In [12], the activities are defined as a sequential process which is described in Figure 2.1. The identification phase determines the test conditions, what can be tested and what testing technique should be used. The test conditions are also generally pri- oritized in this phase. The test condition refers to an item or an event that can be verified by a test. Testing technique refers to a systematic way of obtaining a test

6 Figure 2.1: The testing activities [12].

result, such as boundary-value analysis. The testing techniques are discussed more in Chapter 2.3.[12] In the design phase, it is determined how the test is to be conducted. A test case will specify all steps or actions that are needed to reach the given objective. The test cases produced in this phase contain all prerequisites, input values and expected outcomes in a sequential manner.[12] The build phase consists of building artifacts that are required for test execution. These are test scripts, test data sets and possible input and output definitions if the tooling requires it. The important part, particularly as regards this work work, is the preparation of the test environment hardware. It can mean the initialization of a database, loading a new software build into a system, or configuration of the system software and hardware to sufficient level for the test at hand.[12] In the execution phase the software under test is executed using the test cases as guide. This can take place automatically or manually. In manual execution, those responsible for the test execution follow the instructions of the test case and observe the results. In automated execution, the test instructions are executed by a script. The execution can also be a mix of the two, i.e. semi-automatic testing. This method either aims to improve efficiency by automating parts of the execution, or the man- ual intervention is mandatory due to changes i.e. to hardware which must be made manually.[12] In the comparison phase it is confirmed that the actual outcome of the test is correct and conforms to the requirements of the test case. If the result is correct, it is assumed that the test case is passed; if not the test has failed and the changes are that a defect in software may have been found. Further investigation should take place to identify the source of error.[12]

7 2.1.4 Software Testability and Instrumentation

In [19], Huang states that "Normally, a computer program is designed to produce outputs that are useful only to its users. If test execution of the program produces an incorrect result, it is very significant to the tester in that it attests directly to the presence of faults in the program, and possibly includes some clues about the nature and whereabouts of the faults. On the other hand, if it produces a correct result, its significance to the tester is rather limited. The only conclusion the tester can draw from a successful test execution is that the program works correctly for that particular input. It would be beneficial to the tester if we could make a program to produce additional information useful for fault-detection and process-management purposes." There are applications for this kind of instrumentation consisting of:

• Test coverage measurement. As described in section 2.3.2, this kind of mea- surement is obtained for example placing a counter variables in branches of the control flow and checking the counter values after the test. Non-zero val- ues mean that all paths of control flow would have been traversed at least once.

• Test case effectiveness assesment. This means the effectiveness or capability of a test to reveal a fault in a program. This can be measured by computing a singularity index, which requires a thorough inspection of every expression in the program. In other words, effective test cases e.g. verify multiple facets of an object instead of single one and therefore have better changes of finding faults.

• Assertion checking. By setting assertions to specific points in the source code of a program, where the conditions are supposed to always be true in normal operation, it is certain that there has been a fault in the program.

2.2 Software Testing Levels

This subchapter defines different software testing levels and shows their correspon- dence to the software requirements with the help of the V-model.

8 2.2.1 The V-Model

The V-model 2.2 is a software process model that was originally introduced by Boehm in 1979 [6]. It is most commonly used in more formal environments and where embedded software is produced that runs on hardware devices. The V-model merely implies a flow, where each phase of the development can be implemented based on the detailed documentation produced in the previous phase[21]. It may be considered an extension to the traditional waterfall process model[3]. The V-model also shows how test activities may and should begin parallel to de- velopment activities The requirements development phase produces user require- ments, from which the acceptance tests are derived. System tests are based on the functional requirements and the integration tests on the system architecture. Most low-level tests are unit tests which are usually carried out parallel to design and implementation activities. By starting the test activities at the very beginning of the project (requirements), defects should be found earlier. For example, [34] suggests that a problem in re- quirements may take 30 minutes to evaluate and fix, but if same issue is found in the system testing phase, it might take 5 to 17 hours to fix. It is clear that valuable project hours are saved if the testing process is started early.

2.2.2 Acceptance testing

Regarding [7], acceptance testing is about whether the software system satisfies the acceptance criteria. These criteria are usually the business-level requirements of the software system. In [3], it is also stated that users and other individuals who take part in acceptance testing should have strong domain knowledge of the software system. Reflecting on my own experience in software and telecommunications in- dustry, this is true. Domain knowledge is vital with complex, interoperating sys- tems and their acceptance, and functional testing. As stated in [7], "The customer or a customer’s representative thus specifies or directly undertakes activities to check that their requirements have been met, or in the case of a consumer product, that the organization has satisfied the stated requirements for the target market. This testing activity may or may not involve the developers of the system."

9 Figure 2.2: The V-model

10 2.2.3 System Testing

In [7], system testing is defined as "System testing is concerned with testing the be- havior of an entire system. Effective unit and integration testing will have identified many of the software defects. System testing is usually considered appropriate for assessing the nonfunctional system requirements such as security, speed, accuracy, and reliability. External interfaces to other applications, utilities, hardware devices, or the operating environments are also usually evaluated at this level.". In addition, in [3] Ammann and Offutt state that "This level of testing usually looks for design and specification problems. It is a very expensive place to find lower-level faults and is usually not done by the programmers, but by a separate testing team."

2.2.4 Integration Testing

Integration testing assesses software with respect to subsystem design. This means verification that the interfaces between modules communicate correctly. The devel- opment team usually has the responsibility of integration testing [3]. In [7], integra- tion test is defined as "Integration testing is the process of verifying the interactions among software components. Classical integration testing strategies, such as top- down and bottom-up, are often used with hierarchically structured software."

2.2.5 Unit Testing

Unit testing is testing done to the "smallest possible testable software component" [8]. Unit therefore refers to a class or a method of a class, which could be tested in isolation from other software. Usually the software developer who implemented the code is the one who writes and conducts the unit tests [7].

2.2.6 Other Testing Levels

There are several other testing levels; the demand for these levels depends to large extent on the type of software. Other typical testing levels may consist of:

• "Performance testing verifies that the software meets the specified performance requirements and assesses performance characteristics - for instance, capacity and response time [7]."

• User interface and usability testing. User interface testing verifies the integrity

11 and functionality of the user interfaces of the software. Usability testing eval- uates how easy it is for users to learn and use the software [7].

• Security testing confirms that the tested software meets given security require- ments. A formal description found in [7] is "Security testing is focused on the verification that the software is protected from external attacks. In particular, security testing verifies the confidentiality, integrity, and availability of the sys- tems and its data. Usually, security testing includes verification against misuse and abuse of the software or system (negative testing)."

2.2.7 Regression Testing

As described by Burnstein in [8], regression testing is not a testing level, but a "retest- ing of software that occurs when changes are made to ensure that the new version of the software has retained the capabilities of the old version and that no new de- fects have been introduced due to the changes" The name implies the main reason the tests are executed: after the changes have been made to the software, be it work needed to introduce a new feature or fix a defect, there should be no regressions in the functionality of the existing software. Regression tests usually contain a sub- set of test cases designed for the software, across several testing levels. Regression testing is considered important, as error corrections tend to cause more errors to software than the originally implemented code [22]. Regression testing is considered as the most expensive testing of a software prod- uct. Costs of regression tests can be as high as 80% of the total testing expenses. The high cost of regression testing has motivated efforts to reduce it, e.g. by using test set selection and minimization techniques [29] and test automation to reduce the labor effort involved. While test automation is highly utilized in modern software production, everything cannot be automated (especially in the context of this the- sis, the world of telecommunications equipment); therefore the prioritization and selection methods are still valid techniques for reducing the regression testing load. More in-depth research on prioritization and selection methods is beyond the scope of this thesis.

12 2.3 Software Testing Techniques

To detect as many defects in software as possible, a systematic approach in testing is needed. A random selection of test inputs is considered to be the least effective method to achieve this [7][22]. This chapter describes common testing techniques and their characteristics.

2.3.1 Black and White Box Testing

Black box testing (also known as data-driven and input/output-driven testing) refers to a testing technique where the internals of the program are unknown. In black-box testing the test cases rely only on input and output behavior of the software, and the used test data is determined by the software specification. The fault found in black box testing can be seen as a symptom of a problem, and the root cause might be more difficult to identify than in white box testing. White-box testing (also known as glass-box and logic-driven testing) is based on assumption that the internal structure of the software is known. The test data are determined by the design and logic of the software, but should also take the specification into account. In white-box testing the source of the defect might be easier to find, as the lines of faulty code are known.

2.3.2 Other testing techniques

The more detailed testing techniques presented in [7] are classified by how the tests are generated: the software engineer’s intuition and experience, the specifications, the code structure, the real or imagined faults to be discovered, predicted usage, models, and the nature of the application. Ad-hoc and exploratory techniques This category consists of ad-hoc and ex- ploratory testing. Sd-hoc testing relies on the software engineer’s skill, intuition and experience with similar kinds of programs and is probably the most practiced technique. In ad-hoc testing the tester can exercise or improvise some cases and scenarios that are difficult to come up with by means of more formalized methods. Exploratory testing is considered as a subset of ad-hoc testing. In exploratory test- ing there are also no predefined test cases or a test plan. The software engineer’s knowledge and familiarity with the application determine the effectiveness of the testing. Exploratory testing highlights the concept of "simultaneous learning, test

13 design and test execution". Input domain-based techniques These techniques consist of equivalence parti- tioning, pairwise testing, boundary value analysis, and random testing. In equiva- lence partitioning, the input domain is divided into subsets (equivalent classes) that are based on a specific criterion or relation. The selection criterion may i.e. be based on control flow or a sets of accepted and not accepted input values. Pairwise testing is a combinatorial testing technique, and gathers sets of interesting input value pairs across the input domain. The boundary-value analysis is based on a rationale that most of faults are to be found at the boundaries, the extreme ends of input value ranges. In the random testing the input values are selected randomly from a known range of input domains. Code-based techniques Code-based testing techniques consist of control flow- based criteria and data flow-based criteria. Control flow-based coverage criteria considers all statements, blocks of statements and combinations of these as paths. Exhaustive testing of the paths is not usually possible, but coverage of the visited statements, conditions and branches is measured and the percentage is used as a metric for this type of testing. Data flow-based testing uses the control flow as well and annotates the flow graph with variables and their definition, usage and undef- inition. The all definition-use path requires that each variable is defined and goes through all control flow possibilities until the definition of variable is used. This method is considered as the strongest, but it might result in large numbers of paths and all-definitions and all-uses are it’s lesser alternatives.

2.3.3 Fault based techniques

Fault-based techniques use a fault model, i.e. a set of likely or predefined faults. The techniques comprise error guessing and mutation testing. In error guessing, the most anticipated faults are used as the base of test cases. The software engineer’s expertise and the history of the product/project serve as a setting for guesswork. Mutation-based testing uses an altered version (by a small syntactic change) of the software in parallel with the original version. By executing a test against differing versions, a difference in behavior is expected. It is assumed that small faults are linked to bigger ones and can be found using this technique. The technique also relies on automatic generation of large numbers of differing "mutant" versions.

14 2.3.4 Usage-based techniques

Usage-based techniques consist of operational profile and user observation heuris- tics. In operational profile testing, the software is tested in an as closely reproduced operating environment as possible. The inputs for the software try to simulate its actual use, and its reliability is evaluated under these conditions. User observa- tion heuristics use methods such as cognitive walkthroughs, field observations, user questionnaires and interviews to assess the usability problems of the graphical user interfaces in controlled conditions.

2.3.5 Model-based techniques

Model based techniques consist of decision tables, finite state machines, formal spec- ifications and workflow models. Finite state machines are described as "by model- ing a program as a finite state machine, tests can be selected in order to cover the states and transitions." Formal language specifications can automatically be tran- scribed into functional test cases and the corresponding results. TTCN3, which is specifically used in testing of protocols in telecommunication systems, is an exam- ple of this type of language. In workflow models, activity sequences of humans or software are recognized as workflows. Tests target both typical and alternative workflows.

2.3.6 Techniques based on nature of application

Techniques that are based on the nature of the application focus on the software’s characteristics, such as service-oriented, web software or embedded software. These characteristics may be used to supplement the previously mentioned techniques or vice versa.

2.4 Software Test Automation

Software testing is a resource- and time-consuming activity, which can use up to 80% of the development costs. Test automation aims to reduce the effort needed for adequate testing, or increase testing that can be executed within a limited amount of time. Test automation has become one of central concepts in modern software development.[35][16] Basic building blocks of test automation are test cases, test

15 sets/scenarios and the testware used for test execution and reporting. When a test case is automated, it is turned into a machine-executable script or another ex- ecutable form, which contains definitions of test steps ? usually calls to an inter- face or methods provided by a test automation framework. Test automation can be semi-automatic, meaning scripts that assist in execution of test cases, or fully auto- matic, where test cases contain testing usually follows the continuous integration (CI) paradigm, which is introduced in section 2.4.2. According to [12], test automation benefits consist of:

• Ability to execute regression tests automatically against new versions of a soft- ware. See paragraph on continuous integration, where this is discussed in de- tail.

• Ability to run more tests in a fixed time. By using automated testing, the time needed for testing is reduced. This leads to running increased numbers of test cases and greater confidence in the system.

• Ability to perform tests that would be difficult or impossible to do manually. Some testing, like testing a web-based system in regular intervals with 200 concurrent users, would be next to impossible to arrange with test staff. This type of testing is a great fit for automation, where machines can be leveraged to simulate user load.

• Better use of resources. Tasks that are repetitive and dull in nature, such as the constant entering of the same input values, should be automated. For this kind of tasks, using automation produces greater accuracy and frees human resources for other tasks.

• Better consistency and repeatability in testing. Automated tests are executed in identical fashion in each test run, which leads to consistent and repeatable results. The level of consistency gained is hard to reproduce manually. If same tests are executed in a different hardware or system configuration, the results may give valuable insight into differences or possible bottlenecks in the sys- tem.

• Possibility to reuse tests. Compared to manual tests, automated test have the same design costs, but also additional costs for the automation. Automation pays off in re-using the tests with reduced execution time in consecutive soft- ware releases, and in possible portability between different test setups.

16 • Faster time-to-market. When the initial implementation of automated tests is complete, the test execution time will be substantially shorter. This reduces the time that it takes for a new version of software to enter the market.

• Increased confidence. An extensive set of automated test cases increases the confidence and trust in a software release.

In the same literature, the negative factors are considered to be:

• Too high overall expectations. Emphasis is put on the tooling instead on the effort needed in building and maintaining a successful test automation regime.

• Poor testing practices do not improve by using test automation. Practices con- sist of the organization, documentation and quality of the actual tests. If au- tomation is built on poor quality, the result is poor quality delivered faster.

• Expecting that automated test cases find many new defects. Test cases might find defects in their initial design and execution phase, but in regression the defects are usually reflections of source code modifications.

• Assuming that the software does not contain defects if automated test suite does not find any. It must be considered that even the tests can contain defects or otherwise be incomplete.

• Underestimating the maintenance effort needed to keep test cases up to date. Changes in software may require updates in automated test cases.

• Technical and interoperability issues between test automation products. Test automation tools themselves can contain defects that may severely affect the test automation regime. Some old closed-source tooling might be irreplaceable and lack support.

The difference should be highlighted between verification by a skilled tester and checking by an automated test case. Even small-sized software can have many input combinations and the tester’s experience is needed to carefully select the test cases that should find most of the defects in the given time frame allotted for the testing activities. The tester is able to define whether the outcome of the test case is correct for the given situation, and possibly react and conform to possible changes in exe- cution. In contrast, test automation tools generally compare a set of test outcomes

17 to expected ones and is blind for what is happening around the checked value. This comparison should be separated from verifying what a skilled tester does. According to [12], test automation can provide results in two ways. Either it can limit the effort needed for testing activities, or increase the number of tests that can be carried out within a limited period of time. Reducing effort is mainly accom- plished by automating mundane manual tasks Test automation can be applied to almost any testing phase, but some phases benefit from it more than others. Test-driven development (TDD), behavior-driven development (BDD) and acceptance test-driven development (ATDD) are all method- ologies that encourage developers to first write tests for a software feature before im- plementing it. For TDD tests are low-level, such as unit or integration tests. For BDD and ATDD tests are usually more high-level, such as functional or acceptance tests. The main idea is that once the actual feature is properly implemented previously failed tests will be passed and the feature has an automatic runnable regression test in place.

2.4.1 Requirements of the organization

The research article [28] by Persson and Yilmaztürk gathers common pitfalls of test automation implementations from literature. It reports on the implementation of test automation at ABB, and the attempt to recognize and avoid these pitfalls. More- over, it evaluates the varying effects of pitfalls on the organizational level. While the previous section listed the benefits and negative factors of test automation, this arti- cle focuses on some factors of test automation on the organizational level.

• In order to implement test automation, organizational maturity should be at a level where it can handle process improvements in a structured manner. The concepts and terminology of test automation and automated regression testing should be clear, and communicated effectively throughout the organization.

• Test teams working on test automation need well rounded skill sets in pro- gramming, testing, project management and technical skills like networks and databases. The need for competence in these areas may be underestimated. It is also mentioned that the project knowledge gained during test automation efforts should be secured, and that the staff should not be made up entirely of external resources.

18 • The importance of testware (software tooling used in test automation) reusabil- ity, repeatability, maintainability and modularity is highlighted. Attention should be paid to guidelines for test script development, as a lack of guidelines endangers the reusability, repeatability, and maintainability of the testware.

• The needs for test result reporting should be discussed thoroughly within the organization and with the stakeholders. It is easy to create reports that contain useless information, and where the useful information is not in the best possi- ble format. The article states that "It was a big gap between the first internal project proposal, and the final report template". If not automated, actual test reporting creates additional and often unconsidered effort.

2.4.2 Continuous Integration

Continuous integration 2.3 is an integral part of today’s software development, en- forcing the use of TDD/BDD/ATDD development methods (as described in 2.4). The name "continuous integration" originates from a practice in Extreme Program- ming (XP) development process. By developing code using the before-mentioned practices, it is ensured that the implemented code is covered with a set of tests. These tests act as a "safety net" when a the existing codebase is modified, i.e. for a change in a feature or a refactoring. Let us consider a case where a developer has been working in a test-driven way and wants to commit his work into the version-control system. First, it must be ensured that the local working copy passes the tests, by running them in his per- sonal development environment. After the local tests are passed, the code can be merged in to the main branch of the project in version-control system. The "con- tinuous integration" part happens when this commit automatically triggers a build, containing the freshly committed code and running a larger number of tests, thus integrating the change and ensuring that the commit does not break any other part or functionality of the build. There are other recommended practices for successful continuous integration. When a build breaks, i.e. a test or tests fail, the most important task is to fix the build. This ensures that the main branch always remains healthy, and that no devel- oper deviates too far from it with his local working copy. To achieve this, first, the commits should not be big and monolithic but preferably smaller and incremental to reduce problems with integration. Another thing brought up earlier is that in or-

19 Figure 2.3: Continuous integration

der for continuous integration to work, the version control should have a mainline branch, and the development should be made to this main branch. Some reasonable branches for example for bug fixes are acceptable. Third, the build should be auto- mated and the tests should be part of the automated build process. Fourth, every commit should build the mainline and the build process should be fast, in order to get instant feedback on a possibly broken build[13].

2.5 Software Testing Maturity Model

There are several process models that are used as frameworks in assessing and im- proving the software development process (including the testing process) in organi- zations. In this chapter, we will review a process model that concentrates solely on improving the testing process in organizations, i.e. the testing maturity model (re- ferred to as TMM from this point). TMM was developed by Burnstein et al in 1999, [8] and it is based on Capability Maturity Model (CMM), which in turn was based on a study conducted in U.S. Department of Defense. CMM is a staged architectural model, where the architecture prescribes stages that an organization must proceed

20 through in a given order in order to improve their software development process. TMM uses several features from the CMM and is staged in a similar fashion[9]. The five stages in TMM are called "maturity levels". Each level has a description that can be used to assess the current state of the process maturity and the goals that should be pursued in order to reach the next maturity level in an "evolutionary path".

2.5.1 Maturity levels and goals

As stated in overview, there are five maturity levels of the testing process that ac- tually indicate the testing capability of the organization (illustrated in figure 2.4). The first level is being an unmanaged, chaotic ad-hoc type of testing: from there the organization can evolve step by step to the fifth level, where the test process and in- frastructure are well-defined, managed and measurable, and can then be fine-tuned and optimized. Each maturity level contains the maturity goals, which in turn contain maturity sub-goals. These subgoals determine the needed accomplishments needed for a particular level, along with scopes and boundaries. An organization adapts to goals by committing to the required activities, tasks and responsibilities. In the coming subsections each of these levels with their respective properties will be reviewed in more detail. Level 1 - Initial At maturity level 1 there is no established testing process, and testing merely resembles debugging. The specification of the software does not exist and there is a lack of resources, tools and properly trained staff. The testing activities attempt to show that the software works, and the testing and implementation work is combined together. There are no maturity goals at level 1. Level 2 - Phase definition At maturity level 2 the testing is recognized as a sep- arate phase that follows the implementation phase. Test planning and design may take place after completion of the implementation, as the completed code is seen as a base to the test activities. Possible defects that occur in the specifications are finally found in the testing phase. Basic testing levels and techniques are used. In level 2, the maturity goals are to develop testing and debugging goals, initiate a testing planning process and institutionalize basic testing techniques and meth- ods. Level 3 - Integration At maturity level 3, testing is integrated into the software life cycle. Test planning activities start in the requirements phase, not after the im- plementation phase as in level 2. The goals of level 2 have been attained and the

21 Figure 2.4: Life cycle of the five maturity levels of TMM with their respective goals [8].

22 goals of level 3 have been implemented. There is a software test organization that overlooks and controls the testing, along with the technical training that focuses on testing. The test review program is not yet present at this level. In level 3, the maturity goals are to establish a software test organization, estab- lish a technical training program, integrate the testing into the software life cycle, and control and monitor testing. Level 4 - Management and Measurement At maturity level 4, measurement and quantification are part of the testing process. Reviews cover the entire software process and complement the actual testing. Test cases are stored in a database and the defects are properly logged in another database with severity levels. Software is also tested for quality attributes, like maintainability, reliability and usability. In level 4, the maturity goals are to establish an organization-wide review pro- gram, a test measurement program, and software quality evaluation. Level 5 - Optimization, defect prevention, quality control At maturity level 5, the testing process is solidly defined and managed. The testing process can be opti- mized and further fine tuned. Cost and effectiveness of the testing can be monitored. Defect prevention is practiced along with quality control. In level 5, the goals are defect prevention, quality control and test process opti- mization.

23 3 The GSM Network

This chapter examines the history of the GSM and addresses the basic standardiza- tion and technology aspects. Furthermore, the base station subsystem components and their roles are defined.

3.1 History

Before GSM, there were several different mobile cellular networks. NMT in Scan- dinavia, AMPS in North America and TACS in Europe. France, Germany and Italy had also implemented their own national systems. This fragmentation of technolo- gies caused high costs of mobile devices and network equipment and made Europe- wide roaming impossible. In the year 1993, Western Europe had population of 300 million but only 7,4 million cellular phone users. As pointed out in a contribution by the Netherlands at the CEPT conference in June 1982, there was a significant risk that unless a concerted action for a common European mobile system was started quickly, the 900 MHz band would be taken over by incompatible national systems. That would mean that the change to build European-wide system in the 20th century would be lost. Initially CEPT proceeded with the harmonization of existing national systems, but soon it was agreed that instead of harmonization, there should be more future-proof solutions that would be based on new technology. The Groupe Spécial Mobile (GSM), with members from European mobile network operators, was formed and started the work for the new mobile solution.[18] The work of GSM started in 1982; the intention was that the new system would be available in the early 1990s. Work was based on a strategy and action plan from CEPT, which set out the following requirements for the system:

• To be able to share the 900 MHz spectrum with the present analog systems

• Support for mobile stations and their free circulation

• Mobile stations should be able to operate in all European countries

24 • High spectrum efficiency

• Telephony as main service, but to offer attractive non-voice services

• Strong security support

As the workload of the GSM group increased, several independent subgroups were established. A structure for specifications, the system architecture and the con- cept of services were created. The security group and the digital radio transmission group had their concepts trialed and evaluated. The work done from 1984 to 1987 resulted in a future-proof basic parameter set for the GSM system. The parameters were advanced but could be implemented with existing technologies.[18] The first set of features from the CEPT action plan was selected and specified for GSM Phase 1. It contained the radio subsystem, core network and security archi- tecture along with the Subscriber Identity Module (SIM). It offered telephony, inter- national roaming, call barring and call forwarding services. In 1989 the GSM group was transferred to newly established European Telecommunications Standards In- stitute (ETSI). The actual work remained unchanged, but this allowed manufactur- ers to take part in it as ETSI members. At the same time, the UK requested a mobile cellular network that would work in the 1800 MHz band.[18] GSM Phase 2 started in 1991. It would improve the existing functions from GSM Phase 1 and specify new functions that were omitted in the previous phase. New content in GSM Phase 2 included data service enhancements, supplementary ser- vices, half-rate speech codec and the 1800 MHz system, to be known as GSM1800. Moreover, test specifications for type approval tests of mobile stations were created. It was obvious that the GSM evolution would not stop with Phase 2; confirmation of future extension possibilities was also important, as well as the compatibility with the specifications of GSM Phase 1. In 1995, the GSM 2 Phase 2 specifications were frozen.[18] In 1993 Nokia contributed to the field of GSM development with a one-page doc- ument titled "GSM in a Future Competitive Environment". This document stated that by 1994, there would be GSM Phase 1 networks in 40 countries, and that the current cross-phase compatibility mechanisms would prevent the further develop- ment of GSM. The main message was that until next decade and the upcoming third generation mobile network systems, GSM should remain competitive and offer bet- ter speech quality and improved system capacity, and that the GSM standard should therefore be enhanced by 1997-1998. In a workshop held in Helsinki in 1993, it was

25 realized that GSM Phase 2 should be considered as a platform with true potential for evolution.[18] GSM Phase 2+ fixed the cross-release compatibility issues. It provided new ser- vices and features, i.e. support for ISO Unicode as the second character set in SMS. Adaptive Multi Rate (AMR) codec was put to use to further improve the speech quality. The age of the mobile internet dawned with Wireless Application Protocol (WAP) and the High Speed Circuit Switched Data (HSCSD), which provided a max- imum of 42 kb/s of user data transfer. Higher speeds would be essential, and so the General Packet Radio Service (GPRS) was standardized, requiring large changes in the GSM radio subsystem and two new core network elements, the Serving GPRS Support Node (SGSN) and Gateway GPRS Support Node (GGSN). GPRS enabled data transfer speeds up to 100 kb/s shared by multiple users. Even faster data rates were to be achieved with Enhanced Data rates for GSM Evolution (EDGE). By re- using the GSM radio channel structure and TDMA framing with new modulation and coding techniques, data rates could be as high as 384 kb/s.[18] The first GSM call in a commercial network was made on July 1, 1991. The net- work was owned by the Finnish operator Radiolinja, which now operates under the name Elisa. The network was built by Telenokia and Siemens. From there, the number of global GSM subscribers grew steadily until the peak of 2010-2013. In the time period of 1996-2000, the idea for a third generation of cellular net- works was conceived. It was based on GSM core network evolution and a new radio subsystem. To achieve a global working structure, also needed for GSM, the Third Generation Partnership Project (3GPP) was created. 3GPP has existed since 1998 and has developed specification releases for 3G and 4G (LTE) systems.[18]

3.2 Present and future

Today, more than 90% of the world’s population is covered by GSM services [15]. Based on estimates available in ’s Traffic Exploration Tool, in 2011 the amount of subscriber data traffic exceeded voice traffic. The growth of data transfer is going to be significant in the coming years, as shown in figure 3.2. The customer demand for being "always connected" and being able to stream i.e. high quality music and especially high-bandwidth video requires the cellular network to be able to deliver a high volume of data per user. Clearly, GSM with its data transfer abilities is not the most suitable technology for this kind of task. Estimate of GSM and LTE + WCDMA

26 Figure 3.1: Estimate of global GSM and LTE + WDMA subscribers from 2010 to 2020 [10]. are displayed in figure 3.2. The "GSM sunset" is becoming reality, and some major operators shut down their GSM networks at 2017 [4][30]. The available frequency spectrum is highly regulated and expensive, and the operators refarm the spectrum to match the demand of mo- bile broadband. Still, there are services like GSM-R and many IoT/M2M solutions like remote telemetry and security solutions, that use GSM as communication chan- nel, and the impact especially for the IoT/M2M is considerable.

3.3 Technology

The 3GPP specification defines a total of fourteen frequency bands for GSM, ranging from 380 MHz to 1900 MHz. GSM bands used in Finland and the rest of Europe are GSM 900 (900 MHz) and DCS 1800 (1800 MHz) whereas the used bands in North and South America are GSM 850 (850 MHz) and PCS 1900 (1900 MHz). The fre- quency spectrum allocation per band is varied by the used standard. For GSM-900, there are three different standards: P-GSM, E-GSM and R-GSM. The uplink and downlink frequency spectrums of P-GSM are both 25 MHz wide, containing 124 physical channels. Total radio frequency spectrum allocation of P-GSM is 50 MHz.

27 Figure 3.2: Global smartphone traffic, data vs. voice [10].

28 For E-GSM, uplink and downlink frequency spectrums are 35 MHz wide and the total spectrum usage is 70 MHz. R-GSM uses 39 MHz uplink and downlink spec- trums, totaling 78 MHz.

Figure 3.3: Frequency channels in GSM 900 (E-GSM) [26].

A GSM band is divided in to frequency channels, and the E-GSM for example contains 124 channels (uplink and downlink, figure 3.3). The channel width is 200 KHz. The first and last 200 KHz of a band are not used for safety reasons. The channel access method that GSM uses on air interface is a combination of frequency division multiple access (FDMA), which mean the frequency separation inside a band and time division multiple access (TDMA), which means that traffic is con- tained inside a TDMA frame, which contains timeslots. Timeslots are referenced as physical channels of the air interface. The FDMA-TDMA structure is shown in Fig- ure 3.4. TDMA timeslot can be a traffic channel, a control channel or both. These channel types are called as logical channels of the GSM.

• Traffic channels A traffic channel can be i.e. a full-rate or a half-rate channel. In the former, the speech data rate is 13 kb/s. The half-rate principle is that single users speech data is carried only in every other TDMA frame, thus the capacity is doubled. Due to improved codecs the quality of speech does not suffer significantly from this.

• Control channels Control channels are used when a mobile station (MS) en-

29 ters or leaves the network, in the tracing of MS, initiation, maintenance and ending of the calls. High-level separation of control channels are broadcast channels (BCH) , common control channels (CCCH) and dedicated control channels (DCCH). These contain multiple associated channel types for differ- ent purposes, i.e. synchronization, paging and connection management and authentication.

Figure 3.4: GSM uses a combination of TDMA and FDMA [26].

GSM network is also able to transfer data. The basic data transfer form is the cir- cuit switched (CS data) data, which uses the normal timeslots and has a maximum data transfer rate of 9,6 kb/s. General Packet Radio Service (GPRS) provides faster means for data transfer. GPRS and its successor EDGE transfer data as bursts, opposed to the contiguous time slot reservation of CS data, and therefore enable the simultaneous data trans- fer for multiple users. The data transfer rate per user can be increased by assigning multiple time slots for the data. Maximum peak data transfer rate with all times- lots assigned to data transfer and by using the most efficient channel coding al- gorithms is 158-171kb/s. GPRS also allows operators to bill based on amount of transferred data, instead of on connection time. GPRS requires changes in the GSM network, such as packet control unit (PCU) functionality in the BSC and additional

30 network elements service gateway support node (SGSN) and the gateway GPRS support node (GGSN). Enhanced GPRS (EDGE) is based on the GPRS technology and provides faster data transfer rates of 384kb/s, but it should be noted that this also varies based on amount of used timeslots. Increased data rates are obtained by using more efficient channel coding and by using multiple carriers, thus expanding the number of times- lots used. The EDGE is considered as "2,5G", an additional step between the 2nd and the 3rd evolutions of the mobile network.[2][1][27]

Figure 3.5: Frequency reuse in cellular network [26].

31 3.4 Mobile Station

The MS (mobile station) consists of the physical equipment and the subscriber iden- tity module (SIM) card. The MS (mobile station)consists of the physical equipment, such as the radio transceiver, display and digital signal processors, and the SIM card. It provides the air interface to the user in GSM networks. As such, other services are also provided, which include:

• Voice teleservices

• Data bearer services

• The features’ supplementary services

The MS also provides the receptor for short message service (SMS) messages, enabling the user to toggle between the voice and data use. Moreover, the mobile facilitates access to voice messaging systems. The MS also provides access to the various data services available in a GSM network. The SIM provides personal mobility so that the user can have access to all sub- scribed services irrespective of both the location of the terminal and the use of a specific terminal. You need to insert the SIM card into another GSM cellular phone to receive calls at that phone, make calls from that phone, or receive other subscribed services.

3.5 Base Station Subsystem

The GSM Base Station Subsystem (BSS) consists of following three network ele- ments. Elements and interfaces of BSS are shown in Figure 3.6.

• The Base Station Controller (BSC) is the central network element of the BSS and controls the radio network. This means that the main responsibilities of the BSC are connection establishment between the MS and the NSS, mobility management, statistical raw data collection, as well as Air and A interface signalling support.

• The Base Transceiver Station (BTS) is a network element maintaining the Air interface. It takes care of Air interface signaling, Air interface ciphering, and

32 speech processing. In this context, speech processing refers to all the functions the BTS performs in order to guarantee an error-free connection between the MS and BTS.

• The Transcoder (TC) is a BSS element taking care of speech transcoding, i.e. it is capable of converting speech from one digital coding format to another and vice versa [26].

3.5.1 Base Transceiver Station

A base transceiver station is a physical site from where the radio transmission in both the downlink and uplink direction takes place. The radio resources are the frequencies allocated to the Base Station. The particular hardware element inside the BTS responsible for transmitting and receiving these radio frequencies is appro- priately named "transceiver" (TRX).[26] Other equipment belonging to a BTS would be a power supply (/w a battery backup), combiner, cables, possible mast amplifiers and the actual antennas. A TRX handles the traffic of one radio channel and each radio channel is split into eight timeslots (TS). Each channel can have maximum of eight simultaneous users when full-rate codec is used and 16 if half-rate codec is used. Some of the timeslots are used for signaling which may reduce the number of speech timeslots accordingly. A cell is an area that is covered by one or more TRX. It can be done as omnidi- rectional (round) coverage, or be more directed, covering only a specific area. The frequency reuse in cells is described in Figure 3.5. The MS communicates with BTS via Air interface and the BTS is connected to BSC via Abis interface.[27]

3.5.2 Base Station Controller

The base Station Controller (BSC) manages the radio resources of its own area. It knows the available resources, such as free channels in the cells and signal quality of the ongoing calls. One or more base stations are connected to the base station controller. A location area (LA) consists of groups of cells, and the BSC can control multiple location areas; moreover a location area can have one or more controlling BSCs. MS updates its own location area information to controlling BSC when it moves from one LA to another. When a call comes to a MS residing in certain LA, the BSC sends a request to all cells that belong to that LA. MS gets the request and

33 Figure 3.6: High level view of subsystems [26].

34 the call can be formed. A MS will request a channel, if it is the originating party of the call. In either case, the BSC assigns a traffic channel to the MS. BSC also decides whether a should happen. A handover is done by assigning MS to a new radio channel, that might be available in same or another cell, and the decision may be caused by a MS moving from cell to another or based on measurement data of signal strength and quality, which are collected from MS and BTS. Multiple types of handover exist and a handover can be done inside a cell, from cell to cell, inside a BSC, between two BSCs and between two MSCs. In the latter case, MSC also participates in decision-making. BSC controls the radio interface parameters, which include frequency hopping sequences of traffic channels, power control, cell location area assignments and channel timeslot usage setup. BSC is connected to MSC via the A interface and to the Serving Gateway Support Node (SGSN) via the Gb interface.

3.6 Network Switching Subsystem

The Network Switching Subsystem (NSS) has several network elements. It is con- nected to BSS by the A interface and to NMS by the X.25 interface. In the following section we take a brief look at the network elements of NSS.

Mobile services Switching Centre

The Mobile services Switching Centre (MSC) is responsible for controlling calls in mobile network. This means tasks such as connecting, maintaining and tearing down the calls in its own area. It identifies the origin and destination of a call (ei- ther a mobile station or a fixed telephone in both cases), as well as the type of call. A MSC acting as bridge between a mobile network and a fixed network is called a Gateway MSC. Usually the registers such as HLR and VLR are physically integrated with MSC.

Home Location Register

The Home Location Register (HLR) maintains a permanent register of subscribers and stores subscriber and billing information as well as information about addi- tional subscriber services. Physically, it is a database server that stores information.

35 A network requires at least one HLR, that is connected to some MSC through C interface. Every subscriber is registered in one and only one HLR register, which is specified by the operator and the subscription normally ends when subscriber record is removed from HLR. HLR database stores constant information such as MSISDN number, IMSI iden- tity, subscription type and encryption type. It also stores frequently changing infor- mation, like subscribers VLR location and for example call forwarding information.

Visitor Location Register

Visitor Location Register (VLR) is often integrated with MSC, hence the common name MSC/VLR. It stores subscriber info, which is automatically requested from HLR when MS moves into the area of the MSC. When MS moves into the area of another MSC, the subscriber info is removed from the previous VLR and updated to new one. A VLR stores the following subscriber info: MSISDN, IMSI, TMSI, MSRN, LA, encryption parameters and other service information. Essentially the VLR stores same subscriber info as the HLR, but with additions such as more spe- cific location info. VLR is attached to MSC via the B-interface.

Authentication Centre

The Authenticaton Centre (AuC) stores secret identity numbers of the subscribers. The identity number is generated when a subscriber joins a network. The identity number is compared to MS’s identity number during the call formation and the call is rejected if the numbers do not match. While AuC is not a mandatory part of a GSM network, it is widely used. AuC connects to MSC via the H-interface and the protocol is left unspecified.

Equipment Identity Register

Each MS has an equipment identity (IMEI), which is stored in the Equipment Iden- tity Register (EIR). It can store identities of i.e. faulty or stolen equipment and block the use of certain MS based on that information. The EIR has white, grey and black lists for equipment identities. On the white list is the equipment, that is accepted and can be used in the network. On gray list comprises the equipment, that is fol-

36 lowed and may have a temporary permission to network due to type acceptance reasons. Equipment that may not be used in the network is on the blasck list. Even if equipment is blacklisted, emergency calls can be made. EIR is not a mandatory component of the GSM network. It communicates with MSC via the F-interface.

3.7 Operations Subsystem

The Operations Subsystem (OSS) contains one or more Operations and Maintenance Centers (OMC). Through the OMC, the operator carries out O&M tasks such as up- grading the software of network elements, modifying the network element param- eters, and supervising the operational statuses of the network elements. Other im- portant tasks carried out via OMC are subscriber information management such as billing, and the management of mobile stations. OMC therefore provides the opera- tor with a centralized service from where network can be operated and supervised, as well as a source for information about network use and performance through measurement data. Based on measurements, collected for example from the signaling or transmis- sion channels of the BSC, the operator can i.e. find out why there are dropped calls and whether the fault persists and thus point to a fault in the network planning. Network usage statistics are also provided, and the operator can monitor the load per element and review whether if there is a need for additional capacity in a certain geographical area. OSS itself is not tightly specified, but instead left free for the vendor to imple- ment. OSS is connected to the network through the Q3-interface, which for example applies the frequently used X.25 -protocol for information exchange.

37 4 Nokia implementation and test process

As discussed previously, the GSM is a standard. Telecom equipment vendors fol- low that standard to implement their own sets of network elements. From these elements, network operators can form a complete network that provides communi- cations services to their subscribers. Standardization should ensure that most of the network elements can work together in same network, regardless of vendor. This chapter provides additional background for chapter 5 by looking into Nokia implementation of the GSM Base Station Controller. It introduces the DX 200 plat- form and the structure of the BSC and it’s user interfaces. Operations & Maintenance (O&M) functional testing is discussed, as well as the related testing subjects. Last, the used test automation framework and the test automation process are defined.

4.1 DX 200 platform

The Nokia implementation of GSM network elements is mostly based on the DX 200 switching platform. The development of the DX 200 digital switching platform started in the early 1970s. As shown in Figure 4.1, the platform consists of a hard- ware layer, a computing platform and a switching platform. On top of this lies the application platform.

Figure 4.1: Layers of DX 200 platform [26].

38 The first commercial installation was a fixed line installation, which was com- pleted in 1982. Network elements using the platform are for example Base Station Controller (BSC), Mobile-Services Switching Centre (MSC), Visitor Location Regis- ter (VLR), Home Location Register (HLR), Authentication Centre (AC), Equipment Identity Register (EIR) and Transcoder (TC). Multiple early generations of DX 200 based GSM network element hardware were built into cabinet structures holding cassettes of plug-in units, which were connected together by cabling from the back of the cabinet. The last addition to this construction type was FlexiBSC, which was brought to market in 2010. The multi- controller platform introduced in 2012 differs radically from the old DX hardware, as it uses modern rack-mounted and extensible module-based technology with all IP-based interfaces. Some variants of radio network controller products, such as the 3G equivalent of BSC, the Radio Network Controller (RNC), are also built on multicontroller platform. This provides a valuable upgrade path for network oper- ators, as the multicontroller BSC can be converted into multicontroller RNC with a software upgrade [24][25].

4.2 Base Station Controller

The functional unit composition of the Base Station Controller (BSC) varies depend- ing on the model. Older cabinet-, cassette- and plug-in unit-based variants ranging back from BSCi to the latest FlexiBSC differ from modern multicontroller architec- ture. The former is built from highly task-specific computer units, whereas the lat- ter uses more general multi-purpose hardware packed into a server-style enclosure. The most common functional units are:

• Operation and Maintenance Unit (OMU) is an interface between the DX 200 BSC and a higher-level network management system and/or the user. The OMU can also be used for local operations and maintenance. The fault indi- cations OMU receives from the BSC produce local alarm printouts to the user, or send the fault indications to the OMC. In the event of a fault the OMU automatically activates the appropriate recovery and diagnostics procedures within the BSC. When a fault occurs on the OMU, the active MCMU takes over its duties.

• Marker and Cellular Management Unit (MCMU) performs the control func-

39 Figure 4.2: A structure of a BSC [26].

40 tions of a switching matrix and the BSC-specific management functions of ra- dio resources. The marker functions of the MCMU control the Group Switch. These control functions include the connection and release of the circuits of the switching matrix. The cellular management functions of the MCMU assume responsibility for cells and radio channels that are controlled by the DX 200 BSC. The MCMU reserves and keeps track of the radio resources requested by the MSC or the handover procedures of the BSC. The MCMU also manages the configuration of the cellular network.

• The duties of Base Station Controller Signaling Unit (BCSU) are highly de- pendent on traffic. It provides the SS7 signaling of the A interface and the Air interface channel control and LAPD signaling of Abis interface.

• Group Switch (GSWB) is responsible for switching the speech channels. The operation is supervised by the MCMU. The DX 200 switching network is fully digital and non-blocking.

• Exchange Terminal (ET) work as connection points to the A, Abis and Gb interfaces. The ETs adapt the incoming PCM circuits (encoding and decoding) from and to the Group Switch and synchronize to the system clock. ET can act as an Ethernet interface, providing IP-based connectivity for the previously mentioned interfaces.

• Message Bus (MB) provides connection between computer units (OMU, MCMU, BCSU).

The redundancy of functional units is an important factor in operational relia- bility of the equipment, which should be highly available. In DX 200 BSC there are two primary redundancy types in the functional unit hardware.

• 2N for duplicated units, when there is an active one and a spare one.

• N+1, N+m for N times of units for needed dimensioning of the system and at least one redundant unit or number of m redundant units, which is propor- tional to N.

41 4.2.1 Software

New software builds can be installed while the BSC is running an old software ver- sion. The switchover to new software package causes short downtime. Besides a new versions of application software, the build also contains software for all plug- in units of the functional units. The software of plug-in units is either updated automatically, or initiated via an MML command when the BSC is up and running.

4.2.2 User interfaces

A human user operates the BSC with a command language called Man-Machine Language (MML). The MML interface is provided by the OMU and is used over a standard bidirectional text-based telnet protocol. Text-menu based MML allows the user to carry out Operations & Maintenance (O&M) tasks such as basic configura- tion of the BSC and view the operational state of the functional units (Figure 4.3) and radio network. This is also the interface that is mostly used for functional testing of the equipment.

Figure 4.3: Example output of MML command from BSC.

More control over functional units and for example the file system can be ob-

42 tained when the user interacts through service terminal. The user can connect to the service terminal through an MML command via the OMU or directly through a COM port located in the OMU or another computer unit. At first, interfaces between network elements such as A and Abis used PCM as digital data transfer format. Altough PCM is still supported by hardware vendors, today’s approach is an IP based interface communication.

4.3 Operations & Maintenance Functional Testing of the BSC Soft- ware

Operations & Maintenance (O&M) basically means the functionality that operators use to put and maintain the network in an operational state. A BSC, as well as most network elements, provides localized O&M access that is used to commission the network elements in use with basic configuration. Localized access means that someone using the O&M provided connectivity must be physically present at the site where the network element is located, or be able to access a network that allows needed connectivity to the element[23]. O&M functional testing is mostly done with real BSC environments. As multiple generations of BSCs are supported today, with their respective hardware configura- tion and capacity variants, the total number of needed hardware combinations (BSC and BTS variants combined) to carry out the testing is rather large. When functional test activities for a BSC software release are planned, the available environment re- sources are very important. The main factors for test environment planning are the BSC generations, capacity, interface types, and base station types and configura- tions. The main and most used interface for O&M functional testing of the BSC is the Man Machine Language (MML) connection over Man Machine Interface (MMI). It is a text-based command interface, which can be accessed over a standard telnet or SSH connection. Another important interfaces are the service terminal connec- tions to the computer units, which also operate over telnet connections. While the MML interface provides the control and observation interface for the base station controller and radio network functionalities, the service terminal connection can be used to observe and control the internal actions of the computer units. Basic testing activities are usually carried out by giving commands through these interfaces and observing the results in the system. Test activities may for example

43 also include intervention to power feeding, network connections, and other exter- nal resources. Real mobile devices are used to verify the connectivity in the test network. The O&M functional testing of BSC covers a large number of software feature functionality. Testing is separated into subareas that are handled by separate testing teams. A few examples of these subareas are:

• The recovery and capacity tests use different hardware, interface and base station configurations and combinations to verify the functionality after unit switchovers (from controlling unit to a redundant unit, as described in 4.2), unit restarts and controlled and uncontrolled (i.e. power break) system restarts. Tests also ensure that calls stay connected if for example unit switchover hap- pens due to a faulty unit. These tests may also require large radio network configurations, which accordingly increase the load to the system. Test cases requiring the maximum or a large radio network are executed using simulated radio network environment, as it would not be feasible to host and configure all needed BTS hardware for the test setups.

• Radio network functionality and configuration testing verify the functional- ity of different radio network and cell configurations along with the Abis in- terface. The tests may for example contain adding and removing radio net- work elements from the configuration and altering their states and parame- ters. Again, the tests may be executed against real or simulated radio network hardware.

• The measurements are collected from BSS. They measure characteristics like signal strengths, dropped calls and handover information from the network. The measurement info is stored in the BSC and can be further analyzed in NMS. The measurement info is further processed to Key Performance Indica- tors (KPI). This test area verifies the functionality of these measurements.

4.4 Holistic Integration Tester

The Holistic Integration Tester (HIT) is perhaps the most widely used tool for testing the internal services of DX200 platform at Nokia Networks. The tool provides means to telnet and SSH connections and enables the user to create scripts with a language that is a subset of the C programming language. HIT also provides a collection of

44 system functions along with debugging capabilities to assist in script creation. The scripts themselves are used to execute MML commands and parse the outputs and automate configuration creation, as well as to assist in test case execution. In terms of telnet connectivity, HIT could be considered as a close relative of an ordinary telnet client such as Putty.

4.5 Test Automation Framework

The software tools used for automated testing of the BSC were mostly developed by tool teams inside Nokia. The test automation framework that was used in the O&M functional testing of the BSC was also developed internally. The framework used the HIT tool and it’s scripting capabilities as foundation. Large part of test automation development for the BSC was the development of the test automation framework. The test automation framework consisted of two main parts. The core of the framework contained the general functionality needed for test cases, test sets, re- porting and various file- and MML connection-handling functionalities. The core part also contained the libraries needed for controlling external devices, such as call generators. The second main part was the system libraries, which provided inter- face like functionality for the BSC. System libraries were hardware-specific and also contained the MML abstractions described in 5.1.4. New framework functionality was constantly needed to carry out the automa- tion tasks. A rough estimate of the tool development time would be along the lines of half of the total development time, other half being actual test case development. As stated by Bach [5] and Hendrickson [17], tool development should be consid- ered as a proper software project. Good software development practices such as bug reporting and testing of the automation framework were practised during the development efforts.

4.6 Test Automation Process

The motivation for the subject of the thesis came from my work experience in the position of a test engineering at Tieto Finland Oy. I was responsible for implement- ing the automated test cases for the O&M functional testing phase of base station controller software. I was also responsible for the implementation and maintenance

45 of the test automation infrastructure and analysis and for reporting of the test runs. Very strong domain knowledge was needed in the test case implementation and analysis phases and I benefited from domain support of seasoned test teams, with individuals even having over decade worth of experience in testing the product. After exploring the [12] by Fewster and Graham, the setting where there is a tool expert and experts with domain knowledge, much resembled what was presented in the book. When the infrastructure was in place and the foundation for test automation was laid, the test automation tasks were to be distributed to the test teams. This meant that formerly manual test cases were to be automated by using the test automation framework described in the previous subchapter. The work environment was fast paced, and being as the subsequent test phases usually waited execution results from previous phases, the test executions had rather strict time frames. Test executions themselves could not be faster than the test en- vironment allowed, and usually tests were run overnight in multiple environments simultaneously, and so that the test results were ready for analysis in the morning. Tests needed to be reliable, and to provide enough information for analysis in order to keep duties on schedule. The typical work-flow in test runs was to load the new software build in the environments, check the need for updates of embedded boot software of the units and check the basic configuration of the BSC, interfaces, and radio network as well as call connectivity. After the environment was verified to be in a correct starting state, the automated test sets was started. After execution, the test results were analysed and possible failures were discussed with responsible test team members, whose teams owned the test cases. Re-executions of cases with sufficient monitoring and logging were often needed for the fault report.

4.6.1 Test automation infrastructure

The used continuous integration infrastructure is presented in Figure 4.4 for two tested BSCs. Jenkins is a software that can be used to control and orchestrate CI executions and one way to configure it is to have a Jenkins master node that provides a web interface where test executions can be started and reports can be viewed. The Jenkins master node can also "publish" test reports to another master node, where different views of test results can be composed, e.g. for management purposes. Another Jenkins node type is the slave node. These slaves connect to the master node and typically contain the test automation framework and the test cases. In this

46 case, the slave nodes also connected to the BSC under test via an MML connection and other testing tools, such as the call generators and other peripherals. The slave node to BSC under test relation was 1:1, one slave node per one BSC. Everything in a slave node (environment parameters, test cases and sets and the test framework) were under version control. The development work usually took place on the test engineer’s own environment and the new test cases and features to the test tool were eventually updated to slave machines via version control.

Figure 4.4: Test environment infrastructure for two BSCs.

47 4.6.2 Extending the framework

The execution steps of course came from carefully planned automated test cases, which up to this point were manually executed. Therefore, the instructions for test case automation were clear. Problems started first to appear, when these cases were to be executed in large batches in the test environment BSCs, and then when the cases needed maintenance. Questions such as following were asked:

• What state should system be in for a test case?

• Should a case clean up after itself? What would be the best way?

• This test case leaves a perfect setup for the next one, why can’t I just execute them sequentially?

• What kind of parameters should I supply a case with? How should I name them?

• How should the test case act if a fault is encountered and the test case is aborted due to that?

So clearly the issues were not in the test case content, but instead in what was happening around the execution and how the whole scheme could work in better way. This is a very important point to discuss, as if it is neglected, it may quickly destabilize the batch runs and complicate the maintenance, result analysis or trans- fer of test cases to another environment. These issues and their potential solutions are discussed in the next chapter.

48 5 The Suggested Structure of Automated Test Cases

The previous chapter briefly discussed problems faced in in test automation. In this chapter, we analyze problems more thoroughly and try to find a solution for each one. The presented solutions could form an architecture that when followed would make automated test cases more maintainable, faster to analyse and less error-prone in batch executions. Architecture would therefore benefit the whole test automation process.

5.1 Problems faced with test automation

Next we will attempt to problematize the issues, provide practical examples to gain better understanding of the problems and suggest a detailed solution for each prob- lem.

5.1.1 Test-case portability

It is possible that a test case or a set of test cases are moved from an system under test to another. This may for example be done because of hardware-related issues, load management of test environments, or moving of the test responsibilities to another physical location (laboratory). In [31], Tervo found limitations in portability to also be an issue. Test cases are not automatically portable. A strategy for parameter management is needed if portability is to be accomplished. Without a strategy, the migration from one system under test to another may require refactoring efforts. Next we will identify the parameter classes.

• Test environment-specific parameters are parameters that are extracted from a specific test environment. These are needed to create the required general setup of the system under test. Use of these parameters and setup phases are explained more thoroughly in chapter 5.1.3. Examples are: radio network pa- rameters, unit id values, and interface parameters.

• Test set-specific parameters are parameters that may be used to run a set of

49 Table 5.1: Parameters for a test case Scope of environment Scope of test set / case Environment parameters e0 e1 e2 e3 e4 e5 Test set specific parameters s1 s2 s3 s4 Test case specific parameters c0 c1 c3 c4 Actual parameters used in test c0 e1 s1 s2 c1 e5 s3 c3 c4

cases with altered values. These parameters are moved together with a set of test cases. Example: a BTS id or an interface id.

• Test case-specific parameters are parameters with the scope of a single test case. These parameters are moved with a single test case. Example: an ex- pected alarm id or message captured from an interface.

Consider a batch of test cases that are implemented with tightly coupled values, such as radio network element identifiers, functional unit identifiers or interface parameter values. It is noticed that a batch run of these test cases in the system under test is taking too long, as the test runs are barely completed with the reserved time window. A solution for this would be to move the test cases to a new environment to even out the load between the environments. In practice it is not possible to move this kind of case to any other environment without refactoring the case to contain correct parameter values. The same problem is faced, if some part of the configuration is changed in the test environment. The first logical step to address the issue is to start collecting the required param- eters in a separate file. While this is a straightforward and seemingly good practice, it has problems when there are many parameters and a single file contains parame- ters from all three classes. There is also a high probability of incompatible parameter schemes across developers. In order to gain control over parameter handling, these things must be defined.

• Which parameters (of all possible parameters in the environment) are needed?

• Classification of environment, test set and test case-specific parameters.

• Test automation tool support for this kind of parameter management.

If and when all this is communicated and agreed in teams doing test automation work, as early as possible in test automation efforts, portability should be possi- ble. Moreover to make parameter schemes consistent and easy to adopt, two things

50 could be done. First, test tooling could enforce the correct use of parameters. Sec- ond, "seed projects" showing correct parameter use could be developed and shared among teams.

5.1.2 Script-to-script dependencies

In order to conduct the testing, the system under test should be brought to the state of general setup (see Figure 5.2). This setup process is defined in section 5.1.3. After test case execution, the state in system could be suitable for running another test case immediately. This means that new cases are developed against and dependent of a state that another test case left behind; literally an undefined state, as it is a result of a combination of preceding test cases. This can be an attractive option at first, as it does not require any additional effort. Problems with this approach will emerge in at least three possible ways. First, when a test case fails at the end of batch run execution, whole set of preceding test cases must be re-executed to get the system to the point where the issue occurs. This can be very slow. Second, the execution order is locked to one possible permutation of a set, that is known to work. The third issue is portability, as only batches of test cases can be moved from one environment to another. Portability is explained in detail in 5.1.1. Persson and Yilmaztürk also remarked on this in [28]. Consider a test set of three test cases, as shown in scenario A in 5.1. Test case 3 fails in the batch run, and the test engineer attempts to find out whether this issue is reproducible and should be reported. The execution time for both test-case 1 and 2 is 10 minutes and 5 minutes for test case 3, a total of over 20 minutes. After the second execution, it is clear that the issue can be reproduced and further investigation is needed. Some additional monitoring of the system is needed to investigate the root cause of the issue. The test engineer sets up the monitoring, starts the batch run again, and collects the needed logs. Approximately 45 minutes in total are spent to process the issue to the point where it can be escalated to be fixed. The issue is eventually fixed and the correction is released in the next software build. Another run with additional monitoring is required to verify the functional- ity; this needs another 20 minutes. Now, a total of 65 minutes spent on an issue might not sound much, but this mere example hardly represents a real-life scenario. A batch can contain tens of test cases, and the time spent on similar issue might be multiplied to the extent where it is difficult to remain within the allotted time window of batch runs. Another point

51 Figure 5.1: Script dependency scenarios

52 to consider is that in this example only one issue was found. The situation would have been even more difficult to handle if multiple issues had been found in single batch run. The solution for this would be to have a strict baseline configuration, as de- scribed in Figure 5.2. After the baseline is defined, test cases should follow the scenario B shown in Figure 5.1, where every test case builds general setup and ap- plies the test case specific setup on top of that. After execution of the test case, the teardown follows immediately to revert the system under test back to the baseline state. Following this approach, the issues with fixed execution order, slow debugging of problems and portability are solved. There is certainly a tradeoff of increased total execution time, as the number of setup and teardown routines is increased. This increase may even require additional test environments, but the payback in easy debugging and flexible organization of test cases in batch runs is worth it.

5.1.3 Test Setup and Teardown Handling

As discussed in the previous subchapter, a test case should be self-contained; it should apply general setup and test case specific setup before the test case execution and perform a teardown routine afterwards. figure 5.2 defines these stages. The reasoning for these phases is explained in 5.1.2 and 5.1.1. To be able to discuss the subject further the three stages of setup are defined.

Figure 5.2: Setup stages of BSC

53 • First is the baseline setup, which in practice contains a functional system con- figuration without a radio network. This is done when a new software build is taken in use. The baseline setup details should be agreed on in order to be consistent over test environments.

• In the general setup phase, all recurring setup tasks are performed. These tasks are e.g. radio network creation, alarm handling, and computer log collection set up. Possible radio network remnants are cleaned from the system (from possible aborted test case execution).

• Finally, the test case creates any needed configuration in the test case specific setup phase. This can for example be small alteration to a radio network ele- ment.

The baseline setup is beyond the scope of this work. Figure 5.3 defines the flow of the general setup phase. A test-case execution is triggered either by continuous integration, or manually by test engineer. The general setup phase is automatically included in the execution flow. 1. The setup automatically initiates alarm and computer log handling, based on values defined in test case-specific parameters.

2. In the radio network setup phase, the first thing to do is to erase possible rem- nants of any old radio network setup. Next, if the test case requires a radio network, an attempt to create one will be made. A default network is de- fined, but a test case-specific network may also be supplied as parameter. At this point the environment specific parameters are fed to the network creation scripts. If the test case does not require a network, this point is void.

3. Finally, when the general setup is complete, the test case modifies the configu- ration to be suitable for the test. It also verifies that the required configuration exists and is valid before proceeding to the actual test steps. To accomplish the common setup and teardown phases, first of all tool support should exist. Currently the test automation framework does not mandate any re- quired setup or teardown routine, so the routine needs to be implemented. This is also the place where the parameter handling strategy is put into the action, so the tool should support the parameter scheme discussed in 5.1.1. Note that this is the foundation on which the test case portability and dependency removal discussed in the previous chapter is built.

54 Figure 5.3: A flow diagram of the common setup phase.

55 5.1.4 Test libraries and user interface abstraction

Changes in the user interface are common problem of high-level (i.e. functional) test automation [17]. If interaction with the user interface is tightly coupled, the func- tionality of test cases may break and they may need maintenance efforts to bring them to a usable state again. When the number of test cases grows, the tightly coupled code can lead to massive maintenance efforts. Test automation should op- timally provide a fast and effective feedback loop, and the delay caused by mainte- nance does not fit very well into this scheme. New software build brings a change in MML input or output (i.e. in the user interface). A basic test step in a test case would be to query the functional state of an interface. Let us consider that the command syntax required to inquire the state of the interface has changed, as well as the command output where the functional state can be retrieved. As the test case attempts to give the command in the old form, the system rejects it and displays an error message. It is now impossible to continue the execution of the test case. Fixing the input-related code in the test case will move us forward, which is to parse the output of the command, where the sought interface state lies. Again, as the output has also changed, the test case is broken. If the command output is parsed in the test case, it must be fixed to get the test case functional again. Multiplying this effort with the number of broken test cases, we are dealing with a considerable maintenance effort. A solution to this highly probable scenario would be to have abstraction layers. Optimally the test cases should not communicate directly with the system under test, but through well-defined interfaces, which the test automation framework pro- vides to the test case. Figure 5.4 introduces two layers of abstractions. The respective framework used both library level and user interface level ab- stractions, but they were not enforced. The development of new abstractions to the user interface and to test libraries requires an initial effort, but is paid back gener- ously in maintenance costs, as Hendrickson et al also found out in [17]. Unfortu- nately the need for abstraction may emerge when a lot of work in test automation is already complete. Latter introduction of abstractions may require much refactoring. Early adoption and strict use of abstraction yields maintainable test cases.

56 Figure 5.4: Test tool abstraction layers against system under test.

57 5.1.5 Complexity and Execution time of test cases

If the goal of test automation project is to automate a certain number of old test cases that were previously executed manually, there may be a risk that manual cases are not ideal for automation purposes. Not being ideal means that ease of test case code debugging, test execution times and result analysis are affected negatively by the length of test cases. The reason for this is that functional test cases usually bundle too many things (from an automation viewpoint) into a single test case that therefore gains excessive length. It may be feasible from a manual execution viewpoint, as it requires less separate documentation and less reporting.

Figure 5.5: Test step relations from test case description to raw log file.

If no attention is paid, test automation efforts may produce automated test cases, which have a very long execution time. The reasoning for keeping test cases short is the same as discussed with test case dependencies in 5.1.2. Fewster mentions supporting conclusions for this in [11]. In addition to long execution times, long test cases may cause difficulties when test case description needs to be compared to

58 the automated test case or when automated test case needs to be compared to the execution result and raw log file. These relations are described in figure 5.5. The ideal execution time for an automated test case would be around ten minutes or less. Being conscious about test case execution time makes developers working with test automation more productive, as initial implementation and possible de- bugging times are shorter, as are the result analysis times. While requiring an effort, it may be beneficial to rewrite old legacy test cases to better suit automated form before implementing the automation.

5.2 Chapter Summary

Previous subsections have presented the problems faced while implementing test automation. This summary, we’ll recap the suggestions and present a chart of a test case overall architecture.

• To ensure portability, environment-specific parameters should be recognized, used and named in a consistent manner across developers.

• Test cases should have a consistent setup and teardown functionality, which supports the use of the environment-specific parameters.

• Test cases should be self-contained and should not rely on any configuration made by other cases or scripts.

• External tools should provide uniform interface for test cases to use and avoid the need for excess setup and teardown code to accomplish a task.

• MML absraction layer should be used where possible to ensure test case and library maintainability.

• The implementation of test cases and libraries benefit from use of the coding conventions and standards, like any other software products.

• Test result analysis can be improved by having a sufficient numbers of log files available from every execution. Test case code should be written in such a manner that test description is easily comparable to it.

• Test case execution time should be kept short. Rule of thumb is ten minutes or less. Result analysis and test case code debugging become more difficult if execution time is too long.

59 If the tool used does not mandate the shape of test cases and libraries and there is no clear pattern to follow for developers, a risk of producing non-portable and unmaintainable test code is present. The technical debt caused by this multiplies the more cases are implemented. Risk can be suppressed by following a set of pre- defined guidelines such as those presented in this thesis. It is clear that all types of test cases do not benefit from these guidelines, but particularly the O&M type functional test cases clearly do.

5.3 Supporting findings in another project

As I have been recently introduced to another large test automation project, I have found out that most findings and issues presented in this chapter were clearly iden- tifiable in other project too while the software domain was completely different. Based on these two test automation projects, relatively high level testing like func- tional testing seems to be prone for these kind of issues.

60 6 Summary

This thesis examines the features and problems of a test automation project, in the context of GSM network element. Chapter 2 provides a background of software test- ing, software quality and software test automation. Chapter 3 describes the GSM network, the network elements and their interoperation. Chapter 4 introduces the Nokia implementation of the GSM, especially the base station controller. Typical O&M testing areas and the test automation solution are also presented. Chapter 5 examines the problems that may occur in test automation project and their respec- tive solutions. The objective of the thesis was to discover the requirements for test automation in the context of specific mobile network element. Based on those requirements, a common high-level framework is presented in problem-solution form. Most of the problems presented in Chapter 5 are presented in some form in the literature. Some of the solutions are recognized as best practices, such as the page object pattern [32], which essentially equals the user interface abstraction presented in Chapter 5.1.4. This work might serve its purpose by providing insight into com- mon issues in test automation, for anyone starting or maintaining a test automation project. It seems that although the environment of test automation in this thesis was not a typical one and having found supporting findings on another test automation project, it seems that the pitfalls presented in this thesis could be universal for most of the test automation out there. In general, a test automation project should be considered as a software project. A high-level test automation project typically contains large amounts of code, which has to be adaptable enough to live in symbiosis with actual software under devel- opment. Without applying the best practices and discipline required by proper soft- ware development, and without having a solid understanding of how a test automa- tion project should be developed, it can take a path with many common pitfalls, and which can be very expensive or outright impossible to recover from after the project has reached a certain size.

61 References

[1] 3GPP. Feasibility study for evolved /edge radio access network (geran). URL http://www.etsi.org/deliver/etsi_tr/145900_145999/ 145912/12.00.00_60/tr_145912v120000p.pdf, referenced 24.6.2015.

[2] 3GPP. Gprs and edge. URL http://www.3gpp.org/technologies/ keywords-acronyms/102-gprs-edge, referenced 24.6.2015.

[3] AMMANN, P., AND OFFUTT,J. Introduction to Software Testing. Cambridge University Press, 2008.

[4] ATT. It’s time to develop a migration plan for m2m, 2012. URL: http: //www.business.att.com/enterprise/Family/mobility-services/ machine-to-machine/m2m-applications/cd2migration/page= addl-info/, referenced 7.4.2015.

[5] BACH, J. Test automation snake oil. In Proceedings of the 14th International Conference and Exposition on Testing Computer Software (TCS99) (1999).

[6] BOEHM, B. W. Guidelines for verifying and validating software requirements and design specifications. Euro IFIP 79 (1979), 711–719.

[7] BORQUE, P., AND FAIRLEY,R.E.D. Guide to the Software Engineering Body of Knowledge, Version 3.0. IEEE, 2014.

[8] BURNSTEIN,I. Practical Software Testing. Springer-Verlag New York, Inc, 2002.

[9] BURNSTEIN,I.,SUWANASSART, T., AND CARLSON, R. Developing a testing maturity model for software test process evaluation and improvement. Pro- ceedings of International Test Conference 1996 (1996), 581–589.

[10] ERICSSON. Traffic exploration tool. https://www.ericsson.com/TET, refer- enced 7.5.2018.

[11] FEWSTER,M., AND CONSULTANTS, G. Common mistakes in test automation. In Proceedings of Fall Test Automation Conference (2001).

62 [12] FEWSTER,M., AND GRAHAM,D. Software Test Automation - Effective Use of Test Execution Tools. Addison Wesley, 1999.

[13] FOWLER, M. Continuous integration. http://www.martinfowler.com/articles/ continuousIntegration.html, referenced 30.3.2015.

[14] GLEICK, J. A bug and a crash, 1996. URL: http://www.around.com/ariane. html/, referenced 27.4.2015.

[15] GSMA. Gsma, 2015. URL: http://www.gsma.com/aboutus/ gsm-technology/gsm, referenced 6.4.2015.

[16] HAIKALA,I., AND MÄRIJÄRVI,J. Ohjelmistotuotanto. Suomen ATK-kustannus, 1998.

[17] HENDRICKSON, E. The differences between test automation success and fail- ure. Proceedings of STAR West (1998).

[18] HILLEBRAND, F. The creation of standards for global mobile communica- tion: Gsm and umts standardization from 1982 to 2000. IEEE Wireless Commu- nications October (2013), 24–33.

[19] HUANG,J. Software Error Detection through Testing and Analysis. John Wiley & Sons, Inc., 2009.

[20] IEEE. IEEE Standard Glossary of Software Engineering Terminology. IEEE, 1990.

[21] MATHUR,S., AND MALIK, S. Advancements in the v-model. International Journal of Computer Applications 1, 12 (2010), 29–34.

[22] MYERS,G.J.,SANDLER,C., AND BADGETT, T. The Art of Software Testing. John Wiley & Sons, Inc., 2012.

[23] NOKIA. Gsm bss training material. internal document.

[24] NOKIA. Dx200 hardware, 1999. internal document.

[25] NOKIA. Nokia siemens networks flexi base station controller (flexi bsc), 2009. Datasheet.

[26] OY,N.N. SYSTRA GSM System Training. Nokia Networks Oy, 2000.

63 [27] PENTTINEN,J. GSM-tekniikka: Järjestelmän toiminta ja kehitys kohti UMTS- aikakautta. WSOY, 2002.

1 [28] PERSSON,C., AND YILMAZTÃ 4 RK, N. Establishment of automated regression testing at abb: Industrial experience report on ’avoiding the pitfalls’. Proceed- ings of the 19th International Conference on Automated Software Engineering (2004), 112–121.

[29] ROTHERMEL,G.,UNTCH,R.H.,CHU,C., AND HARROLD, M. J. Prioritizing test cases for regression testing. IEEE Transactions on Software Engineering 27, 10 (2001), 929–948.

[30] TELSTRA. It’s time to say goodbye old friend, 2014. URL: http://exchange. telstra.com.au/2014/07/23/its-time-to-say-goodbye-old-friend/, referenced 7.4.2015.

[31] TERVO, B. Standards for test automation. Proceedings of STAR East (2001).

[32] WEBDRIVER.IO. Page object pattern. http://webdriver.io/guide/testrunner/ pageobjects.html, referenced 27.3.2018.

[33] WHITTAKER, J. What is software testing? and why is it so hard? IEEE Software 17, 1 (2000), 70–79.

[34] WIEGERS,K., AND BEATTY,J. Software requirements. Pearson Education, 2013.

[35] WIKLUND,K.,ELDH,S.,SUNDMARK,D., AND LUNDQVIST, K. Technical debt in test automation. IEEE Fift International Conference on Software Testing, Verifi- cation and Validation (2012), 887–892.