LiU-ITN-TEK-A--10/035--SE

Real time sampling of utilization at Test

Plants

Marky Egebäck Sebastian Lindqvist

2010-06-10

Department of Science and Technology Institutionen för teknik och naturvetenskap Linköping University Linköpings Universitet SE-601 74 Norrköping, Sweden 601 74 Norrköping LiU-ITN-TEK-A--10/035--SE

Real time sampling of utilization at Ericsson Test Plants Examensarbete utfört i kommunikations- och transportsystem vid Tekniska Högskolan vid Linköpings universitet Marky Egebäck Sebastian Lindqvist

Handledare Torbjörn Wikström Examinator Di Yuan

Norrköping 2010-06-10 Upphovsrätt

Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under en längre tid från publiceringsdatum under förutsättning att inga extra- ordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/

Copyright

The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/

© Marky Egebäck, Sebastian Lindqvist

Abstract

This master’s thesis has been written within the field of Electrical Engineering at the De- partment of Science and Technology, Linköping University. The work has been carried out at Ericsson’s site in Linköping during the spring of 2010.

The purpose of this master thesis was to construct a model which could capture and present the utilization rate of test equipment at a telecom company in general. Since this field has not been studied very much in the past, it was decided to study a model from the production industry and try to reuse some of the basic ideas from this model.

From this generic model a recommendation is given as to how the model could be used by implementing a Common Utilization Tool, which could be used to store, configure and present utilization data from all types of equipment in Ericsson’s test environment. This common uti- lization tool will use measurement modules that will both collect and classify the state of the equipment and deliver the result to a common database.

To this Common Utilization Tool a measurement module has been implemented which samples Base Station Controllers (BSC) in Ericsson’s test environment state; used, unused and down. This implementation is also validated against real measured data from testers to conclude if the results are accurate.

i

Acknowledgments

We would like to express our deepest gratitude to our supervisors Torbjörn Wickström at Er- icsson AB, David Gundlegård and Di Yuan at the Department of Science and Technology in Linkoping’s university. Without your knowledge, support and helping hand we would never been able to complete this thesis.

A special thanks to Thomas Thunell, Anders Hollstedt, Jonas Madsen and the rest of the ATD Team for answering all our LTE Util Tool and THC questions. Without you guys the work with the BSC utilization module, which is a big part of our work, would not have been preformed as smoothly.

We would also like to thank the rest of the people who helped us create the BSC utiliza- tion module; Ulf Arkad, Tomi Ojala Carlbergh, Jens Lindberg and Samka Nyberg.

Another person that we would like to thank is Liz Foxbrook for greatly improving the lan- guage in the report.

Last but not least we would like to thank the people at our section and department that has given us so much support and encouragement during the work of this thesis.

Linköping, May 2010.

Marky Egebäck and Sebastian Lindqvist

iii

Contents

1 Introduction 1 1.1 Background...... 1 1.2 Purpose...... 2 1.3 Objectives...... 2 1.4 Method...... 2 1.5 Scope...... 3 1.6 Confidentiality...... 3 1.7 Outline...... 3 1.8 The GSM Organization...... 3 1.9 Testing...... 4 1.10 Ericsson test environment - BETE...... 5

2 Frame of reference7 2.1 OEE - Overall equipment efficiency...... 7 2.1.1 Availability Efficiency...... 8 2.1.2 Operational Efficiency...... 8 2.1.3 Rate Efficiency...... 9 2.1.4 Quality Efficiency...... 9 2.1.5 Applications of OEE...... 9 2.1.6 Limitations of OEE...... 9 2.2 Performance Management...... 10 2.3 Sampling theory...... 11 2.3.1 Sampling methods...... 12 2.3.2 Sampling period...... 12 2.3.2.1 Normal distribution...... 12 2.3.2.2 Binomial distribution...... 14 2.4 Measurement process...... 15 2.4.1 Measures...... 15 2.4.2 Activities in the measurement process...... 15 2.4.2.1 Establish and sustain measurement commitment...... 15 2.4.2.2 Plan the measurement process...... 15 2.4.2.3 Perform the measurement process...... 16 2.4.2.4 Evaluate measurement...... 16 2.4.3 The measurement information model...... 16

v vi Contents

3 GSM 19 3.1 GSM specifications...... 20 3.1.1 GSM Phases...... 20 3.1.2 Services in GSM...... 21 3.2 GSM Network Architecture...... 21 3.2.1 Radio Subsystem...... 22 3.2.2 Mobile Station...... 22 3.2.3 Base Station Subsystem...... 23 3.2.4 Network and Switching Subsystem...... 24 3.2.5 Mobile Switching Center (Mobile Services Switching Center)...... 24 3.2.6 SMSC...... 25 3.3 GSM Areas...... 25 3.4 Databases and Registers...... 25 3.4.1 Home-location-register (HLR)...... 25 3.4.2 Visitor-location-register (VLR)...... 26 3.5 Operations Support Subsystem...... 26 3.5.1 Operations And Maintenance Center...... 26 3.5.1.1 Telecommunications management network...... 26 3.5.2 Authentication Center...... 27 3.5.3 EIR...... 28 3.6 Radio interface...... 28 3.6.1 Logical Channels...... 29 3.6.1.1 Traffic Channels...... 30 3.6.1.2 Control Channels...... 30 3.6.1.3 GSM Mapping...... 31 3.7 Protocols in GSM...... 32 3.8 Addressing and localization in GSM...... 33 3.8.1 International Mobile Subscriber Identity (IMSI)...... 34 3.8.2 Temporary mobile subscriber identity (TMSI)...... 34 3.8.3 Local Mobile Subscriber Identity (LMSI)...... 34 3.8.4 Mobile Station (or Subscriber) ISDN Number (MSISDN)...... 34 3.8.5 The Mobile Station Roaming Number (MSRN)...... 34 3.8.6 International mobile station equipment identity (IMEI)...... 34 3.9 Data services...... 34 3.9.1 GPRS...... 34 3.9.1.1 SGSN...... 35 3.9.1.2 GGSN...... 35 3.9.1.3 Location managemnet...... 36 3.9.2 EDGE...... 36

4 GSM Evolutions 37 4.1 WCDMA...... 37 4.1.1 System and network architecture of WCDMA...... 37 4.2 LTE...... 39 4.2.1 System and network architecture of LTE/SAE...... 39 Contents vii

5 Current Solutions 41 5.1 STP Utilization tool...... 41 5.1.1 Definitions...... 41 5.1.2 Data collecting...... 42 5.1.3 Data presentation...... 42 5.1.4 Evaluation of the tool...... 42 5.2 Utilization tool for the eNodeB in LTE...... 43 5.2.1 Definitions...... 43 5.2.2 Data collection...... 44 5.2.3 Data presentation...... 44 5.2.4 Evaluation of the tool...... 44 5.3 Ericsson Real Utilization Measurement Solution (ERUMS)...... 45 5.3.1 Definitions...... 45 5.3.2 Data collecting...... 45 5.3.3 Data presentation...... 46 5.3.4 Evaluation of the tool...... 47 5.4 ENSIEM adaption for node utilization...... 47 5.4.1 Definitions...... 47 5.4.2 Data collection...... 48 5.4.3 Presentation...... 48 5.4.4 Evaluation of the tool...... 48 5.5 Booking degree as utilization measure...... 49 5.6 Other test efficiency indicator...... 49 5.6.1 Fault-slip-through...... 49

6 General model for utilization measurements 53 6.1 Efficiency indicators for test equipment...... 53 6.2 Equipment Utilization Efficiency...... 55 6.3 The state of the test equipment...... 56 6.3.1 Measurement methods...... 57 6.3.2 Classification of equipment state...... 57 6.3.3 Time resolution...... 57

7 Common Utilization Tool 59 7.1 Schematic model for a general utilization tool...... 59 7.2 Modules...... 60 7.2.1 Time resolutions of measurements...... 61 7.3 Database...... 61 7.4 Common configuration layer...... 64 7.5 Common presentation layer...... 64 7.6 KPI reports presentations layer...... 65

8 BSC Utilization Module 69 8.1 Background...... 69 8.1.1 Type of test cases...... 69 8.2 Pre-study...... 70 8.2.1 Equipment states...... 70 8.2.2 Possible measure points...... 71 8.2.2.1 Capture real user traffic...... 71 viii Contents

8.2.2.2 Capture operations and maintenance traffic...... 71 8.2.2.3 Energy consumption...... 71 8.2.2.4 Measuring inside the node...... 72 8.2.2.5 Indirect measuring points...... 72 8.2.3 Chose of measuring point...... 72 8.3 Implementation...... 72 8.3.1 Base measures...... 73 8.3.2 Code structure...... 73 8.3.2.1 THC Test Case...... 73 8.3.2.2 BSC Utilization Test Code...... 74 8.3.2.3 Database...... 75 8.4 Collected data and classification of BSC state...... 75 8.4.1 Classification of the equipment state...... 75 8.4.2 Validation of the classification...... 79 8.4.3 Samples or Counters...... 79 8.4.4 Classifying Function Test...... 80

9 Utilization modules for other equipment 81 9.1 UE simulators...... 81 9.1.1 UE simulator 1...... 81 9.1.2 UE simulator 2...... 82 9.1.3 Conclusions for the UE simulators...... 82 9.2 Protocol analyzers...... 82 9.2.1 Tektronix K15...... 82 9.2.2 Nethawk M5...... 83 9.2.3 Proposed solution for packet analyzers...... 83

10 Discussion 85 10.1 Possibilities and potentials of equipment utilization measurements...... 85 10.2 The value of a Common Utilization Tool...... 85 10.3 Weakness of the BSC Utilization Module...... 86 10.4 Future work with the BSC Utilization Module...... 87 10.5 Future work with the Common Utilization Tool...... 87 10.6 Future work on the utilization uodules for other equipment...... 88 10.7 Future work in the test environment...... 88

11 Conclusion 89

Bibliography 91

A Test Harness Core (THC) 93 A.1 Definitions in THC...... 93 A.2 System overview and concept...... 94 A.2.1 Resource Factory (RF)...... 94 A.2.2 Test Execution System (TES)...... 95 A.2.3 Test Tool Middle Ware Subsystem (TTMW)...... 96 A.2.4 Log Service...... 96 Contents ix

B Ericssons Base Station Controller (BSC) 100 B.1 Base Station System (BSS)...... 100 B.1.1 TRC...... 100 B.2 BSC Products...... 101 B.3 BSC Hardware and Subsystems...... 102 B.4 APZ Control System...... 102 B.4.1 Central Processor...... 102 B.4.2 Regional Processor...... 103 B.4.3 Adjunct Processor Group...... 103 B.4.3.1 STS...... 104 B.5 OM interfaces...... 104 B.6 Man-Machine Language (MML)...... 105 B.6.1 Command structure...... 105

Acronyms and glossaries 107 List of Figures

1.1 The steps in a project...... 4

2.1 The different states of the equipment...... 8 2.2 The percentage error as a function of the number of samples...... 13 2.3 Information model of the measurement process...... 17

3.1 GSM Network Architecture...... 22 3.2 Flow chart of the activities in the ME...... 23 3.3 TMN elements and the connections in the TMN model...... 27 3.4 GSM normal burst structure...... 29 3.5 GSM frame structure...... 31 3.6 The traffic multiframe used in GSM...... 32 3.7 GSM signaling protcol structure...... 33 3.8 Architecture of a GPRS system...... 35

4.1 Architecture of a WCDMA Network...... 38 4.2 Architecture of a LTE system...... 40

5.1 The second page of the web-GUI that shows the utilization during a time period of 6 hours...... 42 5.2 The main page of the web-GUI that shows the utilization for one day...... 43 5.3 Diagram over data collection...... 44 5.4 Screenshot of the presentation of utilization...... 45 5.5 ERUMS schematic system description...... 46 5.6 Screenshot of the presentation using pChart...... 46 5.7 Screenshot of the presentation of utilization...... 49 5.8 Definition of fault-slip-through...... 50

6.1 The OTEE concept, which put Equipment Utilization in a context...... 54 6.2 Sampling a binary signal that describe the equipment state...... 58

7.1 Schematic model over a general utilization tool...... 60 7.2 Database structure for the Common Utilization Tool...... 62 7.3 The main view in ECUT...... 66 7.4 The resource view in Common Utilization Tool...... 67

8.1 The BSC utilization modules executing environment...... 74 8.2 Visualization of the collected record...... 77 8.3 Histogram with the number of typed MML commands when there are no traffic. 78 8.4 Visualization of classified record...... 79

A.1 Test Harness Core (THC) system components (used with permission by Jonas Madsen, Ericsson AB)...... 95 A.2 Resource Manager (used with permission by Jonas Madsen, Ericsson AB).... 97 A.3 The ATE GUI...... 98 A.4 Log Session View in THC...... 99 A.5 Log Record View in THC...... 99 Contents xi

B.1 The different BSC configurations...... 101 B.2 The BSC components...... 103 B.3 Possible connections to the BSC OM interfaces...... 104

List of Tables

2.1 Performance metrics...... 11 2.2 Error for different number of samples when p=0.5...... 14 2.3 Measurment information model and example [14]...... 18

3.1 Logical layers in TMN...... 28

5.1 Example table over slip through data for each phase...... 50

6.1 Equipment efficiency indicators...... 53 6.2 General equipment states...... 56 6.3 Error in indicator when sampling equipment state...... 58

7.1 Table: resource...... 63 7.2 Table: resource_utilization...... 63 7.3 Table: site...... 63 7.4 Table: resource_type...... 64 7.5 Table: resource_group...... 64 7.6 Table: group...... 64

8.1 Examples of records collected with BSC utilization module...... 76

9.1 Example of base measures of 15 Nethawk M5 Servers...... 84

Chapter 1

Introduction

1.1 Background

Ericsson AB is a world leading company in telecommunications. The company develops mobile telecommunication systems for Global System for Mobile Communications, Wideband Code Division Multiple Access and Long Term Evolution and also provides services, features and upgrades to these systems. When a problem in a product occurs the cost of solving it is lower at an earlier stage in the development. That is why Ericsson carries our comprehensive tests during the development of a product. The tests have to be carried out in an environment which corresponds to the environment where the products will be used, i.e. a configuration in the network that is similar to the operator’s network, the operator being, in general, the customer of the product. In the world mobile telecommunications new technologies are rapidly being released, while previous technologies are still being developed and used. Ericsson’s range of products has there- fore increased constantly. New products are required to function alongside previous technologies since operators want to reuse the old systems in order to minimize the cost of new investments. The process of detecting and correcting problems in products will require test equipment, and the demand for equipment for the tests has therefore also grown constantly. This equipment is very expensive and many of the products are the same ones that Ericsson sells to its customers. Efficient use of the test equipment is very important to improve the quality of the products in that more tests can be performed. Also, the costs can be reduced, since new investments may not be needed. To increase the efficiency, tools that can measure the utilization of the test equipment are needed. At present Ericsson does not have a generic model for measuring the use of equipment in the test environment. Such a tool can be of great help in the decision making process whether new investments of test equipment are needed or whether existing equipment can be used more efficiently. Today, there exist a few tools that can measure the utilization of the equipment, but this applies only to a small part of the equipment. These tools are not consistent concerning how they define and measure utilization. When the usage of equipment is evaluated it is often carried out based on the degree of booking, since all equipment has to be booked, and is defined in the same manner. The topic of creating a model to measure the utilization on the test equipment in BUGS Ericsson Test Environment-lab has been discussed for a long time at Ericsson. As we mentioned earlier, test equipment is very expensive and therefore a resource that must be utilized as much as possible.

1 2 Introduction

Employees performing tests cannot report the utilization, since there is too much equipment and too many test activities being carried out. Furthermore, it is not efficient to monitor usage manually, and the measurements need to be performed automatically. A valid model that presents the utilization of the equipment could allow better scheduling and planning of the test activities.

1.2 Purpose

The purpose of this master’s thesis is to construct a model for measuring the utilization of test equipment at Ericsson’s test plants, mainly at the test site in Linköping. Since the utilization will not be useful without a clear definition as to what the term means in general, the first part of the study will try to define a generic definition for all type of test equipment. During this work the equipment will be studied in detail to find efficient and feasible ways of conducting these utilization measurements. The study ends by creating a prototype that can be used to measure and present the utilization rate of the devices that the master’s thesis focuses on.

1.3 Objectives

A definition of utilization is needed for the model. The definition should be constructed to meet the information requirements (needs) of the end users. The definition also needs to be suitable for the different types of equipment, i.e. a model that is generic for all equipment. The term generic also implies that the model presents a utilization rate that is comparable for all different types of equipment. How the data is sampled and collected will have to be adjusted depending on the equipment. The time period, which can correspond to the sampling rate, should be as low as possible with respect to aspects minimizing the interference on the actual tests and the limitations in data storage and representation. The thesis will also suggest how the data should be transferred from the equipment, as well as stored and presented in a safe and time efficient way. If software has to be installed in any part of the equipment it has to be carried out while taking the safety issues into consideration.

1.4 Method

Articles, literature and other master’s thesis will first be studied to obtain a theoretical back- ground for the thesis. The architecture and system overviews of GSM, WCDMA and LTE will be studied and presented. The theoretical part will also contain theories about utilization and sampling and capturing of data. If there is any previous work directly related to the subject of this thesis, it will be studied. It is important to understand how Ericsson have technically implemented the components of the mobile systems technically, in order to get a view over the range of test equipment and a deep knowledge in how the testing process is carried out. This has to be done within Ericsson, mainly by interviewing the employees and surveying the test environment. Both formalized interviews and discussions will be conducted with the individuals involved with the testing and test equipment. The test environment and the equipment will also be studied in sufficient detail. 1.5 Scope 3

The concept will be evaluated by implementing a prototype based on the general model. The implementation will be validated against real, manually collected, utilization data from the testers of the equipment.

1.5 Scope

Since time is limited for this master thesis the implementation part of the model for measuring utilization will be limited to a subset of the equipment in the testing environment. We will focus on one of the Linköping site’s most important pieces of equipment; the BSC. The study will also investigate how measurements can be carried out on other types of equipment. The main motivation why the BSC has been chosen is that it is based on older technology which does not use ip for delivering user data, and that previous studies on this platform have proved it difficult to determine the usage.

1.6 Confidentiality

Some parts of the thesis are considered to be of a confidential nature by Ericsson and have therefore been edited to hide the sensitive details. The values of the resources utilization rate have been change but since it is not the specific values that are of interest in the thesis but rather how it is defined, collected and presented. Also some of Ericsson suppliers and there tools that are used at Ericsson’s test plants has been denoted supplier 1, 2 and simulator 1, 2.

1.7 Outline

After this introductory Chapter we will present the frame of reference that we have found in this area. In this Chapter we will give a brief introduction to Overall Equipment Efficiency, performance measures, sampling theory and a general measurement process. Since this report will focus on GSM test equipment we will, in Chapter3, present the basic concepts of the GSM network and briefly describe the components in WCDMA and LTE. These chapters will provide the necessary background that was used to create the general model on how Ericsson should work with utilization data for their test equipment. The generic model will then be presented in Chapter6. The next Chapter will present the authors’ recommendations at to how a common utilization tool for storing, configuring and presenting utilization data from this test equipment should be organized with the necessary collector modules. The report will follow with a description of this implementation for a Base Station Controller utilization measurement module. In this Chapter we will also present the working process of this work and the analysis of the output from the module. The last part of this report will focus on a discussion and conclusion of our study and we will also present how the work with a common utilization tool could continue.

1.8 The GSM Organization

At Ericsson in Linköping software is being developed for LTE and GSM. In GSM it is BSC and BSS that are being developed in separated departments with these names [4]. In this chapter 4 Introduction the organization and project structure is briefly described. The departments have different functions in the GSM design projects or have support and maintenance functions. The test equipment that is needed and the type of test cases that are carried out differs between the departments. Ericsson is organized into four main business areas; CDMA Mobile Systems, Global Services, Multimedia and Networks. Within the business area Networks there is a development unit for mobile radio network named DU Radio. The next level below in the organization hierarch is the Product Development Unit (PDU) where GSM RAN is one unit. The PDU GSM RAN has the following departments that use real Base Stations Controller (BSC) in their operations at the test site in Linköping: BSS & BSC System is technical responsible for BSS and BSC. BSS & BSC I&V is responsible for BSS and BSC integration and verification. BSC Design contributes to the development of new features and products in GSM/GPRS/EDGE. The department design, implements, function tests and maintains the features and products. BSS & BSC PLM has third line support at BSC software projects from due to its being General Available for all customers until the maintenance responsibility of Ericsson expires. The department is also responsible for packaging of software upgrades from design to customer.

1.9 Testing

Testing is a main part of the development of new products in mobile networks. It is important that the product is tested in real networks that correspond to the networks of the customers and that the product is tested under similar conditions as it will be used in. In this section the test process within a product development project is described. When a problem with a product is discovered during a test a TR(Trouble reports) is written to the person responsible for the design of the code that caused the problem. There are four levels of TR; A, B, C and D. The level of the TR specifies how severe the consequence of the fault is. If a fault is discovered and a TR with the level A is written it denotes that the fault has to be solved before any test work can continue.

Product Ready For General Introduction Acceptance Aviliability Complete (PIC) (RFA) (GA)

System System System First Office Function Integration Robustness Verification Application test (FT) Test (SIT) Test (SRT) Test (SV) (FAO)

BSC Design BSC&BSS I&V BSC&BSS System BSC&BSS I&V FOA customers

Figure 1.1. The steps in a software development project

Figure1.9 shows a normal product software development project within DU RAN. The steps is described below: • Function Test. The function is tested independently. Almost all function testing that is carried out in simulates or emulates hardware at BSC Design. 1.10 Ericsson test environment - BETE 5

• System Integration Test. Functions are integrated together and tested. • System Robustness Test. This is an early load test and early system verification. After this test the product is accorded the status: PIC (Product Introduction Complete) and no more software code can be added.

• Feature Test. A specific feature is tested in a real environment at BSS I&V with the focus on that the feature works as it was intended. • System Verification Test. After the PIC approval, BSS I&V carries out a full scale system verification test and after the System Verification Test the product obtains the status RFA(Ready for acceptance).

• First Office Application. In this stage the product is released to FAO-customers, who carry out validation tests and therefore receive a discount on the product. After the FAO the product obtains the status GA (General Availability), which means that it is released on the market. When a product obtains status GA, there cannot be a TR with level A from earlier tests.

1.10 Ericsson test environment - BETE

The organization that owns, runs and configures the test environment, is called BETE (BUGS Ericsson Test Environment). Earlier it was a separate company, but is now part of Ericsson. The test plants are spread around the world. In Sweden the larger test plants are in Kista Stockholm, Linköping, Gothenburg, Karlskrona and . The users of the test equipment, the development projects, book the test equipment and pay for the time they have used it. The relationship between BETE and the projects is a supply-customer relationship, which is intended to provide a situation where the projects do not book more equipment than necessary and where BETE do not purchase equipment that will be unused. BETE purchase equipment based on forecasts of the demand for equipment from the customers (development projects). Equipment is deprecated over three years and is one part of BETE’s expenses. Another part of the expenses goes to salaries for the work forces that administrate , setup, maintain and support the equipment. Other expenses are guarantees that are direct costs that cannot be depreciated. When a new investment in equipment is made, the depreciation for the equipment is only one part of the total expenses. After three years, when the depreciation of equipment is complete, there are still costs for the equipment in terms of work force salary and guarantees. To reduce the expenses for BETE, investments have to be decreased and some of the existing equipment has to be scrapped. The demand of equipment in a project is rather unique. A specific configuration of hardware and software is needed that suits the tests that will be carried out. Therefore a STP (System test plant) is constructed that contains one BSC in GSM or one Radio Network Controller in WCDMA and other components needed for the tests. Other components can differ a great deal and can be a number of Base Station Transceiver Station and traffic generators. One STP is configured to satisfy the requirement of a project. The STP is then booked in a system called BAMS and from the booking time in the system the payments is calculated . In BAMS, STPs and other equipment are booked with a minimum booking time of less than one hour. Since the project lasts for several months one STP is often booked for the same time period. When BETE needs to carry out maintenance BAMS allows a booking for such event, even though it is not widely used.

Chapter 2

Frame of reference

Chapter Introduction

The Chapter contains the results of the study of literature, articles, books and other documen- tation used in this thesis. The theories, concepts and methods is the foundation for the general model are presented later in Chapter6.

2.1 OEE - Overall equipment efficiency

The concept of OEE was proposed by S. Nakajima in ”Introduction to TPM: Total Productive Maintenance” in 1988. The definition and measurement of equipment productivity has been developed by the Semiconductor Equipment and Materials International (SEMI) [8]. It is used within the production industry companies to measure how efficient the production equipment is utilized. OEE contains four different ratios concerning the efficiency of equipment. The four ratios are multiplied together to achieve a total measurement over how efficiently equipment is used. By separating the efficiency measures into four ratios, it is easier to see where action should be taken to increase the overall equipment efficiency. Loading time is defined as the time the equipment is planned to be used. Weekends and holidays are withdrawn from the total time to obtain the loading time. The authors of [23] suggests that total time is used instead of loading time. Total time is all available time, 8760 hours per year, which is the maximum potential time equipment can be used. However there are some questions concerning using all 8760 hours per year. There can be legal restrictions that limit the number of hours per day that can be use for production. It can also be economically inefficient, in case of low demand, which does not motivate a night shift in the production. In these cases and for similar reasons it can be argued that this time should be subtracted from the total time. When total time is used, OEE can be referred to as TEEP (Total Equipment Efficiency Performance).

7 8 Frame of reference

To calculate OEE the status of the equipment has to be monitored over time. The status of the equipment can be defined as in Figure 2.1[23]. The Non-schedule state is when the

Non-schedule state

Schedule down state Total time Unscheduled down state

Engineering state Equipment uptime Standby state

Productive state

Figure 2.1. The different states of the equipment equipment is not intended to be used. This can be due to weekends or holidays. The schedule down state is when the equipment cannot be used because of maintenance and setup time. The down time is also categorized into an unscheduled down time, which is all the down time that occurs unexpectedly. The Engineering state is when experiments are performed on the equipment to improve its performance. When the equipment is up but not operating it is in the standby state. The reason for not operating can be due to missing operators or lack of raw material. An operator can be missing because of breaks, lunches or meetings. The production state is when the equipment is producing items as is intended. The four underlying matrices for OEE are defined by SEMI, in document E79-0200 [8]:

OEE = Availability ∗ (Operational ∗ Rate) ∗ Quality (2.1)

2.1.1 Availability Efficiency Equipment Uptime Availability Efficiency = (2.2) T otal T ime The ratio shows the available time the equipment can be used compared to the total available time. Non-scheduled time is the time where the equipment is not scheduled to be used. This is lost time since it could be used for production. Both scheduled and unscheduled down time decrease the availability of the equipment. Unscheduled down time can be repairs and scheduled down time can be maintenance of the equipment.

2.1.2 Operational Efficiency P roduction T ime Operational Efficiency = (2.3) Equipment Uptime The production time is the time that the equipment is carrying out the activity that it is intended to do. This is as opposed to the potential maximum production time, which is the equipment uptime. The time when no production is being carried out can be due to lack of material, to the fact that no operator is available or that engineering experiments are contracted. The operational efficiency and rate efficiency is often combined to one indicator called Performance Efficiency. 2.1 OEE - Overall equipment efficiency 9

2.1.3 Rate Efficiency T heoretical P roduction T ime Rate Efficiency = (2.4) P roduction T ime The rate is the speed of the production. A theoretical production time has to be calculated, which is the lowest possible time for producing the actual number of items. It is then compared to the real production time for the items. A lower production rate gives fewer produced items, which lowers the efficiency.

2.1.4 Quality Efficiency Acceptable units Quality Efficiency = (2.5) Units started The Quality Efficiency indicator illustrates inefficient equipment usage due to low quality of the items. If the quality of a unit is lower than the acceptable limit the item is rejected. In some equipment the production of an item is aborted before it is finished if the quality is unacceptable, in order to prevent unnecessary processing. The indicator does not capture this effect by saving equipment time, since it regards all started items and finished items as having the same production time.

2.1.5 Applications of OEE The main purpose of OEE is to provide a comprehensive measurement of the equipment ef- ficiency. The method tries to cover all the factors that affect the efficiency but also factors that are independent from the equipment itself as discussed in [8]. OEE will decrease if there is a lack of input material, which improvements in the equipment cannot influence. If specific equipment is considered to be a bottleneck the OEE indicator will show if improvements in the equipment are possible. The indicator will also highlight within which area action should be taken. The OEE can also be of great help in investment decisions as discussed in [15]. If a com- pany have low equipment efficiency and no performance indicators it is likely that they see no other solution than to make new investments to handle capacity problems. With the use of OEE, existing equipment and plant can be evaluated and improved before new investments are considered.

2.1.6 Limitations of OEE In the OEE formula there is an estimation of the theoretical production time per item unit. This estimation can be a source of error since a theoretical production time often is difficult to estimate and includes subjective evaluations. Another objection against OEE is that almost all types of equipment used in production have to be down for maintenance and repair for a period of time. This might result in that OEE is not able to achieve 100 % in practice and it is not certain that the value itself is a correct and fair measurement of the overall equipment efficiency; it can be hard to establish the practical top limit of the OEE. Nevertheless OEE can be useful as an index for comparisons between the efficiency of equipment before and after an improvement. Also it is still useful for pointing out where improvements should be carried out. An additional limitation of OEE is the quality of the collected data. Data has to be collected regularly and the equipment state has to be detected by using the data. The variables that vary over time need to be sampled and they are downtime, set-up time, production time per 10 Frame of reference unit, number of items started and accepted number of items. The theoretical production time per unit is constant and needs therefore be set only once. This is the data needed for calculated OEE according to the definitions above. Possible errors and bias in collecting and interpretation of data in these variables will give errors in the OEE value.

2.2 Performance Management

To evaluate the performance in a complex organization quantitative measurements are needed. Many companies use measurements called indicators to evaluate the performance. The indica- tors are values that are based on data collected from within the organization. The measurements are often presented on a dashboard or scoreboard to the management, at a strategic or at a tactical level. The way different companies use performance measurements differs a lot and all companies have their own interpretations and implementations. In this chapter a defini- tion of three performance indicators are presented and discussed. These three indicators of performance measures that can be used in an organization are [20]:

1. Key Result Indicators (KRI).

2. Performance Indicators (PI).

3. Key Performance Indicators (KPI).

KRI measures how well you have done in the past [20]. It is measured within a time period of a month or quarter and there should not be more than ten to twenty KRIs. Examples of KRIs are: customer satisfaction, market share and profit. However the measurement does not show which actions that should be carried out in order to improve performance in the future. That is the big difference compared to KPI which indicates what should be done to improve performance. When the unit for a measure is in money it is likely that the measure is a KRI, since the profit or return of an investment shows the outcome of an action and not the actions that need to be initiated. A report including the KRI is suitable for a board or management responsible for strategy decisions. PI is a indicator that tells you what to do to increase performance [20]. Compared to KRI it focuses on one particular area of performance and they can be both strategic and tactical. Organizations can have up to 80 PIs and they should be well defined. Examples of PI can be, for an airline company, be the percentage of lost luggage and for a hospital the percent of infected patients after surgery. A PI gives a clear view over what needs to be done to increase performance, but they are limited to one area and are not crucial for the overall strategy decisions. KPI show you what to do to increase performance dramatically [20]. They should be moni- tored at a regular basis, since they are the most interesting indicators from a management point of view. If a indicator is calculated every month or quarter it cannot be of such great interest that it qualifies to be a KPI. Examples of KPIs are: number of patients waiting for treatment at a hospital or number of minutes delay in average for an airline company. Both these example show what should be done to improve performance. The hospital needs to lower the number of patients in the queue waiting for treatment. It will give a domino effect in the performance since the quality for the patient will increase because of lower waiting time. The patients will also have less risk for complications since they can be treated earlier. For the airplane company late planes means higher costs, lower customer satisfaction and higher fuel consumption. As these two example shows a real KPI should affect several of the critical success factors (CSF) 2.3 Sampling theory 11 and give clear information regarding intervention. In literature authors have suggested up to 10 or 20 KPIs per organization [20]. It is rarely needed or not even possible to have more than 10 KPIs. Of course the number of PIs is much higher.

Table 2.1. Performance metrics

Metrics Numbers Monitored Definition KRI 10 Monthly Tells you what you have done PI 80 Daily Tells you what to do Tells you what to do to dramat- KPI 10 Daily ically increase performance

In Table 2.1 the performance indicators are shown together. When working with KPIs it is important that they are introduced in a carefully way and that time is allowed for evaluation. After some period of time and evaluations it is possible that the KPIs need to be modified. Since the number of KPI should be small it is important to evaluate the use of them and if they are not used the production of them should be stopped[17].

2.3 Sampling theory

Sampling defines how a representative subset of observations can be chosen from a total pop- ulation of observations. The reason for taking a subset of data and not collecting the entire population can be various. The population can be large which makes it time and resource consuming to capture it all. It may also be that the measuring interferes with the objects and by using sampling the interference to the population is minimized. A sample should be rep- resentative for the population and if the numbers of samples are too small it will not capture the main characteristics of the population. Consequently there is a tradeoff between trying to capture the essence of the population and minimizing the number of samples. Sampling should define the quantity, frequency and location of data to be sampled [24]. It is also used in telecommunication and signal processing when measuring a continuous signal into a numerical sequence. The population can be a set of objects or a variable that changes over time. In a manufac- turing factory a sample of the produced items can be tested to insure that the batch has good quality. It may not be efficient to test all the produced items or not even possible if the quality tests consume the items. In that case it is important that the samples are chosen in a way that they are a representative subset of all produced objects. An example of a continuous variable that is sampled is the speed of a vehicle which is fed into the the speedometer. The speedometer will show an instantaneous value of the speed and can be sampled to get an average speed over a period of time. The average speed will be more accurate if the signal is sampled with higher frequency [19]. If the speedometer is an instantaneous sample type the trip meter is a cumulative sample type. The cumulative type can also be called a counter and it adds values to a variable over time. The counter can be zeroed at some event, for instance when a restart occur. The counter will provide information about what has happen between the two samples of a variable, since the exact increase or decrease can be calculated if the counter is not reset. How the counter has changed during one time period cannot be known, unless the function of the variable is known in advance. 12 Frame of reference

2.3.1 Sampling methods Data can be sampled either event driven (pushed) or time driven (pulled). If the sampling is event driven it can denote that the sampling frequency is dynamic and changes due to some event. An event driven sampling can also include information on how to react when certain events occur [19]. Rules can be set up as to how to change the sampling plan under special circumstance. Pulled sampling is much more common and especially systematic sampling, where the sam- pling is conducted systematically. Sampling variables in the time domain means to sample with the same time interval and frequency. If the population is a set of objects, systematic sampling can, for example, be to sample every tenth object in an order. However if the ob- jects are arranged in a systematic way, this sampling method risks giving samples that are not representative of the population. In some cases this can be dealt with by sampling randomly, although it also has drawbacks. The choice of sampling method that is most suitable depends on the characteristics of the population.

2.3.2 Sampling period All measures do not have to be sampled at the same rate [19]. The measure type affects the suitable sampling period. A measure for a cumulative variable that show the average change of value over a time period will not give higher accuracy even though more sample per time period are used. It is shown by the following: Let Xi = X(ti) be a strict increasing function, where Xi is the sampled value at ti for i = 0,1,...,n and where n is the number of samples. The Xi+1−Xi avarage increase from ti to ti+1 is and the sampling interval, ti+1 − ti = ∆, is constant ti+1−ti for all i.

n−1 n−1 1 X Xi+1 − Xi 1 X Xn − X0 Xn − X0 = X − X = = (2.6) n t − t n∆ i+1 i n∆ t − t i=0 i+1 i i=0 n 0

Equation 2.6 shows that if the average change of one variable over the interval tn − t0 shall be calculated, it is enough to sample the counter at the beginning and at the end of the interval.

fs > 2B (2.7) When sampling a time continuous signal into a time discreet signal the equation 2.7 states that the sampling frequency required to capture all information in the signal, fs is the sampling frequency and B is the bandwidth of the signal. The bandwidth of a signal is the highest frequency in the signal.

2.3.2.1 Normal distribution Small sampling periods provide a great deal of data to be stored, although, it is true that the accuracy increases with the number of samples. To illustrate the relationship between the number of the samples and the accuracy of the measurement the following example is presented which is partly described in [19]. If the stochastic variables X1,X2, ..., Xn are normally dis- tributed and independent, with standard deviation, σ, and expected value, µ, the estimation of the expected value, X, is in the interval with probability 1 - α.

X − µ P (−λ < √ < λ ) = 1 − α (2.8) α/2 σ/ n α/2 2.3 Sampling theory 13

1 Pn where X = n j=1 Xj. µ can then be described in the following way λ σ µ = X ± √α/2 . (2.9) n The interval can be interpreted as a sampling error from the estimated expected value and the error in percent, e%, is now introduced. Since an error is an absolute value and the interval is symmetric, only one side has to be calculated.

λ σ e% X + √α/2 = X(1 + ). (2.10) n 100

The quotient between the standard deviation, σ, and the estimate expected value, X, can σ be expressed as a constant C = µ since the error is a percent of uncertainty in estimating µ, not the actual value. The number of samples can now be presented in terms of C, e% and λα and e% in term of the others:

100λ C 2 n = α/2 (2.11) e% √ 100λ nC e% = α/2 (2.12) n Equation 2.12 shows that the error increases when the ratio between standard deviation and the expected value, C, increases. If the number of samples increases the error will decrease as shown in Figure 2.2.

Figure 2.2. The percentage error as a function of the number of samples 14 Frame of reference

For exponential distribution C = 1 since the standard deviation and the expected value is equal for such distribution. If that is the case Figure 2.2 shows that about 400 samples are needed to get a estimation of the expected value giving an error of less than 10 percent with a confidence of 95 percent. If the ratio between the standard deviation and the expected value, constant C, is larger, even more samples are needed to get an error less than 10 percent.

2.3.2.2 Binomial distribution A binomial distribution is a discrete distribution that shows the probability of a number of positive outcomes from a number of independent attempts. The probability for each attempt is the same. The probability functions for a binomial distribution are: n P (k) = k pk(1 − p)n−k for k = 0,1,...,n. (2.13) Where k is the number of positive outcomes, n is the number of attempts and p is the probability for a positive outcome. The expected value of the distribution is np and the standard deviation is pnp(1 − p). When sampling a population that is binomially distributed, it is the probability, p, that is unknown. The following example shows how the error of estimate p can be calculated. Let X1,X2, ..., Xn be discrete independent stochastic variables that can be either 0 or 1 with the probability p. The set of discrete variables will be binomially distributed as 1 Pn in function 2.13. An estimation of the probability is defined as p = n i=1 Xi, which is also an estimation of the expected value if the probability function is divided by with n and the standard deviation then becomes p(1 − p). The distribution can be approximated as a normal distribution if np(1 − p) is greater than 10[7] and the estimated expected value, p, is in an equivalent interval as in 2.8: p − p P (−λ < √ < λ ) = 1 − α (2.14) α/2 σ/ n α/2 which gives,

p + λα/2d = p(1 + e). (2.15) where, d = pp(1 − p)/n. The variable e is the maximum error in the estimation of p with a confidence of 1 - α. p e = λα/2 p(1 − p)/n (2.16) Function 2.16 shows the maximum error in estimating the probability, p, of a binomial distribution. The error will be greatest when p = 0.5 since it gives the highest value of the function p(1 − p). When the number of samples increases, n, the error decreases. Table 2.2 shows the error for different numbers of samples. The number of samples has to be almost 100 before the error is less den 10 percentage points with a confidence of 95 percent.

Table 2.2. Error for different number of samples when p=0.5

n=5 n=10 n=100 n=500 e (α = 0.05) 0.438 0.310 0.098 0.044 e (α = 0.01) 0.576 0.407 0.129 0.058 2.4 Measurement process 15

2.4 Measurement process

ISO (the International Organization for Standardization) and IEC (the International Elec- trotechnical Commission) have specified an international Standard for the process of carrying out measurements in system and software engineering [14]. The standard identifies the activities and tasks required for implementing and improving measurements in a project or organization. The purpose of the measuring process is to make measurements that support effective manage- ment of a project and show the quality of a product.

2.4.1 Measures Three measurements are defined in the document. • Base measure • Derived Measure • Indicator The base measure is a quantity attribute of an entity that can be measured. The entity can be a process, product, project or resource. For example a base measure can be the number of worked hours, lines of code or defect products. A derived measure is a function of two or more base measures. An indicator is a measure based on several derived measures and base measures. It gives an estimation or evaluation of the information that is needed for answering the question that initiated the measuring process. An indicator can for example be the average productivity in a project or the average quality in a product.

2.4.2 Activities in the measurement process 2.4.2.1 Establish and sustain measurement commitment In the first activity the scope of the measurements is identified. The scope can be just a single project, a functional area or the whole organization. It is also important to identify all stakeholders and that the purpose of the measurements is presented to them, since it is information that directly or indirectly can demonstrate their performance. Different areas within the measurement process shall be allocated resources. The number of people needed for the areas differs and depends on the size and structure of the organization.

2.4.2.2 Plan the measurement process The next activity to carry out is to identify the information need. The information need is something that is important for the organization to know about and should be based on the goals, risks, constrains and problems[14]. The kind of questions that are of interested can be: “what is the productivity in a project?”, “is the quality of a product sufficiently good ?” or “how do the employees experience the work environment?”. If several information needs are identified, which is natural, they have to be prioritized and the most important shall be selected. It is often a good idea to involve the stakeholders in the process of selecting information need. When the information need is identified, the measures shall be selected. All the potentially useful measures can firstly be identified and from the resulting list a number of measures can then be selected. Data collecting, analysis and reporting procedures need to be defined. Data colleting includes when and how to collect data. The storage of data and the requirements for data verification shall also be determined. 16 Frame of reference

2.4.2.3 Perform the measurement process Firstly the methods for measuring attributes need to be implemented. There can be tools that more or less automate the collecting of data or report procedures where data is manu- ally collected. Often the most cost efficient way to implement data collecting is to slightly modify current processes or reuse earlier work[14], i.e. to collect data according to previously determined methods. The data shall be stored with other information needed for verifying, understanding and evaluating data. When enough data is collected it needs to be verified. The base measures are used for calculating the derived measures according to their defined functions. The derived measures are then put together into an indicator measurement. The indicator shall be interpredted into a information product that meet the information need. The indicators cannot be directly used to meet the information needs since they exist in a context that needs to be taken into consideration. The interpretation of the indicator should include the stakeholders and it should result in an information product that meets the information need. The information product shall be reviewed. When reviewing the information product it is important that the results are meaningful and that they enables improvements to be carried out. It is often useful, with qualitative information, to interpret and understand the information product. The result can then be communicated to the users. Feedback from the users and stakeholders should be collected and used to improve the information product.

2.4.2.4 Evaluate measurement The last activity is to evaluate the information product and the measurement process and identify potential improvements. The evaluation should be based on the base and derived measures, the information product and the user feedback. The evaluation may lead to that some measures no longer are useful if they do not contribute to the current information need. Improvements to the information product can include changing the time resolution. Poten- tial improvement in the measurement process is often a trade-off between higher cost in the process and higher quality in the product.

2.4.3 The measurement information model Table 2.3 and Figure 2.3 present the measurement information model in [14]. Table 2.3 shows the components needed for the model and Figure 2.3 the relationship between them. The infor- mation model can be of great help when planning, carrying out and evaluating measurements. 2.4 Measurement process 17

Information needs Information product Interpretetion

Indicator

(analysis) Model

Derived Derived Measure Measure

Measurement function Base Base Measure Measure Measuring method Entity Attribute Attribute

Figure 2.3. Information model of the measurement process 18 Frame of reference

Table 2.3. Measurment information model and example [14]

Description Examples Information Insight necessary to man- Evaluating the efficiency in a Needs age goals, risks and prob- project, estimating the quality of lems. future projects or estimate the status of a project Measurable Con- An abstract relationship Project performance, risk, matu- cept between attributes of rity and quality etc. entities and information needs. Relevant Entities An object that has rele- Products(e.g Source code, test vant attributes that can cases, design documents), pro- be measured. cesses(e.g. design process, testing process), project and resources(e.g. programmers, tester, equipment) Attributes A property or characteris- Code blocks, data counter, list of tic of an entity. fault in a project Base Measures The measurement of an The total number of code lines, attribute.Defines method the amount of data sent over for carrying it out. one interface, the total fault in a project Measurement The definition of how an Count the number of lines in all method attribute is quantified into code blocks, read the value of the a specific scale. data counter, count the number of faults Type of Mea- The definition of how Subjective (human decision is in- surement the quantification is per- volved), objective (only logical Method formed. and numerical rules is used) Scale Allowed values of the base Integer, discrete, continuous measures. Type of scale The relationship between Nominal, ordinal, interval and the values on the scale. ratio Unit of Measure- The unit of the measure- Hours, meters ment ment. Derived Measure A function of two or sev- Code line Productivity eral base measurements. Measurement Function for the derived Divide Lines of Code by Hours of Function measurement. Effort Indicator An indicator which is a Average productivity measurement that gives an estimation or evalu- ation of the information need. Model The algorithm or calcula- Calculates the average mean and tion that outputs the indi- standard deviation for all project cators with base measures productivity values and derived measures as inputs. Decision Criteria The threshold where ac- Indicator below a certain limit tions should be taken if it requires further investigation is exceeded. Chapter 3

GSM

Chapter Introduction In this chapter the basic components in a GSM network will be described. The reader will have the possibility of getting an overall view of the components, the way they are used and how they are connected to each other.

The Global System for Mobile Communications is a set of European Telecommunication Stan- dards Institute standards that define the system components and infrastructure for a cellular system. GSM is of today the world’s most used system for mobile communications. GSM services are currently installed in over 218 countries and have approx. 3,450,410,548 users [5] providing a coverage to more than 80% of the world’s population [6]. GSM is an evolution of the first generation systems, e.g. and Advanced Service, which were analog mobile systems and provided a limited set of service. The second generation systems are digital and can therefore supply more services at higher capacity and quality. One major issue regarding the 1G-systems was that there existed several national analog systems which were not compatible with each other, making mobility between different countries impossible. Therefore an international standardization group was formed to avoid this situation for the new 2G Public Land Mobile Network systems called Groupe Spéciale Mobile in 1982 at the Conference of European Posts and Telegraphs. The introduction of a digital system also gave rise to some other advantages like:

• Increased capacity compared to analog technology. This is achieved by better utilization of the available radio frequencies.

• Quality of services and security. Better quality than the 1G-system and better security by enabling encryption of the networks traffic.

• Reduced cost for infrastructure and therefore cost per user. By standardizing and limiting the number of system components the costs will be reduced.

• New services. New features such as data transmission, SMS and fax were developed.

19 20 GSM

• Improved mobility between networks. To support international mobility the identifi- cation and numbering plans were based on ITU recommendations. Also the modifications to the exiting Public Switched Telephone Network networks were minimal.

The worldwide GSM standard was developed for implementation in different frequency bands to provided better access to the network. Therefore different substandard were formed: the GSM 900 (GSM in the 900 MHz band) (GSM 800 in the US) and the DCS 1800 (Digital Cellular System in the 1800 MHz band) (PCS 1900 in the US). The GSM 900 is used in Europe, Asia and the Pacific Area and is designed to give good radio coverage even in the countryside outside the urban areas. To achieve better capacity the DCS 1800 standard is used in crowded areas which permits smaller cells and faster reuse of frequencies. DSC 1800 was renamed GSM 1800 in 1997.

3.1 GSM specifications

An important feature of GSM is that it is platform-specific and does not specify any hardware requirements and therefore gives the designers the possibility of providing the actual function- ality [3]. Instead the standard specifies the different network functions, nodes and interfaces in detail. Therefore an operator gains the advantage of being able to buy equipment from different vendors.

3.1.1 GSM Phases

When the GSM development started a decision was made to split the specification work into two parts. The reason for this was that the need for continuous development was anticipated in the early stages, in order to get the products out to the market as soon as possible [28]. During 1990 the final specifications of GSM phase 1 were published by ETSI which became responsible for the standardization. This included over 6000 pages of documentation of GSM specifications that defined the standard. The Phase I specifications included such services as: basic telephone calls with ciphering, data transfer, Short Message Service and other phone services, such as call forwarding and call barring. The SMS was first considered as an ”unnec- essary” feature but has, in later years, achieved great commercial success and is, of today, one of the most used services in mobile communications. The first GSM Phase I network was installed in 1991 in Radiolinjas network in Finland where the first GSM call was made [9]. The commercial launch of GSM was a success after some initial problems with handsets, as with other launches of new mobile equipment, and by 1993 GSM was installed in 36 networks in 22 countries [27]. During the deployment of the Phase 1 networks the work preceded with standardization Phase 2 features. Some of the new features were developed as a reaction to experiences gained during the deployment of the first generation GSM. The new services were focused on supple- mentary features which will be described later. When the GSM market grew, customer demand grew even more for new services. To manage this growing demand a new phase in GSM standardization started. This phase, called, Phase 2+, included a new way of transmitting data in the network. The earlier data bearer service used the old way of delivering data in a telecom network, circuit switch, whereas the new phase introduced a packet switched method, General Packet Radio Service; more on this later. 3.2 GSM Network Architecture 21

3.1.2 Services in GSM The GSM system was constructed to interconnect with other voice and data services integrated in other existing networks like Integrated Services Digital Network and PSTN. This, together with the fact that the people who developed the system earlier were working with older telecom products has made that the basic concepts in GSM are derived from the ISDN standard. In GSM there have therefore been defined three types of service categories:

• Bearer services

• Tele services

• Supplementary services

The bearer services are a telecommunication service that gives the user the possibility to transmit signals at a certain capacity between the networks access points. GSM defines different service types for data transmission where the original GSM standard used a circuit switched method allowing data rates of up to 9600 bits/second. We will later discuss the enhancements that have been carried out in later versions of GSM phases. The data services can be sub cat- egorized into two parts, transparent and non-transparent. The transparent mode will interfere as little as possible with the transmission and only forward error check. Non-transparent mode also adds flow control. GSM is primarily aimed for voice communication and the goal was to deliver high quality encrypted sound for security reasons. For this purpose the Teleservices are used to give the user the functions needed to communicate with any other user inside and out- side the network. The standard also includes other types of Teleservices such as an emergency number that could be used in the whole of Europe, SMS, Enhanced Messaging Service and a Group 3 fax service. Supplementary services are, as in ISDN, included in order to enhance the tele and bearer ser- vices. Examples of services are user identification, call waiting, call forwarding and multiparty calls.

3.2 GSM Network Architecture

In order to supply the services described above, the GSM PLMN system is divided into a number of components that are designed to handle the different network functions. In order to make the GSM system as standardized as possible the recommendations for GSM do not only specify the air interface but also the infrastructure and its components. This gives the operator flexibility by allowing them to integrate components from different vendors into their network. Figure 3.1 illustrates the different components in GSM. The mobile station communicates with the network through the radio interface to a cells antenna that is connected to a Base Sta- tion Subsystem. The BSS communicates with the Network and Switching Subsystem through a Mobile Switching Station. From the NSS the information is routed to other parts of the GSM network and also to the non-GSM systems such as the PSTN. The NSS, like all other sub- systems, is also connected to the Operations Support Subsystem which includes the functions needed to manage and run the maintenance of the network. The network architecture could be divided into three subnetworks; the Radio Access Net- work, the Core Network and the Management Network. In GSM standard these parts are denoted as subsystems: the Radio Subsystem, the NSS and the OSS which hieratically divides the network. 22 GSM

PSTN, ISDN OMC AUC EIR OSS O

HLR GMSC MSC/ VLR NSS IWF SMSC A A

BSC BSC

Abis Abis Abis BSS

BTS BTS BTS

Um Um Um MS MS MS

Figure 3.1. GSM Network Architecture.

3.2.1 Radio Subsystem The radio subsystem connects the network users with the core network where the information can be routed to its receiver. The RSS contains the Mobile Station and the BSS and is connected though the A interface to the NSS and to the OSS via the O interface. The A interface is a standardized circuit switched connection often based on the Pulse Code Modulation-30 with a capacity of 30 telephony calls at 64 bit/s connections each (total 2.048 Mbit/s). The O interface uses the Signaling System 7 protocol suite and has the main purpose of setup and tear down of calls, number translation, billing mechanisms and other related services.

3.2.2 Mobile Station The mobile station component corresponds with all the devices, including the actual User Equipment, but also software that is constructed to communicate with the network. TheMS includes two separate parts. Firstly the Mobile Equipment, which contains a Terminal Equip- ment which could be a PDA or a computer connected to theME, and secondly a Subscriber Identity Module. The SIM card is the user’s identity in the network and stores all informa- tion about the user that is needed for connecting and using the network. TheME also stores identity information about the user through the International mobile subscriber identity see chapter 3.8.6. TheME, therefore, only stores information about the handset and about those services that the hardware supports, the so-called class-mark. The SIM-card on the other hand contains information regarding available service and authentication of the user. For security purposes the SIM card contains a Personal Identification Number and a Personal Unblocking Code code that can protect the user against unwanted calls in the case of a theft. TheMS also stores a authentication key, Ki, that is used with a Authentication algorithm, A3, which is implemented in the SIM for authenticating the ME when it is requesting resources. To identify 3.2 GSM Network Architecture 23 theMSa International Mobile Subscriber Identity number is stored and during the time that theMS is connected to the network the SIM also stores some temporary information such as the cipher key Kc and a Temporary mobile subscriber identity and the Location Area Identity which are used to keep track of a mobile’s location in the network.

The main features of the MS are:

1. Radio transmission termination

2. Radio channel management

3. Speech encoding/decoding

4. Radio link error protection

5. Flow control of data

6. Mobility management

7. Performance measurements of radio link

A flow chart of the activities in the ME is shown in figure 3.2.2.

Transmitter Reciever

Modulator Demodulator

Burst formating Equalization Adaptive Chiphering

Interleaving De-Chiphering

De-Interleaving

Channel Coding

Speech Speech Coding Decoding

Segmentation D/A Conversion

A/D Conversion

Microphone Speeker

Figure 3.2. Flow chart of the activities in the ME.

3.2.3 Base Station Subsystem The BSS contains two parts: the Base Station Transceiver Station and the Base Station Con- troller and is used to connect theMS to the NSS. The BTS contains antenna, amplifiers, filters and signal and protocol processing components to support the connectivity to the MS. Speech coding and decoding and rate adaptation due to radio link changes is carried out by the transcoder/rate adapter unit called the Transcoder and Rate Adaptation Unit or Transcoder and Rate adapter Controller. The speech in GSM is encoded using 13 kbps (full rate) or 12.2 24 GSM kbps (enhanced full rate) or in some rare cases 5.6 kbps. These rates are clearly different from the standard 64 kbps Pulse Code Modulation (PCM) and since the GSM network communicates with the PSTN the TRAU is used to convert the GSM coded speech to 64 kbps or GSM coded data transfer. The TRAU can be included in the BTS in the BSS but it is common to locate the TRAU near the MSC in order to reduce the traffic load in the interface between the BSC and the MSC. The interface is then known as the Ater interface. The Ater interface is vendor specific which implies that the BSS and TRAU must be supplied by the same manufacturer. The BTS can be located in the center of a cell or in the edges between multiple cells to give coverage to more than one cell at a time using sectorized antennas. The connection to the MS is called the Um interface (ISDN U interface for mobile) and the connection to the BSC via the Abis ISDN interface. One part of the Abis interface is not standardized in the GSM specifications, the Operations and Maintenance Link. This link is very vendor specific due to the internal design of the BTS being proprietary, causing low compatibility between different manufacturers. In order to keep the BTS small, which helps deploying them in a crowded urban environ- ment, as much intelligence and control as possible is located in the BSC. The BSC has therefore responsibilities for reservations of radio resources, allocation and release, and Mobility Man- agment functions like traffic measurements, location management of the MS and handover management. A BSC is most often connected to many BTS making it a centralized control unit for a large geographical area.

3.2.4 Network and Switching Subsystem

The NSS is the core network connecting several BSC with each other and the PSTN. The NSS is also responsible for handover between different BSC and includes functions for the international roaming of a user. The subsystem also stores subscriber information on available services, charging and accounting.

3.2.5 Mobile Switching Center (Mobile Services Switching Center)

Several BSCs are connected to the Mobile Switching Center which forms the backbone in the network. The MSC includes functions for path searching, data forwarding and service feature processing. Many of the MSC functions are similar to an ordinary telecommunication switch such as ISDN but since the users are mobile the MSC includes more functionality for radio resource reservations and the mobile management, handovers and location registration. A special type of MSC is the Gateway MSCis used to connect the core network to other networks such as a PSTN. In the GMSC Internetworking Functions are used to connect the network to public data networks (PDN) often using the X.25 transmission. IWF also contains functions to handle services for delivering fax messages. The IWF therefore conducts protocol adaptations and rate conversions to make communication between services in the ME and services in different networks possible. The GMSC is also the first node that an external service reaches and is therefore responsible for locating the mobile user in the network through the HLR. The MSC also contains information that is used for charging the customer for the services he has used. All the current charging rates are stored in the MSC and this is applied to a current call and used at the networks billing center. 3.3 GSM Areas 25

3.2.6 SMSC The Short Message Service Center is the node in GSM which is responsible for storing and forwarding SMS between twoMS. When a SMS is sent from aMS the SMS is first routed to the SMSC service center which stores the message. When the service center has stored the message the SMS MSC tries to deliver the message to a MS within the network through the MSC or to a differentMS located in a different network. In the latter case the message is delivered to a SMS interworking MSC located in the foreign network that is responsible for storing and forwarding the SMS to the receivingMS.

3.3 GSM Areas

The GSM network structure can be divided into separate geographical areas; cell, location area, MSC/VLR Area and glsPLMN. The structures of these areas are an important issue in all cellular networks because the users are mobile and they are used to monitor their movements.

• Cell The cells are the smallest geographical units of the network. This area is covered by a BTS and is assigned a unique Cell Global Identity (CGI) which identifies the cells from each other.

• Location Area The Location Area is a group of cells and defines an area where the subscriber is located and is identified by Location Area Identity. TheLA is introduced in a GSM network to reduce the signaling load of updating and finding a subscriber’s location within the network. Each time a user changes from one cell to another theME checks if the new cell belongs to a differentLA. If the cell belongs to a new LA the user’s new position is updated to the network. If a user is called, a paging request is broadcasted to all the cells in the location area.

• MSC/VLR service area The MSC/VLR service area is a geographical area that con- sists of severalLA that are connected to one MSC. To route an incoming call to theME the users current MSC must be stored and retrieved.

• PLMN The PLMN service area is the operators’ entire network cells and is defined as the total area of where the operator has coverage and is therefore the highest geographical area in the GSM network architecture. The term international roaming is used for describing when a user changes PLMN.

3.4 Databases and Registers

Several databases are defined in the GSM standard for management of the networks users and their location; the Home-Location-Register and Visitor-Location-Register. The HLR and VLR are queried by the networks nodes to retrieve information about registration and localization.

3.4.1 Home-location-register (HLR) The Home-Location-Register is a common node and often considered one of the most impor- tant nodes in a mobile network. The HLR contains information of all the subscribers of the network services, including information on each user’s subscribed services, restricted services and telephone numbers, the Mobile subscriber ISDN number and the PSTN associated number 26 GSM and authentication data. Stored in the HLR is also temporary information like the Mobile Subscriber Roaming Number, Local Mobile Subscriber Identity and current MSC and VLR the ME is connected to, if it is available. By using this information it is possible to easily route an incoming service to the MEs closest MSC and BSC, more on this in section 3.8. The HLR is also responsible for all the users in networks traffic information that is later used for accounting and charging.

3.4.2 Visitor-location-register (VLR) The Visitor-Location-Register is a database connected to the NSS and is used to speed-up the retrieval of information needed by the MSC and BSC. When aMS enters a new MSC/VLR service area the permanent information about the subscriber andMS is copied from the HLR and therefore avoids frequent access to the networks HLR. The VLR also stores some own temporary data like LAI and the TMSI. To further shorten the access time a VLR is often collocated with a MSC.

3.5 Operations Support Subsystem

Since all mobile networks are often very complex and contain many entities, GSM included, it is hard to manage each of these entities individually. In GSM the Operations Support Subsystem has been implemented to give the functions to operate, conduct maintenance on the network and ensure security in a centralized manner. The OSS has a connection to all the elements in the GSM network and contains the nodes Authentication Center, Equipment Identity Register and Operations and Maintenance Center.

3.5.1 Operations And Maintenance Center The Operations and Maintenance Center is the management and control part of the OSS. The OMC implemented on the Telecommunications Management Network to ensure the GSM network design philosophy is compatible with the fixed wired network

3.5.1.1 Telecommunications management network The Telecommunications Management Network concept was defined by the International Telecom- munications Union. This concept describes the TMN as a separate network the interfaces with a telecommunications network at multiple access points. The basic requirements for the TMN that were stated by the ITU are [3]:

• Centralized

• Separated from the telecommunications network

• Connected to the nodes in the telecommunications network via standardized interfaces

TMN contains five basic elements (blocks) that are defined by the TMNs functional model [29].

Network Element. TheNE are the actual nodes in the network, HLR, VLR, BSC etc. in GSM. The functions that theNEs performs are called network element functions and are divided 3.5 Operations Support Subsystem 27 into two separate groups, primary and management functions. The primary functions are the telecommunication functions and the management functions described by TMN.

Operations System. All the management in TMN is handled byOS through theOS function. SeparateOS functions cover the functions like billing, management, measurements. TheOS is most often a part of a OMC and if the network is large the network may have several OMCs.

Workstation. TheWS acts as an interface to where the operator can communicate with the TMN throughWS functions. Through this interface the status and maintenance functions are available.

Meditation device TheMD acts as a bridge between theOS and the nodes’ variousNE. The communication is delivered through standardized interfaces.

Q-Adapter TheQA is used to communicate with other non TMN compatible nodes. TheQA is therefore used as a translator between these nodes and the TMN. The TMN also includes a data communications network between theNEs,OSs and other elements using a WAN or LAN connections. In GSM the OMC often uses a X.25 connection to the MSC and BSC [29].

The TMN elements are shown in figure 3.3. Many of the connections and functions of the blocks are not a part of TMN and are therefore placed in the border between the TMN and the rest of the telecommunications network.

TO OTHER OPERATOR OSF WSF TMNs INTERFACE

MD

TMN QAF NEF

CONNECTION TO NON TMN NODES

Figure 3.3. TMN elements and the connections in the TMN model.

To simplify the hierarchy of the TMN the functions has been divided into layers, like the Open Systems Interconnect model, see table 3.1. See GSM System Survey [3] for more information about OSS, particularly Ericsson’s OSS implementation.

3.5.2 Authentication Center All authentication and encryption parameters in the GSM network are handled by the Authenti- cation Center. These parameters help the network to avoid fraud and make the communications 28 GSM

Table 3.1. Logical layers in TMN

Contains functions used to manage the business aspects of the Business Management network, billing and accounting. Provides functions to handle the services in the network: def- Service Management inition, administration and charging of services. Has a view of the entire network. Is used to configure routes, Network Management monitor link utilizations and other performance metrics. The lowest level where the actual nodes are covered. Includes Element Management functions for alarm monitoring, backup, logging and mainte- nance of the software and hardware. between the subscribers as confidential as possible. When theMS registers to the network the SIM card information it is used to create a shared secret that is authenticated by the AuC. After the authentication has passed encryption parameters are passed to theMS. The AuC is often a part of the HLR since their tight integration.

3.5.3 EIR The Equipment Identity Register is used to store information about all IMEIs of theMS that are or have been registered to the network. This information is used to block connections to MS that have been reported as stolen or have some kind of hardware or software defect. Since many of these defects can affect the networks performance, by for example transmitting too long sequences and therefore cause interference with other transmissions, they must be blocked. Since the IMEI contains not only serial number information but also the make and model the HLR block allME constitute this black list.

3.6 Radio interface

The radio interface, the physical layer (OSI Layer I) in GSM, the Um, uses a Space division multiple access scheme with cells that reuse the available frequencies, which maximizes the capacity and performance of the network. The uplink and the downlink in the cells are separated using Frequency division duplex with one frequency band for communication from theMS to the BTS, i.e. uplink, and one band for communications between the BTS and theMS, i.e. downlink. To separate the users, a Time Division Multiple Access scheme in combination with Frequency division multiple access is used, which allows multiple users to share the common RF channel in both a timesharing and frequency sharing scheme. Finally a Gaussian Minimum Shift Keying is used for modulating the digital transmission to the air interface which gives a gross data rate of transmission rate of 270.83 kbit/s per carrier frequency [9]. The available frequencies in GSM 900 are divided into 124 full duplex pairs with 200 KHz carriers and a 200 KHz guard frequency to avoid interference. In GSM 1800 more channels are allocated, 374 carriers, to increase the capacity of the network. To preserveMS the power the uplink, channels are always placed in lower frequency bands (45 MHz lower in GSM 900 and 90 MHz in GSM 1800) due to the fact that less energy is needed to transmit in lower frequencies. To lower the power needed an even more discontinuous transmission mode is used to make it possible to not transmit any signal when no data is in the processes of being transmitted [29]. 3.6 Radio interface 29

In GSM each cell is given a range of RF channels, implying both uplink and downlink, usually one to three but it could be more if the cell has a high demand. The cells are then divided into sectors where one BTS is responsible for several cells. The carriers are further divided into TDMA-frames consisting of eight so called time slots, numbered 0 to 7. Each time slot provides a separate Time-division multiplexing channel available for aMS to use. Each of these time slots occupies the medium for 576.9 µs giving a total frame length of 4.615 ms. To prevent theMS transmitting and receiving at the same time the uplink is delayed three timeslots. Though, when long transmitting delays occurs, due to long distances between the MS and the BTS, a timing advance factor is introduced to assure that the uplink information reaches the BTS at the exact instance of time. GSM supports a timing advance factor of 63 bits (23 ms) resulting in a maximum distance of 35 km between the BTS and theMS. This can be extended by transmitting a full time slot plus the timing advance value earlier [26]. This timing advance factor is calculated in the BTS by doing measurements of the traffic from the MS. To avoid the need of a full duplexMS the uplink time slot numbering is delayed three positions from downlink. During one time slot, data is transmitted in radio bursts. GSM defines five different types of burst in two categories, using the full duration and or a shorter duration of the time slot. The short duration burst, the Access Burst uses a longer guard sequence and is used for the initial setup between Random Access channel described later. The full duration normal burst is for transmitting information for the traffic and control channels. The burst has the length of 148 bits with 2*57 bits of actual data and 26 bits of training information which is used to for example to calculate the timing advance value and traffic measurements, see Figure 3.4.

Tail Data Flag Traning Flag Data Tail Guard

3 57 1 26 1 57 3 8.25

Burst (148 bits 0.546 ms) Time slot (156.25 bits 0.577 msec)

Figure 3.4. GSM normal burst structure.

The tail bits are all set to zero and are used to increase the performance of the receiver by ramping the transmission power up or down. The (stealing) flags indicate if prior data portion is carrying user data or signaling information. The training data is a predefined bit pattern in the middle of the burst and helps the receiving part to adapt to current conditions of physical medium, multipath propagation, fast fading etc. The four other full definition burst types are as follows: frequency correction burst allows fine tuning of carrier frequency, synchronous burst allows exact time synchronization between theMS and the BTS, dummy burst is used when no data is to be transmitted and supports power measurements for quality monitoring. The GSM standard also defines an optional slow frequency hopping sequence where theMS and BTS can negotiate to change carrier frequency by a predefined sequence.

3.6.1 Logical Channels In GSM the physical channels are divided into two separate classes of logical channels, Traffic Channel and Control Channel, depending on which type of information that is delivered. 30 GSM

3.6.1.1 Traffic Channels The Traffic Channel is used to carry user information (speech, data etc.) as either circuit- switched or packet-switched. GSM specifies to categories, full-rate TCH (TCH/F) and half-rate TCH (TCH/H). TCH/F gives 22.8 kbit/s of raw data rate and depending of which codec is used to encode the user’s voice, the actual data rate varies, e.g. from the original specification, 13kbit/s of speech in data and the rest for error correction data. The TCH/H enables the possibility for two connections to share a timeslot and therefore double the amount of calls it is possible to connect, but at a lower quality. The latest technology VAMOS extents capacity even further by allowing fourMS to transmit in one timeslot using the TCH/H principle.

3.6.1.2 Control Channels To support the communication of data in GSM air-interface a set of Control Channel are defined. The CCH control tasks such as: medium access, allocation of traffic channels and mobility. GSM defines three sets of control channels:

• Common control channels The CCCH handles the initial connection setup events.

– Paging channel. The PCH is used to inform aMS of an incoming call or service. Downlink. – RACH. The RACH is used by theMS to request resources. A slotted aloha multiple access scheme is used to allow allMS in the cell to communicate with the BSC. Uplink. – Access Grant Channel. AGCH is used by the BSC to grant the request that theMS requested on the RACH. The grant permits theMS to use a TCH or a Standalone dedicated control channel, which will be described later. Downlink. – Notification channel. NCH is used for group-call and voice broadcast services.

• Dedicated Control Channels. The DCCH is primarily used for theMS to connect with the network. The DCCH are bidirectional and are therefore used by both theMS and the BSC.

– Standalone dedicated control channel. The SDCCH is used for communication between theMS and the BSC when the a TCH is not used. This occurs for example when theMS send SMS and to authenticate and register to a TCH setup. – Slow associated control control channel. Each TCH and SDCCH has an as- sociated SACCH. This is used to carry system information like channel quality and signal power level messages between the BTS and theMS. These messages are used to determine when a handover between two BTS should be initiated. The SACCH is also used for SMS when a call is connected. – Fast associated control control channel. When more signaling information is needed a FACCH is used. This occurs when time critical information like handover information is exchange and when non-voice data has to be delivered at a higher data rate. The FACCH uses capacity reserved for TCH and therefore steels bandwidth from the user data [29]. – Cell broadcast channel. Cell broadcast messages is sent in the same time slot as the SDCCH through the CBCH. Only downlink. 3.6 Radio interface 31

• Broadcast Control Channel is used to carry broadcast information between the BTS and theMSs in the cell. This information includes the cell identification number, the frequencies that are available inside and the neighboring cells and special cell options such as when frequency hopping is used.

– Frequency control channel. The FCCH is used to send frequency corrections to theMS – BCCH. The general information to access the cell resources are carried on the BCCH. It is also used to carry the configuration of the CCCH. – Synchroization channel. The SCH is used to carry time (frame) correction infor- mation but also contains the Base-Station Identity Code.

3.6.1.3 GSM Mapping The GSM standard defines a frame structure which enables the different channel types to be transferred on the physical medium. The frame structure enables the system to schedule (multiplex) the use of the timeslots described in section 3.6. The time based multiplexing can allow the mapping to occupy the complete physical medium or as a part of it. The frame structure gives priority to the data of different TCHs and associates these with a dedicated TDMA-channel whereas the signaling channels have to share the use of one channel. Of the eight available timeslots in a cell the frame numbered 0 is used to carry signaling information and the remaining seven (numbered 1-7) are mapped to carry one (or two if half-rate is used) MS. If several frequency carriers are available within a cell the logical control channel is mapped to the Beacon frequency (the first available frequency in the cell) in time slot 0 and if needed 2, 4 and 6. The mapping procedure in GSM uses a complex hieratical structure defining multiframe, superframe and hyperframes, see Figure 3.5 on top of the basic frames.

hyperframe = 2,047 superframes (3 h 28 min 53.76 s)

superframe = 51 traffic superframe or 26 control channels (6.12 s)

traffic multiframe = 26 control multiframe = frames (120 ms) 51 frames (235.4 ms) Frame 0 1 2 3 4 5 6 7

Burst 0.577ms 4.616 ms

Figure 3.5. GSM frame structure.

The basic frames are grouped to form so called multiframes which can consist of either traffic or control information:

• Traffic multiframe. The TCH are ordered into multiframes consisting of 26 normal bursts. Of the 26 available bursts 24 are used for actual TCH, one for the associated SACCH and one idle frame, see Figure 3.6. The sequence illustrated is repeated in the 32 GSM

time slot that is assigned to theMS, i.e. the slot marked 1, sending one of the 26 frames in each TDMA channel. This makes the total length of the traffic multiframe (26*4.615 ms) = 120ms. The idle period introduced at the end of the multiframe is often used by the MS to scan other neighboring cells for handover purposes. If FACCH is used half of the data portion of eight consecutive bursts is used, indicated by the Stealing flags described above. • Control multiframe. The control multiframe is used to carry the control signaling channels BCCH, CCCH and the SDCCH. The multiframe consists of 51 bursts and is therefore repeated each 235.4 ms.

T T T T T T T T T T T T A T T T T T T T T T T T T I

T = Traffic channel 26 frames = 120 ms A = Associated Control I = Idle

Figure 3.6. The traffic multiframe used in GSM.

Above the multiframe a superfame is defined which takes up 6.12 s. The superframe consists of either 51*26 multiframes or 26*51 depending on which type of multiframe is used giving a total of 2048 frames. The superframe gives theMS the possibility to at least scan the all different frame types once.

3.7 Protocols in GSM

GSM uses a layered protocol structure for the peer-to-peer communication between the different nodes. The signaling protocol stack is used for handling the mobility, radio resource and connection management functions needed for the different functions in the network to work. The four different stacks for the nodes in the BSS and NSS is shown in Figure 3.7. The protocol stack is divided into three interface dependent layers; Layer 1 (Physical), Layer 2 (Data Link) and Layer 3. The functions in Layer 3 has no direct similarity in the OSI model since it includes functions from several layers and can better be described as a messaging layer [21]. The physical layer over the Um (between theMS and the BTS) handles all radio related activities like creating bursts, multiplexing through TDMA frames, synchronization of the trans- mission, encryption and management of channels described in section 3.6. The layer also takes care of channel coding and error detection/correction. The channel coding uses a Forward Error Correction with a high level of redundant information to assure a error free connection between theMS and BTS. The data link layer between theMS and BTS interface uses a protocol named LAPD m. This protocol is a striped down version of the Link Access Protocol protocol which is used for link access the ISDN D-channel. The LAPDm provides the functions for flow control and delivering frames in the right order but does not provide error correction since this is covered by the physical layer. Layer 3 consists of three sublayers; Radio Resource Managment, Mobility Managment and Connection Management. TheRR sublayer is implemented in the BSS to establish a link between theMS and the MSC. The layer covers the functions needed for the physical connection, management and tear 3.8 Addressing and localization in GSM 33

CM CM

MM MM

BSSMAP RR RR BSSMAP RR BTSM BTSM SS7 SS7 LAPD LAPD m LAPDm LAPD

RF RF PCM PCM PCM PCM

Figure 3.7. GSM signaling protcol structure.

down. TheMM sublayer is implemented on top of theRR sublayer between theMS and the MSC to support registration, authentication and other mobility related functions. TheCM sublayer is responsible for sending SMS over a SDCCH and SACCH and the supplementary services provided by GSM.CM also provides functions for call establishment, selection of type of service and call release. At the BTS theRR sublayer is changed to the Base Transceiver Station Management. The main purpose of theRR at this point is to allocate and reallocate traffic channels, initial access, paging and other radio resource and mobility management functions. From the BSC to the MSC the signaling protocol is changed to the SS7 signaling system that is used in the rest of the NSS. Here theRR functions are controlled through the Base Transceiver Station Managment Application Part.

3.8 Addressing and localization in GSM

As in other wireless communication systems it is crucial to be able to locate the different nodes in the system. This is even more important in a global system like GSM where the user is not only connected in the operators own network but also in other operators networks. Therefore the GSM defines a set of addresses that are used for locating not only the MS but also the users. The identity information that is stored about the user is, as mentioned earlier, stored in the SIM-card and theME stores identifiers of the equipment that is used. This makes it possible to develop further service to make the user independent of the accessibility or type of connection, mobile or fixed, and instead route the call to the best service that the user is connected to [9]. The worldwide localization in GSM is made possible by periodically performing a Location Area Update even if the equipment is not used. ThisLA update is processed and stored in the HLR. As mentioned earlier, a LAU is also preformed when theMS is connected to a BSC 34 GSM that belongs to a different VLR than before. Another term for the change of VLR roaming and GSM supports three different types: within the network (LAU), national roaming (often not supported due to regulations from the operators), and international roaming.

3.8.1 International Mobile Subscriber Identity (IMSI) The IMEI is used to identify the subscriber of the network. IMSI uses a 15 decimal digit number which consists of mobile country code (MCC), mobile network code (MNC) and a mobile subscriber identification number (MSIN).

3.8.2 Temporary mobile subscriber identity (TMSI) To strengthen the security in GSM the system supports the use of a TMSI to avoid using the IMSI which could identify the user. The VLR assigns a 4*8 bit TMSI number which theMS stores on the SIM-car. The TMSI is only valid within the a cell and therefore the TMSI together with aLA allows the network to uniquely identify theMS.

3.8.3 Local Mobile Subscriber Identity (LMSI) The LSMI is used as an alternative searching key to gain faster database access times. The MS is assigned a new LMSI when it enters a new MSC/VLR service area that is stored in the HLR.

3.8.4 Mobile Station (or Subscriber) ISDN Number (MSISDN) The MSISDN number is the user’s phone number and is stored in theMS SIM-card. The MSISDN follows the ITU-T standard with a country code, national destination code, and the subscriber number. The use of the MSISDN also gives some extra security by hiding the IMSI.

3.8.5 The Mobile Station Roaming Number (MSRN) The MSRN is a temporary ISDN number which is location dependent and is used to hide the identity and location of a user. The VLR assigns the MSRN and is passed to the MSC when it is needed. The MSRN has the same structure as the MSISDN and is generated so that the routing to the MSC responsible for theMS can easily be determined.

3.8.6 International mobile station equipment identity (IMEI) The IMSEI international identifies aMS. The number contains information about the manufac- turer and the date when the unit was manufactured. The network also stores information about theME through the IMEI i.e. when a unit is reported stolen or other ways black listed from the network. The operator can also be notified if a user uses obsolete hardware that should be exchanged.

3.9 Data services 3.9.1 GPRS GPRS brings packet switched services to GSM networks. Packet switch access is suitable for burst traffic that is common in many services at internet today, for example browsing the web. 3.9 Data services 35

In the case of a circuit switch access is used the channel utilization would be very low. In GPRS a varying number of timeslots can be used for both downlink and uplink. There is a total of 8 timeslots that can be used and the channel coding per timeslot is 9.05, 13.4, 15.5 or 21.4 kbit/s depending on the radio link conditions [26]. In GPRS there are three classes of MSs for simultaneous use of both packet switched and circuit switched services. Class A fully supports the use of packet switched services and circuit switched services simultaneous. Class B supports the use of both services but not at the same time and aMS of class C can only support one type of service[9].

Internet

Gi Gc HLR GGSN Gr GGSN MSC/ Gn VLR Other GPRS Gp SGSN Networks Gs Gb Gf EIR BSC

BTS BTS

MS MS

Figure 3.8. Architecture of a GPRS system

3.9.1.1 SGSN The Serving GPRS Support Node is responsible for delivering packets from the Gateway GPRS Support Node to the BSC, which forwards the packets to the BTS which sends them over the air interface to theMS. One SGSN handles a set of nodes and keeps track of the location of the MS. The current SGSN to which a MS belongs is stored in the HLR, which receives this information over the Gr interface. The SGSN is also responsible for attach/detach of MSs and their authentication and logical link management [9]. For authentication the SGSN can query EIR the IMEI to make sure that the MS is allowed to be registered to the network. For the circuit switched services in GSM a location management already exists. It can be reused in GPRS, for example location updates for GSM and GPRS can be combined. In these cases the Gs interface is used for communication between the SGSN and the MSC/VLR.

3.9.1.2 GGSN The GGSN routes data packets out from the mobile network to Internet and other packet switched data networks. The node converts the network address, ip address at Internet, into a GSM address for incoming packets[9]. The packets are routed in the mobile network to the 36 GSM

SGSN responsible for the service area where the MS is located. The interface between the SGSN and GGSN is called the Gn interface if the two nodes are in the same PLMN and Gp if they are in separated PLMN. When the GGSN needs to know the location of aMS, it can ask HLR about which SGSN the user belongs to. This is carried out over the logical interface called Gc, as shown in Figure 3.8.

3.9.1.3 Location managemnet For packet access theMS can be in three states: • IDLE • READY • STANDBY When theMS is in state IDLE its location is unknown to the GPRS-network. There has to be one attachment to the network initiated by theMS before the user transfers to READY state. TheMS can perform a GPRS detach procedure which takes it back to state IDLE. In state READY the transmission of data, both uplink and downlink, is carried out and theMS makes an update about its position for every cell movement. If no transmissions have occurred for some time, a timer expires and theMS is enters STANDBY state. In this state theMS sends an update about its position every time it changes Routing Area[9]. ARA is a subset of cells within oneLAs in GSM. TheRA is then paged when the cell location is needed and theMS switches to state READY as soon as it starts sending or receiving packets again. If no transmission takes place in STANDBY state a STANDBY timer will expire and theMS will start over in IDLE state. One problem with this solution of location management in GPRS is that the attach process can only be initiated by theMS. This means that push services can only be used when theMS is in READY and STANDBY state. When aMS is in STANDBY state it will send the location updates every time it changes Routing Area, which will consume battery power from theMS and radio resources.

3.9.2 EDGE The improvements made in GPRS were, as presented in the previous section, the possibility of using several timelost for one user. In Enhanced Data Rates for GSM Evolution futher improvements were carried out, by allowing a dynamic modulation which improves the data rate. The main change in EDGE is in the air interface whereMS with, good radio conditions, can use a higher modulation scheme. In GPRS and GSM the modulation is Gaussian minimum- shift keying which has one bit per symbol. In EDGE theMS can switch to 8-PSK that has three bits per symbol [9]. The bit rate is up to 59.2 kbit/s for one timeslot compared to GPRS, that could have up to 21.4 kbit/s for one timeslot The data rate is three times higher with EDGE. In EDGE more improvements are introduced in the air interface. Hybrid automatic repeat request is used which makes retransmissions more efficient. Instead of retransmitting the whole packet, more redundant data is sent, which increases the chances for the receiver to decode the message correctly. However there are a few effects on the GSM system architecture when EDGE is introduced. The interface between BTS and BSC, called Abis, only supports 16 kbit/s per traffic channel [9]. Since edge supports higher data rates, several traffic channels in the Abis interface are allocated for one EDGE channel. Chapter 4

GSM Evolutions

Chapter Introduction This thesis is mainly focused on GSM networks, although this Chapter presents the latest tech- niques for mobile communication developed at Ericsson. The standardized techniques WCDMA and LTE are described from a general perspective.

4.1 WCDMA

After GSM the next generation of mobile network is the third generation, mobile network 3G. It is generally called UMTS (Universal Mobile Telecommunications System) and the first and most widespread one is the release99, which was standardized by the 3GPP [16]. WCDMA is a development of GSM and supports, in the release99, packet switch data rates of 0.384 mbit/s. After release99 the development of WCDMA has continued and HSPA (High Speed Packet Access) is one extension of WCDMA where speeds of up to 14 mbit/s are supported in the first phase [12].

4.1.1 System and network architecture of WCDMA In WCDMA there is a separation of function between the RAN (Radio Access Network) and the CN (Core Network). The mobility management is hidden from the CN in the RAN and the different types of Radio Access Networks, with the same separation, can be connected to the Core Network. This allows a combined CN for both a GSM and a WCDMA RAN network. Many operators already have a GSM network and an investment in WCDMA is therefore lower since the two networks can share the same CN. Handovers between GSM and WCDMA are also supported, which provides better total coverage for the end user because the networks might cover different areas. Since WCDMA and GSM shares the same Core Network, the difference in nodes is in the Radio Access Network. Figure 4.1 shows the logical network of WCDMA and a description of the nodes and interfaces that differ from GSM follows.

37 38 GSM Evolutions

PSTN Internet

Gi

GMSC GGSN

HSS MSC/ SGSN VLR Iu_cs Iu_ps Iu_ps Iu_cs Iur RNC RNC

Iub Iub Iub Iub

NodeB NodeB NodeB

Figure 4.1. Architecture of a WCDMA Network

Node B is the logical node in WCDMA that correspond to the BTS in GSM. The name ’Node B’ was a temporary name during the process of standardizing WCDMA but was never changed [16]. It is responsible for coding, interleaving, modulation and other physical layer function. Radio resource functions, such as power control, are also performed in NodeB. RNC (Radio Network Controller) is responsible for several NodeB’s and is connected to the core network and other RNC. A RNC is one anchor point in the network, which means that it is a fixed point in the network from point of view of the core network, even though the user switches cells. If the user moves to a cell controlled by a NodeB that belongs to another RNC a concept of serving and drift RNC is introduced in order to maintain the original RNC as an anchor. The first RNC remains the serving RNC even if the user moves to a NodeB controlled by another RNC, which becomes drift RNC. HLR stores the information about the subscriber in a database. It contains information about which services that are allowed for the user and status about services such as call forwarding and call waiting. It also stores information about where the user is located, e. g. which MSC/VLR or SGSN. Uu interface is the wireless radio connection between the UE and NodeB. In this thesis there will be no focus on the wireless part. However a lot of research on mobile communication is being carried out within this area. In WCDAM the multiple access is achieved by spreading the signal into a wide signal of 5 MHz and separating the users by adding a unique code for each user. Iu is the interface between the UTRAN with the CN. It has one interface for circuit switched (CN) traffic and one for packet switched (PS) traffic. In this interface information between the RAN and the CN is exchanged for many functions. Some of these functions where the Iu 4.2 LTE 39 interface is used are paging, hard handover and location reporting. By having standardized interfaces UTRAN and the CN from different manufactures can function together. Iub interface connects NodeB with an RNC. In GSM this is not a standardized interface and in LTE this interface does not exist, since NodeB and RNC are combined to one logical node. By standardize this interface the possibility for manufactures to specify on creating NodeB opens. Iur interface is used when two RNC need to communicate with each other. The interface was mainly intended to be used for soft handovers. It can also be used for exchange information for Global Resource Management.

4.2 LTE

After WCDMA the LTE (Long Term Evolution) is a development towards the fourth generation mobile communication system. The operators call this technology 4G, however Ericsson, which is a leading developer in this field, claim that LTE is a step towards 4G. It is likely that Ericsson will call LTE advanced the "real" 4G. In LTE the allocated spectrum can vary from 5 MHz up to 20 MHz. It uses OFDMA (Orthogonal Frequency Division Multiple Access) in the downlink and has a flat IP-based structure [12]. Other targets, when designing LTE, were low round-trip time, high mobility and high capacity. The peak data rate in LTE is from 100 mbit/s without MIMO and 326.4 mbit/s with 4x4 MIMO in the downlink [25].

4.2.1 System and network architecture of LTE/SAE The system architecture of LTE is different to that of WCDMA, shown in Figure 4.2. It has fewer nodes and supports only IP-traffic (Packet Switched). LTE does not have the same splitting of functions between RAN and CN that there is in WCDMA. One reason for this is the fact that a node corresponding to the RNC does not exist in LTE. Instead most of the functions handled by the RNC are moved to the NodeB that are now called eNodeB in LTE. Without an RNC node, the anchor point is moved to the Core Network that now has to handle mobility in the network. In LTE there are no macrodiversity requirements in the RAN. Macrodiversity means that a signal from several transmitting antennas is combined to an improved signal. This is used in WCDMA, for example, in soft handovers between two cells and controlled by the RNC. eNodeB The eNodeB inherits the functionality from the NodeB and most of the function- ality of RNC from the architecture of WCDMA. An eNodeB handles several cells and performs radio link functions like modulation, demodulation, interleaving etc. [12]. In addition to this the eNodeB is in charge of radio resource functions, which are handover decisions and schedulin- gof the shared medium among users. EPC In LTE the Core Network is dramatically changed compared to GSM/WCDMA. The new CN was called Evolved Packet Core (EPC) and the work of designing it is called System Architecture Evolution (SAE) [12]. The objectives of SAE is to have a packet switched domain (IP-based) and to minimize the number of nodes. The Core Network in LTE is reduced to one logic node called EPC and one additional node, HSS, which corresponds to the HLR in GSM/WCDMA. The EPC has to handle mobility and acts as the anchor, e.g. the fixed point from a view outside the network. When a handover occurs the EPC has to know which is the new eNodeB and start forwarding packets there. It has to a logical connection between an EPC and all the eNodeB’s in the Network. HSS Home Subscriber Server (HSS) has a similar function as does the HLR in GSM/WCDMA 40 GSM Evolutions

Internet

SGi

S6 EPC HSS (Evoloved Packet Core) S1 S1 S1

X2 X2 eNodeB eNodeB eNodeB

Figure 4.2. Architecture of a LTE system

and is connected to EPC through an interface called S6. X2 This interface connects the eNodeB with each other. The interface is mainly used for active-mode mobility [12]. When a handover takes place the eNodeB uses this interface to communicate with the neighboring cells that belong to another eNodeB. S1 The connection between the eNodeB and EPC has the interface called S1. This interface corresponds to the Iu packet switched interface in GPRS/WCDMA. Chapter 5

Current Solutions

Chapter Introduction The topic of measuring the utilization of the test equipment have been explored earlier at Ericsson. This Chapter presents the previously used tools and methods for estimating the usage of test equipment that were encountered during the work with the thesis.

5.1 STP Utilization tool

The utilization tool measures the utilization of System Test Plants used in Wideband Code Division Multiple Access.A STP is a set of nodes booked to a specific project. One STP in WCDMA usually contains one or several Radio Network Controllers and other nodes like Radio Base Station.

5.1.1 Definitions The utilization is calculated for a six hour period for each STP and up to four levels of utilization are defined; none, low, medium and high usage. The level of usage is derived from several base measures:

• The number of shell connections to the RNC. The shell connections are: telnet, moshell, SSH, FTP, Serial and OSS.

• Whether the settings in the node have changed.

• The number of registered UE’s to the node.

These variables can then be combined into logical expressions that have to be true if a certain level of utilization is considered to be reached. One example could be: (number of shell connections > 3) AND (number of changed settings > 0) -> usage = high

41 42 Current Solutions

5.1.2 Data collecting The number of activeUE in the RNC at the movement is sampled every 15 minutes. The values are saved in a file for each day. Once every hour a script is scheduled to collect log files from the RNC’s with information about the number of shell connections and changed settings. Monode, which is a part of Moshell, is used to create, parse and then delete the log files at the node. The parsing script extracts the shell connections and the changed settings during the last hour. This data is then saved in a log file, one for each RNC and day. From the log files a final data file is assembled, which contains the values over a six-hour-period for the shell connections, the number of changed settings and the number of registered UE for all RNC’s. It is from this file that the web-GUI calculates the usage level according to the specified rules.

5.1.3 Data presentation The web-GUI in the tool is the same that is used in LTE eNodeB utilization tool but with a few differences. The tool can display four degrees of utilization, however in practice only two levels are used: unused and high, which are shown in Figure 5.2 and Figure 5.1. When the page opens the calculation of the usage is carried out according to the specified rules. This gives the opportunity of changing the logical expression and recalculating the usage. The level of usage that is displayed for one day is the highest level of usage for one 6 hour period that day. For example in Figure 5.1 "2010-01-02 Saturday" has high level in the 0-24 hour period, since the 12-18 hour period has high level of usage.

Figure 5.1. The second page of the web-GUI that shows the utilization during a time period of 6 hours

5.1.4 Evaluation of the tool The tool can use a lot of base measures to determine the level of usage. This allows the tool to capture different types of activity in the nodes. However it has been shown, according to Guido Hüpohl, who is the creator of the tool, that the rules were too complicated. Therefore the tool only calculats the number of shell connections and the number of changed settings. A method for defining the classification rules is probably needed, if a more complicated expression is used, though with more complicated expressions it is probably true that several kinds of activities can be detected. 5.2 Utilization tool for the eNodeB in LTE 43

Figure 5.2. The main page of the web-GUI that shows the utilization for one day

The model has been verified, by testing whether the log files are correct and that they are processed in the way it was intended. Whether the limits then gives correct results is not validated. The limits are determined from experience. Although the tool is good at collecting data and has an informative GUI, the weakness lies in how to specify the rules and limits. Knowing how to interpret the base measures is essential for the quality of the output from the tool. The level of usage is calculated for a six-hour-period, even though the data is collected every hour. There is no reason for not having the same time resolution for the base measures and the derived measures. Data could be collected every sixth hour, which would minimize the interference of the testing or the level of usage could be calculated every hour, which would increase the accuracy of the measurements.

5.2 Utilization tool for the eNodeB in LTE

The tool for measuring the utilization at eNodeB is today running and delivering indicators of the usage of eNodeB’s, also called LTE RBS, at Ericsson test plants in Kista and Linköping. It was first launched in mars 2009 and uses a modified version of the web-GUI from STP utilization tool and the automatic test environment THC. It contains a data collection part and a data presentation part.

5.2.1 Definitions The sampling period is 2 hours. The state of the equipment can be used, unused and that no data was collected or that the collecting of data is disabled. The states are derived from the following three base measures: • The number of bytes transferred over the S1 interface. • The number of datagram’s transferred over the O&M (Operation and Management) in- terface. This interface is used by the operator to manage the eNodeB and is also used during testing. • The number of restarts is included since the two other counters are zeroed when a restart occurs. If one of the base measures exceeds a threshold the state is classified as used. The thresholds for the variables are not fixed, although the default values of the thresholds are 125000 bytes for S1 interface, 5000 datagram’s for O&M and two numbers of restarts during a period of two hours. 44 Current Solutions

5.2.2 Data collection To collect data the tool uses an existing test automation framework called THC (Test Harness Core). THC is used for automated testing environment where secure connections to all eNodeB exist. Data is colleted by running a test case from a THC-server that fetches, parses, validates and stores the data in a database according to Figure 5.3. The commands are transferred using Moshell to the LTE eNodeB. It is in the THC server the processing of data takes place. The processing consists of calculating the difference in the counter for an interface, since the previous value, and determininge the state of the eNodeB. If the connection could not be established it is set to no data. This test script is executed every second hour.

SHH/SFTP eNodeB Testscript: - Get data THC - Process Server eNodeB - Store in DB eNodeB

Utilization database

Figure 5.3. Diagram over data collection

5.2.3 Data presentation From the database the utilization data is presented in a web-GUI. At the main page there is a view over one month and the utilization in percent for each day and each eNodeB is shown. The percentage value is calculated by dividing the number of used hours by 24. The number of used hours will always be a multiple of two, since each two hour period strictly has one state. The user can click on a specific day and node, which allows the user to obtain information concerning usage distribution for that day and get the values and thresholds for the variables. The average values of the utilization for the last 30 days are also calculated for each eNodeB. In this calculation a decision is made regarding how to consider a node that is disabled and when the state is no data. Both when a node is disabled and when no data could be collected, the node is considered to be unused. In Figure 5.4 an example of the view of the main page is shown.

5.2.4 Evaluation of the tool An evaluation of the tool has not yet been carried out. . The thresholds have not been calibrated, though this would probably improve the accuracy . It is clear that, for most of the nodes, it is the threshold for S1 that is exceeded when the node is considered to be used. For nodes with less usage and a lot of missing data it is the restart variable that exceeds its threshold most of the time. The datagram counter over the O&M interface rarely exceeds its threshold and when it does the S1 interface also has high traffic. The utilization of a node appears to depend on the type of test performed at the node. If that is the case, which requires deeper investigations, the thresholds may be set depending on the type of tests that are performed at 5.3 Ericsson Real Utilization Measurement Solution (ERUMS) 45

Figure 5.4. Screenshot of the presentation of utilization

the node. The presentation is not clear on whether the non-use time was unused at the time, if no data could be collected or that the node was disabled. The status no data is unclear in the tool. It is in fact defined in the calculations as not used, which in this case is probably valid but it is not motivated. It cannot obviously be used when it is down, however one may argue that it should be withdrawn from the total time before calculating the utilization.

5.3 Ericsson Real Utilization Measurement Solution (ERUMS)

ERUMS is a tool that measures the utilization rate of network nodes by analyzing the IP-traffic to and from the node.

5.3.1 Definitions The base measure is the number of filtered IP-packets to a node during a five minute period. It is used for the derived measure that shows the utilization for the interval according to the definitions below:

• O packets equals 0 % utilization

• 1-10 packets equals 50 % utilization

• 11 packets or more equals 100 % utilization

5.3.2 Data collecting A schematic picture of a system where ERUMS is used is shown in 5.5. The traffic in the IP-interfaces from the node is mirrored in switches, which forward the packets to a central switch that then forwards the packets to a Linux server. The server first filters out the packets that have an IP-address that matches the IP-address of the nodes that are being monitored. Packets that do not indicate activity in a node are rejected due to predefined rules. These packets can be keep-alive messages or particular port numbers that are of no interest. The remaining packets are logged in a MySQL database. Every fifth minute the collected packets in the database are converted into statistic data since the packets themselves are not interesting, 46 Current Solutions but only the number of packets per time interval. The received packets are stored in the main memory, which is another reason for frequently calculating the statistics from the traffic data and then deleting the data.

Backbone Network

Node

Mirror ERUMS- Node Server switch (Linux) switch Node Mirror switch Node

Figure 5.5. ERUMS schematic system description

5.3.3 Data presentation The presentation is carried out at the ERUMS Linux server in a web-GUI where the user can generate graphs of the statistics. The pChart PHP library is used to generate the graphs that can present the utilization of one or several nodes during different time intervals. The presentation over the daily usage for one node during one month is shown in Figure 5.6. The utilization can be presented both as a percentage of utilization per time interval or as the number of packets per time interval.

Figure 5.6. Screenshot of the presentation using pChart 5.4 ENSIEM adaption for node utilization 47

5.3.4 Evaluation of the tool ERUMS only measure on IP-interfaces, which has a drawback since nodes that do not have IP- interfaces or nodes where traffic goes over other interfaces are impossible or difficult to measure. On the other hand it makes the tool flexible since it is possible to measure all nodes that do have IP-interface. It is the filtering of packets that needs to be adapted for different nodes; the other part of the system is general for all type of nodes. Another problem with ERUMS is that it is not possible to analyze interfaces with high data rates. For each five minute interval the traffic is kept in the main memory. For example if ERUMS can use 1 GB of internal memory it will run out of memory within less than five minutes if the data rate exceeds 26.7 Mbit/s. The data rate into the server is the sum of the data rates for all mirrored interfaces. The problem can be solved by increasing the main memory or decreasing the interval between the statistic calculations. The tool requires special hardware for the nodes that shall be monitored. Switches that can mirror interface are more complex than ordinary switches which make it expensive to scale up the solution. The tool does not use any internal base measures in the nodes, which means that it cannot discover activity that takes place within the node. However since mobile networks consist of nodes that communicate with each other, it can be claimed that they are only in use while they are communicating. The motivation for defining the utilization in five minute intervals to 0, 50 or 100 percent is unclear. The most basic definition is to only regard the equipment as either used or unused in one interval. It is difficult to interpret what a percentage number of utilization during one time interval actually means.

5.4 ENSIEM adaption for node utilization

The tool is an ENIQ-based SIEM application called ENSIEM (Ericsson Network Security Infor- mation and Event Management). ENIQ (Ericsson Network IQ) is an Ericsson product design for Performance Management in a multi-technology network and is a part of Ericsson OSS product. It can collect data from variety of network elements and create reports about the condition in the network. It is planned that the tool will replace STP utilization tool which today are used for mea- suring utilization of RNC and MGW nodes in Ericsson test plant in Jorvas Finland. These nodes are WCDMA products which are based on Cello, a generic platform for telecom appli- cation having ATM, TDM or IP transport [22]. The main purpose for a new tool is mainly better security management and report possibilities. The tool is created at BUGS Ericsson Test Environment which influences the objectives for the tool. Better security management means that unbooked equipment usage should be detected, since users pays for test equipment time to BETE based on the booking of the equipment.

5.4.1 Definitions The tool uses the same definition of utilization as the Utilization tool for STP in WCDMA, see 5.1.1, which are; no, low, medium and high usage. The base measures are also the same:

• The number of shell connections to the RBS. The shell connections are: telnet, moshell, SSH, FTP, Serial and OSS.

• If the settings in the node have changed. 48 Current Solutions

• The number of registered UE’s to the node.

In this tool the usage is measured per hour. The level of usage is classified depending of the values of the three base measures. It is almost solely the number of registered UE´s that decide the level of utilization for each hour. The default thresholds for the number of registered UE´s is 1 - 99 for low usage, 100-499 for medium usage and > 500 for high usage.

5.4.2 Data collection Data collection agents are software running on ENIQ Server that collected the base measures. Data can be collected from the monitored nodes using: SNMP polling, collect and parse log files and parse command printouts. It is then transferred using FTP, sFTP, SSH or Telnet depending on what is supported by the node. The collected utilization data is stored in an ENIQ database, where all data from the monitored nodes are saved.

5.4.3 Presentation The user interface is the ENSIEM dashboard, which is a web based service which uses the Business Objective reporting capabilities to create rapports and provide a GUI. Data for the presentation is stored in the ENIQ database. The GUI over the utilization per day is shown in Figure 5.7 where the usages is mapped into the four levels per day. The daily usage level is determined by the number of hours the equipment is considered used that day, independent of what the level of usage per hour low, medium and high is now considered the same. 0 -3 hours is no usage, 4-6 hours are low usage, 7-10 hours are medium usage and 11-24 hours are high usage. Business Objective is a business intelligent solution production from SAP. It is used for analytics, dashboards, visualization and reporting. For the utilization tool the dashboard func- tionality is used which includes features like flexible report selection and the possibility to share, save and schedule reports. In Business Objective access control is implemented, which is im- portant since users only should be able to see and generate reports with information that they are allowed to access.

5.4.4 Evaluation of the tool The strength of ENIQ-based utilization tool is that it uses ENIQ-server that are intended for collecting data from nodes and that it is flexible in presenting the utilization data. By using already existing solutions many features are already implemented, like the access control. By using ENIQ-server already established connections to the nodes can be used, which simplifies the implementation of the tool. An additional strength of the tool is that the booking and scheduling information of the equipment is available for the tool to use. It is not used today; however it can be used to apply specific classification rules. For example if a STP is booked for function test specific classification rules for function test can be used with lower thresholds compared to load test. The drawbacks of the tool are the same as for STP utilization tool for WCDMA nodes. The definition of utilization is not motivated and it is not possible to interpret what low utilization one hour means. The different levels of utilization per hour is not used when calculating the utilization per day, which can indicate that the different level of usage (low,medium and high) is unnecessary. 5.5 Booking degree as utilization measure 49

Figure 5.7. Screenshot of the presentation of utilization

5.5 Booking degree as utilization measure

Ericsson is evaluating the utilization of test equipment today by calculating the booking rate in the asset management system used by BETE, called BAMS. It is important for BETE since the customers (testers) only pay for the time the equipment is booked. The indicator of the booking degree is calculated in a naïve method by taking the average of the booking degree for all STPs. The value or cost of the STPs is not considered when calculating the indicator. If the booking degree for each STP was weighted with its individual cost in calculations of the average booking degree the indicator would better meet the information need of BETE. In the asset management system, BAMS, the booking of STPs can be done in quarter of one hour, however the customers use to book one STP for several months. For that reason the booking degree is a poor indicator of the real time utilization efficiency of test equipment. If the booking of equipment would be done on more detailed level the booking degree would be a good indicator of the equipment utilization.

5.6 Other test efficiency indicator

5.6.1 Fault-slip-through During a test process it can be difficult to measure how efficiently the tests are being performed. If only the found faults were to be calculated, that indicator would depend on the total number of faults in the software. The total number of fault in software is very difficult to estimate, 50 Current Solutions which makes an indicator that only calculates the found faults misleading. Therefore fault-slip- through measurements are more accurate in indicating the efficiency of software testing.

Fault-slip-through

Fault should Fault is be detected detected

System System System First Office Function Integration Robustness Verification Application test (FT) Test (SIT) Test (SRT) Test (SV) (FAO)

BSC Design BSC I&V BSC System BSS I&V FOA customers

Figure 5.8. Definition of fault-slip-through

The later a fault is found the higher is the cost of fixing it [13]. To fix faults after delivery can be a hundred times more expensive then to find and fix them in an early test process. Fault-slip-through is measured by comparing in what phase the fault was found with the phase in which it should have been found. One example is shown in Figure 5.8 where one fault that should have been found in the FT-phase is instead found in the SV-phase. The phase where the fault should have been found is defined as the phase where it is most cost efficient to find the fault, which is almost always - the earlier the cheaper. However some faults cannot be detected in early phases due to the complexity of the faults. Where the fault should have been found is estimated by the person that reports the fault, the tester, or the person that corrects the fault, the developer. It is important that a definition for these estimations is specified and that the developers and testers are educated properly [18]. When all faults are categorized they can be assembled in a Table that shows where the fault was found and where is should have been found. Table 5.1 shows one example of how the fault-slip-through can be presented for the test process. DIDDET (Did detect) means in what phase the fault was detected and SHODID (Should detect) means in what phase the fault should have been detected.

Table 5.1. Example table over slip through data for each phase

DIDDET: Design FT SIT SRT SV FOA SHODET: Design 27 13 12 7 8 4 FT - 21 13 17 9 12 SIT - - 37 18 7 9 SRT - - - 21 15 3 SV - - - - 32 10 FOA - - - - - 41

Ericsson is today using fault-slip-through, or fault slippery, when test operation in a project 5.6 Other test efficiency indicator 51 is evaluated. Studies of the test operation at Ericsson have shown that, according to the fault- slip-through measure, there is a possible improvement in the Function Test by 32 % and 39 % and in design phase by 85 % and 86 % for two case studies of software project at Ericsson [18].

Chapter 6

General model for utilization measurements

Chapter Introduction This Chapter presents a general model for measuring the utilization of test equipment. The model is based on the theories from the Frame of reference in Chapter2 and the analysis of previous tools in Chapter5.

6.1 Efficiency indicators for test equipment

The model suggests a differentiation of the efficiency indicators for test equipment into the following two indicators:

Table 6.1. Equipment efficiency indicators

Indicator Definition Equipment Utilization Efficiency Shows the proportion of time the equipment is used. Test Performance and Quality Shows the efficiency of the test Efficiency operation when the equipment is used.

The two indicators can be combined into a theoretical indicator called Overall Test Equip- ment Efficiency (OTEE) that captures the overall effective usage of test equipment. The separation into these two indicators allows a distinct definition of utilization that only requires a classification of the equipment usage into discrete states. To calculate the equipment uti- lization only the states ”used” or ”idle” have to be classified, however if the metric is split

53 54 General model for utilization measurements up into additional metrics other states of the equipment have to be monitored. In Figure 6.1 the concept of Overall Test Equipment Efficiency is illustrated graphically. The separation of OTEE gives a context to the Equipment Utilization Efficiency which is the scope of this thesis. If the Equipment Utilization Efficiency is low OTEE will also be low, however if Equipment Uti- lization Efficiency increases, the Test Performance and Quality Efficiency can decrease which leaves the OTEE unchanged. This can be exemplified with the following scenario. One teste- quipment is used once every day for a test that lasts for twelve hours. The test performed on the equipment has high Test Performance and Quality Efficiency which is estimated to 80 percent and the equipment is used for 50 percent of the time. It gives an OTEE of 40 percent. One day the test fails when it is almost finished and the test is repeated. The equipment will then be utilized for 100 percent. The Test Performance and Quality Efficiency is half of what is was the previous day, 40 percent, since the utility of the test is equal although it last for twice the time. The OTEE will be unchanged at 40 percent this day too. The example shows that increase in Equipment Utilization Efficiencydoes not necessaryly have to increase the Overall Test Equipment Efficiency, however it gives an upper limit of OTEE.

Overall Test Equipment Efficiency

Test Performance and Quality Equipment Utilization Show how effective the tests are performed Show the proportion of time the equipment in the equipment and the quality. is used.

Not the scope of this thesis. Availability rate Operational rate

Figure 6.1. The OTEE concept, which put Equipment Utilization in a context

The concept of OTEE, shown in Figure 6.1, illustrates the fact that the definition of equip- ment utilization exists in a context of other performance and quality measurements, which are needed for the whole picture of how efficiently the equipment is being used. If only Equipment Utilization Efficiency is used for this purpose it will give an over estimation of the overall equip- ment efficiency. To get the real overall efficiency the Test Performance and Quality Efficiency is needed. It is not the scope of the thesis, however here follows a few examples of ways to estimate it:

• Number of activities completed compared to the maximum number possible.

• Comparison of the actual test result and the expected test result.

• Measuring the degree of utilization from the equipment, if it is possible to achieve this.

The performance efficiency is highly dependent on the equipment and the activities that are conducted. No detailed description will be presented in the scope of this thesis. 6.2 Equipment Utilization Efficiency 55

6.2 Equipment Utilization Efficiency

The equipment utilization indicator is calculated by dividing the time the equipment been used by the total time. The indicator can be separated into one indicator that shows the proportion of time the equipment is available and another indicator that shows the proportion of time the equipment is used compared to the available time. The indicators are inherited from the Overall Equipment Efficiency concept described in Chapter 2.1 and they are Availability Efficiency and Operational Efficiency and multiplied together they become the Equipment Utilization. The utilization of equipment is modeled by classifying the equipment state for each time period. It means that the equipment is only allowed to have one state per time period and that there are a limited number of possible states. To calculate Availability Efficiency and Operational Efficiency at least the following three states have to be classified; used, idle or unavailable. The definitions for the Efficiency indicators are presented in Equations 6.1 and 6.2 where N is the number of time periods and all time periods are of equal length.

 1 if the equipment is available for time period i a = i 0 otherwise

 1 if the equipment is used for time period i u = i 0 otherwise

i = 1,...,N

PN a Availability Efficiency = i=1 i (6.1) N

PN u Operation Efficiency = i=1 i (6.2) PN i=1 ai PN u Equipment Utilization = i=1 i (6.3) N

The Availability Efficiency shows the proportion of time that the equipment is available for testing compared to the total time. If the value is low it can be due to the fact that the equipment is down and maintenance has to be carried out or that the equipment is not booked for testing. Operation Efficiency shows the proportion of time that tests are carried out on the equipment compared to the time the equipment is available for testing. If the value is low it means that less testing is carried out compared to what was planned or that the equipment was booked up for a longer time than was needed for the tests which were planned. The reason why fewer tests were carried out can be due to failure of other equipment needed for the tests or that the testers have to prioritize other tasks. At Ericsson test environment the projects and departments that uses test equipment have to book it for usage. With the information about the equipment booking status other efficiency measures are possible. Other interesting indicators can be; used time compared to booked time and down time compared to total time. The down time indicator can be used in negotiations with supplier in the case of non Ericsson manufacture equipment. 56 General model for utilization measurements

6.3 The state of the test equipment

The general states for the test equipment are the following:

Table 6.2. General equipment states

State Description Unbooked Equipment is not available for test Down Booked Equipment is down when it is intended to be used Unbooked Equipment is not planned to be used Idle Booked Equipment is planned to be used but is not Unbooked Not allowed test in the equipment Used Booked Test or test related activities in the equipment No data The state of the equipment can not be classified

The state Used is when the equipment carries out the activity it is intended for. For test equipment it is obviously Used when a test is being carried out, but also for all the activities necessary for the carrying out the test and for analyzing the results. The type of activities that make the equipment considered to be used differs depending on the equipment. The state Idle is when the equipment is available for test but it is not used. Equipment is considered to be in state down if no tests can be carried out due to equipment failure or maintenance. In Ericsson test environment almost all equipment has to be booked before the testers are allowed to use it. The booking system is central in the asset management of Ericsson as described in Chapter 1.8 and unbooked can therefore be consider to be an additional state of the equipment which makes the state a combination of usage status (used, idle or down) and the booking status (booked or unbooked). If the equipment lacks booking status or that the information is not possible to access it can be neglected. The booking status of the equipment mainly contributes with information about when unbooked equipment is used and provides an explanation for why equipment is idle in the case of it being unbooked. The equipment utilization indicator equations 6.1- 6.3 require information that specifies whether the equipment is available and if it is being used. Depending on whether the booking information exists the states mapping to this information differ.

If booking information exists:

• Equipment is available when it has state: (used, booked), (used, unbooked) and (idle, booked). • Equipment is used when it has state: (used, booked) and (used, unbooked).

If no booking information exists:

• Equipment is available when it has state: used and idle. • Equipment is used when it has state: used. 6.3 The state of the test equipment 57

6.3.1 Measurement methods The equipment state has to be decided for each time period in order to calculate the efficiency indicators described in Section 6.2. There are no general methods for deciding the equipment state that can be applied to all types of test equipment, there are however similar approaches that can be used. Data has to be collected for all types of equipment and from the collected data the equipment state has to be classified. An attribute is a property of one entity that can be measured and gives information that can be use to classify the equipment state. The entity is often the equipment itself and the attribute can be traffic in an interface, status in the equipment that changes when activities occur or interaction with the equipment that indicated activate. If the entity is not the equipment, the attribute can be of more indirect character, like which type of test (function test, load test or feature test) is planned in the equipment or what project has booked the equipment. The ISO/ETC measurement process standardization document, described in section 2.4, label it base measure. The base measure can be either a sampled value or a counter value. If sampled values are used there is uncertainty in the classification of the equipment state, which will be discussed later.

6.3.2 Classification of equipment state All the base measures for one time period and one piece of equipment is a record, which is used in the classification of the equipment state. The record is inputted in an algorithm that outputs the equipment states (used, idle or down) for the time period. It can consist of logical expressions or decision trees which can be constructed by experts of the equipment that know how it is used. Another approach is to collect records of base measures and at the same time let the users of the equipment manually write down the state of the equipment. This gives already classified records which can be used for building up the classification rules.

6.3.3 Time resolution There are two times of interest in the general model for utilization measurements. These are the time period for which the state of the equipment is determined and the time period between the collecting of data. The time resolution of the equipment state is a trade-off between minimizing the interference of the test operations and having a high accuracy in the equipment state. Lower time periods will usually have high cost since more resources have to be used. If the time period for determining the equipment state is f the time that the equipment is in a state i, Si, have to be greater than twice f to be sure that the state is discovered. This is similar to the Nyquist - Shannon sampling theorem illustrated in equation 2.7. In Figure 6.2 an example of an equipment that is idle for nearly two time periods although no time period is classified as idle since, in this example, it requires idle for the entire time period. If only counters are used in the base measures the accuracy will not increase if the counters are collected with higher frequency as shown in Section 2.3.2. There is therefore no point in having a time period for the data collecting that is smaller then the time period for the equipment state. However if samples are used in the base measures there is an uncertainty in the base measures that decrease with higher sampling frequency. Which sampling frequency that is needed depends on the distribution of the variable that are being sampled. For a slightly varying variable it is often sufficient to sample just once or only a few times. A distribution of a base measure with high variance will give high error in estimating the state of the equipment unless more then hundred samples are use (see Section 2.3.2). If an error in the individual 58 General model for utilization measurements

Used

Idle time Equipment Equipment Equipment Equipment classified as: classified as: classified as: classified as: Used Used Used Used

Figure 6.2. Sampling a binary signal that describe the equipment state.

equipment states can be accepted the error of the indicator will be smaller the more states that are used to calculate the indicator. In Table 6.3 the error in estimating the equipment utilization is calculated using equation 2.16 where the equipment states (idle, used) are sampled with different time periods and the utilization is calculated over different times. The calculation requires an assumption that the state samples are considered to be binomially distributed. In the Table α = 0.95 and p = 0.5 which is the value of p that gives the highest error (see 2.3.2.2).

Table 6.3. Error in indicator when sampling equipment state

Equipment Utilization Sampling period indicator calculated for one: 15 min 30 min 1 hour 2 hour Day 10.0% 14.2% 20.0% 28.3% Week 3.8% 5.4% 7.6% 10.7% Month 1.8% 2.6% 3.7% 5.17% Quarter 1.1% 1.5% 2.1% 3.0%

The calculation in Table 6.3 shows that even a sampling period of 2 hours gives an error less than 11 percent when estimating the Equipment Utilization over one week. The assumption in the calculation that the state samples are binomially distributed is probably a worst case scenario. The interpretation of the result is that when samples have to be used instead of counters, sampling does not need to be carried out so often if the information requirement is the utilization over a week or more. The time resolution also determines the amount of data that needs to be stored. If the equipment state is decided per hour there will be 8760 record per year for one piece of equipment. The number of nodes or equipment in the whole test environment is several thousand and the total number of records per year will be more than ten million. The size of the record depends on the information that is stored. One record can be either the base measures, the derived measures (Equipment states in this case) or both. If the amount of data becomes larger than what is manageable it can by aggregated together and only the indicator can be saved. Detailed historical information about the utilization is not needed but rather the indicators for benchmarking and comparison are required. Chapter 7

Common Utilization Tool

Chapter Introduction

In this Chapter there will be a discussion on how a common utilization tool, the Ericsson Common Utilization Tool, chould be implemented with separated collector measurement mod- ules but with a common storage, configuration, presentation and Key Performance Indicators measurements.

7.1 Schematic model for a general utilization tool

As seen from the discussions in the previous chapter, the general utilization model consists of a general definition of equipment utilization, general efficiency indicators and general states for test equipment. However the method of collecting data and making measurements to classify the specific test equipment state into one of the general states is not specified in the general model. Figure 7.1 shows a schematic model over a general utilization tool that we propose for test equipment. Since it is hard to develop a utilization collector that will be able to detect all types of activities the utilization tool should consist of separated modules. The sampling module is specific for each type of equipment; however in many cases it is possible to reuse sampling modules for new equipment. The data collection and equipment state classification follow the same methodology. The output from the sampling modules is the general equipment states for each time-period, which is stored in a generic database. Since data is stored in the same database, in the same format and with the same definitions a common presentation view is possible. Regardless of the type of equipment it can be presented in a similar graphical interface and using the same performance indicators, which have been suggested earlier.

59 60 Common Utilization Tool

Common utilization Node presentation Node

Performance indicators: Node Equipment Utilization Equipment Availability Node Equipment Operational

Equipment state (idle, Generic utilization used, down) database

Equipment state classifier Equipment Equipment Equipment Base measures A state B state Z state sampling sampling sampling module module module Collect data

Equipment attribute

Figure 7.1. Schematic model over a general utilization tool

7.2 Modules

The use of separated collector modules for determining the predefined states of the equipment is needed to assure that the utilization is as accurate as possible. The use of the separated modules will also make it easier to distribute the development of the collectors to different organizations. This means that, for instance, the GSM part of Ericsson can, as we have done, implement a BSC module and for example BETE to design a module for the RBS platform. This will also permit the reuse of the Utilization tools that have already been developed throughout Ericsson organizations. These tools could, at startup, deliver data to both the old location and to the new common storage location. The module should be constructed to first collect the base measures and with help from these classify the state of the equipment into derived measures. This derived measures should then finally be delivered to the utilization database in the format stated by common guidelines that specifies the state of the tested resource. This should include information on equipment, time-period, utilization state and a comment for the utilization record. The implementation of a module could be carried out in a variety of ways but we support the idea of using a pre-made system that communicates with resources such as Test Harness Core. This will significantly reduce the development process and maintenance time needed to measure utilization. Since a tool such as THC is developed to support "all" equipment that is under test, and to support the testing of equipment, this tool will provide many of the features needed to collect the data. Another important benefit of using THC is that sites that do not already have THC implemented get an environment deployed that can be used to automate the 7.3 Database 61 tests performed at the site.

The common utilization tool could provide a code structure that could be used to develop the test case and the test code to the module. To support reuse of code written for the individual utilization modules a common repository such as eForge should be used to store all code developed for the modules.

7.2.1 Time resolutions of measurements The requirements of the time resolution of the measurements will probably differ between dif- ferent kind of equipment and organizations. Therefore the common utilization tool should not set any strict rules on this subject. Our recommendation is that records should be stored on an hourly basis. This recommendation is based on our supervisor at Ericsson, Torbjörn Wick- ström’s, experience on how the equipment could be shared between testers but this resolution is also preferred in the BETE Asset Management System system. This will also reduce the amount of storage needed to store the utilization records. In cases where this possible, like when test cases need to collect the utilization has a execution time longer than one hour or when this resolution isn’t needed it is better to lower the resolution than to interpolate. This since this interpolation of the data will force more data to be stored. In cases where you need to sample data at a higher resolution than one hour we recommend that this information should be stored in a separated table or database to keep the main databases structure intact and to keep the performance at the presentation layer.

7.3 Database

Since the overall aim of the Common Utilization Tool is to define a uniform base for the utilization of test equipment, is it convenient to store the utilization data at one common source. This will force the different developers and users of utilization measurements to think of utilization for different types of equipment in the same way. This will also aid the use of the same KPIs for all types of resources and to create a common presentation GUI. The requirements for data storage of the utilization data for a common utilization tool are quite high due to the large number of resources (entities in the database). Therefore it is important to keep a simple structure and reduce the information needed to be stored. The structure is very important because a poor database structure will reduce the performance significantly. The number of operations the database has to perform during a transaction of data will grow quickly, if the number of users, and therefore the load, grows fast and could make the database overloaded and unusable. Since Ericsson today has approximately 71 sites and say that each have 500 items that they would like to measure in 24/7, this would give 71 * 500 * 24 = 852 000 records/day. All these records will pretty soon require a large storage area. A solution to this problem is to separate the information that is stored into distributed databases. The central storage utility should, in this scenario, only have to store management information on a daily or monthly basis to produce the required KPI measurements and pre- sentation at a lower resolution. Together with this centralized storage regional site specific databases where collector modules could send these data will be used. These local databases will then consolidate on a daily basis to a central database. In the local database the module specific data could also be stored like the dns-name if it should be measured etc. By doing this the performance of the system will be higher and will keep the information needed for classi- fication; limits, attributes and other previous state variables. This will also make it simpler 62 Common Utilization Tool to design individual collector modules, because the designer can make the decision as to what information to store independently and what other modules are needed. The required data that has to be stored about a resource in the common utilization tool is:

• Unique identifier

• Site

• Type of equipment

• Stakeholder (could be project or domain etc.)

And for the resource utilization:

• Timestamp (measurement period)

• State (used, unused, down)

Another important type of information to store is whether the resource is booked or not. This information could either be stored in the database or fetched on demand by the presentation GUI. Which solution that should be chosen depends on how this information is going to be used. Since this information will be needed for accurate KPI measurements on the utilization we recommend that this type of information is stored together with each utilization record. Using the requirements above we propose the structure for the generic database model to be as the model shown in Figure 7.3. In this model we store and collect data for resources, type, groups and utilization. The design is flexible so that more features could be added if needed.

Figure 7.2. Database structure for the Common Utilization Tool

The resource table is show in Table 7.1. Resource utilization data is stored in the table resource_utilization, see Table 7.2. This table will store the state, if the resource is booked and a text comment for each utilization record. The booking state should be queried from the BAMS database when the resource utilization is collected. This will limit the amount of 7.3 Database 63 information needed to be collected from this database. The resource type information is used to filter the number of entities in the presentation of resources in the WebGUI. The type of resource could be, for example, BSC, RNC or a protocol analyzer. Since each resource can only be of one specific type the type_id is stored for each resource. The type table is shown in Figure 7.4. Since this type of information would vary with time, it is not possible to change this through the GUI. The site table stores information on which site a current resource belongs to, see 7.3 To support the feature that resources could be related to each other, a group table is introduced. This table could group the resource by project, domain or responsible department. This information is set by the GUI. And since this information is changeable over time the resource and groups are connected by a many-to-many relation, see Table 7.6 and 7.5.

Table 7.1. Table: resource

Name: Type: Purpose: id BIGINT(20) Identification name VARCHAR(45) Name of the resource show TINYINT(1) Describes whether the resource should be shown in the WebGUI comment TEXT Comment for the resource. Shown in WebGUI. type_id BIGINT(20) Type identification

Table 7.2. Table: resource_utilization

Name: Type: Purpose: resource_id BIGINT(20) Identification of the resource startdate DATETIME Start time and date for the measure- ment enddate DATETIME End time and date for the measurement state INT State of the resource booked SMALLINT(1) If the resource is booked or not

Table 7.3. Table: site

Name: Type: Purpose: id BIGINT(20) Identification name VARCHAR(45) Name of the group comment TEXT Comment which is shown in WebGUI 64 Common Utilization Tool

Table 7.4. Table: resource_type

Name: Type: Purpose: id BIGINT(20) Identification name VARCHAR(45) Name of the group comment TEXT Comment which is shown in WebGUI

Table 7.5. Table: resource_group

Name: Type: Purpose: resource_id BIGINT(20) Resource identification group_id BIGINT(20) Group identification startdate DATE The date the when the resource started belonging to the group enddate DATE Start The date the when the resource ended belonging to the group

Table 7.6. Table: group

Name: Type: Purpose: id BIGINT(20) Identification name VARCHAR(45) Name of the group comment TEXT Comment for the group. Shown in We- bGUI

7.4 Common configuration layer

The Common utilization tool should include a configuration layer that could be used to configure the resources. To use a common configuration layer puts some requirements on the design for the module specific storage of resources. These databases must provide a table that lists the resources, with their name, a status as to whether the resource should be measured. If the module also includes adjustable levels this should be able to configure with the configuration layer.

7.5 Common presentation layer

The main requirement of the utilization presentation layer was to present the utilization in generic manner for all types of resources. Since a presentation layer for STP-utilization tool, which also is used for the LTE Util Tool and parts of the ENIQ tool, was available we started by investigating whether this could be used. This should serve the purpose of showing the utilization in structured way where all types of resources could be treated equally. But we soon realized that the performance of the tool did not meet our requirements. The version used for LTE utilization, which uses some KPI has a very long response time. The solution 7.6 KPI reports presentations layer 65 that has been used to manage this problem is to use a static version of the page that could be fetched instead of the on demand generated page. This approach works for this purpose but we wanted to have a flexible structure where the resources could be combined in numerous ways which would make this solution impractical. We also wanted to avoid using the outdated CGI programming language of which we, furthermore, have no previous knowledge. Therefore we decided to create a new presentation WebViewer for our BSC utilization mod- ule that could be used as a presentation layer for the common utilization tool. We decided to use the same base structure as the tool we evaluated since this way of showing utilization seems to work quite well. We also added the requirement to make the tool fast so that many resources could be show at once without a long response time. Since we have some previous knowledge in and Servlet/JSP development we made the choice to use this technology as the base for the presentation layer. The main presentation of resource utilization, see Figure 7.6, shows the utilization on a dayly basis combined with a set of KPI for a specified month. The utilization record for a day shows the utilization as a percent value calculated by dividing the state used by the total number of records for the specific day. This will, if the collector has been able to measure during all hours of the day, be records with state used divided by 24. The KPI measures shown are, see Section 6.2 for more details:

• Average usage (equipment utilization) per day • Average usage this month • Average usage last 30-days • Operational efficiency this month • Availability efficiency this month

The choice of resources to be shown in the GUI can be set by updating the show parameter in the resource table, see 7.1. In the standard view the resources are shown grouped by the group to which they belong. If the user is only interested in a specific group the user can filter the results by choosing a specific group and only show the included resources. The second main view is the resource view, see Figure 7.6. This view shows the specified resource, year and months utilization records. In this more detailed view the individual utiliza- tion records states, are presented on an hourly basis. The utilization comment is also available by hovering with the mouse pointer over a record. Since one of the main motivations of construction was to construct a fast implementation, designing the code used has been of most importance. This is done by limiting the amount of traffic to the database and instead using the great indexing capabilities a modern relational database provides. Therefore many of the calculations that are needed to produce the statistics are produced by the database itself by creating smart Structured Query Language queries.

7.6 KPI reports presentations layer

The presentation WebGUI will provide a good presentation of the utilization on a daily and hourly basis. From a management perspective these types of presentations will not be very informative. The common utilization framework should therefore provide more customized reports. For this purpose the KPIWeb Tool could provide a useful platform for generating reports and graphs. The reports should be configurable to include resources on a STP or 66 Common Utilization Tool project level and with a variable time period. The generated report could consist of a table of resource utilization, KPIs and could be further visualized by connecting a graph to show the trend.

Figure 7.3. The main view in ECUT 7.6 KPI reports presentations layer 67

Figure 7.4. The resource view in Common Utilization Tool

Chapter 8

BSC Utilization Module

Chapter Introduction This Chapter present the results of the work which has been carried out for implementing the BSC utilization module that is developed for the common utilization tool, described in Chapter 7. Ericsson BSC see AppendixB.

8.1 Background

The work to construct a model to measure the utilization level on the BSC in the BETE lab environment is a topic which is very important for the Linköping site. This is because a large part of the site is developing the software and testing the software for it. The number of BSCs available for test is huge and possibilities to lower the total spending for this resource is significant. Ericsson has conducted previous studies and has concluded that it is hard to measure the utilization on the BSC due to the internal structure and the way tests are performed. The collector should be able to detect all the types of test cases that are used on the BSC. The BSC has the modular structure with a Central Processor and several Regional Pro- cessors that share the activities preformed in the node. This structure allows theCP to be idle even when the Regional processors are under load. TheCP can also perform maintenance operations which gives some internal load, even if the node would be considered idle (unused). Another big issue when implementing a utilization collector for the BSC is that the test environment is very complex and many of the activities would generate a lot of load on the node.

8.1.1 Type of test cases During the development of new features for the BSC it is important both to verify that the new feature works and that other functionally of the equipment are not affected by the new code. To guarantee this a lot of different test cases are produced to both test the BSCa and other connected nodes. These types of test can be categorized into:

69 70 BSC Utilization Module

• Load tests. During this type of test a lot of Circut switched and Packet switched traffic is generated. The generated traffic can either come from real nodesMS through the BTS or through a TSS. This type also covers most System & Verification tests where test case check if the new features introduces any faults on exist code. • Function tests. These types of tests often only check a small part of the software. During these test cases many statistical functions are disabled and the load on the BSC is low. A high load is often not needed to test a single function and will probably make it harder to test this small part. • Feature tests. A specific feature is tested in a real environment at BSS I&V with the focus on that the feature works as it was intended. • Upgrading. Upgrading ofCP,RP dumps and upgrading of the Adjunct Processor Group. This type of test is mostly performed by the Product Line Maintenance depart- ment.

8.2 Pre-study

The authors started the work for implementing the BSC utilization module by performing a pre-study on how the BSC works and what types of measurement points exists. This was done be reading documentations and discuss with our supervisor at Ericsson, Torbjörn Wikström and with Greger Hultman, a representative from the BETE organization.

8.2.1 Equipment states During the pre-study we gathered information on which types of states the BSC could be in, these are: • BETE pre-configuring of the nodes. • Configuration of the TSS and other equipment needed to perform the test case. • Configuration of the BSC. • Test. The actual test execution. • Idle (waiting for next test case). During this stage the node could be used for a separate test case. • Troubleshooting. In this state the node can appear idle in a traffic point of view but the tester could checks alarms, logs and configuration errors in the node or in surrounding nodes. • Idle (waiting for feedback from designer). During this test case the node is often left in the state it currently is in, so that the test can continue or if more information is needed. • Down. This state often consists of maintenance work done by BETE. During this state the node is often unreachable. • Follow up. During this stage the node could often be considered idle if the tester is not studying logs in the BSC. To get as a good view as possible all the states above should be able to detect and form this determine if the equipment is used or not. 8.2 Pre-study 71

8.2.2 Possible measure points Results from the pre-study rendered the following measuring points that we have evaluated:

• Capture real user traffic (traffic that is generated by the users of the system)

• Capture the traffic on the Operations and Maintenance interfaces

• Check energy consumption

• Using more advanced measure points internally in the node

• Indirect measuring points

8.2.2.1 Capture real user traffic By capturing traffic on the BSC we could create a generic measuring point for all types of node at the test site. Since the main purpose of a mobile network is to route traffic between the different nodes in the system it should be possible to capture activities by listening on the nodes interfaces. All nodes in the network are furthermore designed to communicate using standardized interface which would make it easy to develop generic measuring tool. A collector for this measuring point could be constructed as the Ericsson Real Utilization Measurement Solution tool with the change that a switch which can handle GSM traffic and mirror this traffic to some kind of equipment that can count the traffic. This tool will be able to detect all activities that involve some sort of traffic and could suit a load test environment quite well but all test activities that generates a low traffic load will not be detected. The configuring part of a test campaign will also be undetectable since this phase involves low or no outgoing traffic on the node. Because of this problem, a collector for this measure point will not cover all types and stages involved in the testing environment.

8.2.2.2 Capture operations and maintenance traffic The traffic that is generated on the Operations and Maintenance interfaces comes from testers connecting to the APG,CP and Regional processors through terminal clients and test scripts, when a OSS is configured to gather statistics or upgrade the node. These activities are all related to tests and are therefore a good measurement point to combine with the user generated traffic measurement point. These measurements will cover all test cases where the load on theOM interface is high and most troubleshooting activities. Load test with a low amount of traffic on OM interfaces will however be hard to detect. To get a measure of utilization it will therefore be necessary to combine this measuring point with some other method, such as the user traffic measuring point. This will probably cover most scenarios in the test environment and be a strong indicator of whether the node is used or not. But the cost of the tool will then probably be higher than the gain and consequently not feasible.

8.2.2.3 Energy consumption When a central processing unit operates the power dissipation (the process of consuming elec- trical energy) will vary during time due to the actual load in the node. Energy is dissipated by both the internal switching of transistors and due to the heat that is generated. The manufac- tures of processors presents two measures for the power consumption of the CPU, the thermal 72 BSC Utilization Module power during normal load and the maximum thermal power. Measuring the actual thermal power would therefore give a measure of how much load the node is under. The BSC consists of several processors and other devices, such as switches, that would consume more energy during higher loads. The major problem with this measuring point is that the change in power consumption will probably be dependent on how the BSC is configured. The configurations of the BSC in the test environments are diverse and changing and will therefore make this kind of measuring hard to calibrate. Another problem is also, as with the user traffic measuring point, that the troubleshooting and configuring stages will be hard to determine and correctly classify as used. The solution would also be expensive to scale up since it requires hardware for measuring the energy consumption.

8.2.2.4 Measuring inside the node This method involves measuring the utilization by checking parameters inside the node. Most of the current utilization tools developed for the BETE test environment use this approach. Most equipment used in cellular networks is constructed to be configured and monitored from a separate node like the OSS and functions for this purpose are therefore included in the node. The BSS is no exception from this and includes many statistical measurements, including user traffic that could be used to measure the utilization. The BSC has aOM interface to the APG,CP and all Regional processors included in the node. This measuring point will therefore provide the possibility of measuring both user traffic and capture commands and printouts the tester carried out on theOM-interface.

8.2.2.5 Indirect measuring points The BSC in the BETE environment is always grouped together with other nodes to form a STP. It is therefore possible to check if these nodes have been used to determine whether the BSC has been used. If one, for example, checks the logs for the TSS and find out that this has generated traffic for a BSC the BSC had probably been used during this time. This of course would not give a complete view of the current situation but certainly help to determine the utilization of the node under test.

8.2.3 Chose of measuring point Based on the results from the pre-study the authors decided to develop a software base solution which checks the parameters inside node. This method seams to cover most of the different test cases and states that the test can be in. This solution is also related to the lowest cost of investment because this does not involve investing in new equipment. A software implantation will also be more scalable when you want to measure on new nodes that are added to test environment.

8.3 Implementation

The design choice of using a software implementation that connects directly to the node a new study was started in order to find the appropriate base measures that will provide the derived measure of utilization. This work was carried out by discussing the matter with the testers for different departments in the GSM organization. The departments were chosen so that most of the stages in the development stage of the software was represented. The departments that 8.3 Implementation 73 were included were BSC Design, BSS & BSC I&V and PLM forming a reference group for the development of the BSC utilization module.

8.3.1 Base measures Communication with theCP in the BSC is carried out through Man-Machine-Language com- mands. A command either changes the settings of the BSC, prints out information about the status or initiates a command, e.g. a restart. By printing MML commands the following base measures can be revived.

• Number of calls last minute, with the command plldp. • Number of channels allocated to the cells, with the command rlcrp. The channels that are counted are with this MML command are: BCCH, CBCH, SDCCH and TCH. • Number of on demand channels forPS traffic, with the command rlgrp.

The typed commands are stored in an audit log at the APG which is the input/output system for the BSC. Since MML command is the method for the tester to communicate with theCP the audit log is an interesting attribute of the node. However OSS also types MML command for checking if the node is up. These commands are filtered out and the reaming commands are counted which gives the following base measure:

• Total number of types MML commands

The BSC can store statistics internally. These counters are called Statistics and Traffic Measurement and contains many different types of statistics and traffic measurement. The problem with STS counters is that they do not have to be activated. A number of counters are however activated at almost all BSC since they are used for test KPI. The counters are grouped into object types and counters in the object types BSC, CELLQOSG and CELLQOSEG gives the following base measures:

• Total number of connected calls • Total kilobyte in uplink and downlink

8.3.2 Code structure To make the code as generic as possible we decided to construct a code structure that has a base test code. This base test code states the rules for which methods that should be available for the specific test that communicate with the different nodes. To implement this structure a simple Java inheritance structure for the test cases and test codes was created borrowing ideas from ATE team that created the LTE UtilTool.

8.3.2.1 THC Test Case The main purpose of the THC Test Case GSMUtilizationTestCase is to create test cases for each resource that should be measured for utilization. The test case is developed to be able to support different types of resources in the GSM family like BSC and OSS. This is achieved by using an inheritance model for the test cases sharing a common super class with a method that can be used by all utilization modules. The test cases in THC is generated by (Figure 8.1): 74 BSC Utilization Module

1. Send a request to the database for all resources.

2. Receive resources and prepare test codes.

3. Request resources from the resource manager.

4. Resource manager returns and reserves the resources for test (this is a non blocking reservation).

5. The Java Virtual Machine (JVM) schedules the test codes for execution with a Parallel- TestScheduler.

CORBA SQL Resource Manager JTEX 1. GSMUtilTestCase 2. 3. JVM BSCUtil 4. Database Database BSCUtilTestCode 1 BSCUtilTestCode 2 BSCUtilTestCode 3 ...... BSCUtilTestCode n

Figure 8.1. The BSC utilization modules executing environment.

8.3.2.2 BSC Utilization Test Code The test code written in THC is as stated above implemented in inheritance structure to be able to reuse some of the code when a new module is created. The GSMBaseTestCode (super class) includes the execution() method that is called by THC when the test code is executed. This method executes the methods needed to perform the test. For the utilization test code the following steps are performed when measuring the nodes utilization:

1. Get resource name

2. Set start and stop time

3. Equipment initiation

4. Get data from database (if previous data is needed for calculating a change)

5. Get limits

6. Check equipment setup

7. Fetch data

8. Validate data 8.4 Collected data and classification of BSC state 75

9. Parse and store data

If some sort of error occurs during the execution an equipment clean up method is available to remove any temporary items or reset any state change on the node. These steps could be considered general for measuring at all types of nodes. Not all of the steps will be needed for all equipment and thus no code has to be included in this method. For the BSC utilization module all the methods except equipment initiation and check setup are used. The THC test code for the BSC are connecting to the nodes using the MML resource factory that is pre built in THC for communicating with, for example, a BSC using the MML language. In this factory there are tools available for parsing the printout for MML commands. This makes it easy to interpret the result from the function executed. A parser for the STS counter values was not available in the MML factory or in the THC repository and had to be developed. During the development the authors came across parser implemented in the Perl script language which we converted to JAVA code which could be used for this purpose.

8.3.2.3 Database The BSC utilization module needs a database to store the resources that are included in the utilization measurements. This database includes information about the address to the resource (used to fetch the appropriate THC resource), whether the node is disabled for measurements and a comment that is displayed in the common presentation layer if the resource is disabled. The database also stores the different base measurements thresholds for indicating usage. They are stored for each BSC including a description that could be used when the limits are set.

8.4 Collected data and classification of BSC state

The BSC utilization module has collected data from nine BSCs from different departments during one month, which gave approximately five thousand records. Table 8.1 presents some of the records that the BSC utilization module has colleted. From each record the state of the BSC has to be classified. The base measures are either counters or samples. Since counters are more accurate they should be used primarily. In cases where the traffic counters in the BSC, called STS, are not activated the samples have to be used instead. When deciding the classification rules for the BSC, the users of the equipment were interviewed. From the interviews it was difficult to determine the classification rules and therefore the choice was to start collecting data and decide the rules based on the collected data.

8.4.1 Classification of the equipment state The least number of states of the BSC that need to be classified are down, idle or used. The BSC is considered down if it not possible to access or login to the BSC. The states used and idle have to be classified from the values of the base measures. Figure 8.2 shows the records that have been collected with the BSC utilization module. In the Figure the number of typed MML commands and number of connected calls for each record are plotted. The records contain additional base measures, such as the data traffic, but the most significant base measures are the two which have been plotted. In the collected data there were no records withPS traffic 76 BSC Utilization Module

Table 8.1. Examples of records collected with BSC utilization module

Counters Samples BSC Total Total kB Total Number Number Number number number in up- number of con- of chan- of On of con- link and of types nected nels demand nected downlink MML calls last allocated channels calls com- minute to the for PS mands cells traffic 64 4190 0 0 109 336 0 48 12049 5666558 50 217 616 185 98 not active not active 225 0 316 0 81 7992 10420 0 139 239 14 90 not active not active 0 0 316 0 110 0 0 0 0 644 0 101 1714 354 0 27 37 0 120 3643 0 0 78 332 0

withoutCS traffic. That is why the classification will focus on the number of calls (CS traffic). It is clear that if there is traffic the number of connected calls is high. In the interval between 10 and 1000 number of connected calls there are very few records. It means that the limit for classifying the equipment as used, based on the number of connected calls, can be somewhere in that interval. The Figure also shows groupings of records that represent different types of test cases. Testers often use scripts for executing test cases which generate the same number of MML commands. There are more records which are not shown in Figure 8.2 since they have more then 10 000 calls or more then 200 MML commands during one hour. For the records where STS is active about 20 percent of them have zero number of calls and then it is the number of commands that has to decide the state of the equipment. In Figure 8.3 a histogram illustrates the number of MML commands when theCS traffic is zero. There is no clear line in the histogram, which make it difficult to decide where the threshold should be. There is a peak at 6 commands, which is difficult to interpret and further investigations are needed before a threshold for these base measure can be decided. The first hypothesis for classification rules is that the number of calls have to be higher than 10 and the number of MML commands have to be higher than 5 if the BSC is considered to be used, otherwise is it classified as idle. The rule is shown in Equation 8.1. With this rule it is possible to classify all BSCs that have the counter active. The BSCs that do not have this counter active have to be classified with other base measures and will be discussed later. It is often BSCs where Function Test is performed that do not have STS counters active.

BSC is used if: Number of typed MML commands > 5 (8.1) OR Number of connected calls > 10 idle otherwise

With the classification rule that is presented above the operational efficiency is 87.0 percent and the availability efficiency is 97.5 percent for the BSC that have the counters active (non 8.4 Collected data and classification of BSC state 77

Figure 8.2. Visualization of the collected record 78 BSC Utilization Module

Figure 8.3. Histogram with the number of typed MML commands when there are no traffic 8.4 Collected data and classification of BSC state 79

Feature Test BSC). This gives an equipment utilization efficiency of 84.5 percent.

8.4.2 Validation of the classification A validation of the classification rule was carried out by letting a person responsible for the test cases carried out in one BSC during one week, write down what activities that were carry out in each hour. During that week the BSC sampling module collected data from the same BSC for each hour. It gave classified records which are shown in Figure 8.4. The Figure shows that most of the records are easy to classify since they have high traffic, however when the number of calls is zero it is more difficult to classify the records. The state No activity corresponds to the BSC being idle and the three other states corresponds to it being used. The already classified records in figure 8.4 show that a classification with one limit for the each base measure will have a maximum accuracy of two misclassified records. That is if the rules for used are:

1. Number of typed MML commands > 6 or Number of connected calls > 0

2. Number of typed MML commands > 30 or Number of connected calls > 0

Figure 8.4. Visualization of classified record

Using the rule in Equation 8.1 there are three misclassified records in figure 8.4.

8.4.3 Samples or Counters When data has been collected it is possible to evaluate the difference between using counters or samples. There are two attribute of the BSC that can be either a sample or a counter and that 80 BSC Utilization Module is the number of calls and the amount of data traffic. Almost three thousand records have been collected and from them the following analysis regarding sampling the traffic has been carried out:

• In 8.3 percent of the records the number of calls during one minute is zero when the counter for the number of calls is greater than zero. • In 22.2 percent of the records the number of on demand channels for PS traffic is zero when the counter for PS traffic is greater than zero.

The bullets above indicated that if the number of calls is sampled, 8.3 percent of records with PS traffic will be misclassified. For the data traffic the error will be 22.2 percent if the counters for data traffic are not active. The reason for the large difference between the CS and PS traffic depends on how the attribute is sampled. For CS traffic the sample is the number of calls for the last minute. The sampling of PS traffic indirectly shows the data traffic by counting the number of on demand channels that are used for PS traffic. These channels are used when the PS traffic becomes more intense which means that this measurement may miss low intensity PS traffic. However it was the best possible way to sample PS traffic that was found.

8.4.4 Classifying Function Test The type of test for which it is difficult to measure utilization is mainly the function test. When function test is carried out, the traffic levels are very small. Here follows some suggestions for a better capturing function test: • More intelligent parsing of the audit log by looking for specific commands that indicates function test. • Using different classification rules for function test.

In function testing there are specific commands that are used, for example the command "Test System" is used for to activate a system in the BSC used in function test. Whether this command has been typed can be filtered out from the audit log. The base measure that shows if there are channels allocated to the cells can also be used to discover function test. The collected data show that this variable changed more in a BSC that in a function test. Chapter 9

Utilization modules for other equipment

Chapter Introduction The Common Utilization Tool presented in Chapter7 suggested various collector modules depending on the type of equipment to be used. Possible ways of implementing collector modules for some other pieces of test equipment is presented in this Chapter.

9.1 UE simulators

An investigation about the possibility to measure the utilization ofUE simulators used in Long Term Evolution has been recently conducted at the site in Linköping. The two tools used today are, in the thesis, called UE simulator 1 from supplier 1 and UE simulator 2 from supplier 2. In this section the conclusions from the report are presented for the twoUE simulators.

9.1.1 UE simulator 1 UE simulator 1 simulates the 3GPP LTE mobile terminal but requires a traffic simulator of third layer functionally called LTEsim. In the report no solutions were found where base measures can be measured directly in theUE simulator. The interfaces of the LTEsim could not give sufficiently accurate measures and the UE simulator 1 is difficult to access remotely. Two approaches for measuring activates in the node where investigated. The first solution was to implement counters in UE simulator 1 that counts the number of tests executed, the number of data or packets sent to UE simulator 1 or the number of commands sent to UE simulator 1. The values of the counters will then be the base measures. The drawback of implementing counters in UE simulator 1 is that it puts a load on the system, which can affect the test cases carried out and there is a cost of implementing it.

81 82 Utilization modules for other equipment

The second solution explored in the report is to have a network tap or port mirroring that filter and count the relevant packets in the network between the LTEsim and UE simulator 1. The solution is equivalent to one of the previous tools used for utilization measurements, ERUMS see 5.3.

9.1.2 UE simulator 2 UE simulator 2 also simulates UEs in LTE. The equipment is controlled from a Windows XP instance and external traffic generators are often used. In the test plant in Linköping an instance of Windows XP runs on a VMware ESXi server. The traffic generators are in a virtual Windows XP system, in a virtual Linux system or an external non-virtual data generator. There are several entities that have attributes that can be measures, however no suitabled attributes that could be measured were found. The report proposes two solutions for utilization measurements. The first solution is to implement utilization logging in the UE simulator 2. However the requirement for this feature competes with other features required by Ericsson in the negotiations with the supplier. The other solution is to have a network tap and traffic analyzer as described in the discussed solutions for UE simulator 1. A drawback of such solution is the cost for switches that have a network tap or port mirroring.

9.1.3 Conclusions for the UE simulators The investigation presents two solutions for measuring the utilization of UE simulator 1 and UE simulator 2. One of the solutions is to implement a logging feature in the equipment, which has to be done by the supplier of the tool. It is possible to have it as a requirement for the equipment, in negotiations with the supplier. For both UE simulator 1 and UE simulator 2 a proposed solution is to tap or mirror the traffic in a switch and forward it to a server where the packets are filtered and counted. It is a general method which can be used for all kinds of network nodes with IP-interfaces. It is a solution that already exists within Ericsson called ERUMS and is described in 5.3.

9.2 Protocol analyzers

9.2.1 Tektronix K15 When communication and network products are developed and tested a useful tool is a protocol analyzer that analyzes the traffic over an interface or in a network. Tektronix protocal ana- lyzer K15 is a tool for protocol analysis used at Ericsson test environment. It is a compact-PCI compliant platform that runs Windows XP Embedded operating system where a software appli- cation captures the packets, analyzes the packets and lets the user interact through a graphical user interface. The equipment has to be physically connected to the interface that is being analyzed. The users (testers) are almost always logged in remotely through Remote Desktop, however it is possible to login on site. There is a newer protocol analyzer from Tektronix called K18, which is a solution with a central server for analyzing and probes for capturing the traffic. The proposed utilization measurements for Tektronix are focused on K15 henceforth. The users of the computer login remotely and start the capturing of data. The data is either stored at the hard drive in the compact-PCI or stored on a Server or at the testers PC. The analysis of the captured data can be carried out in real time while the capturing take place or 9.2 Protocol analyzers 83 it can be carried out later, after the capturing is completed. This means that a user can login, start the capturing of data, then log out and let the capturing of data continue. When enough data is captured, which can take several hours, the user logs in again in the computer, stops the capturing and analyzes the data. Through interviews with the staff responsible for Tektronix K15 the following base measures are suggested: 1. Is the application for capturing and analyzing running? 2. Have any users logged in? 3. Is the data being written to the hard drive? 4. Are data transferred from Tektronix K15? The base measure number one gives the most accurate measure of the packet analyzer usage. Only one instance of the application can be running one at a time, which means that it is important that the users quit the application when they are finished. As long as the application is running no other tester can use the equipment and it can be considered to by used. The base measure of checking if a remote login has occurred is a fairly good indicator of usage, since users have to log in to start capturing the data. That measure will not capture when data is being captured without any logged in users. That state of the equipment can be determined by a combination of base measures three and four.

9.2.2 Nethawk M5 Nethawk protocol analyzer M5 is a protocol analyzer software that runs on a Windows Server 2000. It has similar functionality as Tektronix K18 and is used for the same purposes. The servers are located in the test laboratory and the tester login to the server remote using software from Citrix. The remote log in procedure in the Server will not allow the application to continue to run if the user logs out. One experiment was carried out where the number of logged in users were measured and whether the M5 application was running. It was carried out twice for fifteen servers and the results are presented in Table 9.2.2. The measurements are carried out by using PsTools, although they can be carried out with the Windows commands tasklist and quser. In Table 9.2.2 two samples of base measures are shown for fifteen of the Nethawk servers. The samples demonstrate that it is a promising method for measuring the utilization of these servers. As described earlier the application cannot run if a user is not logged in, which is shown by the experiment result in Table 9.2.2. However a user can be logged in without running the application and for that reason the straight forward way to classify the equipment state is to declare that the equipment is used when the M5 application is running. From the result in Table 9.2.2 the equipment utilization is 46 percent although too few samples are made in this experiment for statistical confident results.

9.2.3 Proposed solution for packet analyzers The base measures with the highest potential for both Tektronix K15 and Nethawk M5 is to find out whether the application for data capturing and analysis is running. The classification of the equipment becomes very straightforward. If the application is running the equipment is classified as used and if not it is classified as idle. The problem with such solution is if the user forgets to close the application. It is today very important that they close the application since it is not possible for more than one instances at the same time to be run. 84 Utilization modules for other equipment

Table 9.1. Example of base measures of 15 Nethawk M5 Servers

Sample day 1 Sample day 2 Server number Numbers M5 applica- Numbers M5 applica- logged in tion is run- logged in tion is run- users ning users ning Nethawk01 1 Yes 1 Yes Nethawk02 no data no data no data no data Nethawk03 0 No no data No Nethawk04 0 No 1 No Nethawk05 1 No 1 Yes Nethawk06 no data No no data No Nethawk07 no data Yes no data Yes Nethawk08 0 No 0 No Nethawk09 2 No no data Yes Nethawk10 1 Yes 0 No Nethawk11 1 Yes 1 Yes Nethawk12 0 No 0 No Nethawk13 1 Yes 1 Yes Nethawk14 1 Yes 1 Yes Nethawk15 1 Yes 1 Yes

The measurement method is the mapping of the fact that the application is running or not into a binary value that are the base measures. The measurement method can be of either of pushed or of pulled nature. Whether the application is running or not can be pulled from the equipment by running the windows command tasklist. The state in the equipment is then sampled and this has to be done frequently in order to give confident results. Another approach is to have an application running on the server that registers when the application is started and closed or to implement a logging feature in the application itself. The last solution requires that the feature is implemented by the suppliers of the tools. Only letting the logged in users be the base measure is problematic. Tektronix K15 can be used even if no user is logged in and for both Tektronix K15 and Nethawk M15 a user can be logged in without running the application. The proposed solution for Tektronix K15 and Nethawk M15 is to use the base measure that show whether a special application is running or not. For the different equipment types, different applications classify the equipment as used. The solution should be generic for the Windows platform, hence it can be used for both Windows 2000 Server and Windows XP Embedded. Chapter 10

Discussion

10.1 Possibilities and potentials of equipment utilization measurements

The potential for success of a Common Utilization Tool within Ericsson seems to be enormous. A lot of effort is currently being invested in different efficiency programs within the test envi- ronment in order to raise the output from the invested resources. A utilization tool will give a good idea as to how much the equipment is used and will therefore show some of the possibilities of these effectivity programs. The equipment used for testing the developed software for the different networks is, further- more, very expensive and if one can determine at which rate they are used there will be a better foundation for investment decisions. If, for example, the optimal usage level is determined to be 80% and this can be proved not to be the case, the question is whether an investment should be made in more test equipment. There is of course a possibility that even if a low level of usage is discovered, the equipment cannot be used for anything else, due to, for example, long configuration times and faulty equipment. If this is the case the utilization measures will in- dicate the need to invest in better and faster configuration methods or to further investigate whether some other part of the test environment can be improved.

10.2 The value of a Common Utilization Tool

The situation today, where new utilization tools are developed for each type of equipment that is to be measured on, is not a sustainable solution in the long run. Development resources are wasted and if the development resources could instead be combined, the costs for developing these tools would be drastically reduced. These various tools also have their own definition of usage and this therefore makes it hard to compare the data that is collected from the tools. This definitely makes it harder to follow up the impact on the utilization part of a test efficiency program on a higher level, e.g. the Design Unit Radio level. If one is able to create a complete picture on this level, one would probably be able to detect the departments that are using their test equipment more effectively than others. A big gain would then be to study these to see what it is that makes the difference and use this to help others. The purposed solution, of constructing a common tool with separated modules, will of course make the system a bit more expensive than just to implement a generic utilization tool that

85 86 Discussion could be used for all types of equipment, i.e. ERUMS. But this will enable Ericsson to create a utilization tool that will hopefully work for all types of equipment regardless of platform and type of test they are used for. A tool that is used for this purpose has to give as a correct picture of realty as possible or questions and excuses will be made. The trustworthiness of this kind of tool depends to a great extent, on achieving an un- derstanding of the users of the system. To achieve this understanding the users have to be involved in the process. Then even if the tool is not perfect the users will know what the tool’s information is actually presenting.

10.3 Weakness of the BSC Utilization Module

The collected datas show a high usage of the equipment. The suggested classification rule gives an equipment utilization of 84.5 percent. It can be interpreted in several ways. It is possible that the utilization value is overestimated, though the verification and validation do not indicate that it is the case. Another explanation is that the collected data is not representative of the populations of BSCs since it was only possible to collect data from nine BCSs. Another problem that has been found is the ”special cases”. One out of the nine BSCs was used as a TRC (see Appendix B.2 for some time, which resulted in the outcome that none of the base measures indicated activity. This can be seen as a ”special case”, which either can be dealt with by using additional base measures or it can be considered a rare case and it can be ignored. When the BSC utilization module is set to collect data from all BSCs it is likely that further ”special cases” will be found. It is certain that not all test cases will be able to be found and there is a trade-off between having a model that has few base measures and is simple and having a model that captures as many cases as possible. To determine how the classification should be carried out is difficult. Other methods are possible which could increase the accuracy in the classification. To construct a more sophisti- cated classification a large number of correct classified records are needed, called training data. Training data is time consuming to collect but one solution is to include functionality in the web-GUI where the testers can manually put in information about what activity is carried out in the equipment. In practice it is likely that testers use this function when the tool is not classifying the equipment as used when the tester consider it begin used. If that is the case the training data will no be representative. If such a function in the web-GUI would exist a dynamic classification is possible. The tool would then change the limits or the rules for the classification over time. One technique to use for this is Bayesian Classifiers, which is success- fully used to classify SPAM. However that is a black box black box technique, which is difficult to understand. The results from the previous work at Ericsson, on similar tools, are that a tool with a lot of base measures makes it hard to decide the classification rules. A simple model has the advantage of being easy to understand and people will therefore be more confident about the result. It is possible for testers to leave the traffic activated on purpose or to have scripts that type commands to increase the Equipment Utilization. It could be complicated to consider such behavior in the utilization module. A better method is to complement the indicator Equipment Utilization with Test Performance and Quality Measurements. However the test performance and quality measurements used today, e. g. Fault-slip-through, is on an aggregated level for a whole project and not for a specific equipment. 10.4 Future work with the BSC Utilization Module 87

10.4 Future work with the BSC Utilization Module

The future work with the BSC Utilization module should be focused on collecting more data from many more BSCs. The data have to be analyzed and additional ”special cases” would probably be found. Also the validation has to be further extended. This could be done either by discussions with the tester of the equipment after data has been collected or by letting the tester write down the test activities while the data is being collected. A more detailed validation could improve the classification rules. The improvement of the classification rule in 8.4.1 is to calibrate a threshold for the number of typed MML commands. The other base measures, especially when STS-counters are not active, have to be included in the classification rule as well. The current BSC Utilization Module has problems to classify the state of a BSC where function test is carried out. Since the function test is carried out with traffic that is low and STS-counters are inactive another approach is needed. One solution is to activate STS-counters in all BSCs that are being measured. This can easily be achieved by a MML command that the BSC utilization module can execute. Having some of the STS-counters active should not interfere with test operations. A smarter parsing of the audit log then just counting the number of commands is also a good base measure. Testers use specific commands when carrying out function tests. However the audit log that can be parsed with the BSC Utilization Module is for commands typed to theCP of the BSC. There are Regional processors which testers type commands to, which are not stored in audit log. How often testers only type commands to the Regional processors has not been thoroughly investigated. The number of BSC used for function test is not so high , since most of the function testing is carried out in simulated and emulated test environments. During the implementation a number of testers suggested other features for the tool, features that did not directly contribute to the purpose of the tool but could easily be implemented. It can, for example, be to see in the web-GUI which tester that is logged in to BSC and what configurations the BSC have. These Spin-off features can be implemented in the future.

10.5 Future work with the Common Utilization Tool

Since we have only implemented a part of the common utilization tool, a part of the presentation layer, the next step in development should be to construct a database environment. This environment will have to be able to support a heavy load of traffic if it is to be implemented throughout all Ericsson sites. The database tables presented in Chapter7 will be needed to support the basic functions needed to present the information. The proposed presentation layer will have to be adjusted so that user is able to choose which equipment that should be presented, by resource, group (stakeholder, project etc.) or a more advanced choice like a dynamic STP grouping. This last choice will probably require that more information is stored in the database, in order to be able to support this, but this could be polled to get this type of result, since much of this information is stored in the BAMS database. The common configuration layer could at first be implemented at a site level to reduce the amount of equipment that should be able to be configured in the tool. This will also reduce the level of complexity since the communication will only be required to be intra site rather than inter site. 88 Discussion

10.6 Future work on the utilization uodules for other equip- ment

When implementing utilization modules for other equipment the experience from the work in this Master Thesis can be useful. To be precise, the pre-study and implementation of BSC utilization module, described in Chapter8, gave the following method for utilization modules:

1. Talk to the experts of the equipment to get information about which possible base mea- sures there are. 2. Talk to the users (often tester) that use the equipment for knowledge about how the equipment is used. 3. Implement at data collecting application using THC or reuse existing scripts and tools.

4. Collect data over a sufficiently long time period. 5. Based from the collected data discuss with the users of the equipment how to build up the classification rules.

The first step is very important and was a key factor for the implementation of BSC uti- lization module. Since the experts of the BSC were available it was possible to identify the base measures. Step number two is essential since the experts of the equipment do not have to be the same people as those that carry out the tests. One type of equipment can often be used in many different ways for different reasons and the main users of the equipment should be interviewed. It is more time efficient to reuse previous work and therefore the THC should be used for collecting data. The process of implementing the utilization modules is mainly time consuming when it comes to validation of the model and discussions and interviews with testers and experts. A good idea is to initiate the work with a workshop where the topic is to discuss possible base measures. Then a reference group can be put together, with people representing the main users of the equipment. The reference group is important for steps three and five on the list above.

10.7 Future work in the test environment

During the work with this thesis we found some areas in the test environment that could be changed to make it easier to conduct and assure the accuracy utilization measurements. When Ericsson states the requirements for a tool that is bought from a supplier the possibility of measuring utilization should be added. This will avoid the need of introducing complex and unnecessary ways of measuring utilization, e.g. in the case of the UE simulator 2 where the proposed solution required measuring and filtering high loads of data through external equipment. If this is introduced early on in the negotiations with supplier the feature will probably introduce a lower cost and the benefits of measuring the utilization will top this cost.

To counter the problem that occurs when a tester leaves a script or a traffic generator running even after the test is fulfilled is something that we don’t have a good solution to. One way of solving this problem is to increase the usage of an automated test environment, e.g. THC. By scripting test case the generated traffic can automatically be taken down at the end test and the "forgotten" traffic would not exist and therefore not detected as used. Chapter 11

Conclusion

The objectives of the Master’s Thesis was to define utilization for test equipment, construct a general model for measuring utilization of test equipment and implement at prototype for measuring the utilization of the BSC in GSM. The definition of utilization was straightforward, by classifying equipment into a discrete state per time period. The definition has the advantage that it simple and that it is general for different equipment types. The drawback is that it cannot capture the performance and quality of the tests carried out on the equipment. The general definition was required for the general model for utilization measurements presented in the thesis. The general model makes it possible to have a common presentation of the utilization for different equipment and have equal equipment efficiency indicators. The collecting of data and classification of equipment is however not considered to be general and that is why the concept of utilization modules is presented in the chapter Common Utilization Tool where the general model is realized. One utilization module was implemented for the BSC. The implementation shows that it is possible to measure the utilization of such equipment. It also shows that it is a winning method to reuse current solutions for this purpose. In this case it was THC which was used for making the measurements which are primarily used for test automation. Since THC already has connections to the test equipment it can continue to be used for this purpose when utilization measurements of additional test equipment are to be carried out. The BSC utilization module have collected over 5000 records from nine BSCs and a suggestion of a way to classify the BSC for each record is presented. Using that classification rule, the BSCs that had their STS-counters active were used 84.5 percent of the time. The Master´s Thesis gives great possibilities for the future of measuring the utilization of test equipment. The BSC utilization module can be scaled up to be used for all BSCs in the Ericsson test environment and it is possible to implement utilization modules for other equipment using the methodology and concepts presented in this thesis.

89

Bibliography

[1] Test harness core system description. Ericsson, 2006.

[2] Ericsson Radio Systems AB. EN\LZT 101 1513 R4A, AXE Survey. Ericsson AB, 2004.

[3] Ericsson Radio Systems AB. GSM System Survey. Ericsson AB, 2004.

[4] Heidi Alakurtti and Kristin Zetterström. Kommunikationsproblem i kommunikationsbran- schen. Master’s thesis, Linköping University, 2007.

[5] GSM Association. Market data summary (q2 2009). http://www.gsmworld.com/ newsroom/market-data/market_data_summary.htm, 2009. retrieved on 2010-01-26.

[6] GSM Association. Gsm. http://www.gsmworld.com/technology/gsm/index.htm, 2010. retrieved on 2010-02-16.

[7] Gunnar Blom. Sannolikhetsteori och statistikteori med tillämpningar. Studentlitteratur, 2005.

[8] A. J. de Ron and J. E. Rooda. Equipment effectiveness: Oee revisited. iee stransactions on semiconductor manufacturing, vol. 18, NO. 1, February 2005.

[9] Jörg Eberspächer, Hans-Joerg Vögel, Christian Bettstetter, and Christian Hartmann. GSM - Architecture, Protocols and Services. John Wiley & Sons, Ltd, 3rd edition edition, 2009.

[10] Ericsson. Thc information sheet.

[11] Ericsson. Apg43 description, 3/22102-fgb 101 413 uen. Ericsson, 05 2008.

[12] Erik Dahlman et al. 3G Evolution: HSPA and LTE for mobile broadband 2nd ed. Academic Press, 2008.

[13] Shull F et al. What we have learned about fighting defects. Proceedings of the Eight IEEE Symposiumon SoftwareMetrics, pages 249 – 258, 2002.

[14] International Organization for Standardization and International Electrotechnical Com- mission. Systems and software engineering - measurement process. ISO/IEC 15939:2007(E), 2001.

[15] Philip Godfrey. Overall equipment effectiveness. Manufacturing Engineer, June 2002.

[16] Harri Holma and Antti Toskala. WCDMA for UMTS. John Wiley & Sons, 2001.

91 92 Bibliography

[17] Catherine Jablonsky. Performance management. Plant Engineering Dec, 63 Issue 12:p26– 27, 2009. [18] Lars Lundberg Lars-Ola Damm and Claes Wohlin. Faults-slip-through - a concept for measuring the efficiency of the test process. SOFTWARE PROCESS IMPROVEMENT AND PRACTICE, pages 47–59, 2006. [19] Javier Muñoz and Yiping Ding. Sampling issues in the collection of performance data. BMC Software, 2002. [20] David Paramenter. Key Performance Indicators. John Wiley & Sons, 2007.

[21] Moe Rahnema. Overview of the system and protocol architecture. IEEE Communi- cations Magazine, 0163-6804/93/, 1993. [22] Jonas Reinius. Cello - an atm transport and control platform. Ericsson Review No. 2, pages 48–55, 1999. [23] Örjan Ljungberg. Measurement of overall equipment effectiveness as a basis for tpm ac- tivities. International Journal of Operations & Production Management, Vol. 18 No. 5, 1998. [24] Graham Rong. A framework for real-time process control. Advanced Semiconductor Man- ufacturing Conference and Workshop, IEEE/SEMI, pages 159–164, 1998.

[25] Moray Rumney. 3gpp lte: Introducing single-carrier fdma. Agilent Measurement Journal, 2008. URL: http://cp.literature.agilent.com/litweb/pdf/5989-7898EN.pdf. [26] Jochen H. Schiller. Mobile Communications. Addion-Wesly, second edition edition, 2003.

[27] John Scourias. Overview of the global system for mobile communications. http://ccnga. uwaterloo.ca/~jscouria/GSM/gsmreport.htm, October 1997. retrieved on 2010-01-27. [28] Javier Romero Timo Halonen and Juan Melero. GSM, GPRS and EDGE Performance, Evolution Towards 3G/UMTS. John Wiley & Sons, Ltd, second edition edition, 2003. [29] Imrich Chlamtac Yi.Bing Lin. Wireless and Mobile Network Architectures. Wiley, 2001. Appendix A

Test Harness Core (THC)

THC is a internal "framework for a unified test tool environment within Ericsson". The frame- work is designed by Ericsson together with the external partner Cybercom Group. Cybercom is responsible for implementing new requirements, carrying out maintenance and providing sup- port on THC. The work with THC is based on previous experience from developing the first generation of test tool framework the Test Tool Middle Ware (TTMW) back in 2001. THC could and is recommended to be used for all Ericssons R&D organizations. It is mostly used within System, Network and end-to-end verification of Telecom applications. The main customers of THC is LTE RAN IoV (Kista and Linköping), CNIV in Aachen, GRAN (GSM Radio Access Network) in Linköping and MGW in Jorvas, all using Java as test case development language. The main motivation for using a automated test environment is that the test plants often consist of many different test tools, all with their own interface and platform. Therefore the test code developer has to learn each tool’s Application Programmers Interface. When using more than one of these tools, and the tools are located at different places (and platforms) the situation gets even more complex. One other major driving force is that the test channels are often booked for a long period of time but not used more than a fraction of this time due to extensive reconfiguration of test scenarios. Also changing test environment from a simulated environment to real equipment a tester often has to design a new test script. But since the two cases verify the same feature, using different tools, it is more efficient to use a standardized API which enables reuse of the first test script. The use of THC has proved to be an efficient test case design, execution, test tool devel- opment, maintenance, support building tool that therefore reduces the total cost of executing test cases [10]. The basic idea behind the framework is to provide an easy conversion of test case specifica- tions that a tester creates to verify a new feature to a executive code that carries out the actual measurements.

A.1 Definitions in THC

THC provides a set of definitions that are used in the thesis [1]:

Feature Resource is an object providing functionality in a physical or simulated environment, for example Mobile SMS, MML Session etc.

93 94 Test Harness Core (THC)

Physical Resource is a hardware entity, for example a SUN workstation, an AXE node etc.

Resource is in the scope of THC, either a Feature Resource or a Physical Resource.

Test Campaign is a collection of one or more Test Cases to be executed together.

Test Case is a collection of one or more Test Codes and a Test Scheduler that decides how the Test Code(s) are executed. A Test Case always has a specified purpose and is therefore not suitable for reuse. A Test Case shall be freestanding and there shall be no dependency on other Test Cases.

Test Code is the actual code written in the notation understood by the test executor to perform some test task. JTEX understands Test Codes written in Java. Many testing behaviours within a test case are similar, e.g. a mobile to mobile call. These general testing behaviours should be implemented in reusable test codes. A test code could in its turn reuse other test codes in order to make a more specific but still reusable test code.

Test Session is the time during one or several resources is allocated by a test case or test campaign.

A.2 System overview and concept

THC is designed to be a distributed test automation system and is based on Common Object Request Broker Architecture technology. The use of CORBA gives the possibility to develop a platform and programming language independent framework with a specified Interface Defini- tion Language (IDL). The system consists of four subsystems, illustrated in Figure A.1.

• Test execution system

• Test Tool Middle Ware Subsystem (Middle Ware and Resource Manager)

• Tool Adaptation Subsystem (Resource Factories)

• Logging Subsystem

• External systems

– A Test management system (i.e such as MARS, a TMS system used at Ericsson). The TMS could be used to configure the resources involved in the test campaign. – System under test (the actual node that test case covers)

A.2.1 Resource Factory (RF) THC uses different Resource Factorys to control the resources involved in the test campaign. TheRF is collection of Test Tool Resources that actually communicates with the physical resources (system under test). One example ofRFs is the File Transfer factory that can com- municate through ftp, sftp and SSH and another important factory is the OM AXE (MML) factory that can communicate with APGs or the old Input/Output Groups. TheRF are in- cluded in the TAS subsystem. A.2 System overview and concept 95

Eclipse IDE for test case development

LogViewer Execution GUI Test Repository (web based)

TestTest TestTest Client Client ManagementManagement GUIGUI or or CLI CLI SystemSystem

Middle Ware (Corba) Logging Service TestTest Execution Execution SystemSystem

Middle Ware (Corba)

Factories ... Database Interface Tool #1 Interface Tool #2 Interface Tool #n Examples: • Load Generators

RESOURC ES ------• Protcol Analyzers TOOL TOOLIN STANCE RESOURCEID MA XALLOCATIONS AVAI LABLESTATE OCCUPIEDSTATE NUMBE ROFUS ERS OM M ML bsc10 5 AVA ILABL E F REE 0 OM RESOURC M ML ES bsc14 5 AVA ILABL E B USY 1 : ------Msms TOOL C CN10 1_4x4 TOOLIN STANCE bas_1 RESOURCEID1 MA XALLOCATIONSAVA ILABL E AVAI LABLESTATEF REE OCCUPIEDSTATE0 NUMBE ROFUS ERS Msms C CN10 1_4x4 bas_2 1 AVA ILABL E F REE 0 : OM M ML bsc10 5 AVA ILABL E F REE 0 NetHawk OM netha wk8 M ML netha wk8 bsc14 1 5 AVA ILABL E AVA ILABLF E REE B USY 0 1 : : ... Msms C CN10 1_4x4 bas_1 1 AVA ILABL E F REE 0 RemoteCmd remot ecmd sgsn gsn10 a 5 AVA ILABL E F REE 0 • Link Breaks Msms C CN10 1_4x4 bas_2 1 AVA ILABL E F REE 0 : NetHawk netha wk8 netha wk8 1 AVA ILABL E F REE 0 RESOURCE FACTORI: ES Test Tool #n Test Tool #1 Test Tool #2 ------RemoteCmd-- remot ecmd sgsn gsn10 a 5 AVA ILABL E F REE 0 TOOL TOOLINSTAN CE STATU S

OM M ML REGIS TERED RESOURCE FACTORI ES Msms CCN101_4x4------REGISTERED-- • Terminals TOOL TOOLINSTAN CE STATU S

OM M ML REGIS TERED • Unix/Linux/Windows ws Msms CCN101_4x4 REGISTERED Resource Manager Resource SystemSystem UnderUnder Test Test

Figure A.1. Test Harness Core (THC) system components (used with permission by Jonas Madsen, Ericsson AB)

A.2.2 Test Execution System (TES) The TES is the subsystem that is in charge of executing the actual test campaign and test cases within the THC environment. The test campaign is started directly by the TES but the actual test cases are initiated by TES but executed by several Test Executors. The TES provides the following features.

• Test Campaign Executors Responsible for the test campaigns and reports to the progress to the TCE client. Also communicates with the TEXs and the Resource Manager

• Test Campaign Executors Client, Command-line interface or Graphical User Interface (only available at Site Linköping and Kista), screenshot of GUI in Figure A.3. The CLI is used to login to the THC environment and give instructions on which test campaign that is supposed to run. The feedback from the TCE is directed to the standard output (often the terminal window in the GUI where the test case is executed). The GUI is a graphical 96 Test Harness Core (THC)

interface to the CLI but also helps the user with configuring the required configuration on, for example, the resources needed to execute the test campaign. • Test Executors, (e.g. Java Test Executor). The TEX carries out the test cases that have been designed in java test case code. One main opportunity here is that the actual code can easily be reused if a new test case uses similar components. The TEX supports two types of schedulers, parallel or sequential execution of the test case. • A MML Java Parsing Support. THC provides a very convenient way of working with MML printout by converting them to Tables or using regular expression matching.

A.2.3 Test Tool Middle Ware Subsystem (TTMW) The TTMW consist of theRM and the Core Services. The Core Services are, as mention earlier, based on CORBA Technology and are responsible for the communication between the different blocks and provide the notification service. TheRM maintains a repository for all resources in THC with their associated configuration data. All resources are identified with a unique name and can be prereserved for different test sessions, available and not available.

A.2.4 Log Service The logging subsystem provides a database where the THC logs can be placed. The current version of THC uses a Notification CORBA Service connected to MySql database server where the different nodes in THC, TEX, TCE,RM etc. The log service also provides a log viewer webservice which gives the tester the possibility of viewing the progress and log of the test campaign. The log viewer consists of two views a Test Session View, see Figure A.4, which gives a summary of all test cases and a Log Record View, see Figure A.2.4, which gives detailed information on a test case. A.2 System overview and concept 97

Test Repository

TestTest TestTest Client Client ManagementManagement GUIGUI or or CLI CLI SystemSystem ing Service MW (Corba)

TestTest Execution Execution SystemSystem

MW (Corba) Logg MSMS (CCN) NetHawk VISA (SMU200A) OMAXE (MML) Remote command Spirent Mobitec Tektronix PCMLIG File Transfer CPP GRNS (MTsim) New factory TCE JTEX Log (SEA) AIMS Factories Test Tools Mobitec PCMLIG NetHawk Spirent Tektronix MSMS (CCN) Rhode&Schw (SMU200A) CPP node TSS (MTsim) AXE,SEA APG IOG, New Tool MMLsim SEA FTP, SFTP server AIMS Telnet,SSH server ResourceManager SystemSystem Under Under Test Test

Figure A.2. Resource Manager (used with permission by Jonas Madsen, Ericsson AB) 98 Test Harness Core (THC)

Figure A.3. The ATE GUI A.2 System overview and concept 99

Figure A.4. Log Session View in THC

Figure A.5. Log Record View in THC Appendix B

Ericssons Base Station Controller (BSC)

B.1 Base Station System (BSS)

The BSC is as mentioned in Chapter3, a part of the BSS which is responsible for all radio- related operations in the network such as; communication withMS, handovers, management of radio resources and cell configuration. Ericsson’s BSS consist of:

• Base Station Controller. The BSC is the main switching part of BSS and controls the Base Station Transceiver Station andMS.

• Transcoder and Rate adapter Controller. Responsible for the rate adaptations between the BSS and the NSS. TheMS 33.8 kbit/s signal is to be adapted to the rate that is used by MSC, 64 kbit/s.

• BTS. Interface between theMS and the rest of the GSM-network.

The BSC is one of Ericsson’s most powerful and flexible systems and commonly controls about 10 to 100 BTSs. The main functions are to control theMS, carry out measurements on radio conditions, and to support the handover functions needed.

B.1.1 TRC

As mentioned before the TRC main task is to perform the rate adaptations that is need for a MS to communicate with the NSS. The 33.6 kbit/s signal that is sent from theMS is removed from the overhead that is introduced to support a reliable transfer over the air interface, Um, to a 16 kbit/s. This 16 kbit/s consists of 13 kbit/s data and 3 kbit/s of signaling traffic. The 16 kbit/s signal is the sent to the TRC where the signal is adapted to a 64 kbit/s PCM signal. The reason why the signal between theMS and NSS is first converted to a 16 kbit/s is to reduce the number of links form the BSC to TRC by a quarter. The PCM signal that is used in most PSTN networks and ISDN to the GSM codec’s does also has to be converted. This function is also a task performed by TRC and is called transcoding.

100 B.2 BSC Products 101

B.2 BSC Products

Ericsson’s BSC family consists of two main parts a combined BSC and TRC and a standalone remote BSC (with a separate TRC or a combined BSC/TRC). The TRCs can be allocated depending on load, polled, as Full Rate, Half rate, Adaptive Multi Rate (AMR) Full Rate and AMR Half Rate. The combined BSC/TRC is the most common configuration of the BSC and is able to support up to 1,020 transceivers (Transcoder and Rate Adaptation Unit) and 15 remote BSC [3]. The remote BSC is ideal to use in locations with low traffic demand and can therefore use a separated TRC. These remote BSC supports 500 Transcoder and Rate Adaptation Units. To reduce the capacity needed the stand-alone TRC is located close to the MSC supports 16 remote BSC[3]. The different BSC configurations are illustrated in Figure B.1.

BTS A BTS BSS NSS MS BTS bis

BSC Um

Abis A-ter A BTS BSC TRC MSC 16 kbit/s 16 kbit/s 64 kbit/s per call per call per call (Full rate) (Full rate) A 64 kbit/s per call BSC/TRC

A-ter BTS A BTS bis BSC 16 kbit/s BTS per call (Full rate) BTS BTS

Figure B.1. The different BSC configurations

The BSC is based on the AXE platform first developed for the PSTN. The newer AXE 810 platform which is also used in other nodes like MSC, VLR, HLR. The AXE used in BSC with the following components: • APT. The APT is the switching part of the AXE. Includes the non-blocking switch and the Ericsson Generic Magazine. The Generic Magazine gives the possibility to mix different functions from different units in the same magazine instead of having to use separate ones. The magazine supports 22 different devices which reduces both footprint and power consumption of the BSC. • APZ. The APZ is the control part. Runs the applications that controls the switches. 102 Ericssons Base Station Controller (BSC)

• Adjunct Processor Group. The APG is the I/O system connected to the BSC.

The AXE platform is designed as a modular concept which provides a platform that can easily adapt to changes and new features. This leads to an open architecture with reduced time to market. The AXE is a multi functionality platform which means that the same AXE system can be used for many applications from a small exchange to a large mobile node. The software that runs on the platform are programmed and deployed independently, and with standardized interfaces to increase the level of software security [2].

B.3 BSC Hardware and Subsystems

The BSC includes several AXE system components such as:

• Common Channel Signaling Subsystem (CCS). Consists of both hardware and software for signaling and routing etc. using the SS7 signaling standard.

• Group Switching Subsystem (GSS). Consists of Group Switch and is responsible for con- nection setup and teardown. The group switch uses a Time-Space-Time architecture with high speeds.

• Digital Link, DL. Is an interface between the Group Switch and the configured devices.

• Central Processor

• Regional Processors

• Adjunct Processor Group

The BSC is shown in Figure B.2. The Figure consists of 1. Cabinet, 2. Ericsson Generic Magazine (GEM) and 3. The circuit boards. Examples of the circuit boards are switch-, multiplexer-, transcoder and echo canceller boards.

B.4 APZ Control System

The control system in AXE-platform, the APZ, uses two level architecture with both a central and distributed control. The system uses a powerful central processor and a number of regional processors that can easily be connected.

B.4.1 Central Processor TheCP is duplicated to offer a higher level of security to the system and to reduce the total downtime of the switch. The AXE-platform automatically detects errors during execution by processing data in both processors and may, if needed, swap operations between the two sides without impact on the traffic. If an error is detected each side performs the test program executed by the Maintenance Unit (MU) on both processors. The side that is detected to be defect is shutdown or rebooted to avoid system failure. TheCP is mostly concerned withCS traffic. In theCP the PLEX code language is used which is an Ericsson developed language. During the periods of lower load theCP will conduct maintenance activities and therefore there some load can exist on theCP even during idle periods. B.4 APZ Control System 103 1.

2.

3.

Figure B.2. The BSC components

B.4.2 Regional Processor

The Regional processors are used for repetitive and routine operations and for process intensive jobs. In theRPs C is used as programming language and is often used forPS traffic than the CP. Activities in aRP can be executed independently of theCP and therefore the total system load is distributed between the system entities.

B.4.3 Adjunct Processor Group

Adjunct Processor (or Adjunct Processor Group) is used for maintenance, management and logging. It is an evolved AXE I/O system with a focus on low cost, board size and is integrated in the Ericsson General Magazine. The APG provides a an interface between the AXE node and the OMC so that the node can be configured, traced for errors and record traffic information. APG includes a terminal communications interface to theCP that could be used for exe- cuting commands and to produce printouts. TheCP file system is also managed by APG in which files used by theCP are stored together with charging- and other statistical data. The APG is based on a Microsoft Windows platform and is, like theCP, duplicated for increased security. The charging data, Formating and Output Subsystem, is forwarded to external billing nodes. The statistics from theCP are stored as STS files which are blocks of related counters, i.e. the BSC block which records the number of connected calls. The charging data is collected by theCP and sent to the APG where it is stored and transferred to the billing system which is responsible for the AXE node. 104 Ericssons Base Station Controller (BSC)

B.4.3.1 STS

STS gives support for the collection, storage and presentation of the AXE-nodes statistical data. STS consists of counter groups (blocks) that the supports functions like Authentication, Local Number Portability, Traffic on Routes, Size Alteration Events, Multi-Exchange Paging, and Subscriber Activities [11]. The information from theCP collected by the STS subsystem within the APG is in a raw format so that it can be stored in the filesystem. The STS system is constructed to produce output in the 3GGP ASN.1 but can also produce terminal output and comma separated files. STS supports approx. 2.100.000 counters recorded at a 15 minute interval (by default) [11]. All these counters can also be configured to be collected directly from the BSC to the OSS through the Record Transfer (RTR) and Generic Output Handler (GOH) where the data can permanently stored in OSS databases.

B.5 OM interfaces

The BSC provides many ways of connecting to the internal components to conduct maintenance. An operator most often only uses the APG to connect to the BSC and then through the OSS. The developers and tester of the BSC software often use the APG for manual control of the CP through an application called WinFIOL but they also connects directly to the Regional processors for faster access. See Figure B.3 for more details.

OM-Interface Telnet SSH FTP (APG only)

APZ (CP) APG

RP RP ... RP

Figure B.3. Possible connections to the BSC OM interfaces B.6 Man-Machine Language (MML) 105

B.6 Man-Machine Language (MML)

MML is the language used to communicate withOM interface om AXE based nodes. The language is used both to configure (c), to create printouts (p) from the node and inititiate (i) commands, e.g. restart. MML is written according to ITU-T recommendations to follow a standardized structure.

B.6.1 Command structure The MML-language uses a general structure for the commands as:

COMMAND CODE:PARAMETER NAME=PARAMETER VALUE; • The command code is the name of the command and a print command often ends with the letter p to indicate that the command only produces a printout and no change on the node. The code often contains five characters. • The parameter name is used to how and where the command should be carried out. More than one parameter can be used by separating them with a comma. • The parameter value specifies the value of the parameter name requested.

Acronyms and glossaries

1G first generation mobile systems, examples NMT and AMPS 2G Second generation mobile systems, example GSM 3GGP 3rd Generation Partnership Project. 3GGP scope is to produce technical specifications that could be used for 3G mobile systems and main- tenance and development for the GSM system

AGCH Access Grant Channel AMPS Advanced Mobile Phone Service APG Adjunct Processor Group API Application Programmers Interface APZ The controll part of the AXE platform AuC Authentication Center AXE Ericsson developed telephone switching system

BAMS BETE Asset Management System BCCH Broadcast Control Channel BETE BUGS Ericsson Test Environment BSC Base Station Controller. Is one of the networks’ most node . Controlls the MS BSIC Base-Station Identity Code BSS Base Station Subsystem BSSMAP Base Transceiver Station Managment Applica- tion Part BTS Base Station Transceiver Station BTSM Base Transceiver Station Management

CBCH Cell broadcast channel CCCH Common Control Channels CCH Control Channel Cell An area covered by a BST

107 108 Acronyms and glossaries

CLI Command-line interface CM Connection Management CN Core Network CORBA CP Central Processor CS Circut swiched data. Data is sent in a connec- tion oriented setup with reserved capacity. Of- ten used for speech

DCCH Dedicated Control Channels dns-name Domain Name System is i.e. used to identify windows computers by their hostname DURA Design Unit Radio which is the design unit for mobile radio

ECUT Ericsson Common Utilization Tool EDGE Enhanced Data Rates for GSM Evolution eForge eForge is Ericssons open source community for all types of code. eForge could be compared to the SourceForge project but also permits inter- nal Ericsson users. eForge is courently migarat- ing to the new TeamForge system EIR Equipment Identity Register EMS Enhanced Messaging Service ERUMS Ericsson Real Utilization Measurement Solution ETSI European Telecommunication Standards Insti- tute

FACCH Fast associated control control channel FCCH Frequency control channel FDD Frequency division duplex. Scheme that uses seperate frequency bands to support duplex channels, communications in both directions FDMA Frequency division multiple access. Scheme that allows multiple users to communicate by split- ting the available amount of frequencies into smaller parts, in the simplest form; one part for each user FEC Forward Error Correction FOS Formating and Output Subsystem

GGSN Gateway GPRS Support Node GMSK Gaussian Minimum Shift Keying Acronyms and glossaries 109

GPRS General Packet Radio Service GSM Global System for Mobile Communications: originally from Groupe Spécial Mobile GUI Graphical User Interface

HARQ Hybrid Automatic Repeat reQuest HLR Home-Location-Register

IMEI International mobile subscriber identity IMSI International Mobile Subscriber Identity IOG Input/Output Group ISDN Integrated Services Digital Network ITU International Telecommunications Union IWF Internetworking Functions

KPI Key Performance Indicators KPIWeb KPIWeb is a tool implemented to automate DURA I&V (Integration and Verification) KPI measurements

LA Location Area LAI Location area identity LAPD Link Access Protocol for the ISDN D-channel LAU Location Area Update consist of many Cells LSMI Local Mobile Subscriber Identity LTE Long Term Evolution. Proposed to be the 4G cellular network system. Uses a all IP core net- work

MD Meditation device ME Mobile Equipment MM Mobility Managment MML Man-Machine-Language MN Management Network MS Mobile Station MSC Gateway Mobile Switching Station MSC Mobile Switching Station MSISDN Mobile subscriber ISDN number MSRN Mobile Subscriber Roaming Number

NCH Notification channel NE Network Element Nethawk A protocol analyzer used in the test operations at Ericsson Test Enviorment 110 Acronyms and glossaries

NMT Nordic Mobile Telephone. 1-G system NSS Network and Switching Subsystem

OEE Overall Equipment Efficiency OM Operations and Maintenance OMC Operations and Maintenance Center OML Operations and Maintenance Link OS Operations System OSI Open Systems Interconnect OSS Operations (Support) Subsystem

PCH Paging channel PCM Pulse Code Modulation. Often 64 kbps PIN Personal Identification Number PLM Product Line Maintenance PLMN Public Land Mobile Network PS Packet switched data. Data is sent without to first setup a connection. No capacity is reserved but therefore not wasted during idle periods PSTN Public Switched Telephone Network PUK Personal Unblocking Code

QA Q-Adapter

RA Routing Area RACH Random Access Channel RAN Radio Access Network RBS Radio Base Station RF Resource Factory. RF is controls a resource in THC RM Resource Manager RNC Radio Network Controller RP Regional Processor RR Radio Resource Managment RSS Radio Subsystem

SACCH Slow associated control control channel SCH Synchroization channel SDCCH Standalone dedicated control channel SDMA Space division multiple access. SDMA separates the channels into physical chanels spaced apart from each other Acronyms and glossaries 111

SGSN Serving GPRS Support Node SIM Subscriber Identity Module SMS Short Message Service SMSC Short Message Service Center SQL Structured Query Language. SQL is standard- alized language retrieve and store information in a relational database management system SS7 Signaling System 7. SS7 is a set of telephony signaling protcols used to tear upp and down calls, number translations, billing, sms etc. STP System Test Plant STS Statistics and Traffic Measurement

TAS Tool Adaptation Subsystem TCE Test Campaign Executors TCH Traffic Channel TDM Time-division multiplexing. Multiplexing scheme where the nodes are taking turns of transmitting to the media and therefore slicing the total available time into slots. TDMA Time division multiple access. Scheme that al- lows multiple users to communicate by splitting the available time for transmission into smaller parts, in the simplest form; one part for each user TE Terminal Equipment Tektronix protocol analyzer used in the test operations at Ericsson Test Enviorment TES Test execution system TEX Test Executors THC Test Harness Core. A framework for a auto- mated test environment TMN Telecommunications Management Network TMS Test management system TMSI Temporary mobile subscriber identity TRAU Transcoder and Rate Adaptation Unit TRC Transcoder and Rate adapter Controller TSS Telephony Softswitch Solution is a solution for Telephony with emulation of subscriber services. TTMW Test Tool Middle Ware Subsystem

UE User Equipment 112 Acronyms and glossaries

VLR Visitor-Location-Register

WCDMA Wideband Code Division Multiple Access. 3-G system WS Workstation