Introducing Automated in a Legacy PHP System

Alexander Olsson

February 22, 2010

Abstract

Introducing unit testing in a previously untested system is problematic. De- velopers may lack experience from extensive unit testing, and many work hours is likely needed before the system has high test coverage. It is therefore important to early on choose a framework and a strategy for implementing tests which suits both the developers and the system. The framework is selected from a survey based on requirements specified by the company and an initial guess of important testing criteria. The survey is complemented with test implementations and interviews to capture group- specific viewpoints. A strategy for implementing tests is decided using a code analysis tool, which estimates implementation effort and testing need on a per-unit basis. A test implementation order is suggested by ranking units using a diagram-based method. This approach allows for fitting the strategy to the testing skills of the team. In this application, the most suitable framework from a selection of frame- works is PHPUnit. The framework and strategy is experimentally verified on a PHP web system with over 40k SLOC and a development team of 8 people. As the team is fairly inexperienced with unit testing, a test implementation strategy which prioritize low-effort units was chosen. Furthermore, a small li- brary of unit tests is implemented and serves as templates for system-specific tests and strategy validation.

i ii Acknowledgement

Iwouldliketothankthemanypeoplewhohelpedmeduringmythesis.First and foremost, Richard Kronf¨alt, my supervisor at Axis Communications who has come up with countless of ideas and suggestions. Secondly, Mathias Haage, my supervisor from Lund Institute of Technol- ogy has helped me with anything concerning my thesis and answered all my questions. Thirdly, the people at Axis Communications, and especially the ones at VHS. In particular Hans M˚ansson, who have come with many good insights of how to further improve what I’ve been doing. Also from VHS, Michael Rengbrandt helped particularly with tests needed for the thesis to be complete.

iii iv Contents

1 Introduction 1 1.1 Background ...... 1 1.2 Problem description ...... 2 1.3 Problem approach ...... 2

2 Theory 3 2.1 ...... 3 2.1.1 Unit Testing ...... 4 2.1.2 Integration Testing ...... 7 2.1.3 System Testing ...... 7 2.2 Frameworks ...... 8 2.2.1 The xUnit Family of Frameworks ...... 8 2.2.2 FIT ...... 8

3Framework 11 3.1 Selection Criteria ...... 11 3.1.1 Quantification of Selection Criteria ...... 12 3.2 Benchmark test suite ...... 12 3.3 Nominated frameworks ...... 15 3.4 Framework Case Study ...... 16 3.4.1 Overview ...... 16 3.4.2 Survey Interpretation ...... 17 3.5 Framework evaluation ...... 18 3.5.1 PHPUnit ...... 18 3.5.2 SimpleTest ...... 20 3.5.3 Testilence ...... 22 3.5.4 SnapTest ...... 24 3.5.5 PHPSpec ...... 27 3.5.6 PHPT ...... 29 3.6 Conclusion ...... 31

v 4Integration 33 4.1 Where to start ...... 33 4.1.1 An example ...... 33 4.1.2 Approach ...... 34 4.1.3 Summary ...... 36 4.2 Making the analysis ...... 36 4.3 Creating an analysis tool ...... 37 4.3.1 PHPA: List of units ...... 37 4.3.2 PHPA: Cyclomatic complexity ...... 38 4.3.3 PHPA: Dependency ...... 41 4.3.4 PHPA: Frequency ...... 42 4.3.5 Analysis Results ...... 43 4.4 Choosing the entry point ...... 46 4.4.1 Creating the ranked list ...... 47 4.5 Writing tests ...... 49 4.5.1 Implementation ...... 50 4.6 Putting it all together ...... 53 4.7 Validation ...... 54

5 Conclusion 57 5.1 Discussion ...... 58

6 Summary 63

A Source Code 65 A.1 Complex ...... 65 A.1.1 Complex class ...... 65 A.1.2 Filehandler ...... 70 A.1.3 Config ...... 71 A.1.4 Complex slim ...... 72

B Case study 75 B.1 Introduction ...... 75 B.2 Task ...... 76 B.2.1 Purpose ...... 76 B.2.2 Task ...... 77 B.3 Survey ...... 86

Bibliography 86

vi Chapter 1

Introduction

1.1 Background

Automated software unit testing may be performed in several different ways. It is often difficult to proclaim one school as superior. Coding standards, languages, requirements and personal taste will influence the choices made when introducing automated testing.

Automated testing is a quick and easy way to swiftly verify the func- tionality of a given system. It can be used after implementing changes to the system to verify that parts not touched by the new implementation still works in the same manner. Creating the test cases itself is also a way to force the developer to find a solution to the problem at hand before implementing, resulting in better code[16]. Unfortunately, testing is too often cut from a project to save time and money. It may be argued that enough testing is performed by the developers as they write the source code. This is, however, prone to result in faulty code. Implementing new features may break some previously working code and without an automated test suite there is no way to verify this, aside from checking manually. Bugs will be discovered later in the project and will both delay the time and increase the costs of the project. In many cases the delay and increased cost is a substantial amount of the project’s initial values.

One mean to overcome these issues is to introduce automated testing. Different approaches to accomplish this in a legacy system will be discussed and evaluated in this thesis.

1 1.2. PROBLEM DESCRIPTION CHAPTER 1. INTRODUCTION 1.2 Problem description

In a system which have been existing for several years a need for systematic testing has risen. As the system have increased not only in complexity but also size it has become difficult to manually verify and validate the system. Manual testing has also proved to be time consuming and mistakes are more commonly made.

1.3 Problem approach

There are mainly two tasks which will be carried out in this thesis. First, aframeworkforunittestingisneeded.Thebestone,accordingtoselected criteria, for this specific application will be chosen. Second, the framework should be integrated into the existing system. In this phase, how to imple- ment test cases and overcome difficulties when doing so will be discussed. Finding a suitable framework poses several difficulties. Some questions we need to answer before we can make a choice is; What do the current legacy system require from the framework in order • for it to be applicable to the system? What functionality would improve the quality of the resulting tests? • What do the developers need, and want? • Answering these questions is not trivial. No questions holds an objective answer, but rather depends on humans and personal taste. Different people may also have different opinions on, for example, what functionality is neces- sary. In order to get a fair understanding of the answers to these questions a case study with two developers of the system were performed and is further discussed in section 3.4.Furtherdiscussionoftheanswerstothesequestions is the criteria for selecting a framework, covered by section 3.1. Once a fitting framework has been found, a way to integrate the frame- work along with methods for implementing unit test cases and best practices is needed. As there are many test cases which needs to be implemented, the test cases for a unit will be implemented according to priority. The priority of a unit will be decided by examining the effort for bringing the unit under test and comparing it to the value the unit provides from being under test. The effort is measured in terms of cyclomatic complexity and dependencies. The value is measured in number of invocations during a certain timespan. The results are illustrated in a diagram where each of the two quantities (ef- fort and value) will be on each axis of the diagram. This is further discussed in section 4.3.5.

2 Chapter 2

Theory

2.1 Software Testing

In software testing, there are usually three parts of the hierarchy: Unit test- ing, integration testing and system testing. Unit testing is performed by the developer to ensure his newly developed unit is working as intended. This is further described in section 2.1.1.Integrationtestingisperformedtoas- sure that all current units, including the newly developed ones, are working together as a whole, known as a module. This is further described in sec- tion 2.1.2.Finally,asystemtestiscarriedouttoensuretheend-to-end system is working satisfactory. This is further described in section 2.1.3. The procedure of testing is illustrated in figure 2.1.[19]

Figure 2.1: Testing methodology in a typical software development situation.

3 2.1. SOFTWARE TESTING CHAPTER 2. THEORY

Sometimes, integration and system tests are performed simultaneously, known as acceptance test[16]. The acceptance test is a way for stakeholders to verify that all requirements are met. This type of testing is common in agile developing methodologies such as and Scrum[3].

2.1.1 Unit Testing In terms of software and software engineering, a unit is the smallest testable piece of code in a system. Or even more strict

Definition 2.1.1 A ”unit” is a method or function.[14]

It is also possible to define a unit as a class, or even a short snippet of code. However, in this thesis definition 2.1.1 is used, because it fits very well when working with PHP (and particularly the targeted system). In PHP func- tions are common and even classes consists of several functions, where each function in a class may be considered as a unit. For example, a typical unit may be the add function in the Complex class visible in appendix A.1.Itis averysmallfunctionconsistingofonlytwoSLOC(Source Lines of Code) -whichisenoughforperformingtheaddoperationontwocomplexnumerals.

NOTE: Two complex numerals defined on Cartesian form, z = a+bi and w = c + di,addedtogetherisdefinedas[1]

z + w =(a + c)+i(b + d)

One may argue that the function add is not the smallest piece of code, ac- cording to the implementation - testing the addition of the real value, and imaginary value separately would indeed be a smaller piece. However, doing so would require change to the system source code, hence it is not the small piece of code in the system.Thedefinitionholds. The process of testing units in a system is referred to as unit testing.The concept of unit testing was originally coined by when working with SmallTalk1. He defined a set of frameworks, which later became known as the xUnit family of frameworks, see section 2.2.1.[2]Thereareseveral frameworks (see 2.2 for more on frameworks) in the xUnit family. SUnit is for , JUnit for Java, CPPUnit for C++ and PHPUnit for PHP. Aunittestshouldtestthelogicoftheunit.Thatis,allconditional statements, such as an if-else-if. . . block must be tested. Additionally, “spe- cial” values should be tested as well. In the case of the add-function, there

1http://www.smalltalk.org

4 CHAPTER 2. THEORY 2.1. SOFTWARE TESTING are no conditional statements, thus a single call will test all logic. However, special cases could present strange result. A typical special case in the add- function is 0. (0 + 0i)+(0+0i)=0,not0+0i,whichalthoughcorrectis most likely an unwanted presentation. In a full system there are usually dependencies between different units. For example, the add-function contains a call (save)toadifferentunits functionality (the filehandler)inordertomirrorthecomplexnumeraltodisk in case of a power failure, or unexpected application termination. When unit testing the complex class, we do not want the test to fail if the filehandler is faulty. This issue is solved by a method called mock injection.Amockobject is a fake unit which always returns predefined values. For example, the class MyMockComplexFileHandler is a mock of the class ComplexFileHandler (see appendix A.1.2). When trying to load a complex number using this mock filehandler, it will always return 1 + 2i without ever trying to access the file system. Now, even if the file system fails, the unit tests for the complex class will pass. Ashortexampleofhowtoimplementthetestfortheadd-function, should it have been in Java, using JUnit:

1 import .framework.TestCase; 2 import numbers.Complex;

3

4 public class TestComplex extends TestCase {

5

6 public void testAdd() { 7 Complex c = new Complex(1, 2); 8 Complex c2 = new Complex(3, 4); 9 c.add(c2); 10 assertEquals("4.0 + 6.0i", c.getCartesian()); 11 }

12

13 }

And in PHPUnit:

1

2

3 require_once ’PHPUnit/Framework.’; 4 require_once ’complex.php’;

5

6 class ComplexTest extends PHPUnit_Framework_Testcase

5 2.1. SOFTWARE TESTING CHAPTER 2. THEORY

7 {

8

9 public function testAdd() { 10 $c = new Complex(1,2); 11 $c2 = new Complex(3,4); 12 $c->add($c2); 13 $this->assertEquals(’4 + 6i’,$c->get_cartesian()); 14 } 15 }

16

17 ?>

Note that the above examples do not necessarily promote how unit tests should be written. For instance, they don’t test negative or special case values. They merely serve as an example of what a unit test may look like.

Why unit testing? As discussed by plentiful of authors, such as Kent Beck[3], Roy Osherove[14] and Paul Hamill[8], the advantages of having unit tests exceed the effort required to implement these. There are several reasons as to why, and those commonly listed by above authors are

Tests reduce bugs Writing tests generally reduce bugs. It does not • eliminate them, but occurrences are less frequent. This applies for both implementation of new features as well as finding bugs for legacy code when implementing corresponding tests.

Tests work as documentation When a developer is learning a new • API2 it is very common for the developer to search for examples of how to use it rather than strict documentation. Tests provide these examples.

Improves design As tests forces the user to thoroughly think before • implementing, an overall increase in design is achieved, whether it’s less singletons, global variables or simply better code structure.

2Application Programming Interface, or API, is a way to integrate several programs by providing interfaces to use the functionality a specific programs supplies.

6 CHAPTER 2. THEORY 2.1. SOFTWARE TESTING

Allow for changes Tests, assuming they pass, provide a definition of • how the system should work. If we find the need for changing existing code base, it is safe to do so if we have tests. If the new implementations changes the behavior of the system, this will be noted next time the test suite is run.

2.1.2 Integration Testing

Integration testing is the process of testing several units together. This is the opposite of mock injection discussed in section 2.1.1.Theunitsare tested together as a whole, following an integration test plan. When done, it delivers a system ready for System testing see section 2.1.3.Thepurposeof integration testing is not only to verify functionality, but also reliability and performance. There are several types of integration testing. Common approaches are big bang and bottom-up.Inbig bang testing, the whole system, or very large parts of the system are coupled together and tested. It is very time efficient, but harder to follow. It relies heavily on all units working well in isolation. In bottom-up testing, the aim is to test the lower level components first. This should alleviate the testing of higher layer components. This approach takes longer than big bang but excel in tracking of the testing procedure. If it is unknown how well the system works it may be worth using bottom up.Ifthe system is fairly tested already, big bang may be more time efficient.[6]

2.1.3 System Testing

Last in the testing cycle, a system test is performed. System testing should not only verify that each integrated module work together, but also that all the functional requirements are met, and that the system work with every applicable hardware. Generally known as black box testing, the whole system is tested the same way a customer would use it. This type of testing require no knowledge of either the system’s code or logic, nor any programming skills. It is also in this phase of testing that functional requirements are tested. This is for example some specific uptime without crash or short response time. Also the final behavior of the system should now be verified and make sure all requirements are met with the customer, commonly known as acceptance testing.

7 2.2. FRAMEWORKS CHAPTER 2. THEORY

2.2 Frameworks

2.2.1 The xUnit Family of Frameworks Perhaps the most know family of frameworks is xUnit. They are designed to facilitate unit testing (see section 2.1.1), hence the name. All unit test- ing frameworks which belong to the xUnit family follow the same basic architecture[8]. The most essential part of an xUnit framework is the TestCase. All unit test cases uses some implementation of TestCase (usually by inheritance if the language is object oriented) and adds functions for the specific test case. The frameworks also provides assert-functions which evaluates some boolean expression. If the evaluation in the assert is true, the test has succeeded. If it is false, the test has failed. After a failure, the remaining tests are skipped for this unit, as the result is already known. Another important feature of xUnit frameworks is the TestRunner. Which essentially allows a test case to be executed. It helps to remove unnecessary code, and provides information about test output and time it took to run. Test fixtures in unit testing, is the environment in which a unit test is executed. It can be the state of an object or the value of a global variable. Generally, unit tests should be executed in isolation from each other, and should not share fixtures. Each fixture is specific for one unit test. Although, the same fixture may be useful for several unit tests. In that case it should be torn down and set up once again. An xUnit framework helps with these fixtures by providing two functions, setUp and tearDown. setUp is run right before each unit test case, and tearDown right after. Furthermore, xUnit includes a test suite,whichaggregatesseveralunit tests and runs them in a suite, combining the result, time and allows for easy execution of multiple tests at once. The main advantages with using an xUnit framework include allowing for automated repeatable test while only having to write it once (in contrary to Keyword-Driven or Table-Driven testing[10]).

2.2.2 FIT FIT, or Framework for Integrated Test,isaframeworksuiteinitiallydevel- oped by Ward Cunningham[13]. It aims to alleviate the integration and system testing previously described. By using simple facilities such as HTML pages, Excel documents, etc. it helps developers and customers or other stakeholders to interact and agree on system behavior. Customers formulate the tests by providing examples of

8 CHAPTER 2. THEORY 2.2. FRAMEWORKS how they expect the system to work. Results are often presented in tables where successful tests are marked as green and failed tests are marked as red. FitNesse is an implementation of FIT and is available for many languages, among other Java, PHP and C#.

9 2.2. FRAMEWORKS CHAPTER 2. THEORY

10 Chapter 3

Framework selection

In this chapter, the methods for choosing a framework is presented. First, it is determined what functionality the framework is expected to provide and how to make measurements. Frameworks which might be eligible for the system are presented and evaluated according to the selected criteria. Finally, the results will be analyzed and discussed and the best framework will be chosen.

3.1 Framework Selection Criteria

In order to decide on a framework which is a best fit, some criteria needs to be decided. These criteria reflects what is most important in a testing framework for the system. Note that the listed criteria may not necessarily be the same as they would have been if the testing framework was introduced at the beginning of the project. The criteria was chosen using Roy Osherove’s book “The Art of Unit Testing”[14], and simply common sense in combination with company requirements. The criteria are straight forward and it is intuitive that fulfillment of these criteria is desirable from a framework. After each criteria is a short explanation as to why it is desired.

1. Require no or few changes to the current source code. In order to smoothly introduce a testing framework into an existing system, it may not require substantial rewriting of the code.

2. Easy to run tests. It should be easy to run tests.

3. Price to acquire. The framework may not be expensive to purchase or get hold of.

11 3.2. BENCHMARK TEST SUITE CHAPTER 3. FRAMEWORK

4. Easy to write and read tests. The syntax and semantics of the testing framework may not be exceedingly difficult to understand.

5. Accurate. Reliability of the testing framework is crucial, hence false positives or vice versa should be infrequent.

6. Provide statistics. Test statistics is a quick way to get an overview of asystem.Severalstatisticsaredesired.

Test results. Information about the outcome of each test is crucial. • Code coverage. The amount of the total source code covered by • the test suite. Time to run. The time it took to run the tests. • 7. Maintained. As programming languages, theories and techniques evolve, the framework should as well.

8. Tests should run quickly. It should not take excessively long time to run a test suite.

3.1.1 Quantification of Selection Criteria Comparing frameworks according to the criteria defined in section 3.1 will be difficult without some means of measurement. It is not obvious how to measure all criteria. When comparing frameworks, the following quantifica- tions will be used. The quantifications was chosen in the same way as the criteria, that is common sense with some aid from Roy Osherove’s “The Art of Unit Testing”[14]. Note that the numbering in this list corresponds to the numbering in section 3.1.

3.2 Benchmark test suite

In order to meet some of the quantifications described in section 3.1.1,like “5. Accurate” or “8. Tests should run quickly”, a benchmark test suite is required. The test suite will be implemented in all nominated frameworks listed in section 3.3. Aclassrepresentingacomplexnumberhasbeencreated.Thesource code is available in appendix A.1.Theclasshaveseveralpossibilitiesfor performing arithmetic operations on complex numbers. For example, they

12 CHAPTER 3. FRAMEWORK 3.2. BENCHMARK TEST SUITE

Table 3.1: Quantificatin of selection criteria Criteria Quantification 1 Lines of code or work-hours. 2 Number of clicks or effective time to start tests. 3 Money required to purchase the framework. 4 Asubjectiveratingfrom1-5,evaluatedbysometypical employee. 5 Number of incorrect test results in a benchmark test suite. 6 Number of provided statistics of the desired statistics in section 3.1. 7 Yes or no if it is maintained today. In order for a frame- work to be considered as“maintained today”there must have been a release during the last year, and no ex- plicit statement saying development has been discontin- ued. Probability of future maintenance will be measured in number of developers. 8 Time it takes to run the tests with a benchmark test suite. may be added, multiplied or conjugated. Two forms of presentations of the complex numbers is also available, cartesian (i.e. x+yi)andpolar(i.e. aebi). Acomplexclassisalsorequiredtohaveafilehandlerwhichwillsave the complex number to disk after each operation. The source code for the filehandler is available in appendix A.1.2. With the legacy code in mind, configuration possibilities have also been added. It’s simply a variable which is made available on a global scope through a function. Source code is available in appendix A.1.3.Theconfig- uration options for the complex class is solely changing the precision of the complex number (i.e. number of decimal values). The complex class has been developed with the original source code in mind. It represents the legacy source code. Therefore, the class is imple- mented in roughly the same manner as the legacy system is. As the test suite is not very extensive, just some dozens of tests, the suite will be run 500 times when measuring time in criteria 7. In table 3.2 the tests in this suite is listed. The implementations in different frameworks will be derived from this table. Note that the function unit_circle_representation is not part of the complex class and will not be tested in the unit test for the complex class. Testing the function would be a unit test of its own.

13 3.2. BENCHMARK TEST SUITE CHAPTER 3. FRAMEWORK

Table 3.2: A typical test suite Function Parameters Expected result contruct (1, 2) get class = ’Complex’ get cartesian (1, 2) 1 + 2i get cartesian (-1, -2) -1 - 2i get cartesian (1, -2) 1 - 2i get cartesian (-1, 2) -1 + 2i get cartesian (0, 0) 0 get cartesian (1, 0) 1 get cartesian (0, 1) i get polar (1, 2) 2.24eˆ(1.11i) get polar (-1, -2) 2.24eˆ(-2.03i) get polar (1, -2) 2.24eˆ(-1.11i) get polar (-1, 2) 2.24eˆ(2.03i) get polar (0, 0) 0 get polar (1, 0) 1 get polar (0, 1) eˆ(1.57i) conjugate (1, 1) re = 1, im = -1 conjugate (1, -1) re = 1, im = 1 add (1, 1), (2, 3) re = 3, im = 4 add (-4, 2), (2, -3) re = -2 im = -1 multiply (4,8),(6,3) re=0,im=60 multiply (4,8),(3,-6) re=60,im=0 multiply (2,1),(1,1) re=1,im=3 pow (4, 5), 2 re = -9, im = 40 pow (4, 5), 0 re = 1, im = 0 pow (1, 1), 1 re = 1, im = 1 pow (1, 1), 2 re = 0, im = 2 pow (1, 1), 3 re = -4, im = 0 pow (1, 1), 4 re = 16, im = 0 pow (1, 1), 5 re = 256, im = 0

14 CHAPTER 3. FRAMEWORK 3.3. NOMINATED FRAMEWORKS

3.3 Nominated frameworks

After an exhaustive search, a couple of unit testing frameworks for PHP was found. These are presented below. Resources such as wikipedia1 was used to get an overview of available frameworks. The frameworks found are most likely all frameworks which may be eligible, but it is possible that other frameworks which may meet the criteria exists. Even more frameworks than the one presented here was found, but was not included in the survey because of time limitation. The frameworks excluded were the ones which scored lowest on the easy to measure criteria (such as“inexpensive to acquire”, “statistics” and “maintained”).

PHPUnit Reference page: http://www.phpunit.de • Written by Sebastian Bergman, PHPUnit has quickly become popular. It is a test framework derived from the xUnit family. It contains functionality to create tests as well as an interface to run the test suite and analyze its outcome.

SimpleTest Reference page: http://www.simpletest.org/ • SimpleTest is a unit testing framework hosted by SourceForge2.Italso follows the xUnit paradigm.

Testilence Reference page: http://www.testilence.org/ • Developed by Roman Neuhauser, Testilence is a relatively new framework for unit testing in PHP.

SnapTest Reference page: http://code.google.com/p/snaptest/ • SnapTest is a part of the xUnit family. 1http://en.wikipedia.org/w/index.php?title=List_of_unit_testing_ framework&oldid=325557633 2http://www.sourceforge.net

15 3.4. FRAMEWORK CASE STUDY CHAPTER 3. FRAMEWORK

PHPSpec Reference page: http://www.phpspec.org/ • The framework known as PHPSpec promotes Behavior Driven Develop- ment (BDD) rather than Test Driven Development (TDD).

PHPT Reference page: http://qa.php.net/running-tests.php • PHPT is a script based framework mainly used by the PHP developers to verify the language.

3.4 Framework Case Study

3.4.1 Overview In order to approximately determine how easy a framework is to use (i.e. selection criteria 4 in section 3.1)itwasdecidedtoperformacasestudyon two of the developers of the legacy system. Even though the study is carried out with only two test subjects, they constitute 25% of the targeted users of the framework. That is, if we’re only counting the primary 8 developers, and not other stakeholders who may use it. It was not feasible to have the test subjects try all frameworks studied in this thesis for numerous reasons. Trying all frameworks would take too long. As focus tends to dwindle when the day nears its end, it would result in an unfair comparison between frameworks studied in the morning, and frameworks studied in the afternoon. Nor was the resources for conducting this study unlimited and occupying two developers the whole day was simply not within the price range for the study. Also, some frameworks are very similar, and testing all of those would be a mindless repetitive task with no real gain for either the subject or the study. So which frameworks should be included? PHPSpec and PHPT was ob- vious as they are very different from the other four, and from each other. The four remaining are quite similar in testing approach, however, PHPUnit and SimpleTest both had support for dynamic mock objects (which SnapTest and Testilence do not), so PHPUnit was included. Next, I randomly selected one of the two remaining, SnapTest. In short, the included frameworks was PHPUnit, SnapTest, PHPSpec and PHPT.

16 CHAPTER 3. FRAMEWORK 3.4. FRAMEWORK CASE STUDY

The study was performed in four parts. First an introduction held by the author, where the purpose, relevance and tasks of the case study were de- scribed. The slides providing visual support for this introduction is available in appendix B.1. Next, the test subjects followed the task description visible in appendix B.2 and used each of the frameworks to implement tests to a slimmed version of the complex class, available in appendix A.1.4,whichheld only a constructor and the functions get_polar and add.Theconfiguration possibilities and file handler was not included in this slimmed version. At the time of the introduction, the test subjects received a survey to answer for each framework. The survey used can be found in appendix B.3.Lastly, ashortinterviewwasheldtodiscussanswerstothesurveyandexperience using the frameworks. The result of this interview will be discussed under each applicable framework in section 3.5. Effective time for the case study was about 4 hours per test subject.

3.4.2 Survey Interpretation

This section describes how the survey in appendix B.3 will be interpreted in order to get a measurement with which we can compare the frameworks in the study. By examining the questions in the survey it is easily notable that the higher the number a question got, the more advantageous for the framework it is. A framework with all 5 checked is, according to the test subject, a framework which is easier to use than a framework with all 1 checked. This observation can be exploited to find a number, with which the easiness of a framework may be compared to the other frameworks. If Qi is the answer to question i,1 Qi 5, 1

1 n S = Q (3.1) n i i=1 ￿

The answer, S,willbearealvaluednumber1 S 5. The attentive reader may notice this formula is the arithmetic mean≤ value≤ of a set of data. All questions are considered equally important. Note that if a test subject answered X (i.e. “don’t know”) that question is excluded from the formula and n is decreased.

17 3.5. FRAMEWORK EVALUATION CHAPTER 3. FRAMEWORK

3.5 Framework evaluation

3.5.1 PHPUnit Overview As mentioned earlier, PHPUnit is part of the xUnit family, thus the seman- tics of the framework is very distinct and similar to other xUnit frameworks. PHPUnit uses the object orientation available in PHP. Each test is a class consisting of multiple functions which test different functionality of the SUT (System Under Test) or CUT (Code Under Test), i.e. the unit. The class extends the class PHPUnit_Framework_TestCase which is provided by PH- PUnit. This is a very common approach in unit testing (compare the JUnit example in section 2.1.1). PHPUnit comes with its own tool, ,whichrunsspecifiedtests.

Sample The test for the add function of the complex class (see A.1) in PHPUnit could be implemented as follows

1

4

5 class ComplexTest extends PHPUnit_Framework_Testcase 6 { 7 protected $stub_fh;

8

9 protected function setUp() { 10 $this->stub_fh = $this->getMock(’ComplexFileHandler’, 11 array(’save’,’load’), 12 array(’tmp_file.txt’)); 13 $this->stub_fh->expects($this->any()) 14 ->method(’load’) 15 ->will($this->returnValue(’1 + 2i’)); 16 }

17

18 public function testAdd() { 19 $c1 = new Complex(1,1); 20 $c2 = new Complex(2,3); 21 $c1->add($c2);

18 CHAPTER 3. FRAMEWORK 3.5. FRAMEWORK EVALUATION

22 $this->assertEquals(3, $c1->re); 23 $this->assertEquals(4, $c1->im);

24

25 $c1 = new Complex(-4,2); 26 $c2 = new Complex(2,-3); 27 $c1->add($c2); 28 $this->assertEquals(-2, $c1->re); 29 $this->assertEquals(-1, $c1->im); 30 } 31 } 32 ?>

Survey Results

Table 3.3: Survey results for PHPUnit Question Subject 1 2 3 4 5 6 7 Mean 1 4 4 4 3 4 3 5 3.86 2 5 5 4 X 5 5 5 4.83

The test subjects answered fairly equal to the survey. What is notable is the Xtoquestion4fromsubject2.Duringtheinterview,subject2statedhe was not a big fan of unit testing, but found PHPUnit fairly good. In normal cases subject 2 would not use unit testing, but now subject 2 didn’t know.

19 3.5. FRAMEWORK EVALUATION CHAPTER 3. FRAMEWORK

Criteria results

Table 3.4: Criteria results for PHPUnit Criteria Result 1 Due to the possibility to create mock objects during run time, few or none changes to the existing source code is required. 2 It will take less than 10 seconds to begin running a test case. 3 Free under BSD-License3. 4 Rating: 4.35 5 There were no false positives in the benchmark test suite. 6 All three desired statistics (code coverage, run-time and out- come) is provided by PHPUnit. 7 It is currently maintained, at writing time (15 Oct 2009) the latest release was 16 Sept 20094.Therearecurrently5con- tributors to the PHPUnit project5. 8 It takes 39 seconds to run the benchmark test suite 500 times.

3.5.2 SimpleTest Overview SimpleTest is a very similar to PHPUnit considering the layout of the frame- work. Each test extends the class UnitTestCase to enable the testing utility. Each test class which extends UnitTestCase contains multiple functions and is supposed to test one unit, i.e. it is a unit test. There is, however, a little more overhead using SimpleTest. It does not include a tool for running test cases. This drawback requires each testcase to contain information about paths. In addition, some code to run the test is necessary. Not much is needed, but it is a noticeable overhead when the number of test cases increase. As SimpleTest does not provide a tool to execute the test cases other tools must be used. Either use the PHP Command Line Interface6 or use a web server (e.g. Apache7 and point a browser to the test case. In the sample below, the two last lines of code (row 41 and 42) is the additional code required in each test case file.

3http://www.opensource.org/licenses/bsd-license.php 4http://www.phpunit.de 5http://www.ohloh.net/p/phpunit/contributors 6http://www.php-cli.com/ 7http://www.apache.org/

20 CHAPTER 3. FRAMEWORK 3.5. FRAMEWORK EVALUATION

Sample The test for the add function of the complex class (see A.1)inSimpleTest could be implemented as follows

1

5

6 require_once SIMPLE_TEST . ’unit_tester.php’; 7 require_once SIMPLE_TEST . ’reporter.php’; 8 require_once SIMPLE_TEST . ’autorun.php’;

9

10 require_once ’complex.php’; 11 require_once ’filehandler.php’;

12

13 Mock::generate(’ComplexFileHandler’);

14

15 class TestComplex extends UnitTestCase { 16 function setUp() { 17 $this->stub_fh = new MockComplexFileHandler(); 18 $this->stub_fh->setReturnValue(’load’, ’1 + 2i’); 19 }

20

21 function TestComplex() { 22 $this->UnitTestCase(); 23 }

24

25 public function testAdd() { 26 $c1 = new Complex(1,1); 27 $c2 = new Complex(2,3); 28 $c1->add($c2); 29 $this->assertEqual(3,$c1->re); 30 $this->assertEqual(4,$c1->im);

31

32 $c1 = new Complex(-4,2); 33 $c2 = new Complex(2,-3); 34 $c1->add($c2); 35 $this->assertEqual(-2,$c1->re); 36 $this->assertEqual(-1,$c1->im); 37 }

21 3.5. FRAMEWORK EVALUATION CHAPTER 3. FRAMEWORK

38 }

39

40 $test = &new TestComplex(); 41 $test->run(new TextReporter());

42

43 ?>

Survey Results No survey was made for this framework.

Criteria results

Table 3.5: Criteria results for SimpleTest Criteria Result 1 Due to the possibility to create mock objects during run time, few or none changes to the existing source code is required. 2 It will take less than 10 seconds to begin running a test case. 3 Free. GNU Lesser General Public License8. 4 No case study for SimpleTest. 5 There were no false positives in the benchmark test suite. 6 The only provided statistics is the outcome. 7 SimpleTest is not maintained. At writing time (15 Oct 2009) the latest release was 4 Aug 20089.Thereis9contributorsto SimpleTest10 8 It takes 17 seconds to run the benchmark test suite 500 times.

3.5.3 Testilence Overview Just as PHPUnit and SimpleTest, Testilence takes advantage of the ob- ject oriented possibilities provided by PHP. Each class extends the class Tence_TestCase.Theclasscontainsseveralfunctions,whichcontainstests.

8http://www.gnu.org/copyleft/lesser.html 9http://www.simpletest.org 10http://sourceforge.net/projects/simpletest/

22 CHAPTER 3. FRAMEWORK 3.5. FRAMEWORK EVALUATION

There is, however, great overhead while using Testilence. Each function can only perform one test. This means that testing a unit which requires, say, 8teststofullytestallfunctionalitywillrequire8functionsinthetestcase. Nor is there any support for run time mock objects, which will require mock injections in the source code. The sample below may look like less to write, but keep in mind that this approach requires the source code to have an interface of the classes used, and a mock implementation of said interface. The overhead in the source code will be greater.

Sample The test for the add function of the complex class (see A.1)inTestilence could be implemented as follows

1

2

3 require_once ’complex.php’; 4 require_once ’filehandler.php’;

5

6 class TestComplex extends Tence_TestCase 7 { 8 protected $stub_fh;

9

10 function setUp() { 11 $this->stub_fh = 12 new MyMockComplexFileHandler(’tmp.cpx’); 13 }

14

15 public function testAdd1() { 16 $c1 = new Complex(1,1, $this->stub_fh); 17 $c2 = new Complex(2,3, $this->stub_fh); 18 $c1->add($c2); 19 return $this->assertTrue($c1->re == 3 && 20 $c1->im == 4); 21 }

22

23 public function testAdd2() { 24 $c1 = new Complex(-4,2, $this->stub_fh); 25 $c2 = new Complex(2,-3, $this->stub_fh); 26 $c1->add($c2); 27 return $this->assertTrue($c1->re == -2 &&

23 3.5. FRAMEWORK EVALUATION CHAPTER 3. FRAMEWORK

28 $c1->im == -1); 29 } 30 }

31

32 ?>

Survey Results No survey was made for this framework.

Criteria Results

Table 3.6: Criteria results for Testilence Criteria Result 1 Required changes to the source code will be creating inter- faces, and mock implementations of these, if there are de- pendencies to this class. Global variables are not preserved when running the framework, thus the use of global variables require some rewriting of existing source code. 2 It will take less than 10 seconds to begin running a test case. 3 Free. MIT license11. 4 No case study for Testilence. One testcase per function in- crease overhead and decrease readability. 5 There were no false positives in the benchmark test suite. 6 The only provided statistics is the outcome. 7 Testilence is maintained. At writing time (15 Oct 2009) the latest release was 31 Mar 200912.Thereisonedeveloperfor Testilence13. 8 It takes 27 seconds to run the benchmark test suite 500 times.

3.5.4 SnapTest Overview SnapTest uses the object oriented possibilities in PHP. Each test case extends the framework class Snap_UnitTestCase. Just like Testilence it has extra

11http://opensource.org/licenses/mit-license.php 12http://www.testilence.org 13http://hg.sigpipe.cz/testilence/

24 CHAPTER 3. FRAMEWORK 3.5. FRAMEWORK EVALUATION overhead due to the fact that there can only be one assert per function which will result in many functions per unit test. The functions setUp() and tearDown() which may be used prior and after every test function is required by this framework, no matter if you want to use them or not. This is a notable overhead. It is only possible to create run time mock objects for static classes, hence some mock injection for dynamic classes is still needed. Note that the mock object in the sample below requires interface and implementation of said interface in the source code to function correctly.

Sample The test for the add function of the complex class (see A.1)inSnapTest could be implemented as follows

1

2

3 require_once ’complex.php’; 4 require_once ’filehandler.php’;

5

6 class TestComplex extends Snap_UnitTestCase { 7 protected $stub_fh;

8

9 public function setUp() { 10 $this->stub_fh = new 11 MyMockComplexFileHandler(’tmp.cpx’); 12 } 13 public function tearDown() {}

14

15 public function testAdd1() { 16 $c1 = new Complex(1,1, $this->stub_fh); 17 $c2 = new Complex(2,3, $this->stub_fh); 18 $c1->add($c2); 19 return $this->assertTrue($c1->re = 3 && $c1->im == 4); 20 }

21

22 public function testAdd2() { 23 $c1 = new Complex(-4,2, $this->stub_fh); 24 $c2 = new Complex(2,-3, $this->stub_fh); 25 $c1->add($c2); 26 return $this->assertTrue($c1->re == -2 &&

25 3.5. FRAMEWORK EVALUATION CHAPTER 3. FRAMEWORK

27 $c1->im == -1); 28 } 29 }

30

31 ?>

Survey Results

Table 3.7: Survey results for SnapTest Question Subject 1 2 3 4 5 6 7 Mean 1 4 4 4 3 4 3 4 3.71 2 4 4 4 3 4 3 2 3.43

Both test subjects experienced the framework about equally. The only answer which differs is to question 7. Subject 1 simple felt that it was quite a good manual, while subject 2 felt the opposite. Keep in mind that the test subjects did not have an abundant amount of time to study these manuals, hence the different perceptions.

Criteria Results

Table 3.8: Criteria results for SnapTest Criteria Result 1 Required changes to the source code will be creating inter- faces, and mock implementations of these, if the class is not static. 2 It will take less than 10 seconds to begin running a test case. 3 Free. BSD-License14. 4 3.57. One testcase per functions increase overhead and de- crease readability. 5 There were no false positives in the benchmark test suite. 6 The only provided statistics is the outcome. 7 SnapTest is not maintained. At writing time (15 Oct 2009) the latest release was 30 Aug 200815.Thereare3contributors to the project16. 8 It takes 198 seconds to run the benchmark test suite 500 times.

26 CHAPTER 3. FRAMEWORK 3.5. FRAMEWORK EVALUATION

3.5.5 PHPSpec Overview PHPSpec makes use of the object oriented possibilities in PHP. Each test case should extend the framework class PHPSpec_Context. PHPSpec fo- cuses on BDD, and is very strict in naming conventions and usage. It also promotes a different line of thought when writing test cases using BDD. When testing input/output, PHPSpec attempt to simulate the English lan- guage. Each line is almost possible to read out loud and get some sense out of it. For example. Testing if some variable is a string looks like this; $this->spec($var)->should->beString();. Much like PHPUnit, PHPSpec provides a tool, phpspec,toruntestcases. Note that the implementation in the sample below requires changes to the source; extracting an interface and implementing a mock.

Sample The test for the add function of the complex class (see A.1)inTestilence could be implemented as follows

1

2

3 require_once ’complex.php’; 4 require_once ’filehandler.php’;

5

6 class DescribePHPSpecComplex extends PHPSpec_Context 7 { 8 protected $stub_fh;

9

10 public function before() { 11 $this->stub_fh = new 12 MyMockComplexFileHandler(’tmp.cpx’); 13 }

14

15 public function itShouldAddCorrectly() { 16 $c1 = new Complex(1,1, $this->stub_fh); 17 $c2 = new Complex(2,3, $this->stub_fh); 18 $c1->add($c2);

14http://www.opensource.org/licenses/bsd-license.php 15http://code.google.com/p/snaptest/ 16http://code.google.com/p/snaptest/wiki/ContributorList

27 3.5. FRAMEWORK EVALUATION CHAPTER 3. FRAMEWORK

19 $this->spec($c1->re)->should->be(3); 20 $this->spec($c1->im)->should->be(4);

21

22 $c1 = new Complex(-4,2, $this->stub_fh); 23 $c2 = new Complex(2,-3, $this->stub_fh); 24 $c1->add($c2); 25 $this->spec($c1->re)->should->be(-2); 26 $this->spec($c1->im)->should->be(-1); 27 } 28 }

29

30 ?>

Survey Results

Table 3.9: Survey results for PHPSpec Question Subject 1 2 3 4 5 6 7 Mean 1 X X 1 1 X 3 4 2.25 2 1 1 1 1 1 2 1 1.14

Not very popular at all. Several X answers for test subject one. This was because of insufficient time. The subject spent over one hour on the framework and still didn’t get the test cases to run. Subject 2 got it to work, but very unsatisfactory. Even though low score, the user manual got some credit from both test subjects. The main reason the framework got such a low score was because of the behavior driven approach. Both subjects believes it would have been easier to use the framework if they had more time to learn about behavior driven development, or had some experience with it.

28 CHAPTER 3. FRAMEWORK 3.5. FRAMEWORK EVALUATION

Table 3.10: Criteria results for PHPSpec Criteria Result 1 Required changes will be extracting interfaces and implement- ing mock objects. Global variables are not preserved when running the framework, thus the use of global variables re- quire some rewriting of existing source code. 2 It will take less than 10 seconds to begin running a test case. 3 Free. Creative Commons Attribution 3.0 License17. 4 1.70 5 There were no false positives in the benchmark test suite. 6 PHPSpec provide information about the outcome and time. 7 PHPSpec is not maintained. At writing time (15 Oct 2009) the latest release was 11 Jan 200818.Thereis3contributors to PHPSpec19. 8 It takes 16 seconds to run the benchmark test suite 500 times.

Criteria Results 3.5.6 PHPT Overview PHPT is a very slim framework. It is primarily used by the developers of PHP itself for testing built-in functions. Unlike other frameworks evaluated in this thesis, it does not make use of the object orientation available in PHP. PHPT is closely integrated with PEAR, which is also used to run PHPT tests. There is not much functionality available in PHPT. For instance, there are no setUp or tearDown routines for fixtures, or any run time mock object generation.

Sample The test for the add function of the complex class (see A.1) in PHPT could be implemented as follows

1 --TEST-- 2 Testing add of complex.php

17http://creativecommons.org/licenses/by/3.0/ 18http://www.phpspec.org 19http://code.google.com/p/phpspec/people/list

29 3.5. FRAMEWORK EVALUATION CHAPTER 3. FRAMEWORK

3 --FILE-- 4

7

8 $stub_fh = new MyMockComplexFileHandler(’tmp.cpx’);

9

10 $c1 = new Complex(1, 1, $stub_fh); 11 $c2 = new Complex(2, 3, $stub_fh); 12 $c1->add($c2); 13 var_dump($c1->re); 14 var_dump($c1->im);

15

16 $c1 = new Complex(-4, 2, $stub_fh); 17 $c2 = new Complex(2, -3, $stub_fh); 18 $c1->add($c2); 19 var_dump($c1->re); 20 var_dump($c1->im);

21

22 ?> 23 --EXPECT-- 24 float(3) 25 float(4) 26 float(-2) 27 float(-1)

Survey Results

Table 3.11: Survey results for PHPT Question Subject 1 2 3 4 5 6 7 Mean 1 5 5 5 5 4 1 1 3.71 2 5 5 3 3 5 X X 4.20

Received a decent score. The manual had unordinary ratings from both subjects. This is because there hardly was one - only a couple of sentences. Subject 1 interpreted the survey to use answer 1 in this case, while subject 2 answered X. However, during the interview, it became apparent that they

30 CHAPTER 3. FRAMEWORK 3.6. CONCLUSION meant the same thing and should be interpreted as the same answer. Inter- preting the X from subject 2 as 1, would give the mean 3.29 and the total mean to 3.50, which is the value used here on.

Criteria Results

Table 3.12: Criteria results for PHPT Criteria Result 1 Required changes will be extracting interfaces and implement- ing mock objects. 2 It takes less than 10 seconds to begin running a test case. 3 PHP License20. 4 3.50 5 There were no false positives in the benchmark test suite. 6 PHPT provide information about runtime for tests and the outcome (however, writes it to external file, not to standard output). 7 PHPT is considered part of PHP and is maintained. At writ- ing time (15 Oct 2009) the latest release of PHP was 17 Sept 200921. The number of contributors to PHPT is not known. 8 It takes 96 seconds to run the benchmark test suite 500 times.

3.6 Conclusion

The complete (but less informative) comparison between the frameworks is illustrated in table 3.13. The two criteria “Time to start a test run” and “Number of false positives” may seem as badly chosen criteria. Even though they do not distinguish any framework, they are very important to be fulfilled. Imagine, for example, if a framework were to fail tests which are correctly written, or pass tests which are incorrect. As the criteria are ordered according to priority, we can examine table 3.13 from top to bottom. SimpleTest and Testilence was not part of the case study, and thus lack a value for criteria 4. However, looking at the samples in section 3.5, specifically comparing PHPUnit and SimpleTest, it’s notable that the syntax is very similar, thus the rating would

20http://www.php.net/license/ 21http://www.php.org

31 3.6. CONCLUSION CHAPTER 3. FRAMEWORK PHPSpec PHPUnit SimpleTest Testilence SnapTest PHPT 1 Without rewriting source Yes Yes No No No No 2Timetostartatestrun < 10 s < 10 s < 10 s < 10 s < 10 s < 10 s 3License FreeFreeFreeFreeFreeFree 4 Subjective rating from study 4.35 N/A N/A 2.57 1.70 3.50 5 Number of false positives 0 0 0 0 0 0 6Statistic:Outcome Yes Yes Yes Yes Yes Yes Statistic: Run time Yes No No No Yes Yes Statistic: Coverage Yes No No No No No 7 Maintained Yes No Yes No No Yes Number of developers 5 9 1 3 3 Undisclosed 8Benchmarktestruntime 39s 17s 27s 198s16s 96s Table 3.13: Framework Comparison

be approximately equal. The same is true for Testilence and SnapTest (e.g. they both have a limitation of one assert per function in the testcase). Now, comparing the frameworks top-down, we can see that PHPUnit, SimpleTest and PHPT are superior in criteria 1 through 5. PHPUnit provides more statistics than any other framework does. PHPUnit and PHPT are maintained, which SimpleTest is not. SimpleTest have more contributors and runs the benchmark test suit quicker than PHPUnit, however these are lower prioritized. Number of developers for PHPT is unknown. PHPSpec have more statistics than SimpleTest, but have a really low rating from the study. PHPT is also a close runner up, but lacks Coverage statistics, which PHPUnit has. It is also lower rated from the study and is slow to run. SnapTest is not impressive in any criteria. Testilence have the advantage of being maintained, but is not exceptional in any other criteria. The choice falls on PHPUnit.

32 Chapter 4

Integration

PHPUnit have been chosen as the framework to be used. Next, we need a strategy. As the system contains over 1500 units, all tests can not be implemented simultaneously. We need to decide upon an order. This is achieved by comparing the value a unit provides from being under test and comparing it to the effort required for implementing the test. The result will be a ranked list where the top item, in some sense, will yield the most value per required effort. Furthermore, methods for succeeding with the actual implementations are proposed and discussed.

4.1 Where to start

The obvious approach when introducing unit testing with the tools in place would be to issue the command“just do it”to the developers and pray that everything works out smoothly. Chances are, it won’t. The effort of starting unit testing in a legacy system may easily seem like an endless objective because of the huge system which needs to be brought under testing, and the mere thought of the vast number of test cases needed to do so.

4.1.1 An example There is plenty to gain from an organized and well planned strategy. Let’s consider an ATM (Automatic Telling Machine). A user comes to the machine and wants to make a withdrawal. When the withdrawal is done, it’s tremen- dously more important that the correct amount of money is given to the user than the logotype being rendered in the correct place on the graphical

33 4.1. WHERE TO START CHAPTER 4. INTEGRATION user interface. The payment functionality is in some sense more important to bring under testing than e.g. the placement of the logotype. Of course, there are gray areas. Consider how important it is that the bank internally subtract the withdrawn amount from the user’s balance and compare it to the user receiving the correct amount of money. If the user receives an incorrect sum, it is very likely that he will notice it. However, if the bank subtracts an incorrect sum from the user’s balance, it is less likely that it will be noticed as it is less visible (e.g. a purchase at a local store using a credit card where 56 SEK instead of 57 SEK is subtracted). Now don’t misunderstand, it’s of course desirable that all functionalities of the system work flawlessly.

4.1.2 Approach We have now determined that we need a strategy when writing testing. First, we need a frame of reference so we at least have somewhere to point. This frame could be a list of all units which needs to be brought under testing. The list can be found using static source code analysis (SSCA)1.Byprioritizing this list in some manner, we could find an entry point into the jungle of units. So, how would one go about prioritizing a list of units? One way would be to ask the current developing team. This, however, would be infeasible as it is likely that the legacy system both consists of plenty of units, and no developer knows exactly what each function does nor have the necessary knowledge to prioritize them. It may not even be possible to prioritize between two functions even if you’re well known with the system. Take the ATM example again, which feature is more important to function correctly, the withdrawn amount being subtracted from the user’s balance, or the right amount of money payed? They are both quite important in a banking system.

Required Effort The priority may also be done by examining difficulties which can arise when trying to bring a certain unit under test. These difficulties can be illustrated by metrics of some properties a unit has. The properties should say something about the effort required to bring a certain unit under test. For example, it may be cyclomatic complexity2 or dependencies3 needed to be mocked.

1SSCA is the process of analyzing source code without executing it, in contrary to dynamic analysis, where application behavior is examined during runtime. 2The cyclomatic complexity is the number of linearly independent paths through some code. Method developed by Thomas J. McCabe[12] and further discussed in section 4.3.2. 3A unit, “A”, has a dependency on another unit, “B”, if unit A is using Unit B.

34 CHAPTER 4. INTEGRATION 4.1. WHERE TO START

Finding these properties can also be achieved by SSCA. Knowing how difficult it is to bring a unit under test is important. If the developing team is inexperienced, it may be a good idea to start with the simpler test to facilitate the learning curve. If the developing team is experienced, one may consider starting with the more difficult units as there may be more things that behave in unexpected ways.

Provided Value

As there is no way for all tests to be implemented simultaneously (doing so would require a vast amount of developers), it would be nice to decide how valuable aunitis.Thatis“howmuchwouldwegainfromthisunitbeing under test?”. This can be achieved in several ways. One way could be to analyze the movement of the code (i.e. how often has this unit been altered in the past?). Another approach could be by analyzing how often each unit is invoked in production4.Yetagain,thevaluecouldalsobedeterminedfrom product owners, other stakeholder or developers. For same reasons explained above, it’s infeasible to ask stakeholders for a list of valuable units. Analyzing code movement can be done if the code has been under version control for some time. Using the movement of the code to try and anticipate where changes are frequent require one to make the assumption that changes will continue to be frequent in these areas. This assumption may be more or less misleading. Even though a code snippet has been changed in the past does not guarantee it will change in the future. Perhaps the last bug of that snippet was corrected with the last change or perhaps it was part of a new feature (which obviously only needs to be im- plemented once). In these cases the code is not likely to be moved again. On the contrary, when fixing that last bug, a new bug may have been introduced. Similarly if the code was changed in the last feature implementation, it may require change in the new feature implementation as well. The third approach for determining a unit’s value is to measure how often the unit is invoked over a certain timespan, the frequency. If a unit is very frequent, it would be more devastating for the system if it failed than if a less frequent unit failed. However, a unit which is very frequent is not certain to be of more value than a less frequent one. Take the ATM example again, the logotype placement is probably more frequent than the mechanism which blocks a credit card after three failed PIN-code attempts. However, the blocking mechanism serves a more important purpose than the placement

4A system “in production” is a system which used in the anticipated way. That could be the way a customer has the system set up and in use.

35 4.2. MAKING THE ANALYSIS CHAPTER 4. INTEGRATION of the logotype does, even though it is less frequent. There are pros and cons with all methods.

4.1.3 Summary

To summarize, desirable building blocks for finding an entry point is

Alistofunits • Effort for unit testing • – Cyclomatic complexity – Dependency – Priority by stakeholders

Unit value • – Movement – Frequency – Priority by stakeholders

The continued discussion of choosing an entry point is contained in sec- tion 4.4.

4.2 Making the analysis

Among the building blocks in section 4.1,Thelistofunits,cyclomaticcom- plexities and dependencies can all be found using SSCA. Unfortunately, no tool which can provide the data is available for PHP. In fact, very few SSCA tools exists for PHP. Two examples of SSCA tools for PHP is YASCA5 and RATS6. However, these primarily focus on security issues and bad coding conventions. Not what we’re interested in right now. As no tools will provide the data needed, one will have to be developed.

5http://www.yasca.org/ 6http://www.fortify.com/security-resources/rats.jsp

36 CHAPTER 4. INTEGRATION 4.3. CREATING AN ANALYSIS TOOL

4.3 Creating an analysis tool

As it is PHP which will be analyzed, it is fitting to write the tool in PHP. It will be known as PHP Analyzer or PHPA. As stated earlier, list of units, cyclomatic complexities and dependencies can be found using static code analysis. That is, it is sufficient to simply scan the source code to determine these qualities. The estimations from the development team was not carried out for reasons explained in section 4.1. Determining the unit value was a bit trickier. Once again, asking the current development team for a priority list is not feasible. For problems described in section 4.1,movementofthecodewasnotusedasameasurement of value. Frequency of a unit, on the other hand, is more precise. If a unit has high frequency today, it will have a high frequency tomorrow as well. The problem is that there is no direct mapping between the value of having this unit under test and it being frequent. However, one can conclude that if a unit is frequent it is also desirable that it is functional. In conclusion, we need to find the following data to evaluate a good entry point List of units • Cyclomatic complexity • Dependency • Frequency • When performing SSCA, we need to interpret the source much like a human, or a compiler does. That is, certain known patterns for the programming language used needs to be found. For pattern matching, the theory of regular expressions, or regexp is optimal. Regular expressions have been around for some time and is available in PHP and most other languages and platforms[7].

4.3.1 PHPA: List of units As mentioned earlier A”unit”isamethodorfunction.[14] In PHP, a function and a method is the same thing and will be referred to as a function for the rest of the thesis. A function is defined with the keyword function,whichistheonlyway[11]. To find all units in a project, PHPA will simply go through all source files and see if a function is defined at each line. The regexp used to find a function is

37 4.3. CREATING AN ANALYSIS TOOL CHAPTER 4. INTEGRATION

/function[\s]+&?([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)\(.*\)(?!;)/

For detailed understanding of the regexp, please see reference [7]. Quickly summarized, it initially looks for the string, function. Next, one or more whitespace “\s”mayoccur. Oneornoneampersandsign“&”mayprecede the function name. In the first parenthesis is the allowed name for a func- tion in PHP[11]. Next we want an opening parenthesis “(”, then allow for zero to infinitely occurrences of any character (the arguments) and a closing parenthesis “)”. Finally a semicolon is not allowed. This is because we are not interested in abstract functions. Doing this for each source files generates a list over all units in the system.

4.3.2 PHPA: Cyclomatic complexity Analyzing the cyclomatic complexity was accomplished using McCabe’s com- plexity theorem[12]. According to McCabe, it is possible to draw a graph which fully describes all possible paths through some code. Each decision point in a program (e.g. an if-statement) will increase complexity in some manner. For example, the code

1 function a() { 2 if ($some_var) { 3 // Perform evaluations here 4 } 5 // Other evaluations here 6 } contains one decision point. It’s corresponding graph is illustrated in fig- ure 4.1,wherethelineswitharrowsattheendarereferredtoasedges while the circles are called nodes. In order to succeed with this approach the source code for each unit needs to be available, allowing us to find points of interest. Parsing this proved more challenging than finding functions, but well within the capabilities of regular expressions. Once the source is available, the following regular expressions were used to find each point in the source where a decision is made. That is, where the program must choose one path among several.

if: /(else)?if.*\(.+/ • olif: /\?.*:/ • andand: /&&/ • 38 CHAPTER 4. INTEGRATION 4.3. CREATING AN ANALYSIS TOOL

Figure 4.1: Graph for evaluating cyclomatic complexity of the sample code

case: /case.+?:.*?break[;]/s • default: /default[ ]*:/ • switch: /switch.*\(.*\)\{?/ • while: /while.*\(.*\)/ • for: /for.*\(.*;.*;.*\)/ • foreach: /foreach.*\(.*as.*\)/ • Some of the items above may require an explanation. olif is a ’one-line-if’, which is allowed in PHP. An example of an olif is $statement ? do_if_true() : do_if_false(); which obviously is a de- cision point which increases cyclomatic complexity[12]. The andand is often used in conditional statements such as if ($a && $b) ... and also generates a decision point[17]. Not all items listed above are sure to increase the cyclomatic complexity. Consider the following snippet

1 switch ($a) { 2 default: 3 f(); 4 break; 5 }

39 4.3. CREATING AN ANALYSIS TOOL CHAPTER 4. INTEGRATION

No matter the value of $a, the code in the default-block will always be executed. Hence, the snippet above have a cyclomatic complexity of one (which is the minimum cyclomatic complexity as there trivially has to be at least one linearly independent route through some code). Furthermore, else-statements do not influence the cyclomatic complexity at all. Consider the following snippet

1 if ($a) { 2 // some code here 3 }else{ 4 // other code here 5 }

The graph for this code is illustrated in figure 4.1, only the phrase“Not enter- ing if-statement” would be “Entering else-statement”. However, the elseif- statement obviously influences the cyclomatic complexity in the same way an if-statement would, hence it is covered by the if regular expression above. To accurately calculate the cyclomatic complexity according to McCabe, knowledge of the quantities listed above is required. To calculate the number of nodes (n)andedges(e)equation4.1 and equation 4.2 is used, which is a direct derivation from McCabe.

n =3if + 3 olif + 3 andand + 2 switch + case (4.1) · 2 default· +· 3 while + 3 ·for + 3 foreach− · · · ·

e =4if + 4 olif + 4 andand + 2 switch + 2 case (4.2) · 2· default· + 4 while +· 4 for + 4 foreach· − · · · · Knowing the number of nodes and edges of the source, we can calculate the cyclomatic complexity directly from equation 4.3 where e is the num- ber of edges, n the number of nodes and p is the number of connected components[12].

v = e n +2p (4.3) − In this application the number of connected components (or number of graphs) is 1, p =1.Equation4.3 can be simplified to equation 4.4,whichisstraight- forward to use.

v = e n +2 (4.4) −

40 CHAPTER 4. INTEGRATION 4.3. CREATING AN ANALYSIS TOOL

4.3.3 PHPA: Dependency Finding dependencies, as mentioned earlier, can be done by SSCA. The source for a unit need to be available in order to find dependencies. In this application, dependencies is divided into two categories, internal and external. Internal dependencies are defined as a unit which is dependent on another unit. Both units are defined in the system, i.e. they’re not in PHP’s default library or another third party library. One could argue that these functions also should be mocked in order to completely, and solely bring a unit under test. With this logic, operators such as + and - should also be mocked, which is infeasible. The line has to be drawn somewhere. Only requiring units which is in the system (i.e. units needed to be brought under test) is a common and fair approach. In some sense, one have to assume that imported libraries as well as standard functions are already tested[14]. External dependencies come from a unit which require some external resource, such as a database, web service or a file in a file system. Both internal and external dependencies needs to be mocked in order to bring a unit under test, thus both are of interest. Finding internal dependencies is straightforward. The same methodology used when finding units was used here: go through the source and find ref- erences to the current unit list we have (this require that the list of units is available, which was solved in section 4.3.1). External dependencies is a bit trickier. This could be done using dynamic code analysis. There is, however, one major drawback by doing an analysis like this dynamically. Unless it’s absolutely certain that all paths through the system has been executed, it’s unknown if all dependencies has been found. Practically, it’s almost impossible to be sure in a legacy system. Using SSCA, however, is not a walk in the park either. What should we be looking for? In PHP, the function fopen generally opens a file on the local file system. However, wrappers exists which allows you to input “http:// www.google.com”asfilename. Thecontentofthe“file”wouldbetheGoogle website - ergo, it would not be a dependency on the local file system, but rather on a web service. There are other functions though, which are certain to enforce some dependency. The mysql_query function is sure to have a database dependency. Finding key functions like fopen and mysql_query is how PHPA determines external dependencies. PHPA contains a list of key functions which it matches to functions found in source. As some functions may be used for several dependencies in different contexts (as with the fopen-function), the most common usage have been selected as the dependency. For example, fopen is assumed to enforce a file system dependency. The result of the external dependencies will not be

41 4.3. CREATING AN ANALYSIS TOOL CHAPTER 4. INTEGRATION

100% accurate using this method. However, the error is not as bad as it first seems. Even if a call to fopen actually is a web service, it will find an external dependency, only not the correct one. Currently, the maximum number of external dependencies PHPA will find is three, web service, file system and database.

4.3.4 PHPA: Frequency Determining the frequency of a unit can not be done using SSCA. It is easy to find the number of references of a unit in the source code, but that reveals nothing of how frequent it would be in production. Dynamic code analysis is necessary. For PHP, a profiling7 tool called Xdebug is available. Once enabled it will record various data concerning invocation of units during runtime. Even though it provides information about time spent in a unit and memory usage, only the number of occurrences of each unit invocation is of interest when determining the frequency. In order to get a good frequency measurement, the profiling tool should be run for a considerable amount of time and on a system which is in production. For this thesis, there is no production system available to run the profiling tool on. However, a demo system, which is primarily used to promote the product to customers was used. In practice, end users use this system the same way a “real” production system would be used, only less traffic. Xdebug generates a lot of output. Even though the only interesting part for determining frequency is the invocation count for each unit, there is no way to filter the output from Xdebug to only give this data. Along with the desired data, Xdebug also provides information about memory usage, at what time index a unit is called, how long execution takes, etc.8.This amount of data is troublesome to handle. It not only takes a long time to analyze, but also takes up a lot of storage space. The data collection was decided to run for one work week, Monday morn- ing until Friday afternoon. We deliberately did not run it during the weekend for two reasons. Firstly, the amount of data was simply too much, there was not enough storage space available for more than 5 days. Secondly, weekends are free days, and surveillance of the test could not be provided. If something broke on Saturday, it could not be fixed until Monday. This downtime was not acceptable.

7Profiling in this thesis is defined as the process of logging the time and memory usage when invoking different units.[18] 8All output generated by Xdebug, along with other possible settings is available at http://xdebug.org/docs/all_settings [Online; accessed 20-November-2009]

42 CHAPTER 4. INTEGRATION 4.3. CREATING AN ANALYSIS TOOL

During this week, just over 400 GB of data was collected. Analyzing the data with PHPA also took a fair amount of time - almost 27 hours. The log parsing procedure was similar to other analysis procedures. Each line was parsed and matched against known units. Finding the number of invocations with Xdebug convey one limitation. While it tells what unit was invoked, it does not provide information about which file this unit resides in. If the system contains several units with an equal name in different source files, which is not uncommon in a larger project, and frankly necessary in any well designed object oriented solution, Xdebug would have no means to distinguish these from each other. The frequency data for units with equal names is, unfortunately, unreliable. When using these variables in the coming calculations, the frequency variable will be used as though this limitation did not exist. As a result, some functions will be considered to have a higher value. This should be kept in mind when interpreting the results.

4.3.5 Analysis Results All sought building blocks described in sections 4.3.1, 4.3.2, 4.3.3 and 4.3.4 was successfully found. Some with limitations as discussed in respective section. A sample HTML report from PHP Analyzer is shown in figure 4.2. PHPA can also report in many other formats, such as XML, diagrams (e.g. figure 4.3 and figure 4.4)orplaintext. Running PHP Analyzer on the root of the legacy PHP system, gave the following results

Number of units: 1512 • Unparsable units: 1. Warnings: 147 • Alistofunitsholdingunitspecificdata • – Unit name – File – Row – Frequency (if Xdebug analyzing was enabled) – Cyclomatic complexity – Internal and external dependencies – SLOC

43 4.3. CREATING AN ANALYSIS TOOL CHAPTER 4. INTEGRATION

Figure 4.2: PHP Analyzer report

Unparsable units are units where PHPA was unable to extract the source code of the unit. PHPA identifies this situation when it has reached the end of the file where the unit resides, but still have an unmatching number of curly braces ({ and }). The unit warnings occur when several units have an equal name. As discussed in section 4.3.4,thefrequencyvaluemaynotbe accurate for these units. To illustrate the variables frequency, cyclomatic complexity and dependen- cies a plot may be sufficient. As discussed in section 4.1,cyclomaticcom- plexity and dependencies is a metric of required effort to bring a unit under test while frequency is a metric of a units value. Plotting frequency vs com- plexity and frequency vs dependency sum9 is illustrated in figure 4.3 and 4.4 respectively. Note that scales are logarithmic to improve the ability to get an overview of where units are in this plot. Figure 4.5 illustrates an attempt to combine the two diagrams by multiplying the cyclomatic complexity with the dependency sum on the X-axis, while keeping the frequency on the Y-axis. As the scales are logarithmic, the value zero is not defined[5]. However, we still want to plot every unit in this diagram, therefore each 0 value is replaced with 1. Logically, Negative values can not occur.

9Dependency sum is the sum of the internal and external dependencies.

44 CHAPTER 4. INTEGRATION 4.3. CREATING AN ANALYSIS TOOL

Figure 4.3: Y-axis: frequency, X-axis: cyclomatic complexity

Figure 4.4: Y-axis: frequency, X-axis: dependency sum

45 4.4. CHOOSING THE ENTRY POINT CHAPTER 4. INTEGRATION

Figure 4.5: Y-axis: frequency, X-axis: complexity times dependency sum

Furthermore, in diagram 4.5 the values depsum and complexity have had 1addedtoeachbeforetheyaremultipliedtogether.Thisistoalleviatethe effect multiplication with zero may have. If a unit have a high complexity, but 0 dependencies, the calculated effort required to bring the unit under test would be zero. The same as a unit which have 0 complexity and 0 dependencies. That’s obviously unfair, thus we add 1 to both to overcome this issue. In section 4.4 aprioritylistwillbederived,usingthedatafromthe PHPA analysis. The list will serve as a guide over which units will provide the greatest gain to bring under test.

4.4 Choosing the entry point

As discussed in section 2.1.1 one of the main reasons for unit testing is finding bugs, and the parts of the system where bugs are least desirable are the most frequent (see section 4.1). Hence, on the diagrams presented in section 4.3.5, the ones with the highest frequency should be tested first. As discussed earlier, it would not be very strategic to list all units with descending characteristic according to frequency. Taking the required effort

46 CHAPTER 4. INTEGRATION 4.4. CHOOSING THE ENTRY POINT for bringing the units under test into account would be desirable. If done properly, the order in which unit tests are implemented can be fitted to the experience of the developing team. If the developing team have previous experience of unit testing it may be a good idea to start in the top right of diagram 4.5.Theunitshererequireafairamountofefforttobringundertest, but are also relatively frequent. Advantages of being able to start here are that unit tests will generally require less and less effort as progression is made (though, there may be some units in the bottom right quadrant which require more effort). Units which require a lot of effort to test (high cyclomatic complexity and depsum) also have more things that can go wrong (although this doesn’t necessarily increase the probability of something failing). If the developing team’s experience is a bit more rusty or non-existing, starting with easier units is probably a better way to go. The required effort in the beginning will be low. This will allow developers to gain experience as they implement unit tests. A similar analogy could be skiing. The first time you acquired a pair of skis, you wouldn’t go to the alps and throw yourself out of a black slope. You could, but chances are you’d be hurt. Starting in alocal,greencoloredslopeisprobablyabetteridea.Thiswouldcorrespond to the upper left quadrant of diagram 4.5. At the department at Axis Communications where this thesis is carried out, there is limited experience from unit testing. Hence, it is fitting to start in the upper left quadrant. How the ranked list is created is described in section 4.4.1.Themethodsusedthereisindependentofwhichquadrant selected, although section 4.4.1 is specific for the starting in the upper left quadrant.

4.4.1 Creating the ranked list For this specific application, it’s desired to start in the upper left quadrant of diagram 4.5. The next unit we implement unit tests for should be of as much value as possible. According to previous discussion, this means a high frequency and low effort (cyclomatic complexity and depsum). This could be done by imagining a straight line with a positive slope at the top left of the diagram. The line will then be progressed with infinite steps through the diagram towards the bottom right. Each dot which is on the line is the next one for unit testing. See figure 4.6 It is now possible to tune which units to focus on by altering the slope of the line. A steeper slope would prioritize low effort units, while a flatter slope would prioritize the ones with high frequencies. The former alternative would cause units with higher required effort to be implemented later, while

47 4.4. CHOOSING THE ENTRY POINT CHAPTER 4. INTEGRATION

Figure 4.6: Method for diagram progression the latter would put the high effort units sooner. There is no general superior way to select which quadrant to start in or the slope of the line. It depends on the current application, the developing team and the units’ location in the diagram. As the developing team has limited experience, it’s decided to tilt the slope in favor of lower effort required. The slope will be set to 2. Consider the equation for the straight line y = ax+b.We’vesettheslope to 2.

y =2x + b (4.5) where y is the frequency of a unit and x is the required effort in terms of diagram 4.5. All that’s left to find is at which value of b each unit will be on the line y.Thisiseasilyaccomplishedbyreorderingtheequation

b = y 2x (4.6) − Ordering the units in a descending fashion according to b and the ranked list is generated.

48 CHAPTER 4. INTEGRATION 4.5. WRITING TESTS

4.5 Writing tests

This section addresses common issues which may arise when developing unit tests. The three question of where to start writing unit tests (specific to a legacy system), how to implement test cases and when to do it are briefly discussed.

Where The question of where, i.e. which units to start implement unit tests for have already been answered in section 4.1.Theresultwasarankedlist.

How Teaching a group of developers to effectively implement unit testing in their daily work routine could be formulated as a thesis of its own. However, a few general and basic things will be covered here. Some form of quick reference needs to be produced, which the developers may use when in doubt. It should be kept short. If it is too long, it will require too much effort to find anything of value. A couple of things is desirable to have in the document General unit test layout. • How to mock dependencies. • How many test cases to write per unit. • In many cases, examples is the best way. And more specifically, practical examples for the system. The examples could be real implemented unit tests. It would be useful to select one unit from each difficulty that may occur when implementing the test. This would mean selecting one unit from each of the following categories. Has internal dependency. • Has file system dependency. • Has database dependency. • Has web service dependency. • Low complexity value. • High complexity value. • 49 4.5. WRITING TESTS CHAPTER 4. INTEGRATION

Which would pretty much cover the main difficulties when implementing unit test cases, i.e. answered the question how.Thisisfurtherdiscussedin section 4.5.1.

When Discussions of when to write unit tests are also frequent. TDD promotes writ- ing tests prior to implementing[3]. This require the developer to thoroughly examine the problem before implementing the solution. As there already is an existing system, it is obvious that the tests for this system has to be written after the implementation, unless it would be advantageous to re-implement the system, in which case, this thesis is of no interest as that would be more similar to starting a new project. To get the tests implemented, a consult may be hired. A better solution, however, may be allowing some time for the developing team to implement the tests. This way, it will probably be finished quicker. The developing team is also the ones who implemented the code once upon a time and require less time to familiarize with the code. This can be disadvantageous though, sources indicate that a developer who is oriented in a system often write tests which verifies that the system works as it does today, rather than how it should work[16].

4.5.1 Implementation Internal dependencies When mocking an internal dependency, we simply want to fake an invocation of the required unit. Consider the following code

1

2

3 function a($param) { 4 // ... 5 }

6

7 function b() { 8 // ... 9 $var = a($aparam); 10 // Use $var in some way. 11 }

50 CHAPTER 4. INTEGRATION 4.5. WRITING TESTS

Now if we want to test the unit b,theresultwoulddependonwhatthe result of a is. Now if a was to fail, the unit test for b would also fail, even though unit b may function correctly. This is indeed unwanted behavior and needs to be countered in some manner. The solution is faking. In PHP there is an extensions called runkit 10 which can overcome this issue. Through provided functionality one is allowed to override an already defined function. That is, in the example above we could execute the fol- lowing code prior to the unit test of b to avoid influence of the function a. runkit_function_redefine(’a’, ’$param’, ’return true;’);

Function a will now always return true, and when unit testing b, $var will always be true. That is, b is no longer dependent on a. There is a side effect by doing this. The functions are redefined on a global scope, meaning that each unit test run after a redefinition of a function will only see the newly defined function. This makes it hard to run tests independent from each other. The solution is to set up each new function definitions we need before each test (setting up the fixture), and restore all changes made to functions after a test is finished (tearing down). This introduces a small overhead, but allows for isolating unit tests.

File System Dependencies Similarly to previous section, we do not want to be dependent on a specific file system when testing units which require the file system. We want to mock the file system just as we mocked internal functions before. This can be achieved with a tool known as vfsStream11. vfsStream allow the tester to put the virtual file system in a known state by simulating a file system in the primary memory, completely eliminating the need to access the underlaying file system. Hence, should the underlaying file system fail, or be in any unknown state, the unit test would remain unaffected.

Database dependencies Databases are quite cumbersome when unit testing. This is mainly because it is hard to isolate tests[14]. If some data is changed when performing one

10http://www.php.net/manual/en/book.runkit.php 11http://code.google.com/p/bovigo/wiki/vfsStream

51 4.5. WRITING TESTS CHAPTER 4. INTEGRATION unit test. That data will still be changed when initiating the next unit tests -unlessspecificactionsaretaken. The two options we have when testing units which interact with a database are

Mocking the database • Testing against a real database • There are pros and cons with both alternatives. Mocking the database requires extensive implementation. We need to be able to store databases, tables and values. Making sure this mock implemen- tation works correctly would require a whole test suite itself. Unfortunately, there is no tool available to do this (as vfsStream solved the file system mock- ing). If we would want to test updating data in a table, it would be required to store all the current data, handle the update request and alter the data accordingly. Then allow for comparing to some expected data. The advantage with this approach, however, is that we are in full control of the contents and behavior of the database at all times. Furthermore, there is no need for installation or configuration of an actual database. Tests will also (most likely) be quicker, as it does not need to interact with a (possibly remote) database. The second option, testing against a real database would reduce the effort required from the test implementer. PHPUnit also includes excellent func- tionality for testing against a live database. The framework ensures that the database is restored to the initial state after each test, allowing for isolating tests. I’ve decided to do the database testing against a real database for numer- ous reasons.

As a database already is required to be able to develop the targeted • system, installing and setting up is most likely already done.

Implementing the mock for each database layer that may be used is • cumbersome and takes a lot of effort.

PHPUnit provides great functionality for performing tests on a real • database[4].

Implementing the tests for the database is now straight forward. The frame- work take care of most issues, as returning it to a known state, and comparing the contents of the table data to expected result.

52 CHAPTER 4. INTEGRATION 4.6. PUTTING IT ALL TOGETHER

Web service dependencies Web service dependencies is the usage of services which are (most likely) remote from the running machine. It may use protocols such as http12,ftp13 or smtp14. In PHP, these types of services are usually made available through func- tions provided by PHP itself. Evidently, these functions can not be altered for the sake of unit testing. The mocking itself would have to be one layer higher up, in the function that uses the service. Instead of mocking, we could run the tests against the service live, as was the case with testing the database. However, it is quickly realized that this approach is infeasible. Consider we need the contents of a remote site, say http://www.sourceforge.net.Inthiscase,thetestcasehavenocontrol over the sites content, or even if it is reachable or not. It is impossible to predict the answer from a service like this, much less control it. This would make unit testing impossible. The final method would be the one used in the file system approach. A wrapper which intercepts all calls to remote services and replies with known answers. This would indeed be the most desirable approach, but no such tool exists for PHP. Creating the tool would (likely) require less effort than the database wrapper would, but still outside the scope of this thesis. For this application, the only way to go would be to mock the functions using web services. This is achieved with the previously mentioned runkit module. Runkit does not allow for overriding internal (PHP defined) functions by default, but require a configuration to do so.

4.6 Putting it all together

Using the results in section 4.5.1 is trivial. The list which holds all units in the legacy system is now ordered in a way, such that the top items are very valuable to have tests for, while the bottom items are less valuable to have tests for compared to the effort required to implement. Hence, unit tests should be implemented following the list from the top to the bottom. The order may of course be partly revised as the conducted study was not perfectly accurate, due to the shortcomings discussed previously. As for breaking various dependencies, the procedure for each dependency

12http://www.faqs.org/rfcs/rfc2616.html 13http://www.faqs.org/rfcs/rfc959.html 14http://www.faqs.org/rfcs/rfc821.html

53 4.7. VALIDATION CHAPTER 4. INTEGRATION type is described in previous section. Combined with the sample test cases implemented by the author, any developer should be able to continue break- ing dependencies successfully. PHPA may also be used continuously to update the ranked list if the software changes. As it is possible to save intermediate analysis made (such as the frequency analysis), it is possible to quickly update it when source changes. The sample test cases I implemented was put in the repository of the revision control system to serve as examples, as well as a read me file which describes how to fetch all required tools and how to execute the test suite. These sample test cases are further discussed in section 4.7.

4.7 Validation

In order to validate the work done in this thesis, a couple of sample units were selected and tests were implemented. Units were selected in an attempt to represent all difficulties which may arise when implementing tests. A couple of units were selected, fitting each category in this list

Internal dependency. • File system dependency. • Database dependency. • Web service dependency. • Low complexity value. • High complexity value. • The units was chosen from the previously mentioned priority list, picking the first unit from the top satisfying the condition.

Internal dependency Implementing tests for a unit with only internal dependencies held no sur- prises implementation wise. However, a bug in runkit caused a segmentation fault15 on some systems. Problems like these are quite frequent in “not- so-used” tools, although may appear everywhere. Usually a fix is already

15A segmentation fault is a common error whenever memory is accessed in a incorrect manner.[15]

54 CHAPTER 4. INTEGRATION 4.7. VALIDATION available by some crafty individual somewhere on the Internet, and findable by e.g. Google. If a fix can’t be located, debugging yourself is usually not too demanding, and in any way a great way to learn. If you find and fix the problem, be sure to share it with the world. The bug in runkit was already fixed, although not included in any release. Applying the provided patch made the extension run as intended.

File system dependency

Simulating a file system using vfsStream works very well. Only one prob- lem was discover when using this tool. Although support implemented, vf- sStream does not look at file permissions when opening a file. This feature is currently marked as ”TODO” in version 0.4.0 (as of 2009-12-28). This limitation makes it impossible to test that unreadable files are in fact unread- able. However, as the developers of vfsStream are aware of the limitation, hopefully it will be implemented soon.

Web service dependency

When functions defined by PHP needs to be mocked, the first issue is that runkit do not allow these functions to be mocked by default. The PHP ini variable runkit.internal_override needs to be set to the value 1 in order to allow this. Normally the built in functions providing an interface to a web service usually consist of multiple functions. Hence, many overrides needs to be created in order to successfully isolate a unit. For example, a unit using the curl library in PHP16 may have up to 18 functions which have to be stubbed. However, it is very unlikely that one unit uses all functions. Implementing tests for units which use web services require a bit more effort than units using internal functions, but the approach is the same.

Low Complexity

Implementing unit tests for low complexity valued unit holds few surprises. Usually one or two test cases executes all code, and several more is needed to test edge values. Theories and practices from most unit testing books are directly applicable[14].

16http://www.php.net/manual/en/ref.curl.php

55 4.7. VALIDATION CHAPTER 4. INTEGRATION

High Complexity Units with high complexity value are tedious to unit test as they have an abundant amount of decision points. Usually, a huge amount of tests are required to execute all lines of the source code. Sometimes not even this is possible. In addition, an even greater number of tests is required to test all special cases. Some more work on how to handle high complexity units could be useful. Perhaps it is possible to extract source code from the high complexity unit (i.e. refactoring) or perform some other optimizations to reduce complexity. This is not within the scope of this thesis.

56 Chapter 5

Conclusion

When introducing automatic testing to any legacy system, the same problems are posed. A framework is the most essential tool when introducing testing. Of course, a framework is equally essential if the automated unit testing was planned at the beginning of the project. Choosing a fitting framework for the legacy system is very individual for each system. It’s important to realize what is important when introducing tests in the current system. The list in section 3.1 are things that was con- sidered important in the system targeted by this thesis. In a general system, there may be other items which are important, or items in the list which are excluded. PHPUnit was selected for the application studied by this thesis. PHPUnit has a versatile bank of features and allows for integration into a legacy PHP system without any (major) changes to the current source. Especially in conjunction with a few other tools, such as runkit and vfsStream,alsocovered by this thesis. In a legacy system, there are usually hundred, or thousands of units, along with a huge quantity of source code. In this application there were 1512 unique units built on 43240 source line of code. Needless to say, it will take some time to get all these units under test. In order to decide which units were most important to get under test early on, some properties for each unit were identified and measured. We found that the interesting unit properties when creating this list was

The frequency with which the unit was invoked during a certain time • span.

The cyclomatic complexity of the unit. • The dependencies the unit has towards internal and external influences. • 57 5.1. DISCUSSION CHAPTER 5. CONCLUSION

The key to creating a usable list from these variables is to combine them in asaneway.Thiswasachievedbyrealizingwhichpropertycontributedto the required implementation effort, and which property contributed to the value from having the unit under test. It was determined that the frequency increased the value while cyclomatic complexity and dependencies increased the required effort. Putting these in a diagram and traversing the diagram in some manner would realize a list usable as a priority list where it would be most beneficial from unit testing the top items, and least beneficial from unit testing the bottom items. Finding the properties discussed would be difficult without a tool. As no such tool yet existed, PHP Analyzer was born. PHP Analyzer is now available as open source from SourceForge1,wherethecodeandmethodsto find each property may be examined. Releasing the tool as open source also allows for anyone to improve the tool, add new features and so on. One of the primary ingredients in succeeding when writing unit tests is to isolate tests from each other. To isolate a unit from any influence can prove quite tedious. Four dependencies of a unit have been discussed: Internal dependencies. • File system dependencies. • Database dependencies. • Web service dependencies. • In order to successfully test units with these dependencies we need to isolate the unit to remove possible side effects. Internal dependencies were overcome with the PHP extension runkit, which has the ability to override an already defined function. File system dependencies were overcome by using the tool vfsStream,whichsimulateda file system in the primary memory. Database dependencies were overcome by creating a test database, and using built in functionality in PHPUnit which always rendered the database to a know state, allowing for unit testing. Web service dependencies were overcome by mocking each web service with runkit, similar to internal dependencies.

5.1 Discussion

As always when humans are involved, there is rarely one right way to pro- ceed. Whether it’s choosing a framework or methods for implementing test

1http://phpa.sourceforge.net

58 CHAPTER 5. CONCLUSION 5.1. DISCUSSION cases, different people always do things differently and have varying opin- ions. Because of this, assumptions of how much these influences contribute are frequently made in order to accomplish the task at hand. When selecting a framework to work with it was not possible to evaluate all frameworks. 6 frameworks were chosen for the study, whereas only 4 of these were subject for a deeper study. There are even more frameworks which could have been fair candidates (such as lime2 or ojes3). Due to lack of time it was simply infeasible to try all available frameworks and some had to be excluded without any extensive research. As common sense was the only gate keeper of which frameworks made it into the study and which did not in the initial stages of the thesis, it is possible that a potent framework was discarded. However, judging from some criteria described in section 3.1 which are easily measured without extensive research (such as maintained and license)itislikelythatallinterestingframeworkswereincluded. The case study performed in an attempt to capture how user friendly aframeworkishadsomepotentialflaws.Onlytwopeopleparticipated. Generally this is a too low count to get any quantitative results. However, as discussed previously (section 3.4.1)thisis25%oftheintendedtargetgroup. There are currently 8 primary developers of the system. Furthermore, the survey questions was difficult to establish. Simple questions as “Was the framework good?” is not suitable. It is very difficult to answer. The questions used attempt to break that question down into several, more easily answered, questions. The problem with this approach is that the survey is harder to interpret. The interpretation made may be misleading. This comes from all questions having an equal value and each question contribute equally to the average value. It was interpreted this way to alleviate the effort required in this phase. Even so, looking at the results of the surveys they were pretty much as expected. The choice of creating an analysis tool was simply because the lack of any such tool. The few tools which do exist for PHP (and was discussed in section 4.2)didnotprovideenoughinformation,notevencombined. The variables examined when attempting to create the ranked list of units needing tests were frequency, dependencies and cyclomatic complexity. It was argued that a high frequency would increase the need for a unit to be under test, while many dependencies and high cyclomatic complexity would increase the required effort to bring a unit under test. Multiple dependencies or a high cyclomatic complexity do indeed increase the required effort, i.e. increase the number of test cases needed or the amount of work necessary to

2http://trac.symfony-project.org/browser/tools/lime/trunk 3http://ojesunit.blogspot.com/

59 5.1. DISCUSSION CHAPTER 5. CONCLUSION implement test cases. Frequency, however, do not necessarily equal that a unit is desired to have tests for. A unit which is frequently invoked is most likely tested by several other tests already (integration tests, system test, or simply any ad hoc test performed by any developer while developing). Thus, it may not be desired to start with this unit. Furthermore, even if a unit is frequently used, does not equal important (compare the ATM example in section 4.1,whererenderingthebankslogocorrectlyislessimportantthan making a withdrawal of the correct amount.). The frequency variable was chosen mainly because the lack of any other good variable, or the means to measure it (for example, it was infeasible to ask the developers to create the list for reasons listed in section 4.1). Analyzing the movement of the code (i.e. how often the code is changed) could be an alternative to analyzing the frequency. The system have been in development and under version control for some time (it is after all a legacy system) and the required data to perform the analysis is available. However, there are about the same type of problems with this analysis as there is with the frequency analysis. One can not be sure that code with high movement increases value. It may be that this code have already been completely stripped of bugs, or changes made to it was because of a new feature implementation (which obviously do not need to be implemented again). Furthermore, new code (which is likely to contain bugs and will require changes) would not be deemed valuable from a movement analysis. If there was enough time, a combination between a frequency analysis and amovementanalysiswouldprobablybeabetterpathtotakethanjust choosing one of the two. Combining the variables was not straight forward. We concluded that frequency, in some sense, adds to the value of a unit being under test while dependencies and cyclomatic complexity adds to the required effort for bring- ing the unit under test. Hence, it was reasonable to put frequency on one axis and combine dependencies and cyclomatic complexity in some way on the other axis. Before dependencies and cyclomatic complexity can be combined, depen- dencies needs to be examined. There are two types of dependencies which have been analyzed, internal (which is the usage of another unit in the sys- tem), and external (which is the usage of any external service, such as a file system or a database). These two needs to be combined in some manner. As they both represent a dependency which needs to be broken in order to bring the unit under test, simply adding them seems sufficient. Then the result, referred to as depsum,isthenumberofdependenciesnecessarytobebroken. It may be argued that external dependencies are harder to break and should therefore be weighted to increase the depsum more than internal dependen-

60 CHAPTER 5. CONCLUSION 5.1. DISCUSSION cies does. Choosing a weight is not trivial, and to be certain that the weight is fair is even harder. Also, a weight of 2 or even 3 would not influence the created list remarkably as there are at most three external dependencies. Combining dependencies (depsum) and cyclomatic complexity can be done in several ways. The two could simply be added together by addi- tion, or they could be multiplied. Neither seems fair. Assume addition is used, if either value is much larger than the other (which is not uncommon) the smaller value would have little influence on the result. Multiplication would allow each value to equally influence the other. Problem here is the value zero. If one value is zero and the other is very high, the result would still be zero. To overcome this, one was added to each value resulting in the formula

(depsum + 1) (complexity + 1) ∗ on the axis. When creating the ranked list it was chosen to start with the units requir- ing less effort to bring under test. This was mainly because the targeted team which will implement the test cases is fairly inexperienced with unit testing. This was further emphasized when the slope of the line used to traverse the diagram (see section 4.4.1), was increased to select “easier” units first.

61 5.1. DISCUSSION CHAPTER 5. CONCLUSION

62 Chapter 6

Summary

Introducing automatic testing in a legacy system is not trivial. There are many factors which needs to be taken into account. Alegacysystemnormallyconsistsofseveralthousandslinesofcode.In the system targeted by this thesis, there are over 1500 units. Needless to say, methods for implementing unit tests is required in order to be successful. It is not enough to only address problems specific for a legacy system, but also general issues when implementing unit tests. For example, “How do I break dependencies?” and ”How many test cases should I write and what should I test?”. Selecting a framework was done by defining a number of criteria which was considered important. Next, several frameworks which may fit the criteria were tested and evaluated. With a decent margin, PHPUnit came out on top. With a framework and over 1500 units needed to be tested, any attempt to do so may seem futile. There is a lot to do. A common approach when presented with a huge task, is to divide it into several, smaller task. A natural breakdown of this would be to implement one unit at a time. Which unit should we start with? To find metrics specific for a certain unit, a tool is required. PHP Analyzer was born. PHP Analyzer determined three metrics of interest; Frequency, de- pendencies and cyclomatic complexity.Combiningtheseallowedforcreating arankedlistwhichmaybeusedwhenimplementingunittests.Thelistwas created with some thought. As the developers are not very experienced with implementing unit tests, the units at the top of the list is easier to create unit tests for than the ones at the bottom. However, they still provide fair value from being under test, according to its frequency. Higher frequency increase value of a unit being under test, while dependencies and cyclomatic complexity increase the required effort to bring the unit under test.

63 CHAPTER 6. SUMMARY

While implementing unit tests, it is common that dependencies needs to be broken. This was accomplished with a PHP module extension called runkit. Using runkit, a developer could easily override existing functions with an implementation specific for the unit test. The resulting functions is often known as a stub or mock,andhasnone,orlittlelogic.Often,theyjustreturn predefined values so we can control the unit test. For breaking file system dependencies, a tool called vfsStream exists. vfsStream creates a virtual file system in the primary memory. All file inter- actions is then made against this virtual file system. Of course, the unit test have full control of all files and folders in the file system. vfsStream poses one limitation. It does not support the concept of file ownership or permissions which makes it impossible to create tests where files are not readable due to lack of permissions. Not all dependencies was broken though. The external database depen- dency is an example of this. It was decided not to mock the database for two reasons. First, there is no available mock for a database connection (such as there was for file system). Implementing a mock database would require a great deal of effort. Second, it is very common to test against a live database. A database is fairly easy to set up, and PHPUnit contains many useful features to set the database in a known state, allowing for unit tests. Asmalllibraryofsampletestcaseswasimplementedusingmethodsdis- cussed in this thesis. The library not only serves as examples of how unit test could be implemented but also verifies that the framework, techniques and strategy invented with this thesis works.

64 Appendix A

Source Code

A.1 Complex

A.1.1 Complex class

1

2

3 require_once ’filehandler.php’; 4 require_once ’config.php’;

5

6 /** 7 *Aclassrepresentingacomplexnumeral 8 * 9 *@authorAlexanderOlsson 10 */ 11 class Complex { 12 /** 13 *Therealpartofthecomplexnumeral. 14 */ 15 public $re;

16

17 /** 18 *Theimaginarypartofthecomplexnumeral. 19 */ 20 public $im;

21

22 /** 23 *Thefilehandlertomanageloadandsaves. 24 */

65 A.1. COMPLEX APPENDIX A. SOURCE CODE

25 private $fh;

26

27 /** 28 *CreatesanewComplexnumber. 29 * 30 *double$reTherealpartofthecomplexnumber. 31 *double$imTheimaginarypartofthecomplexnumber. 32 */ 33 public function __construct($re, $im, 34 IComplexFileHandler $fh) { 35 $this->re = round($re, get_config(’PREC’)); 36 $this->im = round($im, get_config(’PREC’)); 37 $this->fh = $fh; 38 $this->fh->save($this); 39 }

40

41 /** 42 *Returnsatextualrepresentaionontheformx+yi. 43 * 44 *@returnstringThecomplexnumberintextualform. 45 */ 46 public function get_cartesian() { 47 $im = abs($this->im) == 1 ? ’’ : abs($this->im); 48 if ($this->re != 0 && $this->im != 0) { 49 return "{$this->re}" . ($this->im < 0 ? ’ - ’ : ’ + ’) 50 ."{$im}".’i’; 51 }elseif($this->re!=0){ 52 return "{$this->re}"; 53 }elseif($this->im!=0){ 54 return ($this->im < 0 ? ’-’ : ’’ ) . $im . ’i’; 55 }else{ 56 return ’0’; 57 } 58 }

59

60 /** 61 *Returnsatextualrepresentionontheformae^(bi). 62 * 63 *@returnstringThecomplexnumberintextualform. 64 */ 65 public function get_polar() {

66 APPENDIX A. SOURCE CODE A.1. COMPLEX

66 $A = round(sqrt($this->re * $this->re + 67 $this->im * $this->im), 68 get_config(’PREC’)); 69 $exp = round(atan2($this->im, $this->re), 70 get_config(’PREC’)); 71 $A_out = abs($A) == 1 ? ’’ : abs($A); 72 if ($A != 0 && $exp != 0) { 73 return $A_out . ’e^(’ . $exp . ’i)’; 74 }elseif($A!=0){ 75 return "{$A}"; 76 }else{ 77 return ’0’; 78 } 79 }

80

81 /** 82 *Conjugatesthecomplexnumber. 83 *Thatis,x+yibecomesx-yi. 84 */ 85 public function conjugate() { 86 $this->im = -$this->im; 87 $this->fh->save($this); 88 }

89

90 /** 91 *Addsacomplexnumbertothecurrentone. 92 * 93 *@paramComplex$cThecomplexnumbertobe 94 *addedtothisone. 95 */ 96 public function add(Complex $c) { 97 $this->re = $this->re + $c->re; 98 $this->im = $this->im + $c->im; 99 $this->fh->save($this); 100 }

101

102 /** 103 *Multipliesacomplexnumberwiththecurrentone. 104 * 105 *@paramComplex$cThecomplexnumbertobemultiplied 106 *withthisone.

67 A.1. COMPLEX APPENDIX A. SOURCE CODE

107 */ 108 public function mult(Complex $c) { 109 $new_re = $this->re * $c->re - $this->im * $c->im; 110 $new_im = $this->re * $c->im + $this->im * $c->re;

111

112 $this->re = round($new_re, get_config(’PREC’)); 113 $this->im = round($new_im, get_config(’PREC’)); 114 $this->fh->save($this); 115 }

116

117 /** 118 *Raisesthecurrentcomplexnumbertothe 119 *powerofparameter. 120 *Thisequalstoz^n=z*z*z...ntimes. 121 * 122 *@paramInteger$nThepowertoraisecurrent 123 *complexnumberto. 124 */ 125 public function pow($n) { 126 if ($n === 0) { 127 $this->re = 1.0; 128 $this->im = 0.0; 129 }

130

131 for ($i = 0; $i < $n-1; $i++) { 132 $this->mult($this); 133 } 134 $this->fh->save($this); 135 } 136 }

137

138 /** 139 *Returnsanarraywithcomplexnumbers 140 *representingtheunitcircle.Thenumbers 141 *willbeevenlydistributedalongtheunitcircle. 142 * 143 *@paramInteger$nthenumberofcomplexnumberswanted. 144 */ 145 function unit_circle_representation($n) { 146 $a_complex = array(); 147 for ($i = 0; $i < $n; $i++) {

68 APPENDIX A. SOURCE CODE A.1. COMPLEX

148 $fh = new ComplexFileHandler("cpx_$i.cpx"); 149 $phi = $i*(2*M_PI/$n); 150 $a = 1/(sqrt(1+tan($phi)*tan($phi))); 151 $b = sqrt(1 - $a*$a); 152 if ($phi > M_PI/2 && $phi < 3/2*M_PI) { 153 $a = -$a; 154 } 155 if ($phi > M_PI && $phi < 2*M_PI) { 156 $b = -$b; 157 } 158 array_push($a_complex, new Complex($a, $b, $fh)); 159 } 160 return $a_complex; 161 }

162

163 ?>

69 A.1. COMPLEX APPENDIX A. SOURCE CODE

A.1.2 Filehandler

1

7

8 class ComplexFileHandler implements IComplexFileHandler { 9 private $fh; 10 private $filename;

11

12 public function __construct($filename) { 13 $this->filename = $filename; 14 }

15

16 private function open() { 17 if (!isset($this->fh)) { 18 $this->fh = fopen($this->filename, ’w’); 19 } 20 }

21

22 public function save(Complex $c) { 23 $this->open(); 24 fwrite($this->fh, $c->re . ’,’ . $c->im); 25 }

26

27 public function load() { 28 $this->open(); 29 return fgets($this->fh); 30 } 31 }

32

33 class MyMockComplexFileHandler implements 34 IComplexFileHandler { 35 public function __construct($filename) {} 36 public function save(Complex $c) {} 37 public function load() { return ’1 + 2i’; } 38 }

70 APPENDIX A. SOURCE CODE A.1. COMPLEX

A.1.3 Config

1

2

3 $the_config = array(); 4 $the_config[’PREC’] = 2;

5

6 function get_config($var) { 7 global $the_config; 8 return $the_config[$var]; 9 }

10

11 ?>

71 A.1. COMPLEX APPENDIX A. SOURCE CODE

A.1.4 Complex slim

1

2

3 /** 4 *Aclassrepresentingacomplexnumeral 5 * 6 *@authorAlexanderOlsson 7 */ 8 class Complex { 9 /** 10 *Therealpartofthecomplexnumeral. 11 */ 12 public $re;

13

14 /** 15 *Theimaginarypartofthecomplexnumeral. 16 */ 17 public $im;

18

19 /** 20 *CreatesanewComplexnumber. 21 * 22 *double$reTherealpartofthecomplexnumber. 23 *double$imTheimaginarypartofthecomplexnumber. 24 */ 25 public function __construct($re, $im) { 26 $this->re = round($re, 2); 27 $this->im = round($im, 2); 28 }

29

30 /** 31 *Returnsatextualrepresentionontheformae^(bi). 32 * 33 *@returnstringThecomplexnumberintextualform. 34 */ 35 public function get_polar() { 36 $A = round(sqrt($this->re * $this->re + 37 $this->im * $this->im), 2); 38 $exp = round(atan2($this->im, $this->re), 2); 39 $A_out = abs($A) == 1 ? ’’ : abs($A);

72 APPENDIX A. SOURCE CODE A.1. COMPLEX

40 if ($A != 0 && $exp != 0) { 41 return $A_out . ’e^(’ . $exp . ’i)’; 42 }elseif($A!=0){ 43 return "{$A}"; 44 }else{ 45 return ’0’; 46 } 47 }

48

49 /** 50 *Addsacomplexnumbertothecurrentone. 51 * 52 *@paramComplex$cThecomplexnumbertobe 53 *addedtothisone. 54 */ 55 public function add(Complex $c) { 56 $this->re = $this->re + $c->re; 57 $this->im = $this->im + $c->im; 58 } 59 }

60

61 ?>

73 A.1. COMPLEX APPENDIX A. SOURCE CODE

74 Appendix B

Case study

B.1 Introduction

75 B.2. TASK APPENDIX B. CASE STUDY

B.2 Task

The code you’re supposed to test is a small class which represents a complex number with very restricted functionality. The source code is available at [URL to source has been stripped].Testsforthefunctions(get_polar and add)needstobewritten.Thisshouldbedoneinfourdifferentunittesting frameworks for php; PHPUnit, SnapTest, PHPSpec and PHPT.

B.2.1 Purpose The goal if this this laboration is to attempt to find an answer to the question ”how easy is the framework to use”. This will be done by several assignments, first a short introduction will be held. Next, you will be given time to try out the frameworks. Once finished, you will participate in a small survey, with a couple of question to answer. This will be followed by a short interview where we discuss your answers and experiences.

76 APPENDIX B. CASE STUDY B.2. TASK

B.2.2 Task The tests which should be implemented (at least, feel free to add more) is described in table B.1.Itispseudospecifications,butoughttobeself describing. If anything is unclear, please ask.

Table B.1: Test suite Function Parameters Expected result get_polar (1, 2) 2.24e^(1.11i) get_polar (-1, -2) 2.24e^(-2.03i) get_polar (1, -2) 2.24e^(-1.11i) get_polar (-1, 2) 2.24e^(2.03i) get_polar (0, 0) 0 get_polar (1, 0) 1 get_polar (0, 1) e^(1.57i) add (1, 1), (2, 3) re = 3,im=4 add (-4, 2), (2, -3) re = -2,im=-1 add (1, 4), (-2, -2) re = -1,im=2 add (4, 1), (-3, -4) re = 1,im=-3

Parameters from table B.1 should be thought of as parameters to the con- structor. I.e. for the first entry in the table, php-code would be something like

1 $c = new Complex(1,2); 2 $p = $c->get_polar(); 3 /* ... compare returned value to expected (2.24e^(1.11i)) */

Note that the parameters are not provided to the get_polar-function di- rectly. A similar approach is needed when testing the add function; create two complex numbers, call add on one with the other as argument and check the state of the complex number. The order in which you evaluate the frameworks does not matter. You can consider yourself finished with a framework when you have a test suite covering above test cases running. On the following pages there is a quick reference to each of the frameworks.

77 B.2. TASK APPENDIX B. CASE STUDY

PHPUnit Manual: http://www.phpunit.de/manual/current/en/index.html • Installation and running Instruction to install phpunit is found through the manual, http://www.phpunit.de/manual/current/en/installation.html After in- stallation, a testcase may be run by issuing the command phpunit FILE

Quick Reference AtestcaseclassshouldextendPHPUnit_Framework_Testcase. • Function setUp() is run before every other function. • Function tearDown() is run after every other function. • Standard asserts assertEquals($actual, $expected [, $message]) • assertLessThan($actual, $expected [, $message]) ($actual should • be less than $expected.Errorotherwise.)

assertGreaterThan($actual, $expected [, $message])($actual should • be greater than $expected.Errorotherwise.)

assertTrue($condition [, $message]) •

78 APPENDIX B. CASE STUDY B.2. TASK

Sample

1

3

4 class MyTest extends PHPUnit_Framework_TestCase 5 { 6 public function testHelloWorld() 7 { 8 $this->assertEquals(’Hello world’, 9 ’Hello world’); 10 } 11 } 12 ?>

79 B.2. TASK APPENDIX B. CASE STUDY

SnapTest Manual: http://code.google.com/p/snaptest/w/list • Installation and running Instruction to install SnapTest can be found here http://code.google.com/p/snaptest/wiki/QuickStart To run you need to be in the directory where you extracted SnapTest, and run php snaptest.php PATH_TO_TESTFILE

Quick Reference Afileshouldendwith.stest.php.Example:TestComplex.stest.php. • AtestcaseclassshouldextendSnap_UnitTestCase. • Function setUp() is run before every other function. The function is • required.

Function tearDown() is run after every other function. The function • is requried.

Standard asserts assertEqual($actual, $expected [, $message]) (’==’ comparison • )

assertIdentical($actual, $expected [, $message]) (’===’ com- • parison)

assertTrue($condition, [, $message]) •

80 APPENDIX B. CASE STUDY B.2. TASK

Sample

1

5

6 public function testHelloWorld() { 7 return $this->assertEqual(’Hello world’, 8 ’Hello world’); 9 } 10 } 11 ?>

Note that assertEqual is spelled without ’s’, in contrast to PHPUnit where it’s spelled assertEquals.

81 B.2. TASK APPENDIX B. CASE STUDY

PHPSpec

Manual: http://dev.phpspec.org/manual/en/ •

Installing and Running

Guide to install PHPSpec can be found here http://dev.phpspec.org/manual/en/installing.phpspec.html when run- ning a testcase use phpspec FILE_WITHOUT_EXTENSION

Quick Reference

Each class must follow one of the following name conventions: *Spec or • Describe*, where * is arbitrarly number of characters allowed by PHP in class names. Example: TestComplexSpec or DescribeHowToTestComplex.

Each function must begin with itShould. • Example: itShouldPerformAdditionCorrectly.

Function before() is run before every other function. • Function after() is run after every other function. •

All specs looks like $this->spec($thing_to_test)->[should|shouldNot]->MATCHER(). Example: $this->spec(’Hello world’)->should->beString(); which tests if the provided spec is a string. Also $this->spec(’Hello world’)->shouldNot->beInt(); tests if the provided spec is not an integer. The functions beString() and beInt() are known as matchers.

Standard matchers

be($expected) (identical to equal($expected) and beEqualTo($expected)). • beTrue() • beLessThan($expected) • beGreaterThan($expected) •

82 APPENDIX B. CASE STUDY B.2. TASK

Sample

1 spec(’Hello world’)->should->be->(’Hello world’); 6 } 7 } 8 ?>

Note how PHPSpec attempts to simulate the english language and its usage.

83 B.2. TASK APPENDIX B. CASE STUDY

PHPT Manual: http://qa.php.net/running-tests.php • Installing and Running PHPT is installed if you have installed PHP CLI extension. (i.e. sudo apt-get install php5-cli or use favorite package manager). May also be found at http://www.php-cli.com. Running PHPT tests is accomplished via PEAR. Running all tests in a folder run-tests To run a specific file pear run-tests FILE

Quick Reference A PHPT test case should have the extension .phpt. • Each file have three sections; Description, Test code and expected re- • sult. Layout:

1 --TEST-- 2 Description of the test 3 --FILE-- 4 7 --EXPECT-- 8 The expected result of the code here

84 APPENDIX B. CASE STUDY B.2. TASK

Sample

1 --TEST-- 2 Test Hello world. 3 --FILE-- 4 7 --EXPECT-- 8 string(11) "Hello world"

Note that no standard asserts/specs are available. Things to test is simply the output.

85 B.3. SURVEY APPENDIX B. CASE STUDY

B.3 Survey

Name of framework:

In the survey below, please rate each statement from 1 to 5 where 1 is I completely disagree and 5 being I totally agree.TheX clause is to be used if you for various reason is unable to rate the statement.

Statement 1 2 3 4 5 X It is quick to write testcases It is easy to understand how to write testcases The framework helping me veri- fying code, rather than being an obstacle Iwouldusetheframeworknext time I need to verify code rather than performing some ad hoc out- put The framework is intuitive The manual is easily understood The manual is rich in content

86 Bibliography

[1] L. Ahlfors, Complex Analysis, McGraw-Hill, third ed., 1979.

[2] K. Beck, Simple smalltalk testing: With patterns. http://www. xprogramming.com/testfram.htm,1989.[Online;accessed25-october- 2009].

[3] K. Beck, Test Driven Development: By Example, Addison-Wesley Pro- fessional, 2002.

[4] S. Bergmann, PHPUnit Pocket Guide,O’ReillyMediaInc.,2005.

[5] L. C. Boiers¨ and A. Persson, Analys i En Variabel,Studentlitter- atur AB, 2001.

[6] P. M. Duvall, S. Matyas, and A. Glover, Continuous Integra- tion: Improving Software Quality and Reducing Risk, Addison-Wesley Professional, 2007.

[7] J. Friedl, Mastering Regular Expressions,O’ReillyMediaInc., third ed., 2006.

[8] P. Hamill, Unit Testing Frameworks,O’ReillyMediaInc.,2004.

[9] E. V. Hippel, B. M. Hill, and K. Lakhani, Free open source re- search community. http://opensource.mit.edu/,2009.[Online;ac- cessed 21-october-2009].

[10] M. Kelly, Choosing a framework. http://www.ibm. com/developerworks/rational/library/591.html,2003.[Online;ac- cessed 19-november-2009].

[11] R. Lerdorf, K. Tatroe, and P. MacIntyre, Programming PHP, O’Reilly Media Inc., second ed., 2006.

[12] T. J. McCabe, A complexity measure,Tech.Rep.4,IEEE,1976.

87 BIBLIOGRAPHY BIBLIOGRAPHY

[13] R. Mugridge and W. Cunningham, Fit for Developing Software: Framework for Integrated Tests, Prentice Hall, 2005.

[14] R. Osherove, The Art of Unit Testing,ManningPublicationsCo.,209 Bruce park Avenue Greenwich, CT 06830, 2009.

[15] G. L. Steele and S. P. Harbison, C: A Reference Manual,Prentice Hall, fifth ed., 2002.

[16] S. Warden, Extreme programming, Pocket Guide,O’ReillyMediaInc., 2003.

[17] A. H. Watson and T. J. McCabe, Structured testing: A testing methodology using the cyclomatic complexity metric, tech. rep., NIST, 1996.

[18] Wikipedia, Profiling (computer programming) — wikipedia, the free encyclopedia. http://en.wikipedia.org/w/index.php?title= Profiling_(computer_programming),2009.[Online;accessed20- November-2009].

[19] Wikipedia, Software testing — wikipedia, the free encyclopedia. http: //en.wikipedia.org/wiki/Software_testing,2009.[Online;ac- cessed 11-January-2010].

88