Quick viewing(Text Mode)

Improving Test Coverage of GNU Coreutils

Improving Test Coverage of GNU Coreutils

MASARYKUNIVERSITY FACULTY}¡¢£¤¥¦§¨  OF INFORMATICS !"#$%&'()+,-./012345

Improving coverage of GNU coreutils

BACHELORTHESIS

Andrej Antaš

Brno, 2012

Declaration

Hereby I declare, that this paper is my original authorial work, I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Andrej Antaš

Advisor: RNDr. Petr Roˇckai

iii Acknowledgement

I would like to thank my supervisor and OndˇrejVašík, had the patience to walk me through the of creating this work. My thanks also belongs to my beloved family, friends and my loving girl- friend, who all stood by my side in need and never let me down.

iv Abstract

The aim of this bachelor degree thesis is to improve test coverage on the GNU coreutils project in Fedora distribution. The output of my work is a file with changes done on tests and reported describing problems discovered by new altered tests, which also improved the robustness of the coreutils and brought the level of test coverage of these utilities near the level of the upstream coverage.

Keywords coreutils, testing, coverage, gcov, fedora, patch, multi-byte

Contents

1 Introduction ...... 1 2 GNU Core Utilities ...... 4 2.1 Brief ...... 4 2.2 GNU Utilities ...... 4 2.3 GNU utilities ...... 5 2.4 GNU Text utilities ...... 5 3 Testing ...... 6 3.1 Black box testing ...... 7 3.2 White box testing ...... 8 3.3 Coreutils test suite ...... 9 4 Linux ...... 10 4.1 ...... 10 4.2 RPM Packaging system ...... 11 4.3 Fedora Project ...... 11 4.4 Upstream ...... 12 4.5 Downstream ...... 12 5 Test coverage ...... 14 5.1 Statement Coverage ...... 14 5.2 Method Coverage ...... 15 5.3 Condition Coverage ...... 15 5.4 Branch Coverage ...... 16 5.5 coverage ...... 17 6 Coverage tools ...... 19 6.1 Gcov ...... 19 6.1.1 Data format ...... 19 6.1.2 Usage ...... 20 6.1.3 Why gcov? ...... 25 6.2 Other used tools ...... 26 6.2.1 LCOV ...... 26 6.2.2 Genhtml ...... 26 6.2.3 Git ...... 26 6.3 Other coverage tools ...... 28 6.3.1 Trucov ...... 28 6.3.2 Testwell CTC++ ...... 28 6.3.3 Bullseye Coverage ...... 29

vii 6.3.4 KLEE ...... 29 7 Test coverage retrieval ...... 30 7.1 Prerequisites ...... 30 7.2 Upstream test coverage ...... 31 7.3 Downstream test coverage ...... 33 8 Comparison ...... 36 8.1 Cause of differences ...... 37 8.1.1 Patches ...... 37 8.1.2 coreutils-i18n.patch ...... 37 8.1.3 Affected utilities ...... 38 8.2 Multi-byte test patch ...... 39 8.2.1 Making patch ...... 39 8.2.2 Differences after multi-byte test patch ...... 42 9 Conclusion ...... 45 10 Attachment ...... 50

viii 1 Introduction

We, as developers and many companies use UNIX based operating systems for their development process and collection of other activ- ities. I want to about Linux, which belongs into this group and is distributed under GNU general public license [3], what basically means it is free to use even for companies and also comes with source codes, therefore is easy to alter in any way, so the system works for us not we for the sake of system. Linux as an operating system is built mainly on packaging, which means that every single program/utility in the system is distributed as a package available for you to download from official and unof- ficial repositories. These are easily reachable places on the internet, where packages are stored. This approach widens the opportunity for the user to customize the system. There obviously are some packages, that are common for all sys- tems like kernel (the core of the system) and for example also core- utils. These core components include utilities to correctly manipulate with the hardware and grant us the basic to use the com- puter. As I mentioned, one of these basic packages is coreutils package, which is the one I was interested in for my bachelor thesis. Coreutils is the collection of utilities which give us the basic capabilities for work with file system, text and shell. This package is part of every with some small divergences. The availability of source codes means, that if user finds a mistake in software he is using, he can easily download the package with source code and if he is able to, he can correct the mistake and submit it to maintainers of the package, so they can revise this change and maybe include it in the next update. Therefore there are still many updates in distribution packages, but one problem arise with these. The problem is that policy for submitting new changes to down- stream is not that strict and does not care that much about testing your changes as the upstream (definitions of terms unknown to reader can be found in chapter 4). Exactly this situation occurred in coreutils package. Few bigger updates were introduced and dragged the test coverage of the source code rapidly down and also added potential

1 1. INTRODUCTION problems.This was the thing that led me to the project, because on every occasion the main thing we, as developers want, is to avoid situations when harmful code is pushed into the production. Mistakes in source code are also called “bugs” and the best - ample that came into my mind and probably the expensive in human history is the explosion of European rocket Ariane V. The ex- plosion occurred forty seconds after the launch and what is the most interesting part, all because of one failing test said to be harmless. I mentioned it is the most expensive one and that is because the dam- age was quantified to 370 million US dollars.[2] The reason why I mentioned this particular situation is that it is the most extreme demonstration of what could possibly be caused by a bug, but also that the exact mistake was a raised software exception during data conversion from 64-bit floating point to 16-bit signed integer value[2]. Which is not quietly the same as untested changes in coreutils project but hugely reminds of the situation because my aim is the i18n patch which added multi-byte functionality to certain utilities in the package. Thesis starts with brief introduction into coreutils project which can be found in chapter two. After handshake with the project, there is the chapter three going deeper into testing and its principles also mentioning information about tests on coreutils project. Next, fourth, chapter is about Linux operating system and con- cludes all the basics we need, all terms used and also narrows our view to Fedora Linux distribution. This testing unrelated, but also es- sential chapter is followed by fifth chapter which is dealing with test coverage and explaining most used and the most important kinds from all existing. The coverage part is logically followed by tools which can re- trieve these statistics for us. It is the sixth chapter and it is further divided into section with description of collection of tools I really used during my work on thesis and section with selection of other alternatives that can be used for coverage retrieval. Seventh chapter is describing preparations before coverage re- trieval and then also my approaches of retrieving the coverage from the upstream source code followed by the downstream one and also showing the results of these procedures. In final, eighth, chapter the main cause of the differences in cov-

2 1. INTRODUCTION erage is described and also coverages of the most affected utilities are shown and discussed. Then the creation of the patch that raises the distribution package test coverage follows and last bugs that these new test changes uncovered are listed and so the comparison with upstream coverage of these distinct utilities. The implementation of test changes was done in program- ming language, utilities used for coverage retrieval are listed and described in chapter six as I mentioned before. The output of my bachelor thesis is patch with all test changes and I also reported bugs found after running these altered tests.

3 2 GNU Core Utilities

The GNU Core Utilities are the basic file, shell and text manipula- tion utilities within the GNU operating systems. These are the core utilities which are expected to exist on every UNIX based operating system.[4]

2.1 Brief history

Coreutils project originates from UNIX utilities, development of which started in early nineties and until now it still remained active, which is a very good sign for an open source project. In the whole development process, than a hundred pro- grammers contributed to the project with 28 of them still being active nowadays. The Coreutils package is the combination of and replacement for the fileutils, sh-utils, and textutils packages. It began as the union of these revisions: fileutils-4.1.11, textutils-2.1, sh-utils-2.0.15 in August 2002.[10] The first major stable release (coreutils-5.0) was published in April 2003 and has gone through to version 8.15 since then. The maintainers of the project are Jim Meyering, Pádraig Brady, Blake and Paul Eggert, with Jim Meyering being the contributor of more than 87% of the whole 110 000 lines of code now contained on the project.[5]

2.2 GNU File Utilities

Collection of basic file manipulation utilities such as , , , , , and other. It was along with other two packages of the similar basic purpose combined into coreutils and since then only the project GNU Core Utilities is maintained.[6]

4 2. GNU CORE UTILITIES 2.3 GNU Shell utilities

Collection of basic shell commands that appear in each and every GNU operating system such as , , who, and many other. This is the second package which was combined into GNU Core Util- ities and is no longer maintained as a standalone project.[7]

2.4 GNU Text utilities

Collection of basic Text manipulation utilities such as , , ex- pand, , , , and other. This is the last package that was combined along with two mentioned above into GNU Core Utilities.[8]

5 3 Testing

One of the major parts of all programs are tests. A wise man once said "testing shows the presence, not the absence of bugs"[9]. The man was Edsger Dijkstra and he was obviously trying to encourage us to look for mistakes in our code by testing it, but not to expect that there are none, if all tests are passing, because there surely is untested piece of code that fails. Good developer always writes tests for every feature he added/ altered and what is more important, with this approach he tends to correct code. Definition 3.0.1. Testing is the process of analyzing a software item to de- tect the differences between existing and required conditions (that is, bugs) and to evaluate the features of the software items.[1] Every test suite for a certain program includes one or more test scenarios and even bigger number of test cases. Definition 3.0.2. A test suite in software development commonly known as a validation suite, is a collection of test cases that are intended to be used to test a software program to show that it has some specified set of be- haviours. A test suite often contains detailed instructions or goals for each collection of test cases and information on the system configuration to be used during testing. A group of test cases may also contain prerequisite states or steps, and descriptions of the following tests. 1 Definition 3.0.3. Test scenario is a set of test cases that ensure that the business process flows are tested from end to end. They may be independent tests or a series of tests that follow each other, each dependent on the output of the previous one. The terms "test scenario" and "test case" are often used synonymously. 2 Definition 3.0.4. Test case is a set of test data and test programs (test scripts) and their expected results. Test case validates one or more system requirements and generates a pass or fail. 3

1. http://dictionary.babylon.com/testsuite/ 2. http://computer.yourdictionary.com/test-scenario 3. http://computer.yourdictionary.com/test-case

6 3. TESTING

The most important result of testing a piece of software is its - ification and validation.

Definition 3.0.5. Verification is the process of evaluating a system or com- ponent to determine whether the products of a given development phase satisfy the conditions imposed at the start of that phase.[1]

Definition 3.0.6. Validation is the process of evaluating a system or com- ponent during or at the end of the development process to determine whether it satisfies specified requirements.[1]

There are two main points of view by which we divide testing into black box testing and white box testing. They describe how do the developers and testers look at the software and how do they ap- proach it. We further divide tests into six subtypes, which includes acceptance, integration, unit, regression, functional/system, beta test- ing as divided and described below.

3.1 Black box testing

Definition 3.1.1. Black box testing (also called functional testing or be- havioral testing) testing that ignores the internal mechanism of a system or component and focuses solely on the outputs generated in response to selected inputs and execution conditions.[1]

This of testing takes into account nothing but software re- quirements. The whole source code is a black box and should not be reachable for testers. They are expecting outputs from given in- puts on the base of test cases and defined behaviour of the software from the specification. Every point/requirement in the specification should have at least one test case, then this situation is called mini- mal black box test suite. The results of black box testing, mainly failing tests does not tell us where the problem really occurred, because they have no visibil- ity of the source code, but they do alert us that there is a possible problem, because software requirements were somehow not met. Four of the six subtypes of testing belong here: acceptance, func- tional/system, beta and partly also integration testing.

7 3. TESTING

Definition 3.1.2. Acceptance testing is a formal testing conducted to en- able a user, customer, or other authorized entity to determine whether to accept a system or component. [1]

Definition 3.1.3. Functional testing is a testing that ignores the internal mechanism of a system or component and focuses solely on the outputs gen- erated in response to selected inputs and execution conditions. [1]

Definition 3.1.4. Beta testing comes after alpha testing and can be con- sidered a form of external user acceptance testing. Versions of the software, known as beta versions, are released to a limited audience outside of the pro- gramming team. The software is released to groups of people so that further testing can ensure the product has few faults or bugs. Sometimes, beta ver- sions are made available to the open public to increase the feedback field to a maximal number of future users. [11]

Definition 3.1.5. Integration testing is testing in which software compo- nents, hardware components, or both are combined and tested to evaluate the interaction between them. [1]

3.2 White box testing

Definition 3.2.1. White box testing (also called structural testing) takes into account the internal mechanism of a system or component. [1]

It is also know as structural testing, box testing, or glass box testing. In this type of approach, tests are run with known inputs and results are also already calculated and we are waiting whether the run meets our results or not, that way we can tell if the test failed/succeeded and the feature is coded defectively/correctly. Results of white box testing, the most important being failing tests, show us exactly which lines were involved in the mistake and therefore unlike in black box testing it gives us bigger opportunity to correct our code and improve its robustness. Three of six subtypes of testing use this approach and those are unit, integration, and regression testing. Definition 3.2.2. Unit testing is testing of individual hardware or software units or groups of related units. [1]

8 3. TESTING

Definition 3.2.3. Integration testing (see Definition 3.1.5)

Definition 3.2.4. Regression testing is selective retesting of a system or component to verify that modifications have not caused unintended effects and that the system or component still complies with its specified requirements.[1]

3.3 Coreutils test suite

Because of strict regulations of upstream, the Fedora distribution coreutils are very well tested, but still, people contributing on dis- tribution patches (for explanation of terms unknown to reader, see chapter 4) are poorly testing their code changes or not at all. Biggest changes, in the meaning of testing the downstream coreutils, were disabled tests that were failing and there was nobody, who had enough to solve the situation, so tests got disabled for compilation not to throw errors at users. There are of course the exceptions, for example i18n patch added test for utility, but it is obviously insufficient. To it up, nowadays coreutils test suite contains together al- most eight hundred tests, some of them are not run and some need the environmental variables to be set properly (see chapter 7). All these tests have a clear structure and are divided into two parts. There is the GNU library and its unit tests (includes more than 290 tests) and the second part are regression tests of coreutils itself (contains circa 480 tests). The coreutils test suite then contains a few levels of test difficulty, first are those run every time, but then there are the expensive and very expensive tests, which are run only after environmental variables are set properly, otherwise disabled because of their time and resource consumption. Other types of disabled tests by default are those that require privileges or membership in a certain group and can be also run with proper configuration. Tests are written mainly in form of scripts (circa 80 %), but also complemented with small number of Perl scripts.

9 4 Linux

4.1 Operating system

Linux is a UNIX like computer operating system assembled under the model of free and open source software development and dis- tribution. The defining component of Linux is the , an operating system kernel first released October 5, 1991 by Linus Tor- valds. [13] Linux was originally developed as a free operating system for Intel x86-based personal computers. It has since been ported to many computer hardware platforms. It is a leading operating system on servers and other big iron systems such as mainframe computers and supercomputers: more than 90% of today’s 500 supercomputers run some variant of Linux, including the 10 fastest.[14] Linux also runs on embedded systems (devices where the op- erating system is typically built into the firmware and highly tai- lored to the system) such as mobile phones, tablet computers, net- work routers, televisions and video game consoles. Android system in wide use on mobile devices is built on the Linux kernel. The development of Linux is one of the most prominent exam- ples of free and open source software collaboration: the underlying source code may be used, modified, and distributed—commercially or non-commercially—by anyone under licenses such as the GNU General Public License[3]. Typically Linux is packaged in a format known as a Linux distribution for desktop and server use. Some pop- ular mainstream Linux distributions include (and its deriva- tives such as ), Fedora and openSUSE. Linux distributions include the Linux kernel, supporting utilities and libraries and usu- ally a large amount of application software to fulfil the distribution’s intended use. A distribution oriented toward desktop use will typically include the and an accompanying desktop environment such as GNOME or KDE Plasma. Some such distributions may in- clude a less resource intensive desktop such as LXDE or Xfce for use on older or less powerful computers. A distribution intended to run as a server may omit all graphical environments from the stan-

10 4. LINUX dard and instead include other software such as the Apache HTTP Server and an SSH server such as OpenSSH. Because Linux is freely redistributable, anyone can create a distribution for any in- tended use. Applications commonly used with desktop Linux sys- tems include the Mozilla Firefox web browser, the LibreOffice office application suite, and the GIMP image editor. Since the main supporting system tools and libraries originated in the GNU Project, initiated in 1983 by , the Foundation prefers the name GNU/Linux. [12]

4.2 RPM Packaging system

RPM (RPM) is a package management system. The name RPM variously refers to the .rpm file format, files in this for- mat, software packaged in such files, and the package manager itself. RPM was intended primarily for GNU/Linux distributions; the file format is the baseline package format of the .[15] While RPM was originally written in 1997 by Erik Troan and Marc Ewing[16] for use in , RPM is now used in many GNU/ Linux distributions. It has also been ported to some other op- erating systems, such as NetWare (as of version 6.5 SP3) and IBM’s AIX as of version 4. Whereas an RPM typically contains the compiled version of the software, an SRPM contains either the source code corresponding to that RPM or the scripts of a non-compiled software package. Originally standing for Red Hat Package Manager, RPM now stands for "RPM Package Manager", a recursive abbreviation.[15]

4.3 Fedora Project

Fedora is one of the many distributions of Linux operating systems. It is absolutely free and you can use it as an addition to some other operating system or as a stand alone. It is quite big collection of free (open source) software which makes your computer or notebook us- age more efficient and easy to master. Red Hat invests a lot of time and resources into development of this operating system to encourage other contributors to collaborate

11 4. LINUX with them more, because this product is free of charge on the internet for everybody and also everybody can contribute, which is basically the main principle of the whole open source world. My bachelor the- sis was also made in collaboration with Red Hat and I used the Fe- dora 16 operating system to prove the usability of this distribution in real life application.

4.4 Upstream

Definition 4.4.1. In free and open source projects, the upstream of a pro- gram or set of programs is the project that develops those programs. Fedora is downstream of those projects. This term comes from the idea that water and the goods it carries float downstream and benefit those who are there to receive it. [17]

4.5 Downstream

In common sense downstream is a modification of upstream projects source code to suite the actual needs of a customer (in many cases customer is the user). The process of modification of the code starts with taking clean upstream code and adding patches to it, which modify only that functionality which is really necessary. The downstream is however a relative term in sense, that you can never say that one project is a downstream, because the project is a downstream of the upstream one it modifies, but also can be upstream for other project. As a real life example, we can look at Fedora. Fedora is a downstream of thousands of upstream projects it contains, but is also upstream for .

Definition 4.5.1. Patch is a modification made directly to an object pro- gram without reassembling or recompiling from the source program.[1]

Definition 4.5.2. Patch is a modification made to a source program as a last-minute fix or afterthought. [1]

Definition 4.5.3. Patch is a modification to a source or object program. [1]

12 4. LINUX

In all these three definitions the common thing they want to tell us, and every definition mentions it, is the modification word, which is so important. Patch is basically a text file which does not contain the whole modified source code of the project, but just those lines that were altered with clearly marked files and line numbers for processing through patch utility (for more information on creating a patch file see section 8.2). Patch utility basically applies these patch files (also called files, because only differences are stored in them) on orig- inal files altering changed files listed in the patch file. Patch is also the only change to upstream code, which is in major- ity of cases added by some contributor and then it awaits for the re- view of the package maintainer and after his review it can be pushed into the official code repository. After gathering enough changes, so- lutions of problems and bug fixes, maintainers can release a new sta- ble release to production (for public usage). There are also other distribution divergences, apart from patches, which include non-default configure options or different compiler flags. You can also see them in seventh chapter when you compare section 7.2 with section 7.3.

13 5 Test coverage

Definition 5.0.4. Test coverage is the degree to which a given test or set of tests addresses all specified requirements for a given system or component. [1] Coverage and in our case test coverage is the way we find out how well is our code tested, which methods are being really exe- cuted, which lines are used when running the code. Also the cover- age tells us how are we fulfilling our software requirements. There are five types of coverage according the approach we choose.

5.1 Statement Coverage

Statement coverage is the most basic form of . A state- ment is covered if it is executed. Note that a statement does not nec- essarily correspond to a line of code. However terms Line and State- ment coverage are used as synonyms in many cases. Multiple state- ments on a single line can confuse issues - the reporting if nothing else. Where there are sequences of statements without branches it is not necessary to count the execution of every statement, just one will suffice, but people often like the count of every line to be reported anyway, especially in summary statistics. This type of coverage is relatively weak in that even with 100% statement coverage there may still be serious problems in a program which could be discovered through the use of other metrics. Even so, the first time that statement coverage is used in any reasonably sized development effort it is very likely to show up some bugs. It can be quite difficult to achieve 100% statement coverage. There may be sections of code designed to deal with error conditions, or rarely occurring events such as a signal received during a certain sec- tion of code. There may also be code that should never be executed:

if ( $param > 20) { die "This should never happen!"; }

14 5. TESTCOVERAGE

It can be useful to mark such code in some way and flag an error if it is executed. [19]

5.2 Method Coverage

Method coverage is a measure of the percentage of methods that have been executed by test cases. Undoubtedly, your tests should call 100% of your methods. It seems irresponsible to deliver methods in your product when your testing never used these methods. As a result, you need to ensure you have 100% method coverage. int foo (int a, int b, int , int d, float e) { float e;

if (a == 0) { return 0; }

int x = 0;

if ((a==b) OR ((c == d) AND bug(a) )) { x=1; }

e = 1/x; return e; }

In the code shown above, we attain 100% method coverage by calling the foo method. Consider test case: the method call foo(0, 0, 0, 0, 0.), expected return value of 0. If you look at the code, you see that if a has a value of 0, it does not matter what the values of the other parameters are – so we’ll it really easy and make them all 0. Through this one call we attain 100% method coverage. [19]

5.3 Condition Coverage

When a boolean expression is evaluated it can be useful to ensure that all the terms in the expression are exercised. For example:

15 5. TESTCOVERAGE

a if $x || $y;

To achieve full condition coverage, this expression should be eval- uated with $x and $y set to each of the four combinations of values they can take. In Perl, as is common in many software programming languages, most boolean operators are short-circuiting operators. This means that the second term will not be evaluated if the value of the first term has already determined the value of the whole expression. For example, when using the || operator the second term is never eval- uated if the first evaluates to true. This means that for full condition coverage there are only three combinations of values to cover instead of four. Condition coverage gets complicated, and difficult to achieve, as the expression gets complicated. For this reason there are a number of different ways of reporting condition coverage which try to ensure that the most important combinations are covered without worrying about less important combinations. Expressions which are not part of a branching construct should also be covered:

$z = $x || $y;

Condition coverage is also known as expression, condition-decision and multiple decision coverage. [20]

5.4 Branch Coverage

The goal of branch coverage is to ensure that whenever a program can jump, it jumps to all possible destinations. The most simple ex- ample is a complete if statement:

if ($x){ print "a"; } else { print "b"; }

16 5. TESTCOVERAGE

Full coverage is only achieved here only if $x is true on one occa- sion and false on another. Achieving full branch coverage will protect against errors in which some requirements are not met in a certain branch. For example:

if ($x) { $h = { a => 1 } } else { $h = 0; } print $h->{a};

This code will fail if $x is false (and you are using strict refs). In such a simple example statement coverage is as powerful, but branch coverage should also allow for the case where the else part is missing, and in languages which support the construct, switch state- ments should be catered for:

$h = 0; if ($x) { $h = { a => 1 } } print $h->{a};

100% branch coverage implies 100% statement coverage. Branch coverage is also called decision, arc or all edges coverage.[20]

5.5 Path coverage

There are classes of errors which branch coverage cannot detect, such as:

$h = 0; if ($x) { $h = { a => 1 }; } if ($y) { print $h->{a}; }

17 5. TESTCOVERAGE

100% branch coverage can be achieved by setting ($x, $y) to (1, 1) and then to (0, 0). But if we have (0, 1) then things go wrong. The purpose of path coverage is to ensure that all paths through the program are taken. In any reasonably sized program there will be an enormous number of paths through the program and so in prac- tice the paths can be limited to those within a single subroutine, if the subroutine is not too big, or simply to two consecutive branches. In the above example there are four paths which correspond to the truth table for $x and $y. To achieve 100% path coverage they must all be taken. Note that missing elses count as paths. In some cases it may be impossible to achieve 100% path cover- age:

a if $x; b; c if $x;

50% path coverage is the best you can get from the code above. Ideally, the code coverage tool you are using will recognise this and not complain about it, but unfortunately we do not live in an ideal world. And anyway, solving this problem in the general case requires a solution to the halting problem. 100% path coverage also implies 100% branch coverage. Path cov- erage and some close to it are also known as predicate, basis path and Linear Code Sequence And Jump coverage. [20]

18 6 Coverage tools

6.1 Gcov

Gcov is a test coverage program. Use it in concert with GCC to ana- lyze your programs to create more efficient, faster running code and to discover untested parts of your program. You can use gcov as a profiling tool to help discover where your optimization efforts will best affect your code. You can also use gcov along with the other pro- filing tool, gprof, to assess which parts of your code use the greatest amount of computing time. [21]

6.1.1 Data format As the output of running gcov, the .gcov file is generated with data in following format. The .gcov files contain the ’:’ separated fields along with program source code. The format is execution_count:line_number:source line text

Additional block information may succeed each line, when re- quested by command line option. The execution_count is ’-’ for lines containing no code. Unexecuted lines are marked ’#####’ or ’====’, depending on whether they are reachable by non-exceptional paths or only exceptional paths such as C++ exception handlers, respec- tively. [22] There are also some lines at the beginning with line_number zero, which are the preamble lines and have the following format:

-:0:tag:value

The .gcov file can also contain other, human readable information in simple format, understandable also for a machine to parse it, in the format: tag information

19 6. COVERAGETOOLS

Information regarding percentages, if 0% or 100%, the values are printed only if they are exactly so and all other values are not rounded as we would expect, but are printed as the nearest possible non- boundary value.

6.1.2 Usage The main part of using Gcov is compilation of source code files with two special CFLAGs which are:

-fprofile-arcs -ftest-coverage

After this procedure, not only compiled program is the output, but also .gcno file for each source file is created. As most of Linux bash utilities/programs, gcov is invoked with additional options and desired file to run it on like this: gcov [options] file/s

These options that can be put right after gcov keyword give as more flexibility and information about the program we are testing, I will demonstrate just two most used and useful ones, which are -a and -b. [22] As the example I will use a small piece of C code, which is for compilation and other work stored in hello_world.c, as written bel- low:

#include int main (void) { int i, sum;

sum = 0;

for (i = 0; i < 11; i++) sum +=i;

if (sum != 55) { ("Error occurred\n"); } else {

20 6. COVERAGETOOLS

printf ("Hello World\n"); } }

The first step after writing the code we need to compile it with the two CFLAGs mentioned before:

gcc -o hello_world -fprofile-arcs -ftest-coverage hello_world.c

This command will not only compile the code for us but it also creates additional profiling information needed by gcov in the for- mat of .gcno file which is situated in the same as our com- piled file. Definition 6.1.1. The .gcno file is generated when the source file is com- piled with the GCC -ftest-coverage option. It contains information to re- construct the basic block graphs and assign source line numbers to blocks. [23] Then running the program will generate us profile output in for- mat of .gcda file also situated in the same directory as our program. Definition 6.1.2. The .gcda file is generated when a program containing object files built with the GCC -fprofile-arcs option is executed. A separate .gcda file is created for each object file compiled with this option. It contains arc transition counts, and some summary information. [23] Therefore after running our compiled piece of code (later on when I retrieve the coverage, tests are run to generate desired .gcda files for each source file):

./hello_world

We can use gcov for the first time without any options and we will get the output like this:

gcov hello_world.c

File ’hello_world.c’ Lines executed:87.50% of 8 hello_world.c:creating ’hello_world.c.gcov’

21 6. COVERAGETOOLS

And now the .gcov file got generated for us and it has these in- formation (see subsection 6.1.1 for explanation of the output):

-: 0:Source:hello_world.c -: 0::hello_world.gcno -: 0:Data:hello_world.gcda -: 0:Runs:1 -: 0:Programs:1 -: 1:#include -: 2: 1: 3:int main (void) { -: 4: int i, sum; -: 5: 1: 6: sum = 0; -: 7: 12: 8: for (i = 0; i < 11; i++) 11: 9: sum +=i; -: 10: 1: 11: if (sum != 55) { #####: 12: printf ("Error occurred\n"); -: 13: } else { 1: 14: printf ("Hello World\n"); -: 15: } 1: 16:}

When run with –all-blocks option it gives us information as in definition that follows:

Definition 6.1.3. The -a option writes individual execution counts for ev- ery basic block. Normally gcov outputs execution counts only for the main blocks of a line. With this option you can determine if blocks within a single line are not being executed. [22]

The command to run it and the output: gcov -a hello_world.c

File ’hello_world.c’ Lines executed:87.50% of 8 hello_world.c:creating ’hello_world.c.gcov’

22 6. COVERAGETOOLS

And then the .gcov file gives us more explicit information: -: 0:Source:hello_world.c -: 0:Graph:hello_world.gcno -: 0:Data:hello_world.gcda -: 0:Runs:1 -: 0:Programs:1 -: 1:#include -: 2: 1: 3:int main (void) { -: 4: int i, sum; -: 5: 1: 6: sum = 0; -: 7: 12: 8: for (i = 0; i < 11; i++) 1: 8-block 0 11: 8-block 1 12: 8-block 2 11: 9: sum +=i; -: 10: 1: 11: if (sum != 55) { 1: 11-block 0 #####: 12: printf ("Error occurred\n"); $$$$$: 12-block 0 -: 13: } else { 1: 14: printf ("Hello World\n"); 1: 14-block 0 -: 15: } 1: 16:} 1: 16-block 0 In this , each basic block is only shown on one line – the last line of the block. A multi-line block will only contribute to the execution count of that last line, and other lines will not be shown to contain code, unless previous blocks end on those lines. The total execution count of a line is shown and subsequent lines show the execution counts for individual blocks that end on that line. After each block, the branch and call counts of the block will be shown, if the -b option is given. [22] So with the knowledge of this we can use the second option to check, if the results are like in the definition given.

23 6. COVERAGETOOLS

Definition 6.1.4. The -b option writes branch frequencies to the output file, and write branch summary to the standard output. This option allows you to see how often each branch in your program was taken. Unconditional branches will not be shown, unless the -u option is given. The command and the command line output looks like this: gcov -b hello_world.c

File ’hello_world.c’ Lines executed:87.50% of 8 Branches executed:100.00% of 4 Taken at least once:75.00% of 4 Calls executed:50.00% of 2 hello_world.c:creating ’hello_world.c.gcov’

And in newly generated .gcov file, we find these lines:

-: 0:Source:hello_world.c -: 0:Graph:hello_world.gcno -: 0:Data:hello_world.gcda -: 0:Runs:1 -: 0:Programs:1 -: 1:#include -: 2: function main called 1 returned 100% blocks executed 86% 1: 3:int main (void) { -: 4: int i, sum; -: 5: 1: 6: sum = 0; -: 7: 12: 8: for (i = 0; i < 11; i++) branch 0 taken 92% branch 1 taken 8% (fallthrough) 11: 9: sum +=i; -: 10: 1: 11: if (sum != 55) { branch 0 taken 0% (fallthrough) branch 1 taken 100% #####: 12: printf ("Error occurred\n"); call 0 never executed

24 6. COVERAGETOOLS

-: 13: } else { 1: 14: printf ("Hello World\n"); call 0 returned 100% -: 15: } 1: 16:}

Right before the main function declaration, there is a human read- able line with the information how many times the function was called, percentage how many times it returned and percentage of functions executed. For each basic block, a line is printed after the last line of the basic block describing the branch or call that ends the basic block. There can be multiple branches and calls listed for a single source line if there are multiple basic blocks that end on that line. In this case, the branches and calls are each given a number. There is no simple way to map these branches and calls back to source constructs. In general, though, the lowest numbered branch or call will correspond to the leftmost construct on the source line. [22] For a branch, if executed, the percentage as an indicator of num- ber of times it was taken divided by number of times executed is printed or simply information "never executed". For a call, if executed, the percentage as an indicator of number of times it returned divided by number of times executed is printed. It usually is a 100% but if the function calls or longjmp, the number may vary, because these do not return every time that are called. This was just a simple glossary of basic work with gcov, of course there are many other options, but these are the most interesting for us to show on a simple example given and to demonstrate the usage of gcov at all.

6.1.3 Why gcov? My choice of using gcov for test coverage is not only based on the task I was given and which I am trying to accomplish in my thesis, but has many more reasons. First of all is that gcov is a standard part of the GNU operating systems and my operating system (Fedora) is one of them. Then, al- most the whole coreutils source code is written in C programming

25 6. COVERAGETOOLS language and the gcov utility is build exclusively to suite the GNU GCC compiler which has spared me a lot of problems with configu- ration of some third party utility. Another one is that this tool is open source, which means, that if there are any problems, I can solve them with altering the original code and the most important for this thesis is that the utility is free to use. Unlike commercial software which I would have to buy or ex- change many emails with the company which made it, so they will grand me the permission to use their tool for educational purposes. The last but not the least would sure be the simple usage of the gcov utility and practically automatic configuration.

6.2 Other used tools

6.2.1 LCOV LCOV is a graphical front-end for GNU GCC’s coverage testing tool gcov. It collects gcov data for multiple source files and in harmony with genhtml utility (mentioned later on) creates HTML pages con- taining the source code annotated with coverage information. It also adds overview pages for easy navigation within the file structure. LCOV supports statement, function and branch coverage measure- ment. [24]

6.2.2 Genhtml Is a tool that transforms coverage data that can be found in trace files into HTML pages readable and understandable for the wide public. At the end of the whole process of generating the coverage using these tools, coverage looks like on the figure 6.1, where you can also go through individual files and see which lines are exactly covered and which are not like shown on figure 6.2.

6.2.3 Git Git is an open source, distributed version control system designed to handle everything from small to very large projects with speed and

26 6. COVERAGETOOLS

Figure 6.1: Generated test coverage

Figure 6.2: Exact line coverage

27 6. COVERAGETOOLS efficiency. Every git clone is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on network ac- cess or a central server. Branching and merging are fast and easy to do. [25] I used git repositories hosted on bitbucket.org (it offers private repositories free of charge unlike other hostings) to store my thesis, all materials I encountered when doing it, but mainly for the patch retrieval which is as simple as it can be.

6.3 Other coverage tools

There is of course not only one tool to retrieve test coverage of C/C++ code, and in this section I briefly mention few other that are also pop- ular in community and of course free to use or also open source.

6.3.1 Trucov

Trucov is an open source program that works with the GCC com- piler to display the control flow of a program and its test coverage information. It assists developers to ensure their test cases have suf- ficient coverage. Trucov simplifies and formalizes the way in which developers identify untested code in a simple to identify, visual way. [29]

6.3.2 Testwell CTC++

Testwell CTC++ is a powerful instrumentation-based code coverage and dynamic analysis tool for C and C++ code. As a code coverage tool, CTC++ shows the coverage all the way to the Modified Con- dition/Decision Coverage (MC/) level as required by DO-178B projects. As a dynamic analysis tool, CTC++ shows the execution counters in the code, i.e. more than plain boolean coverage infor- mation. You can also use CTC++ to measure function execution costs (normally time) and to enable function entry/exit tracing at test time. [30]

28 6. COVERAGETOOLS

6.3.3 Bullseye Coverage BullseyeCoverage is a code coverage analyzer for C++ and C that tells you how much of your source code was tested. You can use this information to quickly focus your testing effort and pinpoint areas that need to be reviewed. Code coverage analysis is useful during unit testing, integration testing, and final release. [31]

6.3.4 KLEE KLEE is a symbolic execution tool, capable of automatically generat- ing tests that achieve high coverage on a diverse set of complex and environmentally-intensive programs. [33] KLEE looks at the code in other, more complex way than the other tools presented. Instead of running code on manually- or randomly- constructed input, they run it on symbolic input initially allowed to be “anything.” They substitute program inputs with symbolic values and replace corresponding concrete program operations with ones that manipulate symbolic values. When program execution branches based on a symbolic value, the system (conceptually) follows both branches, on each path main- taining a set of constraints called the path condition which must hold on execution of that path. When a path terminates or hits a bug, a test case can be generated by solving the current path condition for concrete values. Assuming deterministic code, feeding this concrete input to a raw, unmodified version of the checked code will make it follow the same path and hit the same bug. [33]

29 7 Test coverage retrieval

This chapter is dedicated to KNOW-HOW of getting the coverage for our code. The most important statement to make at the beginning is to say, that coreutils as all software is going through development, therefore I am making this whole thesis for the coreutils version 8.15, although there is newer version already, but as my thesis was in ad- vanced stage when it was released, and it does not affect the results of the thesis and points I am trying to prove in it at all, I kept on doing my work with version 8.15.

7.1 Prerequisites

These prerequisites apply only to coverage retrieval, there is no need of other tools than gcc for running tests on your own machine. In order to use gcov in combination with lcov and genhtml to generate nice html output of coverage data, you sure need more than gcc. You maybe won’t need all the packages listed here, but for the clarity of the text I will mention all required. All tests and coverage data retrieval was done on my Fedora core16 with kernel version 3.3.1-5.fc16.x86_64 therefore in the guides here always yum as a package manager is used. First step before we can really go into coverage retrieval, we need to get source code of coreutils. There are several possible ways to do this depending on from what are you going to retrieve your coverage data. In my case as first of all ensure you have package git installed by simply calling: git --version

You should see the version of git, if not just run: yum install git

Now you are ready to download upstream source code which contains README-prereq file that can be found in the root of source code. It contains detailed descriptions on how to download and com- pile few necessary tools like [26], [27] and xz[28]. To download the upstream source from git:

30 7. TEST COVERAGE RETRIEVAL git clone git://git.savannah..org/coreutils.git

However at the end is written that now you can build the pack- age, which you can, it is not sufficient for us if we want to retrieve the coverage. Additional packages are needed : sudo yum install valgrind valgrind-devel python-inotify sudo yum install perl-Expect expectk perl-Test-Expect sudo yum install strace expect expect-devel

And now we are ready to retrieve coverage info from coreutils package.

7.2 Upstream test coverage

For better comparison, I will guide you through the process of get- ting the coverage from the tarball of upstream coreutils. The tarball is the name for the .tar.gz package in which the source code is packed and distributed. In our case, the GNU coreutils project uses .tar.xz packaging for compatibility and better compression. Here is the where to download the upstream coreutils: http://ftp.gnu.org/gnu/coreutils/coreutils-8.15.tar.xz

After downloading the tarball we need to export C flags for the compiler to know how to properly create the makefile so the cover- age data can be gathered later on and run configure this way : export CFLAGS="-fprofile-arcs -ftest-coverage"

./configure

Now we are ready to make the coreutils, which is done by make command: make

31 7. TEST COVERAGE RETRIEVAL

And as we have the package build, we can now run tests during their run necessary files for coverage retrieval are created. Because we want to run as many tests as possible, first run all tests with ex- pensive and very expensive environmental variables enabled with this command: RUN_EXPENSIVE_TESTS=yes RUN_VERY_EXPENSIVE_TESTS=yes make check

Be aware, because this procedure can take more than 20 minutes depending on your computer performance. After all the tests are run, we want to make sure we do not loose anything, so we need to use lcov to gather the coverage data from this state. We do it separatelly for subdirectories of the source code, so the command looks like this: lcov -c -o coverage_src_env.info -d src -b src lcov -c -o coverage_lib_env.info -d lib -b lib

This time some tests were disabled, because of the root privileges they require, therefore we will run the whole test suite once more to run those missing tests with this command: sudo make check

And now we once more gather the coverage data, but this time we change the -o option, which is the output file name: lcov -c -o coverage_src.info -d src -b src lcov -c -o coverage_lib.info -d lib -b lib

Finally as genhtml can take whole collections of tracefiles to gen- erate nice html output of the coverage we take advantage of that and instead of running it four times, we run it once on the collection of *.info tracefiles this way: genhtml -t coreutils -o coverage *.info Now the coverage in nice html output is situated in coverage di- rectory in the root of the coreutils source code. When you open it, it looks like this:

32 7. TEST COVERAGE RETRIEVAL

Figure 7.1: Upstream coverage

7.3 Downstream test coverage

As we did the coverage retrieval of upstream tests suite from tarball, we also need to do this with the downstream, the fedora coreutils one. There are few ways how to obtain the code. I will explain just the simplest one and that is getting patched source files from the source RPM. The source RPM can be found at: http://koji.fedoraproject.org/koji/buildinfo?buildID=30 5779

Which is the corresponding one to the upstream tarball. The source RPM one is the one with suffix *.src.rpm. We need it, because this package contains everything needed to recreate not only the program and associated files that are contained in the binary package file, but the binary and source package files themselves. [32] Now we just need the source code out of the whole package and also here we can obtain it several ways, either way the main aim of retrieving it is to apply downstream patches so this is the way I did it. First we need to make sure that our system contains package rpm- build which will be used: rpmbuild --version

if you see the version, you do not need to worry, if not then install the package with this command:

33 7. TEST COVERAGE RETRIEVAL

sudo yum install rpm-build

After successful installation/check of rpm-build package, we need to make directory structure for the operation we want to make with the package and that is extracting the patched source code.

rpmbuild-setuptree

This will create all required folders for us in the . Now we just need to unpack the source rpm and patch it. First, open the *.src.rpm file with archive manager and extract the coreutils.spec file into directory /rpmbuild/SPECS/ then put all those other files from the archive into /rpmbuild/SOURCES/ and we are ready to compile, run tests and retrieve the coverage. The main things we need now is to patch the sources which can be easily done this way:

rpmbuild -bp ~/rpmbuild/SPECS/coreutils.spec

Now when we look into /rpmbuild/BUILD/ there is the folder called coreutils-8.15 and that is the patched source code we wanted. At this point we are ready to build and run tests to gather cover- age data for the HTML output. Before all other commands, we need to export C and CPP flags for the proper configuration with these commands:

export CFLAGS=" -fno-strict-aliasing -fpic -D_GNU_SOURCE=1 -O2 -g -pipe - -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector -fprofile-arcs -ftest-coverage"

export CPPFLAGS="$CPPFLAGS -DUSE_PAM"

Next thing we need to have right is to create configuration files for various utilities used to compile the code (touch is used for the correctness of timestamps)

touch aclocal. configure config.hin Makefile.in */Makefile.in aclocal -I m4

34 7. TEST COVERAGE RETRIEVAL autoconf --force automake -- --add-missing

And now we are ready to configure:

./configure --enable-largefile --enable-pam --enable-selinux --with--group --enable-install-program=su,hostname,arch DEFAULT_POSIX2_VERSION=200112 alternative=199209

And after all this we just build the package and run tests as in pre- vious section in upstream coverage and gather coverage data with lcov, so that we have already the whole this procedure described only few lines above, I recommend you look at it as it is written at the end of chapter 7.2. The final coverage is shown here:

Figure 7.2: Fedora coreutils coverage

35 8 Comparison

In my bachelor thesis I am not going to compare the whole cover- age and dedicate much text to it, because that is not the aim of my work, nevertheless to find out what I have done, small overview is essential. The upstream has strict rules when a patch can be submitted and it also undergoes the control of professional programmers, who are actively maintaining the package. The code is precisely tested, but a few years backwards when coreutils was not one project, but was still divided into three parts, as I mentioned at the beginning, rules were not this strict and therefore the upstream coverage also does not reach the hundred percent boundary. We can see the overview on figure 8.1.

Figure 8.1: Upstream coverage

Figure 8.2: Fedora coreutils coverage

36 8. COMPARISON

In comparison with the downstream coverage overview (figure 8.2), there is quite small global change, the biggest ( one percent ) can be seen on the line coverage.

8.1 Cause of differences

As I mentioned in section 4.5 the difference between upstream and downstream (or distribution) package are the patches applied on the distribution one. These make a big difference in test coverage, be- cause these patches are sometimes adding not only the bug fixes, but also new functionality needed on the desired system. Distribution patches therefore change mainly the contents of source files, not the tests. Of course sometimes some tests start to fail even when the program works properly and developers use to disable these kinds of tests.

8.1.1 Patches In the Fedora coreutils, there are more than ten patches applied, but some are small enough so there is no need of mentioning them, but there are five big, which changed the test coverage rapidly. To mention them chronologically, first of them was 6.10 - config- uration, which served the purpose of workaround the koji build sys- tem. Next big one was pam patch which added pluggable authen- tication modules support. This was followed by i18n patch which added multi-byte functionality support and is the most attractive for me and the biggest cause of the lower downstream test coverage and I will discuss it in next subsection more explicitly. Then there are some smaller like runuser and selinux patches.

8.1.2 coreutils-i18n.patch As I have already written, this patch added multi-byte functionality support.

Definition 8.1.1. Single-byte character set - the ASCII character set de- fines the characters from 0 to 127 and an extended set from 128 to 255. Several alternative single-byte character sets, primarily European, define

37 8. COMPARISON the characters from 0 to 127 identically to ASCII, but define the charac- ters from 128 to 255 differently. With this extension, 8-bit representation is sufficient for defining the needed characters in most European-derived languages. [34]

Definition 8.1.2. Multi-byte character set - a multi-byte character set con- sists of both one-byte and two-byte characters. A multi-byte-character string can contain a mix of single and double-byte characters. A two-byte charac- ter has a lead byte and a trail byte. In a particular multi-byte character set, the lead and trail byte values can overlap, and it is then necessary to use the byte’s context to determine whether it is a lead or trail byte. [34]

The problem with this patch is that it does not test altered features properly, although it adds test for sort utility, as I mentioned before, it is still insufficient. Therefore I aimed mainly on this one with my test coverage improvement.

8.1.3 Affected utilities

The most affected utilities and those that made the biggest difference in coverage are cut, , fold, join, , sort, unexpand and uniq. On the figure 8.3 we can see the coverage of these eight utilities after applying downstream/ distribution patches. On the other hand on figure 8.4 is the upstream coverage of the same utilities and we can obviously see, that the patch dragged the coverage at least twenty percent down, which can have serious impact on the functionality of these utilities.

Figure 8.3: Downstream coverage of distinct utilities

38 8. COMPARISON

Figure 8.4: Upstream coverage of distinct utilities

8.2 Multi-byte test patch

Therefore I was going through upstream coreutils, looking for com- mit adding tests for multi-byte paths and it was submitted by Jim Meyering the maintainer of these utilities and can be found here: http://git.savannah.gnu.org/cgit/coreutils.git/commit/ tests/misc/cut?id=d8945c8d8f7f4505e9beb61a1005b1ece09e2790

8.2.1 Making patch After finding out these information and trying them on cut, I was simply able to alter all eight tests. The most important changes made are listed and described below. These changes can be simply divided into two blocks. At the beginning of the script there are few lines I added starting with "my" keyword. These lines are simply defini- tions of scalar variables used later in code execution either to run the code in two ways (single-byte and multi-byte locales) or to help omitting errors to user output. my $mb_locale = $ENV{LOCALE_FR_UTF8}; ! defined $mb_locale || $mb_locale eq ’none’ and $mb_locale = ’C’; my $prog = ’utility_name’; my $try = "Try \‘$prog --help’ for more information.\n"; my $inval = "$prog: invalid byte, character or field list\n$try";

And the second block is the for each cycle, which duplicates each test vector, but this time it is done with inserted {ENV => "LC_ALL=

39 8. COMPARISON

$mb_locale"}, which is the multi-byte locale that was untested in dis- tribution coreutils. And also for clarity and better understanding of test output it adds "-mb" at the end of test name. For one more detail (the recognition if the utility is multi-byte patched) see the commen- tary in the piece of code given: if ($mb_locale ’C’) { my @new; foreach my $t (@Tests) { my @new_t = @$t; my $test_name = shift @new_t;

# Depending on whether utility_name is multi- # byte-patched, it emits different diagnostics: # non-MB: invalid byte or field list # MB: invalid byte, character or field list # Adjust the expected error output accordingly. if ( {ref $_ eq ’’ && exists $_->{ERR} && $_->{ERR} eq $inval} (@new_t)) { my $sub = {ERR_SUBST => ’s/, character//’}; push @new_t, $sub; push @$t, $sub; } push @new, ["$test_name-mb", @new_t, {ENV => "LC_ALL=$mb_locale"}]; } push @Tests, @new; }

@Tests = triple_test \@Tests;

I encountered some problems with certain tests, therefore I had to use Perls next function, which means, that I skipped these tests. Once I had to do it for unexpand test, where is one which deadlocks (situation, when two running processes are waiting for each others

40 8. COMPARISON

response). So I had to search for the one that causes the problem (by commenting out test after test) and then just disabling the one with the line marked with commentary sign at the end. In the sample code below are last few lines from the cycle as it is whole written above to illustrate my solution : push @new_t, $sub; push @$t, $sub; } next if ($test_name =~ ’b-1’); # push @new, ["$test_name-mb", @new_t, {ENV => "LC_ALL=$mb_locale"}]; } The next utilities which caused a problem and I had to skip cer- tain test inside them as well were sort and uniq. Here the situation was caused by test already accepting multi-byte locale, so there was no need of duplicating the test vector and running it once more. Therefore I searched for the test case like in example before and dis- abled it. The last thing that is important to mention is that during work on changes in one test, there is no need to run the whole test suite every time we make a change (big time consumption), but we can easily run the test we want with this command: make check TESTS=test/name After making these changes in all eight utility tests, I used git versioning system (see subsection 6.2.3) to create a patch file which is also submitted with this bachelors thesis. From the beginning of my work on altering the tests I was us- ing git as storage to make sure nothing gets lost through the process of my work. As a first commit (submission of source code) I put in- side the repository just the unchanged source code. The identification number given by the git to this first commit and the identification number of the last commit, the one with finished modification of the source code is used in the command which simply makes the differ- ence between them and creates the patch file. This is the command: git diff commit_id_first commit_id_finished > coreutils-mb-test.patch

41 8. COMPARISON

8.2.2 Differences after multi-byte test patch

After applying the patch I once more run these tests and retrieved the coverage. My results are, if we look at the line coverage, the same as the upstream ones as shown on figure 8.5 in comparison with the upstream coverage seen on figure 8.1. But now if we look at the utilities one by one there are still few percent missing because of changes from other patches which are not the aim of my bachelors degree thesis.

Figure 8.5: Downstream coverage after applying my patch

Figure 8.6: Distinct utilities coverage after applying my patch

The other problem is as I mentioned few times before, untested code is very vulnerable to mistakes or unpredictable behaviour. There- fore after my patch, not all tests are passing, which means that they are failing and therefore detecting mistakes, which I, after arranging all formalities with OndˇrejVašík (my consultant) from Red Hat, re- ported to https://bugzilla.redhat.com/

42 8. COMPARISON

Only two from eight tests passed without any mistakes also with multi-byte support, and those are fold and pr. All six other utilities contain bugs, which I reported to bugzilla as I mentioned before and here is they way how it should be done correctly. As first thing you always do is search for similar or duplicate questions as the one you want to ask. You can do so with just a few keywords on https://bugzilla.redhat.com/query.cgi?format=advanced

After this you are obliged to create an account with a valid email address, otherwise you are unable to report anything. In this case, your email address is visible to all users, so secondary email use is advised. Your first step on Red Hat bugzilla page is finding of the proper product and its component for your bug report. Then you should only change severity and priority values if you are concerned about their correctness, otherwise keeping defaults is advised. Next few things you need to fill out are a short summary of the issue and also a description template which has more parts, but are clearly marked to help you with filling them in correctly. The last thing is to submit your issue and for maintainers response. If there is none, try writing them and be prepared to give them more information (your configuration, more debug info, ...). [35] I did so and here are my reported issues. The expand report can be found here:

https://bugzilla.redhat.com/show_bug.cgi?id=821260

Bugzilla for unexpand is here:

https://bugzilla.redhat.com/show_bug.cgi?id=821262

Also uniq can be found here:

https://bugzilla.redhat.com/show_bug.cgi?id=821263

Join is reported here:

43 8. COMPARISON

https://bugzilla.redhat.com/show_bug.cgi?id=821261

Sort bugzilla is here:

https://bugzilla.redhat.com/show_bug.cgi?id=821264

And finally cut is reported here:

https://bugzilla.redhat.com/show_bug.cgi?id=821259

44 9 Conclusion

The main goal of my bachelors degree thesis was to improve code coverage on Fedora distribution coreutils package, which is affected by distribution patches, and to raise it as close to the upstream cov- erage as possible. This goal was completed and I successfully raised the code cov- erage to the desired level. I also proved, as discussed many times through thesis, that bad code coverage causes potential threads for code functionality. It proved true also on coreutils. Altered tests un- covered mistakes, which were properly reported to Red Hat bugzilla, where the maintainers of the package should take care of them in a short time. I also presented basic terminology to the reader and looked deeper into testing itself and explained code coverage. I mentioned essential part about Linux, its purpose and distribution Fedora which was the object of my work and with which I work everyday. I described tools used through the process of coverage retrieval and explained basic usage. I clearly stated all the processes of coverage retrieval and also of code changes. I can therefore summarize my work as successful and have to state that distribution patches are often resolving issues or adding new functionality but can pose new problems, which can not be found without proper testing process, which is the weakest part of patches. The main cause of this problem is insufficient manpower and time for testing, but also not as strict policy as the upstream has. As an output of my thesis, I created a patch file (see the attach- ment), which can be easily applied on coreutils. Apart from patch file, as mentioned, I also reported bugs to bugzilla for developers to improve the functionality.

45 Bibliography

[1] IEEE. IEEE Standard 610.12-1990, IEEE Stan- dard Glossary of Software Engineering Ter- minology [online]. Available from WWW:

[2] Lions, J. L. ARIANE 5 Flight 501 Failure [online]. Available from WWW:

[3] GNU General Public License [online]. Available from WWW:

[4] Coreutils - GNU core utilities [online]. Available from WWW:

[5] GNU Core Utilities [online]. Available from WWW:

[6] Fileutils - GNU File Utilities [online]. Available from WWW:

[7] Shellutils - GNU shell utilities [online]. Available from WWW:

[8] Textutils - GNU Text Utilities [online]. Available from WWW:

[9] Edsger W. Dijkstra [online]. Available from WWW:

[10] Meyering, Jim. [Coreutils-announce] coreutils- 5.0 released (union of fileutils, sh-utils, textutils) [online]. Available from WWW:

46 9. CONCLUSION

[11] Wikipedia contributors. Software test- ing [online]. Available from WWW:

[12] Wikipedia contributors. Linux [online]. Available from WWW:

[13] Torvalds, Linus B. Free -like kernel sources for 386-AT. [online]. Available from WWW:

[14] Top500.org [online]. Available from WWW:

[15] Maximum RPM: Taking the Red Hat Package Man- ager to the Limit [online]. Available from WWW:

[16] RPM Project Roadmap [online]. Available from WWW:

[17] Sundaram, Rahul. Staying close to up- stream projects [online]. Available from WWW:

[18] Hong, Zhu. Software Unit Test Coverage and Adequacy [online]. 1996. Available from WWW:

[19] Williams, Laurie. White-Box Testing [online]. 2006. Available from WWW:

[20] Johnson, Paul. Testing and Code Cov- erage [online]. Available from WWW:

47 9. CONCLUSION

[21] Introduction to gcov [online]. Available from WWW:

[22] Invoking gcov [online]. Available from WWW:

[23] Brief description of gcov data files [online]. Available from WWW:

[24] LCOV - the LTP GCOV exten- sion [online]. Available from WWW:

[25] git –everything-is-local [online]. Available from WWW:

[26] Autoconf - GNU Project - Free Software Foun- dation (FSF) [online]. Available from WWW:

[27] Automake - GNU Project - Free Software Foun- dation (FSF) [online]. Available from WWW:

[28] XZ Utils [online]. Available from WWW:

[29] trucov [online]. Available from WWW:

[30] Testwell CTC++ Test Coverage Analyzer for C/C++ [online]. Available from WWW:

48 9. CONCLUSION

[31] BullseyeCoverage [online]. Available from WWW:

[32] Source Package Files and How To Use Them [online]. Available from WWW:

[33] Dunbar, Daniel. KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs [online]. Available from WWW:

[34] Understanding Single and Multibyte Char- acter Sets [online]. Available from WWW:

[35] Vašík, Ondˇrej. Bugs [online]. 2009. Available from WWW:

49 10 Attachment

CD-: Fedora coreutils patch Includes:

coreutils-mb-test.patch

To apply this patch file you can prepare Fedora coreutils as de- scribed in section 7.3, then simply copy the .patch file into root direc- tory with coreutils source code and run these commands:

path/to/coreutils-8.15

patch -p1 < coreutils-mb-test.patch

You can also remove changes made by this patch by running: patch -p1 -R < coreutils-mb-test.patch

50