arXiv:1908.01540v1 [cs.SE] 5 Aug 2019 ie agae.Ml sbito o fteLV compiler LLVM the of top on co built targeting is tool Mull testing languages. mutation piled general-purpose compiled a certain build for a tools and still testing Java mutation is languages. from programming usable there examples of communities, strong lack language being programming [3] Ruby Mutant of P with implementations and exist, definitely mature [2] maturi tools While of testing level mutation use. a open-source practical reach [1]. tools for appear these testing needed of languages all mutation programming not Unfortunately, open-source various more targeting tools and More community. uaincoverage mutation ntepormssuc oe oass h ult fatest a of called metric quality a the uses assess testing To mutation mutation code. a suite source of program’s location the and in operator mutation of combination of called one modification on program operators based for created rules is predefined program the original of mutation Each rga nrdcdb hsmtn,or mutant, this by introduced be program to said is gis ahvrin hc scle a suite test called a is runs which then version, and mod program each slightly software original many against of creates of quality testing versions improve mutation fied and for evaluate tool to A tests. way a as serves u vlaino ulo elwrdpoet uha RODOS, as such projects real-world resu on the LLVM. present Mull OpenSSL, Mul and of of Mull, evaluation details of our implementation limitations and current to algorithm highlight testing the lang programming describe mutation mut compiled We existing for of capabilities no processing these knowledge, faster provides our in less cod or To results IR do of this programs. Rust, to fragments and Mull ++, modified recompiled, allows only C, are Mull mutations: IR as generate LLVM mutations. to such of that work its language manipulation IR, programming Direct LLVM and any Swift. to program in compilation written tested code supports with a work of can execution compil over o and control capabilities fine-grained two and following independence the language enables p compilat choice just-in-time works to design for representation, JIT This Mull LLVM intermediate uses framework. and low-level mutations, LLVM form a the IR, LLVM on with based testing mutation nti ae,w rsn h ulpoet u tep to attempt our project, Mull the present we paper, this In source open the from interest getting is testing Mutation uainTsig al-ae otaetsigtechniqu testing software fault-based a Testing, Mutation Terms Index Abstract uli vr uaintsigbsdo LLVM on based testing mutation over: it Mull ahmtn srpeetdb a by represented is mutant Each . Ti ae ecie ul noe-oreto for tool open-source an Mull, describes paper —This mtto etn,llvm testing, —mutation killed . ftets ut eet hnet the to change a detects suite test the if .I I. mi:[email protected] Email: needn researcher Independent NTRODUCTION eln Germany Berlin, lxDenisov Alex survived mutant uainscore mutation uainpoint mutation mutant A . otherwise. mutation Mull: f uages. t of lts ation itest ated or , ion. m- a : er- ty ol e, i- l, e interchangeably. I icse uuewr.ScinVI ocue h paper the concludes VIII Section Sectio work. Mull. future of discusses limitations the VII highlights OpenSSL, VI RODOS, Section projects: LLVM. source describ open V the mutatio of Section evaluation the Mull. our imple- describes in interesting IV implemented currently Section most operators Mull. the of deepe consider goes details we then mentation III what Section describes Mull. and of algorithm the describes [6]. 2.0 version License, file. SQLite the generat from that report program HTML separate an or a score is mutation There show coverage. not m mutation does as and Mull survived), tool, such command-line or program, a (killed tested As mutants a points, on mutation running tests, while informat gathers contains Mull database SQLite that framework, The test settings. a other operators, few mutation a as of progr set results tested a the of files, list with bitcode a database include SQLite options Configuration an output. produces and input an language. programming adapters the implement by to used needs frameworks test one Objective-C. the support Swift, to language Rust, to as a due such add but IR, To language LLVM C++, programming to other and compiles any C that with on work focus can primary Mull LLVM, a with Mull started as such domains algorithm computations. different mathematical in programming, and it application of operators programming, use mutation systems practical should basic the tool of real- enable tool The number to of projects. reasonable The analysis source a testing tool use. open implement mutation The and in and tools. production use up build world for ready be with set be integration must to should smooth tool easy allow the and should tool: configurable testing fast, mutation of implementation n t uae oneprs LMI sas eerdto referred also is program IR tested LLVM a counterparts. JIT as of mutated and execution its and mutations and compilation perform runtime low- to its for language, IR, LLVM: of intermediate components level two uses It [4]. framework ecnie h olwn rtraipratfrapractica a for important criteria following the consider We eognz h eto h ae sflos eto II Section follows. as paper the of rest the Apache organize under We [5] online available is code source Mull’s as file configuration a takes It tool. line command a is Mull We mind. in criteria above the of all with built is Mull LMBitcode LLVM mi:[email protected] Email: needn researcher Independent tnsa Pankevich Stanislav eln Germany Berlin, rsml as simply or bitcode eueteeterms these use We . am’s ore. and ion es es n . n s r l II. ALGORITHM build the tree. Unfortunately, a function is not always known The following are the steps that Mull performs during a until runtime. Typical examples are C function pointers and session: C++ virtual function calls. After a few failed attempts we Step 1: Mull loads LLVM Bitcode into memory. switched to the dynamic instrumentation. Thus the call tree became dynamic call tree. Step 2: Mull inserts instrumentation code into each func- tion. This code is used to collect code coverage information. B. JIT and Runtime Compilation We describe our approach to instrumentation in III.A. To run a tested program, Mull needs to compile the bitcode Step 3: Mull compiles instrumented LLVM Bitcode to files into object files containing machine code and link them machine code and prepares the machine code for execution together into executable, as any compiler would do. To accom- by LLVM JIT engine. plish this task Mull utilizes LLVM JIT engine. This approach Step 4: In the LLVM IR code Mull finds the tests according has a great advantage: compilation and linking happen in to a test framework specified in the configuration file. memory. Thus there is no disk I/O overhead. Step 5: Mull runs each test using LLVM JIT engine and When it comes to mutation Mull performs it on a copy collects code coverage information. of a single bitcode file, recompiles it and links together with Step 6: Mull finds mutations in the LLVM IR code based already compiled object files. Partial recompilation helps to on a code coverage information collected for each test. A set increase performance. It also helps to decrease memory usage: of mutation points is created. mutated bitcode file can be disposed from memory right after Step 7: For each mutation point, Mull creates a mutant and execution. runs each test that can kill the mutant. For each mutant, only part of bitcode is recompiled into machine code. We describe C. Sandboxing our approach to runtime compilation in III.B. Mutations can make the code behave in unexpected ways: to Step 8: All information collected during the session is crash, to timeout or to exit prematurely. We use a parent/child written to the SQLite database. This is the final step. Mull process isolation to achieve a proper sandboxing of a tested finishes its execution at this point. program. Mull, which is a parent process, runs each test in a separate III.IMPLEMENTATION child process. The fork system call is used to create a child A. Instrumentation and Dynamic Call Tree process, mmap system call is used to share memory between the parent process and the child process. A typical program has many mutations, but not all of them Mull handles exit status of a child process according to the are reachable by the program’s tests. We use this fact to following policy: reduce the number of mutants. To know which mutations are 1) Normal execution (test has passed or failed): We use a reachable we thus need to know which code is reachable conventional exit code 227 to indicate if a test has run without from a test. To achieve this, we insert instrumentation into any issues. If child process exits with code 227, Mull knows each function and then run a test to gather code coverage that a test has either succeeded or failed and that nothing information. From this information, we construct a dynamic extraordinary like in one of the following cases has happened. call tree. 2) Timeout: Mutated code might never finish its execution The purpose of the call tree is better illustrated by in a child process. To handle this case Mull sets an alarm in a example. Consider a function test_driver calling child process that exits with a conventional timeout code 239 function test which is calling functions testee1 after a certain time interval. We use ualarm function to set and testee2. The call tree would look like the following: the alarm. test_driver -> test -> { testee1, testee2 }. 3) Crash: Mutated code can crash with a child process In this case the code being tested is inside of testee1 and executing it. We use WIFSIGNALED() to detect a crash of testee2. Therefore we can inject mutations only into the a child process. subtrees of the test function. 4) Abnormal exit: Mutated code exits prematurely from a The call tree adds more fine-grained control of the amount child process and this does not let a test to finish. This scenario of work via mutation distance. We can define a mutation is a reason for the existence of the custom exit code 227 from distance to be a distance from a test function to a function the case 1) because Mull needs to distinguish between normal with the actual mutation. If function A calls function B and exit and abnormal exit from a child process. function B calls function C, then the distance between A and C is 2. Mutation distance can be used to decrease the number D. Dry Run of mutations: Mull can ignore mutations that are too far from It is not known in advance how many mutations a project a test function. has and how much time does it take Mull to run it. To The instrumentation-based approach has an overhead, but remove uncertainty, we introduce dry run mode. In this mode, it is necessary to get the right code coverage information. Mull collects information about mutations but does not run Initially, we used static code analysis to build the call tree: tests against them. Therefore no partial recompilation and no Mull iterated through bitcode and followed call instructions to sandboxing are needed. Additionally, Mull gives a pessimistic approximation of the kills the mutant, then there is no need to run the remaining run time: it calculates how much time would be needed if 19 tests. The fail fast mode is disabled by default and can be each mutant times out. Real execution time is lower than the enabled in the configuration file. approximation, but it gives a good hint of expected run time. G. Caching E. Test Framework: plugin architecture Mull can work with any test framework. The only require- Mull uses JIT and Runtime Compilation for better perfor- ment is that Mull can run a single test independently from the mance. However, sometimes it is faster to read object file from other tests in a test suite. disk than to compile it in memory from LLVM Bitcode. In Each test framework plugin consists of two components: this regard, Mull supports on-disk cache. Before compiling a test finder and test runner. Mull uses test finder to find the bitcode file Mull attempts to get an object file from disk. If tests in a bitcode of a tested program, and it uses test runner there is none, then Mull compiles the bitcode file and saves to run one test according to the calling conventions of a given resulting object file on-disk for later usage. When Mull runs test framework. next time, it can use an object file from the previous session. Test finder takes all bitcode files as an input and gives back To avoid a use of outdated object files, Mull encodes a list of test functions. Examples: for SimpleTestFinder checksum of original bitcode file into the name of a cached a test is simply a C function whose name starts with object file. Object files for mutants also contain a unique test, for GoogleTestFinder a search algorithm ex- identifier of the mutation point. tracts the information about the test functions from internal::MakeAndRegisterTestInfo registration IV. SUPPORTED MUTATION OPERATORS call of a GoogleTest framework. Mull performs mutations on the LLVM IR code, so its Test runner runs one test and returns the result of its execu- implementation of mutation operators is largely determined tion. Running a test can be as easy as calling a test function by the specification of LLVM language and in particular its and checking its return value: for SimpleTestRunner test Instruction Reference [7]. We also used Pitest’s documentation passes if its test function returns 1 and fails if it returns of its mutation operators [8] to decide which operators to 0. For GoogleTest framework GoogleTestRunner has to implement first. emulate the work of GoogleTest’s main() function: to run one test GoogleTestRunner runs a test suite in a ”focused The following is the list of mutation operators currently mode” with a filter set to a name of this test’s function supported by Mull. All of the instructions referenced below (--gtest_filter=TestFunctionName). can be found in the LLVM IR language manual. Many projects have their custom test suites. Examples are Musl, OpenSSL, glibc, openlibm. While it is possible A. Math: Add, Sub, Mul, Div to create a dedicated pair of test finder and test runner This group of operators performs mutations of basic arith- for each of these projects like OpenSSLTestFinder and metic operators: ”+” to ”-”, ”-” to ”+”, ”*” to ”/”, ”/” to ”*”. OpenSSLTestRunner, we created a general solution called Math Add replaces an add instruction, which returns the CustomTestFramework to enable testing of these projects. sum of its two operands, with a sub instruction, which returns To use CustomTestFramework with a given project one the difference of its two operands. Math Add also works with has to provide the custom test definitions in a configuration the floating equivalent of add instruction: fadd which is file. Here is an example for OpenSSL project: replaced with fsub. test_framework: CustomTest Math Sub operator performs the same kind of mutation as custom_tests: Math Add but in opposite direction: from sub to add and - name: test_bio_enc_aes_128_cbc fsub to fadd. Math Mul and Div work with mul, fmul method: test_bio_enc_aes_128_cbc and div, fdiv instructions respectively. program: bio_enc_test arguments: [ test_bio_enc_aes_128_cbc ] B. Negate Condition In this case CustomTestFinder treats a function Negate Condition operator works with icmp instruction called test_bio_enc_aes_128_cbc as a test, and (comparison of integer operands) and fcmp instruction (com- CustomTestRunner runs the program using specified ar- parison of floating-point operands). Both instructions accept guments. three operands of which ”the first operand is the condition code indicating the kind of comparison to perform”. This F. Fail Fast mode first operand is a conventional code that represents a type In the worst case a tested program with N tests and M of comparison, for example: ”unsigned equal” to ”signed mutations requires N * M test runs. Mull has an option to less than”. Negate Condition modifies the code to achieve a decrease the amount of test runs: fail fast mode. For example, complete negation of a condition: from ”equal” to ”not equal”, if a mutation is reached from 20 tests and the very first test from ”signed less” to ”signed greater than or equal”, etc. C. Remove Void Function TABLE I PROJECTS This operator removes a call to a void function from LLVM IR code. The void function calls can be represented by two Lines Bitcode Bitcode Average time instructions in LLVM IR: call and invoke. The difference Project of code files size per test run RODOS 125,127 32 407 KB 23 ms between these instructions is related to the details of exception OpenSSL 311,293 630 11 MB 42 ms handling and is hidden well behind LLVM IR API making this LLVM 1,324,567 224 242 MB 31 ms difference irrelevant to Mull.

D. Replace Call For each project we measure how many tests a test suite has, This operator replaces a function call, whose return value is how many mutants Mull detects given the mutation operators an integer or floating-point scalar value, with an arbitrary value mentioned above, total amount of test runs executed during according to the following simple rule: the function call is analysis, and the total execution time for both cold and hot replaced with a value forty-two (42) of a corresponding integer runs. or floating-point type. Like Remove Void Function operator, A. RODOS this operator works with call and invoke instructions. RODOS [9] is a real-time operating system developed by E. Scalar Value Replacement the German Aerospace Center. It is written in C and C++ and This operator replaces an integer or floating-point scalar uses CppUnit test framework [12] for its test suite. Among value with a predetermined value according to the following several bare-metal platforms, it can be run on and other simple rule: non-zero value is replaced with a zero value (0) of POSIX-compliant operating systems. the corresponding integer or floating-point type, zero value is RODOS has many small test suites, each of replaced with a value of one (1) of the corresponding integer them covering very specific part of the system. or floating-point type. Scalar values can appear as operands Examples are: matrix4d_test, quaternion_test, of many different instructions in LLVM IR language: binary filesystem_test, hal_gpio_test. Each test suite arithmetic instructions like add or mul, comparison instruc- is designed to run only one single test per compilation: to tions icmp and fcmp, return instruction ret, function call run a test one has to compile the test suite enabling a test instructions call and invoke and some others. Scalar Value by providing a macro definition. We have to change this to operator maintains a list of such instructions that determines control the test selection at runtime rather than at compile if a particular instruction can be a target of a Scalar Value time. Once the test suite is compiled, it can run either all mutation. tests, or the one specified via command-line arguments (argv). Original test driver always exits with exit code 0. V. EVALUATION To check if tests failed or not one has to either observe the In this section, we describe our experience of applying output or check a test report that is written into XML file. Mull on real-world projects. We focus on ease of integration, We have to add a small change here as well: exit code should performance, and a practical applicability of Mull, rather than represent amount of failed tests. If all tests pass, then exit on concrete results such as found bugs or shallow tests. For code is 0, otherwise some positive number. This is a widely this paper we picked three open-source projects: RODOS [9], used approach. RODOS uses Makefiles to build and run its OpenSSL [10] and LLVM [11]. Table I describes some prop- test suites, we have to change compiler flags (CXXFLAGS) to erties of these projects. The number of lines of code represents emit LLVM Bitcode. size and scale of a project. However, more representative Two more workarounds are required to run Mull against metric is a number and an overall weight of bitcode files: RODOS. Some parts of the system are written in assembly it has a direct impact on performance because all this code so they are compiled directly into machine code. We have to has to be loaded into memory, analyzed, compiled, and linked point Mull to them using object_file_list configura- together. tion option. Since RODOS uses CppUnit we also have to point Measurements for OpenSSL and LLVM were made on it to the libcppunit.so via dynamic_library_list macOS 10.13 with 16GB of RAM and Intel i7 3.1GHz CPU. configuration option. Measurements for RODOS were made on the same machine, Once these preparations are done, we can run Mull against but inside of VirtualBox running Ubuntu 16.04, 32 bit. 4GB RODOS. For this experiment, we pick five test suites based on of RAM and two cores of the Intel i7, 3.1GHz were allocated amount of tests in each of them. Table II contains the results. for the virtual machine. For this experiment we used three mutation operators: Math B. OpenSSL Add (IV-A), Negate Condition (IV-B), and Remove Void OpenSSL [10] is a well-known implementation of TLS and Function (IV-C). All tests were run with the Fail Fast mode SSL protocols. It is written in C. It uses custom test framework (III-F) and Caching (III-G) enabled. We ran each group of for its tests. OpenSSL has a mix of unit and integration tests. tests twice: a cold run, without cache in place, and a hot run, We have to look at each test suite to identify if it is a unit test with cache in place. suite or an integration test suite. Test suites with unit tests are TABLE II TABLE IV RESULTS FOR RODOS RESULTS FOR LLVM

Test Total time Test Total time Test suite Tests Mutants runs Distance (cold / hot) Test suite Tests Mutants runs Dist. (cold / hot) linkinterfaceuart 11 19 186 2 4s / 2s All Tests 550 11779 60325 25 3h46m / 1h54m stdlib pico 10 36 356 2 5s / 4s All Tests 550 5508 13601 2 1h52m / 47m46s 10 57 196 3 5s / 3s APFloat 70 1894 22010 25 41m1s / 18m21s sortedlist 9 16 85 4 2s / 2s APFloat 70 361 1622 2 14m13s / 4m32s linkinterfacecan 8 18 471 5 19s / 12s StringExtras 5 160 165 7 4m44s / 3m2s StringExtras 5 93 98 2 4m36s / 2m24s

TABLE III RESULTS FOR OPENSSL LLVM is a big project. It this case it is recommended Test Total time to launch Mull in Dry Run mode (III-D) to get information Test suite Tests Mutants runs Distance (cold / hot) about tested program. Dry run shows that Mull detected 550 packettest 22 67 173 3 30s / 14s destest 20 274 1256 3 47s / 29s tests and found 11779 mutants, it also shows that there are test test 19 102 214 4 31s / 17s 60325 test runs according to III-F. The report also shows igetest 10 118 709 2 29s / 18s approximation of execution time: full run may take about 9 bio enc test 6 708 2667 12 3m8s / 2m45s hours at maximum. It helps to see the order: hours, not days in the worst case. The approximation is very pessimistic: Mull assumes that every mutant fails because of a timeout. In fact, simple programs that are compiled into an executable. Each real execution time was 3 hours 46 minutes. test suite can only run all tests at once. We have to change Table IV shows the results for different configurations. We them to run one test based on command line arguments, or to use three groups of tests of different size and different mutation run all of them if no arguments are given, to preserve original distance (III-A) for each of them to show applicability of Mull behavior. We also have to extract information about each test even for big projects. The execution time of almost four hours manually. To set up CustomTestFramework Mull needs on the whole test suite is impractical for iterative development to know which function is a test and which command line process as opposed to two minutes for a subset of tests. arguments to pass to run this very specific test. Obtaining LLVM Bitcode is trivial. To compile OpenSSL VI. CURRENT LIMITATIONS one has to invoke configure script to prepare build system. A. Junk and stray mutations configure accepts additional parameters that are used as CFLAGS. We invoke the script by passing -flto to enable A mutation can exist in bitcode, but cannot be achieved LTO [13], which produces bitcode files instead of object files by changing original source code. Such mutation is called as build artifacts. Then we compile test suites of interest and junk mutation. The term was first coined by Henry Coles construct separate configuration files for each of them. [16]. A good example of such mutation in C++ is a Results of this experiment can be found in Table III. std::vector::push_back method call: one line of C++ code produces around 200 LLVM IR instructions. Depending on mutation operator Mull finds mutations in those instructions C. LLVM even though there is no equivalent in the original source code. The LLVM [11] compiler infrastructure project is the Some mutation operators require advanced pattern matching to biggest project we have analyzed so far. LLVM is written in avoid this issue, for others, we did not find a robust solution C++. It uses GoogleTest [14] as a test framework. It has several yet. unit test suites of various sizes targeting different subsystems Another issue is C and C++ code from their standard of LLVM. For this experiment, we use ADTTests: a test suite libraries. Compiler inlines code from macros and templates that covers specific abstract data types used in LLVM such into resulting bitcode. Mull finds mutations in this code as as arrays, strings, maps, integers, floats, etc. Additionally, in well, but they are not relevant to a tested program. We call this test suite, we focus only on normal tests and exclude tests such mutations stray mutations. Fortunately, there is a simple which are based on Typed Tests feature of GoogleTest that workaround: Mull can filter out mutations based on their GoogleTestFinder does not support yet. location in a source code using exclude_locations con- Obtaining Bitcode is trivial for LLVM: it has a build setting figuration option. This approach also helps to avoid mutation that enables LTO [13], which produces bitcode files instead of of third-party code. object files as build artifacts. We use only one workaround to get LLVM’s tests running: LLVM JIT does not support B. Current limitations of LLVM JIT Thread-Local Storage [15] so we have to exclude one source Mull uses LLVM JIT from which it gets its power as file that uses TLS from the compilation. Fortunately, this file well as some of its limitations. The following are two major is not used in the test suite so its absence does not affect the limitations we encountered: LLVM JIT does not work with analysis. projects using Thread Local Storage [15], and it does not support Objective-C Runtime [17]. The latter limitation is the • OpenSSL (C/C++, custom test suite, macOS) only reason why Mull does not yet fully support Objective-C • RODOS (C/C++, CppUnit, Linux) and Swift programming languages. Both problems are solvable • openlibm (C, custom test suite, macOS) and are waiting for their solution. • newlib’s libm (C, custom test suite, Linux) • fmt (C++, GoogleTest, macOS) VII. FUTURE WORK • CryptoSwift (Swift, XCTest, Linux) There is a lot of work to be done to get Mull closer to • rustc-demangle (Rust, Rust’s test framework, macOS) its use in production. Below, we outline the three major (and • Mull (autoanalysis) (C++, GoogleTest, macOS) most obvious) parts of our work: performance improvements, Our long-term goal is to get Mull to the point where it can integration with modern IDE’s, further exploration of the real- be used by industry as a drop-in solution for mutation testing. world projects. Also, we expect Mull to find a use in research, including One direction of work is further performance optimizations: interaction with other tools and approaches, that would find parallelization and even better control over recompilation of solutions to speed up the normally slow process of mutation bitcode. Mull still runs only one child process at a time so testing with automatic test generation. the work with multiple child processes is one of the nearest optimizations we are planning. Recompilation of mutated ACKNOWLEDGEMENTS function instead of a whole bitcode file that contains it can We thank Henry Coles and Markus Schirp for fruitful improve performance of Mull on projects with large bitcode discussions and their helpful advice at the early stage of files. development of Mull. Integration with existing IDE’s is yet another important part We thank Yue Jia and Mark Harman for their Analysis and of work to make Mull practical for daily use. While Mull Survey [19] that gave us a theoretical background for our work. works perfectly as a command-line tool that produces HTML We thank Tobias Grosser for the hint about LTO option that reports, we also see it natural to be a part of a workflow helps to get LLVM Bitcode from a project’s source code. provided by the modern IDE’s. REFERENCES Another direction of work is a further exploration of the real-world projects that will drive the implementation of new [1] “Github topics: Mutation testing.” [Online]. Available: https://github.com/topics/mutation-testing test framework adapters like Catch for C++, better support [2] H. Coles, “Pitest.” [Online]. Available: http://pitest.org of programming languages like Rust and Swift, running Mull [3] M. Schirp, “Mutant.” [Online]. Available: https://github.com/mbj/mutant on BSD and Windows systems. In this regard, we especially [4] C. Lattner and V. Adve, “Llvm: A compilation framework for lifelong program analysis & transformation,” in Proceedings of the International look forward to the proper support of Objective-C Runtime by Symposium on Code Generation and Optimization: Feedback-directed LLVM JIT because it will open Mull the door to the world of and Runtime Optimization, ser. CGO ’04. Washington, DC, desktop and mobile application development on macOS and USA: IEEE Computer Society, 2004, pp. 75–. [Online]. Available: http://dl.acm.org/citation.cfm?id=977395.977673 iOS platforms. [5] A. Denisov and S. Pankevich, “Mull.” [Online]. Available: We are aware that other mutation testing tools for compiled https://github.com/mull-project/mull programming languages exist [18] and we assume that a proper [6] “Apache license,” Apache Software Foundation. [Online]. Available: https://www.apache.org/licenses/LICENSE-2.0 comparison between Mull and these tools should be a topic [7] “LLVM Language Reference Manual: In- of separate research. struction Reference.” [Online]. Available: https://releases.llvm.org/3.9.0/docs/LangRef.html#instruction-reference VIII. CONCLUSION [8] H. Coles, “Pitest: Available mutation operations.” [Online]. Available: http://pitest.org/quickstart/mutators/ Our choice of LLVM as a base for an implementation of a [9] “RODOS.” [Online]. Available: mutation testing tool was based on an experiment with LLVM https://en.wikipedia.org/wiki/Rodos (operating system) IR and LLVM JIT libraries that had produced results superior [10] “OpenSSL.” [Online]. Available: https://www.openssl.org [11] “LLVM.” [Online]. Available: https://llvm.org to those from any of our previous attempts to implement a [12] “CppUnit.” [Online]. Available: https://sourceforge.net/projects/cppunit/ solution working on source code or AST levels. So far, we [13] “LLVM Link Time Optimization: Design and Implementation.” did not encounter a single critical problem that would turn [Online]. Available: https://llvm.org/docs/LinkTimeOptimization.html [14] “GoogleTest.” [Online]. Available: https://github.com/google/googletest us away from our decision to base Mull on LLVM with its [15] “MCJIT TLS support: Cannot select: X86ISD::WrapperRIP.” [Online]. intermediate language and infrastructure. Quite to the opposite, Available: https://bugs.llvm.org/show bug.cgi?id=21431 Mull satisfies all criteria that we consider important for an [16] H. Coles, “Junk Mutations.” [Online]. Available: https://twitter.com/0hjc/status/478896988784963584 implementation of mutation testing tool. It has a great number [17] “[llvm-dev] Is it possible to execute Objective- of applications and a large room for further improvement. C code via LLVM JIT?” [Online]. Available: To test Mull on real-world projects and to explore possible http://lists.llvm.org/pipermail/llvm-dev/2016-October/106218.html [18] P. Delgado-Prez, I. Medina-Bulo, F. Palomo-Lozano, A. Garca- limitations of our approach we applied it to as many different Domnguez, and J. Domnguez-Jimnez, “Assessment of class mutation projects, programming languages, test frameworks and operat- operators for c++ with the mucpp mutation system,” vol. 81, p. 169184, ing systems as was possible with our capacity. The following 01 2017. [19] Y. Jia and M. Harman, “An analysis and survey of the development of is the list of the projects we analyzed: mutation testing,” IEEE Transactions on Software Engineering, vol. 37, • LLVM (C/C++, GoogleTest, macOS) no. 5, pp. 649–678, Sept 2011.