Extrapolate: Generalizing Counterexamples of Functional Test Properties

This is a repository copy of Extrapolate: generalizing counterexamples of functional test properties. White Rose Research Online URL for this paper: https://eprints.whiterose.ac.uk/129498/ Version: Accepted Version Proceedings Paper: Braquehais, Rudy and Runciman, Colin orcid.org/0000-0002-0151-3233 (Accepted: 2018) Extrapolate: generalizing counterexamples of functional test properties. In: IFL 2017: 29th Symposium on the Implementation and Application of Functional Programming Languages. ASSOC COMPUTING MACHINERY , New York . (In Press) Reuse Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request. [email protected] https://eprints.whiterose.ac.uk/ Extrapolate: generalizing counterexamples of functional test properties Rudy Braquehais Colin Runciman University of York, UK University of York, UK [email protected] [email protected] ABSTRACT properties in Haskell. Several example applications demonstrate This paper presents a new tool called Extrapolate that automatically the effectiveness of Extrapolate. generalizes counterexamples found by property-based testing in Example 1.1. Consider the following faulty sort function: Haskell. Example applications show that generalized counterexamples can inform the programmer more fully and more immediately sort :: Ord a => [a] -> [a] what characterises failures. Extrapolate is able to produce more sort [] = [] general results than similar tools. Although it is intrinsically un- sort (x:xs) = sort (filter (< x) xs) sound, as reported generalizations are based on testing, it works ++ [x] well for examples drawn from previous published work in this area. ++ sort (filter (> x) xs) The function sort should have the following properties CCS CONCEPTS prop_sortOrdered :: Ord a => [a] -> Bool · Software and its engineering → Software testing and de- prop_sortOrdered xs = ordered (sort xs) bugging; prop_sortCount :: Ord a => a -> [a] -> Bool KEYWORDS prop_sortCount x xs = count x (sort xs) == count x xs enumerative property-based testing, systematic testing, functional that together completely specify sort. programming, Haskell. If we pass both properties to an established property-testing ACM Reference Format: library, such as QuickCheck [11] or SmallCheck [33], we get some- Rudy Braquehais and Colin Runciman. 2017. Extrapolate: generalizing coun- thing like: terexamples of functional test properties. In IFL 2017: 29th Symposium on > check (prop_sortOrdered :: [Int] -> Bool) the Implementation and Application of Functional Programming Languages, August 30-September 1, 2017, Bristol, United Kingdom. ACM, New York, NY, +++ OK, passed 500 tests. USA, 11 pages. https://doi.org/10.1145/3205368.3205371 > check (prop_sortCount :: Int -> [Int] -> Bool) *** Failed! Falsifiable (after 4 tests): 1 INTRODUCTION 0 [0,0] Most programmers are familiar with the following situation: a That is, prop_sortCount 0 [0,0] = False. If instead we test failing test case has been discovered during testing; but it is not using Extrapolate, then for the failing property, in addition to a immediately apparent what more general class of tests would trig- specific counterexample, Extrapolate prints: ger the same failure. The programmer may resort to painstaking step-by-step reevaluation of the reported failure in the hope of re- Generalization: alizing where a fault lies. In this paper, we examine a less explored x (x:x:_) approach: the generalization of failing cases informs the program- Some values have been generalized: the specific value 0 does not mer more fully and more immediately what characterises such matter, prop_sortCount x (x:x:_) = False for any integer failures. This information helps the programmer to locate more x; the tail value _ does not affect the result. Extrapolate also prints: confidently and more rapidly the causes of failure in their program. We present Extrapolate, a tool to generalize counterexamples of test Conditional Generalization: x (x:xs) when elem x xs Permission to make digital or hard copies of all or part of this work for personal or This hints that our faulty sort function fails for lists with repeated classroom use is granted without fee provided that copies are not made or distributed □ for profit or commercial advantage and that copies bear this notice and the full citation elements. We return to this example in ğ4.1. on the first page. Copyrights for components of this work owned by others than the In general, if Extrapolate finds a counterexample, then the prop- author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or erty is definitely false. Any specific counterexample displayed by republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. Extrapolate is valid. A displayed generalization may or may not be IFL 2017, August 30-September 1, 2017, Bristol, United Kingdom valid. However, a generalization is only displayed if a specific coun- © 2017 Copyright held by the owner/author(s). Publication rights licensed to the terexample has first been found and displayed. So Extrapolate never Association for Computing Machinery. ACM ISBN 978-1-4503-6343-3/17/08...$15.00 incorrectly reports a property to be false, even though a generalized https://doi.org/10.1145/3205368.3205371 counterexample may be invalid (or loose, cf. ğ6). IFL 2017, August 30-September 1, 2017, Bristol, United Kingdom Rudy Braquehais and Colin Runciman 1.1 Contributions import Test.Extrapolate The main contributions of this paper are: sort :: Ord a => [a] -> [a] (1) methods using automated black-box testing to generalize sort [] = [] counterexamples of functional test properties by replacing sort (x:xs) = sort (filter (< x) xs) constructors with variables, where these variables may be ++ [x] repeated or subject to side-conditions; ++ sort (filter (> x) xs) (2) the design of the Extrapolate library, which implements these methods in Haskell and for Haskell test properties; prop_sortOrdered :: [Int] -> Bool (3) a selection of small case-studies, investigating the effective- prop_sortOrdered xs = ordered (sort xs) ness of Extrapolate; where (4) a comparative evaluation of generalizations performed by ordered :: [Int] -> Bool Extrapolate and similar tools for Haskell. ordered (x:y:xs) = x <= y && ordered (y:xs) Despite the Haskell setting of the implementation and experiments, ordered _ = True we expect similar techniques to be applicable in other functional programming languages. prop_sortCount :: Int -> [Int] -> Bool prop_sortCount x xs = count x (sort xs) 1.2 Road-map == count x xs This paper is organized as follows: where ğ2 describes how to use Extrapolate; count :: Int -> [Int] -> Int ğ3 describes how Extrapolate works internally; count x = length . filter (== x) ğ4 presents example applications and results; main :: IO () ğ5 discusses related work; main = do ğ6 evaluates Extrapolate in comparison with similar tools; ğ7 draws conclusions and suggests future work. check prop_sortOrdered check prop_sortCount 2 HOW EXTRAPOLATE IS USED Figure 1: Full program applying Extrapolate to properties of Extrapolate is a library (loaded by łimport Test.Extrapolatež). sort used to obtain the results in ğ1. Unless they already exist, instances of the Listable (ğ3.1) and Generalizable (ğ3.2) typeclasses are declared for needed user- defined datatypes (step 1). The check function is then applied to each test property (step 2). Listable enumeration (ğ3.1). This definition is similar to the one used in our previous work on Speculate [5]. By default, Extrapolate: Step 1. Provide class instances for used-defined types. Extrapolate • needs to know how to generate test values for property arguments tests properties for up to 500 value assignments; • Ð this capability is provided by instances of the Listable typeclass considers side conditions up to size 4; • (ğ3.1). Extrapolate also needs to manipulate the structure of test val- uses Speculate to find equivalences between expressions of ues so that it can perform its generalization procedure Ð this capabil- up to size 4 avoiding many redundant equivalent conditions ity is provided by instances of the Generalizable typeclass (ğ3.2). (ğ3.3). Extrapolate provides instances for most standard Haskell types Extrapolate allows variations of these default settings. The number and a facility to derive instances for user-defined data types using of configured tests affects the runtime linearly so long astheun- Template Haskell [34]. For applications in which values of algebraic derlying Listable enumeration has linear runtime. The size limit datatypes can be freely constructed, as there are no constraining of side conditions affects runtime exponentially so it should be data invariants, writing deriveGeneralizable ’’<Type> is adjusted with caution. Our choice of default parameters aims for a enough to create the necessary instances. Extrapolate’s online doc-

Extrapolate: Generalizing Counterexamples of Functional Test Properties

The Types, Roles, and Practices of Documentation in Data Analytics Open Source Software Libraries

What I Wish I Knew When Learning Haskell

Approaches and Applications of Inductive Programming

Not All Patterns, but Enough

Haddock User Guide I

Download and Use25

Bias Reformulation for One-Shot Function Induction

On the Di Iculty of Benchmarking Inductive Program Synthesis Methods

Learning Functional Programs with Function Invention and Reuse

Haskhol: a Haskell Hosted Domain Specific Language for Higher-Order

Inductive Programming Meets the Real World

Analytical Inductive Programming As a Cognitive Rule Acquisition Devise∗