Evalua Ng Clojure Spec
Total Page:16
File Type:pdf, Size:1020Kb
Linköping University | Department of Computer and Information Science Master thesis, 30 ECTS | Computer Science 2017 | LIU-IDA/LITH-EX-A--17/043--SE Evaluang Clojure Spec Utvärdering av Clojure Spec Chrisan Luckey Supervisor : Bernhard Thiele Examiner : Christoph Kessler External supervisor : Rasmus Svensson Linköpings universitet SE–581 83 Linköping +46 13 28 10 00 , www.liu.se Upphovsrä Dea dokument hålls llgängligt på Internet – eller dess framda ersäare – under 25 år från pub- liceringsdatum under förutsäning a inga extraordinära omständigheter uppstår. Tillgång ll doku- mentet innebär llstånd för var och en a läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och a använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsräen vid en senare dpunkt kan inte upphäva dea llstånd. All annan användning av doku- mentet kräver upphovsmannens medgivande. För a garantera äktheten, säkerheten och llgäng- ligheten finns lösningar av teknisk och administrav art. Upphovsmannens ideella rä innefaar rä a bli nämnd som upphovsman i den omfaning som god sed kräver vid användning av dokumentet på ovan beskrivna sä samt skydd mot a dokumentet ändras eller presenteras i sådan form eller i så- dant sammanhang som är kränkande för upphovsmannenslierära eller konstnärliga anseende eller egenart. För yerligare informaon om Linköping University Electronic Press se förlagets hemsida hp://www.ep.liu.se/. Copyright The publishers will keep this document online on the Internet – or its possible replacement – for a period of 25 years starng from the date of publicaon barring exceponal circumstances. The online availability of the document implies permanent permission for anyone to read, to download, or to print out single copies for his/hers own use and to use it unchanged for non-commercial research and educaonal purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are condional upon the consent of the copyright owner. The publisher has taken technical and administrave measures to assure authencity, security and accessibility. According to intellectual property law the author has the right to be menoned when his/her work is accessed as described above and to be protected against infringement. For addional informaon about the Linköping University Electronic Press and its procedures for publicaon and for assurance of document integrity, please refer to its www home page: hp://www.ep.liu.se/. © Chrisan Luckey ABSTRACT The objective of this thesis is to evaluate whether or not Clojure Spec meets the goals it sets out to meet with regards to easy data validation, performance and automatically generated tests in comparison to existing specification systems in the Clojure ecosystem. A specification for a real-world data format was implemented in the three currently popular spec- ification systems used in Clojure. They were then compared on merits in terms of performance, code size and additional capabilities. The results show that Spec shines with complex data, both in expressivity and validation perfor- mance, but has an API more complex than its competitors. For complex enough use cases where expressing regular data structures and generative testing is desired the time investment of learn- ing Spec pays off, in simpler situations an assertions library like Truss can be recommended. iv I want to thank my mother and father without whom I wouldn’t be here. v Contents Abstract iii Acknowledgments v Contents vi List of Figures viii List of Tables ix 1 Introduction 3 1.1 The objectives of Spec . 3 1.2 Aim . 4 1.3 Questions . 4 1.4 Scope . 5 2 Background 7 2.1 Definitions . 7 2.2 Code complexity and quality . 9 2.3 Related work . 11 2.4 Introduction to Clojure . 12 2.5 Introduction to Spec, Schema and Truss . 15 3 Method 23 3.1 Pre-study . 23 3.2 Data selection . 25 3.3 Filtering test data . 27 3.4 Writing specifications . 27 3.5 Benchmarking . 27 3.6 Effort reduction . 28 3.7 Edge cases . 28 4 Results 29 4.1 Effort reduction . 29 4.2 Performance . 30 vi 4.3 Edge cases . 33 4.4 Criteria comparison . 33 4.5 Generating data from Clojure Spec . 34 5 Discussion 37 5.1 Method . 37 5.2 Results . 39 5.3 Writing specifications . 41 5.4 In a wider context . 52 6 Conclusion 53 Bibliography 56 A Statistical results from generating data with Spec 57 B Validation time broken down by keyword per system 61 C Validation time broken down by keyword grouped by system 67 vii List of Figures 4.1 Project file validation time summary . 31 4.2 Validation time broken down by whether the tested data is valid or not. 32 5.1 Validation time comparison for boolean value in a map. 40 5.2 Validation time for a dependency vector. 40 5.3 Validation time for a value with multiple options. 40 B.1 Validation time broken down by keyword for Truss . 62 B.2 Validation time broken down by keyword for Spec . 63 B.3 Validation time broken down by keyword for Schema . 64 B.4 Validation time broken down by keyword for plain Clojure validation . 65 C.1 Validation time by all systems for keys :aliases to :exclusions . 68 C.2 Validation time by all systems for keys :filespecs to :main . 69 C.3 Validation time by all systems for keys :managed-dependencies to :release-tasks . 70 C.4 Validation time by all systems for keys :repl-options to :warn-on-reflection . 71 viii List of Tables 3.1 Downloads from Clojars per library. 25 3.2 Number of projects on Clojars that depend on the given library. 25 4.1 SLOC per specification implementation. 29 4.2 Statistical measures in ms for validation of project files. 31 4.3 Criteria comparison: Xmeans full, d partial and 7 no support. 34 4.4 Generation time in milliseconds from spec. 35 5.1 Feature comparison: Xmeans full, d partial and 7 no support. 41 ix 1 Introduction Clojure Spec is an upcoming standard library of the programming language Clojure [48] for specifying the functionality of programs and the nature of data. It allows the programmer to describe the structure and contents of data held in any combination of Closure’s data structures as well as that given to and returned from functions and macros. It also allows for the relation between the data given to and returned from a function or macro to be expressed. These specifications can then be used for data validation, higher level parsing, generative testing as well as to provide improved documentation and error messaging. This work seeks to evaluate Clojure Spec in comparison to other competing specification systems as well as plain, normal Clojure code. 1.1 The objectives of Spec Introduced in May 2016, but as of July 2017 yet to see a stable release, Spec seeks [10] to fill a lot of gaps in the Clojure ecosystem by: • Documenting functions, macros, keywords, lists, arrays, maps and sets for both program- matic and human consumption. • Reporting errors1 on the parsing and destructuring of data. • Providing run time data validation. • Automatic destructuring and parsing. • Generating property based tests. 1Error messages being hard to understand has been the prime reason given by potential users of Clojure as to why they are not currently using Clojure, according to the Clojure survey. [44] [45] 3 1. Introduction • Generating test data for these tests. Some of the terms used above are described further in the Definitions section. The specifications themselves are available at application run time, including test time. They are not intended as mathematical proofs like those which type systems provide [10], although such applications exist2; instead the intention is to provide an environment where arbitrary validation of run time data is not only possible but easy to perform. 1.2 Aim The aim of this thesis is to try to evaluate whether Clojure Spec succeeds in achieving some of the goals outlined in section 1.1. We do this partly through a measurement of time consumed in validating a real-world data set but also through measures of code quality and criteria based evaluation. 1.2.1 Classifying this study Stol and Fitzgerald write in their “Holistic Overview of Software Engineering Research Strategies” [46] that “terminology is a challenge in research methodology, that there is no commonly adopted taxonomy to describe such”. Nevertheless, some key words as defined in their paper describes the approach of this study. This thesis describes primary, quantitative and qualitative, desk research. It is a field study, an exploratory case study. The target of the study is the common ways of implementing specifications in the Clojure ecosystem. 1.3 Questions The question asked in this paper is: To what degree and at what cost are the goals of Spec achieved? Specifically: (1) How does the code of Spec specifications compare to equivalent code in competing systems, both in terms of plain SLOC and convenience of expression? (2) How does Spec perform in real-world benchmarks compared to competing systems? (3) To which degree does Spec expose issues in data or functions not found by competing systems, if at all? The competing systems to Clojure Spec were deemed to be the existing data specification or assertion systems Schema [39] and Truss [5], and to not use any library at all. A specification for the project declaration file of the highly popular build, project and depen- dency management tool Leiningen was produced using each system and compared to each other with regards to the research questions. A comparison was also made using the criteria of related work. 2The third party library Spectrum [43] uses Spec as the basis for a static type system.