VICTORIA UNIVERSITY of WELLINGTON Te Whare W
Total Page:16
File Type:pdf, Size:1020Kb
VICTORIAUNIVERSITYOFWELLINGTON Te Whare Wananga¯ o te UpokooteIkaaM¯ aui¯ School of Engineering and Computer Science Te Kura Matai¯ Pukaha,¯ Purorohiko¯ PO Box 600 Tel: +64 4 463 5341 Wellington Fax: +64 4 463 5045 New Zealand Internet: offi[email protected] Verifying Whiley Programs using an Off-the-Shelf SMT Solver Henry J. Wylde Supervisors: Dr. David J. Pearce, Dr. David Streader October 22, 2014 Submitted in partial fulfilment of the requirements for Engineering 489 - Engineering Project. Abstract This project investigated the integration of external theorem proving tools with Whiley—specifically, Satisfiability Modulo Theories (SMT) solvers—to increase the number of verifiable Whiley programs. The current verifier, the Whiley Constraint Solver (WyCS), is limited and hence there is a difficulty in verifying Whiley pro- grams. This project designed and implemented an extension that supported the use of arbitrary SMT solvers with the Whiley compiler. The evaluation of this ex- tension used the Z3 SMT solver. The evaluation confirmed the value of using external SMT solvers with Whiley by emphasising the extension’s ability to ver- ify simple Whiley programs. However additional research would be required for applying this solution to more complex programs. This project also conducted an experiment that analysed WyCS’s rewrite rules. This research may be used to educate WyCS’s rewrite rule selection criteria, improving its verification abilities. Acknowledgments I wish to thank my supervisors, Dr. David J. Pearce and Dr. David Streader, for all the time and effort they put into providing feedback, advice and help throughout this project. Their assistance has been invaluable. I further wish to express my gratitude to those who helped proofread this report. Their feedback and comments helped to shape it and made a big difference. i ii Contents 1 Introduction 1 1.1 Contributions . .2 2 Background 3 2.1 The Whiley Project . .3 2.1.1 The Whiley Programming Language . .3 2.1.2 The Whiley Assertion Language . .4 2.1.3 The Whiley Constraint Solver . .5 2.2 SMT Solvers . .6 2.2.1 SMT-LIB . .6 2.2.2 Top SMT Solvers . .7 2.3 Related Work . .8 3 Design 11 3.1 The Whiley Project Architecture . 11 3.2 Integration . 13 3.2.1 The WyAL File Format . 14 3.2.2 The WyCS File Format . 14 3.2.3 Design Decision . 15 3.3 Target Technology . 15 3.3.1 SMT-LIB . 15 3.3.2 Z3 . 16 3.3.3 CVC4 . 16 3.3.4 Design Decision . 16 4 Implementation 17 4.1 Encoding Native Data Types . 17 4.1.1 Booleans, Unbounded Integers, Nulls and Unbounded Rationals . 17 4.1.2 Tuples . 18 4.1.3 Sets . 20 4.2 Encoding Native Functions . 21 4.2.1 Set Length . 21 4.2.2 Subset and Proper Subset . 23 5 Evaluation 25 5.1 Experiment 1 – WyCS JUnit Tests . 25 5.1.1 Methodology . 25 5.1.2 Limitations and Implications . 26 5.1.3 Results . 27 5.1.4 Discussion . 27 iii 5.2 Experiment 2 – Performance over WyCS JUnit Tests . 27 5.2.1 Methodology . 28 5.2.2 Limitations and Implications . 28 5.2.3 Results . 28 5.2.4 Discussion . 29 5.3 Experiment 3 – Whiley JUnit Tests . 29 5.3.1 Methodology . 29 5.3.2 Limitations and Implications . 29 5.3.3 Results . 32 5.3.4 Discussion . 32 5.4 Experiment 4 – Whiley Benchmarks . 32 5.4.1 Methodology . 32 5.4.2 Limitations and Implications . 33 5.4.3 Results . 33 5.4.4 Discussion . 33 5.5 Summary . 34 6 Additional Experiment 35 6.1 WyCS Architecture . 35 6.2 Experiment 5 – WyCS Rewrite Rules . 36 6.2.1 Methodology . 36 6.2.2 Limitations and Implications . 37 6.2.3 Results . 37 6.2.4 Discussion . 42 7 Future Work and Conclusions 43 7.1 Future Work . 43 7.2 Conclusions . 44 Appendices 47 A Evaluation Configuration Options 47 iv List of Figures 1.1 GCD Function in Whiley . .2 2.1 GCD Function Control Flow Graph . .5 2.2 GCD Function Assertion in WyAL . .5 3.1 The Whiley Compiler Collection Architecture . 13 5.1 Execution Time Comparison between WyCS and Z3 (Valid WyCS JUnit Tests) 30 5.2 Execution Time Comparison between WyCS and Z3 (Invalid WyCS JUnit Tests) 31 6.1 WyCS Rewrite Rule Usage (Passed WyCS JUnit Tests) . 38 6.2 WyCS Rewrite Rule Usage (Failed WyCS JUnit Tests) . 39 6.3 WyCS Rewrite Rule Activation Probability (Passed JUnit Tests) . 40 6.4 WyCS Rewrite Rule Activation Probability (Failed WyCS JUnit Tests) . 41 v vi List of Tables 3.1 Comparison of Supported Data Types in Whiley, WyAL, WyCS and SMT-LIB 12 5.1 Test Coverage Comparison between WyCS and Z3 (WyCS JUnit Test Suite) . 27 5.2 Test Coverage Comparison between WyCS and Z3 (Whiley JUnit Test Suite) . 32 5.3 Performance Comparison between WyCS and Z3 (Whiley Benchmarks) . 34 vii viii Chapter 1 Introduction “I call [null references] my billion-dollar mistake” —Tony Hoare [65]. Null dereferencing is the most common error in Java [19], but not the only error found within programming lan- guages. Other common errors include divide-by-zero, integer overflow and underflow, and in- valid array indexing. These errors may occur unexpectedly and are just some of the causes of failures in software applications. To understand the negative impact that software fail- ures can have we only need to look at a few examples: the patriot missile friendly-fire in- cident, killing 28 soldiers and injuring many more people [16]; the NASA Mars surveyer program, where the Mars Climate Orbiter and the Mars Polar Lander were lost [72, 15]; and the Therac-25 medical electron accelerator that gave lethal radiation doses to patients [52]. The patriot missile incident was caused by a rounding error that worsened with time. The Mars Climate Orbiter was lost due to a mismatch between metric and imperial units of mea- surement, while the Mars Polar Lander was lost due to a sensor incorrectly indicating touch down when in fact the craft was still 40 meters away from the surface. The Therac-25 inci- dent was caused by a counter overflow that prevented the software safety interlocks from activating. Unit testing is a common method for discovering errors within software applications. Yet in 1972 Dijkstra claimed that “program testing can be a very effective way to show the pres- ence of bugs, but it is hopelessly inadequate for showing their absence” [24]. Mathematical and logical reasoning are methods for statically verifying code and were advocated by Hoare, Dijkstra and others [40, 25]. Such methods may be used by tools (e.g., FindBugs [4]) or ver- ifying compilers (e.g., Dafny [68] and Chalice [50]) to certify the correctness of a program. A decade ago Hoare proposed a grand challenge for computer science as the development of a verifying compiler, where the compiler “uses automated mathematical and logical reasoning methods to check the correctness of the programs that it compiles” [41]. The Whiley programming language—developed by David J. Pearce at Victoria University of Wellington [62]—aims to realise the verifying compiler grand challenge. The language supports design-by-contract principles, where the programmer may write function con- tracts in the form of pre- and post-conditions. A pre-condition is a requirement that must hold on entry into the function, e.g., b > 0 in Figure 1.1. A post-condition is a guarantee to callers about the result of the function on exit, e.g., a % r == 0 and b % r == 0, in Figure 1.1. These contracts can be used in conjunction with Hoare logic [40] and theorem proving techniques to mathematically verify that the program is free of certain common errors. This facilitates the development of robust software applications [62, 63]. The Whiley Constraint Solver (WyCS) is the crux of the verification process in Whiley. It differs from the majority of theorem proving tools in its fundamental data types (discussed in x 3.2.2 and x 3.3.1), making it unique, yet not without limitations. For example, in order for WyCS to support lists and maps, complex encodings must be used which hinder its per- 1 1 function gcd(int a, int b) => (int r) 2 requires a >=0 3 requires b >0 4 ensures r >=0 5 ensures a % r ==0 6 ensures b % r ==0: 7 8 if a % b ==0: 9 return b 10 11 return gcd(b, a % b) Figure 1.1: An implementation of Euclid’s greatest common divisor (GCD) algorithm [38] in Whiley. This function demonstrates the usage of pre-conditions (requires) and post-conditions (ensures). formance and capabilities. An opportunity existed here to use an external verification tool for the verification of Whiley programs. Satisfiability Modulo Theories (SMT) solvers are descendent from SAT solvers and are one such class of verification tools [66]. Z3 is an ex- ample SMT solver, developed at Microsoft Research [21]. It is a high performance theorem prover that is maintained by a dedicated team of developers and won SMT-COMP [9] in both years it entered, 2007 and 2008 [10]. Using an external verification tool, such as the tried and tested Z3, could allow more Whiley programs to be verified. Thus, this project aimed to develop and evaluate a method for using such a tool in order to enhance the verification abilities of Whiley. Furthermore, this project evaluated and analysed some aspects of WyCS for future improvement of its verification abilities. 1.1 Contributions This project has added and evaluated a verification extension to the Whiley project (1, 2) and provided research for future projects to enhance the verification abilities of WyCS (3).