Intersection Types and Higher-Order Model Checking
Total Page:16
File Type:pdf, Size:1020Kb
Intersection Types and Higher-Order Model Checking Steven J. Ramsay Merton College University of Oxford A dissertation submitted for the degree of Doctor of Philosophy in Computer Science Trinity Term 2013 Abstract Higher-order recursion schemes are systems of equations that are used to define finite and infinite labelled trees. Since, as Ong has shown, the trees defined have a decidable monadic second order theory, recursion schemes have drawn the attention of research in program verification, where they sit naturally as a higher-order, functional analogue of Boolean programs. Driven by applications, fragments have been studied, algorithms developed and extensions proposed; the emerging theme is called higher- order model checking. Kobayashi has pioneered an approach to higher-order model checking using intersection types, from which many recent advances have followed. The key is a characterisation of model checking as a problem of intersection type assignment. This dissertation contributes to both the theory and practice of the intersection type approach. A new, fixed-parameter polynomial-time decision procedure is described for the alternating trivial automaton fragment of higher-order model checking. The algorithm uses a novel, type-directed form of abstraction refinement, in which behaviours of the scheme are distinguished according to the intersection types that they inhabit. Furthermore, by using types to reason about acceptance and rejection simultaneously, the algorithm is able to converge on a solution from two sides. An implementation, Preface, and an extensive body of evidence demonstrate empirically that the algorithm scales well to schemes of several thousand rules. A comparison with other tools on benchmarks derived from current practice and the related literature puts it well beyond the state-of-the-art. A generalisation of the intersection type approach is presented in which higher- order model checking is seen as an instance of exact abstract interpretation. Intersec- tion type assignment is used to characterise a general class of safety checking problems, defined independently of any particular representation (such as automata) for a class of recursion schemes built over arbitrary constants. Decidability of any problem in the class is an immediate corollary. Moreover, the work looks beyond whole-program verification, the traditional territory of model checking, by giving a natural treatment of higher-type properties, which are sets of functions. Acknowledgements I am very grateful to my supervisor, Luke Ong, for the patient guidance and con- stant encouragement that he has given to me during the course of the research con- tained in this dissertation. My thanks are also due to Martin Hofmann and Hongseok Yang, who gave much of their time to ensure its proper assessment. Finally, I grate- fully acknowledge the help of the Engineering and Physical Sciences Research Council, whose financial support has been essential. Contents Contents vii 1 Introduction1 1.1 The difficulties of constructing correct software................ 1 1.2 Verification and software model checking.................... 2 1.3 Functional programming............................. 3 1.4 Higher-order model checking .......................... 5 1.5 Contributions and structure........................... 6 2 Higher-Order Model Checking9 2.1 Higher-order recursion schemes......................... 9 2.2 Model checking problems for recursion schemes................ 23 2.3 The intersection type characterisation ..................... 27 2.4 Algorithms for higher-order model checking.................. 33 2.5 Applications in verification ........................... 39 3 An Intersection Refinement Type System with Subtyping 45 3.1 Intersection types................................. 45 3.2 Intersection type assignment........................... 49 3.3 Intersection type checking............................ 54 3.4 Consistency of type environments........................ 56 3.5 Higher-order model checking is type inference................. 60 4 Model Checking via Type Directed Abstraction Refinement 63 4.1 Introduction.................................... 63 4.2 Type directed abstraction refinement...................... 65 4.3 A decision procedure for model checking.................... 70 4.4 A narrated example ............................... 80 4.5 Correctness of the decision procedure...................... 84 4.6 Preface: a higher-order model checker .................... 97 vii Contents 4.7 Related work ...................................102 5 Intersection Types as Exact Abstract Interpretations 107 5.1 Introduction....................................107 5.2 Term languages, property languages and queries . 109 5.3 Concrete and abstract properties........................118 5.4 Exact abstraction at ground types .......................122 5.5 Applications to higher-order model checking..................129 5.6 Exact abstraction at higher types........................134 5.7 Related work ...................................138 6 Conclusion 141 6.1 Summary .....................................141 6.2 Discussion.....................................142 6.3 Future directions.................................144 A Evaluation: Complete Results 147 Bibliography 157 Index of Definitions 163 Index of Notations 165 viii Chapter 1 Introduction 1.1 The difficulties of constructing correct software “Because, in a sense, the whole is ‘bigger’ than its parts, the depth of a hierarchical decomposition is some sort of logarithm of the ratio of the ‘sizes’ of the whole and the ultimate smallest parts. From a bit to a few hundred megabytes, from a microsecond to a half an hour of computing confronts us with completely baffling ratio of 109! The programmer is in the unique position that his is the only discipline and profession in which such a gigantic ratio, which totally baffles our imagination, has to be bridged by a single technology.” E. W. Dijkstra Software is complex. Of course, the size of modern computer programs, which may be documents tens of millions of lines long, is a major factor in complexity. That such software, whose complete comprehension is well beyond the faculties of any one person, works even some of the time is a great achievement of the methods of modern software engineering; that it fails the rest of the time is a great frustration of its users. However, the sheer scale of the objects that are involved is only a small part of the problem. There is also the great depth of the hierarchies involved, as pointed out in the excerpt above by Dijkstra, one of the great pioneers of the science of computer programming. Such hierarchies are inevitable when thinking about how to instruct a machine, whose basic operations can be applied to (perhaps) 64 bits at a time, in how to perform complex processing tasks that may ultimately involve many gigabytes of data. Still other factors are social and related to the education of computer programmers, the expectations of their clients and the lack of generally accepted standards. The consequence of software complexity, and our apparent inability to manage it ef- fectively, is an unacceptable rate of defects. At the time of writing, it would be very surprising to find any serious user of computing equipment who had not suffered frustra- tion with the inadequacies of faulty computer programs. That this statement is hardly controversial is already a serious indictment of the state of affairs, but such is the perva- siveness of computing in industry and in business that faulty software also carries with 1 1. Introduction it a tangible financial cost. In a 2002 report [RTI, 2002], the National Institute of Stan- dards and Technology (NIST) estimated the cost to the United States national economy of failing to identify defects in computer software as $59.5 billion, annually. To put the figure into context, the economic cost of Hurricane Sandy which, among other disastrous effects, caused the cancellation of almost 20,000 airline flights, left millions without power, destroyed thousands of homes, and completely closed the New York Stock Exchange for two days, was recently estimated at $65 billion [see US NOAA 2013]. Conventionally, the main weapon with which to combat software defects is testing. Testing is the work of running a given program on various inputs and observing the corresponding outputs. In many respects, this follows the same well-established pattern found in engineering more generally: once a component of some design is built, it is tested in order to check that it performs correctly within some tolerances and, if this is the case, then it is accepted. However, software is unlike the familiar materials of other engineering disciplines. Its behaviour does not conform to well known physical principles; instead the laws which it obeys are a function of the code that the programmer has written. In effect, each piece of software is a new material, whose attributes must be divined by experimentation. Even two pieces of software written to the same specification and in the same design are unlikely to share all the same attributes, since each will almost certainly have its own unique set of faults which serve to distort the original intention in unpredictable ways. Consequently, to be effective, software testing must be extremely thorough, and hence expensive. A study undertaken by Maximilien and Williams[2003] at IBM found that, by adopting a test-driven development methodology, they were able to reduce the density of defects discovered from 7 per thousand