The Machine Assisted Proof of Programming Language Properties
Total Page:16
File Type:pdf, Size:1020Kb
THE MACHINEASSISTED PROOF OF PROGRAMMING LANGUAGE PROPERTIES MYRA VANINWEGEN A DISSERTATION in COMPUTER AND INFORMATION SCIENCE Presented to the Faculties of the UniversityofPennsylvania in Partial Fulllment of the Requirements for the Degree of Do ctor of Philosophy August Carl Gunter Sup ervisor of Dissertation Peter Buneman Graduate Group Chairp erson ABSTRACT The MachineAssisted Proof of Programming Language Properties Myra VanInwegen Advisor Carl Gunter The goals of the pro ject describ ed in this thesis are twofold First wewanted to demon strate that if a programming language has a semantics that is complete and rigorous mathematical but not to o complex then substantial theorems can b e proved ab out it Second wewanted to assess the utility of using an automated theorem prover to aid in such pro ofs Wechose SML as the language ab out which to prove theorems it has a published semantics that is complete and rigorous and while not exactly simple is comprehensible We enco ded the semantics of Core SML into the theorem prover HOL creating new deni tional packages for HOL in the pro cess Weproved imp ortant theorems ab out evaluation and ab out the typ e system We also proved the typ e preservation theorem which relates evaluation and typing for a go o d p ortion of the language Wewere not able to complete the pro of of typ e preservation b ecause it is not true we found counterexamples These pro ofs demonstrate that a go o d semantics will allow the pro of of programming language prop erties and allowtheidentication of trouble sp ots in the language The use of HOL had its plusses and minuses On the whole the b enets greatly outweigh the drawbacks enough so that we b elieve that these theorems could not b een proved in amount of time taken by this pro ject had we not used automated help ii Contents Intro duction Related Work Programming Language Sp ecications A Brief History SML and its Sp ecication Automated Theorem Provers and HOL A Brief Description of HOL Enco ding HOLSML in HOL Enco ding the Syntax Enco ding the Semantic Ob jects Enco ding the Typing Relations Proving Prop erties of Evaluation Inversion Theorems Determinism Relating Static and Dynamic Semantics Metho ds of Proving Typ e Soundness WhyProveTyp e Preservation Towards Proving Typ e Preservation HowtoGiveaTypetoaValue The Typ e Preservation Theorem iii Substitution Typ echecking under Substitution Problems with the Formulation Assessment What HaveWe Learned Ab out SML Was HOL Useful A Semantics of HOLSML A Grammar A Static Semantics A Dynamic Semantics Bibliography iv List of Tables HOL Basic Inference Rules Denition of Basic Logical Constants Basic Axioms exp pred Part of tych Enco ding into HOL PrimitiveFunctions for val has typ e Rules for j v Rules for j j j j j c e r f vp Rules for Environments and Store Typ es Full StatementofTyp echecking Under Substitution A Grammar Expressions Matches Declarations and Bindings A Grammar Patterns and Typ e Expressions v Chapter Intro duction As computer hardware and software systems come to play a larger role in our lives it b ecomes increasingly imp ortanttomake sure that these systems op erate as intended While no metho d can guarantee that a system will work p erfectly there are several ways to increase condence that a system will p erform as exp ected when used in the setting for whichitwas designed A system specication is a precise statement of what the system must do without saying how it is to b e done The sp ecication must also include a description of the environment in which the system will op erate Engineers use validation vericationand testing to help ensure correctness of systems Validation is the pro cess of checking that the sp ecication is correct that is that the assumptions ab out the environment are valid and that the system designers and the customer agree on what the system should do Testing involves running the system with a collection of inputs and checking that the outputs conform to the sp ecication Verication consists of lo oking at the system itself in order to determine if it meets the sp ecication without actually running it Verication usually involves the use of mathematical logic to prove that an algorithm or circuit meets its sp ecication or satises some given prop erties The problem with testing is that in complex systems there are to o many cases to test them all Occasionally one can prove that the input can b e divided up into a small number of classes and that it is sucient to test one input from each class This is often not the case When only a few of the inputs can b e tested testing do es not ensure the correctness of the system This is what gives verication its app eal When a system or some abstraction ofitisveried it has b een proved to b e correct at that level of abstraction There are two big obstacles to verifying systems The rst is that one needs to have precise mathematical descriptions of what the system is supp osed to do the sp ecication and of what the system actually do es An example of the rst is a nite state automata that gives the action of the system in each state and indicates how the system go es from state to another An example of the second sp ecication is a mathematical mo del of a circuit Creating these mathematical descriptions is a bigger problem than one might guess Much of the work in formal metho ds which is the use of logicbased mathematics to increase condence in systems has b een in formulating these mathematical descriptions The second obstacle is that even if one has a rigorous mathematical description of the desired prop erties and actual function of the system pro ofs of correctness are usually large and complex This not only makes the pro ofs long and tedious but also increases the chances that they will contain mistakes Thus one would liketohave some kind of automated help in doing the pro ofs Not only can an automated pro of to ol help organize and carry out the pro of but it acts as a pro of checker preventing errors Automated pro of to ols can b e roughly divided into two classes The rst is used to check prop erties of systems that can b e mo deled as nite state automata These systems called model checkers are fully autonomous when given an input consisting of a nite state automata and a prop ertytobechecked they will after some computation rep ort either that the system satises the prop erty or return a state for which prop erty fails A go o d reference for the use of mo del checkers in verifying prop erties of systems is McMillan Mo del checkers do not work in all situations Sometimes the system is not easily describable as a nite state automaton either b ecause one would need an innite number of states or b ecause a more complex mathematical mo del is needed For instance a system that can b e congured in an varietyofways or can haveanynumb er of comp onents cannot b e easily describ ed as a nite state automaton An example of this would b e a cache coherence proto col where mo del checking can only b e used to verify the proto col in certain xed congurations The other class of pro of to ols are semiautomated generalpurp ose theorem provers For the rest of this thesis the term theorem prover will refer to this second class of pro of to ols Theorem provers implement a logic for example rstorder logic and provide some automated supp ort for proving theorems in the logic One would likeitto b e the case that when the theorem prover is given a statement if the statement is true then a pro of is returned and if it is not true then some pro of of its falseness p erhaps a counterexample is returned However the logics implemented by theorem provers are usually p owerful enough that it is imp ossible for theoretical not just practical reasons to give an algorithm that will do this Thus these theorem provers can only nd pro ofs for a limited numb er of statements and they often require a great deal of human input to guide them in the pro of The human eort required can make using a theorem prover a tedious pro cess One of the annoyances with using a theorem prover is that it demands that everything b e proved in great detail Statements that are obviously true to the mathematicallyknowledgeable human are not obvious to the theorem prover it must b e convinced with a detailed pro of Theorem provers can b e augmented with decision pro cedures to help with this problem Decision pro cedures are programs that when given a statement of a certain form sayin a restricted subset of the logic will determine whether or not it is true and if it is true return a pro of There is still muchwork to b e done b efore theorem provers are suciently powerful that they can b e feasibly applied to prove correctness of large realworld systems It is useful to note that even if verication b ecomes more easily applicable to real world problems it will never completely supplant testing Mathematics is sometimes not completely convincing or understandable An analogy can b e drawn b etween formal verication of systems and formal denitions of mathematical concepts It is often the case that a denition is not fully understo o d until several examples have b