Formalizing Basic First Order Model Theory

Formalizing Basic First Order Model Theory John Harrison Intel Corporation, EY2-03 5200 NE Elam Young Parkway, Hillsboro, OR 97124, USA [email protected] Abstract. We define the syntax of unsorted first order logic as a HOL datatype and define the semantics of terms and formulas, and hence no- tions such as validity and satisfiability. We prove formally in HOL some elementary metatheorems such as Compactness, Löwenheim-Skolem and Uniformity, via canonical term models. The proofs are based on those in Kreisel and Krivine’s book on model theory, but the HOL formalization raises several interesting issues. Because of the limited nature of type quantification in HOL, many of the theorems are awkward to state or prove in their standard form. Moreover, simple and elegant though the proofs seem, there are surprising difficulties formalizing Skolemization, one of the more intuitively obvious parts. On the other hand, we sig- nificantly improve on the original textbook versions of the arguments, proving two of the main theorems together rather than by separate arguments. 1 Introduction This paper deals with the formalization, in the HOL Light theorem prover, of the basic model theory of first order logic. Our original motivation was that we were writing a textbook on mathematical logic, and wanted to make sure that when presenting some of the major results, we didn’t make any slips – hence the desire for machine checking. We took as our model for this part of logic the textbook of Kreisel and Krivine [6], hereinafter referred to as K&K. However, as we shall see, in the process of formalizing the proofs we made some significant improvements to their presentation. In view of the considerable amount of mathematics that has been formalized, mainly in Mizar [14, 11], it is perhaps hardly noteworthy to formalize yet another fragment. However, we believe that the present work does at least raise a few interesting general points. – Formalization of syntax constructions involving bound variables has inspired a slew of research; see e.g. Chap. 3 of Pollack [9], or Gordon and Melham [3]. We offer an unusual angle, in that for us, syntax is subordinate to semantics. – The apparent intuitive difficulty of parts of the proof does not correlate well with the difficulty of the HOL formalization: an apparently straightforward part turns out to be the most difficult. This raises interesting questions over whether the textbook or HOL is at fault. – As part of the task of formalization, we found improvements to the textbook proof. Though this was an indirect consequence of formalization, resulting from the necessary close reading of the text, it serves as a good example of possible auxiliary benefits. – We provide a good illustration of how HOL’s simplistic type system can be a hindrance in stating results in the most natural way. In particular, care is needed because quantification of type variables happens implicitly at the sequent level, rather than via explicit binding constructs. HOL Light is our own version of HOL, and while its logical axiomatization, proof tools and even implementation language are somewhat different from other versions, the underlying logic is exactly equivalent to the one described by Gor- don and Melham [4]. This is a version of simply typed lambda calculus used as a foundation for classical higher order logic. 2 Syntactic Definitions We define first order terms in HOL as a free recursive type: a term is either a variable, identified by a numeric code, or else a function symbol, also identified by a number, followed by a possibly empty list of arguments. (We regard nullary functions as individual constants.) term = V num | Fn num (term list) Here we have already made two significant choices. First of all, we have re- stricted ourselves to countable languages, since the function symbols are indexed by N. This restriction is inessential, however; everything that follows extends to any infinite HOL type, given a theorem that for an infinite type α there is an injection α × α → α, a fairly easy consequence of Zorn’s Lemma. The restriction to countable languages was made entirely for convenience, to avoid having to specify types everywhere. Secondly, we have not made any restrictions on the arities of functions, e.g. that function 6 can only be applied to two arguments. Rather, we really consider a function symbol as identified by a pair consisting of its numerical code and its arity, so in f3(x) and f3(x, y), the two functions are different, identified by the pairs (3, 1) and (3, 2) respectively. In fact, we define a function that returns the set of functions, in this sense, that occur in a term: |- (∀v. functions_term (V v) = {}) ∧ (∀f l. functions_term (Fn f l) = (f,LENGTH l) INSERT (LIST_UNION (MAP functions_term l))) where: |- (LIST_UNION [] = {}) ∧ (LIST_UNION (CONS h t) = h UNION (LIST_UNION t)) Next we define formulas; there is a wide choice over which connectives to take as primitive and which as defined. We differ a bit from our textbook model by taking falsity, atoms, implications and universal quantifications as primitive. form = False | Atom num (term list) | --> form form | !! num form Once again, identically-numbered predicates used with different arities are considered different. We now define functions to return the set of functions and predicates in this sense occurring in a formula: |- (functions_form False = {}) ∧ (functions_form (Atom a l) = LIST_UNION (MAP functions_term l)) ∧ (functions_form (p --> q) = (functions_form p) UNION (functions_form q)) ∧ (functions_form (!! x p) = functions_form p) |- (predicates_form False = {}) ∧ (predicates_form (Atom a l) = {(a,LENGTH l)}) ∧ (predicates_form (p --> q) = (predicates_form p) UNION (predicates_form q)) ∧ (predicates_form (!! x p) = predicates_form p) It’s more convenient to lift these up to the level of sets of formulas, and we pair up the sets of functions and predicates as the so-called language of a set of formulas. |- functions fms = UNIONS {functions_form f | f IN fms} |- predicates fms = UNIONS {predicates_form f | f IN fms} |- language fms = functions fms, predicates fms Trivially we have then, for a singleton set of formulas: |- language {p} = functions_form p,predicates_form p Other logical constants are defined in a fairly standard way; it doesn’t really matter for what follows exactly how this is done. These are respectively negation, truth, disjunction (or), conjunction (and), bi-implication (iff) and existential quantification. |- Not p = p --> False |- True = Not False |- p || q = (p --> q) --> q |- p && q = Not (Not p || Not q) |- p <-> q = (p --> q) && (q --> p) |- ?? x p = Not(!!x (Not p)) The set of free variables in a term and formula are now defined: |- (∀x. FVT (V x) = {x}) ∧ (∀f l. FVT (Fn f l) = LIST_UNION (MAP FVT l)) |- (FV False = {}) ∧ (∀a l. FV (Atom a l) = LIST_UNION (MAP FVT l)) ∧ (∀p q. FV (p --> q) = FV p UNION FV q) ∧ (∀x p. FV (!! x p) = FV p DELETE x) We prove various simple consequences of the definition, most importantly, by induction over the syntax of terms and formulas, that these always give finite sets, e.g. |- ∀p. FINITE(FV p) The most complex syntactic definition is of substitution. We have chosen a ‘name-carrying’ formalization of syntax, rather than indexing bound variables using some scheme following de Bruijn [1]. The latter is usually preferred when formalizing logical syntax precisely because substitution is simpler to define cor- rectly. (Indeed, it is foreshadowed in the tradition – see Prawitz [10] for example – of distinguishing between [bound] variables and [free] parameters.) Our decision to stick with name-carrying terms was partly in order to stay closer to intuition, and as we shall see, defining substitution so that it renames variables appropriately is not too difficult. In some sense, things are easier for us since we always have semantics (to be defined later) as a check. By contrast, in formalizing, say, beta reduction in lambda-calculus, the syntax is the primary object of investigation and substitution must be defined in a way that is correct and intuitively clear or the whole exercise loses its point. In order to define renaming substitution as a clean structural recursion over terms, it’s necessary to define multiple substitutions rather than single ones [13]. This is because for something like (∀x. P (x) ⇒ P (y))[x/y] we need recursively to perform two substitutions on the body of the quantifier, one to substitute for y and one to rename the variable x. Therefore we take multiple parallel substitution as primitive. (Multiple serial substitution would also be acceptable, since by construction the substitutions in renaming clauses never interfere, but this seems less intuitive.) We represent substitutions simply as functions :num->term, map- ping each variable index to the term to be substituted for that variable. (The identity substitution is therefore the type constructor for variable terms, V.) Al- though in practice we only need substitutions that are nontrivial (i.e. differ from V) only for finitely many arguments, nothing seems to be gained by imposing this restriction generally. Substitution at the term level is simple: |- (∀x. termsubst v (V x) = v(x)) ∧ (∀f l. termsubst v (Fn f l) = Fn f (MAP (termsubst v) l)) It’s when we come to formulas that the complexities of renaming arise. We need to see if a variable in the substituted body would become captured by the bound variable in a universal quantifier, and if so, rename the bound variable appropriately. |- (formsubst v False = False) ∧ (formsubst v (Atom p l) = Atom p (MAP (termsubst v) l)) ∧ (formsubst v (q --> r) = (formsubst v q --> formsubst v r)) ∧ (formsubst v (!!x q) = let v’ = valmod (x,V x) v in let z = if ∃y. y IN FV(!!x q) ∧ x IN FVT(v’(y)) then VARIANT(FV(formsubst v’ q)) else x in !!z (formsubst (valmod (x,V(z)) v) q)) Here the function valmod modifies a substitution, or more generally any function, as follows: |- valmod (x,a) v = λy.

Load more