<<

Formalizing Foundations of

Mihnea Iancu and Florian Rabe , Jacobs University Bremen, Bremen, Germany

Abstract Over the recent decades there has been a trend towards formalized mathematics, and a number of sophisticated systems have been developed to support the - ization process and mechanically verify its result. However, each tool is based on a specic foundation of mathematics, and formalizations in dierent systems are not necessarily compatible. Therefore, the integration of these foundations has re- ceived growing interest. We contribute to this goal by using LF as a foundational framework in which the mathematical foundations themselves can be formalized and therefore also the relations between them. We represent three of the most important foundations  Isabelle/HOL, Mizar, and ZFC  as well as relations between them. The relations are formalized in such a way that the frame- work permits the extraction of translation functions, which are guaranteed to be well-dened and sound. Our work provides the starting point of a systematic study of formalized foundations in order to compare, relate, and integrate them.

1 Introduction

The 20th century saw signicant advances in the eld of foundations of mathematics stimulated by the discovery of in naive , e.g., Russell's of unlimited set comprehension. Several seminal works have redeveloped and advanced large parts of mathematics based on one coherent choice of foundation, most notably the Principia (Whitehead & Russell 1913) and the works by Bourbaki (Bourbaki 1964). Today various avors of axiomatic set theory and provide a number of well- understood foundations. Given a development of mathematics in one xed foundation, it is possible to give a fully formal in which every mathematical valid in that foundation can be written down. Then mathematics can  in principle  be reduced to the manipulation of these expression, an approach called and most prominently expressed in Hilbert's program. Recently this approach has gained more and more momentum due to the advent of computer technology: With machine support, the formidable eort of formalizing mathematics becomes feasible, and the trust in the soundness of an can be reduced to the trust in the implementation of the foundation. However, compared to traditional mathematics, this approach has the drawback that it heavily relies on the choice of one specic foundation. Traditional mathematics, on

1 the other hand, frequently and often crucially abstracts from and moves freely between foundations to the extent that many mathematical papers do not mention the exact foundation used. This level of abstraction is very dicult to capture if every is rigorously reduced to a xed foundation. Moreover, in formalized mathematics, dier- ent systems implementing dierent (or even the same or similar) foundations are often incompatible, and no reuse across systems is possible. But the high cost of formalizing mathematics makes it desirable to join forces and integrate foundational systems. Currently, due to the lack of integration, signicant overlap and redundancies exist between libraries of formalized mathematics, which slows down the progress of large projects such as the formal proofs of the Kepler (Hales 2003). Our contribution can be summarized as follows. Firstly, we introduce a new method- ology for the formal integration of foundations: Using a logical framework, we formalize not only mathematical but also the foundations themselves. This permits for- mally stating and proving relations between foundations. Secondly, we demonstrate our approach by formalizing three of the most widely-used important foundations as well as translations between them. Our work provides the starting point of a formal library of foundational systems that complements the existing foundation-specic libraries and provides the basis for the systematic and formally veried integration of systems for formalized mathematics. We begin by describing our approach and reviewing related work in Sect. 2. Then, in Sect. 3, we give an overview of the logical framework we use in the remainder of the paper. We give a new formalization of traditional mathematics based on ZFC set theory in Sect. 4. Then we formalize two foundations with particularly large formalized libraries: Isabelle/HOL (Nipkow, Paulson & Wenzel 2002) in Sect. 5 and Mizar (Trybulec & Blair 1985) in Sect. 6. We also give a translation from Isabelle/HOL into ZFC and sketch a partial translation from Mizar (which is stronger than ZFC) to ZFC. We discuss our work and conclude in Sect. 7. Our formalizations span several thousand lines of declarations, and their are correspondingly simplied. The full sources are available at (Iancu & Rabe 2010).

2 Problem Statement and Related Work

Automath (de Bruijn 1970) and the formalization of Landau's analysis (Landau 1930, van Benthem Jutting 1977) were the rst major success of formalized mathematics. Since then a number of computer systems have been put forward and have been adopted to varying degrees to formalize mathematics such as LCF (Gordon, Milner & Wadsworth 1979), HOL (Gordon 1988), HOL Light (Harrison 1996), Isabelle/HOL (Nipkow et al. 2002), IMPS (Farmer, Guttman & Thayer 1993), Nuprl (Constable, Allen, Bromley, Cleaveland, Cremer, Harper, Howe, Knoblock, Mendler, Panangaden, Sasaki & Smith 1986), Coq (Coquand & Huet 1988), Mizar (Trybulec & Blair 1985), Isabelle/ZF (Paulson & Coen 1993), and the body of peer-reviewed formalized mathematics is growing (T. Hales and G. Gonthier and J. Harrison and F. Wiedijk 2008, Matuszewski 1990, Klein,

2 Nipkow & (eds.) 2004). A comparison of some formalizations of foundations in Automath, including ZFC and Isabelle/HOL, is given in (Wiedijk 2006). The problem of interoperability and integration between these systems has received growing attention recently, and a number of connections between them have been estab- lished. (Obua & Skalberg 2006) and (McLaughlin 2006) translate between Isabelle/HOL and HOL Light; (Keller & Werner 2010) from HOL Light to Coq; and (Krauss & Schropp 2010) from Isabelle/HOL to Isabelle/ZF. The OpenTheory format (Hurd 2009) was designed as an interchange format for dierent implementations of higher order . We call these translations dynamically veried because they have in common that they translate in such a way that the target system reproves every translated . One can think of the source system's as an oracle for the target system's proof search. This approach requires no reasoning about or trust in the translation so that users of the target system can reuse translated theorems without making the source system or the translation part of their trusted base. Therefore, such translations can be implemented and put to use relatively quickly. It is no surprise that such translations are advanced by researchers working with the respective target system. Still, dynamically veried translations can be unsatisfactory. The proofs of the source theorems may not be available because they only exist transiently when the source system processes a proof script. The source system might not be able to export the proofs, or they may be too large to translate. In that case, it is desirable to translate only the theorems and appeal to a general result that guarantees the soundness of the theorem translation. However, the statement of soundness naturally lives outside either of the two involved foundations. Therefore, stating, let alone proving, the soundness of a translation re- quires a third in which source and target system and the translation are represented. We call the third system a foundational framework, and if the soundness of a translation is proved in a foundational framework, we speak of a statically veried translation. Statically veried translations are theoretically more appealing because the soundness is proved once and for all. Of course, this requires the additional assumptions that the foundational framework is consistent and that the representations in the framework are adequate. If this is a concern, the soundness proof should be constructive, i.e., produce for every proof in the source system a translated proof in the target system. Then users of the target system have the option to recheck the translated proof. The most comprehensive example of a statically veried translation  from HOL to Nuprl (Naumov, Stehr & Meseguer 2001)  was given in (Schürmann & Stehr 2004). HOL and Nuprl proof terms are represented as terms in the framework Twelf (Harper, Honsell & Plotkin 1993, Pfenning & Schürmann 1999) using the judgments-as-types methodology. The translation and a constructive soundness proof are formalized as type-preserving logic programs in Twelf. The soundness is veried by the Twelf type checker, and the well-denedness  i.e., the totality and termination of the involved logic programs  is proved using Twelf. In this work, we demonstrate a general methodology for statically veried translations. We formalize foundations as signatures in the logical framework Twelf, and we use the

3 LF module system's (Rabe & Schürmann 2009) translations-as-morphisms methodology to formalize translations between them as morphisms. This yields translations that are well-dened and sound by design and which are veried by the Twelf type checker. Moreover, they are constructive, and the extraction of translation programs is straightforward. Our work can be seen as a continuation of the Logosphere project (Pfenning, Schür- mann, Kohlhase, Shankar & Owre. 2003), of which the above HOL-Nuprl translation was a part. Both Logosphere and our work use LF, and the main dierence is that we use the new LF module system to reuse encodings and to encode translations. Logosphere had to use monolithic encodings and used programs to encode translations. The latter were either Twelf logic program or Delphin (Poswolsky & Schürmann 2008) functional programs, and their well-denedness and termination was statically veried by Twelf and Delphin, respectively. Using the module system, translations can be stated in a more concise and declarative way, and the well-denedness of translations is guaranteed by the LF type theory. There are some alternative frameworks in which foundations can be formalized: other variants of dependent type theory such as Agda (Norell 2005), type theories such as Coq based on the calculus of inductive constructions, or the Isabelle framework (Paulson 1994) based on polymorphic higher-order logic. All of these provide roughly comparably expressive module systems. We choose LF because the judgments-as-types and relations- as-morphisms methodologies are especially appropriate to formalize foundations and their relations. We discuss related work pertaining to the individual foundations separately below.

3 The Edinburgh Logical Framework

The Edinburgh Logical Framework (Harper et al. 1993) (LF) is a formal meta-language used for the formalization of deductive systems. It is related to Martin-Löf type theory and the corner of the cube that extends simple type theory with dependent types and kinds. We will work with the Twelf (Pfenning & Schürmann 1999) implementation of LF and its module system (Rabe & Schürmann 2009). The central notion of the LF type theory is that of a signature, which is a list Σ of kinded type family symbols a : K or typed constant symbols c : A. It is convenient to permit those to carry optional denitions, e.g., c : A = t to dene c as t. (For our purposes, it is sucient to assume that these abbreviations are transparent to the underlying type theory, which avoids some technical complications. Of course, they are implemented more intelligently.) LF contexts are lists Γ of typed variables x : A, i.e., there is no polymorphism. Relative to a signature Σ and a context Γ, the expressions of the LF type theory are kinds K, kinded type families A : K, and typed terms t : A. type is a special kind, and type families of kind type are called types. We will use the concrete of Twelf to represent expressions:

4 • The dependent function type Πx:AB(x) is written {x : A} B x, and correspondingly for dependent function kinds {x : A} K x. As usual we write A → B when x does not occur free in B.

• The corresponding λ-abstraction λx:At(x) is written [x : A] t x, and correspondingly for type families [x : A](B x). • As usual, application is written as juxtaposition. Given two signatures sig S = {Σ} and sig T = {Σ0}, a signature morphism σ from S to T is a list of assignments c := t and a := A. They are called views in Twelf and declared as view v : S → T = {σ}. Such a view is well-formed if • σ contains exactly one assignment for every c or a that is declared in Σ without a denition,

• each assignment c := t assigns to the Σ-symbol c : A a Σ0-term t of type σ(A), • each assignment a := K assigns to the Σ-symbol a : K a Σ0-type family K of type σ(K). Here σ is the homomorphic extension of σ that maps all closed expressions over Σ to closed expressions over Σ0, and we will write it simply as σ in the sequel. The central result about signature morphisms (see (Harper, Sannella & Tarlecki 1994)) is that they preserve typing and αβη-equality: Judgments `Σ t : A imply judgments `Σ0 σ(t): σ(A) and similarly for kinding judgments and equality. Finally, the Twelf module system permits inclusions between signatures and views. If a signature T contains the declaration include S, then all symbols declared in (or included into) S are available in T via qualied , e.g., c of S is available as S.c. Our inclusions will never introduce clashes, and we will write c instead of S.c for simplicity. Correspondingly, if S is included into T , and we have a view v from S to T 0, a view from T to T 0 may include v via the declaration include v. This yields the following for Twelf where gray color denotes optional parts.

Toplevel G ::= · | G, sig T = {Σ} | G, view v : S → T = {σ} Signatures Σ ::= · | Σ, include S | Σ, c : A = t | Σ, a : K = t Morphisms σ ::= · | Σ, include v | σ, c := t | σ, a := A Kinds K ::= type | {x : A} K Type families A ::= a | A t | [x : A] A | {x : A} A Terms t ::= c | t t | [x : A] t | x

We will sometimes omit the type of a bound variable if it can be inferred from the context. Moreover, we will frequently use implicit : If c is declared as c : {x : A} B and the value of s in c s can be inferred from the context, then c may alternatively be declared as c : B (with a free variable in B that is implicitly bound) and used as c (where the argument to c inferred). We will also use xity and precedence declarations in the style of Twelf to make applications more readable.

5 Example 1 (Representation of FOL in LF). The following is a fragment of an LF signature for rst-order logic that we will use later to formalize set theory: sig FOL = { set : type prop : type ded : prop → type prefix 0

⇒ : prop → prop → prop infix 3 ∧ : prop → prop → prop infix 2 ∀ :(set → prop) → prop . = : set → set → prop infix 4 ⇔ : prop → prop → prop infix 1 = [a][b](a ⇒ b) ∧ (b ⇒ a) ∧I : ded A → ded B → ded A ∧ B

∧El : ded A ∧ B → ded A

∧Er : ded A ∧ B → ded B ⇒I :(ded A → ded B) → ded A ⇒ B ⇒E : ded A ⇒ B → ded A → ded B ∀I :({x : i} ded (F x)) → ded (∀ [x] F x) ∀E : ded (∀ [x] F x) → {c : i} ded (F c) } This introduces two types set and prop for sets and , respectively, and a type family ded indexed by propositions. Terms p of type ded F represents proofs of F , and the inhabitation of ded F represents the provability of F . Higher-order abstract syntax is used to represent binders, e.g., ∀([x : set] F x) represents the ∀x : set.F (x). Equivalence is introduced as a dened connective. Note that the argument of ded does not need brackets as ded has the weakest prece- dence. Moreover, by convention, the Twelf binders [] and {} always bind as far to the right as is consistent with the placement of brackets. As examples for rules, we give introduction and elimination rules conjunction and implication. Here A and B of type prop are implicit arguments whose types and values are inferred. For example, the theorem of commutativity of conjunction can now be stated as comm_conj : ded (A ∧ B) ⇔ (B ∧ A)

= ∧I (⇒I [p] ∧I (∧Er p)(∧El p))

(⇒I [p] ∧I (∧Er p)(∧El p)) The LF guarantees that the proof is correct.

4 Zermelo-Fraenkel Set Theory

In this section, we present out formalization of Zermelo-Fraenkel set theory. We give an overview over our variant of ZFC in Sect. 4.1 and describe its encoding in Sect. 4.2

6 and 4.3. Finally, we discuss related formalizations in Sect. 4.4.

4.1 Preliminaries Zermelo-Fraenkel set theory (Zermelo 1908, Fraenkel 1922) (with or without choice) is the most common implicitly or explicitly assumed foundation of mathematics. It represents all mathematical objects as sets related by the binary ∈ predicate. Propositions are stated using an untyped rst-order logic. The logic is classical, but we will take care to intuitionistically whenever possible. There are a number of equivalent choices for the of ZFC. Our axioms are

: ∀x∀y(∀z(z ∈ x ⇔ z ∈ y) ⇒ x = y) • Set existence: ∃x true (This could be derived from the of innity, but we add it explicitly here to reduce dependence on innity.),

• Unordered pairing: ∀x∀y∃a(∀z(z = x ∨ z = y) ⇒ z ∈ a) • : ∀X∃a∀z(∃x(x ∈ X ∧ z ∈ X) ⇒ z ∈ a) • : ∀x∃a∀z((∀t(t ∈ z ⇒ t ∈ x)) ⇒ z ∈ a) • Specication: ∀X∃a(∀z((z ∈ X ∧ϕ(z)) ⇔ z ∈ a)) for a unary predicate ϕ (possibly containing free variables)

• Replacement: ∀a(∀x(x ∈ a) ⇒ ∃!y(ϕ x y)) ⇒ ∃b(∀y(∃x(x ∈ a ∧ ϕ(x, y)) ⇔ y ∈ b)) for a binary predicate ϕ (possibly containing free variables) where ∃! abbreviates the easily denably quantier of unique existence

• Regularity: ∀x(∃t(t ∈ x)) ⇒ (∃y(y ∈ x ∧ ¬(∃z(z ∈ x ∧ z ∈ y)))). • Choice and innity, which we omit here. It is important to note that there are no rst-order terms except for the variables. Specic sets (i.e., rst-order constant symbols) and operations on sets (i.e., rst-order function symbols) are introduced only as derived notions: A new symbol may be intro- duced to abbreviate a uniquely determined set. For example, the ∅ abbreviates the unique set x satisfying ∀y.¬y ∈ x. Adding such abbreviations is conservative over rst-order logic but cannot be formalized within the language of rst-order.

4.2 Untyped Set Theory

Our Twelf formalization of ZFC uses three main signatures: ZFC_FOL encodes rst- order logic, ZFC encodes the rst-order theory of ZFC, and nally Operations intro- duces the basic operations and their properties, most notably products and functions. The actual encodings (Iancu & Rabe 2010) comprise several hundred lines of Twelf dec- larations and are factored into a number of smaller signatures to enhance maintainability

7 and reuse. Therefore, our presentation here is only a summary. Moreover, to enhance readability, we will use more characters in identiers here than in the actual encodings.

First-Order Logic ZFC_FOL is an extension of the signature FOL given in Ex. 1. Besides the usual components of FOL encodings in LF (see e.g., (Harper et al. 1993)), we use two special features. Firstly, we add the (denite) operator δ : {F : set → prop} ded ∃! ([x] F x) → set, which encodes the mathematical practice of giving a name to a uniquely determined object. Here ∃! is the quantier of unique existence which is easily denable. Thus δ takes a formula F (x) with a free variable x and a proof of ∃!x.F (x) and returns a new set. The LF type system guarantees that δ can only be applied after showing unique existence. δ is axiomatized using the axiom scheme axδ : ded F (δ F P ); from this we can derive irrelevance, i.e., δ F P returns the same object no matter which proof P is used. Secondly, we add sequential connectives for conjunction and implication. In a sequen- tial implication F ⇒0 G, G is only considered if F is true, and similarly for conjunction. This is very natural in mathematical practice  for example, mathematicians do not hes- itate to write x 6= 0 ⇒0 x/x = 1 when / is only dened for non-zero dividers. All other connectives remain as usual. Sequential implication and conjunction are formalized in LF as follows: ∧0 : {F : prop} (ded F → prop) → prop ⇒0 : {F : prop} (ded F → prop) → prop 0 0 ∧I : {p : ded F } ded G p → F ∧ [p] G p ∧0 : ded F ∧0 [p] G p → ded F El ∧0 : {q : ded F ∧0 [p] G p} ded G (∧0E q) Er l 0 0 ⇒I :({p : ded F } ded G p) → ded F ⇒ [p] G p 0 0 ⇒E : ded F ⇒ [p] G p → {p : ded F } ded G p ∧0 and ⇒0 are applied to two arguments, rst a formula F , and then a formula G stated in a context in which F is true. This is written as, e.g., F ∧0 [p] G p where p is an assumed proof of F that may occur in G. We will use F ∧0 G and F ⇒0 G as abbreviations when p does not occur in G, which yield the non-sequential cases. The introduction and elimination rules are generalized accordingly. Note that these sequential connectives do not rely on classicality. In plain rst-order logic, such sequential connectives would be useless as a proof cannot occur in a formula. But in the presence of the description operator, the proofs frequently occur in terms and thus in .

Set Theory The elementhood predicate is encoded as ∈: set → set → prop together with a corresponding inx declaration. The formalization of the axioms is straightfor- ward, for example, the is encoded as:

8 . ax_exten : ded ∀ [x] ∀ [y](∀ [z] z ∈ x ⇔ z ∈ y) ⇒ x = y

It is now easy to establish the adequacy of our encoding in the following sense: Every well-formed closed LF-term s : set over ZFC encodes a unique set satisfying a certain predicate F . This is obvious because s must be of the form δ F P . The inverse does not hold as there are models of set theory with more sets than can be denoted by closed terms.

Basic Operations We can now derive the basic notions of set theory and their proper- ties: Using the description operator and the respective axioms, we can introduce dened Twelf symbols empty : set = ... uopair : set → set = ... bigunion : set → set = ... powerset : set → set = ... :(set → set) → set → set = ... filter : set → (set → prop) → set = ... S such that empty encodes ∅, uopair x y encodes {x, y}, bigunion X encodes X, powerset X encodes PX, image f A encodes {f(x): x ∈ A}, and filter A F encodes {x ∈ A | F (x)}. For example, to dene uopair we proceed as follows: is_uopair : set → set → set → prop . . = [x][y][a](∀ [z](z = x ∨ z = y) ⇔ z ∈ a) p_uopair : ded ∃! (is_uopair A B) = spec_unique (shrink (∀E (∀E ax_pairing A) B)) uopair : set → set → set = [x][y] δ (is_uopair x y) p_uopair . Here is_uopair x y a formalizes the dening property a = {x, y} of the new function symbol, and p_uopair shows unique existence. The above uses two lemmas shrink : ded (∃ [X] ∀ ([z](ϕ z) ⇒ z ∈ X)) → ded (∃ [x] ∀ ([z](ϕ z) ⇔ z ∈ x)) = ··· spec_unique : ded (∃ [x] ∀ ([z](ϕ z) ⇔ z ∈ x)) → ded ∃! [x] ∀ ([z](ϕ z) ⇔ z ∈ x) = ··· shrink expresses that if there is a set X that contains all the elements for which the predicate ϕ : set → prop holds, then the set described by ϕ exists. spec_unique expresses that if a predicate ϕ : set → prop describes a set then that set exists uniquely. They can be proved easily using extensionality and specication.

Advanced Operations Then we can dene the advanced operations on sets in the usual way. For example, the denition of binary union x ∪ y = S{x, y} can be directly formalized as

9 union : set → set → set = [x][y] bigunion (uopair x y) We omit the denitions of singleton sets, ordered pairs, cartesian products, relations, partial functions, and functions. Our denitions are standard except for the . We dene (x, y) = {{x}, {{y}, ∅}}, which is similar to Wiener's denition (Wiener 1967) and dierent from the more common (x, y) = {{x, y}, {x}} due to Kuratowski. Our denition is a bit simpler to work with than Kuratowski pairs because it avoids the special case (x, x) = {{x}}. pair : set → set → set = [a][b] uopair (singleton a)(uopair (singleton b) empty) The dierence with Kuratowski pairs is not signicant as we immediately prove the characteristic properties of pairing and then never appeal to the denition anymore. . conv : ded pi1 (pair X Y ) = X = ··· pi1 . conv : ded pi2 (pair X Y ) = X = ··· pi2 . convpair : ded ispair X → ded pair (pi1 X)(pi2 X) = X = ··· The proofs are technical but straightforward. Finally, we can dene function construction X 3 x 7→ f(x) and application f(x) as λ : set → (set → set) → set = [a][f] image ([x] pair x (f x)) a @: set → set → set . = [f][a] bigunion (image pi2 (filter f ([x](pi1 x) = a))) where λ A f encodes {(x, f(x)) : x ∈ A}, and @ f x yields the b such that (a, b) ∈ f. Application is dened for all sets: for example, it returns ∅ if f is not dened for x. Like for pairs, we immediately prove the characteristic properties, which are known as βη-conversion and extensionality in computer science. We never use other properties than these later on: . conv : ded X ∈ A → ded @(λ A F ) X = FX = ··· apply . convlambda : ded F ∈ (⇒ AB) → ded λ A ([x]@ F x) = F = ··· func : ded F ∈ (⇒ AB) → ded G ∈ (⇒ AB) → ext . . ({a} ded a ∈ A → ded @F a = @G a) → ded F = G = ··· Again we omit the straightforward proofs.

4.3 Typed Set Theory Classes as Types A major drawback of formalizations of set theory is the complexity of reasoning about elementhood and set equality. It is well-known how to overcome these using typed , but in mathematical accounts of set theory, types are not primitive but derived notions. We proceed accordingly: The central idea is to use the predicate subtype Elem A = {x : set | ded x ∈ A inhabited } to represent the set A. In fact, we can use the same approach to recover classes as a derived notion: F = {x : set | ded F x inhabited } for any unary predicate F : set → prop.

10 However, LF does not support predicate subtypes (for the good reason that it would make the typing relation undecidable). Therefore, we think of elements x of the class {x | F (x)} as pairs (x, P ) where P : ded F x is a proof that x is indeed in that class. We encode this in LF as follows: Class :(set → prop) → type celem : {a : set} ded F a → Class F cwhich : Class F → set cwhy : {a : Class F } ded (F (cwhich a))

Elem : set → type = [a] Class [x] x ∈ a elem : {a : set} ded a ∈ A → elem A = [a][p] celem a p which : elem A → set = [a] cwhich a why : {a : elem A} ded (which a) ∈ A = [a] cwhy a Class F encodes {x|F (x)}, celem x P produces an of a class, and cwhich x and cwhy x return the set and its proof. The remaining declarations specialize these notions to the classes {x|x ∈ A}. . To axiomatize these, we use the additional axiom eqwhich : ded cwhich (elem X P ) = X as well as the following axiom for proof irrelevance . proofirrel : {f : ded G → Class A} ded cwhich (f P ) = cwhich (f Q) which formalizes that two sets are equal if they only dier in a proof.

Typed Operations Using the types Elem A, we can now lift all the basic untyped operations introduced above to the typed level. In particular, we dene typed quantiers . ∗ ∀∗, ∃∗, typed equality = , typed function spaces ⇒∗, and booleans bool as follows. Firstly, we dene typed quantiers such as ∀∗ :(elem A → prop) → prop. In higher- order logic (Church 1940), such typed quantication can be dened easily using abstrac- tion over the booleans. This is not possible in ZFC because the type prop is not a set itself, i.e., we have prop : type and not prop : set. If we committed to , we could use the set bool : set from below. A natural solution is relativization as in ∀∗ F := ∀[x] x ∈ A ⇒0 F x for F : elem A → prop. However, an attempt to dene typed quantication like this meets a subtle di- culty: In ∀∗ F , F only needs to be dened for elements of A whereas in ∀[x] x ∈ A ⇒0 F x, F must be dened for all sets even though F x is intended to be ignored if x 6∈ A. There- fore, we use sequential connectives: ∀∗ :(Elem A → prop) → prop = [F ] ∀[x] x ∈ A ⇒0 [p](F (elem x p)) ∃∗ :(Elem A → prop) → prop = [F ] ∃[x] x ∈ A ∧0 [p](F (elem x p)) It is easy to derive the expected introduction and elimination rules for ∀∗ and ∃∗. Secondly, typed equality is easy to dene: . ∗ . = : Elem A → Elem A → prop = [a][b](which a) = (which b) . . ∗ It is easy to see that all rules for = can be lifted to = .

11 Then, thirdly, we can dene function types in the expected way: ⇒∗ : set → set → set = [x][y] Elem (x ⇒ y) λ∗ :(Elem A → Elem B) → Elem (A ⇒∗ B) = ... @∗ : Elem (A ⇒∗ B) → Elem A → Elem B = [F ][x] elem (@ (which F )(which x)) (funcE (why F )(why x)) . ∗ beta : ded (@∗ (λ∗ [x] F x) A) = FA = ... . ∗ eta : ded (λ∗ [x] (@∗ F x)) = F = ... We omit the quite involved denitions and only mention that the typed quantiers and thus the sequential connectives are needed in the denitions. Finally, we introduce the set {∅, {∅}} of booleans and derive some important oper- ations for them. In particular, these are the constants 0 and 1, a supremum operation on families of booleans, a variant of if-then-else where the then-branch (else-branch) may depend on the (falsity) of the condition, and a reection function mapping propositions to booleans. bool : set = uopair empty (singleton empty) 0 : Elem bool = ... 1 : Elem bool = ... sup :(Elem A → Elem bool) → Elem bool ifte : {F : prop} (ded F → Elem A) → (ded ¬ F → Elem A) → Elem A = ... reflect : Elem prop → Elem bool

The denition of the supremum operation is only possible after proving that {∅, {∅}} = P{∅}, which requires the use of excluded middle. (In fact, it is equivalent to it.) Simi- larly, reflect and ifte can only be dened in the presence of excluded middle. All other denitions in our formalization of ZFC are also valid intuitionistically.

4.4 Related Work Several formalizations of set theory have been proposed and developed quite far. Most notable are the encodings of Tarski-Grothendieck set theory in Mizar (Trybulec & Blair 1985, Trybulec 1989) and of ZF in Isabelle (Paulson 1994, Paulson & Coen 1993). The most striking dierence with our formalization is that these employ sophisticated machine support with structured proof languages. Since there is no comparable machine support for Twelf, our encoding uses hand-written proof terms. We chose LF because it permits a more elegant formalization: We use only ∈ as a primitive symbol and use a description operator to introduce names for derived concepts. This deviates from standard accounts of formalized mathematics and is in contrast to Mizar where primitive function symbols are used for singleton, unordered pair, and union, and to Isabelle/ZF where primitive function symbols are used for empty set, power set, union, innite set, and replacement. But it corresponds more closely to mathematical practice, where the implicit use of a description operator is prevalent.

12 Our encoding depends crucially on dependent types. Description operators are also used in typed formalizations of mathematics such as HOL (Church 1940). They dier from ours by not taking a proof of unique existence as an argument. Consequently, they must assume the non-emptiness of all types and a global choice function. Other language features only possible in a dependently-typed framework are sequential connectives and our ifte construct. Connectives similar to our sequential ones are also used in PVS (Owre, Rushby & Shankar 1992) and in (de Nivelle 2010), albeit without proof terms occurring explicitly in formulas. Moreover, using dependent types, we can recover typed reasoning as a derived notion. Here, our approach is similar to the one in Scunak (Brown 2006), and in fact our formal- ization of classes and typed reasoning is inspired by the one used in Scunak. Scunak uses a variant of dependent type theory specically developed for this purpose: The symbols set, prop, and Class, and the axioms eqwhich and proofirrel are primitives of the type theory. This renders the formalization much simpler at the price of using a less elegant framework. A compromise between our encoding and Scunak's would be an extension of the LF framework. For example, the dependent sum type Σx:set(ded F x) could be used instead of our Class F . Moreover, in (Lovas & Pfenning 2009) a variant of proof irrelevance is introduced for LF that might make our encoding more elegant.

5 Isabelle and Higher-Order Logic 5.1 Preliminaries Isabelle Isabelle is a logical framework and generic LCF-style interactive theorem prover based on polymorphic higher-order logic (Paulson 1994). We will only consider the core language of Isabelle here  the P ure logic and basic declarations  and omit the module system and the structured proof language. We gave a comprehensive formaliza- tion of P ure and the Isabelle module system in (Rabe 2010). The grammar for Isabelle is given in Fig. 1, which is a simplied variant of the one given in (Wenzel 2009).

con ::= c :: τ ax ::= a : ϕ lem ::= l : ϕ proof typedecl ::= (α1, . . . , αn)t types ::= (α1, . . . , αn)t = τ τ ::= α | (τ, . . . , τ) t | τ ⇒ τ | prop term ::= x | c | term term | λx :: τ.term ϕ ::= ϕ =⇒ ϕ | V x :: τ.ϕ | term ≡ term proof ::= ... Figure 1: Isabelle Grammar

13 An Isabelle theory is a list of declarations of typed constants c :: τ, axioms a : ϕ, lemmas a : ϕ P where P proves ϕ, and n-ary type operators (α1, . . . , αn)t which may carry a denition in terms of the αi. Denitions for constants can be introduced as special cases of axioms, and we consider base types as nullary type operators.

Types τ are formed from type variables α, type operator applications (τ1, . . . , τn)t, function types, and the base type prop of propositions. Terms are formed from variables, constants, application, and lambda abstraction. Propositions are formed from implica- tion =⇒, universal quantication V at any type, and equality on any type. (Wenzel 2009) does not give a grammar for proofs but lists the inference rules; they are V introduc- tion and elimination, =⇒ introduction and elimination, reexivity and for equality, β and η conversion, and functional extensionality. Constants may be polymorphic in the sense that their types may contain free type variables. When a polymorphic constant is used, Isabelle automatically infers the type arguments.

HOL The most advanced logic formalized in Isabelle is HOL (Nipkow et al. 2002). Isabelle/HOL is a classical higher-order logic with shallow polymorphism, non-empty types, and choice operator (Church 1940, Gordon & Pitts 1993). Isabelle/HOL uses the same types and function space as Isabelle. But it introduces a type bool for HOL-propositions (i.e., booleans since HOL is classical) that is dierent from the type prop of Isabelle-propositions. The coercion T rueprop : bool ⇒ prop is used as the Isabelle truth judgment on HOL propositions. HOL declares primitive constants for implication, equality on all types, denite and indenite description operator some x : τ.P and the x : τ.P for a predicate P : τ ⇒ bool. Furthermore, HOL declares a polymorphic constant undefined of any type and an innite base type ind, which we omit in the following. Based on these primitives and their axioms, simply-typed set theory is developed by purely denitional means. Going beyond the Isabelle framework, Isabelle/HOL also supports Gordon/HOL-style type denitions using representing sets. A set A on the type τ is given by its charac- teristic function, i.e., A : τ ⇒ bool. An Isabelle/HOL type denition is of the form (α1, . . . , αn) t = AP where P and A contain the variables α1, . . . , αn and P proves that A is non-empty. If such a denition is in eect, t is an additional type that is axiomatized to be isomorphic to the set A.

5.2 Formalizing Isabelle/HOL Isabelle Our formalization of Isabelle follows the one we gave in (Rabe 2010). We declare an LF signature P ure for the inner syntax of Isabelle, which declares symbols for all primitives that can occur in expressions. P ure is given in Fig. 2. This yields a straightforward structural encoding function p−q sig that acts as described in Fig. 3. Similar encodings are well-known S = { include Pure for LF, see e.g., (Harper et al. 1993). The only subtlety is the case Σ of polymorphic constant applications c t . . . t where the type p q 1 n } of c contains type variables α1, . . . , αm. Here we need to infer the

14 sig P ure = { tp : type ⇒ : tp → tp → tp infix 0 tm : tp → type prefix 0 λ :(tm A → tm B) → tm (A ⇒ B) @: tm (A ⇒ B) → tm A → tm B infix 1000

prop : tp V :(tm A → tm prop) → tm prop =⇒ : tm prop → tm prop → tm prop infix 1 ≡ : tm A → tm A → tm prop infix 2

` : tm prop → type prefix 0 V I :(x : tm A ` (B x)) → ` V([x] B x) V E : ` V([x] B x) → {x : tm A} ` (B x) =⇒ I :(` A → ` B) → ` A =⇒ B =⇒ E : ` A =⇒ B → ` A → ` B refl : ` X ≡ X subs : {F : tm A → tm B} ` X ≡ Y → ` FX ≡ FY exten :({x : tm A} ` (F x) ≡ (G x)) → ` λF ≡ λG beta : ` (λ[x : tm A] F x)@ X ≡ FX eta : ` λ ([x : tm A] F @ x) ≡ F } Figure 2: LF Signature for Isabelle types at which is applied, and put τ1, . . . , τm c pc t1 . . . tnq = . Polymorphic axioms and lemmas occurring in (c pτ1q ... pτmq)@ pt1q ... @ ptnq proofs are treated accordingly. Finally, an Isabelle theory S = Σ is represented as shown on the right where pΣq is dened declaration-wise according to Fig. 4. sig HOL = { Adequacy It is easy to show include P ure the adequacy of this encod- bool : tp ing: For an Isabelle theory Σ, trueprop : tm bool ⇒ prop Isabelle types τ over Σ with eps : tm (A ⇒ bool) ⇒ A type variables from . α1, . . . , αm = : tm A ⇒ A ⇒ bool are in bijection with LF-terms set : tp → tp = [a] a ⇒ bool in context pτq : tp α1 : nonempty :(tm set A) → type = ... , and accord- tp, . . . , αm : tp typedef : {s : tm set A} nonempty s → tp ingly Isabelle terms t :: τ with Rep : tm (typedef S P ) ⇒ A LF-terms ptq : tm pτq, and Is- Abs : tm A ⇒ (typedef (S : tm set A) P ) abelle proofs P of ϕ with LF- } Figure 5: LF Signature for HOL terms pP q : ` pϕq.

15 Expression Isabelle LF type τ ptq : tp term t :: τ ptq : tm pτq proof P proving ϕ pP q : ` pϕq containing type variables in context

α1, . . . , αm α1 : tp, . . . , αm : tp Figure 3: Encoding of Expressions

Declaration Isabelle LF

type operator (α1, . . . , αn) t t : tp → ... → tp → tp type denition (α1, . . . , αn) t = τ t : tp → ... → tp → tp = [α1] ... [αn] τ constant , in c :: τ α1, . . . , αm τ c : tp → ... → tp → tm pτq axiom , in a : ϕ α1, . . . , αm τ a : tp → ... → tp → ` pϕq lemma , in l : ϕ P α1, . . . , αm ϕ, P l : tp → ... → tp → ` pϕq = [α1] ... [αm] pP q Figure 4: Encoding of Declarations

HOL Since HOL is an Isabelle theory, its LF-encoding follows immediately from the denition above. The fragment arising from translating some of the primitive declarations of HOL is given in the upper part of the signature HOL in Fig. 5. For example, eps is the choice operator. The lower part gives some of the additional declarations needed to encode HOL-style type denitions. The central declaration is typedef, which takes a set S on the type A and a proof that S is nonempty and returns a new type, say T . Rep and Abs translate between A and T , we refer to (Wenzel 2009) for details.

5.3 Interpreting Isabelle/HOL in ZFC We formalize the relation between Isabelle/HOL Pure and ZFC by giving two views P ureZF C and HOLZF C from P ure and HOL, respectively, PureZFC to ZFC+ as shown on the right. These for- malize the standard set-theoretical HOLZFC of higher-order logic. HOL ZFC+ ZFC+ arises from ZFC by adding a global choice function choice : {A : Class nonempty} (Elem (chwich A)) that produces an element of a non-empty set A. This is stronger than the (which merely states the existence of such an element) but needed to interpret the choice operators of HOL.

16 Isabelle The general structure of the translation is given in Fig. 6 and the view in Fig. 7. Types are mapped to non-empty sets, terms to elements, in particular propositions to . ∗ booleans, and proofs of ϕ to proofs of P ureZF C(ϕ) = 1. These invariants are encoded (and guaranteed) by the assignments to tp, tm, prop, and ` in P ureZF C. It is tempting to Isabelle propositions to ZFC propositions rather than to booleans. However, in Isabelle, prop is a normal type and thus must be interpreted as a set. An alternative would be to map prop to a set representing intuitionistic truth values rather than classical ones, but we omit that for simplicity. (Due to our use of a standard model, we cannot expect anyway.)

Isabelle/HOL ZFC τ : tp P ureZF C(τ): Class nonempty. t : tm τ P ureZF C(t): Elem (cwhich P ureZF C(τ)). ϕ : tm prop P ureZF C(ϕ): Elem (cwhich boolne). . ∗ P : ` ϕ P ureZF C(P ): ded P ureZF C(ϕ) = (bbne 1).

Figure 6: Isabelle/HOL Declarations in ZFC

The case for terms t is a bit tricky: Since τ is view P ureZF C : P ure → ZFC = { interpreted as an element of Class nonempty, tp := Class nonempty we rst have to apply cwhich to obtain a set. tm := elem Then we apply Elem to this set to obtain the prop := boolne . ∗ type of its elements. Similarly, prop cannot be ` := [x] ded x = 1 mapped directly to Elem bool. Instead, we ⇒ := ⇒∗ have to introduce boolne : Class nonempty λ := λ∗ which couples bool with the proof that it @ := @∗ is a non-empty set. Therefore, we also V := ∀∗ have to dene the auxiliary functions bbne : =⇒ := ⇒ . ∗ Elem bool → Elem (cwhich boolne) and bneb : ≡ := = to convert . Elem (cwhich boolne) → Elem bool . back and forth. These technicalities indicate a drawback of our  otherwise perfectly natural }  representation of classes. Dierent represen- Figure 7: Interpreting Pure in ZFC tations that separate the mapping of types to sets from the proofs of non-emptiness may prove more scalable, but would require a more sophisticated framework. The remaining cases are straightforward. For example, ⇒ must be mapped to a ZFC expression that takes two arguments of type Class nonempty and returns another, i.e., must respect the invariants above.

HOL Similarly, we obtain a view from HOL to ZFC, a fragment of which is shown in Fig. 8. HOL booleans are mapped to ZFC booleans so that trueprop is mapped to the identity. The choice operator eps is interpreted using ifte and choice. Note that

17 in the given Twelf terms we elide some bookkeeping proof steps. The then-branch uses elem (filter f) P to construct an element of Class nonempty, to which then choice is applied. In both cases, P must use the assumption p that the condition of the ifte-split is true. typedef s p is interpreted using filter according to s. Thus, type denitions using sets on A are interpreted as of A in the expected way. The proof p is used to obtain an element of Class nonempty.

view HOLZF C : HOL → ZFC = { include P ureZF C bool := bool trueprop := [x] x . . ∗ = := λ∗([x](λ∗([y]bbne(reflect(x = y))))) eps := [f : Elem (A ⇒∗ bool)] ifte (nonempty (filter f)) ([p](choice (elem (filter f) p))) ([p] choice A) . . . ∗ typedef := [s : Elem (A ⇒ bool)] [p] celem (filter ([x] s @ x = 1)) p . . } Figure 8: Interpreting HOL in ZFC

5.4 Related Work Our formalization of Isabelle is a special case of the one we gave in (Rabe 2010). There we also cover the Isabelle module system. Together with the formalization of HOL given here, we now cover interpretations of Isabelle locales in terms of Isabelle/HOL. This is interesting because if Isabelle locales are seen as logical theories and HOL as a foundation of mathematics, then interpretations can be seen as models. Formalizations of HOL in logical frameworks have been given in (Pfenning et al. 2003) using LF and of course in Isabelle itself (Nipkow et al. 2002). Ours appears to be the rst formalization of Isabelle and HOL and the meta-relation between them. Moreover, we do not know any other formalizations of HOL-style type denitions in a formal framework  even in the Isabelle/HOL formalization, the type denitions are not expressed exclusively in terms of the Pure meta-language. Our semantics of Isabelle/HOL does not quite follow the one given in (Gordon & Pitts 1993). There, individual models provide a set U of sets, and every type is interpreted as an element of U. Models must provide further structure to interpret HOL type constructors, in particular a choice function on U. Our semantics can be seen as a single model where the set theoretical universe is used instead of U. Consequently, our model is not a set itself and thus not a model in the sense of (Gordon & Pitts 1993), but every individual model in that sense is subsumed by ours.

18 Independent of our work, a similar semantics of Isabelle/HOL is given in (Krauss & Schropp 2010). They translate Isabelle/HOL to Isabelle/ZF where the of Pure is simply the identity. Their semantics is given as a target-trusting implementation rather than formalized in a framework. They also use the full set-theoretical universe and a global choice function. An important dierence is the treatment of non-emptiness: They assume that interpretations for all type constructors are given that respect non- emptiness; then they can interpret all types as sets (which will always be non-empty) and only have to relativize universal quantiers over types to quantiers over non-empty sets. Our translation is more complicated in that respect because it uses Class nonempty to guarantee the non-emptiness.

6 Mizar and Tarski-Grothendieck Set Theory 6.1 Preliminaries Mizar At its core, Mizar (Trybulec & Blair 1985) is an implementation of classical rst-order logic. However, it is designed to be used with a single theory: set theory following the Tarski-Grothendieck axiomatization (Tarski 1938, Bourbaki 1964) (TG). Consequently, Mizar is strongly inuenced by its representation of TG. Like Isabelle, it includes a semi-automated theorem prover and a structured proof language. Mizar/TG is notable for being the only major system for the formalization of math- ematics that is based on set theory. Types are only introduced as a means of eciency and clarity but not as a foundational commitment. Moreover, the Mizar Mathematical Library is one of the largest libraries of formalized mathematics containing over 50000 theorems and 9500 denitions. Mizar's logic is an extension of classical rst-order logic with second-order axiom schemes. The proof system is Jaskowski-style natural deduction (Ja±kowski 1934). Con- trary to the LCF style implementations of HOL and to our ZFC, which try to use a small set of primitives, Mizar features a rich ontology of primitive mathematical objects, types, and proof principles. In particular, the type set of terms (i.e., sets in Mizar/TG) can be rened using a complex type system, see, e.g., (Wiedijk 2007). The basic types are called modes, and while they are semantically predicate subsorts (i.e., classes in Mizar/TG), they are technically primitive in Mizar. Modes can be further rened by attributes, which are predicates on a type. These two renement relations generate a subtype relation between type expressions, called type expansion. Both modes and attributes may take arguments, which makes Mizar dependently-typed. Mizar enforces the non-emptiness of all types, and all mode denitions and attribute applications induce the respective proof obligations. The notion of typed rst-order functions between types, called functors, is primitive. Function denitions may be implicit, in which case they induce proof obligations for well-denedness. This expressivity makes theorems and proofs written in Mizar relatively easy to read

19 but makes it hard to represent Mizar itself in a logical framework. We will use the grammar given in Fig. 9, which is a substantially simplied variant of the one given in (Mizar 2009). Here ... and ∗ denote possibly empty repetition.

Article ::= Article-Name* Text-Proper Text-Proper ::= (Block | Theorem)∗ Block ::= let (x be ϑ)∗ Denition end Denition ::= Mode | Functor | Attribute

Mode ::= mode M of x1, . . . , xn is ϑ | mode M of x1, . . . , xn → ϑ means α existence proof

Attribute ::= attr x is (x1, . . . , xn)V means α Functor ::= func f(x1, . . . , xn) equals t | func f(x1, . . . , xn) → ϑ means α existence proof uniqueness proof Theorem ::= theorem T : α proof t ::= x | f(t1, . . . , tn) α ::= t is ϑ | t in t | α&α | not α | t = t | for x be ϑ holds α ϑ ::= Adjective∗ Radix Adjective ::= (t1, . . . , tn)V | non (t1, . . . , tn)V Radix ::= M of t1, . . . , tn proof ::= ... Figure 9: Mizar Grammar

A Mizar article starts with one import clause for every kind of declaration to import from other articles. Instead, we only permit cumulative imports of whole articles. This is followed by a list of denitions and theorems. We only permit mode, functor, and attribute denitions. Predicate denitions and schemes could be added easily. All three kinds of denitions introduce a new symbol, which takes a list of typed term arguments xi. The type ϑi of xi must be given by a let declaration or defaults to the type set. Mode denitions dene M of x1, . . . , xn either explicitly as the type ϑ(x1, . . . , xn) or implicitly as the type of sets it of type ϑ satisfying α(it, x1, . . . , xn). In the latter case, non-emptiness must be proved. Similarly, functor denitions dene f(x1, . . . , xn) either explicitly as t(x1, . . . , xn) or implicitly as the object it of type ϑ satis- fying α(it, x1, . . . , xn). In the latter case, well-denedness, i.e., existence and uniqueness, must be proved. Finally, attribute denitions dene (x1, . . . , xn)V as the unary predicate on the type of x given by α(x, x1, . . . , xn). Terms t and formulas α are formed in the usual way, and we omit the productions for proof terms. in and is are used for elementhood in a set or a type, respectively. Finally, types are formed by providing a list of possibly negated adjectives on a mode. In Mizar, these types must be proved to be non-empty before they can be used, which we will omit here.

20 Tarski-Grothendieck Set Theory TG is similar to ZFC but uses Tarski's axiom asserting that for every set there is a universe containing it. It implies the axioms of innity, choice, power set, and large cardinals. Mizar/TG is dened in the Mizar article Tarski (Trybulec 1989), which contains primitives for elementhood, singleton, unordered pair, union, the Fraenkel scheme, the Tarski axiom, as well as a denition of ordered pairs following Kuratowski.

6.2 Formalizing Mizar and TG Set Theory sig Mizar = { tp : type prop : type proof : prop → type prefix 0 be : tp → type set : tp is : be T → tp → prop infix 30 in : be T → be T 0 → prop infix 30 not : prop → prop prefix 20 and : prop → prop → prop infix 10 eq : be T → be T 0 → prop infix 10 . . for :(be T → prop) → prop . . func : {f : be T → prop}(proof ex [x] f x) → proof for [x] for [y](f x and f y) implies x eq y → be T funcprop : {F }{Ex}{Unq} proof F (func F Ex Unq) attr : tp → type = [t](be t → prop) adjective : {t : tp} attr t → tp adjI : {x : be X} (proof A x) → be (adjective X A) adjE : {x : be (adjective X A)} be X adjE0 : {x : be (adjective X A)} proof A (adjE x) . . } Figure 10: LF Signature for Mizar

Mizar The LF signature that encodes Mizar's logic is given in Fig. 10, where we omit the declarations of denable constants, such as equivalence iff and existential quantier ex. The general form of the encoding of Mizar expressions in LF is given in Fig. 11. Mizar types, formulas, and proofs of F are represented as LF terms of the types tp, prop, and proof pF q, respectively. The judgment expand encodes Mizar's type expansion relation. Mizar's use of a type system within an untyped foundation is hard to represent in a logical framework. We mimic it by using an auxiliary type constructor be with the intended meaning that t : be T encodes a Mizar term t of type T . Consequently, if T

21 expands to T 0, terms of type T must be explicitly cast to obtain terms of type T 0 by applying cast. Attributes on a type ϑ are represented as LF terms of type attr ϑ. In eect, they are represented as LF functions be ϑ → prop. A type ϑ = A1 ...Am R is encoded as adjective (... (adjective R Am) ... ) A1. Attributes A = (t1, . . . , tn)V are encoded as . Finally types of (radix types in Mizar) are encoded as V pt1q ... ptnq M t1, . . . , tn . M pt1q ... ptnq To that, we add LF constant declarations that represent the primitive formula and proof constructors of Mizar's rst-order logic. For formulas and proofs, this is straight- forward, and the only subtlety is to identify exactly which constructors are primitive. For example, or and imp are dened using and and not. We omit the constructors for type expansion. This induces an encoding of Mizar terms, types, formulas, and proofs as LF terms. The only remaining subtlety is that applications of cast must be inserted whenever the well-formedness of a type depends on the type expansion relations.

Expression Mizar LF type ϑ pϑq : tp formula α pαq : prop proof P proving α pP q : proof pαq typed term t be ϑ ptq : be pϑq 0 0 type expansion ϑ expands to ϑ expand pϑq pϑ q inhabited Figure 11: Encoding of Expressions

Then we can represent Mizar declarations according to Fig. 12. Explicit functor and mode denitions are represented easily as dened LF constants. Implicit denitions are represented using special constants func and mode. func ([x : be ϑ] α x) ex unq encodes the uniquely existing object of type ϑ that satises α. Similar to δ in our ZFC encodings, it takes proofs of existence and uniqueness as arguments. mode ([x : be ϑ] α x) P encodes the necessarily non-empty subtype of ϑ containing the objects satisfying α. Attribute denitions are encoded easily. In all three cases, the arguments x1, . . . , xn of Mizar functors/modes/attributes are represented directly as LF arguments. Finally, theorems are encoded in the same way as for Isabelle.

Finally, we can encode a Mizar article Art1, . . . , Artn TP in le A as the following LF signature where the Text-Proper part TP is encoded declaration-wise.

sig A = { Adequacy Intuitively, our Mizar encoding should be adequate include Mizar in the sense that Mizar articles that stay within our simplied include Art1 . grammar are well-formed in Mizar i their encoding is well-formed . in LF. include Art We cannot or even prove the adequacy because there is no n pTPq semantics of Mizar that would be rigorous and complete }

22 Mizar LF

let xi be ϑi mode of is M x1, . . . , xn ϑ M : {x1 : be pϑ1q} ... {xn : be pϑnq} tp = [x1] ... [xn] pϑq mode of means M x1, . . . , xn → ϑ α M : {x1 : be pϑ1q} ... {xn : be pϑnq} tp existence P = [x1] ... [xn] mode ([it] pαq) pP q func equals f(x1, . . . , xn) t f : {x1 : be pϑ1q} ... {xn : be pϑnq} be pϑq expands to t ϑ = [x1] ... [xn] ptq func means f(x1, . . . , xn) → ϑ α f : {x1 : be pϑ1q} ... {xn : be pϑnq} be pϑq existence uniqueness P Q = [x1] ... [xn] func ([it] pαq) pP q pQq let be x ϑ V : {x1 : be pϑ1q} ... {xn : be pϑnq} attr pϑq attr is means x (x1, . . . , xn)V α = [x1] ... [xn] ([x] pαq) theorem T : α P T : proof pαq = pP q Figure 12: Encoding of Declarations enough for that. This is partially due to the fact that Mizar is justied more through mathematical intuition than through a formal semantics.

TG Set Theory The encoding of TG set theory given in Fig. 13 is rather straightfor- ward. The use of LF's {} binder for Mizar's axiom schemes is the only subtlety. The denitions for singleton, uopair, and union are given using func, and their existence and uniqueness conditions are stated as axioms. We only give the case for singleton. The Tarski axiom is easy to encode but requires some auxiliary denitions.

6.3 Interpreting Mizar/Tarski in ZFC Similar to the interpretation of Isabelle/HOL in ZFC, we give corresponding views for Mizar. Here the view from T arski to ZFC is dashed because it is partial: It omits the Tarski axiom, which goes beyond ZFC. Mizar Mizar The general idea of the interpretation of Mizar in ZFC is given in Fig. 14. In particular, a type ϑ is inter- MizarZF C preted as a unary predicate (the intensional description of ϑ), and the auxiliary type be ϑ as the class of sets in ϑ T arskiSem T arski ZFC (the extensional description of ϑ). Technically, we should interpret types as non-empty predicates, i.e., as pairs of a predicate and an existence proof. We avoid that because it would complicate the encoding even more than in the case of Isabelle/HOL. This is possible because no part of our restricted Mizar language relies on the non-emptiness of type. Type expansion is interpreted as a subclass relationship, and the interpretation of cast maps a set to itself but treated as an element of a dierent class. This is formalized by the rst declarations in the view MizarZF C in Fig. 15.

23 sig T arski = { include Mizar . .

singletonex : {y : be set} proof ex [it : be set](for [x : be set] (x in it) iff (x eq y)) 0 singletonunq : {y} proof for [it] for [it ](for [x](x in it iff x eq y) and for [x](x in it0 iff x eq y)) implies it eq it0 singleton : be set → be set = [y] func ([it] for [x] x in it iff x eq y) (singletonex y)(singletonunq y) . . fraenkel : {A : be set}{P : be set → be set → prop} proof (for [x : be set] for [y : be set] for [z : be set] ((P x y) and (P x z)) implies y eq z) → proof (ex [X] for [x] ((x in X) iff (ex [y] y in A and (P y x)))) . .

subsetclosed : {m} prop = [m] for [x](for [y] ((( x in m) and (y ⊆ x)) implies (y in m))) powersetclosed : {m} prop = [m] for [x](x in m implies (ex [z] z in m and (for [y] y ⊆ x implies y in z))) tarski_ax : proof for [n](ex [m]( n in m and subsetclosed m and powersetclosed m and for [x] (x ⊆ m implies ((isomorphic x m) or x in m)))) . . } Figure 13: Encoding TG Set Theory

func is interpreted using the description operator from ZFC, and the interpretation of mode is trivial. Finally, attributed modes adjective ϑ A are interpreted using the conjunction of the interpretations P : set → prop of ϑ and Q : Class P → prop of A. Note how sequential conjunction is needed to use the truth of P x in the second conjunct. This is necessary because in Mizar A only has to be dened for terms of type ϑ, which corresponds to Q only being applicable to sets satisfying P . We omit the straightforward but technical remaining cases for formula and proof con- structors.

TG The view T arskiZF C from T arski to ZFC is straightforward, and we omit the details. However, the view is only partial because it omits the Tarski axiom. Partial views in LF simply omit cases. Consequently, their homomorphic extensions are partial functions. For our view, that means that every denition or theorem that

24 Mizar ZFC ϑ : tp MizarZF C(ϑ): set → prop α : prop MizarZF C(α): prop P : proof α MizarZF C(P ): ded MizarZF C(P ) α : be ϑ MizarZF C(α): Class MizarZF C(ϑ)

Figure 14: Mizar/TG Declarations in ZFC

view MizarZF C : Mizar → ZFC = { tp := set → prop prop := prop proof := ded be := [f] Class f set := [x] > is := [a][F ] F (cwhich a) in := [a][b](cwhich a) ∈ (cwhich b) . . func := [F ][EX ][UNQ] δ F (andI EX UNQ) mode := [F : Class A → prop][EX ] ([x](A x) ∧0 [p] F (celem x p)) adjective := [P : set → prop][Q : Class T → prop] [x](P x) ∧0 [p](Q (celem x p)) . . } Figure 15: Interpreting Mizar in ZFC depends on the Tarski axiom cannot be translated to ZFC. This is more harmful than it sounds: Since the Tarski axiom is used in Mizar to prove the existence of power set, innity, and choice, almost all denitions depend on it. However, we have already designed an elegant extension of the notion of Twelf views that solves this problem in (Dumbrava & Rabe 2010). With this extension, it is possible to make T arskiZF C undened for the Tarski axiom, but map Mizar's theorems of power set, innity, and choice, which depend on the Tarski axiom, to their counterparts in ZFC. We say that power set, innity, and choice are recovered by the view. Then Mizar expressions that are stated in terms of the recovered constants can still be translated to ZFC, and the preservation of truth is still guaranteed. With this amendment, most theorems in the Mizar library can be translated. Only theorems that directly appeal to the Tarski axiom remain untranslatable, and that is intentional because they are likely to be unprovable over ZFC.

25 6.4 Related Work Mizar is infamous for being impenetrable as a logic, and previous work has focused on making the syntax and semantics of Mizar more accessible. The main source of complexity is the type system. (Wiedijk 2007) gives a comprehensive account on the syntax and semantics of the Mizar type system. It interprets types as predicates in the same way as we did here. A translation to rst-order logic is given that is similar in spirit to our translation to ZFC. An alternative approach using type algebras was given in (Bancerek 2003). In (Urban 2003), a translation of Mizar into TPTP-based rst-order logic is given. It also interprets types as predicates.

7 Conclusion and Future Work

We have represented three foundations of mathematics and two translations between them in a formal framework, namely Twelf. The most important feature is that the well- denedness and soundness of the translations are veried statically and automatically by the Twelf type checker. In particular, the LF type system guarantees that the translation functions preserve provability. Our work is the rst systematic case study of statically veried translations between foundations. Our foundations are ZFC, Mizar's Tarski-Grothendieck set theory (TG) and Isabelle's higher-order logic (HOL). We chose ZFC as the most widespread foundation of non- formalized mathematics, and our formalization stays notably close to textbook develop- ments of ZFC. (We have to add a global choice function though both for Isabelle/HOL and for Mizar/TG.) We chose Isabelle/HOL and Mizar because they are two of the most advanced foundations of formalized mathematics in terms of library size and (semi- )automated proof support. They are also foundationally very dierent  higher-order logic and untyped set theory, respectively  and represent the whole spectrum of founda- tions. Moreover, our formalizations make the foundational assumptions of these systems explicit and thus contribute to their documentations and systematic comparison. We have formalized translations from Isabelle/HOL and Mizar/TG into ZFC as indicated on the right. Mizar P ure These translations can be seen as giving two founda- tions used in formalized mathematics a semantics in terms of the foundation dominant in traditional math- ematics. Actually, the translation from Mizar/TG to T arski ZFC+ HOL ZFC+ is only partial because the former is stronger than the latter, but this is no serious concern as we discussed in Sect. 6.3. We did not give the inverse translation from ZFC to Mizar/TG, but that would be straightforward. However, a corresponding translation from ZFC to Isabelle/HOL remains a challenge. (Translations such as the one in (Aczel 1998) would not be inverse to ours.) Future work will focus on two research directions. Firstly, we will formalize more foundations and translations. This is an on-going eort

26 in the LATIN project (Kohlhase, Mossakowski & Rabe 2009), which will provide a large library of statically veried foundation translations and for which this work provides the theoretical bases and seed library. Examples of further systems are Coq (Coquand & Huet 1988) or PVS (Owre et al. 1992). Secondly, a major drawback of statically veried translations is that the extracted translation functions cannot be directly applied to the libraries of the foundations: First those libraries must be represented in the foundational framework. This is a conceptually trivial but practically long-term research eort that is still under way. Acknowledgements This work was supported by grant KO 2428/9-1 (LATIN) by the Deutsche Forschungsgemeinschaft.

References

Aczel, P. (1998), On Relating Type Theories and Set Theories, in T. Altenkirch, W. Naraschewski & B. Reus, eds, `Types for Proofs and Programs', pp. 118.

Bancerek, G. (2003), On the structure of Mizar types, Vol. 85 of Electronic Notes in Theoretical Computer Science, pp. 6985. Bourbaki, N. (1964), Univers, in `Séminaire de Géométrie Algébrique du Bois Marie - Théorie des topos et cohomologie étale des schémas', Springer, pp. 185217.

Brown, C. (2006), Combining Type Theory and Untyped Set Theory, in U. Furbach & N. Shankar, eds, `International Joint Conference on Automated Reasoning', Springer, pp. 205219. Church, A. (1940), `A Formulation of the Simple Theory of Types', Journal of Symbolic Logic 5(1), 5668.

Constable, R., Allen, S., Bromley, H., Cleaveland, W., Cremer, J., Harper, R., Howe, D., Knoblock, T., Mendler, N., Panangaden, P., Sasaki, J. & Smith, S. (1986), Implementing Mathematics with the Nuprl Development System, Prentice-Hall. Coquand, T. & Huet, G. (1988), `The Calculus of Constructions', Information and Com- putation 76(2/3), 95120. de Bruijn, N. (1970), The Mathematical Language AUTOMATH, in M. Laudet, ed., `Proceedings of the Symposium on Automated Demonstration', Vol. 25 of Lecture Notes in Mathematics, Springer, pp. 2961. de Nivelle, H. (2010), Classical Logic with Partial Functions, in J. Giesl & R. Hähnle, eds, `Automated Reasoning', Springer, pp. 203217. Dumbrava, S. & Rabe, F. (2010), `Structuring Theories with Partial Morphisms'. Work- shop on Abstract Development Techniques.

27 Farmer, W., Guttman, J. & Thayer, F. (1993), `IMPS: An Interactive System', Journal of Automated Reasoning 11(2), 213248. Fraenkel, A. (1922), `The notion of 'denite' and the independence of the axiom of choice'.

Gordon, M. (1988), HOL: A Proof Generating System for Higher-Order Logic, in G. Birtwistle & P. Subrahmanyam, eds, `VLSI Specication, Verication and Syn- thesis', Kluwer-Academic Publishers, pp. 73128. Gordon, M., Milner, R. & Wadsworth, C. (1979), Edinburgh LCF: A Mechanized Logic of Computation, number 78 in `LNCS', Springer Verlag.

Gordon, M. & Pitts, A. (1993), The HOL Logic, in M. Gordon & T. Melham, eds, `Introduction to HOL, Part III', Cambridge University Press, pp. 191232. Hales, T. (2003), `The yspeck project'. See http://code.google.com/p/flyspeck/. Harper, R., Honsell, F. & Plotkin, G. (1993), `A framework for dening ', Journal of the Association for Machinery 40(1), 143184. Harper, R., Sannella, D. & Tarlecki, A. (1994), `Structured presentations and logic rep- resentations', Annals of Pure and Applied Logic 67, 113160. Harrison, J. (1996), HOL Light: A Tutorial Introduction, in `Proceedings of the First International Conference on in Computer-Aided Design', Springer, pp. 265269. Hurd, J. (2009), OpenTheory: Package Management for Higher Order Logic Theories, in G. D. Reis & L. Théry, eds, `Programming Languages for Mechanized Mathematics Systems', ACM, pp. 3137.

Iancu, M. & Rabe, F. (2010), `Formalizing Foundations of Mathematics, LF Encodings'. see https://latin.omdoc.org/wiki/FormalizingFoundations. Ja±kowski, S. (1934), `On the rules of suppositions in formal logic', Studia Logica 1, 532. Keller, C. & Werner, B. (2010), Importing HOL Light into Coq, in M. Kaufmann & L. Paulson, eds, `Proceedings of the Interactive Theorem Proving conference'. to appear in LNCS. Klein, G., Nipkow, T. & (eds.), L. P. (2004), `Archive of Formal Proofs'. http://afp.sourceforge.net/. Kohlhase, M., Mossakowski, T. & Rabe, F. (2009), `The LATIN Project'. See https: //trac.omdoc.org/LATIN/. Krauss, A. & Schropp, A. (2010), A Mechanized Translation from Higher-Order Logic to Set Theory, in M. Kaufmann & L. Paulson, eds, `Proceedings of the Interactive Theorem Proving conference'. to appear in LNCS.

28 Landau, E. (1930), Grundlagen der Analysis, Akademische Verlagsgesellschaft. Lovas, W. & Pfenning, F. (2009), Renement Types as Proof Irrelevance, in P. Curien, ed., `Typed Lambda Calculi and Applications', Vol. 5608 of Lecture Notes in Com- puter Science, Springer, pp. 157171.

Matuszewski, R. (1990), `Formalized Mathematics'. http://mizar.uwb.edu.pl/fm/. McLaughlin, S. (2006), An Interpretation of Isabelle/HOL in HOL Light, in N. Shankar & U. Furbach, eds, `Proceedings of the 3rd International Joint Conference on Au- tomated Reasoning', Vol. 4130 of Lecture Notes in Computer Science, Springer.

Mizar (2009), `Grammar, version 7.11.02'. http://mizar.org/language/mizar- grammar.xml. Naumov, P., Stehr, M. & Meseguer, J. (2001), The HOL/NuPRL proof translator - a practical approach to formal interoperability, in `14th International Conference on Theorem Proving in Higher Order Logics', Springer.

Nipkow, T., Paulson, L. & Wenzel, M. (2002), Isabelle/HOL  A Proof Assistant for Higher-Order Logic, Springer. Norell, U. (2005), `The Agda WiKi'. http://wiki.portal.chalmers.se/agda. Obua, S. & Skalberg, S. (2006), Importing HOL into Isabelle/HOL, in N. Shankar & U. Furbach, eds, `Proceedings of the 3rd International Joint Conference on Auto- mated Reasoning', Vol. 4130 of Lecture Notes in Computer Science, Springer. Owre, S., Rushby, J. & Shankar, N. (1992), PVS: A Prototype Verication System, in D. Kapur, ed., `11th International Conference on Automated Deduction (CADE)', Springer, pp. 748752.

Paulson, L. (1994), Isabelle: A Generic Theorem Prover, Vol. 828 of Lecture Notes in Computer Science, Springer. Paulson, L. & Coen, M. (1993), `Zermelo-Fraenkel Set Theory'. Isabelle distribution, ZF/ZF.thy.

Pfenning, F. & Schürmann, C. (1999), `System description: Twelf - a meta-logical frame- work for deductive systems', Lecture Notes in Computer Science 1632, 202206. Pfenning, F., Schürmann, C., Kohlhase, M., Shankar, N. & Owre., S. (2003), `The Logo- sphere Project'. http://www.logosphere.org/. Poswolsky, A. & Schürmann, C. (2008), System Description: Delphin - A Functional Pro- gramming Language for Deductive Systems, in A. Abel & C. Urban, eds, `Interna- tional Workshop on Logical Frameworks and : Theory and Practice', ENTCS, pp. 135141.

29 Rabe, F. (2010), Representing Isabelle in LF, in K. Crary & M. Miculan, eds, `Logical Frameworks and Meta-languages: Theory and Practice', Vol. 34 of EPTCS. Rabe, F. & Schürmann, C. (2009), A Practical Module System for LF, in J. Cheney & A. Felty, eds, `Proceedings of the Workshop on Logical Frameworks: Meta-Theory and Practice (LFMTP)', ACM Press, pp. 4048. Schürmann, C. & Stehr, M. (2004), An Executable Formalization of the HOL/Nuprl Connection in the Metalogical Framework Twelf, in `11th International Conference on Logic for Programming Articial Intelligence and Reasoning'. T. Hales and G. Gonthier and J. Harrison and F. Wiedijk (2008), `A Special Issue on '. Notices of the AMS 55(11). Tarski, A. (1938), `Über Unerreichbare Kardinalzahlen', Fundamenta Mathematicae 30, 176183. Trybulec, A. (1989), `Tarski Grothendieck Set Theory', Journal of Formalized Mathe- matics Axiomatics. Trybulec, A. & Blair, H. (1985), Computer Assisted Reasoning with MIZAR, in A. Joshi, ed., `Proceedings of the 9th International Joint Conference on Articial Intelligence', pp. 2628. Urban, J. (2003), Translating Mizar for First Order Theorem Provers, in A. As- perti, B. Buchberger & J. Davenport, eds, `Mathematical Knowledge Management', Springer, pp. 203215. van Benthem Jutting, L. (1977), Checking Landau's Grundlagen in the AUTOMATH system, PhD thesis, Eindhoven University of Technology. Wenzel, M. (2009), `The Isabelle/Isar Reference Manual'. http://isabelle.in.tum. de/documentation.html, Dec 3, 2009. Whitehead, A. & Russell, B. (1913), , Cambridge University Press. Wiedijk, F. (2006), `Is ZF a hack?: Comparing the complexity of some (formalist in- terpretations of) foundational systems for mathematics', Journal of Applied Logic 4(4), 622645. Wiedijk, F. (2007), Mizar's Soft Type System, in K. Schneider & J. Brandt, eds, `The- orem Proving in Higher Order Logics', Vol. 4732 of Lecture Notes in Computer Science, Springer, pp. 383399. Wiener, N. (1967), A Simplication of the Logic of Relations, in J. van Heijenoort, ed., `From Frege to Gödel', Harvard Univ. Press, pp. 224227. Zermelo, E. (1908), `Untersuchungen über die Grundlagen der Mengenlehre I', Mathe- matische Annalen 65, 261281. English title: Investigations in the foundations of set theory I.

30