A Theory of Objects [Abadi & Cardelli ‘96]
Total Page:16
File Type:pdf, Size:1020Kb
Step-indexed Semantic Model of the Imperative Object Calculus Catalin Hritcu Jan Schwinghammer Gert Smolka Programming Systems Lab Saarland University, Saarbrücken February 2007 1 The Big Picture 2 Deduction systems Γ = a : α α β (Sem-SSub) | ⊆ Γ = a : β | α τ τ β (Sem-SRefl) α α (Sem-STrans) ⊆ ⊆ ⊆ α β Sets of ⊆ (Sem-STop) α (Sem-SBot) α rules ⊆ " ⊥ ⊆ E D (Sem-SObj) ⊆ [md = ς(xd)bd]d D [me = ς(xe)be]e E ∈ ⊆ ∈ α, β Type. α β F (α) G(β) (Sem-SRec) ∀ ∈ ⊆ ⇒ ⊆ µF µG ⊆ β α τ Type. τ β F (τ) G(τ) (Sem-SUniv) ⊆ ∀ ∈ ⊆ ⇒ ⊆ F G ∀α ⊆ ∀β 3 Deduction systems are sets of rules used to prove properties of systems. For example these are some of the rules of our type system. Deduction systems • Type systems • Property: type safety - efficiently decidable • Soundness usually proved syntactically • One can also use semantic models!(details soon) • Program logics (e.g. Hoare logic) • Property: correctness - undecidable • Soundness proved wrt. semantic model: • “Derivability implies validity in the model” 4 The properties one proves using deduction systems range from efciently decidable ones, like type soundness, to very strong properties which are undecidable like program correctness. Program logics are such deduction systems for program correctness. In order to prove soundness of type systems one usually employs syntactic arguments - subject-reduction. However, for program logics one proves soundness with respect to a semantic model, which makes a clear distinction between validity and derivability. A program logic is sound if everything that can be derived using the rules is also valid with respect to the model. We will see that simple semantic models can also be used for type systems. The Challenge 5 However, finding good semantic models in which one can reason about the behaviour of programs is challenging. This is even harder when dynamic storage of executable code is allowed. Still, this feature is present in diferent forms in almost all practical programming languages: pointers to functions in C/C++, callbacks in Java or general references in ML. Classical denotational models based on partial orders are highly complex in this setting, and in the existing ones many equivalences involving state still do not hold. [Category theory, domain theory: For higher-order store the semantic domain is defined as mixed-variant recursive equation one has to solve. For modeling dynamic allocation one has to move to a possible-world model, formalized as a category of functors over CPOs.] One possible alternative is to use simpler set-theoretic models based on the operational semantics and step-indexing. But before we dive into this, I will first present the programming language we studied. The Challenge • Executable code on the heap (e.g. C/C++, Java, ML) • Under this realistic assumption • Denotational semantic models • Very complex • Still not perfect 5 However, finding good semantic models in which one can reason about the behaviour of programs is challenging. This is even harder when dynamic storage of executable code is allowed. Still, this feature is present in diferent forms in almost all practical programming languages: pointers to functions in C/C++, callbacks in Java or general references in ML. Classical denotational models based on partial orders are highly complex in this setting, and in the existing ones many equivalences involving state still do not hold. [Category theory, domain theory: For higher-order store the semantic domain is defined as mixed-variant recursive equation one has to solve. For modeling dynamic allocation one has to move to a possible-world model, formalized as a category of functors over CPOs.] One possible alternative is to use simpler set-theoretic models based on the operational semantics and step-indexing. But before we dive into this, I will first present the programming language we studied. The Challenge • Executable code on the heap (e.g. C/C++, Java, ML) • Under this realistic assumption • Denotational semantic models • Very complex • Still not perfect • Possible alternative: “step-indexing” • Simpler set-theoretic models (details soon) 5 However, finding good semantic models in which one can reason about the behaviour of programs is challenging. This is even harder when dynamic storage of executable code is allowed. Still, this feature is present in diferent forms in almost all practical programming languages: pointers to functions in C/C++, callbacks in Java or general references in ML. Classical denotational models based on partial orders are highly complex in this setting, and in the existing ones many equivalences involving state still do not hold. [Category theory, domain theory: For higher-order store the semantic domain is defined as mixed-variant recursive equation one has to solve. For modeling dynamic allocation one has to move to a possible-world model, formalized as a category of functors over CPOs.] One possible alternative is to use simpler set-theoretic models based on the operational semantics and step-indexing. But before we dive into this, I will first present the programming language we studied. Imperative Object Calculus • Simple object-oriented programming language • Only one primitive: objects • Classes and inheritance encoded • Captures the essence of object-oriented languages • Features dynamically allocated, higher-order store • We also studied the functional object calculus • A Theory of Objects [Abadi & Cardelli ‘96] Martin Abadi Luca Cardelli 6 It is called the imperative object calculus and it is a very simple programming language. Only objects are considered as primitives. However, the calculus is expressive enough to encode all common features of practical object-oriented languages like classes and inheritance. The imperative object calculus captures the essence of object-oriented programming languages, the same way the lambda calculus captures the essence of functional ones. The calculus features: dynamically allocated, higher-order store - a very realistic store type. As we discussed before, it is challenging to find good semantic models for such a calculus. Actually, we have also studied the simpler functional object calculus and have a similar model for it, which I won’t present in this talk. Both the functional and the imperative object calculus are described in the book by Abadi and Cardelli: A Theory of Objects from 96. Imperative Object Calculus a, b ::= x variable [md = ς(xd)bd]d D object creation | clone a ∈ object cloning | a.m method invocation | a.m := ς(x)b method update | λx. b a b procedures | | [gcd = ς(y)λx. λz. if x < z then y.gcd x (z x) else if z < x then y.gcd (−x z) z else x] − 7 Step-indexed Semantic Models • First developed by Appel et al. - foundational PCC • Machine-checkable proofs of type soundness • For low-level languages • Lambda calculus • Recursive types [Appel & McAllester, ‘01] • General references and polymorphism [Ahmed et al., ‘03] [Ahmed, ‘04] Andrew Appel Amal Ahmed David McAllester 8 Now that we presented the calculus, let us turn our attention to step-indexed semantic models. Such models were first developed Andrew Appel and his collaborators in the context of foundational proof-carrying code. They wanted simpler and more modular proofs of type soundness that are easier to check automatically. And they considered low-level languages, like assembler or machine code. However, step-indexed semantic models were also defined first for a lambda calculus with recursive types in 2001, and then successfully extended to an imperative language with general references and impredicative polymorphism. Step-indexed Semantic Models • Use operational semantics of untyped calculus • Types are sets of indexed values “a :k α if a behaves like an element of α for k steps” • In the presence of state • Values also indexed by store typings Ψ Store extension relation: (k, Ψ) (j, Ψ!) • ! • Types have a “meaning” - predicates over programs 9 These models are based only on the small-step operational semantics of an untyped calculus. Types are interpreted as sets of indexed values. Informally, an expression has a certain type if it behaves like an element of that type for a fixed number of steps. Or equivalently we say that a is in alpha with approximation k, if one cannot distinguish a from a real value of type alpha in less than k computation steps. In the presence of state, the values are indexed not only by computation steps, but also by store typings. Store typings increase monotonically during computation, which is captured by a store extension relation. Like in a denotational semantics types have a meaning: they can be viewed as predicates over programs. Type constructors are just functions over these predicates. However, our model is purely set theoretical, so much simpler than the models one usually has with a denotational semantics. Especially the proofs are usually by direct induction on the index. Types Object types [md : τd]d D • ∈ • Recursive types µF Polymorphic types F F • ∀ ∃ Subtyping α β • ⊆ • “No object calculus can fully justify its existence without some notion of subsumption” - Luca Cardelli Bounded quantification F F • ∀α ∃α • Self types ~ recursive object types ς(λτ Type. [md : Fd(τ)]d D) ∈ ∈ 10 Out object calculus has of course object types, which enumerate the return type of each method (our methods do not have arguments). Recursive and polymorphic types are defined in the same way as in Ahmed’s model. However, a feature which is very important to us, but was not explicit before, is subtyping. In his book Luca Cardelli states that: “No object calculus can fully justify its existence without some notion of subsumption”. In particular an object with more methods can subsume (emulate) another object with fewer methods. Our model has a very natural notion of subtyping: since types are just sets, subtyping is just set inclusion. And while defining subtyping this way is very simple, the interaction between subtyping and the other features of the type system is not so simple.