Formal Semantics and Analysis of Object Queries

Formal Semantics and Analysis of Object Queries

Formal semantics and analysis of object queries G.M. Bierman University of Cambridge Computer Laboratory J.J. Thomson Avenue Cambridge, CB3 0DF. UK. [email protected] ABSTRACT ification and analysis of query languages. This paper ap- Modern database systems provide not only powerful data plies two dominant themes in current programming language models but also complex query languages supporting pow- research|type systems and operational semantics|to the erful features such as the ability to create new database ob- study of an object query language and the problems of query jects and invocation of arbitrary methods (possibly written optimization. in a third-party programming language). We believe that a formal, mathematical approach is es- In this sense query languages have evolved into power- sential to set a firm foundation for researchers, users and ful programming languages. Surprisingly little work exists implementors of complex query languages. Without such utilizing techniques from programming language research mathematical precision it is very difficult, for example, to to specify and analyse these query languages. This pa- assert correctness. For example, the ODMG [8, p.100] de- per provides a formal, high-level operational semantics for fine a notion of least upper bound of two types in their object a complex-value OQL-like query language that can create model, and give an informal definition. However a few mo- fresh database objects, and invoke external methods. We ment's formality soon reveals that a least upper bound of define a type system for our query language and prove an two types need not necessarily exist (because we have both important soundness property. classes and interfaces)! We shall also see in a number of We define a simple effect typing discipline to delimit the places that another advantage for our formal approach is computational effects within our queries. We prove that that it allows us to consider the design space of various fea- this effect system is correct and show how it can be used to tures. detect cases of non-determinism and to define correct query In this paper we pay particular attention to object-oriented optimizations. data models although our techniques apply equally well to both relational and object-relational data models. We de- fine a simple object data model that is essentially a fragment 1. INTRODUCTION of the ODMG object model. This model provides primi- tive types and arbitrarily nested collection types, along with \Database languages (`query languages') are noth- classes, and supports single inheritance between classes. ing but special-purpose programming languages." [6] We define a complex query language, broadly based on Since Codd's pioneering work, database systems have moved ODMG OQL. This query language supports path expres- beyond the simple relational data model and basic query lan- sions, object creation, and (read-only) method invocation guage. Modern data models typically support complex data amongst the more familiar features. Our language is similar types, objects which are collected into classes, and notions in spirit with IQL of Abiteboul and Kanellakis [1], although of subtyping. Likewise query languages have been extended it is different in detail. Moreover the primary concern in to include various features including object identity, object their work is in the expressive power of IQL (in particu- creation, and method invocation. To this extent, we can see lar the implications of its ability to create new objects, see that Date's quote above is even more true now than when also [7]). he made it nearly twenty years ago. We specify formally the type system for queries, and also Given that query languages are now essentially complex provide an operational semantics. This semantics defines programming languages, it is perhaps surprising to find that precisely the process of evaluation of a query, and is defined the techniques and methodologies of the programming lan- recursively over the structure of the query. Given this op- guages community do not find widespread use in the spec- erational semantics we are able to prove the correctness of our type system. As far as we are aware, no result of this form exists for object query languages (in fact, the opposite negative result has been asserted for ODMG OQL [2]). Permission to make digital or hard copies of all or part of this work for We then turn our attention to reasoning about queries. In personal or classroom use is granted without fee provided that copies are particular we are concerned with formalizing when the two not made or distributed for profit or commercial advantage and that copies queries should be considered equivalent, which is at the heart bear this notice and the full citation on the first page. To copy otherwise, to of the optimization problem. Even given our relatively small republish, to post on servers or to redistribute to lists, requires prior specific query language, we see that matters are subtle and compli- permission and/or a fee. SIGMOD 2003, June 9­12, 2003, San Diego, CA. cated. The chief complication is that iteration over sets is Copyright 2003 ACM 1­58113­634­X/03/06 ...$5.00. non-deterministic|we can have no idea as to the order in Contributions. In summary, the significant contributions which elements are taken from a set. In the pure relational of this paper include: (1) The definition of an OQL-like world, this is not a problem as queries are purely declarative query language, IOQL, that incorporates comprehensions, and so the order of evaluation is irrelevant. However once we object identifiers, path expressions, object creation, and sim- add features familiar from object programming languages to ple method invocation; (2) The formal definition of a type our query language things are much more complicated. For system and operational semantics for IOQL; (3) A proof of example, consider the following query (written in a version type soundness for IOQL; (4) The development of a type- of OQL). based effect system to infer database access and update be- haviour of queries; (5) A proof of correctness for this anal- SELECT (if size(Fs)<1 ysis; and (6) The use of effects to detect non-determinism then (new F(name:"Peter",pal:p)).name and define correct optimizations. else p.name) FROM p in Ps; 2. DATA MODEL This query assumes a simple class P of objects with just a In this section we define our data model which is strongly name attribute, whose extent is called Ps. We also have a influenced by ODMG ODL, although the techniques we em- class F with attributes name and pal, whose extent is called ploy could easily be applied to other complex-value data Fs. We shall assume initially there are no F objects and just models, including object-relational data models. two P objects, one with name \Jack", and the other \Jill". Our data model is a class-based object model. As in Unfortunately this query is observably non-deterministic! ODL, we allow single-inheritance between classes, although The result of the query (and its side-effect on the extent of for simplicity we have not included interfaces. All objects class F) is different depending on the order in which the P have a unique object identity (oid), and consist of internal objects are considered.1 If we visit the \Jack" object first, state comprising attributes, and a collection of methods. For the result is the set {"Peter","Jill"}, otherwise the result simplicity, we shall consider only read-only methods, simi- is the set {"Peter","Jack"}. lar to that provided by e.g. PREDATOR [22] or considered Another problem, and one often ignored, e.g. [5, 19], is in [15]. The impact of more sophisticated method support that method invocation may not terminate. For example, is considered briefly in 5. consider the following query which is a variant of the one For example, here is ax simple class definition of Employee above, where P objects now have a method loop that, as objects. the name suggests, does not terminate. class Employee extends Person SELECT (if size(Fs)<1 and p.name="Jack" (extent Employees) then p.loop() { attribute int EmpID; else new F(name:"Peter",pal:p) attribute int GrossSalary; FROM p in Ps; attribute Manager UniqueManager; int NetSalary (int TaxRate); } We now have quite different non-deterministic behaviour: the query terminates if we visit the \Jill" object first, but This defines the Employee class as a subclass of Person. fails to terminate if we visit the \Jack" object first. Its extent is called Employees. It has three attributes and To address these problems (and also to formulate correct one method. For simplicity we insist that all class defini- optimizations) we define an effect typing discipline for our tions explicitly state a superclass (we also assume a class query language. Such typing disciplines, originally proposed Object, which is the superclass of all classes). Two of the by Gifford and Lucassen [11], are used to delimit computa- attributes, EmpID and GrossSalary are integer values. The tional effects and have been used in a variety of program- third, UniqueManager, is an object-valued attribute. The ming languages. For example, Java contains a simple effects method NetSalary takes an integer argument and returns system where each method is labelled with the exceptions an integer. it might raise. We define a simple effects system that an- More formally we define the grammar for class definitions notates types with details of the extents that may be used as follows, where φ denotes valid types in our data model, in the evaluation of the query. For example, the source of which are just class names and primitive types int and bool.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    12 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us