. · ~'

Object Query Service

Revised Submission in response to OMG Object Services Task Force RFP #4

January 4, 1995

Prepared by: Texas Instruments Incorporated Central Research Laboratories 13510 N. Central Expressway, M/S 238 Dallas, TX 75243

Technical Contact: Dr. Craig Thompson tel: +1 (214) 995-0347

fax: +1 (214) 995-8600 email: [email protected] ~ ·

© Copyright 1995 Texas Instruments Incorporated Texas Instruments Incorporated hereby grants a royalty-free license to the Object Management Group, Inc. (OMG) for world-wide distribution of this document or any derivative works thereof, so long as the OMG reproduces the copyright notices and below paragraphs on all distributed copies.

NOTICE

The information contained in this document is subject to change without notice.

The material in this document is submitted to the OMG for evaluation. Submission of this document does not represent commitment to implement any portion of this specification in the products of the submitters. WHILE THE INFORMATION IN THIS PUBLICATION IS BELIEVED TO BE ACCURATE, THE ABOVE COMPANIES MAKE NO WARRANTY OF ANY KIND WITH REGARD TO THIS MATERIAL, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Texas Instruments Incorporated shall not be liable for errors contained herein or for incidental or consequential damages in connection with the furnishing, performance, or use of this material. This document contains information which is protected by copyright. All Rights Reserved. Except as otherwise provided herein, no part of this work may be reproduced or used in any form or by any means - graphic, electronic, or mechanical, including photocopying, recording, taping, or information storage and retrieval systems - without the permission of the copyright owners. All copies of this document must include the copyright and other information contained on this page. The copyright owners grant member companies of the OMG permission to make a limited number of copies of this document (up to fifty copies) for their internal use as part of the OMG evaluation process. The following terms are trademarks of the OMG: OMG, CORBA, ORB, IDL.

RESTRICTED RIGHTS LEGEND. Use, duplication, or disclosure by government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Right of Technical Data and Computer Software Clause at DFARS 252.227.7013.

ACKNOWLEDGMENT. This research is sponsored by the Advanced Research Projects Agency under ARPA Order No. A719 and managed by the U.S. Army Research Laboratory under contract DAAA15-94-C-0009. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the Advanced Research Projects Agency or the United States Government. Abstract

The OMG Object Query Service provides for operations on collections that return collections. More than most OMG object services, the query service represents an important opportunity for convergence between the OMG, ODMG, and ISO/ ANSI SQL communities. Convergence in this area would go a long way in simplifying interoperability problems across enterprise distributed systems and systems. Design goals and the architecture of the query service are presented, the latter accounting for native, foreign, and scalable, federated query services. A Functional Interface analogous to ODBC is proposed to create or destroy the query service itself or deliver to it queries for processing. A OQL(IDL] derived directly from SQL is proposed. A Collection Object Service, expressed in IDL, is needed independently of the object query service. Type Templates for IDL are also independently needed. Finally, mappings from OQL(IDL] to OQL[C++] and to OQL[Relations] (e.g., SQL) are provided. t•

1

Contents

1 Service Description 2 1.1 Design Goals of the Object Query Service ...... 3 1.1.1 High-Level Objectives of the Object Query Service .. . 3 1.1.2 Design Goals of the Object Query Language OQL[IDL] 5 1.1.3 Design Goals of the Object Query Service Architecture 6 1.2 Detailed Description of the Object Query Service 7 1.3 The OQS as a Continuum of Services . . . 10 1.4 Architecture of the Object Query Service .. 14 1.4.1 Overview of the Architecture . . . . . 14 1.4.2 Client and Service Provider Interfaces 14 1.4.3 Classification of Object Query Service Engines and Interfaces 14 1.4.4 Where is the processing done? ...... 17 1.4.5 Communicating Queries Across the ORB ...... 17

2 Service Structure 19

3 Resolution of Technical and Non-Technical Issues 23

4 Service Dependencies 29 4.1 Mandatory Dependencies ...... 29 4.2 Optional Dependencies ...... 29 4.3 Other Services that may depend on the Object Query Service 29

5 Relationship to CORBA 30

6 Relationship to the OMG Object Model 31

7 Standards Conformance 33 7.1 Relationship to OMG Standards 33 7.2 Relationship to OSI/ ANSI SQL . 34 7.3 Relationship to ODMG OQL .. 36

8 Open Technical Issues and future Work 39

A OQS functional Interface 41

B BNF for OQL[X] 43

C Object Query Language OQL[IDL) 48

D Object Query Language OQL[C++] 56

E Object Query Language OQL[Relations) (Sketch) 60

F Templates for IDL 64

G Collections Object Service 65

H Collection Indexing Object Service 68 1

Contents

1 Service Description 2 1.1 Design Goals of the Object Query Service ...... 3 1.1.1 High-Level Objectives of the Object Query Service ... 3 1.1.2 Design Goals of the Object Query Language OQL[IDL] 5 1.1.3 Design Goals of the Object Query Service Architecture 6 1.2 Detailed Description of the Object Query Service 7 1.3 The OQS as a Continuum of Services ... 10 1.4 Architecture of the Object Query Service .. 14 1.4.1 Overview of the Architecture . . . . . 14 1.4.2 Client and Service Provider Interfaces 14 1.4.3 Classification of Object Query Service Engines and Interfaces 14 1.4.4 Where is the processing done? ...... 17 1.4.5 Communicating Queries Across the ORB 17

2 Service Structure 19

3 Resolution of Technical and Non-Technical Issues 23

4 Service Dependencies 29 4.1 Mandatory Dependencies ...... 29 4.2 Optional Dependencies ...... 29 4.3 Other Services that may depend on the Object Query Service 29

5 Relationship to CORBA 30

6 Relationship to the OMG Object Model 31

7 Standards Conformance 33 7.1 Relationship to OMG Standards 33 7.2 Relationship to OSI/ANSI SQL . 34 7.3 Relationship to ODMG OQL .. 36

8 Open Technical Issues and F\tture Work 39

A OQS functional Interface 41

B BNF for OQL[X] 43

C Object Query Language OQL[IDL] 48

D Object Query Language OQL[C++J 56

E Object Query Language OQL[Relations] (Sketch) 60

F Templates for IDL 64

G Collections Object Service 65

H Collection Indexing Object Service 68

1 SERVICE DESCRIPTION 2

1 Service Description

Queries are operations on sets or other kinds of collections of objects that have a predicate-based, declarative specification and that may result in sets or other collections of objects. The pro­ posed Object Query Service (OQS) provides the ability to operate on collections of objects using a predicate-based, declarative language, called OQL[IDL]l, that uses OMG's IDL as the object model and contains query and update statements based directly on the standard SELECT, UPDATE,

INSERT, and DELETE statements of SQL. OQL[IDL) allows queries on transient or persistent collec­ tions of IDL objects and permits user-defined IDL functions in the formulation of queries includ­ ing collection-valued and Boolean-valued functions. OQL[IDL] supports queries on semantically different collection types (e.g., multiset, list, array), and makes direct use of IDL-provided data abstraction, behavior, and inheritance. In OQL[IDL), the notions of type and type extent are distinguished, thus allowing multiple collections of a type in an application.

Organization of this Document: The remainder of this section describes the overall design goals of the proposed Object Query (1.1), presents the detailed description of the Service and how it meets the several requirements listed in OMG OSTF RFP #4 (1.2), describes the continuum of services that an OQS should provide (1.3), and presents the architecture of the proposed Object Query Service that meets the design goals and provides the required functionality (1.4). Section 2 outlines the structure of the OQS and provides pointers to appendices containing the details. Section 3 identifies and resolves technical and non-technical issues. Section 4 lists the various services dependencies of and on the OQS. Section 5 discusses the relationship of the proposed OQS to CORBA. Section 6 discusses the relationship of the proposed OQS to the OMG object model IDL. Section 7 identifies standards to which the OQS must conform and summarizes the level of conformance reached. Section 8 identifies open technical issues and required future work. Details of the material from the body of the proposal are contained in a number of appendices.

1 0QL[X) is read "OQL of X" where X is an object or data model. OQL[X) is a parameterized grammar for SQL where non-terminals are defined in the BNF of the host language. OQL[IDL) is the Object Query Service applied to the object model IDL. Similarly, OQL(C++] is the Object Query Service applied to the object model C++ and OQL[Relations), which is equivalent to SQL, is the Object Query Service applied to the SQL data model based on tables. 1 SERVICE DESCRIPTION 3

Appendix A describes the OQS Functional Interface through which OQS instances are created, connected to, and passed queries. Appendix B provides a BNF grammar for the proposed query language template OQL[X], which is based directly on SQL. In Appendix C, we describe (1) the API for the query language OQL[IDL] and (2) the interfaces for invoking the query service. In Appendix D, we provide a mapping from OQL[IDL] to OQL[C++] based on the OMG IDL C++ Mapping [OMG 94-8-2]. In Appendix E, we sketch a mapping from OQL[IDL] to SQL[Relations]. Appendix F discusses templates for IDL. Appendix G presents the rationale for a Collections Object Service, which should exist independently of the OQS. Appendix H presents the rationale for a Collections Indexing Object Service.

1.1 Design Goals of the Object Query Service

1.1.1 High-Level Objectives of the Object Query Service

OMG OSTF RFP #4 lists the following objectives of all OMG object services. We describe how these are achieved for the object query service.

• To make the ORB "usable" for supporting commercial applications

If the ORB is to become the standard gateway to enterprise information, then it must be useful for accessing the mass of corporate data, providing traditional database services in­ cluding relational queries while remaining compatible with the CORBA IDL object model. Interoperability will be much improved in a future world in which distributed systems and database management systems can both use a common object model. A query service for the ORB, available for all OMG applications, directly supports a more usable ORB. Indeed, the potential is to merge and unify three currently separate and competing kinds of object­ oriented system infrastructure: OMG OSA Distributed Object Systems, Systems, and Relational Database Systems.2

• To provide software "building blocks"

The OMG Object Services Architecture is a "building blocks" approach. Past generations of system infrastructure software have been developed as monolithic systems (e.g., in the

2 Merging (a) the distributed systems and database communities and (b) CORBA and OLE2.0 seem to us to be two of the biggest opportunities for massive improvements in system infrastructure interoperability. 1 SERVICE DESCRIPTION 4

database area, a database system traditionally provided many of the services in the OMG OSA). We can view "component database systems" based on an OMG OSA as a next generation beyond the hierarchical-network-relational-00 generations that provides service decomposition. 3

A litmus test of a componentized design is whether the components are separately useful. The OODB vendors have demonstrated that very useful DBMS systems can provide just persis­ tence (and concurrency control, etc.) but not also queries. Relational DBMSs depend heavily on queries. Hybrid OODB-RDB systems typically support both objects and queries. But, one can separately identify a need for a Collections Object Service, for Queries on collections that return collections, and can separately justify this with or without persistence and with or without concurrency control. Thus, a separable Query Service makes sense as an add-in component to a collection of object services. Taken together, a query service, transactions service, persistence service, and object model (plus other services like externalization) can be used to re-constitute an OODB or an OODB-RDB hybrid.4 Thus, the software "building blocks" approach appears very powerful as a basis for the next generation of system software frameworks.

• To leverage existing standards and to increase convergence of standards

At present, OMG, ODMG, and ISO/ANSI SQL3 standards are not miles apart, there is some convergence activity, but there is real danger that these specifications may not converge quickly. The OMG OSTF RFP #4 Object Query Service is a golden opportunity for all three communities to work together toward a common unified standard specification.

This submission proposes a path to converge the OMG Object Query Service, the ODMG OQL, and ISO/ ANSI SQL3, preserving most of the heritage of each community. Without some solution like the one we propose, these three groups may adopt incompatible specifica­ tions for an Object Query Service and Object Model. See section 4.5, Standards Conformance.

• To reuse complex existing components

3 A generation beyond that, directly enabled by OSA architectures, is to unify access to not only OODBs and RDBs but also file systems. The OMG Persistent Object Service is a step in this direction. 4 The Open OODB project at Texas Instruments is demonstrating just that. 1 SERVICE DESCRIPTION 5

The same existing relational query engines, file servers and indexing mechanisms, and object­ oriented query engines that currently provide access to massive corporate and engineering data represent an enormous development investment. Since, as stated above, the OQS must provide these capabilities anyway, it is imperative that the OQS be designed in such a way as to reuse these existing mechanisms and not just reproduce their functionality. It is for this reason that the architecture of the proposed OQS is designed to accommodate federation of query services, including those based on other data models than IDL, e.g., relations.

• To be open and extensible

One of the ways this is supported is through subtyping of the OQS itself. It is important that the OQS define a standard OQL[IDL] query service to query OMG collections (one subtype of OQS) and also a standard mapping from IDL to today's relational DBMSs (another subtype). It is equally important that light-weight subtypes be accommodated, like a simpler get-put access mechanism for searching and updating lists of attribute-value pairs. The OMG Property Object Service can then be defined to use either the light-weight service or invoke the more powerful OQL-based query service.

1.1.2 Design Goals of the Object Query Language OQL[IDL]

To support today's relational DBMS, we assume that one subtype of OQS supports SQL directly. But we also assume that another subtype of OQS supports an object query language OQL[IDL) that directly supports IDL. Most of the Sections 1.1 and 1.2 focus on OQL[IDL). Within the high-level goals described in Section 1.1.1, our design for OQL[IDL] is based on the following principles and assumptions:

• OQL[IDL) must rely on OMG's IDL as the object model. Therefore, OQL[IDL] does not include separate data definition statements. Data definition is entirely based on IDL. SQL compatible data definitions could be added for backwards compatibility, however.

• OQL[IDL] must support data abstraction (encapsulation, behavior, inheritance) as provided by the IDL object model.

• OQL[IDL) must operate on semantically different collection types (e.g., set, list, multilist, 1 SERVICE DESCRIPTION 6

sequence). Furthermore, queries are allowed on any collection-valued function provided by an IDL object.

• OQL[IDL] must support queries on collections of objects regardless of whether collections represent type extents or collections are defined and maintained by an application.

• OQL[IDL] must be orthogonal to the functionality of other OMG object services. For example, orthogonality with respect to persistence means that it should be possible to query and update collections of transient or persistent IDL objects. Similarly, it should be possible to query and update objects regardless of whether they are distributed, replicated, versioned, or time­ varying.

• OQL[IDL] must have a familiar syntax as close as possible to SQL89, SQL2 Entry Level, and SQL3 within the constraint that its object model is IDL (not SQL89/SQL2 relations or SQL3's ADTs). This implies use of the familiar SELECT-FROM-WHERE syntax as a basis.

• OQL[IDL] must be project-able into any programming language for which there is an IDL binding. Thus, it should be possible to define projections OQL[C++], OQL[SmallTalk], OQL[Relations], etc., as projections of OQL[IDL]. Appendices D and E cover OQL[C++] and OQL[Relations] respectively.

• OQL[IDL] and its projections must be optimizable.

1.1.3 Design Goals of the Object Query Service Architecture

Finally, there are several objectives that drive the proposed federated architecture of the OQS:

• Any query using the ORB according to the CORBA OSA model must be posed in terms of IDL collections of IDL objects.

• Queries embedded into applications written in some programming language with an IDL binding may be posed in terms of that language's binding of the particular IDL collection and object definitions. The language of implementation ofthe various collections and objects in a query should not be apparent to an application posing a query. 1 SERVICE DESCRIPTION 7

• Execution of object queries requires the ability to invoke method implementations imple­ mented in various programming languages. To obtain reasonable efficiency, the query engine ma.y exist in the sa.me process space (and hence usually the same language) as the method implementation.

• Existing query engines will not be discarded because of their expense, a.nd because existing applications tha.t a.re encapsulated a.s Object Services, Common Facilities, or Application Objects will still need to use their current query engines internally.

• The query service must accommodate federation to allow queries on collections that a.re distributed across ORBs and to accommodate legacy da.ta. stored in existing DBMS systems.

1.2 Detailed Description of the Object Query Service

Definition of Queries. Queries a.re operations on sets or collections of objects that ha.ve a. predicate-based, declarative specification a.nd tha.t ma.y result in sets or other collections of objects. The OQS provides the ability to operate on collections of objects using a. predicate-based, declara­ tive language called OQL[IDL]. The results of object queries a.re collections of objects. Operations provided by the OQS fall into the general categories of selecting, inserting, updating, a.nd deleting elements of collections. In this sense, the OQS is similar to relational query engines.

IDL as the Object Model. To be consistent with the OMG/CORBA model ofintera.ction where only objects known to the ORB can be operated on by Object Services, queriable collections and the objects they contain must be known to the ORB. This means tha.t all queria.ble collections a.nd the objects they contain must ha.ve IDL specifications and have implementations in a. programming la.ngua.ge with a.n IDL mapping. Specifically, a.ny collection of objects ca.n be queried, provided both the collection and the objects in it (1) have IDL specifications, (2) a.re implemented in some programming la.ngua.ge with a.n IDL binding, a.nd (3) a.re known to the ORB. In general, a.ny ORB-visible collection type satisfying certain properties (Appendix G) can be the target a.ndjor result of a. query. Thus, OQS supports the ability to query collection types other tha.n sets a.nd multisets. Because of the wa.y in which OQS uses collections, adding a. new queriable collection type does not require modification of the OQS. 1 SERVICE DESCRIPTION 8

Collections. The OQS operates on Collection classes. A Collections Object Service is described in Appendix G. Collection classes are separately needed and are not directly part of the OQS specification, hence they are separately specified. Collection classes are specified in IDL. Collection classes are homogeneous, defined with templates on user-defined IDL objects. Since IDL does not support templates, Type Templates for IDL are described in Appendix F. All collection classes support operations to create, access, insert elements, erase elements, and destroy collections. Like individual IDL objects, IDL collection objects may be persistent or transient, local or remote, versioned or not, etc.

OQL[IDL). The OQS5 defines an object query language, OQL[IDL], in which queries over IDL­ specified collection objects may be expressed. OQL[IDL) is an object-oriented subset of SQL89/ SQL2 with the familiar SELECT-FROM-WHERE syntax. In particular, the data model of SQL is generalized from SQL89 /SQL2's tables (multisets) of fiat tuples to collections of object instances, and operations in the WHERE clause can include user-defined methods on instances in the collection type. Since objects in collections are specified in IDL, OQL[IDL] directly uses IDL encapsulation, inheritance, and operation overloading. Thus, it is possible to pose queries in which object identity forms part of a query predicate, in which subtypes of the object type specified in the SELECT clause can satisfy the predicate, and in which objects implemented in various languages can be queried in the same way and even participate in the same query. OQL[IDL] redefines the normal relational algebra operations appropriately to an object model. These changes are apparent in the handling of object identity, and the ability of the JOIN operator to produce new types. For example, a projection of an object may cast the object to one of its super-types, thus preserving object identity (and possibly allowing only part of the state to be materialized to save space), or a projection or join may create a new instance of a type not in the objects' hierarchies by using the services of an object factory. The result of an OQL[IDL] query is always an ORB handle to an instance of a collection type having an IDL specification. The collections resulting from queries can be used wherever any collection-valued variable or function can be used. As with any other object accessible via the ORB, its implementation may be local or remote, persistent or transient, and may be implemented

6 Actually the subtype of OQWS that accepts OQL[IDL]. 1 SERVICE DESCRIPTION 9 in any supported language.

Relationships and Properties. Because relationships and properties as defined by the Object Relationship Service and the Object Property Service (OPS) will be IDL objects, we expect them to be able to fit naturally into query predicates. Relationships, whether implemented as pointers as is common in C+ +, or as abstractions provided by a service as in the OMG Object Relationship Service, are related to relational join operations used for navigation. Texas Instruments' existing OQL[C++] successfully supports optimized queries across relationships encoded as C++ pointers, and OQL[IDL) is defined to support the more general case of relationships implemented as ab­ stractions provided by an Object Service. We further expect that collections of relationships or properties could themselves be the targets of queries. The OMG Object Property Service can be defined to be optionally dependent on the OQS if its definition of list-of-attribute-value-pairs can be subtyped to be a collection that OQS can query. Thus, light or heavy weight property lists could be accommodated by an OPS.

Interactive and Embedded Queries and Language-Specific Projections of OQL[IDL]. Like relational query languages, OQL[IDL] may be used in two distinct ways: interactive, ad hoc queries, and embedded in a programming language. When used interactively, the syntax of OQL[IDL] can be directly used by the human user. When embedded into a programming language, a projection of OQL[IDL] into that language can be used, with the actual queries being posed in the projection. The syntax for the projection of OQL[IDL] into a given programming language is language­ specific. OQL[IDL] may be projected into a given programming language in more than one way. In Appendix D, we propose a projection of OQL[IDL] into C++, called OQL[C+ +], that supports C++ inheritance, encapsulation, path expressions, methods, constants, and operator overloading. Because the programming language projections of OQL[IDL) are independent, it is not necessary to define or implement all of them at once. We assume such projects will observe the property that, given an OMG IDL to X Mapping for data model X, then OQL[IDL] will map to OQL[X] respecting the IDL to X mapping. We have defined OQL[IDL] in such a way that multiple projections are possible. For example, see Appendix E that sketches another useful mapping for OQL[Relations]. 1 SERVICE DESCRIPTION 10

Optimization. Queries in OQL(IDL] and its programming language projections are optimizable. Query language statements are mapped into an object algebra operator graph and rearranged by any of a variety of heuristic or cost-based optimization strategies within constraints imposed by the operator algebra. The operator algebra is based on relational algebra and can be augmented with algebra rules appropriate to the methods of the various object and collection types over which queries can be posed. We identify but do not specify an interface to the OQS optimizer that allows application developers to supply these algebraic rules, thus allowing optimization over methods without breaking encapsulation. For example, it becomes possible to declare to the OQS that a pair of methods commute without saying what they do, thus allowing the OQS optimizer to rearrange them without breaking encapsulation. Providing an algebra for the methods of particular objects or collections is optional; failure to do so does not prohibit queries over them, it just reduces the amount of optimization that can take place. Algebra rules can be provided at any time, thus allowing developers to determine and declare them only if a performance need is demonstrated. Visibility of these optimization capabilities to OMG applications is not mandatory. No standard interface to support object query optimization is provided at this time.

1.3 The OQS as a Continuum of Services

The capabilities of an OQS described in the previous subsection represent a maximal OQS. How­ ever, there are a variety of levels of query service that can and should be supported by the OQS. These include SQL compatibility and full object queries over arbitrary collection types where object identity is maintained. It is important that an OQS support this variety of queries as an upwardly compatible continuum of services. Specifically, object queries over collections of objects that "look like" relations of flat tuples should not differ from the corresponding SQL Entry Level queries. This need has been generally recognized by the SQL and ODMG communities, who have formed a working group to ensure this level of continuity. At the upper end of the spectrum, other problems arise that are more related to uncertain semantics than to syntactic consistency between two query languages. For instance, it is by no means clear what a join between a list and a set should look like (is it a set, a list of sets, a set of lists, ... ?). Similar problems arise whenever a nonset collection is involved in a query; existing query algebras are relational, and their semantics are simply not defined for other collection types. 1 SERVICE DESCRIPTION 11

At a less difficult level, consider the semantics for the selection of objects from sets. Such selection must preserve identity. However, there are at least three separate semantics that can be supported when doing a simple selection: (1) the original objects become part of the result set with their identity preserved, (2) projections of the original objects to one of their ancestor classes become part of the result set with their identity preserved, and (3) new objects with new identities are materialized as the result of some form of join operation between two objects. The first alternative is the obvious case. The second alternative is very useful for managing memory utilization, since it allows elimination of unneeded slots of objects (note that this may also be useful during query processing). The third is needed if we are to allow a complete view capability where object identity is available in the view. Note that this cannot be done within a relational model of joins, since no identity is preserved or generated. Identity in the view is important to allow updates through the view, to allow object-oriented identity comparison, and to ensure that distinct objects with (temporarily) identical state do not collapse down into a single tuple. To ensure backwards compatibility to SQL and to promote forward compatibility to new object­ oriented query features that are envisioned but not yet specified or ready for implementation while at the same time allowing the OQS selection and development cycle to proceed, we need a roadmap of the aforementioned query continuum and a scoping of what will appear in the initial OQS specification being promulgated (OMG OQS 1.0). This is shown in Figure 1.3. Across the top are shown the various levels of query capability provided; across the bottom are the kinds of data that can be manipulated by each level of language. Each level of language must subsume the level to its left syntactically and semantically. SQL[Relations] is regular SQL. OQL[IDL-restricted] allows queries over instances of IDL-defined relation classes of IDL-defined tuple class objects. relation and tuple may be subclassed. These two classes must have primitive types known to SQL Entry Level as their slot values and must have only slot accessor functions (i.e., no member functions that take arguments). The syntax and semantics of SQL[Relations] and OQL[IDL-restricted} are identical. This is being defined by an ad hoc X3H2/0DMG working group. We are prepared to accept this specification pending review that it is consistent with the stated goals. OQL[IDL-restricted} can be made capable of accessing IDL-specified objects of types other than tuple (shown in the figure by the dashed line) as long as only accessor functions are used in the 1 SERVICE DESCRIPTION 12

WHERE clause and all attributes to be retrieved are primitive types known to SQL. The next level of capability, OQL[IDL-Sets of Objects], is the ability to select arbitrary IDL­ defined objects from sets while preserving object identity. As noted above, there are three semantics possible for the selection of objects: simple selection, selection with identity-preserving projection to a supertype, and construction of new objects via joins. While it is our belief that all should ultimately be supported, there does not appear to be community consensus on either the semantics or the appropriate syntax for the second two. Thus, we propose that in OMG OQS 1.0, only simple selection be supported as an identity-preserving operation. Selection of a subset of object's attributes in a non identity-preserving manner into tuples can be done using OQL[IDL-restricted]; applications can then use these retrieved values to construct new objects or project themselves. OQL[IDL-Collections of Objects] allows all the capabilities of OQL[IDL-Sets of Objects] with the ability to query over arbitrary collection types, ultimately including lists, arrays, graphs, trees, etc. As pointed out previously, the semantics of an algebra for these bulk types is unknown. Thus it is premature to specify this level of service in OQS-1. However, its desirability must be kept in mind when specifying OQS-1 to provide the other capabilities listed above.

1 SERVICE DESCRIPTION 14

1.4 Architecture of the Object Query Service

1.4.1 Overview of the Architecture

Figure 1 shows the architecture for a general Object Query Service. The architecture accounts for OQS components and the interfaces among them, including:

• interface for invoking a query service,

• API for phrasing OQL[IDL] queries,

• mappings from OQL[IDL] to foreign query engines including OODBs, RDB, and file systems,

• collection classes expressed in IDL,

• interface for indexing IDL collections,

• interface for registering optimization information, and

• interface for federation of multiple OQS services.

1.4.2 Client and Service Provider Interfaces

The client of an OQS issues OQL queries and receives responses without knowing the implemen­ tation of the OQS. In environments where there is just one query service, already bound to the environment, the client needs not be aware of interfaces for invoking the query service. Thus, the primary user interface to OQS is the API for phrasing OQL[IDL] queries. See Appendix C. OQS service providers need additional system interfaces since they need to provide ways for OMG query engines to interoperate with legacy query engines and ways for scaling the OMG query capability by federating query engines.

1.4.3 Classification of Object Query Service Engines and Interfaces

OQS system developers need to be able to build systems that interoperate and scale. We classify OQS engines and interfaces needed into the following four building blocks:6

6 Note: These common cases are true for other services, not just the query service. For example, the name services, transaction, or security service can have native or foreign implementations, gateways, or federated architectures. That is, it is not just CORBA and OQS that needs a federation architecture but other object services as well. • Native OQS - A native OQS engine natively parses OQL[IDL], optimizes OQL queries to produce an execution plan, then provides the execution engine that computes a response. 1 SERVICE DESCRIPTION 16

• Foreign OQS - A foreign OQS engine does not understand OQL or IDL. Some form of wrapper maps OQL(IDL] to FQL(FDM] (Foreign Query Language of Foreign Data Model), which produces results which are given IDL interfaces. A typical example is SQL(Relations].

• OQS Gateways - An OQS gateway provides a front-end OQS that can map to multiple back-end OQS engines.

• Federated OQS - A federated OQS splits and coordinates execution of a query across Native, Foreign, or Federated OQSs and combines the results returned.

The Foreign OQS case is important since today few if any Native OQS query engines exist and most database data is in existing legacy . But today's DBMS systems cannot take direct advantage of the object modeling capabilities of IDL; in particular, they often do not permit operations in predicates in selection expressions. Also, they cannot optimize for object models and application developers cannot define new indices. Finally, there is a need to map data from the data model IDL to the data model of the foreign database engine and this is difficult when applica­ tion domains are more complex than simple relational database applications support. Application developers are left to decide whether to model their domain data in IDL or in a DBMS supported object model (e.g., SQL3's object model) or in an OOPL and the mapping between these three is left to them. This provides an opportunity for Native OQS services that directly understand IDL, support predicates in selection expressions, and allow developers to define new indices. A middle ground is Interface Adaptors that automatically convert between OQL[IDL] and various projections like SQL[Relations], thus allowing queries to be migrated between OQSs for processing. As with all OMG bindings from IDL to other object models, it may not be equally easy for developers to model freely in IDL and in the target data model and shift back and forth when the target data model semantics does not fit IDL directly.7 Since it is desirable to be able to port IDL applications to multiple vendor's OQS implemen­ tations, gateways provide a degree of independence. A standard gateway OODBC analogous to ODBC is needed. 7This is the primary reason why it would be valuable if SQL3, ODMG, and OMG all supported the same object model. 1 SERVICE DESCRIPTION 17

Finally, since data may be stored under the control of multiple OQS engines, there is a need for federation of OQSs. All communication between these OQS engines is in terms of OQL[IDL) queries and responses are IDL collections or exceptions.

1.4.4 Where is the processing done?

COREA makes object location transparent. Query operations often operate on large numbers of objects. While it may be that operations are moved to the object or objects to the operator, a concern for efficiency is important, hence it is necessary to provide for Query Optimization. Query optimization decisions for an OQS will be a challenge since much of the data is stored in various legacy databases wrapped with IDL. Support for indexing and optimizing arbitrary IDL will be a challenge if it involves mapping down to optimization machinery in existing legacy DBMS systems. This is the case today with object interfaces to relational engines. For many IDL-wrapped legacy applications, however, processing may just involve mapping IDL queries to SQL queries on relational data, then returning the relational result wrapped in IDL. This should remain as efficient as the existing implementation.

1.4.5 Communicating Queries Across the ORB

Any query transmitted between OQSs will be in the form of OQL[IDL). Object references contained in the query will be packaged by the OQS as IDL references. Thus, names local to a particular program environment will be resolved to an OlD as part of the process of packaging the query. It is not specified whether the objects themselves will be shipped as part of the query or be allowed to be retrieved by the recipient. Results returned from queries will be OIDs of collection objects; again, it is not specified whether the actual objects will be returned, but that is the most likely case. Using OQL[IDL] as the language of passing queries has been chosen as the standard mechanism for several reasons. Queries posed in one OQL[X] projection that are to evaluated in OQL[Y] must be translated between them. While it is possible for an OQS[X] to translate to all formats, it is better to use an intermediate form from which other OQS[Y]s can translate into their own dialect. Besides the obvious reduction in the number of translators required, it makes it easier to add new OQS[Z]s because they can simply appear in the system and become usable without having to add 1 SERVICE DESCRIPTION 18 new translators to all the existing OQS(X]s. This is also very much in keeping with the OMG philosophy that encourages minimal knowledge of a service supplier by the requester. By using OQL[IDL] for transport, the requesting OQS need not know the language in which the serving OQS will perform, much less which dialect (e.g., which variant of SQL). 2 SERVICE STRUCTURE 19

2 Service Structure

Based on the architecture of the Object Query Service, as described in Section 1.4, a complete Object Query Service submission will require separate specification of several kinds of interfaces. In this section, we list those interfaces we have identified and briefly describe which of these we recommend are candidates for OMG OQS 1.0 adoption. Since the interfaces are orthogonal, we recommend that OMG require that these specifications be separately (not monolithicly) packaged to allow each specification be separately considered and improved. Appendices A through H provide the detailed specifications or descriptions of the OQS interfaces we have identified. An OQS specification will eventually require the separate specification of the following interfaces. As noted, some of these should be included in OMG OQS 1.0 and some can be deferred to a future OQS 2.0 specification.

• OQS Functional Interface

An OQS-Factory object creates OQS instances that enable OMG applications and services to interoperate with OQSs. This interface enables applications to: (1) connect with one or more OQSs, (2) submit query and update requests to the OQS, (3) get the results of query requests, ( 4) bind the results of query requests to variables in the application run-time environment, and ( 4) disconnect from OQSs. The IBM, ODMG, Oracle, and TI first round submissions all identified the need for this interface.

Subtypes of OQS objects can accept different query languages. OMG OQS 1.0 specification should support the following:

- an OQS subtype that combines SQL and IDL (see Appendix C) to make it easy to query IDL collections using a standard mechanism.

an OQS subtype that interfaces to SQL-92-entry-level (see Appendix E) to insure OMG access to data in legacy relational DBMS systems.

In addition, other OQS subtypes based on SQL may be defined (e.g., SQL89 SQL-92-entry­ level-Oracle) as well as OQS subtypes not based on SQL (e.g., a lighter weight OQS for the OMG Property Service). The SQL-based OQS subtypes should retain as much compatibility with ANSI/ISO SQL standards as possible. 2 SERVICE STRUCTURE 20

Appendix A describes the functionality needed for an OQS Functional Interface, but we have not specified an IDL syntax for OQS objects since we expect to be able to adopt the interface being defined by IBM and ODMG in their merged submission.

The OQS Functional Interface is analogous to the ODBC interface currently offered by many relational DBMSs as well as to some portions of the SQL Call Level Interface. OMG needs to consider compatibility with these specifications before adopting its own OQS Functional Interface.

• OQL[X] BNF

In Appendix B, we describe an SQL BNF grammar that is parameterized in such a way to allow bindings to IDL, C++, and Relations.

The particular OQL grammar provided is a subset of SQL89 and of SQL2. It is provided as a guide for how we can define a common SQL-based query language that provides bindings to different object and data models. See Section 1.3. A completed OMG OQS 1.0 specification should include at least one carefully defined OQL[X] corresponding to and compatible with SQL89 and/or SQL2-entry-level.

• OQL[IDL)

In Appendix C, we describe an OQL[IDL) binding and provide example queries. OQL[IDL) uses the OQL[X) BNF defined in Appendix B.

This binding is needed in OMG OQS 1.0 since it defines the object query language that "operates on IDL collections to return IDL collections" that is required in OSTF RFP #4.

• OQL[C++] Binding

In Appendix D, we describe an OQL[C++] binding and provide example queries. OQL[C++] uses the OQL[X] BNF defined in Appendix B.

This binding would be useful to include in OMG OQS 1.0 since it defines the object query lan­ guage that operates on C++ collections to return C++ collections, analogous to OQL[IDL]. The description of OQL[C++] is consistent with the OMG IDL C++ Language Mapping [OMG Document 94-8-2]. 2 SERVICE STRUCTURE 21

• Binding OQL[Relations]

In Appendix E, we describe an OQL(Relations] binding. OQL[Relations] uses the OQL[X] BNF defined in Appendix B. This binding is functionally equivalent to SQL and may be syntactically the same.

This binding is needed in OMG OQS 1.0 since much data is in existing relational databases and only accessible via SQL. We propose a projection of RDBMS relations back into IDL so IDL relation and tuple objects mirror existing RDBMS relations.

• IDL Templates

In Appendix F, we identify the need for a separate OMG IDL Templates facility. C++ and SQL3 both support templates. The OMG Collections Object Service will require templates.

While we do not provide a specification of templates, we assume a syntax consistent with the CORBA syntax for builtin templates Sequence and String is sufficient for the OQS specifica­ tion, e.g., Collection.

• OMG Collections Object Service

In Appendix G, we identify the need for a separate OMG Collections Object Service. Both OMG Object Services Task Force and OMG Common Facilities Task Force have identified this need and a description has been circulated along with a recommendation from CFTF to OSTF to include the Collections Object Service in OSTF RFP #5.

We expect to support the minimal definition of an IDL collection class defined by the combined ODMG-IBM submission and so do not provide our own specification.

However, that specification should carefully consider upwards compatibility with a future OMG Collections Object Service and IDL Templates facility. The Collection class can be defined as another IDL builtin for now for future compatibility.

• OMG Collections Indexing Service

In Appendix H, we identify, describe, but do not specify IDL interfaces for an OMG Collec­ tions Indexing Object Service. For the present, a usable OMG OMG OQS 1.0 capability can be defined without this capability, which is primarily needed to provide control for system maintenance of collections and for optimization of queries on collections. t

2 SERVICE STRUCTURE 22

• Other Interfaces

We identify here but do not further describe other interfaces related to the OQS that may be candidates for future OMG standards:

Object Query Optimizer

Object Query Algebra

Object Query Parse Tree (intermediate form)

OQS Federation Interfaces

OQL[Smalltalk]

- Object Implementation Repository Service 3 RESOLUTION OF TECHNICAL AND NON-TECHNICAL ISSUES 23

3 Resolution of Technical and Non-Technical Issues

This section explains how the OQL[IDL] query service addresses the technical issues in Sections 2, 4, and 8, and Appendix B.3 of OMG OSTF RFP #4.

Issues from OMG OSTF RFP #4 Section 4.3

• "Object Services interfaces shall be object-oriented and shall be expressed in IDL."

OQL[IDL] operates on IDL. Queries are written in the form oflanguage statement constructs, not methods or functions or argument strings, to preserve the SQL statement heritage. This is compatible with OQL[C+t]. The mapping to OQL[Smalltalk] may require a different approach to match Smalltalk syntax but is not provided in this specification.

• "Proposed extensions to IDL, CORBA, and/or OMG Object Model shall be identified."

See Sections 5 and 6 and Appendix F and G.

• "Operation sequencing shall be included where applicable."

See the description of the Object Query Service Functional Interface in Appendix A.

• "Object Service Specifications shall not contain implementation descriptions."

The need for interfaces to Object Collection Indexing Service and the Implementation Repos­ itory Service to support query optimization is identified but interfaces are not defined.

• "Specifications shall be complete."

See the appendices.

• "Object Services shall have precise descriptions."

See the appendices.

• "Independence and modularity of object services."

The Object Query Service is a software "building block." It can be used with or without persistence and with or without several other service. See Service Dependencies, Section 4. 3 RESOLUTION OF TECHNICAL AND NON-TECHNICAL ISSUES 24

• "Minimize duplication of functionality."

No other service provides operations on collections to return collections. The Query Service is not "bundled" with other separately useful services. Again, see Section 4.

• "No hidden interfaces among object services."

While some interfaces are not provided in this specification (see Section 1.3 and 4), they are identified.

• "Consistency among object services."

Other object services can be freely composed with the Object Query Service. Again, see Section 4. OQS is CORBA and COSS compliant.

• "Extensibility of individual object services."

Collections on user-defined types is supported with the current specification using the pro­ posed Collection Object Service. In addition to the Collection Object Service, new queriable collection classes could be defined using IDL templates. New indices can be defined. New query algebras can be defined. Rules for distributing queries can be defined. Interfaces for defining these algebras and rules, and registering them with the query optimizer are not provided in this specification.

• "Extending the collection of object services."

Other query services (e.g., a Business Rules Service) could be added to the OMG OSA without affecting the Object Query Service.

• "Configurability."

An OMG OSA-based system can be configured with or without a Query Service. In fact, this can be viewed as a primary difference between today's monolithic OODB and RDB systems and future systems built using Object Services Architectures. OODBs often only provide persistence and do not provide a query service. RDBs often monolithically require a query service. The specification we propose works with or without other services like persistence.

• "Consistency with other COSS services"

As far as we can determine. 3 RESOLUTION OF TECHNICAL AND NON-TECHNICAL ISSUES 25

• "Conventions and Guidelines"

As far as we can determine.

• "Mandatory versus optional interfaces"

The Object Query Service itself is optional but could be configured as mandatory. In the near term, it might make sense to require OMG OQSs to support queries on Sets mandatory and queries on other collection types optional. The Object Query Service has a mandatory interface to the Collections Object Service. The Object Query Service has optional interfaces to the Collections Indexing Object Service. See also Section 4.

• "Constraints on object behavior"

Operations in WHERE clauses should not change state. This gives rise to an issue: given encapsulation, how can we guarantee that operations in predicates do not have side-effects? Some possible solutions are described in Section 6.

• "Integration with future object services"

It is expected that the Object Query Service will operate with future services like Change Management and Replication. It may take some work to determine if the Object Query Service should be combined with an Integrity Constraint or Trigger or Business Rules Service but these services have not yet been proposed.

Issues from OMG OSTF RFP #4 Section 4.4

• "Proof of Concept Statement" - At Texas Instruments, we have designed and implemented the Open OODB Toolkit, a component DBMS system architected as a collection of object services [Wells et al. 1992]. The implementation demonstrates an Object Services Architecture for C++ and Common Lisp. At present, we are working to also support OMG IDL. The system is designed as separable modules that can be composed to provide an OODB or an OODB­ RDB. Module boundaries are consistent with the OMG Object Services Architecture.

An implementation of OQS for OQL[C++], as described in Appendix D, has been developed and field tested at 25 alpha sites. The OQS service implementation operates on transient 3 RESOLUTION OF TECHNICAL AND NON-TECHNICAL ISSUES 26

and/or persistent C++ objects. A skeletal query optimizer generates query execution plans. We have a skeletal query execution engine but more work is needed here.

The Open OODB Toolkit is expected to be available from Texas Instruments in source code in lQ 1995. It will not be a fully supported "TI product" but will instead be available on an "as is" unsupported basis. End User and Commercial Developer licenses are expected to be modestly priced.

Issues from OMG OSTF RFP #4 Section 4.5

• "Reliability" -Reliability depends on implementation and design. We consider "safety" of the design and note that the design of OQL[IDL] does not require breaking IDL encapsulation.

• "Performance" - Interfaces for indexing and optimization are identified but not defined in the current specification.

• "Scalability" - Section 1.3 describes a scalable federated query service.

• "Portability" - OQS is designed to be portable. More work would be required to tune an OQS query optimizer to use indices supported by other systems unless these interfaces are standardized.

As evidence for portability, our Open OODB OQS[C++] implementation was ported to Ver­ sant very quickly (a few days). We believe it could be ported to other OODBs or CORBAs rapidly.

Issues from OMG OSTF RFP # 4 Appendix B.3

• "enabling the specification oflanguage bindings to object derivatives of SQL and/or to direct manipulation query languages."

For language bindings, see Appendices B, C, D, and E.

We do not address direct manipulation query languages if by that is meant GUI interfaces to DBMS systems. Clearly such GUI interfaces could be built as applications that interfaced to OQS and permitted end users to specify queries. 3 RESOLUTION OF TECHNICAL AND NON-TECHNICAL ISSUES 27

• "operations of selection, insertion, updating, and deletion on collections of objects."

Basic operations of selection, insertion, updating, and deletion on collections of objects are handled by the Collection Classes themselves. The Object Query Service provides higher level aggregate operators for SELECT and UPDATE. Syntax like SQL CREATE can also be defined for backwards compatibility with SQL but the Lifecycle Service already provides a way to create and destroy objects including collections.

• "granularity of objects accessed by queries, including good support for high performance access to fine-grained objects."

The IDL C++ mapping and the existence of some DSOM C++ compilers that generate IDL wrappers leads us to believe that IDL can provide high performance access to fine­ grained objects. Texas Instruments' OQL[C++] implementation provides efficient, optimized access to fine-grained C++ objects. ODMG uses IDL as an interface language for OODBs. There is no reason to believe CORBAs or COREA-based query services will only provide low performance access to coarse-grained, heavy-weight objects like spreadsheets.

• "scope of objects accessible in and via collections that are the immediate operands of the query operations."

OQS allows navigation operators (path expressions) in the WHERE clause of SELECT state­ ments. The optimizer in our current OQL[C++J supports this.

• "querying and returning complex data structures."

This is the basic objective of the Object Query Service.

• "operating on user defined collections."

Queries on system- or user-defined collections are supported.

• "operating on other kinds of collections as well as sets."

OQS supports queries on multiple collection types, not just sets.

• "allowing the use of attributes, inheritance, and procedurally-specified operations in the pred­ icate and in the computational results. 3 RESOLUTION OF TECHNICAL AND NON-TECHNICAL ISSUES 28

OQS supports the formulation of queries involving predicates that refer to the object's inter­ face, inherited behavior, and relationships (via dot notation).

"allow the use of available interfaces defined by OMG-adopted specifications."

This is fully supported.

"allow the use of properties when the Property Service is available."

Queries over objects that have Properties will be supported immediately with no extra work. Additional work will be required to optimize queries that use Properties.

- "allowing the use of navigation when the Relationship Service is available, including testing for the existence of a relationship between objects."

Queries over objects that have Relationships will be supported immediately with no extra work. Additional work will be required to optimize queries that use Relationships.

• "The query service should not break encapsulation."

The Query Service respects IDL encapsulation. 4 SERVICE DEPENDENCIES 29

4 Service Dependencies

4.1 Mandatory Dependencies

The Object Query Service depends on:

• Collection Object Service (see Appendix G) - the Object Query Service operates on collections to return collections.

The Collection Object Service in turn depends on IDL Templates (see Appendix F).

• Object Lifecycle Service - used to create and delete object instances. This is particularly important in the SELECT clause, where projections to supertypes and types outside the objects' type hierarchies may be created as the result of a JOIN.

4.2 Optional Dependencies

The Object Query Service may take advantage of other object services if they are available. Inter­ faces to these services are not specified in this document. These services include:

• Collections Indexing Object Service- may be used by the OQS Query Optimi~er.

• Object Implementation Repository Service - used for storing optimi~ation information, including information about the cardinality of a collection, location of elements, existence and kind of indices, and predicate selectivity.

• Object Persistence Service- if collections being queried are persistent.

• Object Relationships Service- relationship traversal operations can be used in query service predicates.

• Object Property Service- properties can be used in query service predicates.

• Object Change Management Service - to version collections and element of collections.

• Object Security Service - to control access to collections or elements.

4.3 Other Services that may depend on the Object Query Service

Other object services can take advantage of the Object Query Service if it 1s available. These include:

• Naming - to search for names in namespaces

• Change Management - to use queries on temporal indices to locate versions.

• Implementation and Repositories- to search repositories

• Property Service- if OQS is used to query a subtype of list-of-attr-val-pairs \

5 RELATIONSHIP TO CORBA 30

5 Relationship to CORBA

The CORBA 1.1 specification [OMG 91.12.1] provides two separable specifications in one document: a specification for the OMG object model IDL and a specification for location-transparent dispatch, or distribution. IDL is the native object model of the OQS. OQL[IDL] operates on IDL collections to return IDL collections. The reference and dispatch mechanisms of CORBA are the mechanisms OQS uses for accessing the query service, dispatching query language operations to the query service, defining and referring to collections, referencing objects in collections, and returning results. CORBA is also used for federation of distributed query services, as described in Section 1.4, Architecture of Object Query Service. 6 RELATIONSHIP TO THE OMG OBJECT MODEL 31

6 Relationship to the OMG Object Model

The Object Query Service we propose does not require extensions to IDL. It does suggest extending IDL to support Type Templates to make it easier to allow the definition of a collection class library. For compatibility with the design goals of upwards compatibility with ODMG and SQL3, the following IDL changes might be considered during the OMG OSTF RFP #4 merging process. These potential change are noted here, should be studied further, can be considered separately from each other, and are not part of our current more-RISC-like OQS proposal.

• OBSERVER/MUTATOR keywords - annotate operations and are currently proposed for SQL3. SQL3 annotates operators with these keywords to be able to distinguish whether a method simply accesses or also changes state. Mutator operations in WHERE clauses intro­ duce semantic problems and not knowing which operators change state can make maintaining indices on a collection problematic.

There are other means of detecting or protecting against changes to elements in collections during queries. These range from unsafe manual methods like "user beware" or "mark­ modified" to automated means like parsing code or memory protection-based schemes. Similar mechanisms are also needed for efficiency in transactions on persistent objects where only some objects are modified.

• PUBLIC/PRIVATE/PROTECTED keywords- analogous to C++ are currently proposed for SQL3. These provide one means of allowing unencapsulated relational tuples in legacy systems to be viewed as a special case of fiat, public IDL, providing one path to backwards compatibility with SQL89 and SQL2. The downside is that this extension would directly affect the encapsulation of IDL. An alternative migration path is to argue that RDBMS relations really do encapsulate their state but the ability to separate interface from implementation is not available to RDBMS users except via views.

• EXTENT keyword- proposed by ODMG to automatically add to and delete objects from collections as they are created and destroyed. Extents are collections that contain all elements of a type. This seems reasonable as an optional keyword.

However, some DBMS RDB-OODB hybrid systems go a step further and require that extents 6 RELATIONSHIP TO THE OMG OBJECT MODEL 32

be maintained for every class making it uniformly easier to maintain and query sets and optimize in the presence of inheritance and aggregation. Some of these proposals do not provide a way to separately distinguish collections containing just some elements of a type from extents. While we believe it is desirable to support this semantics, we do not believe this should be the only general-purpose semantics as it is in some RDB-OODBMS systems. There are many cases where there is no need to ever reference all members in an extent and, in these cases, maintaining these extents would be inefficient.

• KEY keyword- proposed by ODMG-93 as a mechanism to define identity based on equality of key attributes. This seems reasonable and optional for both sets and extents, but may not be meaningful for arbitrary collection types.

To date, the OMG Object Model Task Force identified an abstract model for extending the core OMG object model with component semantic extensions. This extensibility has not been tested by any OMG community. It is possible that some of the optional keywords above may be useful IDL extensions to meet the needs of richer data modeling and/or the needs of the relational and OODB database communities. 7 STANDARDSCONFORMANCE 33

7 Standards Conformance

As described in Section 1, we believe that a unification of Object DBMS and Relational DBMS with OMG-like Object Services Architectures is inevitable and desirable but that poorly interoperating composites are today's fare. To summarize this section on standards conformance, we propose a unified Object Query Service for OMG, ISO/ANSI SQL, and ODMG. The proposed object query service is based on:

• OMG's object model IDL, with potential OMG object model component extensions from ODMG and ISO/ ANSI, and

• query commands from ISO/ANSI SQL, again with possible extensions from ODMG.

We will be interested in any merged Object Query Service proposal that results in a unified Object Query Service for OMG, ISO/ ANSI SQL, and ODMG.

7.1 Relationship to OMG Standards

Our proposed Object Query Service conforms to the following OMG specifications without change.

• OMG Architecture Guide

• CORBA 1.1

• OMG OMTF Object Model

• OMG OSTF Common Object Services COSS-1 and COSS-2

• OMG IDL C++ Mapping

The proposed Object Query Service recommends that OMG adopt some additional standards or extensions to existing standards that an OMG Object Query Service depends on. These are independently useful and do not themselves depend on the Object Query Service.

• Templates for IDL (see Appendix F)

• Common Collection Classes (see Appendix G) 7 STANDARDSCONFORMANCE 34

• Index Service (see Appendix H)

• some optional IDL extensions (Section 4.4)

As proposed, the Object Query Service is a collection of potentially stand-alone object ser­ vices (not monolithic), supports queries on transient or persistent IDL collections, supports an OQL[C++] binding compatible with the IDL to C++ Mapping [OMG 94-8-2], and supports an SQL-derivative query language as required in OMG OSTF RFP #4.

7.2 Relationship to OSI/ ANSI SQL

X3H2 is an ANSI accredited standards committee focused on standardizing database data defini­ tion and data manipulation languages, specifically focusing on SQL-based standards. Relational database management systems represent a multi-billion dollar industry. Clearly, OMG must provide for standard interfaces to existing and future SQL-based DBMS systems. In the past, ISO I ANSI completed SQL89, which most relational vendors support, and SQL2 or SQL92, which extended SQL in a number of ways. ISO I ANSI is now working on SQL3, which is expected to be completed in the 1997 time frame. Information on SQL3 is available in [Melton 1994]. A summary of SQL3 is included in [Manola 1994]. Because SQL89 and SQL2 are widely supported by industry and many DBMS applications depend on them, we accept that the OQS must support these. See Appendix E. The remainder of this section considers SQL3. Like OMG IDL, SQL3 proposes a programming language neutral object model to permit sharing data and behavior across programming environments. Like OMG, the SQL3 object model has been influenced in design by C++ and supports classes, methods, encapsulation, and inheritance. Unlike IDL, SQL3 does not provide a separate specification for an object model than for object services, instead providing a monolithic and increasingly complex specification.8 Unlike IDL, SQL3 is influenced by the need to maintain backwards compatibility with SQL89ISQL2's structural data model. Also, unlike IDL, SQL3 not only defines an object definition language playing the same role as IDL (though with a different syntax and semantics) but also SQL3 provides a computationally

8 Actually, OMG is also guilty since the IDL specification is bundled with the distribution specification as part of the CORBA 1.1 specification. It should be unbundled! 7 STANDARDSCONFORMANCE 35

complete programming language for writing database methods (in addition to the ability to interface to external procedures). So, far from being OOPL neutral, SQL3 proposes YAOOPL (yet another 00 programming language) for use with persistent data. In addition, ISO f ANSI has a work item to develop a number of standard class libraries for multi-media, geographic information systems, etc. using the SQL3 database programming language. While it is clearly valuable that SQL move beyond blobs to provide direct support for objects, it is not clear (to us) that there are compelling technical reasons for industry to produce and support dueling object models for distribution and DBMS. It is also not clear that there is a need for YAOOPL or for class libraries written in a database programming language when the OODB community has demonstrated how to support seamless persistence for existing object models C++, Smalltalk, Common Lisp, and others. There is an opportunity to unify the OMG and ISO/ANSI communities. While there is some common membership between ISO/ANSI and OMG, the effort to unify ISO/ ANSI's SQL3 object model and OMG's IDL has only recently begun. OMG IDL is still in the early adoption stage and its OMG Object Model supports the untested idea of component additions to a core object model to meet the special needs of specific communities like the DBMS community. SQL3 is still in development though some SQL3-variant products exist. The future holds two scenarios:

• OMG and SQL3 (object models and services, including the query service) continue to di­ verge. Vendors supplying infrastructure software will need to develop for both OMG and SQL3 "standards". Then, applications that need both a database and distribution will find themselves mapping objects from an OOPL to IDL and separately to SQL3. The mappings will not be seamless. Instead, at best, the relational community will provide a binding (not necessarily a one-to-one mapping) from IDL to/ from SQL3's object model. They might ar­ gue that SQL3 is yet-another-OOPL that OMG needs to bind to. This will force application developers to have to choose what object model to use to natively represent their enterprise information, SQL3 or IDL, and they may need to try to maintain enterprise models in both SQL3 and IDL to get the services of each. This will be expensive.

• OMG and SQL3 communities work to unify their object models, at least, semantically. In­ frastructure vendors and application developers both benefit from the common object model supported widely in industry. Huge investments in glue logic are avoided; a more stable 7 STANDARDSCONFORMANCE 36

infrastructure base is put in place to preserve infrastructure investment for the next 10-20 years.

The Object Query Service we are proposing is in the spirit of this later scenario. We are propos­ ing that OQL[IDL] use the widely adopted SQL query syntax and semantics of SQL, including the

9 SELECT-FROM-WHERE statement, operating on collections of OMG IDL objects . The query statement for OQL[IDL] is directly derived from ISO I ANSI SQL. The BNF in Appendix B is SQL89 BNF, the commonly accepted relational query language in the relational DBMS industry. (If this solution approach appears reasonable to ISO I ANSI and OMG, it is expected the BNF for queries can be upgraded with SQL2 and SQL3 query language extensions.) If SQL3 becomes a standard with ADTs not compatible with IDL, then we propose that OMG will have to adopt an OQS subtype for SQL3. However, since SQL3 is not yet a standard, there is a chance it can merge its ADTs into IDL. We propose that OMG wait to adopt an SQL3-specific interface. Thus, we are proposing not only a solution for a specific OMG Object Query Service, but also a solution approach for unifying OMG and ISO I ANSI standards.

7.3 Relationship to ODMG OQL

Object Data Management Group (ODMG) is a consortium of Object Database vendors working to jump-start the standards process for OODBs by providing a specification, ODMG-93 [Cattell1993], as a de facto standard OODB API. ODMG is affiliated with OMG but is a separate organization (not an OMG SIG or Task Force) and meets separately. The ODMG standard is related to OMG standards in that ODMG's Object Definition Language (ODMG ODL) is derived from OMG IDL and the ODMG specification is recognized by reference in the OMG Persistent Object Service. ODMG is not explicitly architected as a collection of OMG basic object services though with some straightforward work it could be (unifying OMG and ODMG specifications). The Object Query Service we propose is related as follows to ODMG's Object Definition Lan­ guage (ODMG ODL) as described in Chapters 2 and 3 of [Cattell-93] and ODMG's Object Query

9 Actually, we additionally propose that ISO/ ANSI separately specify the object model of ADTs and the query lan­ guage of SQL. Further, we propose that good ideas from SQL3 ADTs (e.g., multi-methods and multiple inheritance) be proposed as extensions to OMG IDL. 7 STANDARDSCONFORMANCE 37

Language (ODMG OQL) as described in Chapter 4 of the same document. The object model of ODMG OQL is ODL. ODMG ODL is based on OMG IDL and is an upwards compatible extension of IDL with several additions:

• ODMG ODL defines a specific Class Library that includes Collection Classes.

• ODMG ODL adds optional keywords for EXTENT of a class, permitting but not requiring automated collection maintenance for all instances in a class, and for KEY(S). See Section 6 of this document.

• ODMG ODL adds optional keywords for binary relationships (RELATIONSHIP, INVERSE, and ORDER_BY) to IDL.

Appendix G describes some of OMG's choices for a Collections Object Service. ODMG collec­ tion classes are one possibility. The API for OQL[IDL] is reasonably independent of which collection class library is adopted (although an OQS optimizer benefits from this information). Extensions to our OQL[IDL] to support ODL would mainly involve adding specific collection classes and ODMG keywords for extents, keys, and relationships. These extensions would be straightforward for our proposed OQS but are not needed for a basic Object Query Service. The bigger issue is ODMG's object query language OQL as described in Chapter 5 of [Cattell 1993] and included by reference in the ODMG initial submission to this RPF. ODMG OQL as specified there is both less and more than SQL: less in not basing its (abstract) query syntax directly on SQL BNF; more in providing operations on collections other than just sets, a functional notation that allows queries wherever a collection-valued expression occurs, and relies on an object's methods to change object state rather than explicit database commands such as UPDATE. There is an attempt being made by ODMG members and ISO/ ANSI to define a subset of ODMG OQL that is "hard compliant" with SQL Entry Level. We whole-heartedly support this effort for that subset of OQL. How completely this is achieved and how well that subset is integrated with the rest of ODMG OQL is not known at the time of this writing. Our OQL[IDL] intends to cover the nice additions of ODMG while maintaining backwards compatibility with SQL. That is, it more directly supports SQL query syntax, avoiding arbitrary deviations from it. Where is fails to also cover ODMG OQL, it could be extended to also support it if the ODMG and ISO/ANSI communities (or if OMG) agree on a common standard. 7 STANDARDSCONFORMANCE 38

In addition, ODMG Sections 5.4 and 5.6.2.2 provide two different embedding of OQL into C++­ Both mappings are considerably more awkward than that in our proposal for OQL[C++]- Finally, like ISO/ ANSI, ODMG presents the query service as an integral part of a monolithic specification. It is clear (to us) that a separate query service is also of value. It would not be difficult for ODMG to adopt a modular object query specification based on OMG OQS. Overall, we are proposing not only a solution for a specific OMG Object Query Service, but also a solution approach for unifying OMG and ODMG's query service standards. ...

8 OPEN TECHNICAL ISSUES AND FUTURE WORK 39

8 Open Technical Issues and Future Work

The specification we have proposed provides a sufficient base for an initial OMG Object Query Service but it is incomplete in several ways and raises some issues. Technical issues that are not resolved in this submission include:

• Wh.ich Collections Object Service should OMG select. See Appendix G.

• What is the semantics of SQL queries on collections other than sets? For instance, what result should be returned from a join of a list and a set?

• Do query results return copies or references for objects? If the later, how is consistency maintained among overlapping sets, especially in the presence of indexing?

• Definition of object selection syntax and semantics to allow projection of objects to an ancestor class while maintaining identity, and to allow creation of new object types with new, consistent (across multiple instantiations) identity.

• SQL3 allows tables with rows with or without object identifiers, in the latter case, with objects identified by primary key. Also, it provides vendor supplied h.idden row IDs that are not unique and immutable. Should the OMG Object Query Service attempt to be SQL compatible everywhere?

• SQL3 provides a semantics for null values, which is not considered here.

• SQL3 supports multimethods, public and private state, and mutator keywords. Should OMG and OQL[IDLJ support these?

More work is needed in the following areas:

• Templates for IDL (see Appendix F)

• Collections Object Service (see Appendix G)

• Collection Indexing Service (see Appendix H for open issues in indexing collections)

• Interfaces to the OMG Implementation Repository to provide information to the the query optimizer. ..

8 OPEN TECHNICAL ISSUES AND FUTURE WORK 40

• Federation of OQS services to provide distributed OQS.

• Different OQL[IDL) to SQL89 and SQL2 Mapping/Binding (see Appendix E)

• OQL[IDL) to OQL[Smalltalk) Mapping/Binding (not provided) A OQS FUNCTIONAL INTERFACE 41

A OQS Functional Interface

The purpose of an OQS Functional Interface is to enable OMG applications and services to inter­ operate with OQSs. The following subsections outline functionality the OQS must provide.

Environment creation and destruction

These functions allocate and deallocate an environment to be used by an OQS connection (see below). This environment is a buffer or cache to store control information as well as data returned by the OQS. An OQS factory creation function returns a handle to the environment.

Find an OQS

This function allows an application to locate a particular OQS. It returns a handle to the OQS identifier.

Connect and Disconnect

Given that an environment has been allocated and a particular OQS has been found, these functions enable the connection and disconnection of the application with the OQS. The connect function returns a handle to a new connection. It should be possible for an application to establish multiple simultaneous connections with OQSs. The disconnect function terminates a connection.

Query Execution

This is the main OQS processing function. Given a query or update statement, this function causes the execution of the statement by an OQS . After successful execution of this function, the application must call other OQS functions to retrieve the resulting objects.

Bindings

These functions bind the IDL objects returned by the query with object variables in the application environment (e.g., C++). A OQS FUNCTIONAL INTERFACE 42

Iteration

These functions allow applications to iterate over the results of a query one object at a time. This is similar to the role of cursors in SQL and allows query results to take an amount of space larger than the environment initially allocated to the OQS connection. B BNF FOR OQL{Xj 43

B BNF for OQL[X]

This section describes how a single query language specification, OQL[X], directly compatible with SQL, can be reused with different objects models, specifically OMG IDL, C++, and Relations. OQL[IDL] provides a native query capability for IDL. OQL[C++ ] makes it easy for C++ program­ mers to query collections. OQL[Relations] is the same as SQL and is provided to support access to legacy relational database systems. Note that this use of a single query language specification is orthogonal to the need to provide inter object model mappings/bindings, e.g., for IDL to C++ (adopted), C+ + to IDL, Relations to IDL (see Appendix E), or IDL to Relations. The need for a mappings to SQL3's ADT object model is also described in Appendix E. The following is the BNF grammar for OQL, a parameterized subset of SQL with the follow­ ing changes to rewrite rules to accomodate object extensions. The grammar provided should be considered for illustrative purposes, since it is not complete and has not been evaluated in a public forum for SQL compliance.

• The "where clause" is generalized to permit boolean combinations of user-defined operations. OQL comparison operators including EQ and GT can appear as usual in SQL. In addition, language-specific comparison operators such as == and > can also appear. Language-specific operators are overloadable as normal in the target language model (e.g., C++ overloading). Overloaded operators must be registered with either the OQS or the Object Indexing Service (proposed in this submission) in order to be optimized across. It is possible to overload the OQL comparison operators and register these overloadings, resulting in more uniformity of operator usage at the expense of somewhat greater complexity; which is preferable seems a matter of taste. Special operations on string, number , date, etc. are no longer needed since operations on user-defined data types subsume this need. (Instead, a built-in class library of such operators could be supported backwar ds compatably by SQL.)

• T he "from clause" is generalized to support collections. Relations in SQL, which are sets of tuples, are generalized to be collections of objects in OQL. Non-terminals seLref and seLexpression, which appear in the SQL FROM clause, are generalized to be collection_ref and collection_expression in OQL[X]. A particular data model may then re-restrict these gen- B BNF FOR OQL{X} 44

eralizations. Specifically, OQL[Relations] would restrict back down to sets of tuples, etc.

SET semantics is used for combining different kinds of collections in the FROM clause and a Set is always the result. Section 1.4 describes a future variant of OQL that returns collection . Appendix 8 describes an issue involving what collection type to return.

• The "project clause" is generalized to permit value_constructors of any type. The type of value_constructor's constructed in the "project clause" determines the type ofthe query _expression.

• Path_expressions are supported.

The same BNF is used in defining OQL[IDL], OQL[C++], and OQL[Relations] (which is the same as SQL). Non-terminals surrounded by "* ... *",(e.g., *path_expression*, *function_expression*, and *value_expression*) are defined in the host object model environment (e.g., IDL, C++, and SQL ). The expansion of such non-terminals will thus vary by host language. Of course, SQL is not generic. SQL89 differs from SQL2-entry-level from proposals for SQL3. The SQL subset we include is a subset of SQL89. An OQL89[X] or OQL92-entry-level[X] could be defined analogously. OMG OQL 1.0 should adopt OQL92-entry-level[X]

OQL89 I manipulative_staternent I query_expression

'******************** ...... UPDATE EXPRESSIOIS , manipulati ve_statement insert_staternent ';' delete_statement ';' update_statement ';' insert_statement : INSERT IRTO collection_expresssion insertion

insertion query_expression I value_erpression delete_statement DELETE frorn_statement ~here_clause I DELETE from_staternent update_statement UPDATE '(' path_expression_commalist ')' from_statement ~here_clause I UPDATE '(' path_expression_commalist ')' from_statement B BNF FOR OQL{X} 45

from_statement : FROM collection_ref path_expression_commalist path_ expression path_expression_commalist '•' path_expression ,..••....•...... ••• ...... QUERY EXPRESS I ONS ,

query_expression query_term I query_expression UKIOK query_term

query_term query_spec I '(' query_expression ')'

query_spec : SELECT selection container_expression

selection STAR ALL object_re:f value_constructor

value constructor : IDEKTIFIER '(' constructor_parameter_commalist ')'

constructor_parameter_commalis t constructor_parameter I constructor_parameter_commalist ' , ' constructor_parameter constructor_parameter range_ variable I range_variable parameter

parameter connector :function_expression I parameter connector :function_expression

container_expression :from_ clause I :from_cl ause 9here_clause

:from_ clause : FROM col l ection_re:f_commalist

collection_re:f_commalist collection_ref I collection_ref_commalist ',' collection_ref

collecti on_ref : obj ect_ref IN collection_expression

object_ref IDENTIFIER range_variable I DEKTIFIER pointer_variable '(' IDENTIFIER') ' pointer_variable '(' IDENTIFIER')' range_variable range_ variable B BNF FOR OQL{X} 46

I path_expression collection_exp ression collection function_ expression llhere_clause : WHERE search_condition search_condit ion boolean_term I search_condition OR boolean_term boolean_ term boolean_factor I boolean_terrn AND boolean_factor boolean_factor boolean_primary I NOT boolean_prirnary boolean_primary predicate I '(' search_condition ')' predicate comparison_predicate existence_test in_test function_ expression comparison_predicate : comparison_expression comparison comparison_expression comparison_expression path_expression I value_expression comparison COMP_EQ BE LT LE GT GE existence_test : EXISTS subquery in_test path_expression IN subquery I range_variable IR subquery subquery : '(' query_spec ')' pointer_variable : STAR range_variable range_ variable : IDENTIFIER collection : range_variable B BNF FOR OQL{X} 47

I path_expression path_ expression range_variable •connector• function_expression I path_expression •connector* f unction_expressi on function_ expression : •function_expression• value_expression •value_ expression• C OBJECT QUERY LANGUAGE OQL{IDL} 48

C Object Query Language OQL[IDL]

This section describes the binding of the OQL language, defined in Appendix B, to the OMG Interface Description Language IDL, e.g., OQL[IDL].

Overview

We describe an object query language called OQL[IDL] that supports associative query and update on collections of IDL objects. The design of OQL(IDL] is based on the premise that object model features and query statement syntax are two orthogonal issues in the design of an object query language. Specifically, it is possible to define an object query language for any of the existing object models such as C++, Smalltalk, SQL3, ODMG-93 ODL, EXPRESS, etc. The syntax of the query statements may be derived from SQL or different from SQL (e.g., ObjectStore's Query Language [Lamb 1991]) as long as it is declarative. However, given the amount of investment in industry on SQL, an object SQL syntax is reasonable. We refer to this approach to defining object query languages as OQL(X], where X denotes a particular object model. See Appendix B for a BNF definition of OQL(X]. In this section, we propose an object query language where the target object model is IDL. The following is an example of a OQL(IDL] query that obtains the employee name and de­ partment of all employees who are at least 32 years old, work in a department on the third floor, and have received a salary increase after January 1, 1993. The query illustrates the use of ab­ stract data type operators (e.g., > for Date), join and projection (i.e., generation of objects of type Newobject with new identity via a Newobject...Factory object), and data abstraction (i.e., all predicates expressed in terms of the objects' public interface).

Date aDate(01,01,1993); SELECT Newobject_Factory ( e.narne, d ) FROM Employee e IN Employees, Department d IN Departments loJHERE e. department == d && e. age >=32 && e.last_raise > aDate && d.floor == 3;

OQL(IDL] supports queries on semantically different collection types, data abstraction, inher­ itance, and complex objects. The Object Query Service allows queries on collections of objects C OBJECT QUERY LANGUAGE OQL{IDL} 49 regardless of the semantic and physical characteristics of objects such as persistence, distribution, versioning, and time. OQL[IDL] allows the formulation of query predicates in terms of the object's public interface. OQL[IDL] query expressions can operate on semantically different collection types (e.g., set, list, array). Also, the notions oftype and type extent are separated, enabling the creating and querying of multiple collections of a type in an application.

More detail

An OQL[IDL] query statement is an extension of the SQL query block [Chamberlin et al. 1976] represented as follows:

= SELECT FROM IN WHERE ;

We adopt the syntax of SQL because it can provide OMG object services users with a well­ known model for the formulation of queries that currently enjoys wide use in relational database applications. Also, any syntax of an associative query statement needs to provide a way of specifying the three basic components of the query block above. Therefore, there seems no good reason to design yet another, completely new query syntax that would achieve the same purpose and instead chose to compatably evolve the SQL SELECT statement to an "Object SQL". Exactly as in SQL, the SELECT clause identifies the type of the objects in the collection to be returned by the query. The FROM clause declares the range variables and the target collection to be queried. Several variables ranging over several collections may be declared in this clause (e.g., in the case of joins). The WHERE clause specifies the predicate that defines the properties to be satisfied by the objects to be retrieved. The result of the query is assigned to a collection-valued variable in the program. We present an overview of OQL[IDL) using example queries on the database schema of Figure 2 which includes the types Person, Physician, Patient, and MedicaLRecord. The types show only relevant function interfaces used in the examples. We assumed the existence of two named Set objects: set and set, with the obvious associated interfaces throughout the examples. Also, Figure 2 illustrates List and Set template collection types. C OBJECT QUERY LANGUAGE OQL{IDL} 50

interface Person { interface Medical_ Record { Person create ( Medical_Record create( .. . ); in Name name, att ribute long int patient_no; in Address address, attribute Date date; in Birthdate birthdate attribute String diagnosis; ) ; attribute List lab_ tests attribute List x_ ray_ tests; short int age(); attribute char sex; attribute Name name; } ; attribute Address home_ address; void print();

} ; interface Patient : Person

interface Physician : Person { Patient create ( •.• ) ; attribute long int ident; Physician c reate ( • • • ); attribute Physician family_doctor; readonly attribute String specialty; attribute Set records; attribute Address office address ; v oid print ( ); attribute String phone; - void print(); } ; } ;

Figure 2: Interfaces for a clinical database.

A simp le examp le

This query illustrates the role of range variables, inheritance of member functions, and object composition in OQL[IDL]. Example 1. Retrieve the patients who are treated by Dr. J. Smith.

result = SELECT p FROM Pati ent p IN Patients WHERE p. family_doctor .name == "J . Smith" ;

In the query, p is declared in the FROM clause as a range variable over member objects of the set named Patients of type Set . The function name() used in the WHERE clause is a public member function of Person inherited by Patient and Phys ician. Inheritance of function interfaces is as defined by IDL. The SELECT clause indicates that the objects returned by the query are Patient objects. The expression p .family_cioctor.name, called a path expression, allows navigation through the object composition graph which enables the formulation of predicates on nested objects. The function f aroily_doct or is of type Physi ci an; therefore, we use the dot notation (following C++ convention) to invoke the function name of physician. The result of the query is an object of type Set containing the patient objects that satisfy the predicate. C OBJECT QUERY LANGUAGE OQL{IDL} 51

In this example, the variable result is a transient instance of type Set. A programmer m ay decide to make this t ransient set persistent at a later time using the Persistence Service interface. The path expression p.family_doctor.name returns an object of type Name. The operator== for the Name class is overloaded by the definer of the Name interface to allow comparison with explicit strings (e.g., "J. Smith") within a query predicate. More sophisticated overloadings of operator== are possible to include, for example, approximate string matching. More generally, predicates in the WHERE clause can be defined using comparison operators () E {==,<,<=,>,>=,! = },and logical operators && (AND), II (OR), and ! (NOT). Atomic terms are t 1 () t 2, t 1 IN s11 St CONTA INS s2, t1 () ALL s11 t1 () ANY St, and EXISTS Stj where t1 and t2 are single-valued path expressions or constants, s1 and s2 are sets or set-valued path expressions, and () is a comparison operator. The atomic terms involving ANY and ALL are used for existential and universal quantification, respectively. A predicate is a Boolean combination of atomic terms.

Path expressions

Path expressions provide a succinct way for users to formulate predicates in terms of attributes deeply nested in the structure of an object, therefore, they are supported in OQL(IDL]. Note that support for path expressions in OQL(IDL] does not imply any extension to IDL itself; path expressions can be expressed as a nesting of functions at the level of the 0 RB, and as such become a convenient notation to OQL(IDL] applications. Path expressions may be single-valued or set-valued. In Example 1, p. family ...doctor. name is single-valued. The next example shows how set-valued path expresswns are handled. This query also illustrates the use of set-valued functions in the FROM clause. Example 2. Retrieve male patients who have been diagnosed with flu prior to June 5, 1993.

SELECT * FROM Patient p IN Patients

IJHERE p. sex == "male" && EXISTS ( SELECT r FROM Medical_Record r IN p.records WHERE r.date < Date(06 ,05 ,93) && r .diagnosis == "flu" );

In this query, t he path expression p.records returns an object of type Set. OQL(IDL] requires the programmer to write the query by defining a variable (e.g., r ) to range over C OBJECT QUERY LANGUAGE OQL{IDL} 52 the members of this set. This introduces the use of nested subqueries. The EXISTS keyword in OQL[IDL] has the same semantics as in SQL. The expressions r.date and Date(06 ,05,93) are of type Date. The evaluation of the condition involving dates uses the comparison operator < overloaded for Date instances. Path expressions provide a uniform mechanism for the formulation of queries that involve object composition and inherited member functions. Let p.m1(a1).mz(az). · · · .mn(an) be a path expres­ sion, where each ai, 1:::; i:::; n, represents a (possibly empty) list of arguments to the corresponding function mi. To use this path expression in a query, every mi( ai) must be single-valued (enforced during query parsing). H a function mk(ak), 1:::; k:::; n, is set-valued, then the programmer must break the path expression by defining a subquery on the set p.m1 ( a 1 ).m2 ( a 2 ). · · · .mk( ak)· Alterna­ tively, OQL[IDL] could have performed the iteration over mk(ak) transparently to the programmer, but this would have made the semantics of queries more difficult to understand. For now, we opt for a simpler design that requires querying set-valued expressions explicitly. This design decision will be revisited in the future.

Query blocks involving a single range variable in the FROM clause may use unambiguously the form SELECT *of SQL to indicate that the objects in the result are of the same type as the objects in the set being queried. OQL[IDL] supports data abstraction by the fact that it uses IDL as the object model without breaking IDL encapsulation.

Type of results

The type of the objects in the result of a OQL[IDL] query may be the same as the type of the objects in the set being queried (as illustrated in the examples presented so far), a type that is an ancestor of the type of the objects being queried, or a new type. To obtain objects whose type is a supertype of the type of the objects being queried, OQL[IDL] uses a notation similar to a C++ cast. For example, the following query extracts a subset of objects from the Patients set. The Person objects produced by the query have the same identity as the Patient objects in the set being queried. Only Person member functions can be invoked on objects in the answer set.

SELECT (Person) p FROM Patient p IN Patients

WHERE p.family_doctor .name == "J. Smith"; C OBJECT QUERY LANGUAGE OQL{IDL] 53

A query may also return objects of a type that is unrelated to the type hierarchy of the objects being queried. This is necessary to support queries involving projection and join. Creation of new objects relies on the support of the Life Cycle Service. To create a new object the OQS must be able to find and invoke the appropriate factory object for the type of object that must be created as a result of a query. We assume that the appropriate factory object and the parameters required to create an instance of the corresponding object must be specified as part of the SELECT clause of a query. The following query illustrates the use of projection. Example 3. Retrieve only the name and age information of patients less than 10 years old.

interface New_Object {

};

SELECT New_Object (p.name, p.age) FROM Patient p IN Patients WHERE p.age < 10 ;

In this query, New_Object names the factory object required to build the instances of type New _O bject which are the members of the collection resulting from the execution of the query. The results produced by OQL[IDL] queries are always sets of objects. The following query illustrates the use of join. Example 4. Retrieve all doctors and patients who live in the same street within the same city.

interface Dr_Patient {

};

SELECT Dr_Patient( d, p FROM Phys i cian d IN Physicians, Patient p IN Patients WHERE d.office_address.street == p.home_address . street && d.office_address.city == p.home_address. city;

Again, the query will return a new set of Dr Y at i ent objects, each of which has new iden­ tity. The factory DrYati ent creates objects of type Dr_Fatient. It is up to the user defining the Dr_Fatient to determine whether the object in that class will point to the original patient C OBJECT QUERY LANGUAGE OQL{IDL} 54 and physician objects or to new copies of these objects. The support of joins is important in a query language to derive relationships among objects based on the values of their properties. The programmer is responsible for defining the types associated with new objects generated by a query.

User-defined functions

The next example illustrates the use of user-defined functions in the FROM and WHERE clauses of a query statement. Example 5. Retrieve patients having X-ray exams matching a tuberculosis of the lungs pattern.

SELECT p FROM Patient p IN Patients ~I HERE EXISTS ( SELECT * FROM Medical_Record r IN p. records WHERE EXISTS ( SELECT * FROM X_Ray x IN r.x_ray_tests !~HERE x_ray _match(x. picture, pattern)));

Every medical record of a patient contains a list of X-ray exams for the patient. OQL[IDL] a.llows queries on sets (e.g., Patients, p.records) and on lists (e.g., r.x__ray_tests). The user­ defined (Boolean) function x__ray ..match compares a digital representation x. picture of an X-ray with the program variable pattern of type Bitmap representing a typical tuberculosis pattern.

Updates

As in SQL, OQL[IDL] also supports the update statements UPDATE, DELETE, and INSERT. UPDATE applies functions to modify the state of a subset of objects specified by a predicate. The public inter­ face to parameterized sets (see Figure 4 in Appendix G) includes member functions insert ...member and remove...member which provide insertion and deletion of single member objects into and from a set, respectively. However, sometimes it is convenient for applications to have the capability to perform bulk insertions and deletions of objects in a set. The DELETE and INSERT statements pro­ vide such a capability. The following examples illustrate the use of set-oriented update operations in OQL[IDL].

Example 6. Change the phone number of a.ll physicians whose offices are located in 453 First St.,

Dallas, TX to 214-444-9999. C OBJECT QUERY LANGUAGE OQL{IDL} 55

UPDATE SET p.set_phone("214-444-9999") FROM Physician p IN Physicians WHERE p.address.street == "453 First St." && p.address.city -- "Dallas" && p.address.state == "Texas";

In this query, we assume that set_phone is a member function of the Physician class.

Example 7. Delete all patients of Dr. Smith from the set Patients.

DELETE FROM Patient p IN Patients WHERE p.family_doctor.name == "J. Smith"

Example B. Transfer all patients of Dr. Smith in the Patients set suffering from tuberculosis to the new Tuberculosis..Patients set.

INSERT INTO Tuberculosis_Patients SELECT * FROM Patient p IN Patients WHERE p.family_doctor.name == "J. Smith" && EXISTS ( SELECT r FROM Medical_Record r IN p.records WHERE r.diagnosis == "Tuberculosis"); D OBJECT QUERY LANGUAGE OQL{C++} 56

D Object Query Language OQL[C++]

This section describes the binding of the OQL(X] language, defined in Appendix B, to the pro­ gramming language C++, e.g., OQL(C++]. Design objectives are:

• preserve the SQL heritage replacing SQL relations with set

• directly support the C++ object model

The paper entitled describes the OQL[C++] binding (see OMG document 94-9-45, an attach­ ment to this specification). The document (OMG 1994] presents the IDL C++ Language Mapping Specification. By exten­ sion, this section presents an example that illustrates that OQL[IDL] can be mapped to OQL[C++] (see Appendix D) in a straightforward manner. Below we provide examples of the OQL[C++]language mapping corresponding to the examples of OQL[IDL] provided in Appendix C. Consider the set of IDL interface specifications described in Figure 2. The corresponding set of C++ classes is given in Figure 3. Consider the queries presented in Examples 1-8 in Appendix C. The corresponding OQL[C++] mappings are listed below. Notice that, to a large extent, the mapping consists of renaming the IDL interfaces referenced in a query to the corresponding C++ member functions according to the IDL C++ Language Mapping Specifi.cation.10

Example 1. Retrieve the patients who are treated by Dr. J. Smith.

SELECT p FROM Patient p IN Patients WHERE p.get_family_doctor() .get_name() -- "J. Smith";

Example 2. Retrieve male patients who have been diagnosed with flu prior to June 5, 1993.

SELECT * FROM Pati ent p IN Patients

10The mappings of IDL to C++ presented in Figure 3 have not been compiled with a running IDL C++ translator. However, it should be clear that, given a correct IDL C++ mapping, the corresponding OQL[IDL] to OQL[C++] mapping is straightforward. D OBJECT QUERY LANGUAGE OQL{C++} 57

class Person { class Medical_ Record { public: public: Person create( Name&, Address&, Birthdate& ); Medical_Record create( ••• ); int age(); Long get_patient_no(); char get_ sex(); void set_patient_ no ( l ong); void set_ sex( char ); Date get_date(); Name get_ name(); void set_date( Date& ); void set_name( Name& ); String get_diagnosis(); Address get_home_address(); void set_diagnosis( String& ); void set_home_ address( Address ); List get_ lab_tests(); void print(); void set_lab_ tests( List& List get_x_ ray_ tests{); } ; void set_ x_ray_tests( List& class Physician : public Person { } ; public: class Patient: publ ic Person { Physician create ( ••. ); public: String specialty(); Patient create ( ... ); Address get_office_address(); long get_ ident(); void set_ office_ address( Address& ); void set_ ident( long); String get_phone(); Physician get_ family_doctor(); void set_phone( String& ); void set_ family_ doctor( Physician& ); void print(); Set get_records(); } ; void set_ records( Set& void print(); } ;

Figure 3: C++ classes for the clinical database of Figure 2.

WHERE p.get_sex() == "male" && EXISTS ( SELECT r FROM Medical_Record r IN p.get_records()

WHERE r.get_date() < Date(06,05,93) && r .get_diagnosis() - - "flu") ;

Example 3. Retrieve only the name and age information of patients less than 10 years old.

class New_Object {

};

SELECT New_Dbject (p.get_name(), p.age() ) FROM Patient p IN Pati ents WHERE p.age() < 10;

Example 4· Retrieve all doctors and patients who live in the same street within the same city. t.

D OBJECT QUERY LANGUAGE OQL{C++) 58

class Dr_Patient {

};

SELECT Dr_Patient( d, p ) FROM Physician d IN Physicians, Patient p IN Patients WHERE d.get_office_address().get_street() == p.get_home_address().get_street() && d.get_office_address().get_city() == p .get_home_address().get_city();

Example 5. Retrieve patients having X-ray exams matching a tuberculosis of the lungs pattern.

SELECT p FROM Patient p IN Patients WHERE EXISTS ( SELECT * FROM Medical_Record r IN p.get_records() WHERE EXISTS ( SELECT * FROM X_Ray x IN r.get_x_ray_tests() WHERE x_ray_match(x.get_picture(), pattern)));

Example 6. Change the phone number of all physicians whose offices are located in 453 First St., Dallas, TX to 214-444-9999.

UPDATE SET p.set_phone("214-444-9999") FROM Physician p IN Physicians WHERE p . get_address (). get_street () == "453 First St." && p.get_address() .get_city() == "Dallas" && p.get_address() .get_state() == "Texas";

Example 7. Delete all patients of Dr. Smith from the set Patients.

DELETE FROM Patient p IN Patients t.!HERE p.get_family_doctor() .get_name() -- "J. Smith"

Example 8. Transfer all patients of Dr. Smith in the Patients set suffering from tuberculosis to the new Tuberculosis...Patients set. D OBJECT QUERY LANGUAGE OQL{C++} 59

INSERT INTO Tuberculosis_Patients SELECT * FROM Patient p IN Patients WHERE p.get_family_doctor() .get_name() == "J. Smith" && EXISTS ( SELECT r FROM Medical_Record r IN p.get_records() tmERE r .get_diagnosis() == "Tuberculosis" ) ; E OBJECT QUERY LANGUAGE OQL{RELATIONS} (SKETCH) 60

E Object Query Language OQL[Relations] (Sketch)

This section identifies that even with a definition for an object query language that operates on OMG Collections, there is also a parallel need for OMG to provide a mapping to relational DBMS, where much legacy data is stored. Appendix B describes how the SQL specification can be parameterized to be object model independent, then combined with different host object models for IDL, C++, and Relations. This section assumes that the mappings between OQL[IDL] and SQL (that is, OQL[Relations]) can be reduced to considering mappings between data models IDL and Relations, assuming an OQL[X]. If this independence is achieved cleanly, then it will greatly simplify the puzzle of com­ patiblility between existing DBMS and OMG environments. The mapping to OQL[Relations] binds the "starred" non-terminals of the OQL[X] grammar from Appendix B to the specific data model targetted. In the case of OQL[Relations], this simply returns the original SQL production rules to the agmented grammar.

SQL89 and SQL2

Much enterprise data is already in relational databases. These DBMS systems already implement many OMG Object Services. It is important that OMG provide a standard mapping to allow access to this data. In Appendix C, OQL[IDL] was proposed as a native query service for IDL. In this section we sketch a related, important binding OQL{IDL} to SQL Mapping which provides a binding from OQL[IDL] to RDBMSs. Different mappings with different objectives are possible. This section sketches three possible mappings and is a place holder for more detailed specifications that are still needed.

• RDBMS Wrapper - assumes data is already represented in relational tables and there is a need to mirror that data in IDL. This is the dominant case and provides access to legacy data. A solution provides a way to define IDL objects whose implementations are RDBMS tables (see below).

• Object Interface to RDBMS - assumes new applications will use IDL in its generality as a semantic modeling language but that the engine that implements object services is a relational E OBJECT QUERY LANGUAGE OQL{RELATIONS} (SKETCH) 61

DBMS. This mapping is useful for green-field applications that want to use OMG services but also want to depend on mature DBMS system implementations.

• Persistent Object Interface to RDBMS- provides persistence implemented using an RDBMS engine, which is used for getting and putting externalized persistent objects, not generally for arbitrary queries since the externalized objects are opaque to the RDBMS. This is a variant of the Object Interface to RDBMS where the primary goal is to support persistence.

Case 1 and Cases 2 and 3 differ in perspective. In Case 1, the goal is to project existing RDBMS relations onto IDL to make them accessible from CORBA-based environments. In Case 2 and 3, the goal is to allow the use of IDL as the native perspective and project IDL onto a mature backend implementation based on a RDBMS. It may not be possible to accommodate Cases 1, 2, and 3 in a single mapping allowing bi-directional interoperability freely between CORBA and RDBMS perspectives without being aware of mapping restrictions or peculiarities. Proposals that preserve bi-directional interoperability would be of great interest. In the absence of such proposals, we briefly consider Case 1 and Cases 2 and 3 separately.

Case 1: RDBMS Wrapper

Case 1 is important because so much data already exists in relational databases modeled directly as fiat relations, or tables. It needs to be easy to query existing relational DBMSs from IDL. A straightforward mapping given below covers this case and can be automated. But it is special purpose, covering only IDL objects that are implemented in RDBMS systems, not all IDL objects. The mapping takes advantage of the point of view that relations in RDBMSs are (multi)sets of objects where the objects are homogeneous in type. Objects in the relational model are then viewed as degenerate IDL objects that do not use inheritance or define behavior (since these modeling capabilities are not available in todays RDBMS systems). While attributes in the relational model can be viewed as unencapsulated, they are encapsulated as IDL attributes. In more detail, the basic mapping is:

• For every relation R that is to be accessible from IDL, create an IDL interface type R'.

• For each relational attribute A of relation R, specify IDL attribute A' within the corresponding IDL type R'. E OBJECT QUERY LANGUAGE OQL{RELATIONS} (SKETCH) 62

• Define a mapping from SQL basic types to IDL supported basic types.

• OQL[IDL] queries (for IDL types that corresponds to relations) maps directly to SQL[Relations] queries. The BNF for OQL[Relations] maps directly to SQL BNF.

• Queries are executed in the RDBMS engine. Results may be returned to new or existing IDL types defined using the correspondence rules above. Result IDL types may have been defined beforehand or dynamically.

One can extend this mapping in interesting ways:

• Behavioral specifications (operations) can be added to the IDL objects that directly corre­ spond to the relations. This provides a place to "hang behavior" on RDBMS relations. Unfor­ tunately, there is often no way to use this behavior in WHERE clauses in OQL[IDL] queries since the underlying RDBMS systems do not support user-defined operations in WHERE clauses.

• IDL inheritance can be used to create and model new objects, which are then stored in relations corresponding one-to-one with their IDL definitions. But there may be no analog of the inheritance hierarchy in the RDBMS.

The mapping given above is useful and needed to unlock data in RDBMS systems. The principal drawback of this mapping is that application developers may often have to be able to switch back and forth between two not quite parallel environments, and restrictions in one may end up projecting to become seemingly arbitrary restrictions in the other (e.g., applications may not be able to query all IDL collections, just those implemented in RDBMSs, and applications may not be restricted in their use of operations in query predicates).

Case 2 a nd 3: Object Interfaces to RDBMS

Cases 2 and 3 are interesting and can be combined. We are unaware of commercial system im­ plementations of these cases for IDL but similar implementations exist fo r other object models.

In particular, HP OpenODB provides an example of Case 2 where a proprietary object model E OBJECT QUERY LANGUAGE OQL{RELATIONS} (SKETCH) 63 is mapped to backend RDBMSs. Products including Persistence Systems and Parcplace Object­ Works provide examples of hybrids of Case 2 and 3 for C++ to RDBMS and Smalltalk to RDBMS respectively.

SQL3

ISO and ANSI committees are working to define SQL3, an upwards compatible subset of SQL that supports objects. SQL3's object model is based on Abstract Data Types, or ADTs. There are some identified, significant semantic differences between the OMG IDL object model and the SQL3 object model, e.g., SQL3 supports multimethods and IDL does not, different inheritance rules are defined in each. Since SQL3's ADT model is not yet "cast in concrete", there is a possibility that X3H2/ ISO and OMG could converge their object models making it much easier for future generations of programmers to operate on enterprise data that needs to be shared, stored persistently in a DBMS, and distributed using a CORBA. An ad hoc joint committee of ODMG and X3H2 is meeting to determine if convergence is possible. It would be possible to define SQL3 in such a way to separate its query specification language from its object model specification. Presumably, a variant of OQL[ADT] would be equivalent to SQL3. If ADTs and IDL can be unified, then OQL[IDL] will be SQL3. Otherwise mappings are needed between IDL and SQL3, and between OMG's future adopted object query service that operates on OMG collections and SQL3. F TEMPLATES FOR IDL 64

F Templates for IDL

In this section, we identify the future need for OMG to define an IDL Templates facility. Templates are generally useful, but, in the context of the Object Query Service specification, are needed for the definition of the OMG Collections Object Service. The definition of IDL in the CORBA 1.1 specification [OMG 91.12.1] does not provide a general facility for Templates comparable to that of C++. CORBA section 4.7.3 on "Template Types" lists two builtin template types and but provides no general definition for templates. Similarly, the IDL C++ Mapping [OMG 94-8-2] provides no support for templates. Collection classes (see Appendix G), which an Object Query Service would operate on, naturally require a Template facility for IDL. C++ [Ellis and Stroustrup, 1990), SQL3 [Melton, 1994), and ODMG [Cattell, 1994) all define collections in terms of templates. Collection classes, including sets, multisets, vectors, lists, trees, DAGs, and queues, would be much easier to define using a general IDL template facility (e.g., Set, List ). In the absence of an OMG IDL Template Specification, we propose that OMG adopt the template specification from C++. However, any decision to adopt this standard should be made independently of the Object Query Service since it is a general facility that in no way depends on the Object Query Service. Adoption of an OMG Template specification will affect other OMG specifications, including the CORBA 1.1 specification [OMG 91.12.1), the IDL C++ Mapping [OMG 94-8-2), and the Interface Repository specification [OMG 94-11-7) . For the purposes of the Object Query Service specification, we assume an IDL template facility exists based on the C++ Template facility. The OQL[IDL) query language does not depend on the details of how templates are defined. G COLLECTIONS OBJECT SERVICE 65

G Collections Object Service

The Object Query Service operates on collection objects to return collection objects. Although collections are fundamental to the operation of an OQS, they can also be created and manipulated by applications independently of an OQS. Therefore, they should not be directly part of the OQS specification. They should be specified separately as part of an OMG Collections Object Service. The OQS must be able to operate on semantically different collection objects such as Set, Multiset, List, Vector or Array, and Tree or DAG. Users could define others (e.g., Queue). Like any other IDL object, a collection object may be persistent or transient, their parts may be local or remote, versioned, or replicated. Since membership in collection objects may be directly maintained by applications, defining a standard interface for collection objects is important. We propose that OMG collection classes be defined in IDL extended with templates. These OMG collection classes are homogeneous, defined with templates on user-defined IDL types. We discuss the choice of a Common Collections Object Service for OMG below. For the purposes of this document, we use an interface for sets defined by Texas Instruments (Figure 4). This is only for illustration purposes to indicate the kind of functionally required from collection interfaces. We may choose to adopt the specification supplied by a merged IBM-ODMG submission if it is minimal and consistent with a future OMG Collections Object Service. A collection interface minimally provides a way to insert and remove objects from the collection, merge two collections to form a new collection of the same kind, test a collection for emptiness, and compare two collections for equal identity and/or contents. The semantics of these operations vary by collection type; SETS have equal contents if they have the same elements, BAGS must have the same elements and element cardinality, and GRAPHS must have the same elements, cardinality, and structure. These semantic variations often carry corresponding variations in the interfaces; unlike insertion into a SET or BAG, insertion into a LIST or TREE requires an insertion position, which may be absolute (e.g., position 3), or relative (after some particular element). A particular collection type may be homogeneous or heterogeneous, and subtyping of entries may or may not be allowed. However, in all collection types, the operations available on collections are orthogonal to the types contained in the collection. Specification of a set of Collection Classes defines a Collections class library. The Object Query G COLLECTIONS OBJECT SERVICE 66

template interface Set {

int Is_Empty (); II Test for empty set . int Is_Persistent(); II Test persistence status of the set. int Contains_Member (Typet); II Test for membership in set. int Is_Equal_To (Sett); II Test for equality between sets. int Is_Subset_Of (Set&); II Test for set containment. int Cardinality(); II Cardinality of the set. int Insert_Member (Typet); II insert, int Remove_Member (Typet); II remove, and int Delete_Member (Typet); II remove and delete a member. II Set operators: Sett Union (Sett); II union, Sett Intersection (Set&);ll intersection, and Sett Difference (Set&); II difference. II Set iteration operators: Typal Get_Member (Iteratorl); II get member pointed to by iterator, void First_Member (Iteratorl); II set iterator to first member, void Last_Member (Iterator&); II set iterator to last member, void Next_Member (Iteratorl); II set iterator to next member, void Prev_Member (Iteratort); II set iterator to previous member, int Is_End_Of_Set (Iteratort); II test if iteration is complete.

};

Figure 4: Example of a parameterized IDL Set interface template.

Service operates on this class library. A major issue is, on which of the several class libraries that support collections should the OMG Collections Object Service be based? Some of the choices include:

• collection classes based on the proposed X3J16 C++ "Standard Template Library" [Stepanov 1994]

• built-in collection classes Set, Bag, List, Array as defined in ODMG (Section 2.6.3 of [Cattell 1994])

• collection data types TABLE, SET, MULTISET, and LIST as defined in SQL3 [Melton 1994]

• collection data types from other class libraries (e.g., NIHCL, Taligent)

• IDL should support several collection class libraries. G COLLECTIONS OBJECT SERVICE 67

Unless the final choice is selected, there is substantial work ahead for OMG in selecting, unifying and converging on a Collections Object Service. Given this state of affairs, we believe it is wise to make the Object Query Service specification as independent as possible of the particular choice of class library. For exposition purposes, we adopt a simple default class library defined in IDL assuming templates. We could equally well have adopted ODMG's class library because it is defined in IDL and also uses templates but we believe that ODMG may be influenced by market forces to adopt the X3Jl6 C++ Standard Template Library (STL). Class libraries in IDL corresponding to SDL or to SQL3 would likely represent the direction of the much larger C++ and SQL communities. H COLLECTION INDEXING OBJECT SERVICE 68

H Collection Indexing Object Service

In this section, we identify the future need for the OMG Object Service Task Force to define a Collection Indexing Object Service (CXOS). Such a service is not required for the interface spec­ ification of the Object Query Service or the Collections Object Service but is needed for efficient access to elements of collections and for query optimization. For completeness, we briefly describe some aspects of the CXOS. The purpose of indices is to improve the performance of retrieval from Collections. Collection classes can be indexed in potentially many ways. Indices provide redundant, non-information­ bearing, efficient access paths to indexed objects. There is a need to support extensible subtypes of index, including hash, heap, b-tree, r-tree, quad-tree, keyword-in-context, and others. Index objects support operations including creation, destruction, add element, remove element, cursor movement, and others. Also, indices have performance attributes that are often type and implementation-specific. It is often the case that a collection class may be indexed by possibly more than one index. Typically, inter-index consistency maintenance is handled by the collection class owner. In this case, the interface to indices is hidden from the collection class user except for exposing the ability to create and destroy various types of indices. The CXOS is itself optional. It depends only on the Collections Object Service. The Object Query Service is independent of a CXOS in that queries can be specified and executed without using secondary indices. Query Optimization, whether implemented as a separate service or not, typically depends in part on the CXOS. Thus, an interface from the Object Query Service to the CXOS is optional, not mandatory. (Similarly, the interface from the Object Query Service to the Implementation Repository for access to other forms of performance-related meta data that it might need is also optional.) The CXOS may be bundled as part of an Object Query Service, in which case end users cannot separately manipulate individual indices except to create or destroy them. Since indices are not required by the API to the Object Query Service, we recommend that OMG defer standardizing on a specification for indices as part of the Object Query Service. But we do note the following: given that the Object Query Service operates on user defined data types, H COLLECTION INDEXING OBJECT SERVICE 69 it is desirable to be able to add new types of indices. This requires that a protocol exist to register these with the Query Optimizer. Different implementations of an Object Query Service may or may not support this form of openness. Open issues involving the CXOS include:

• What are the tradeoffs between providing interoperability standards for a CXOS and leav­ ing this specification to proprietary implementations, as is current practice in the DBMS community.

• Can we define efficient Open Query Optimizers to operate on a standard Collection Indexing Object Service while still providing extensibility. Both the semantics and performance char­ acteristics of query and other operations on different types of collections must become better understood.

• What is the relationship of Collection Indexing to Data Interchange. It would be desirable if we could define file formats for the hundreds of structured file types, treat the definitions as instructions to an indexing service, then, while leaving the file "in place" permit operations like queries on "objects" in the file. . . '

REFERENCES 70

References

[1] The Common Object Request Broker: Architecture and Specification, Revision 1.1. Digital Equipment Corporation, Hewlett-Packard Corporation, HyperDesk Corporation, NCR Cor­ poration, Object Design, Inc., Sunsoft, Inc. Published by Object Management Group and X/Open, OMG Document 91.12.1, December 1991.

[2] IDL C++ Language Mapping Document, Submission by Expersoft, Digital Equipment Cor­ poration, Hewlett-Packard Corporation, IONA Technologies, Ltd., International Business Ma­ chines Corporation, Novell Inc., SunSoft, Inc., OMG Document 94-8-2, 3 August 1994. Ap­ proved by OMG TC in Dublin.

[3] Interface Repository, Submission by Digital Equipment Corporation, Hewlett-Packard, and SunSoft, OMG Document 94-11-7,16 November 1994. Approved by OMG TC in Long Branch.

[4] BLAKELEY, J ., THOMPSON, C., AND ALASHQUR, A. A Strawman Reference Model for Object Query Languages. Computer Standards & Interfaces 13 (1991), 185-199.

[5] BLAKELEY, J., McKENNA, W., AND GRAEFE, G. Experiences Building the Open OODB Query Optimizer. In Proceedings of the 1993 ACM SIGMOD International Conference on the Management of Data (Washington, D.C., May 1993), pp. 287-296.

[6] ELLIS, M. A., AND STROUSTRUP, B. The Annotated C++ Reference Manual. Addison­ Wesley, 1990.

[7] R.G.G. CATTELL (Eo), The Object Database Standard: ODMG-93, Revision 1.1, Morgan Kaufmann, San Mateo, California, 1994.

[8] MELTON, J. (ED), ISO/ANSI Working Draft Database, available on line from "ftp:/ / speckle.ncsl.nist .gov /isowg3 I dbliBASEdocs I sql3index. txt" with file names sql3index.ps, sql3partX.Y where X= 1-5 andY = ps or txt.

[9] MANOLA, F. (Eo), Object Model Features Matrix, Draft, Version 8, X3H7-94-008, ANSI X3H7, Access directly via Mosaic: "http:/ /info.gte.com/ftpfdoc/activities/x3h7.html"

[10] STEPANOV, A. AND M. LEE, The Standard Template Library, Hewlett-Packard Laboratories, Doc no: X3J16/94-0140, WG21/N0527, 29 July 1994.

[11] WELLS, D . 1., BLAKELEY, J. A., AND THOMPSON, C. W. Architecture of an Open Object­ Oriented Database Management System. Computer 25, 10 (Oct. 1992), 74-82.