Deductive Languages: Problems and Solutions MENGCHI LIU University of Regina

Deductive result from the integration of relational database and techniques. However, significant problems remain inherent in this simple synthesis from the language point of view. In this paper, we discuss these problems from four different aspects: complex values, object orientation, higher- orderness, and updates. In each case, we examine four typical languages that address the corresponding issues.

Categories and Subject Descriptors: H.2.3 [Database Management]: Languages— Data description languages (DD); Data manipulation languages (DM); Query languages; Database (persistent) programming languages; H.2.1 [Database Management]: Logical Design—Data models; Schema and subschema; I.2.3 [Artificial Intelligence]: Deduction and Theorem Proving—Deduction (e.g., natural, rule-base); Logic programming; Nonmonotonic reasoning and belief revision; I.2.4 [Artificial Intelligence]: Knowledge Representation Formalisms and Methods—Representation languages; D.1.5 [Programming Techniques]: Object-oriented Programming; D.1.6 [Programming Techniques]: Logic Programming; D.3.2 [Programming Languages]: Language Classifications; F.4.1 [Mathematical Logic and Formal Languages]: Mathematical Logic—Logic and constraint programming General Terms: Languages, Theory Additional Key Words and Phrases: Deductive databases, logic programming, nested relational databases, complex object databases, inheritance, object-oriented databases

1. INTRODUCTION ment of various data models. The most well-known and widely used data model Databases and logic programming are is the relational data model [Codd two independently developed areas in 1970]. The power of the relational data computer science. Database technology model lies in its rigorous mathematical has evolved in order to effectively and foundations with a simple user-level efficiently organize, manage and main- paradigm and set-oriented, high-level tain large volumes of increasingly com- query languages. However, the rela- plex data reliably in various memory tional data model has been found inex- devices. The underlying structure of da- pressive for many novel database appli- tabases has been the primary focus of cations. During the past decade, many research that has led to the develop-

Author’s address: Department of Computer Science, University of Regina, Regina, Saskatchewan S4S 0A2, Canada; email: [email protected]. Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted without fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. © 1999 ACM 0360-0300/99/0300–0027 $5.00

ACM Computing Surveys, Vol. 31, No. 1, March 1999 28 • M. Liu

CONTENTS tion to solve problems by deriving logi- cal consequences. The most well-known 1. Introduction 2. Complex-Value Deductive Languages and widely used logic programming lan- 2.1 LDL guage is [Colmerauer 1985; Kow- 2.2 COL alski 1988; Sterling and Shapiro 1986], 2.3 Hilog which uses the Horn clause subset of 2.4 Relationlog 3. Object-Oriented Deductive Languages first-order logic as a programming lan- 3.1 O-Logic guage and the resolution principle as a 3.2 F-Logic method of inference with well-defined 3.3 ROL model-theoretic and proof-theoretic se- 3.4 IQL mantics [Lloyd 1987]. However, Prolog 4. Schema and Higher-Order Features 4.1 HiLog lacks expressiveness in a number of ar- 4.2 L2 eas. Over the past several years, a num- 4.3 F-logic ber of richer and more expressive logic 4.4 ROL programming languages have been pro- 5. Updates 5.1 Prolog posed and implemented, such as LOGIN 5.2 DLP and Its Extension [Aït-Kaci and Nasr 1986], LIFE [Aït- 5.3 DatalogA and LDL Kaci and Nash 1986], HiLog [Chen et al. 5.4 Transaction Logic 1993], Prologϩϩ [Moss 1994], Gödel 6. Conclusion [Hill and Lloyd 1994], Oz [Smolka 1995], Mercury [Somogyi et al. 1996], XSB [Rao et al. 1997], etc. Important studies on the relations be- more expressive data models and sys- tween logic programming and relational tems have been developed, such as the databases have been conducted for ER model [Chen 1976], RM/T [Codd about two decades, mostly from a theo- 1979], TAXIS [Mylopoulos et al. 1980], retical point of view [Gallaire and SDM [Hammer and McLeod 1981], DA- Minker 1978; Gallaire 1981; Jacobs PLEX [Shipman 1981], SHMϩ [Brodie 1982; Ullman 1982; Maier 1983]. Rela- 1984], Gemstone [Maier et al. 1986], tional databases and logic programming * Galileo [Albano et al. 1985], SAM [Su have been found quite similar in their 1986], Iris [Fishman et al. 1987], IFO representation of data at the language [Abiteboul and Hull 1987], the one for level. They have also been found com- EXODUS [Carey et al. 1988], Orion plementary in many aspects. Relational [Kim 1990a; 1990b], O2 [Deux et al. database systems are superior to the 1991], Jasmine [Ishikawa et al. 1993], standard implementations of Prolog Fabonacci [Albano et al. 1995], ODMG with respect to data independence, sec- [Cattell 1996], and others. ondary storage access, concurrency, re- Logic programming is a direct out- covery, security and integrity [Tsur and growth of earlier work in automatic the- Zaniolo 1986]. The control over the exe- orem proving and artificial intelligence. cution of query languages is the respon- It is based on mathematical logic, which sibility of the system, which uses query is the study of the relations between optimization and compilation tech- assertions and deductions and is for- niques to ensure efficient performance malized in terms of proof and model over a wide range of storage structures. theories. Proof theory provides formal However, the expressive power and specifications for correct reasoning with functionality of relational database premises, while model theory prescribes query languages are limited compared how general assertions may be inter- to that of logic programming languages. preted with respect to a collection of Relational languages do not have specific facts. Logic programming is pro- built-in reasoning capabilities. Besides, gramming by description. It uses logic relational query languages are often to represent knowledge and uses deduc- powerless to express complete applica-

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •29 tions, and are thus embedded in tradi- Aditi [Vaghani et al. 1994], LogicBase tional programming languages, result- [Han et al. 1994], Declare/SDS [Kie- ing in impedance mismatch [Maier bling et al. 1994] etc. See Ramakrish- 1987] between programming and rela- man and Ullman [1995] for a survey of tional query languages. Prolog, on the these deductive database systems. De- other hand, can be used as a general- ductive database systems have been purpose programming language. It can used in a variety of application domains be used to express facts, deductive in- including scientific modeling, financial formation, recursion, queries, updates, analysis, decision support, language and integrity constraints in a uniform analysis, parsing, and various applica- way [Reiter 1984; Sterling and Shapiro tions of transitive closure such as bill- 1986]. of-materials and path problems [Ra- The integration of logic programming makrishman 1994]. They are best and relational database techniques has suited for applications in which a large led to the active research area of deduc- amount of data must be accessed and tive databases [Gallaire et al. 1984; Ceri complex queries must be supported. et al. 1990; Grant and Minker 1992]. However, significant problems remain This combines the benefits of the two inherent in this kind of deductive data- approaches, such as representational base from the language point of view. In and operational uniformity, reasoning this paper, we investigate four problem- capabilities, recursion, declarative que- atic areas. The first one is that such rying, efficient secondary storage ac- deductive databases provide inexpres- cess, etc. The function symbols of Pro- sive flat structures and cannot directly log, which are typically used for support the complex values common in building recursive functions and com- novel database applications. The second plex data structures, have not been is that object orientation offers some of found useful for operating over rela- the most promising ways of meeting the tional databases made up of flat rela- demands of the most advanced database tions. As a result, a restricted form of applications, but it is not clear how to Prolog without function symbols called incorporate object orientation into de- (with negation), with a well- ductive databases to increase their data defined declarative semantics based on modeling capability. The third is that the work in logic programming, has current deductive databases cannot nat- been widely accepted as the standard urally deal with schema and higher- deductive database language [Ullman order features in a uniform framework 1989a; Ceri et al. 1990]. so that a separate, non-logic-based lan- In the past few years, various set- guage is normally provided to specify oriented evaluation strategies specific and manipulate the schema informa- for deductive databases have been the tion. Finally, it is necessary to be able to main focus of extensive research [Ban- update and modify a database, but cilhon et al. 1986; Bancilhon and Ra- there is no immediately obvious way of makrishnan 1986; Beeri and Ra- expressing updates within a logical makrishnan 1991; Ceri et al. 1990; framework. We elaborate these prob- Ioannidis and Ramakrishnan 1988; lems and survey four typical languages Jiang 1990; Mumick et al. 1996; Sacca for each area with consistent examples. and Zaniolo 1987; Ullman 1989a; Ull- There are other significant issues rel- man 1989b] and a number of deductive evant to deductive database languages. database systems or prototypes based These include semantics of logic pro- on Datalog have been reported. These grams with negation such as predicate include Nail [Morris et al. 1986], LOLA stratification and fixpoint semantics [Freitag et al. 1991], Glue-Nail [Derr et [Apt et al. 1988], local stratification and al. 1993], XSB [Sagonas et al. 1994], perfect model semantics [Przmusinski CORAL [Ramakrishnan et al. 1994], 1988], stable model semantics [Gelfond

ACM Computing Surveys, Vol. 31, No. 1, March 1999 30 • M. Liu and Lifschitz 1988; Przymusinski 1990], car͑vehicle͑ny_858778, chevy_vega, 1983͒, modularly stratified semantics [Ross owner͑name͑arthur, keller͒, 1994], well-founded semantics [Gelder getFrom͑dealer͒)) et al. 1991] and alternating fixpoint se- mantics [Gelder 1993]; non-Horn clause car͑vehicle, ͑ny_358735, volkswagen, 1990͒ logic programs such as disjunctive de- owner͑name͑arthur, keller͒, ductive databases [Fernández and getFrom͑owner͑name͑kurt, godel͒, Minker 1992a; Fernández and Minker getFrom͑owner͑name͑norbert, wiener͒, 1992b; Barback 1992; Subrahmanian getFrom͑dealer͒)))))) 1993], hypothetical reasoning [Bonner 1989; Bonner 1990; Chen 1997; Inoue where vehicle, name, owner, getFrom 1994; Sattar and Goebel 1997]; con- are functors that are not interpreted (or straints [Brodsky 1993; Jaffar and Ma- are freely interpreted) because they her 1994; Grumbach and Su 1996], etc. have no prior meanings. However, we believe that a separate Prolog also indirectly supports sets by survey is more appropriate for these using lists. In Prolog, set expressions issues. Indeed, the languages we survey are represented using the built-in pred- here deal only with predicate stratifica- icate setof: tion and fixpoint semantics. Throughout this paper, we follow the setof͑X, P, S͒ Prolog convention and use words start- ing with upper-case letters for variables where P represents a goal or goals, X is and ”_” for anonymous variables. a variable occurring in P, and S is a list whose elements are sorted into a stan- 2. COMPLEX-VALUE DEDUCTIVE dard order with no duplicates and can LANGUAGES be treated as a set. It is read as “The set The relational data model, which im- of instances of X such that P is provable poses the first normal form (1NF) re- is S”. In other words, S ϭ ͕XP͖. striction, has been found to be inexpres- The problems with the setof predicate sive for many novel database in Prolog are that it is based on lists applications, such as engineering de- and that it is not a first-order predicate sign, graphic databases, CAD/CAM, so its semantics is not well-defined. Be- etc., as these applications require sides, the usage of lists as sets is not proper representation and manipulation expressive enough. The simple member- of arbitrarily complex values with inter- ship predicate has to specify details nal structures. The nested relational about implementation, such as how to and complex-value data models [Abite- iterate over the lists. When a predicate boul et al. 1987; Ozsoyogly and Yuan involves more than one set, the program 1987; Roth et al. 1987; Roth 1988; Colby can become quite complicated and non- 1989; Levene and Loizou 1993; Abite- intuitive, which is contrary to the gen- boul and Beeri 1995] developed during eral philosophy of declarative program- the past decade allow tuple components ming [Kuper 1990]. of a relation to be possibly nested For this reason, Datalog has been ex- tuples, sets or even relations. tended in the past several years to in- Datalog is based on 1NF relations and corporate tuple and/or set constructors therefore cannot support nested tuples [Kuper 1990; Beeri 1991; Abiteboul and and sets. In fact, its predecessor Prolog Grumbach 1991; Liu 1998b]. In the can indirectly support nested tuples by meantime, Prolog has also been ex- using functors. For example, we may tended to incorporate general-purpose use the following facts in Prolog to store set constructs and basic operations on information about cars and their own- sets [Dovier et al. 1996; Jayaraman ers: 1992; Hill and Lloyd 1994].

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •31

In this section, we examine four dif- of sets with rules, LDL provides two ferent Datalog extensions: LDL, COL, powerful mechanisms: set enumeration Hilog and Relationlog. and set grouping. With set enumera- tion, a set is constructed by listing all of 2.1 LDL its elements. With set grouping, a set is constructed by defining its elements LDL (Logical Data Language) [Tsur and with a property that they must satisfy. Zaniolo 1986; Naqvi and Tsur 1989; Several examples are given below. Chimenti et al. 1990; Beeri et al. 1991] is the first language to extend Datalog Example 2.2 To derive a relation on to support complex values with well- sets of book titles from the book base defined semantics. It has been imple- relation such that the total price of any mented at MCC. Like Prolog, LDL can three different books does not exceed indirectly support tuples by using func- $100, we can use the following rule with tors. In addition, it directly supports a set enumeration term ͕X, Y, Z͖ in sets. LDL: Example 2.1 Consider the following ͕͑ ͖͒ ͑ ͒ nested relation DeptϪEmployees:3 bookϪdeal X, Y, Z :- book X, Px , book͑Y, Py͒, book͑Z, Pz͒, X Þ Z, Y Þ Z, Dept_Employees Px ϩ Py ϩ Pz Ͻ 100 Dept Employees Name Areas Note that we can specify the cardinal- ity of the sets with set enumeration. Area This feature is not directly supported in cpsc bob db Prolog. ai Example 2.3 Consider the relation joe os parentof represented as facts in LDL as pl follows: db parentof͑bob, pam͒ math sam gt parentof͑bob, tom͒ cm The following set-grouping rule in LDL si groups all parents of a person into a set tom ca and obtains parentsof͑bob, ͕ pam, tom͖͒, where ͗Y͘ is a set-grouping term: We can represent this relation in LDL parentsof͑X, ͗Y͒͘ :- parentof͑X, Y͒ by using the following two facts: Unlike set enumeration, we cannot ͑ deptϪemployees cpsc, specify the cardinality of the sets with ͕empl͑bob, ͕db, ai͖͒, set grouping. empl͑joe, ͕os, pl, db͖͖͒͒ In LDL, programs must be predicate- ͑ deptϪemployees math, stratified (or layered) based on grouping ͕empl͑sam, ͕ gt, cm, si͖͒, and negation in order to have a seman- empl͑tom, ͕ca͖͖͒͒ tics. Given a rule of the form A :- L1,

To support access to elements in a set, ..., Ln, where n Ն 0,ifLi is negated or LDL allows the use of the member pred- A has grouping, then this rule must be icate ͑ʦ͒. To support the construction evaluated after the rules whose heads

ACM Computing Surveys, Vol. 31, No. 1, March 1999 32 • M. Liu have the same predicate symbol as Li, requires us to know the parentsof and for each 1 Յ i Յ n. That is, each Li ancestorsof relations in the body be- must be completely known when we fore we can evaluate the ancestorsof evaluate this rule. Whether or not a relation in the head in the second rule. program is predicate-stratified can be One way around this problem is to in- determined statically based on the pred- troduce a flat intermediate relation icate symbols in the program. There are three significant limita- ancestorof as follows: tions in LDL. First, the grouping mech- anism is restricted to a single rule ancestorof͑X, Y͒ :- parentsof͑X, S͒, Y ʦ S rather than to a set of rules (or a pro- ancestorof͑X, Y͒ :- parentsof͑X, S1͒, Z ʦ S1, gram). ancestorof͑Z, Y͒ Example 2.4 Consider the following ancestorsof͑X, ͗Y͒͘ :- ancestorof͑X, Y͒ facts: Finally, it is cumbersome to access fatherof͑bob, pam͒ deeply nested data and to nest/unnest relations, since we must be procedural motherof͑bob, tom͒ by introducing intermediate variables and/or relations. We cannot use the following grouping rules in LDL to group all parents of a Example 2.6 Consider the LDL facts person into a set and obtain parentsof in Example 2.1 and two queries: ͑bob, ͕ pam, tom͖͒: (1) Find the employee names in the computer science department for parentsof͑X, ͗Y͒͘ :- fatherof͑X, Y͒ employees whose research areas in- parentsof͑X, ͗Y͒͘ :- motherof͑X, Y͒ clude DB. as they actually generate parentsof (2) Find if AI is a research area for ͑bob, ͕ pam͖͒ and parentsof ͑bob, employees. ͕tom͖͒ instead. We have to introduce an We can represent them in LDL as intermediate relation parentof and use follows by starting from the top level it to obtain the desired result, as shown and gradually getting to the level we in Example 2.3. want and introducing intermediate

The second problem is that grouping variables S1 and S2: requires predicate stratification, which makes direct recursive definition of a ? Ϫ deptϪemployees͑cpsc, S ͒, nested relation impossible. 1 empl͑X, S2͒ ʦ S1, Example 2.5 Consider the following db ʦ S2. two rules: ? Ϫ deptϪemployees͑_, S1͒, ͑ ͒ ʦ ancestorsof͑X, ͗Y͒͘ :- parentsof͑X, S͒, Y ʦ S empl _, S2 S1, ai ʦ S2 ancestorsof͑X, ͗Y͒͘ :- parentsof͑X, S1͒, Zʦ S1, ancestorsof͑Z, S2͒, Y ʦ S2 Example 2.7 The nested relation Dept_Employees in Example 2.1 can be Here we intend to define the nested obtained from the following normalized relation ancestorsof directly and recur- relation Dept_Employee_Area by using sively. However, we cannot achieve the nest operation of extended rela- what we intend in LDL as grouping is tional algebra defined in Roth et al. restricted to a single rule and grouping [1988] twice.

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •33

Dept_Employee_Area ber predicate ͑ʯ͒ can be used to access Dept Employee Area sets and to construct sets through cpsc bob db grouping or enumeration. Unlike LDL, cpsc bob ai where sets are unnamed, data functions cpsc joe os in COL are used to name sets. cpsc joe pl Consider the facts in Example 2.4 cpsc joe db math sam gt again. The parentsof relation can be math sam cm defined in COL using the functor par as math sam si follows: math tom ca par͑X͒ ʯ Y :- fatherof͑X, Y͒

In order to obtain Dept_Employees par͑X͒ ʯ Y :- motherof͑X, Y͒ from Dept_Employee_Area in LDL, we parentsof͑X, par͑X͒͒ have to use grouping several times and introduce several intermediate relations The functor par is used to group every as follows: parent Y of a person X into a set de- noted by par͑X͒ and grouping can in- r1͑D, E, ͗A͒͘ :- deptϪemployeeϪarea͑D, E, A͒ volve more than one rule. Note that the ͑ ͑ ͒͒ ͑ ͒ r2 D, empl E, As :- r1 D, E, As unique set of parents associated with X deptϪemployees͑D, ͗Es͒͘ :- r2͑D, Es͒ is named by the data function par͑X͒. In COL, we can group a set recur- 2.2 COL sively using data functions. COL (Complex Object Language) [Abite- Example 2.8 The task shown in Ex- boul and Grumbach 1991] is a typed ample 2.5 can be represented in COL as extension of Datalog that supports com- follows: plex values directly. Unlike LDL, which uses functor objects for tuples indi- anc͑X͒ ʯ Y :- par͑X͒ ʯ Y rectly, COL directly supports tuples, tuple constructors and sets. anc͑X͒ ʯ Y :- par͑X͒ ʯ Z, anc͑Z͒ ʯ Y The nested relation Dept_Employees ancestors͑X, anc͑X͒͒ in Example 2.1 can be represented in COL as follows: The set enumeration of LDL shown in Example 2.2 can be performed in COL deptϪemployees͑cpsc, ͕͓bob, ͕db, ai͖͔, using a 0-ary functor f as follows: ͓joe, ͕os, pl, db͖͔͖͒ deptϪemployees͑math, ͕͓sam, ͕ gt, cm, si͖͔, f ʯ ͕X, Y, Z͖ :- book͑X, Px͒, ͓tom, ͕ca͖͔͖͒ book͑Y, Py͒, book͑Z, Pz͒, X Þ Y, X Þ Z, Y Þ Z, A novel feature of COL is the use of Px ϩ Py ϩ Pz Ͻ 100 interpreted functors called data func- tions in a deductive framework. Data bookϪdeal͑X͒ :- f ʯ X functions play a crucial role in func- tional data models and in many seman- As a result, COL allows grouping over tic data models. Their use in COL pro- several rules recursively with data func- vides natural support for functional tions. However, it is still cumbersome to dependencies in deductive databases. In access deeply nested data in order to LDL, functional dependencies are not nest/unnest relations. supported. To represent the two queries in Ex- In COL, data functions and the mem- ample 2.6 in COL, we still must intro-

ACM Computing Surveys, Vol. 31, No. 1, March 1999 34 • M. Liu duce intermediate functions f and g as The stratification of COL has the fol- follows: lowing limitation: Example 2.9 [Adapted from Kifer Ϫ ͑ ͒ ? deptϪemployees cpsc, f , and Wu [1993]] Consider the following ʯ ͓ ͔ ʯ f X, g , g db program in COL:

? Ϫ deptϪemployees͑_, f͒, f ʯ ͓_, g͔, g ʯ ai person͑peter, ͕bridge͖͒ person͑tom, ͕chess, tennis͖͒ The following rules in COL show how person͑tom, hobby͑peter͒͒ to obtain the nested relation Dept Em- ployees in Example 2.1 from the normal- hobby͑X͒ ʯ Y :- person͑X, S͒, S ʯ Y ized relation Dept_Employee_Area in Example 2.7: The intended meaning of these rules is that Peter’s hobby is bridge and Tom’s f͑D, E͒ ʯ A :- deptϪemployeeϪarea͑D, E, A͒ hobbies include chess and tennis and r1͑D, E, f͑D, E͒͒ every hobby of Peter’s is also a hobby of Tom’s. However, the program is not g͑D͒ ʯ ͓E, As͔ :- r1͑D, E, As͒ stratified and therefore has no meaning.

deptϪemployees͑D, g͑D͒͒ 2.3 Hilog As in LDL, programs in COL must be stratified as well in order to have a Hilog [Chen and Chu 1989; Chen and well-defined semantics. Stratification is Kambayashi 1991] is a typed extension based on predicate and function sym- of Datalog intended to solve some of the bols used in the program. Given a rule problems of LDL. In Hilog, every set or tuple must be associated with a name of the form A :- L , ..., L , n Ն 0,ifL 1 n i that can be viewed as an atom or a is negated or contains a data function, functor of Prolog. The relation Dept_ or A contains a data function, then this Employees in Example 2.1 can be repre- rule must be evaluated after the rules sented in Hilog by using one fact as whose heads have the same predicate or follows: function symbol as Li or A, for each 1 Յ i Յ n. As in LDL, whether or not a deptϪemployees͕ program is stratified can be determined dept͑cpsc, statically based on the predicate and employees͕employee͑bob, function symbols occurring in the pro- areas͕db, ai͖͒, gram. employee͑joe, For the task shown in Example 2.5, areas͕os, pl, db͖͒}), we might like to use the following two dept͑math, rules instead of the rules in Example employees͕employee͑sam, 2.8 in COL: areas͕gt, cm, si͖͒, employee͑tom, anc͑X͒ ʯ Y :- areas͕ca͖͒})}

parents͑X, S1͒, ʯ Note that the names associated with S1 Z, tuples and sets must be unique (unique ͑ ͒ ancestors Z, S2 , name assumption). In the above exam- ʯ S2 Y. ple, we cannot change the name dept employees to employees as the latter is ancestors͑X, anc͑X͒͒. used inside the relation. However, they are not stratified and Sets in Hilog have a special meaning. therefore have no semantics in COL. For example, areas͕db, ai͖ does not

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •35 mean that the set consists of db and ai. ample 2.9 can be expressed in Hilog as Instead, it means that db and ai are follows: elements of the set, and is equivalent to person͑peter, hobbies͕bridge͖͒ areas͕db͖ and areas͕ai͖. With this special meaning, we can directly access person͑tom, hobbies͕chess, tennis͖͒ deeply nested data either by giving ͑ ͕ ͖͒ their nested structure or using the person tom, hobbies X :- unique name associated with them. The person͑peter, hobbies͕X͖͒ two queries in Example 2.6 can then be represented in Hilog directly as follows However, the special treatment of without introducing any intermediate sets in Hilog has its limitations since it symbols: does not allow us to specify explicitly what a set exactly contains. Therefore, ? Ϫ dept͑cpsc, employees͕employee͑X, areas͕db͖͖͒͒ the set enumeration of LDL is not sup- ported. Another problem with Hilog is ? Ϫ deptϪemployees͕ dept͑_, employees͕employee͑_, areas͕ai͖͖͒͒} that the unique-name assumption is not practical for large databases. With the unique-name assumption, The Hilog language is not higher-or- nested terms in Hilog can be directly der in the sense that we cannot use used as atoms. Therefore, the second variables for the names associated with query above can be simply represented the sets and tuples. In Section 4.1, we as follows: discuss a true higher-order language that has a similar name: HiLog. ? Ϫ areas͕ai͖ 2.4 Relationlog The nested relation Dept_Employees in Example 2.1 can be obtained from the Relationlog (Relation LOGic) [Liu 1995; normalized relation Dept_Employee_ 1998b] is another typed extension of Area in Example 2.7 by using just the Datalog with powerful set and tuple rule: constructors. It has been implemented at the University of Regina [Liu and dept͑X, employees͕employee͑Y, areas͕Z͖͖͒͒ :- Shan 1998; Shan and Liu 1998]. Rela- deptϪemployeeϪarea͑X, Y, Z͒ tionlog combines the best features of LDL, COL and Hilog. It directly sup- This set mechanism naturally sup- ports complex values as COL. The ports grouping that involves several nested relation Dept_Employees in Ex- rules. The parentsof relation can be ex- ample 2.1 can be represented in Rela- pressed in Hilog as follows: tionlog in the same way as in COL. Relationlog allows negation, supports parentsof͑X, parents͕Y͖͒ :- fatherof͑X, Y͒ Hilog’s special set treatment with par- ͑ ͕ ͖͒ ͑ ͒ tial set terms, and eliminates the parentsof X, parents Y :- motherof X, Y unique-name assumption. A partial set This set mechanism does not require term has the form ͗X, Y, Z͘ and corre- stratification. The task shown in Exam- sponds to a Hilog set term p͕X, Y, Z͖ ple 2.5 can be represented in Hilog as for some name p. It also supports set follows: enumeration in LDL with complete set terms. A complete set term is of the ͑ ͕ ͖͒ ͑ ͕ ͖͒ ancestorsof X, ans Y :- parentsof X, par Y form ͕X, Y, Z͖ and corresponds to an ancestorsof͑X, ans͕Y͖͒ :- parentsof͑X, par͕Z͖͒, LDL set-enumeration term of the same ancestorsof͑Z, ans͕Y͖͒ form. Unlike set-grouping terms in LDL, The unstratified COL program in Ex- partial set terms in Relationlog can ap-

ACM Computing Surveys, Vol. 31, No. 1, March 1999 36 • M. Liu pear not only in the head but also in the We can obtain the nested relation body of a rule. When a partial set term Dept_Employees in Example 2.1 from appears in the body, it denotes part of a the normalized relation Dept_Employee_ set, as in Hilog. When appearing in the Area in Example 2.7 by using only one head, it is used to group a set as in LDL. rule, as in Hilog: Several rules can be used to group the same set. deptϪemployee͑X, ͓͗Y, ͗Z͔͒͘͘ :- Consider Example 2.4 again. The par- entsof relation can be directly defined in deptϪemployeeϪarea͑X, Y, Z͒ Relationlog as follows: The unstratified COL program in Ex- parentsof͑X, ͗Y͒͘ :- fatherof͑X, Y͒ ample 2.9 can be expressed clearly as a stratified program in Relationlog as fol- parentsof͑X, ͗Y͒͘ :- motherof͑X, Y͒ lows: ͑ ͕ ͖͒ Here two rules are used to group the person peter, bridge same set. person͑tom, ͗chess, tennis͒͘ As in LDL and COL, programs must be stratified in order to have a seman- person͑tom, ͗X͒͘ :- person͑peter, ͗X͒͘ tics. Stratification is based on predicate symbols used in the program. Given a The first fact says that Peter has ex- rule of the form A :- L1, ..., Ln where n actly one hobby, which is bridge. The

Ն 0,ifLi is negated or contains a com- second fact says that Tom’s hobbies in- plete set term, then this rule must be clude chess and tennis. The rule states evaluated after the rules whose head that every hobby of Peter’s is also a hobby of Tom’s. has the same predicate symbol as Li, for Յ Յ Relationlog has the following limita- each 1 i n. As in LDL and also tions. First, while it supports incom- COL, whether or not a program is strat- plete sets by using partial set terms, it ified can be determined statically based does not support incomplete tuples (i.e., on the predicate symbols occurring in tuples with null values), which are com- the program. mon in database applications. For ex- As the use of partial set terms in the ͑ Ќ͒ head does not require stratification, the ample, the tuple fatherof tom, , ancestorsof relation in Example 2.5 can representing that Tom’s father is un- be defined recursively in Relationlog as known, cannot be represented in Rela- follows: tionlog. Therefore, it is not clear what the meanings of recursive rules are ancestorsof͑X, ͗Y͒͘ :- parentsof͑X, ͗Y͒͘ when both incomplete sets and tuples appear. Next, since the complete set ancestorsof͑X, ͗Y͒͘ :- terms in Relationlog cannot contain parentsof͑X, ͗Z͒͘, partial set terms, it is not possible to ancestorsof͑Z, ͗Y͒͘ have a set of which we know the cardi- nality but for which we do not have As in Hilog, deeply nested data can be complete knowledge of some of the ele- directly accessed in Relationlog with ments, such as ͕͗a, b͘, ͕b, c͖͖. Besides, partial set terms. The two queries in there is no easy way to query sets of a Example 2.6 can be represented in Rela- particular cardinality in Relationlog, tionlog directly as follows, without in- such as sets of cardinality Ն 2 or Յ 2. troducing any intermediate symbols: Finally, being a complex-value lan- guage, Relationlog is lacking in its data- ? Ϫ deptϪemployees͑cpsc, ͓͗X, ͗db͔͒͘͘ modeling power, as we discuss in the ? Ϫ deptϪemployees͑_, ͓͗_, ͗ai͔͒͘͘ next section.

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •37

3. OBJECT-ORIENTED DEDUCTIVE to it. The value part of an object is also LANGUAGES called its state [Beeri 1989; Kim 1990a]. Objects Since an object in the real Object-oriented concepts have evolved world is associated with an object iden- in three different disciplines: first in tifier and a value in an object-oriented programming languages, then in artifi- database, a natural question is what an cial intelligence, and then in databases object is in a database. There are two (since the end of the 60’s). Indeed, object kinds of views in object-oriented data- orientation offers some of the most bases. One view is that everything is an promising ways to meet the demands of object; that is, classes, object identifiers, many advanced database applications. atomic values, tuples, and sets are all In this section, we first introduce the objects. Objects are related to each fundamental principles of the object-ori- other through attributes. The attribute ented data models and then examine values of an object can change but not their applications in deductive database the object itself. With this view, at- languages. tributes of objects can be expressed in Object Identity In a value-oriented terms of functions and all attribute val- database, keys, which are values, are ues together form a tuple. This view is used to represent objects in the real used in Iris [Fishman et al. 1987] and world. A fundamental problem with Jasmine [Ishikawa 1993]. This view such a representation of objects is that naturally incorporates functional and keys are subject to updates. Updating a multivalued dependencies. The other key causes a break in the continuity in view is that the pair of object identifier the representation of the object. Fur- and its associated value together is an thermore, care must be taken to update object. This view is used in the object- all tuples and sets that refer to this oriented data models O2 [Lecluse and object to reflect the change. Object-ori- Richard 1989], Orion [Kim 1990a; ented databases permit the explicit rep- 1990b] and ODMG-93 [Cattell 1996]. resentation of real-world objects The first view results in a simpler se- through the use of object identifiers, mantics, which is very important for a which are different from values in the logic-based language, while the second database. A unique object identifier view allows object identifiers to identify within the entire database is assigned not only tuples but also atomic values to each object that can be represented in and sets. the database and this association be- Methods Objects in an object-ori- tween object identifier and object re- ented database are manipulated with main fixed. Unlike values, an object methods. A method has a name, a signa- identifier can be created or destroyed ture, and an implementation. The name but cannot be updated. Two objects are and signature provide an external inter- different if they have different object face to the method. The implementation identifiers even if they have the same is typically written in an extended pro- attribute values. Whether or not the gramming language. user can access object identifiers varies Classes A class is a set of objects that from system to system. have exactly the same internal struc- Complex Values Each object is asso- ture and therefore the same attributes ciated with a value, which may be com- and the same methods. The class to plex. The value associated with the ob- which an object belongs is called the ject can contain object identifiers so immediate, proper or primary class of that two objects can share an object by the object. A class directly captures the referencing the same object identifier. instance-of relationship between an ob- Updates to the values of the shared ject and the class to which it belongs. In object do not affect the objects that refer addition, it provides the basis on which

ACM Computing Surveys, Vol. 31, No. 1, March 1999 38 • M. Liu a query may be formulated. It also de- Encapsulation Encapsulation de- fines the attributes and methods of its rives from the notion of abstract data instances. It is sometimes useful to al- types in programming languages. In an low an object to belong to more than one object-oriented database, objects encap- class. However, in most systems, an ob- sulate data and methods to be applied ject must belong to only one class for on these data. Encapsulation requires performance reasons. In this case, an that information about a given object object is said to have a unique primary can be manipulated only by means of class. As we show shortly, inheritance the methods defined for the object. The makes it possible for an object to belong user or application program cannot di- logically to more than one class. rectly examine or modify the value or Class Hierarchies and Inherit- the methods associated with the object. ance Classes are organized into class Therefore, encapsulation provides an hierarchies, which capture the generali- abstract interface to the object and zation relationship between a class and achieves some kind of logical data inde- its subclasses. Subclasses inherit the pendence. It is a very useful notion for attributes and methods of their super- data modeling. classes. An instance of a subclass logi- cally belongs to its superclasses. Inher- 3.1 O-Logic itance enables us to reduce the The object-oriented approach was first redundancy of the specification while applied to a deductive framework in O- maintaining its completeness. Inherit- logic (Object Logic) [Maier 1986], which ance normally takes place between was in turn based on the logic program- classes, whereas instances are com- ming language LOGIN [Aït-Kachi and pletely “sterile” in the object-oriented Nasr 1986]. It treats object identifiers data models. A subclass can modify the and atomic values as objects that can be attributes/methods inherited. This mod- related to each other through at- ification is called overriding. In addi- tributes. Classes can be used but are tion, some attributes/methods defined not objects. in a superclass may not be applicable to its subclass. In this case, a subclass Example 3.1 The following is an O- must block (or cancel) their inheritance. logic program: Inheritance can be single or multiple. In the case of single inheritance, the sub- joe : employee͓name 3 “Joe”, class hierarchy forms a tree; in the case position 3 prof, of multiple inheritance, the subclass hi- works 3 cs] erarchy forms a DAG. Multiple inherit- ͓ 3 ance is more elegant than single inher- sam : employee name “Sam”, itance, but conflicts may arise when an position 3 inst, attribute/method name is defined in two works 3 cs] or more superclasses. Several ap- ͓ 3 proaches are used to resolve an ambigu- cs : dept name “ComputerScience”, 3 ity. One is user-specified priority, which head joe] relies on the user to define a priority prof : rank͓name 3 “Professor”, among the super classes whose at- pay 3 2000] tributes/methods are inherited. Another one is explicit renaming, which requires inst : rank͓name 3 “Instructor”, that all attributes/methods inherited pay 3 1000] must have distinct names. Overriding can also be used to resolve ambiguities. E : employee͓salary 3 S͔ :- Unfortunately, there is still no consen- E : employee͓position 3 R͔, sus for the underlying semantics of mul- R : rank͓pay 3 S͔ tiple inheritance.

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •39 where employee, dept, rank are well substantiated. It is pointed out in classes, joe, sam, cs, prof, inst are Kifer and Wu [1993] that the correct ͑@ ͒͑@ ͒ object identifiers, and name, position, quantification should be E M ͑? ͒͑@ ͒ works, head, pay, salary are at- P N . Both happen to work here tributes. The facts provide information because E and M functionally deter- about employees, departments, and pay mine N. If the label name is set-valued, rate of professors and instructors. The the two quantifications will yield differ- rule specifies how to infer the salary of ent results. There is no obvious way of each employee. choosing between these two quantifica- tions using only the syntactic structure In O-logic, tuples and sets are not of the rule. The problem is that the allowed. The objects can have arbitrary variable P does not appear in the rule attributes but no structural or behav- body so that the rule is not domain- ioral information can be defined or de- independent [Abiteboul 1995]. clared on their classes and therefore no inheritance is supported. 3.2 F-Logic As rules can be used, a fundamental problem is how to generate objects with An extended O-logic based on O-logic rules. This issue is not properly ad- was first proposed in Kifer and Wu dressed in O-logic. [1993], followed by a more general F- logic (Frame Logic) in Kifer and Lausen Example 3.2 Consider the following [1989] and Kifer et al. [1995] as a solu- rule in O-logic: tion to the problems of O-logic. F-logic has been partially implemented at Uni-

P : interestingϪpair͓employee 3 E, versität Freiburg in Germany [Frohn et manager 3 M]:- al. 1997]. It treats atomic values, object E : employee͓name 3 N, works 3 D͔ identifiers, functor objects, even at- D : dept͓manager 3 M : tribute and classes as objects. It allows employee͓name 3 N͔] the declaration of attributes on classes and supports attribute inheritance. where P, E, M, D, and N are variables, Example 3.3 The following is an ex- and interestingϪpair, employee, and ample of an F-logic program: dept are classes. The intended meaning of this rule is that if the employee’s person͓name f string, department’s manager’s name coincides age f integer, with the employee’s name, then the pair parents ff person] of employee and manager is interesting employee ::person and an object identifier should be cre- joe : employee ated for this pair by using the variable tom : person P that occurs only in the head. pam : person 25 : integer The problem with the rule in Example “Joe”:string 3.2 is that it is not clear how the vari- joe͓name 3 “Joe”, able P should be quantified [Kifer and age 3 25, Wu 1993]. Universal quantification over parents33͕tom, pam͖] P obviously does not make sense. It is suggested in Maier [1986] that P should where person, employee, integer, and be existentially quantified as ͑@E͒ string are classes, while joe, tom, pam, ͑@M͒͑@N͒͑?P͒. However, the argu- 25 and “Joe” are object identifiers. The ment for such a quantification is not symbols “::” and “:” denote subclass rela-

ACM Computing Surveys, Vol. 31, No. 1, March 1999 40 • M. Liu tionship and class membership respec- there is no distinction between classes tively. The statement employee::person and objects, since the domains for says that employee is a subclass of per- classes and objects are the same. They son, joe:employee says that joe is an can be distinguished only from their instance of the class employee, etc. At- occurrences in the program. The at- tribute declaration information cannot tributes applicable to the class person really be strictly separated from at- are declared by using double-shafted ar- tribute value information in F-logic. As rows f and ff , where the former is a result, there is no clear separation for scalar attribute definitions and the between the notions of schema and in- latter for set-valued attribute defini- stance, which is essential for database tions. They are inherited by the sub- systems. class employee. Attribute values of the F-logic is a pure object-oriented de- object joe are represented by using sin- ductive language. It does not directly gle-shafted arrows 3 and 33, where support relationships as do in value- oriented deductive languages discussed the former is for scalar values and the in the previous section. It only supports latter for set values, corresponding to monotonic multiple attribute inherit- double-shafted arrows. ance, and does not support attribute F-logic is a powerful deductive lan- overriding and blocking. guage with a well-defined semantics In F-logic, sets are not treated as ob- compared to O-logic. It supports functor jects and cannot have attributes. Func- objects and sets. It also allows the use of tor objects cannot contain sets. As a functor objects as classes, attributes, result, the object-creation mechanism in and object identifiers. The interesting- F-logic is limited, since we cannot cre- pair rule of O-logic in Example 3.2 can ate objects directly based on some sets. be represented in F-logic as follows: Furthermore, the arguments associated with attributes cannot be sets either. f͑E, M͒ : interestingϪpair The treatment of sets in F-logic is the ͓employee 3 E, manager 3 M͔ :- same as in Hilog. For a statement E : employee͓name 3 M : string, works 3 D͔ bob͓age 3 30͔, 30 is the value of the D : dept͓manager 3 M : employee͓name 3 N͔ attribute age of bob. If a set is involved, the semantics is different. For a state- ͓ 33͕ ͖͔ where f is a functor and the rule infers ment bob children jim, pat , the ͕ ͖ functor objects such as f͑bob, sam͒ as set jim, pat does not mean that it is instances of the class interesting pair. the value of the children of bob. Instead, This object-creation mechanism is it means it is part (subset) of the set also used in the object-oriented deduc- value and therefore it is possible to have tive languages C-logic [Chen and War- another statement such as ren 1989] and LIVING IN A LATTICE bob͓children33͕sam͖͔. The (com- [Heuer and Sander 1993]. plete) value of children of bob is the Unlike O-logic, which only supports union of all such partial values that can simple attributes, F-logic supports pa- be inferred from the program. This spe- rameterized attributes. For example, we cial treatment allows the complete can have the following program in F- value of a set-valued attribute to be logic: inferred in a number of steps using sev- eral rules. The following is an example: person͓childrenϪwith@person ff person͔ bob͓childrenϪwith@liz33͕tom, pam͖͔ X͓ancestors33͕Y͖͔ :- X͓parents33͕Y͖͔ bob͓childrenϪwith@ann33͕ jim͖͔ X͓ancestors33͕Y͖͔ :- X͓parents33͕Z͖͔, F-logic has several drawbacks. First, Z͓ancestors33͕Y͖͔

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •41

As in Hilog, this unique treatment of information about set-valued attribute sets makes it impossible to specify the to be specified by supporting limited complete value of a set-valued attribute. forms of partial and complete set terms For example, if we know that tom and of Relationlog. pam are the two parents of joe,wecan The F-logic program in Example 3.3 only use the statement: can be expressed equivalently in ROL as follows: joe͓parents33͕tom, pam͖͔ Schema which will not prevent additional per- person͓name f string, sons from being inferred as joe’s parent. age f integer, Next, F-logic is an untyped language parents f ͕ person͖] with a complicated syntax and seman- employee isa person tics full of various symbols. However, important features such as typing and unique primary class (the lowest class Facts in the class hierarchy where the object tom : person is an instance of) for objects are not pam : person built into the semantics, but are dealt joe : employee͓name 3 “Joe”, with using axioms. In terms of program- age 3 25, ming, this means that the user should parents 3 ͗tom, pam͘] take care of them by properly including corresponding axioms into the programs where person, employee, integer, and if he wants his program to be typed or objects to have unique primary class. As string are classes, joe, tom, and pam well, the notion of typing expressed in are object identifiers, and ͗tom, pam͘ axioms can only be applied to attribute is partial set term. Unlike F-logic, ROL values rather than functor objects. builds in value classes such as integer, Finally, F-logic does not support be- string, real, etc. The symbols isa and : havioral object-oriented features such are used to denote immediate subclass as methods and encapsulation. relationship and primary class member- ship respectively. The double-shafted 3.3 ROL arrow f is used for both scalar and set-valued attribute definitions and the ROL (Rule-based Object Language) [Liu single-shafted arrow 3 is used for 1996; 1998a] is a recently proposed ob- both scalar and set-valued attribute val- ject-oriented deductive language imple- mented at the University of Regina [Liu ues. If we want to represent that tom et al. 1998]. It is based on F-logic and and pam are the only parents of joe,we solves some of the problems of F-logic can replace the partial set term ͗tom, discussed above. pam͘ with the complete set term Unlike F-logic, ROL is a typed lan- ͕tom, pam͖. guage. Objects and classes are strictly Unlike F-logic, ROL has a simple syn- distinguished and function at two differ- tax and semantics. The semantics build- ent levels: schema and instance. At the sin important object-oriented features, schema level, we declare classes, at- such as typing, multiple inheritance tributes on classes, and subclasses. The and unique primary class. In ROL, func- attribute declarations of a class con- tor objects are required to be typed with strain the instances of the class. At the respect to their class definitions. For instance level, we use facts and rules to provide extensional and intensional in- example, functor objects interesting_ formation about objects. In addition, pair͑bob, sam͒ and list͑1, list͑2, ROL allows both partial and complete list͑3, nil͒͒͒ are instances of the func-

ACM Computing Surveys, Vol. 31, No. 1, March 1999 42 • M. Liu tor class interestingϪpair and Example 3.4 In an international cou- list͑integer, list͒, respectively. rier service address database, we may A novel feature of ROL is that its have three classes address, us_address functor classes generalize classes and (for U.S.A.), cn_address (for Canada), so relations. Functor objects can directly that us_address and cn_address are represent relationships in ROL. They subclasses of address. We may define can have attributes and can be used as the attribute postcode of address on in- attribute values. In addition, functor ob- teger which also applies to instances of jects can contain sets and thus are more us_address. However, the postcode for general than those in F-logic. As a re- cn_address must be defined on string. sult, ROL subsumes predicate-stratified Therefore, cn_address should override Datalog (with negation) and LDL with- out grouping and Relationlog without the inherited attribute declaration of partial set terms as special cases. For postcode from address. This can be example, the following LDL rules are achieved in ROL with the following also ROL rules: class definitions, but is not supported in F-logic: int͑s͑X͒͒ :- int͑X͒ ancestor͑X, Y͒ :- parent͑X, Y͒ address͓..., postcode f integer͔

ancestor͑X, Y͒ :- parent͑X, Z͒, usϪaddress isa address ancestor͑Z, Y͒ cnϪaddress isa address ͓ f ͔ bookϪdeal͕͑X, Y, Z͖͒ :- book͑X, Px͒, postcode string book͑Y, Py͒, book͑Z, Pz͒ Þ Þ Þ In ROL, a subclass can also block the X Y, X Z, Y Z, inheritance from its superclasses to it- Px ϩ Py ϩ Pz Ͻ 100 self and to its subclasses by using the The interesting-pair rule in Example built-in class none. 3.2 can be represented in ROL as fol- Example 3.5 Consider the classes lows: person, french, orphan, and french_or- phan where french, and orphan can be interestingϪpair͑E, M͒ :- defined as subclasses of person and fren- E : employee͓name 3 N, works 3 D͔, ch_orphan as a subclass of both french D : dept͓manager 3 M͔, ͓ 3 ͔ and orphan. If we declare attributes M : employee name N father, mother, and age for person, then Here, a typed functor object interest- it is meaningful for french to inherit or override all attribute declarations from ing_pair ͑E, M͒ that has no separate identifier is used for each interesting- person, but not meaningful for orphan pair object, whereas in F-logic, an un- and french_orphan to inherit father and typed functor object of the class interest- mother attributes. That is, orphan and ing_pair is used as an identifier for the french_orphan should block the inherit- associated interesting pair. ance of attribute father and mother from ROL supports non-monotonic multi- person. This intended semantics is nat- ple attribute inheritance, as discussed urally supported in ROL using the fol- in Borgida [1988], whereas F-logic sup- lowing class definitions: ports only monotonic multiple attribute inheritance. Subclasses in ROL can in- herit attribute declarations of their su- person͓father f person, perclasses and can also override such mother f person, inheritance. age f integer]

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •43

orphan isa person ferent classes not related by the sub- ͓ father f none, class relationship cannot have at- tributes of the same name. mother f none] Although ROL is a deductive lan- french isa person guage that supports both object-ori- ented and value-oriented approaches, frenchϪorphan isa french, orphan its support for the value-oriented ap- proach is limited as it does not support Even if french overrides (more accu- grouping in functor objects. Its partial rately, refines) the attributes father and complete set mechanisms are not as and mother as follows: general as those in Relationlog. Finally, behavioral object-oriented features such as methods and encapsu- french isa person lation are not supported in ROL. ͓ father f french, mother f french] 3.4 IQL the attribute father and mother are still IQL (Identity ) [Abite- blocked in french_orphan because of the boul and Kanellakis 1989] is a typed use of none. If we insist that french_or- deductive database language based on phan have mother nun, then we can use the object-oriented data model O2.Asin the following class definition to override O2, an object in IQL is a pair of object the inheritance blocking: identifier and value. Unlike F-logic and ROL, the value in an IQL object can be frenchϪorphan isa french, orphan not only a tuple, but also a set or an ͓mother f nun͔ atomic value. For example, we can have the following objects for a family in IQL: Another novel feature of ROL is that sets are treated as first-class citizens as ͑o1, ͓name :“Bob”, spouse : o2, values, object identifiers, and functor children : o , phone : o ]) objects, so that sets can also have at- 5 6 tribute. For example, we can have the ͑o2, ͓name :“Liz’, spouse : o1, following class definition and fact in children : o5, phone : o6 ROL: ͑o3, ͓name :“Tom”, phone : o6͔͒ ͕person͖͓count f integer͔ ͑o4, ͓name :“Pam”, phone : o6͔͒ ͕tom, pam͖͓count 3 2͔ ͑o5, ͕o3, o4͖͒

ROL has the following limitations. ͑o6, 1234567͒ First, like F-logic, an object identifier in ROL can directly identify only a tuple In this example, object identifiers o1, but not a set or an atomic value. ROL’s o , o and o identify tuples, o identi- treatment of classes is not general 2 3 4 5 enough as there are only two layers, fies a set, while o6 identifies an atomic objects and classes, and classes cannot value. If this family changes its phone be treated as objects (or meta objects). number or add another child, only one In ConceptBase [Jarke et al. 1995], simple update is needed. there can be infinite layers of objects: In IQL, the value associated with an ordinary objects, classes, metaclasses, object identifier can be accessed by us- etc. Besides, parameterized attributes ing the object identifier dereferencing of F-logic are not supported in ROL. operator ˆ. For example, oˆ1, oˆ5 denote Another problem with ROL is that dif- the tuple ͓name :“Bob”, spouse : o2,

ACM Computing Surveys, Vol. 31, No. 1, March 1999 44 • M. Liu

Object identity in IQL can be used not children : o5, phone : o6͔ and the set only for object sharing and update man- ͕o3, o4͖, respectively. IQL directly supports relations in ad- agement, but also for set grouping as dition to classes. In pure object-oriented data function of COL. data models such as O2 and ODMG-93, Example 3.6 The following rules many-to-many relationships between show how to derive the relation ances- objects are difficult to deal with. But torsof from the relation parentsof using they can be represented in IQL by using object identifiers as grouping in IQL. relations. Indeed, using object identifi- ers for every object is burdensome and temp͑X, O͒ :- parentsof͑X, Y͒ problematic in object-oriented deductive languages, and a pure value-oriented Oˆ ͑Y͒ :- temp͑X, O͒, parentsof͑X, Y͒ approach has been argued to be better in this regard [Ullman 1991]. With di- Oˆ ͑Y͒ :- temp͑X, O͒, Oˆ ͑Z͒, parentsof͑Z, Y͒ rect support for relations, IQL combines the benefits of the relational and object- ancestorsof͑X, Oˆ ͒ :- temp͑X, O͒ oriented approaches. In F-logic and ROL, object identifiers The first rule is used to create an object are simply constants that can be intro- identifier for the set of ancestors for duced explicitly in facts and rules by the each person. The next two rules use the user. In IQL, object identifiers are sys- object identifiers to group all ancestors tem-created, as in object-oriented pro- for each person. The last rule generates gramming languages. They can be cre- the relation ancestorsof by dereferenc- ated only by using rules that have ing object identifiers that identify sets. variables occurring in the head, but not in the body as in O-logic. This may be Note that the rules in the above ex- viewed as a syntactic variation of the ample must be stratified; that is, the new construct in object-oriented pro- last rule should be evaluated after the gramming languages. Unlike O-logic, first three rules. Unlike Datalog (with there is a well-defined semantics for negation), LDL, COL and Relationlog, this object-creation mechanism. in which stratification is automatically The intended interesting_pair rule in determined based on predicate or func- Example 3.2 can be represented in IQL tion symbols, in IQL we may not have as follows: such symbols to use, as shown in the second and third rule in Example 3.6. pair͑E, M͒ :- Therefore, the user of IQL must take employee͑E, N, D͒, care of stratification by using program dept͑D, M͒, composition. The rules in Example 3.6 employee͑M, N, D͒ should be organized into two programs with the first three rules in the first one interestingϪpair͑O, E, M͒ :- pair͑E, M͒ and the last in the second. IQL’s object identifier-creation mech- ˆ O ϭ ͓E, M͔ :- interestingϪpair͑O, E, M͒ anism is also used in object-oriented deductive languages LOGRES [Cacaceet The first rule generates all pairs in al. 1990] and LLO [Lou and Ozsoyoglu the relation pair. The second creates an 1991]. object identifier for each pair with the Like ROL, IQL is a typed language. relation interesting_pair using the vari- There is a clear separation of the no- able O that appears in the head but not tions of instances and schema. Objects in the body. The last rule assigns the and classes are strictly distinguished pair to the object identified by the corre- and function at two different levels: in- sponding object identifier. stance and schema. At the schema level,

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •45 we declare classes and their associated 4. SCHEMA AND HIGHER-ORDER types. The kinds of object identifiers FEATURES that can be created with rules are com- A database usually has a schema that pletely determined by the types associ- provides the description of the database ated with the classes for these objects. structure, constrains data in the data- IQL has the following problems: First, base and query results, and guarantees the object identifier-creation mecha- part of the consistency of the database. nism may lead to infinite loops and does In relational database systems, sche- not support a logic-oriented semantics, matic information is also stored in rela- as discussed in Beeri [1990]. Indeed, tions and the same relational languages two kinds of semantics are used to de- can be used to query not only data but fine IQL, one for the underlying data also the schema. model and one for the IQL language, However, the notion of schema is not which makes the language quite compli- properly supported in deductive data- cated and nonintuitive. base languages, because they are based Second, IQL is so dependent on rela- on the logic programming language Pro- tions that no objects can exist if there log, which is inherently not typed. In- are no relations. Therefore, it is not a deed, the semantics of schema is not pure object-oriented deductive database well defined in most deductive frame- language. works. Another problem is that values such In Datalog, predicates correspond to as integers, strings, tuples, sets, etc., relation names. In order to reason about cannot be treated as objects that have schematic information, we need to use properties in IQL. They can appear only predicate variables that belong to high- in the value part of an object. For a er-order logic. However, higher-order mathematical database, we may store logics have been met with skepticism, integers and their associated attributes since the unification problem is unde- such as odd, even, prime, etc. For exam- cidable. A non-logic-based language has ple, ͑2, ͓even : true, prime : true͔͒. been used to specify the schematic infor- In this case, an integer is identified by mation in a few implemented deductive itself rather than by a separate identi- database systems such as LDL, and fier. The cases for tuples and sets are such information cannot be queried, let the same. alone by using a logic-based language. In addition, separating relations from Prolog actually combines first-order classes suffers the same problems asso- logic, higher-order constructs, and ciated with the entity-relationship meta-level programming in one working model. For example, relationships can system, so that generic predicates such only include classes, while classes can- as transitive closure and sorting can be not include relationships and relation- defined, predicates can be passed as pa- ships cannot include relationships. rameters and returned as values, and Finally, the schema supported by IQL predicates and rules can be queried. is quite limited. It does not support important object-oriented features such Example 4.1 Prolog provides a as attribute inheritance or behavioral built-in binary predicate current_predi- object-oriented features such as meth- cate ͑Name, Term͒ that can be used to ods and encapsulation. Indeed, how to unify Name with the name of a user- incorporate methods and encapsulation defined predicate, and Term with the in an object-oriented deductive frame- most general term corresponding to that work remains an open problem. predicate. Consider the following rules in Prolog:

ACM Computing Surveys, Vol. 31, No. 1, March 1999 46 • M. Liu

relation͑X͒ :- currentϪpredicate͑X, Y͒ can be viewed as sets intensionally. In HiLog, unary parameterized terms can structure͑Y͒ :- currentϪpredicate͑X, Y͒ be viewed as sets in the same way. For example, part of the relation in Exam- The first rule defines a relation to col- ple 2.1 can be represented in HiLog as lect all predicates (relation names) in follows: the database. The second rule defines a relation to collect all predicate struc- areas͑bob͒͑db͒ tures (predicates and their arguments) areas͑bob͒͑ai͒ in the database. areas͑joe͒͑os͒ areas͑joe͒͑pl͒ Unfortunately, the semantics of Pro- areas͑joe͒͑db͒ log is based on first-order logic, which does not have the wherewithal to sup- employees͑cpsc͒͑bob, areas͑bob͒͒ port any of these features. Therefore, employees͑cpsc͒͑joe, areas͑joe͒͒ they have an ad hoc status in logic programming. deptϪemployee͑cpsc, employees͑cpsc͒͒ In this section, we examine four high- er-order languages that can be used to As HiLog does not support extension- reason about schematic information in ally represented sets, we do not classify deductive databases: HiLog, L2, F-logic it as a complex-value deductive lan- and ROL. The first two are value ori- guage. However, variables can be used ented, while the last two are object ori- in place of predicates in HiLog, so that ented. we can use such variables to reason about schematic information. 4.1 HiLog Example 4.2 Consider the database In first-order logic, predicates and func- consisting of three relations: hardware, tions have different roles. Predicates software, and admin, each of which rep- are used to construct atoms, while func- resents the employees in the respective tions are used to construct terms that departments. The tuples in all three are embedded in other terms or atoms. relations have attributes name, age and Predicate and function symbols have salary. We can define the following rela- fixed arities. However, it has been tions with higher-order queries in Hi- shown from Prolog applications that it Log: is very useful to let a symbol be used as both predicate and function with vari- ous arities. p1͑R͒ :- R͑_, _, _͒ HiLog (Higher-order Logic) [Chen et ͑ ͒ ͑ ͒ al. 1993] generalizes such a usage and p2 R :- R joe,_,_ gives it a well-defined declarative se- p ͑N͒ :- R͑N,_,_͒ mantics. Note that HiLog here is differ- 3 ent from Hilog discussed in Section 2.3. p4͑N͒ :- There are only two kinds of symbols in R1͑joe, A1,_͒, R2͑N, A2,_͒, HiLog, parameters with various arities A1 Ͻ A2 (or arityless) and variables. Parameters ͑ ͒ can function as predicates, functions p5 N1, N2 :- and constants depending on the context. R1͑N1, A,_͒, R2͑N2, A,_͒, Terms are formed with parameters and R1 Þ R2, N1 Þ N2 variables in the usual way and may ͑ ͒ represent individuals, terms, and atoms p6 N1, N2 :- ͑ ͒ ͑ ͒ in different contexts. R1 N1,_,S , R2 N2,_,S , In first-order logic, unary relations N1 Þ N2

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •47

where hardware, software and admin p7͑N͒ :- R͑N,_,S͒, ¬ R͑NЈ,_,SЈ͒, are attribute names that function as S Ͻ SЈ predicates or relation names of the da- tabase tuple db, and r1, r2, r3 are corre- sponding relations, each of which is a The relation p1 collects the relation set of tuples of the form ͑name : ..., names in the database. The relation p2 collects the department names (relation age : ..., salary : ...͒. A specific at- names) to which joe belongs. The rela- tribute value joe in the relation hard- tion p3 collects the names of employees ware can be referenced with in the hardware, software and admin .hardware͑.name ϭ joe͒. Besides ϭ, relations. The relation p4 collects the we can also use Þ, Ͻ, Յ, Ͼ, and Ն. names of employees who are older than In L2, each relation has an ordering Joe. The relation p5 collects the pair of the names of employees who have the of attributes in the tuples, so that at- tributes can be omitted according to this same age but are in different depart- ordering. We can use variables for at- ments. The relation p6 collects the pair tribute names (relation names) and we of the names of employees who have the need not deal with their arity. same salary. The relation p7 collects the names of employees who have the high- Example 4.4 Based on the database est salary in their department. in Example 4.3, we can define the fol- lowing relations with higher-order que- However, we cannot define a relation ries: to collect all relation names in HiLog since we must explicitly deal with the arity of each relation in order to query .p1͑R͒ :- .R the relation name, as shown in the first ͑ ͒ ͑ ϭ ͒ rule in Example 4.2. Furthermore, we .p2 R :- .R .name joe cannot query attributes as they are not .p ͑N͒ :- .R͑.name ϭ N͒ representable in HiLog. Therefore, we 3 cannot define in HiLog the two relations .p4͑N͒ :- relation and structure, as in Example .R1͑.name ϭ joe,.age ϭ A͒, 4.1. .R2͑.name ϭ N,.age Ͼ A͒

.p5͑N1, N2͒ :- 4.2 L2 .R1͑.name ϭ N1,.age ϭ A͒, ͑ ϭ ϭ ͒ L2 [Krishnamurthy and Naqvi 1988] is a .R2 .name N2,.age A , Þ higher-order deductive database lan- R1 R2 guage based on LDL. The problems dis- .p6͑N1, N2͒ :- cussed above are solved by introducing ͑ ϭ ϭ ͒ attributes into tuples and redefining the .R1 .name N1,.salary S , ͑ ϭ ϭ ͒ database as a tuple. .R2 .name N2,.salary S , N Þ N Example 4.3 The database discussed 1 2 2 in Example 4.2 can be defined in L as a .p7͑N͒ :- tuple as follows: .R͑.name ϭ N,.salary ϭ S͒, R ¬ ͑.name Þ X,.salary Ͼ S͒,

db ϭ ͑hardware : r1, .p8͑A͒ :- .hardware͑.A͒ software : r2, .p ͑R, ͗A͒͘ :- .R͑.A͒ admin : r3) 9

ACM Computing Surveys, Vol. 31, No. 1, March 1999 48 • M. Liu

The relation p1 collects all relation denotes a set of objects that have com- names in the database no matter what mon attributes and prescribes the at- arity they have. It is the L2 version of tributes applicable to their instances. Classes are organized into a subclass the first rule in Example 4.1. The rela- hierarchy such that the prescribed at- tions p2 to p7 are the same as in Exam- tributes of superclasses can be inherited ple 4.2. The relation p8 collects all the by the subclasses. attribute names of the relation hard- A natural question for object-oriented ware in the database. The relation p9 databases is how to reason about the collects all relation names and their at- subclass hierarchy, attribute informa- tion of classes, and class and attribute tributes, where ͗A͘ is a set-grouping information of objects. F-logic is the term. Its definition is more intuitive first object-oriented deductive language than the second rule in Example 4.1. that supports reasoning about class and attribute information. A novel feature of L2 is the introduc- tion of replacement semantics as a solu- Example 4.5 The following is a por- tion to higher-order unification. In bot- tion of an F-logic program based on the tom-up evaluation of rules in deductive database discussed in Example 4.1. It databases, general unification is not has classes and subclass definitions and needed. Instead, only matching is used; part of the objects. that is, only one of the two terms con- tains variables. The higher-order vari- employee ::person ables in L2 are limited to range over database attributes. A rule with higher- hardware ::employee order variables can be rewritten by re- software ::employee placing the variables with attributes. The rewritten rules are in first-order admin ::employee logic and their meaning is well defined. Since the number of database attributes joe : hardware must be finite, the language is decid- employee͓name f string, able. This approach is a natural and f simple way towards the integration of age integer, f the definition and manipulation of salary integer, schema and data. children ff person] 2 However, L does not have a notion of joe͓name 3 ‘‘Joe’’, schema. In the database in Example age 3 30, 4.3, if one of the relations r1, r2, r3 is salary 3 30K, empty, then we have no structural in- children33͕bob, sam͖] formation such as name, age, and sal- ary. Furthermore, it is not required that ... tuples in a relation have the same form. Unlike L2, we can define attribute For example, r1 can contain two tuples: information for classes in F-logic. Even ͑name : joe, age : 30, salary :30K͒ though we have no instances in the ͑ ͒ and name : sam, wife : pam . There- class, the attribute information re- fore, the last two rules in Example 4.4 mains. are not really meaningful in this case. As discussed in Section 3, classes and attributes are also objects in F-logic. 4.3 F-logic They differ from other objects in their occurrences in special places. Therefore, In pure object-oriented databases, we we can use variables for the places of have classes instead of relations. A class classes and attributes and reason about

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •49 class and attribute information in F- C͓interveningϪclass@D33͕E͖͔ :- logic. C ::E, E ::D, Example 4.6 For the F-logic program C Þ E, E Þ D in Example 4.5, we can use the follow- ing rules with higher-order queries to 4.4 ROL collect all classes, all attributes of the class hardware, all attributes of each ROL [Liu 1996; 1998a] is another lan- class, and all instances of each class in guage that supports reasoning about F-logic: subclass hierarchy, attribute informa- tion of classes, and class and attribute information of objects. It has a clear db͓classes33͕C͖͔ :- C ::D ͓ 33͕ ͖͔ notion of schema and allows class and db classes C :- D ::C attribute variables in rules. The seman- ͓ 33͕ ͖͔ db classes C :- O : C tics of higher-order features of ROL is ͓ 33͕ ͖͔ ͓ f ͔ db classes C :- C A D based on replacement semantics of L2 so ͓ 33͕ ͖͔ ͓ ? ͔ db classes C :- C A D that the language is still decidable. db͓classes33͕C͖͔ :- D͓A f C͔ db͓classes33͕C͖͔ :- D͓A ? C͔ Example 4.7 The F-logic program in Example 3.3 can be represented in ROL employee͓subclasses33͕C͖͔ :- as follows: C ::employee hardware͓attributes33͕A͖͔ :- Schema hardware͓A f C͔ person employee isa person hardware͓attributes33͕A͖͔ :- ͓name f string, hardware͓A ? C͔ children f ͕ person͖, f C͓attributes33͕A͖͔ :- C͓A f D͔ phone integer] hardware isa employee C͓attributes33͕A͖͔ :- C͓A ff D͔ software isa employee C͓instances33͕O͖͔ :- O : C admin isa employee As shown in the above example, it is Facts cumbersome to query all classes in F- joe : hardware logic. The reason is that F-logic does not ͓name 3 “Joe”, distinguish between ordinary objects age 3 30, and classes. As a result, we have to salary 3 30K, query all possible places for classes. The children 3 ͕bob, sam͖] use of two kinds of arrows is somewhat ... burdensome, too. Similarly, it is also cumbersome to query all immediate In ROL, the user explicitly defines subclasses, superclasses, and immedi- immediate subclasses and primary class ate instances of a class in F-logic. For membership using the symbols “isa” example, the following rules show how and “:”, respectively, based on which 1 to collect all immediate subclasses general subclasses and class member- ship can be derived using the symbols ͓ 33͕ ͖͔ .respectively ,”ء:“ and ”ءC immediateϪsubclasses D :- “isa C ::D, ¬C͓interveningϪclass@D33͕E͖͔ Example 4.8 Consider the following ROL rules with higher-order queries based on the ROL program in Example 1This example is due to one of the referees. 4.7:

ACM Computing Surveys, Vol. 31, No. 1, March 1999 50 • M. Liu

db͓classes 3 ͗C͔͘ :- C guages have been proposed [Abiteboul 1988; Abiteboul and Vianu 1991; Bon- employee͓subclasses 3 ͗C͔͘ :- ner and Kifer 1993; Bonner et al. 1993; employee Bonner and Kifer 1994; Bry 1990; Chen ءC isa 1997; de Maindreville and Simon 1988; ͓ 3 ͗ ͔͘ hardware attributes A :- Kakas and Mancarella 1990; Kowalski hardware͓A f _͔ 1992; Manchanda 1989; Montesi et al. C͓attributes 3 ͗A͔͘ :- C͓A f _͔ 1997; Naqvi and Krishnamurthy 1988; Naish et al. 1987; Reiter 1995; Wichert C͓immediateϪsubclasses 3 ͗D͔͘ :- and Freitag 1997]. Indeed, modeling up- D isa C dates in logic-based languages is still the subject of substantial current re- C͓nonϪimmediateϪsubclasses 3 ͗D͔͘ :- search. -C, ¬ D isa C The important issues related to up ءD isa ͓ 3 ͗ ͔͘ dates from a deductive database point of C immediateϪinstances O :- O : C view are: ͓ 3 ͗ ͔͘ C nonϪimmediateϪinstances O :- (1) extensional and intensional data- ¬ ء O : C, O : C base updates, ء ͔͘ ͗ 3 ͓ C instances O :- O : C, (2) nondeterministic updates, ͓ 3 ͔ C highestϪpaidϪemployee O :- (3) bulk or set-oriented updates, O : C͓salary 3 S1͔ ¬OЈ : C͓salary 3 S2͔, (4) conditional updates. Ͻ S1 S2 In this section, we discuss four families of deductive languages that support up- The first five rules are the ROL versions dates, Prolog, DLP and its extension, of the F-logic rules in Example 4.6. The DatalogA and LDL, and Transaction next four collect all immediate sub- Logic, and show how they deal with the classes, all nonimmediate subclasses, above issues. all immediate instances and all nonim- mediate instances of of each class, re- 5.1 Prolog spectively. The last rule finds the em- ployees who have the highest salary in In Prolog, the basic update primitives are assert and retract. Assert is used to their department. insert a single fact or rule into the data- As discussed in Section 3, ROL’s func- base. It always succeeds initially and tor classes can be used as relations. fails when the computation backtracks. However, the higher-order mechanism Facts and rules can be deleted from the used in ROL only allows us to query the database in Prolog by using retract. Ini- relation schema and it is impossible to tially, retract deletes the first clause in query just the attributes in the rela- the database that unifies with the argu- tions. In addition, the non-immediate ment of retract. On backtracking, the and immediate features are cumber- next matching clause is removed. It some. fails when there are no remaining matching clauses. Therefore, both ex- 5. UPDATES tensional and intensional databases can be updated with Prolog. Like a relational database, a deductive database also undergoes updates to ab- Example 5.1 Let employee be a base sorb new information. In the past de- relation that stores the department and cade, various methods to incorporate salary of each employee. The following update constructs into logic-based lan- query in Prolog is used to fire employee

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •51

Joe in the Toy department with a salary The nondeterminism is due to the vari- of 10K and hire an employee Bob in the able Section, which occurs only in the Shoe department with a salary of 15K. body of the rule and is therefore exis- tentially quantified. ? Ϫ retract͑employee͑joe, toy,10K͒͒, Prolog uses a top-down, tuple-at-a- assert͑employee͑bob, shoe,15K͒͒ time, backtracking-based inference Updates expressed in queries can be mechanism for updates and queries. For performed only once. If the same kind of bulk updates, it resorts to recursive up- updates may be used more than once, it date rules in Prolog. is better to write a general update rule Example 5.4 Let employee be a base and then query the head of the rule for relation as in Example 5.1. The follow- specific updates. ing recursive rules can be used to give Example 5.2 Let account be a base every employee an X% salary increase relation that stores the balance of each in Prolog: account. The following rule can be used to withdraw money from an account: raise͑X͒:- ¬employee͑Name, Dept, Sal͒ withdraw͑Acc, Amt͒ :- raise͑X͒:- ͑ ͑ ͒͒ account͑Acc, Bal1͒, retract employee Name, Dept, Sal1 , ء ϭ ϩ Bal1 Ն Amt, Sal2 Sal1 Sal1 X, ͑ ͒ Bal2 ϭ Bal1 Ϫ Amt, raise X , ͑ ͑ ͒͒ retract͑account͑Acc, Bal1͒͒, assert employee Name, Dept, Sal2 assert͑account͑Acc, Bal2͒͒ In order to give every employee a 10% If we want to withdraw $30 from the salary increase, we can issue a query ?- account a123, then we can use a query raise͑0.1͒. If there are still employees ?- withdraw͑a123,30͒. in the database, the second will nonde- terministically delete an employee, call In Prolog, variables that occur only in ͑ ͒ the body of a rule are existentially the query ?- raise 0.1 recursively, and quantified. Therefore, updates in Prolog then insert the employee with a new can be nondeterministic. salary. In this way, the salary of each employee can be increased. Example 5.3 Let course, section and size be base relations where course Unfortunately, the semantics of assert stores the sections for each course, sec- and retract is not well defined in Prolog. tion stores students for each section, The exact effect of calling code contain- and size stores the size of each section. ing assert and retract is often difficult to predict. The resulting database update The following Prolog rule can enroll a relies only on the operational semantics student into one section of the course of Prolog, rather than just the declara- whose size is no more than 30 nondeter- tive semantics. The order of execution of ministically: subgoals is as important as the logical enroll͑Student, Course͒ :- content of the goal. For the rules in Examples 5.2, 5.3 and 5.4, changing the course͑Course, Section͒, ͑ ͒ order of emph assert or retract in the size Section, Num1 , body of the rules may yield different Ͻ ϭ ϩ Num1 30, Num2 Num1 1, results. Many distributed Prolog sys- ͑ ͑ ͒͒ retract size Section, Num1 , tems have versions of retract with bugs assert͑section͑Section, Student͒͒, or strange behavior (sometimes called assert͑size͑Section, Num2͒͒ “features”). In an external or distrib-

ACM Computing Surveys, Vol. 31, No. 1, March 1999 52 • M. Liu uted database system, the problems namic predicate symbols. For every with assert and retract become much base predicate symbol p, two associated more severe. Multiuser access creates dynamic predicate symbols ϩp and Ϫp even more difficulties. In addition, the are used to specify basic addition and tuple-at-a-time, backtracking-based up- date mechanism of Prolog is not suit- deletion of tuples in the base relation p, able for deductive databases. The fol- respectively. Atoms formed using dy- lowing example shows another problem namic predicate symbols are called dy- with conditional updates in Prolog. namic atoms. An update query in DLP is of the form Example 5.5 Let employee be as in ͗D͑͘␣͒ where D is a dynamic atom used Example 5.1 and avgϪsal an inten- for updates and ␣ is a query. The up- sional relation based on employee that dates specified in D are executed only if gives the average salary of employees in the query ␣ is evaluated true after the each department. Consider the follow- updates. If ␣ is just true, then we can ing Prolog rule: simply use ͗D͘ as an update query. As ␣ hire͑Name, Dept, Sal͒ :- can contain an update query, complex updates can be expressed. assert͑employee͑Name, Dept, Sal͒͒, ͑ ͒ Յ The updates in Example 5.1 can be avgϪsal Dept, Avg , Avg 50K represented as a nested update query in DLP in two different ways as follows: The intention here is to hire an em- ployee only if the average salary in the ? Ϫ ͗Ϫ employee͑joe, toy,10K͒͘ department stays below 50K after hir- ͑͗ϩ employee͑bob, shoe,15K͒͒͘ ing. However, in Prolog, if the average salary after hiring Bob is greater than ? Ϫ ͗Ϫ employee͑bob, shoe,15K͒͘ 50K, Bob has still been hired, since ͑͗ϩ employee͑joe, toy,10K͒͒͘ Prolog does not undo updates during backtracking. Update rules whose body contains up- date query can be used in DLP. The 5.2 DLP and Its Extension updates in Example 5.2 can be repre- sented as an update rule in DLP as The notion of states is inherent in any notion of updates. A state in Datalog is follows: a set of base and derived relations. Up- ͗ ͑ ͒͘ dates cause transitions from a state withdraw Acc, Amt :- ͗Ϫ ͑ ͒͘ through a state space, thereby changing account Acc, Bal1 ͑ Ն ϭ Ϫ the database. Dynamic logic [Harel Bal1 Amt, Bal2 Bal1 Amt, 1979] is a logic for reasoning about pro- ͗ϩ account͑Acc, Bal1͒͘) grams that test and change the values of an environment. It provides a natural To withdraw $30 from the account state-transition semantics for updates a123, we can use an update query ?- and is therefore used in several exten- ͗withdraw͑a123,30͒͘. sions of Datalog with updates. As in Prolog, a top-down, tuple-at- DLP (Dynamic Logic Programming) a-time, backtracking-based inference [Manchanda and Warren 1988] is such mechanism is used in DLP for both up- a Datalog extension. It allows only the dates and queries. However, unlike Pro- extensional database to be updated. The log, it undoes updates during backtrack- intensional database cannot be updated. Ϫ͗ In DLP, predicate symbols are divided ing. For the above query ? withdraw into three disjoint sets: base predicate ͑a123,30͒͘, DLP first deletes the tuple symbols, rule predicate symbols and dy- account͑a123, Bal1͒.IfBal1 Ն 30 is

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •53 not true, DLP undoes the deletion dur- ͗raise͑X͒͘ :- ing backtracking. ͗Ϫ employee͑Name, Dept, Sal1͒͘ ͒,Xء Like Prolog, DLP also supports non- ͑Sal ϭ Sal ϩ Sal deterministic updates as variables oc- 2 1 1 ͗raise͑X͒͘ curring in the body of a rule are existen- ͑͗ϩ employee͑Name, Dept, Sal ͒͒͘ tially quantified. The nondeterministic 2 update rule in Example 5.3 can be de- Updates in DLP have an operational fined in DLP with a different ordering semantics with a left-to-right reading. as follows: In addition, its tuple-at-a-time, back- tracking-based update mechanism is ͗enroll͑Student, Course͒͘ :- not suitable for deductive databases. As course͑Course, Section͒, a result, DLP is extended in Manchanda [1989] to directly support bulk updates. size͑Section, Num1͒, ͗ϩ section͑Section, Student͒͘ For example, the updates in Example ͑ Ͻ ϭ ϩ 5.5 can be represented in this extension Num1 30, Num2 Num1 1, of DLP as follows: ͗Ϫ size͑Section, Num1͒͘ ͑͗ϩ ͑ ͒͒͒͘ size Section, Num2 raise͑X͒ : ͕employee2͑Name, Dept, Sal2͒ :- employee͑Name, Dept, Sal1͒, {XءIf a course DB has two sections, the Sal2 ϭ Sal1 ϩ Sal1 query ? Ϫ enroll͑joe, db͒ inserts Joe into one section and then checks if it is Given a query ?- raise͑0.1͒, the up- full. If so, it backtracks, undoes the dates are performed as follows: trans- insertion, and tries another section. form the given definition into the Data- Therefore, if neither section is full, Joe log one: will be enrolled nondeterministically in ͑ ͒ a section. raiseϪemployee2 X, Name, Dept, Sal2 :- ͑ ͒ As an update query in DLP contains a employee Name, Dept, Sal1 , ء ϭ ϩ query part that must be evaluated true, Sal2 Sal1 Sal1 X we can express conditions that com- pleted updates must satisfy. For exam- Then evaluate the relation denoted by ple, the conditional updates in Example raise_employee2 in the current data- 5.4 can be represented as follows: base; project out the first attribute (cor- responding to X) from this relation; re- ͗hire͑Name, Dept, Sal͒͘ :- place the current instance of employee ͗ϩ employee͑Name, Dept, Sal͒͘ with this projection. ͑avgϪsal͑Dept, Ave͒, Avg Յ 50K 5.3 DatalogA and LDL Given a query ? Ϫ ͗hire͑bob, toy, ͒͘ DatalogA (Datalog with Actions) [Naqvi 15K , if the average salary after hiring and Krishnamurthy 1988] is another ex- Bob is greater than 50K, then Avg tension of Datalog with updates only to Յ 50K will fail. Therefore, the whole base relations. Like DLP, it uses dy- query will fail and the update will be namic logic to assign state-transition undone without leaving a residue in the semantics to updates. database. In DatalogA, base facts can be up- As in Prolog, the bulk updates in Ex- dated by using primitive update actions ample 5.5 must be represented in DLP that are represented by prefixing a sym- by recursive update rules as follows: bol ϩ or Ϫ to a base predicate, where ϩ is for insertion and Ϫ is for deletion. ͗raise͑X͒͘ :- ¬employee͑Name,Dept,Sal͒ For example, ϩ employee͑joe, toy,

ACM Computing Surveys, Vol. 31, No. 1, March 1999 54 • M. Liu

50K͒ denotes the action that inserts the In DatalogA, rules with update ac- base fact employee͑joe, toy,50K͒ into tions are evaluated only if the head of the current state. the rule is queried (i.e., top-down), Two kinds of updates are distin- while rules without update actions are guished in DatalogA and have different evaluated bottom-up even if they are semantics. The first kind is the update not directly queried. in which all different orders of execu- Unlike Prolog and DLP, in which tion yield the same final state. This variables occurring only in the body of kind of update has a declarative seman- an update rule are existentially quanti- ␣ ␤ ␣ fied, such variables in DatalogA are uni- tics and is represented by , , where versally quantified so that a set-ori- and ␤ stand for update actions. The ented non-backtracking-based inference other kind is the update in which the mechanism can be used for updates. For update results depend on the order of example, the updates in Example 5.4 execution. That is, different orders of can be represented as follows: execution may yield different final ͑ ͒ states. This kind of update has an oper- raise X :- ͑ ͒ ational semantics and is represented by employee Name, Dept, Sal1 , ء ϭ ϩ ␣;␤, where ␣ and ␤ stand for update Sal2 Sal1 Sal1 X, ͑ Ϫ ͑ ͒ actions. The semantics for this kind employee Name, Dept, Sal1 ; ϩ ͑ ͒ does not require that ␣ executed before employee Name, Dept, Sal2 ) or after ␤ gives the same result. In this update rule, employee͑Name, The updates in Example 5.1 can be Dept, Sal1͒ is a query that generates all represented as a query in DatalogA as substitutions for Name, Dept, and Sal , follows: 1 and ͑Ϫ employee͑Name, Dept, Sal1͒; ? Ϫ employee͑joe, toy,10K͒, ϩ employee͑Name, Dept, Sal2͒͒,is ϩ employee͑bob, shoe,15K͒ used to change the salary of the employee Example 5.6 If we want to fire Joe from Sal1 to Sal2 for each group of sub- and give his salary to Bob, we can use stitutions. Like Prolog, DatalogA does not sup- the following query: port conditional updates, as it has no ? Ϫ employee͑joe, toy, Sal͒; provision for backtracking and remov- ϩ employee͑bob, shoe, Sal͒ ing the effects of previous updates if the condition fails. Therefore, the intended In this query we cannot change the or- updates in Example 5.5 cannot be ex- der of actions, as we need to execute the pressed in DatalogA. deletion first to get the substitution for As the semantics for variables occur- the variable Sal, and then perform the ring only in the body and inference insertion. mechanism in DatalogA are different from those in Prolog, the update results Rules with update actions in the body can be quite different too, even though can also be defined in DatalogA. For the rules look similar. The following example, the update rule in Example rule is the DatalogA version of the rule 5.2 can be represented as follows: in Example 5.3: enroll͑Student, Course͒ :- withdraw͑Acc, Amt͒ :- course͑Course, Section͒, account͑Acc, Bal ͒, 1 size͑Section, Num ͒, Bal Ն Amt, 1 1 Num Ͻ 30, Num ϭ Num ϩ 1, Bal ϭ Bal Ϫ Amt, 1 2 1 2 1 Ϫ size͑Section, Num ͒, ͑Ϫ account͑Acc, Bal ͒; 1 1 ϩ section͑Section, Student͒, ϩ account͑Acc, Bal2͒) ϩ size͑Section, Num2͒.

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •55

If a course DB has two sections, then theory and a sound and complete proof the query ?- enroll͑joe, db͒ will insert theory. Joe into both sections of the DB course. In T᏾, two primitive operations are Indeed, nondeterministic updates are provided to delete and insert facts and not supported in DatalogA. rules: del and ins. A new logical connec- The update mechanism of DatalogA is tive called serial conjunction and de- incorporated in LDL. The nondetermin- noted by R is introduced to sequence istic updates in LDL are supported transactions, where ␣ R ␤ means do ␣ through the use of the choice operator of then do ␤. The updates in Example 5.1 the form choice͑͑X͒, ͑Y͒͒, where X and do not require sequencing and can be Y are vectors of terms, which selects represented as a transaction in T᏾ as those instantiations of the variables X follows: and Y that satisfy the functional depen- ? Ϫ del : employee͑joe, toy,10K͒, dency X 3 Y. The nondeterministic up- ins : employee͑joe, shoe,10K͒ dates in Example 5.3 can be repre- sented in LDL as follows: The updates in Example 5.6 require enroll͑Student, Course͒ :- sequencing and can be represented us- course͑Course, Section͒, ing R as follows: ͑ ͒ size Section, Num1 , ? Ϫ del : employee͑joe, toy, Sal͒ R Ͻ ϭ ϩ Num1 30, Num2 Num1 1, ins : employee͑joe, shoe, Sal͒ choice͑͑Course͒, ͑Section͒͒, Ϫ size͑Section, Num ͒, 1 In T᏾, transactions can be defined ϩ section͑Student, Section͒, using rules that can call other transac- ϩ ͑ ͒ size Section, Num2 tions. The update rule in Example 5.2 can be represented as follows: The choice operator here guarantees that only one section of the course is withdraw͑Acc, Amt͒ :- selected for insertion. balance͑Acc, Bal͒ R The dynamic logic interpretation of Bal Ն Amt R updates in DatalogA and LDL provides del : balance͑Acc, Bal͒ R a semantics to updates that is conso- ins : balance͑Acc, Bal Ϫ Amt͒ nant with the operational meanings of the update predicates. However, this se- mantics is rather tricky. As in Prolog and DLP, in T᏾ variables occurring only in the body of a rule are existentially quantified and tuple-at-a- 5.4 Transaction Logic time evaluation mechanism is used. However, the evaluation can be exe- The notion of transaction is essential for cuted either bottom-up or top-down. database operations. It was first intro- Nondeterministic updates are naturally duced into Prolog informally in Naish et supported in this way. The rule in Ex- al. [1987]. Its use in logic programming was then formalized in Transaction ample 5.3 can be represented in T᏾ as follows: Logic T᏾ [Bonner and Kifer 1993; 1994; ͑ ͒ Bonner et al. 1993]. T᏾ is an extension enroll Student, Course :- of first-order logic, both syntactically ͑course͑Course, Section͒, and semantically. Not only the exten- size͑Section, Num͒, sional database but also the intensional Num Ͻ 30 R database can be updated in T᏾. Unlike ins : section͑Section, Student͒ R other approaches, the main novel fea- del : size͑Section, Num͒ R ture of T᏾ is that it has a natural model ins : size͑Section, Num ϩ 1͒.

ACM Computing Surveys, Vol. 31, No. 1, March 1999 56 • M. Liu

As a transaction must be done atomi- semantics of logic programming and its cally, the conditional updates in Exam- inference capabilities. The need for ex- ple 5.4 can be represented in T᏾ as pressiveness has forced deductive data- follows: base languages away from their simple roots in logic programming and the rela- hire͑Name, Dept, Sal͒ :- tional data model. ins : employee͑Name, Dept, Sal͒ R In this paper, we have discussed the problems with the standard deductive avgϪsal͑Dept, Ave͒ R database language Datalog (with nega- Avg Յ 50K tion) from four different aspects: com- As in DLP and DatalogA, if the aver- plex values, object orientation, higher- age salary after hiring an employee is orderness and updates. In each case, we greater than 50K, the whole query will have examined four typical solutions. fail without leaving a residue in the There are three important areas that database. we have not discussed here, since little has been published to date. One is that In order to support bulk updates, T᏾ none of the deductive database lan- uses relational assignment as in rela- guages are computationally complete so tional databases, in the same spirit as not all programming tasks can be per- the extension of DLP discussed in Sec- formed in a single framework. The sec- tion 5.2. For example, the following ond is that none of the deductive data- rules in T᏾ can be used to give every base languages support unknown/null employee a 10% salary increase. values, which are common in database applications. Indeed, it is not clear how ͒ ء ͑ employee2 Name, Dept, Sal 1.1 :- to derive information with rules when employee͑Name, Dept, Sal͒ there are null values in the database. raise :- ͓employee :ϭ employee2͔ Finally, most object-oriented deductive languages are only structurally object- oriented, since methods and encapsula- Ϫ Given a query ? raise, the new contents tion, two important features of object of the employee relation are computed orientation, are not properly supported. by the first rule and are held in the rela- In Bertino and Montesi [1992], encapsu- tion employee2. Then the relational as- lation is addressed in an object-oriented signment ͓employee :ϭ employee2͔ in deductive framework, but the semantics the second rule assigns the new contents is not defined. into the relation employee. We expect that a few more expressive Unfortunately, the relational assign- deductive database languages will be ment cannot pass arguments. There- developed that incorporate various im- fore, we cannot define the rule in Exam- portant features to solve the problems discussed here. In the mean time, vari- ple 5.4 that gives every employee an X% ous implementation techniques will be salary increase in T᏾. investigated. Powerful deductive data- base systems will eventually be deliv- 6. CONCLUSION ered to the database market to extend database applications further. Approaches to deductive databases are torn by two opposing forces. On one side ACKNOWLEDGMENTS there are the stringent real-world re- quirements of actual databases. The re- The author would like to thank Michael quirements include efficient processing Franklin, associate editor of ACM Com- as well as the ability to express complex puting Surveys, and reviewers for their and subtle real-world relationships. On critical reading of the article. Their the other side are the simple and clear many valuable comments and sugges-

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •57 tions substantially helped to improve APT,K.R.,BLAIR,H.A.,AND WALKER, the quality and accuracy of this survey. A. 1988. Towards a theory of declarative knowledge. In Foundations of deductive da- Thanks also to Katrina Avery at ACM tabases and logic programming, J. Minker, Computing Surveys, Cory Butz, and Ed. Morgan Kaufmann Publishers Inc., San Shilpesh Katragadda for their correc- Francisco, CA, 89–148. tions. This work was partly supported BANCILHON, F., MAIER, F., SAGIV, Y., AND ULLMAN, by the Natural Sciences and Engineer- J. 1986. Magic Sets and Other Strange Ways to Implement Logic Programs. In Pro- ing Research Council of Canada. ceedings of the ACM Symposium on Principles of Database Systems (PODS ’86, Cambridge, REFERENCES Massachusetts). ACM Press, New York, NY, 1–16. ABITEBOUL, S. 1988. Updates, a New Database BANCILHON,F.AND RAMAKRISHNAN, R. 1986. An Frontier. In Proceedings of the International Amateur’s Introduction to Recursive Query Conference on Data Base Theory (Pruges, Bel- Processing Strategies. In Proceedings of the gium). Springer-Verlag, New York, 1–18. conference on Management of data (SIGMOD ABITEBOUL,S.AND BEERI, C. 1995. The Power of ’86, Washington, D.C., May 28-30, 1986), C. Languages for the Manipulation of Complex Zaniolo, Ed. ACM Press, New York, NY, 16– Values. VLDB J. 4, 4, 727–794. 52. ABITEBOUL, S., FISCHER,P.C.,AND SCHEK, H., Eds. 1987. Proceedings of the International BARBACK, M., LOBO, J., AND LU,J. Workshop on Theory and Applications of 1992. Minimizing Indefinite Information in Nested Relations and Complex Objects in Da- Disjunctive Deductive Databases. In Pro- tabases. (Darmstadt, Germany) Springer- ceedings of the International Conference on Verlag, New York. Data Base Theory (Berlin, Ger- many). Springer-Verlag, New York, ABITEBOUL,S.AND GRUMBACH, S. 1991. A rule- based language with functions and sets. 246–260. ACM Trans. Database Syst. 16, 1 (Mar.), BEERI, C. 1989. Formal Models for Object-Ori- 1–30. ented Databases. In Proceedings of the Inter- ABITEBOUL,S.AND HULL, R. 1987. IFO: a formal national Conference on Deductive and Object- semantic . ACM Trans. Da- Oriented Databases (Kyoto, Japan), W. Kim, tabase Syst. 12, 4 (Dec. 1987), 525–565. J. Nicolas, and S. Nishio, Eds. North-Hol- ABITEBOUL, S., HULL, R., AND VIANU,V. land Publishing Co., Amsterdam, The Nether- 1995. Foundations of Databases. Addison- lands, 405–430. Wesley, Reading, MA. BEERI, C. 1990. A formal approach to object- ABITEBOUL,S.AND KANELLAKIS, P. 1989. Object oriented databases. Data Knowl. Eng. 5,2, Identity as a Query Language Primitive. In 353–382. Proceedings of the 1989 ACM SIGMOD inter- BEERI, C., NAQVI, S., SHMUELI, O., AND TSUR, national conference on Management of data S. 1991. Set constructors in a logic data- (SIGMOD ’89, Portland, Oregon, June 1989), base language. J. Logic Program. 10, 3/4 J. Clifford, J. Clifford, B. Lindsay, D. Maier, (Apr./May 1991), 181–232. and J. Clifford, Eds. ACM Press, New York, BEERI,C.AND RAMAKRISHNAN, R. 1991. On the NY, 159–173. power of magic. J. Logic Program. 10, 3/4 ABITEBOUL,S.AND VIANU, V. 1991. Datalog ex- (Apr./May 1991), 255–299. tensions for database queries and up- BERTINO,E.AND MONTESI, D. 1992. Towards a dates. J. Comput. Syst. Sci. 43, 1 (Aug. Logical Object-oriented Programming Lan- 1991), 62–124. guage for Databases. In Proceedings of the AÏT-KACI, H. 1993. An Introduction to Life. In International Conference on Extending Data- Proceedings of the Joint International Confer- base Technology (Vienna, Austria). Springer- ence and Symposium on Logic Programming Verlag, New York, NY, 168–183. (Vancouver, Canada). MIT Press, Cam- BONNER, A. J. 1989. Hypothetical datalog nega- bridge, MA, 506–524. tion and linear recursion. In Proceedings of AÏT-KACI,HAND NASR, R 1986. Login: A logic the eighth ACM SIGACT-SIGMOD-SIGART programming language with built-in inherit- symposium on Principles of Database Systems ance. J. Logic Program. 3, 3 (Oct. 1986), (PODS ’89, Philadelphia, PA, Mar. 29–31, 185–215. 1989), A. Silberschatz, Ed. ACM Press, New ALBANO, A., CARDELLI, L., AND ORSINI,R. York, NY, 286–300. 1985. GALILEO: a strongly-typed, interac- BONNER, A. J. 1990. Hypothetical datalog: com- tive conceptual language. ACM Trans. Data- plexity and expressibility. Theor. Comput. base Syst. 10, 2 (June 1985), 230–260. Sci. 76, 1 (Oct. 31, 1990), 3–51. ALBANO, A., GHELLI, G., AND ORSINI,R. BONNER,A.J.AND KIFER, M. 1993. Transaction 1995. Fibonacci: A Programming Language logic programming. In Proceedings of the for Object Databases. VLDB J. 4, 3, 403– tenth international conference on Logic pro- 444. gramming (ICLP’93, Budapest, Hungary,

ACM Computing Surveys, Vol. 31, No. 1, March 1999 58 • M. Liu

June 21–25, 1993), D. S. Warren, Ed. MIT Object-Oriented Databases (Kyoto, Japan), W. Press, Cambridge, MA, 257–279. Kim, J. Nicolas, and S. Nishio, Eds. North- BONNER,A.J.AND KIFER, M. 1994. An overview Holland Publishing Co., Amsterdam, The of transaction logic. Theor. Comput. Sci. Netherlands, 431–452. 133, 2 (Oct. 24, 1994), 205–265. CHEN,Q.AND KAMBAYASHI, Y. 1991. Nested Re- BONNER, A., KIFER, M., AND CONSENS, lation Based Database Knowledge Represen- M. 1993. Database Programming in Trans- tation. In Proceedings of the 1991 ACM SIG- action Logic. In Proceedings of the Interna- MOD International Conference on tional Workshop on Database Programming Management of Data (SIGMOD ’91, Denver, Languages (New York, NY). Morgan Kauf- CO, May 29–31, 1991), J. Clifford and R. mann Publishers Inc., San Francisco, CA, King, Eds. ACM Press, New York, NY, 328– 309–337. 337. BORGIDA, A. 1988. Modeling class hierarchies CHEN, W. 1997. Programming with Logical with contradictions. In Proceedings of the Queries, Bulk Updates and Hypothetical Rea- Conference on Management of Data (SIGMOD soning. IEEE Trans. Knowl. Data Eng. 9,4, ’88, Chicago, IL, June 1-3, 1988), H. Boral and 587–599. P.-A. Larson, Eds. ACM Press, New York, CHEN, W., KIFER, M., AND WARREN,D. NY, 434–443. S. 1993. HILOG: a foundation for higher- BRODIE, M. 1984. On the development of data order logic programming. J. Logic Program. models. In On Conceptual Modelling,M. 15, 3 (Feb. 1993), 187–230. Brodie, J. Mylopoulos, and J. Schmidt, CHEN,W.AND WARREN, D. S. 1989. C-logic of Eds. Springer-Verlag, New York, NY, 19– complex objects. In Proceedings of the eighth 48. ACM SIGACT-SIGMOD-SIGART symposium BRODSKY, A., JAFFAR, J., AND MAHER,M. on Principles of Database Systems (PODS ’89, 1993. Towards practical constraint databas- Philadelphia, PA, Mar. 29–31, 1989), A. Sil- es. In Proceedings of the International Con- berschatz, Ed. ACM Press, New York, NY, ference on Very Large Data Bases (Dublin, 369–378. Ireland). Morgan Kaufmann Publishers Inc., CHIMENTI, D., GAMBOA, R., KRISHNAMURTHY, R., San Francisco, CA, 557–580. NAQVI, S., TSUR, S., AND ZANIOLO,C. BRY, F. 1990. Intensional Updates: Abduction 1990. The LDL System Prototype. via Deduction. In Proceedings of the Interna- IEEE Trans. Knowl. Data Eng. 2, 1, 76–90. tional Conference on Logic Programming CODD, E. 1970. A for large (Budapest, Hungary). MIT Press, Cam- shared databases. Commun. ACM 13,6, bridge, MA, 561–575. 377–387. CACACE, F., CERI, S., CREPI-REGHIZZI, S., TANCA, L., CODD, E. F. 1979. Extending the Database rela- AND ZICARI, R. 1990. Integrating Object- Oriented Data Modelling with a Rule-Based tional model to capture more meaning. ACM Programming Paradigm. In Proceedings of Trans. Database Syst. 4, 4 (Dec.), 397–434. the 1990 ACM SIGMOD International Confer- COLBY, L. 1989. A Recursive Algebra and ence on Management of Data (SIGMOD ’90, Query Optimization for Nested Relations. In Atlantic City, NJ, May 23–25, 1990), H. Gar- Proceedings of the 1989 ACM SIGMOD inter- cia-Molina, Ed. ACM Press, New York, NY, national conference on Management of data 225–236. (SIGMOD ’89, Portland, Oregon, June 1989), CAREY,M.J.,DEWITT,D.J.,AND VANDERBERG,S. J. Clifford, J. Clifford, B. Lindsay, D. Maier, 1988. A data model and query language for and J. Clifford, Eds. ACM Press, New York, EXODUS. In Proceedings of the Conference NY, 124–138. on Management of Data (SIGMOD ’88, Chi- COLMERAUER, A. 1985. Prolog in 10 figures. cago, IL, June 1-3, 1988), H. Boral and P.-A. Commun. ACM 28, 12 (Dec. 1985), 1296– Larson, Eds. ACM Press, New York, NY, 1310. 413–423. DE MAINDREVILLE,C.AND SIMON,E. CATTELL, R., Ed. 1996. The 1988. Modelling Non Deterministic Queries Standard: ODMG-93. Release 1.2 Morgan and Updates in Deductive Databases. In Kaufmann Publishers Inc., San Francisco, Proc. 14th International Conference on Very CA. Large Data Bases (Los Angeles, CA). Mor- CERI, S., GOTTLOB, G., AND TANCA, gan Kaufmann Publishers Inc., San Fran- L. 1990. Logic programming and databas- cisco, CA, 395–406. es. Springer-Verlag, New York, NY. DERR, M., MORISHITA, S., AND PHIPPS, CHEN, P. 1976. The Entity-Relationship model: G. 1993. Design and Implementation of the Toward a unified view of data. ACM Trans. Glue-Nail Database System. In Proceedings Database Syst. 1, 1, 9–36. of the 1993 ACM SIGMOD international con- CHEN,Q.AND CHU, W. 1989. HILOG: A High- ference on Management of data (SIGMOD ’93, Order Logic Programming Language for Non- Washington, DC, May 26–28, 1993), P. Bun- 1NF Deductive Databases. In Proceedings of eman and S. Jajodia, Eds. ACM Press, New the International Conference on Deductive and York, NY, 147–167.

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •59

DEUX, O. 1991. The O2 system. Commun. HAMMER,M.AND MCLEOD, D. 1981. Database ACM 34, 10 (Oct. 1991), 34–48. description with SDM: A semantic database DOVIER, A., OMODEO, E., PONTELLI, E., AND ROSSI, model. ACM Trans. Database Syst. 6,3 G. 1996. {log}: A Language for Program- (Sept.), 351-386. ming in Logic with Finite Sets. J. Logic Pro- HAN, J., LIU, L., AND XIE, Z. 1994. LogicBase: a gram. 28,1,1–44. deductive database system prototype. In FERNÁNDEZ,J.AND MINKER, Proceedings of the 3rd International Confer- J. 1992a. Semantics of Disjunctive Deduc- ence on Information and Knowledge Manage- tive Databases. In Proceedings of the Inter- ment (CIKM ’94, Gaithersburg, Maryland, national Conference on Data Base Theory Nov. 29–Dec. 2, 1994), N. R. Adam, B. K. (Berlin, Germany). Springer-Verlag, New Bhargava, and Y. Yesha, Eds. ACM Press, York, 21–50. New York, NY, 226–233. FERNÁNDEZ,J.A.AND MINKER, J. 1992b. HAREL, D. 1979. First-Order Dynamic Logic. Disjunctive Deductive Databases. In Pro- Lecture Notes in Computer Science, vol. ceedings of the International Conference on 361. Springer-Verlag, New York. Logic Programming and Automated Reason- HEUER,A.AND SANDER, P. 1993. The LIVING ing (LPAR ’92, St. Peterburg, Rus- IN A LATTICE Rule Language. Data Knowl. sia). Springer-Verlag, New York, 332–356. Eng. 9, 4, 249–286. FISHMAN, D. H., B., B., CATE,H.P.,CHOW,E.C., HILL,P.AND LLOYD, J. 1994. The Gödel pro- CONNORS, T., DAVIS,J.W.,DERRETT, N., HOCH, gramming language. MIT Press, Cambridge, C. G., KENT, W., LYNGBAEK, P., MAHBOD, B., MA. NOUE NEIMAT,M.A.,RYAN,T.A.,AND SHAN,M. I , K. 1994. Hypothetical Reasoning in C. 1987. Iris: An object-Oriented Database Logic Programs. J. Logic Program. 18,3, Management System. ACM Trans. Off. Inf. 197–227. Syst. 5,1,48–69. IOANNIDIS,Y.AND RAMAKRISHNAN, R. 1988. Efficient Transitive Closure Algorithms. In FREITAG, B., SCHUTZ, H., AND SPECHT,G. 1991. LOLA—A Logic Language for Deduc- Proc. 14th International Conference on Very tive Databases and its Implementation. In Large Data Bases (Los Angeles, CA). Mor- gan Kaufmann Publishers Inc., San Fran- Proceedings of the 2nd International Sympo- cisco, CA, 382–394. sium on Database Systems for Advanced Ap- ISHIKAWA, H., SUZUKI, F., KOZAKURA, F., MAKINOU- plications (DASFAA ’91, Tokyo, Japan, CHI, A., MIYAGISHIMA, M., IZUMIDA, Y., Apr.). 216–225. AOSHIMA, M., AND YAMANE, Y. 1993. The FROHN, J., HIMMERÖDER, R., KANDZIA, P., LAUSEN, model, language, and implementation of an G., AND SCHLEPPHORST, C. 1997. Florid: A object-oriented multimedia Prototype for F-logic. In Proceedings of the management system. ACM Trans. Database International Conference on Data Engineer- Syst. 18, 1 (Mar. 1993), 1–50. ing. IEEE Computer Society, New York, NY. JACOBS, B. 1982. On Database Logic. J. ACM GALLAIRE, H. 1981. Impacts of Logic on Data 29, 2 (Apr.), 310–332. Bases. In Proceedings of the International JAFFAR,J.AND MAHER, M. 1994. Constraint Conference on Very Large Data Bases logic programming: A survey. J. Logic Pro- (Cannes, France). IEEE Computer Society, gram. 19-20, 503-581. New York, NY, 248–259. JARKE, M., GALLERSDÖRFER, R., JEUSFELD,M.A., GALLAIRE,H.AND MINKER, J., Eds. 1978. Logic STAUDT, M., AND EHERER, S. 1995. and Data Bases. Plenum Press, New York, ConceptBase—a deductive object base for NY. meta data management. J. Intell. Inf. Syst. GALLAIRE, H., MINKER, J., AND NICOLAS,J.M. 4, 2 (Mar. 1995), 167–192. 1984. Logic and Databases: A Deductive Ap- JAYARAMAN, B. 1992. Implementation of subset- proach. ACM Comput. Surv. 16, 2, 153–186. equational programs. J. Logic Program. 12, GELFOND,M.AND LIFSCHITZ, V. 1988. The stable 4 (Apr. 1992), 299–324. model semantics for logic programming. In JIANG, B. 1990. A Suitable Algorithm for Com- Proceedings of the 5th International Confer- puting Partial Transitive Closures. In Proc. ence on Logic Programming (Washington, IEEE Sixth International Conference on Data DC). MIT Press, Cambridge, MA, 1070–1080. Engineering (Kobe, Japan, May). IEEE Com- GRANT,J.AND MINKER, J. 1992. The impact of puter Society, New York, NY, 264–271. logic programming on databases. Commun. KAKAS,A.AND MANCARELLA, P. 1990. Database ACM 35, 3 (Mar. 1992), 66–81. Updates through Abduction. In Proceedings GRUMBACH,S.AND SU, J. 1996. Towards practi- of the 16th VLDB Conference on Very Large cal constraint databases (extended abstract). Data Bases (VLDB, Brisbane, Australia). In Proceedings of the fifteenth ACM SIGACT- VLDB Endowment, Berkeley, CA, 650–661. SIGMOD-SIGART symposium on Principles KIEBLING, W., SCHMIDT, H., STRAUB, W., AND DUN- of database systems (PODS ’96, Montreal, ZINGER, G. 1994. DECLARE and SDS: P. Q., Canada, June 3–5, 1996), R. Hull, Early Efforts to Commercialize Deductive Da- Ed. ACM Press, New York, NY, 28–39. tabase Technology. VLDB J. 3, 2, 211–243.

ACM Computing Surveys, Vol. 31, No. 1, March 1999 60 • M. Liu

KIFER,M.AND LAUSEN, G. 1989. F-Logic: A International Workshop on Database and Ex- Higher-Order Language for Reasoning about pert System Applications (DEXA Workshop Objects, Inheritance, and Schema. In Pro- ’98, Vienna, Austria, Aug. 24–28). IEEE ceedings of the 1989 ACM SIGMOD interna- Computer Society Press, Los Alamitos, CA, tional conference on Management of data 856–863. (SIGMOD ’89, Portland, Oregon, June 1989), LIU, M., YU, W., GUO, M., AND SHAN, J. Clifford, J. Clifford, B. Lindsay, D. Maier, R. 1998. ROL: A Prototype for Deductive and J. Clifford, Eds. ACM Press, New York, and Object-Oriented Databases. In Proceed- NY, 134–146. ings of the International Symposium on Data- KIFER, M., LAUSEN, G., AND WU, base Engineering and Applications (ICDE ’98, J. 1995. Logical foundations of object-ori- Orlando, FL, Feb. 23–27). IEEE Computer ented and frame-based languages. J. ACM Society Press, Los Alamitos, CA, 598. 42, 4 (July 1995), 741–843. LLOYD, J. W. 1987. Foundations of Logic Pro- KIFER,M.AND WU, J. 1993. A logic for program- gramming. 2nd ed. Springer-Verlag Symbolic ming with complex objects. J. Comput. Syst. Computation and Artificial Intelligence Se- Sci. 47, 1 (Aug. 1993), 77–120. ries. Springer-Verlag, Vienna, Austria. KIM, W. 1990a. Introduction to Object-Oriented LOU,Y.AND OZSOYOGLU, M. 1991. LLO: A De- Databases. MIT Press, Cambridge, MA. ductive Language with Methods and Method KIM, W. 1990b. Object-Oriented Databases: Inheritance. In Proceedings of the 1991 ACM Definition and Research Direction. IEEE SIGMOD International Conference on Man- Trans. Knowl. Data Eng. 2, 3 (Sept.), 327– agement of Data (SIGMOD ’91, Denver, CO, 341. May 29–31, 1991), J. Clifford and R. King, KOWALSKI, R. A. 1988. The early years of logic Eds. ACM Press, New York, NY, 198–207. programming. Commun. ACM 31, 1 (Jan. MAIER, D. 1983. The Theory of Relational Data- 1988), 38–43. bases. Computer Science Press, Inc., New KOWALSKI, R. 1992. Database updates in the York, NY. event calculus. J. Logic Program. 12, 1/2 MAIER, D. 1986. A logic for objects. Technical (Jan. 1992), 121–146. Report CS/E-86-012. Oregon Graduate Cen- KRISHNAMURTHY,R.AND NAQVI, S. 1988. ter, Beaverton, Oregon. Towards a Real Horn Clause Language. In MAIER, D. 1987. Why Database Languages are Proc. 14th International Conference on Very a Bad Idea. In Proceedings of the Workshop Large Data Bases (Los Angeles, CA). Mor- on Database Programming Languages gan Kaufmann Publishers Inc., San Fran- (Roscoff, France). cisco, CA, 252–263. MAIER, D., STEIN, J., OTIS, A., AND PURDY, KUPER, G. M. 1990. Logic programming with A. 1986. Development of Object-Oriented sets. J. Comput. Syst. Sci. 41, 1 (Aug. 1990), DBMS. In OOPSLA ’86. ACM Press, New 44–64. York, NY, 472–482.

LECLUSE,C.AND RICHARD, P. 1989. The O2 Da- MANCHANDA, S. 1989. Declarative expression of tabase Programming Language. In Proceed- deductive database updates. In Proceedings ings of the fifteenth international conference of the eighth ACM SIGACT-SIGMOD-SI- on Very large data bases (Amsterdam, The GART symposium on Principles of Database Netherlands, Aug 22–25 1989), R. P. van de Systems (PODS ’89, Philadelphia, PA, Mar. Riet, Ed. Morgan Kaufmann Publishers Inc., 29–31, 1989), A. Silberschatz, Ed. ACM San Francisco, CA, 411–422. Press, New York, NY, 93–100. LEVENE,M.AND LOIZOU, G. 1993. Semantics for MANCHANDA,S.AND WARREN, D. S. 1988. A log- null extended nested relations. ACM Trans. ic-based language for database updates. In Database Syst. 18, 3 (Sept. 1993), 414–459. Foundations of deductive databases and logic LIU, M. 1995. Relationlog: A Typed Extension programming, J. Minker, Ed. Morgan Kauf- to Datalog with Sets and Tuples (Extended mann Publishers Inc., San Francisco, CA, Abstract). In Proceedings of the Interna- 363–394. tional Symposium on Logic Programming MONTESI, D., BERTINO, E., AND MARTELLI, (ILPS ’95, Portland, Oregon, Dec. 4–7). MIT M. 1997. Transactions and Updates in De- Press, Cambridge, MA, 83–97. ductive Databases. IEEE Trans. Knowl. LIU, M. 1996. ROL: A Deductive Object Base Data Eng. 9, 5, 784–797. Language. Inf. Syst. 21, 5, 431–457. MORRIS, K., ULLMAN,J.D,AND VAN GELDER, LIU, M. 1998a. An Overview of Rule-based Ob- A. 1986. Design overview of the NAIL! sys- ject Language. J. Intell. Inf. Syst. 10,1, tem. In Proceedings on Third international 5–29. conference on logic programming (London, LIU, M. 1998b. Relationlog: A Typed Extension UK, July 14-18, 1986), E. Shapiro, Ed. to Datalog with Sets and Tuples. J. Logic Springer-Verlag, New York, NY, 554–568. Program. 36, 3, 271–299. MOSS, C. 1994. Prologϩϩ. Addison-Wesley Long- LIU,M.AND SHAN, R. 1998. The Design and man Publ. Co., Inc., Reading, MA. Implementation of the Relationlog Deductive MUMICK,I.S.,FINKELSTEIN,S.J.,PIRAHESH, H., Database System. In Proceedings of the 9th AND RAMAKRISHNAN, R. 1996. Magic condi-

ACM Computing Surveys, Vol. 31, No. 1, March 1999 Deductive Database Languages: Problems and Solutions •61

tions. ACM Trans. Database Syst. 21,1, ROSS, K. A. 1994. Modular stratification and 107–155. magic sets for Datalog programs with nega- MYLOPOULOS, J., BERNSTEIN,P.A.,AND WONG,H. tion. J. ACM 41, 6 (Nov. 1994), 1216–1266. K. T. 1980. A Language Facility for De- ROTH,M.A.,KORTH,H.F.,AND BATORY,D.S. signing Database-Intensive Applications. 1987. SQL/NF: a query language for ¬ 1NF ACM Trans. Database Syst. 5, 2 (June), 185– relational databases. Inf. Syst. 12, 1 (Jan. 207. 1987), 99–114. NAISH, L., THOM, L., AND RAMAMOHANARAO,K. ROTH,M.A.,KORTH,H.F.,AND SILBERSCHATZ, 1987. Concurrent Database Updates in Pro- A. 1988. Extended algebra and calculus for log. In Proceedings of the International Con- ¬1NF relational databases. ACM Trans. Da- ference on Logic Programming (Melbourne, tabase Syst. 13, 4 (Dec. 1988), 389–417. Australia). MIT Press, Cambridge, MA, SACCA,D.AND ZANIOLO, C. 1987. Magic Count- 178–189. ing Methods. In Proceedings of the ACM NAQVI,S.AND KRISHNAMURTHY, R. 1988. SIGMOD Annual Conference on Management Database Updates in Logic Programming. In of Data (SIGMOD ’87, San Francisco, CA, Proc. 7th ACM SIGACT-SIGMOD-SIGART May 27-29, 1987), U. Dayal, Ed. ACM Press, Symposium on Principles of Database Systems New York, NY, 49–59. (Austin, TX, Mar.). 261–272. SAGONAS, K., SWIFT, T., AND WARREN,D. NAQVI,S.AND TSUR, S. 1989. A logical language 1994. XSB as an Efficient Deductive Data- for data and knowledge bases. Computer base Engine. In Proceedings of the 1994 Science Press principles of computer science ACM SIGMOD International Conference on series. Computer Science Press, Inc., New Management of Data (SIGMOD ’94, Minneap- York, NY. olis, Minnesota, May 24–27, 1994), R. T. Snodgrass and M. Winslett, Eds. ACM OZSOYOGLU,Z.M.AND YUAN, L.-Y. 1987. A new Press, New York, NY, 442–453. normal form for nested relations. ACM SATTAR,A.AND GOEBEL, R. 1997. Consistency- Trans. Database Syst. 12, 1 (Mar. 1987), 111– motivated reason maintenance in hypotheti- 136. cal reasoning. New Gen. Comput. 15, 2, 163– PRZYMUSINSKI, T. C. 1988. On the declarative 186. semantics of deductive databases and logic SHAN,R.AND LIU, M. 1998. Introduction to the programs. In Foundations of deductive data- Relationlog System. In Proceedings of the bases and logic programming, J. Minker, Ed. 6th Intl. Workshop on Deductive Databases Morgan Kaufmann Publishers Inc., San Fran- and Logic Programming (DDLP ’98, Manches- cisco, CA, 193–216. ter, UK, June 20 1998). 71–83. PRZYMUSINSKI, T. 1990. Extended Stable Se- SHIPMAN, D. W. 1981. The Functional Data mantics for Normal and Disjunctive Pro- Model and the Data Language DAPLEX. grams. ACM Trans. Database Syst. 6, 1, 140–173. In Proceedings of the International Conference SMOLKA, G. 1995. The Oz programming mod- on Logic Programming (Budapest, Hunga- el. In Computer Science Today, J. van Leeu- ry). MIT Press, Cambridge, MA, 459–477. wen, Ed. Lecture Notes in Computer Sci- RAMAKRISHMAN, R., Ed. 1994. Applications of ence, vol. 1000. Springer-Verlag, New York, Logic Databases. Kluwer Academic Publish- 324–343. ers, Hingham, MA. SOMOGYI, Z., HENDERSON, F., AND CONWAY,T. RAMAKRISHNAN, R., SRIVASTAVA, D., SUDARSHAN, S., 1996. The Execution Algorithm of Mercury: AND SESHADRI, P. 1994. The coral deductive An Efficient Purely Declarative Logic Pro- system. VLDB J. 3, 2, 161–210. gramming Language. J. Logic Program. 29, RAMAKRISHMAN,R.AND ULLMAN, J. D. 1995. A 1–3, 17–64. Survey of Deductive Database Systems. J. STERLING,L.AND SHAPIRO, E. 1986. The art of Logic Program. 23, 2, 125–150. Prolog: advanced programming tech- RAO, P., SAGONAS, K., SWIFT, T., WARREN, D., AND niques. MIT Press series in logic program- FREIRE, J. 1997. XSB: A System for Eff- ming. MIT Press, Cambridge, MA. ciently Computing. In Proceedings of the In- SU, S. Y. W. 1986. Modeling Integrated Manu- IEEE Computer .ء ternational Conference on Logic Programming facturing Data with SAM and Non-Monotonic Reasoning (LPNMR ’97, 19,1,34–49. Dagstuhl, Germany). Springer-Verlag, New SUBRAHMANIAN, V. S. 1992. Paraconsistent dis- York, 431–441. junctive deductive databases. Theor. Com- REITER, R. 1984. Towards a logical reconstruc- put. Sci. 93, 1 (Feb. 3, 1992), 115–141. tion of relational . In On TSUR,S.AND ZANIOLO, C. 1986. LDL: A Logic- Conceptual Modelling, M. Brodie, J. Mylopou- based data-language. In Proceedings of the los, and J. Schmidt, Eds. Springer-Verlag, 12th international conference on on Very New York, NY, 191–233. Large Data Bases (Kyoto, Japan, Aug., REITER, R. 1995. On Specifying Database Up- 1986). VLDB Endowment, Berkeley, CA, 33- dates. J. Logic Program. 25, 1, 53–91. 41.

ACM Computing Surveys, Vol. 31, No. 1, March 1999 62 • M. Liu

ULLMAN, J. 1982. Principles of Database Sys- Kifer, and Y. Masunaga, Eds. Springer-Ver- tems. Computer Science Press, Inc., New lag, New York, 263–277. York, NY. VAGHANI, J., RAMANOHANARAO, K., KEMP, D., SOMO- ULLMAN, J. 1989a. Principles of Database and GYI, Z., STUCKEY, P., LEASK, T., AND HARLAND, Knowledge-Base Systems. Computer Science J. 1994. The Aditi Deductive Database Press, Inc., New York, NY. System. VLDB J. 3, 2, 245–288. ULLMAN, J. D. 1989b. Bottom-up beats top- VAN GELDER, A. 1993. The alternating fixpoint down for datalog. In Proceedings of the of logic programs with negation. J. Comput. eighth ACM SIGACT-SIGMOD-SIGART sym- Syst. Sci. 47, 1 (Aug. 1993), 185–221. posium on Principles of Database Systems VAN GELDER, A., ROSS,K.A.,AND SCHLIPF,J.S. (PODS ’89, Philadelphia, PA, Mar. 29–31, 1991. The well-founded semantics for gen- 1989), A. Silberschatz, Ed. ACM Press, New eral logic programs. J. ACM 38, 3 (July York, NY, 140–149. 1991), 619–649. ULLMAN, J. 1991. A Comparison between De- WICHERT, C.-A. AND FREITAG, B. 1997. ductive and Object-Oriented Databases Sys- Capturing Database Dynamics by Deferred tems. In Proceedings of the International Updates. In Proceedings of the International Conference on Deductive and Object-Oriented Conference on Logic Programming (Leuven, Databases (Munich, Germany), C. Delobel, M. Belgium). MIT Press, Cambridge, MA.

Received: March 1997; revised: September 1997, April 1998; accepted: July 1998

ACM Computing Surveys, Vol. 31, No. 1, March 1999