Implementation of the CORAL Deductive System

Raghu Ramakrishnan Divesh Srivastava

Univ of Wisconsin Madison Univ of Wisconsin Madison

y

Praveen Seshadri S Sudarshan

ATT Bell Labs Murray Hill Univ of Wisconsin Madison

Intro duction Abstract

In this pap er we discuss the design and implementation of

CORAL is a deductive database system that provides a mo du

the CORAL deductive database system CORAL seeks to

lar declarative query languageprogramming language CORAL

combine features of a database query language such as e

is a deductive system that supp orts a rich declarative language

cient treatment of large relations aggregate op erations and

provides a wide range of evaluation metho ds and allows a combi

declarative semantics with those of a logic programming

nation of declarative and imp erative programming The data can

language such as more p owerful inference capabiliti es and

b e p ersistent on disk or can reside in mainmemory We describ e

the architecture and implementation of CORAL

supp ort for incomplete and structured data Supp ort for

p ersistent relations is provided by interfacing with the EX There were two imp ortant goals in the design of the CORAL ar

chitecture to integrate the dierent optimization techniques

ODUS storage manager A unique feature of CORAL is

in a reasonable fashion and to allow users to inuence the

that it provides a wide range of evaluation strategies such as

evaluation strategies used so as to exploit the full p ower of the

topdown b ottomup and their variants and allows users to

CORAL implementation A CORAL declarative program can b e

optionally tailor execution of a program through highlevel

organized as a collection of interacting mo dules and this mo d

annotations The language is describ ed in Applica

ule structure is the key to satisfying b oth these goals The high

tions in which large amounts of data must b e extensively

level mo dule interface allows mo dules with dierent evaluation

techniques to interact in a transparent fashion Further users

analyzed are likely to b enet from this combination of fea

can optionally tailor the execution of a program by selecting from

tures In comparison to other deductive database systems

among a wide range of control choices at the level of each mo dule

such as Aditi EKSV LDL LOLA and

CORAL also has an interface with and users can pro

NailGlue CORAL provides a more p owerful language

gram in a combination of declarative CORAL and C extended

and supp orts a much wider range of optimization techniques

with CORAL primitives A high degree of extensibility is provided

We highlight several design decisions that allowed us to

by allowing C to use the class structure of C

to enhance the CORAL implementation

integrate often unrelated evaluation techniques and opti

mizations in a nearly seamless fashion Sp ecically we con

This research was supp orted by a David and Lucile Packard

sider the following issues

Foundation Fellowship in Science and engineering a Presidential

Young Investigator Award with matching grants from Digital

Data representation eg constants lists terms

Equipment

Relation representation and implementation eg in

Corp oration Tandem and Xerox and NSF grant IRI

mainmemory and diskresident The address of the authors are Computer Sciences De

partment University of Wisconsin Madison WI

Index structures eg hashstructures and Btrees

USA and ATT Bell Lab oratories Mountain Av

enue Murray Hill NJ USA The authors email

Evaluation techniques eg materialization and top

addresses are fraghudiveshpraveengcswiscedu and sudar

down

sharesearchattcom

y

The work of this author was p erformed largely while he was

In the CORAL implementation we divide evaluation into

in the University of Wisconsin Madison

a numb er of distinct subtasks and provide a clean inter

face b etween the subtasks relevant optimization techniques

can b e almost indep endentl y applied to each subtask Ex

tending database programming languages has received much

attention lately and we b elieve that the CORAL exp eri

ence oers guidelines for resolving several common issues

0

that go b eyond the sp ecic extensions that are addressed in

Page QUERY DATA

CORAL MAIN EVALUATION MANAGER MEMORY

SYSTEM

One of our goals was to allow users to exploit the full

answers

p ower of the implementation CORAL supp orts a very rich

and we b elieve that some user guidance is criti language query optimized

USER program

to eectively optimizing many sophisticated programs cal INTERFACE

EXODUS

is to provide users with the ability to cho ose The problem O.S.

STORAGE

supp orted by CORAL in a from the suite of optimizations program FILES

MANAGER

relatively orthogonal and highlevel way and to use a combi

QUERY

t parts of a program The

nation of optimizations for dieren OPTIMIZER

module structure was a key to solving this problem The in

terface b etween mo dules is kept at a high level evaluation

techniques can b e chosen on a p ermo dule basis through op

tional annotations and mo dules with dierent evaluation

Figure CORAL System Architecture

techniques can interact in a nearly transparent fashion

An overview of the CORAL declarative language is pre

sented in The query language supp orts general Horn

ODUS to olkit and thus by CORAL However within each

clauses with complex terms setgrouping aggregate op er

CORAL pro cess all data that is not managed by the EXO

ations negation and data that contains universally quanti

DUS storage manager is strictly lo cal to the pro cess Most of

ed variables A numb er of user annotations can also b e

the eort of design and implementation in CORAL has con

sp ecied to guide the system in query evaluation and opti

centrated on the singleuser client and the implementation

mization The details of the language are b eyond the scop e

has fo cused on op eration out of main memory While this

of the pap er Many features of the implementation ranging

is in keeping with Prologstyle systems there is no part of

from lowlevel structures to the interactive system environ

the design that is particularly biased towards a mainmemory

ment have also b een omitted from the pap er due to shortage

1

system

of space The implementation details are found in In

Data stored in text les can b e consulted at which p oint

Sections we discuss some imp ortant asp ects of the sys

the data is converted into mainmemory relations with any

tem implementation Section contains an overview of the

sp ecied indices indices can also b e created at a later time

CORAL system architecture Section explains the under

Data stored using the EXODUS storage manager is paged

lying representation of the data and Section provides an

into EXODUS buers on demand making use of the index

overview of query evaluation and optimization Section is

ing and scan facilities of the storage manager The design of

the main section that deals with implementation issues It

the system do es not require that this data b e collected into

covers the basic strategies used in evaluating a mo dule as

mainmemory CORAL structures b efore b eing used as is

well as several imp ortant renements It also addresses user

usual in database systems the data can b e accessed purely

guidance of query optimization and the interaction in the

out of pages in the EXODUS buer p o ol

evaluation of dierent mo dules The CORALC inter

The query pro cessing system consists of two main parts

face and supp ort for extensibility in CORAL including the

a query optimizer and a query evaluation system Simple

addition of new data typ es and op erations and new rela

queries selecting facts from a base relation for instance

tion and index implementations are discussed in Sections

can b e typ ed in at the user interface and are not opti

and We discuss related systems in Section Finally we

mized Complex queries are typically dened in declarative

provide a retrosp ective discussion of the CORAL design and

program mo dules that exp ort predicates views with as

outline future research directions in Section

so ciated query forms ie sp ecications of what kinds of

queries or selections are allowed on the predicate

Architecture of the CORAL

The query optimizer takes a program mo dule and a query

System

form as input and generates a rewritten program that is

The architecture of the CORAL deductive system is shown

optimized for the sp ecied query forms In addition to doing

in Figure CORAL is designed primarily as a single user

rewriting transformations the optimizer adds several control

database system and can b e used in a standalone mo de

annotations to those if any sp ecied by the user The

Persistent data is stored either in text les or using the

rewritten program is stored as a text le which is useful

EXODUS storage manager which has a clientserver ar

as a debugging aid for the user and is also converted

chitecture Each CORAL singleuser pro cess is a client that

into an internal representation that is used by the query

accesses the common p ersistent data from the server Multi

evaluation system

ple CORAL pro cesses could interact by accessing p ersistent

1

data stored using the EXODUS storage manager Trans

Certain optimizations like structuresharing have however

actions and concurrency control are supp orted by the EX b een implemented that optimize mainmemory op eration

Page

typ es such as integers strings or other abstract datatyp es The query evaluation system takes as input annotated

are sub classes of Arg The class Arg denes a set of virtual declarative programs in an internal representation and

metho dssuch as equals hash and print which must b e de database relations The annotations in the declarative pro

ned for each abstract datatyp e that is created The class grams provide execution hints and directives The query

Tuple denes tuples of Args A memb er of the class Relation evaluation system interprets the internal form of the opti

is a set of tuples The class Relation has a numb er of virtual mized program We also develop ed a ful ly compiled ver

metho ds dened on it These include insert Tuple delete sion of CORAL in which we generated a C program

Tuple and an iterator interface that allows tuples to b e from each user program This is the approach taken by

2

fetched from the relation one at a time The iterator is LDL We found that this approach to ok a signi

implemented using a memb er of a TupleIterator class that cantly longer time to compile programs and the resulting

is used to store the state or p osition of a scan on the rela gain in execution sp eed was minimal We have therefore

tion and to allow multiple concurrent scans over the same fo cused on the interpreted version consulting a program

relation takes very little time and is comparable to Prolog systems

This makes CORAL very convenient for interactive program

development

Representation of Terms

The query evaluation system has a well dened getnext

The primitive data typ es provided in the CORAL system

tuple interface with the data manager for access to relations

include integers doubles strings and arbitrary precision

3

This interface is indep endent of how the relation is dened

integers The current implementation restricts data that

as a base relation declaratively through rules or through

is stored using the EXODUS storage manager to b e lim

system or userdened C co de In conjunction with the

ited to terms of these primitive typ es Such data is stored

mo dular nature of the CORAL language such a high level

on disk in its machine representation while in memory the

interface is very useful since it allows the dierent mo dules

data typ es are implemented as sub classes of Arg

to b e evaluated using dierent strategies It is imp ortant to

The evaluation of rules in CORAL is based on the op

stress that CORAL does manipulate data in a setoriented

eration of unication that generates bindings for variables

fashion and the getnexttuple interface is merely an ab

based on patterns in the rules and the data An imp or

straction provided to supp ort mo dularity in the language

tant feature of the CORAL implementation of data typ es

For example a getnexttuple request on a p ersistent rela

is the supp ort for unique identiers to make unication of

tion results in a pagelevel IO request by the buer manager

large terms very ecient Such supp ort is critical for e

of EXODUS

cient declarative program evaluation in the presence of large

CORAL supp orts an interface to C extended with

terms In CORAL each typ e can dene how it generates

several features that provide the abstraction of relations and

unique identiers indep endent of how other typ es construct

tuples C can b e used to dene new relations as well as

their unique identiers if any b ecause of this orthogonal

manipulate relations computed using emb edded declarative

ity no further integration is needed to generate unique iden

CORAL rules The CORALC interface is intended to

tiers for terms built using several dierent kinds of typ es

b e used for the development of large applications

This is very imp ortant for supp orting extensibili ty and the

creation of new abstract data typ es

Terms can b e built from a function symb ol or functor

The Data Manager

and such terms are imp ortant for representing structured

The data manager DM is resp onsible for maintaining and

information For instance lists are a sp ecial typ e of func

manipulatin g the data in relations In discussing the DM

tor term A term f X Y is represented by a record

we also discuss the representation of the various data typ es

containing the function symb ol f an array of ar

While the representation of simple typ es is straightforward

guments and extra information to make unication of

complex structural typ es and incomplete data present in

such terms ecient The current implementation of CORAL

teresting challenges The eciency with which such data

uses a mo died version of hashconsing that op erates

can b e pro cessed dep ends in large part on the manner in

in a lazy fashion Hashconsing assigns unique identiers to

which it is represented in the system This section therefore

each ground functor term such that two ground func

presents the data representation at a fairly detailed level

tor terms unify if and only if their unique identiers are the

and this facilitates the discussion of evaluation techniques

same We note that such identiers cannot b e assigned to

in the subsequent sections

functor terms that contain free variables and these have to

The CORAL system is implemented in C and all data

b e handled dierently

typ es are dened as C classes Extensibility is an imp or

Variables constitute a primitive typ e in CORAL since

tant goal of the CORAL system In particular we view sup

p ort for userdened abstract data typ es as imp ortant In

2

This is analogous to the cursor notion in SQL

3

order to provide this supp ort CORAL provides the generic

Arbitrary precision integers are supp orted using the BigNum

class Arg that is the ro ot of all CORAL datatyp es sp ecic package provided by DEC France

Page

TERM STRUCTURE

aluation describ ed in Section The implementation argument binding ev

* environment *

of this extension involves creating subsidiary relations one

corresp onding to each interval b etween marks and trans

tly providing the union of the subsidiary relations cor

argument binding environment * paren A b enet of this name resp onding to the desired range of marks

"f" #0 25 null

is that it do es not interfere with the indexing arity organization 3 BINDING

#1

hanisms used for the relation the indexing mechanisms

X #0 ENVIRONMENT mec

h subsidiary relation VAR are used on eac

arguments 10

CORAL uses the EXODUS storage manager to supp ort Y #1 Z #0

VAR

t relations EXODUS uses a clientserver architec VAR p ersisten

FUNCTOR

t pro cess and maintains buers for #0 50 null ture CORAL is the clien

ARGUMENT BINDING

t relations If a requested tuple is not in the client

ENVIRONMENT p ersisten

buer p o ol a request is forwarded to the EXODUS server

the page with the requested tuple is retrieved While

f(X, 10, Y) X −−> 25, Y −−> Z, Z −−> 50 and

the architectural design do es not require any copying of this

Figure Representation of an Example Term

data into other CORAL structures the current implemen

tation do es p erform some copying This is an artifact of the

basic implementation decision to share constants instead of

CORAL allows facts and not just rules to contain variables

copying their values Currently tuples in a p ersistent rela

in this CORAL diers from most other deductive database

tion are restricted to have elds of primitive typ es only At

systems The semantics of a variable in a fact is that the

least in the case of constants of primitive typ es like integers

variable is universally quantied in the fact Although the

this has proven to b e a p o or decision and we are in the

basic representation of variables is fairly simple the repre

pro cess of mo difying the implementation

sentation is complicated by requirements of eciency when

using nonground facts in rules We describ e the problems

Implementation of Index Structures

briey

Hashbased indices for inmemory relations and Btree in

Supp ose we want to make an inference using a rule Vari

dices for p ersistent relations are currently available in the

ables in the rule may get b ound in the course of an inference

CORAL system CORAL allows for the sp ecication of two

A naive scheme would replace every reference to the variable

typ es of hashbased indices argument form indices and

by its binding It is more ecient however to record vari

pattern form indices The rst form is the traditional

able bindings in a binding environment at least during the

multiattribute hash index on a subset of the arguments of

course of an inference A binding environment often re

a relation The hash function chosen works well on ground

ferred to as a bindenv is a structure that stores bindings for

terms however all terms that contain a variable are hashed

variables Therefore whenever a variable is accessed during

to a sp ecial value denoted as var The second form is more

an inference a corresp onding binding environment must b e

sophisticated and allows us to retrieve precisely those facts

accessed to nd if the variable has b een b ound We show the

that match a sp ecied pattern where the pattern can con

representation of the term f X Y where X is b ound to

tain variables Such indices are of great use when dealing

and Y is b ound to Z and Z is b ound to in a separate

with complex ob jects created using functors One can re

bindenv in Figure

trieve for example those tuples in relation append that have

as the rst argument a list that matches X j A tuple

Representation of Relations

j would then b e retrieved

CORAL currently supp orts inmemory hashrelations as

well as p ersistent relations the latter by using the EXO

Overview of Query Evaluation

DUS storage manager Multiple indices can b e created

A numb er of query evaluation strategies have b een develop ed

on relations and can b e added to existing relations The

for deductive and each technique is particularly

relation interface is designed to make the addition of new

ecient for some classes of programs but may p erform rela

relation implementations as sub classes of the generic class

tively p o orly on others It is our premise that in such a p ow

Relation relatively easy

erful language completely automatic optimization can only

b e an ideal the must b e able to provide hints CORAL relations currently only the inmemory versions

or annotations and o ccasionally even override the systems supp ort several features that are not provided by typical

decisions in order to obtain go o d p erformance across a wide database systems The rst and most imp ortant extension

range of programs Since they are expressed at a high level is the ability to get marks into a relation and distingui sh

they give the programmer the p ower to control optimization b etween facts inserted after a mark was obtained and facts

and evaluation in a relatively abstract manner A detailed inserted b efore the mark was obtained This feature is im

description of the annotations provided by CORAL may b e p ortant for the implementation of all variants of seminaive

Page

found in we mention some of them when discussing the deciding whether to rene the basic nestedlo ops join

query evaluation techniques with intel ligent backtracking These asp ects are discussed in

detail in

The CORAL programmer decides on a p ermo dule ba

sis whether to use one of two basic evaluation approaches The optimizer also decides on the subsumption checks to

namely pipelining or materialization which are discussed in b e carried out on each relation The default is to do sub

Section Many other optimizations are dep endent up on sumption checks on all relations A user can ask that a rela

the choice of the basic evaluation mo de The optimizer gen tion b e treated as a multiset with as many copies of a tuple

4

erates annotations that govern many runtime actions and as there are derivations for it in the original program This

if materializati on is chosen do es sourcetosource rewriting semantics is supp orted by carrying out duplicate checks only

of the users program We discuss these two ma jor tasks of on the magic predicates some version of Magic Templates

the optimizer b elow must used

Mo dule Evaluation Strategies

Rewriting Techniques

The evaluation of a declarative CORAL program is divided

Materialized evaluation in CORAL is essentially a xp oint

into a numb er of distinct subcomputations by expressing

evaluation using a b ottomup iteration on the program rules

the program as a collection of mo dules Each mo dule is

If this is done on the original program selections in a query

a unit of compilation and its evaluation strategies are in

are not utilized Several program transformations have b een

dep endent of the rest of the program Mo dules export the

prop osed to propagate such selections and many of these

predicates that they dene a predicate exp orted from one

are implemented in CORAL

mo dule is visible to all other mo dules and can b e used by

The desired selection pattern is sp ecied using a query

them in rules Since dierent mo dules may have widely vary

form where a b ound argument indicates that any binding

ing evaluation strategies some relatively high level interface

in that argument p osition of the query is to b e propagated

is required for interaction b etween mo dules The basic ap

Bindings are propagated by creating magic facts that rep

proach used by CORAL is outlined here

resent the appropriate binding values By sp ecifying that all

During the evaluation of a rule r in mo dule M if we gen

arguments are b ound binding propagation similar to Prolog

erate a query on a predicate exp orted by mo dule N a call

is achieved ie all available bindings are propagated By

is set up on mo dule N The answers to this query are used

sp ecifying that all arguments are free in contrast bindings

iteratively in rule r each time a new answer to the query

in the query are ignored except for a nal selection

is required rule r requests for a new tuple from the inter

The default rewriting technique is Supplementary Magic

face to mo dule N The interface to relations exp orted by a

Templates see also The rewriting can

mo dule makes no assumptions ab out the evaluation of the

b e tailored to propagate bindings across subgoals in a rule

mo dule Mo dule N may contain only base predicates or

b o dy using dierent subgoal orderings CORAL uses a left

may have rules that are evaluated in any of several dier

toright ordering within the b o dy of a rule by default

ent ways The mo dule may cho ose to cache answers b etween

Other selectionpropagati ng rewriting techniques supp orted

calls or cho ose to recompute answers All this is transparent

in CORAL include Magic Templates Supplementary

to the calling mo dule Similarly the evaluation of the called

Magic With GoalId Indexing and Context Factoring

mo dule N makes no assumptions ab out the evaluation of

Supplementary Magic is a go o d choice as a default

calling mo dule M This orthogonality p ermits the free mix

although each technique is sup erior to the rest for some pro

ing of dierent evaluation techniques in dierent mo dules in

grams CORAL also supp orts Existential Query Rewriting

CORAL and is central to how dierent executions in dier

which seeks to propagate pro jections This is applied

ent mo dules are combined cleanly

by default in conjunction with a selectionpushi ng rewriting

Two basic evaluation approaches are supp orted namely

pipelining and materialization Pip elini ng uses facts on

Decisions On Runtime Alternatives

they and do es not store them at the p otential cost of

In addition to cho osing rewriting techniques for material

recomputation Materialization stores facts and lo oks them

ized evaluation the optimizer makes a numb er of decisions

up to avoid recomputation Several variants of material

that aect execution The optimizer analyzes the rewrit

ized evaluation are supp orted Basic SemiNaive Predicate

ten program and identies some evaluation and optimiza

SemiNaive and Ordered Search

tion choices that app ear appropriate

The default xp oint evaluation strategy is called Basic

Mo dule and Rule Data Structures

SemiNaive evaluation BSN but a variant called Predicate

The compilation of a materialized mo dule generates an in

SemiNaive evaluation PSN which is b etter for programs

ternal module structure that consists of a list of structures

with many mutually recursive predicates is also available

4

With resp ect to seminaive evaluation the optimizer is re

On nonrecursive queries this semantics is consistent with

sp onsible for join order selection index selection SQL when duplicate checks are omitted

Page

corresp onding to the strongly connected comp onents SCCs Materialization

5

of the mo dule and each SCC structure contains struc

The variants of materialization are all b ottomup xp oint

tures corresp onding to seminaive rewritten versions of rules

evaluation metho ds Bottomup evaluation iterates on a

These seminaive rule structures have elds that sp ecify the

set of rules rep eatedly evaluating them until a xp oint is

argument lists of each b o dy literal and the predicates that

reached In order to p erform incremental evaluation of rules

they corresp ond to Each seminaive rule also contains eval

across multiple iterations CORAL uses the seminaive eval

uation order information precomputed backtrack p oints

uation technique This technique consists of a rule

and precomputed osets into a table of relations

rewriting part p erformed at compile time which creates ver

sions of rules with delta relations and an evaluation part

A mo dule to b e evaluated using pip elini ng is stored as

The delta relations contain changes in relations since the

a list of predicates dened in the mo dule Asso ciated with

last iteration The evaluation part evaluates each rewritten

each predicate is a list of rules dening it in the order they

rule once in each iteration and p erforms some up dates to

o ccur in the mo dule denition each rule b eing represented

the delta relations at the end of the iteration An evaluation

using structures like those used for seminaive rules

terminates when an iteration pro duces no new facts

The optimizer analyzes the seminaive rewritten rules and

generates annotations to create any indexes that may b e use

Pip elining

6

ful during the evaluation phase The basic join mechanism

For pipelining which is essentially topdown evaluation the

in CORAL is nestedlo ops with indexing In a manner simi

rule evaluation co de is designed to work in a coroutining

lar to Prolog CORAL maintains a trail of variable bindings

fashion when rule evaluation is invoked using the get

when a rule is evaluated this is used to undo variable bind

nexttuple interface it generates an answer if there is one

ings when the nestedlo ops join considers the next tuple in

and transfers control back to the consumer of answers the

any lo op

caller Control is transferred back to the susp ended rule

evaluation when more answers are desired

Mo dule Level Control Choices

At mo dule invo cation the rst rule in the list asso ciated

At the level of the mo dule a numb er of choices exist with

with the queried predicate is evaluated This could involve

resp ect to the evaluation strategy for the mo dule and the

recursive calls on other rules within the mo dule which are

sp ecic optimization s to b e used We describ e the imple

also evaluated in a similar pip elin ed fashion If the rule

mentation of some of these strategies

evaluation of the queried predicate succeeds the state of the

computation is frozen and the generated answer is returned

Ordered Search

A subsequent request for the next answer tuple results in

Ordered Search is an evaluation mechanism that orders the

the reactivation of the frozen computation and pro cessing

use of generated subgoals in a program and thereby provides

continues until the next answer is returned At any stage if

an imp ortant strategy for handling programs with negation

a rule fails to pro duce an answer the next rule in the rule list

setgrouping and aggregation that are lefttoright mo du

for the head predicate is tried When there are no more rules

larly stratied Full details of Ordered Search are not pre

to try the query on the predicate fails When the topmost

sented here but the reader is referred to The principle

query fails no further answers can b e generated and the

of Ordered Search is that the computation is ordered by

pip elined mo dule execution is terminated

hiding subgoals This is achieved by maintainin g a con

There are three imp ortant p oints to note Firstly the

text that stores subgoals in an ordered fashion and that

implementation of pip elini ng which is a radically dierent

decides at each stage in the evaluation which subgoal to

evaluation technique from b ottomup xp oint evaluation

make available for use next The order in which generated

demonstrates the mo dularity of the CORAL implementa

subgoals are made available for use is somewhat similar to

tion Secondly from a language p oint of view it demon

a topdown evaluation

strates that the mo dule mechanism allows a user to eec

From an implementation p ersp ective in addition to main

tively combine b ottomup and topdown evaluation tech

taining the context two changes have to b e made First the

niques in a single program Indeed our implementation

rewriting phase which must use a version of Magic in con

of pip elini ng could b e replaced by an interface to a Prolog

junction with Ordered Search xp oint evaluation must b e

system Thirdly pip elini ng guarantees a particular evalua

mo died to intro duce done literals guarding negated liter

tion strategy and order of execution While the program is

als and rules that have grouping and aggregation Second

no longer truly declarative programmers can exploit this

the evaluation must add a goal magic fact to the corre

guarantee and use predicates like up dates that involve side

sp onding done predicate when and only when all answers

eects

to it have b een generated The context mechanism is used

6

Index generation also o ccurs for pip elined mo dules but at

5

An SCC is a maximal set of mutually recursive predicates the level of the original rules

Page

to determine the p oint at which a goal is considered done

p mo dule s

These changes ensure that rules involving negation for ex

exp ort s pbf f f f f f f

ample are not applied until enough facts have b een com

selection pX Y P C X Y minC aggregate

pX Y P C s p l eng thX Y C pX Y P C s

puted to reduce the negation to a setdierence op eration

s p l eng thX Y min C pX Y P C

pX Y P C pX Z P C edg eZ Y E C

The Save Mo dule Facility

appendedg eZ Y P P C C E C

In most cases facts other than answers to the query com

pX Y edg eX Y C edg eX Y C

puted during the evaluation of a mo dule are b est discarded

mo dule end

to save space since b ottomup evaluation stores many facts

space is generally at a premium Mo dule calls provide a

convenient unit for discarding intermediate answers By de

Figure Program Shortest Path

fault CORAL do es precisely this it discards all interme

diate facts and subgoals computed by a mo dule at the end of

a call to the mo dule However there are some cases where

Indexing Relations

this leads to a signicant amount of recomputation This is

As mentioned in Section CORAL supp orts two forms

esp ecially so in cases where the same subgoal in a mo dule is

of indices argument form indices and pattern form

generated in many dierent invo cations of the mo dule In

indices The rst form creates an index on a subset of the

such cases the user can tell the CORAL system to maintain

arguments of a relation The second form is more sophis

the state of the mo dule ie retain generated facts in b e

ticated and creates an index on a sp ecied pattern that

tween calls to the mo dule and thereby avoid recomputation

can contain variables Supp ose a relation emp had two ar

we call this facility the save module facility

guments the rst a name and the second a complex term

In the interest of ecient implementation we have the

addr S tr eet C ity The following declaration then creates

following restriction on the use of the save mo dule feature

a pattern form index that can eciently retrieve employees

if a module uses the save module feature it should not be

named John who stay in Madison without knowing

invoked recursively We do not make any guarantees ab out

their street

correct evaluation should this happ en at runtime Note

that the predicates dened in the mo dule can b e recursive

make index empN ame addr S tr eet C ity N ame C ity

this do es not cause recursive invo cations of the mo dule

From an implementation p oint of view the challenge is to

The Magic Templates rewriting stage generates annota

ensure that no derivations are rep eated across multiple cal ls

tions to create all indices that are needed for ecient eval

to the module This requires signicant changes to semi

uation The user is allowed to sp ecify additional indices

naive evaluation while the details are omitted here for lack

which is particularly useful if the Magic Templates rewrit

of space they can b e found in

ing stage is bypassed

Lazy Evaluation

Aggregate Selections

In the traditional approach to b ottomup evaluation all an

path program in Figure To com Consider the shor test

swers to a query are computed by iterating over rules till

pute shortest paths b etween p oints it suces to use only the

a xp oint is reached and then returning all the answers

shortest paths b etween pairs of p oints path facts that do

Lazy evaluation tries to return the answers at the end of

not corresp ond to shortest paths are irrelevant CORAL p er

every iteration instead of at the end of computation Lazy

mits the user to sp ecify an aggregate selection on the predi

evaluation is implemented by storing the state of the compu

cate path in the manner shown The system then checks at

tation at the end of an iteration and returning the answer

runtime if a path fact is such that there is a path fact of

tuples generated in that iteration The state is stored with

lesser cost C with the same value for X Y ie b etween the

the iterator that is created for the query recall the get

same pair of p oints and if there is such a fact the costlier

nexttuple iterative interface The iterator then iterates

path fact is discarded This aggregate selection is extremely

over the tuples returned and when it has stepp ed through

imp ortant for eciency without it the program may

all the tuples it reactivates the frozen computation that it

run for ever generating cyclic paths of increasing length

has stored This reactivation results in the execution of one

With this aggregate selection along with the choice annota

more iteration of the rules and the whole pro cess is rep eated

tion aggregate selection pathX Y P C X Y C any P

until an iteration over the rules pro duces no new tuples

a single source query on the program runs in time O E V

where there are E edge facts and V no des in the graph

Predicate Level Control

CORALs aggregate selection mechanism can also b e used

CORAL provides a variety of annotations at the level of

to provide a version of the choice op erator of LDL but with

individu al predicates in a mo dule We discuss a couple of

altogether dierent semantics

them in this section

Page

InterMo dule Calls descriptors and a suite of asso ciated metho ds In addition

there is a construct to emb ed CORAL commands in C

The interaction b etween mo dules merits some discussion

co de This extended C can b e used in conjunction with

Supp ose that p is a predicate that app ears in the b o dy of a

the declarative language features of CORAL in two distinct

rule of mo dule M Evaluation within a rule pro ceeds leftto

7

ways

right and can b e thought of as a nestedlo ops join While

Relations can b e computed in a declarative style using

this is not entirely accurate with resp ect to pip elined evalu

declarative mo dules and then manipulated in imp er

ation it is an accurate enough description for our purp oses

ative fashion in extended C without breaking the

When evaluation reaches the p literal a scan is op ened on p

relation abstraction In this mo de of usage typically

A p tuple retrieved by the scan is used to instantiate the rule

there is a main program written in C that calls up on

8

When evaluation returns to the p literal on backtracking

CORAL for the evaluation of some relations dened in

the scan on p is advanced to get the next p tuple

CORAL mo dules The main program is compiled after

This getnexttuple interface to a relation p via a scan is

some prepro cessing and executed from the op erating

the only interface presented to M by any relation regardless

system command prompt the CORAL interactive in

of the nature of the relation If p is dened in mo dule M

terface is not used

as a derived relation the interface is still the same as if

New predicates can b e dened using extended C

p were a base relation We emphasize that the user need

These predicates can b e used in declarative CORAL

not b e concerned ab out the details of how getnexttuple

co de and are incrementally loaded from the CORAL

requests are generated This is just an abstraction of how

interactive command interface There are however

the evaluation pro ceeds and is presented here for clarity of

some restrictions on the typ es of arguments that can

exp osition

b e passed to the newly dened predicates

An imp ortant consequence of this interface is that if M is

Thus declarative co de can call extended C co de and

materialized then p is fully evaluated as evaluation rep eat

viceversa The ab ove two mo des are further discussed in

edly reaches the p literal up on backtracking More precisely

the following sections

the part of p that is relevant to this p literal is fully evalu

ated Thus the following rule governs intermo dule calls

Extensions to C

C has b een extended by adding a collection of classes

The cal ling module wil l wait until the cal led module

and asso ciated metho ds The new classes are

returns answers to the subquery The cal led mod

Relation This allows access to relations from C Re

ule presents a scanlike interface and returns al l

lation values can b e constructed through a series of ex

answers to the subquery upon repeated getnext

plicit inserts and deletes or through a call to a declar

tuple requests

ative CORAL mo dule The asso ciated metho ds al

low manipulati on of relation values from C without

This is indep endent of the evaluation mo des of the two mo d

breaking the relation abstraction

ules involved The p oint at which the called mo dule returns

answers however dep ends on its evaluation mo de

Tuple A relation is a collection set or multiset of

tuples

If the called mo dule is pip elined an answer is returned as

so on as it is found and the computation of the called mo dule

Arg A tuple in turn is a list of args ie arguments A

is susp ended until another answer is requested by the caller

numb er of metho ds are provided to construct and take

The use of certain features such as save mo dule and ag

apart arguments and argument lists

gregate selections can result in all answers b eing computed

C ScanDesc This abstraction supp orts relational scans

b efore any answers are returned by the called mo dule Oth

in C co de A C ScanDesc ob ject is essentially a

erwise answers are returned at the end of each xp oint it

cursor over a relation

eration in the called mo dule further iterations are carried

In addition to the new classes any sequence of commands

out if more answers are requested by the calling mo dule At

that can b e typ ed in at the CORAL interactive command

the level of the topmost query this results in answers b eing

interface can b e emb edded in C co de bracketed by sp e

available at the end of each iteration

cial delimiters A le containing C co de with emb edded

CORAL co de must rst b e passed through a CORAL pre

Interface with C

pro cessor and then compiled using a standard C com

The CORAL system has b een integrated with C in or

piler One restriction in the current interface is that a very

der to supp ort a combination of imp erative and declarative

limited abstraction of variables is presented to the user

programming styles We have extended C by providing

Variables can b e used as selections for a query say via

a collection of new classes relations tuples args and scan

rep eated variables or in a scan but variables cannot b e

returned as answers ie the presence of nonground terms

7

More generally in a user sp ecied order

8

is hidden at the interface Presenting the abstraction of

Backtracking is only intrarule unless evaluation in M is

pip elined nonground terms would require that binding environments

Page

b e provided as a basic abstraction and this would make the from a list of arguments used to recreate ob jects given a

interface rather complex printed representation hash to return a hash value and

some memory management functions For a summary of

the virtual metho ds that constitute the abstract data typ e

Dening New Predicates

interface see In addition to creating the abstract

As we have already seen predicates exp orted from one

data typ e the user can dene predicates to manipulate and

CORAL mo dule can b e used freely in other mo dules Some

p ossibly display in novel ways ob jects b elonging to the ab

times it may b e desirable to dene a predicate using ex

stract data typ es These predicates must b e registered with

tended C rather than the declarative language sup

the system registration is accomplished by a single com

p orted within CORAL mo dules A coral export statement

mand

is used to declare the arguments of the predicate b eing de

ned The CORAL primitive typ es are the only typ es that

Extensibilty of Access Structures

coral exp ort declaration userdened typ es can b e used in a

CORAL currently supp orts relations organized as linked

are not allowed It is imp ortant to note that the CORAL

lists relations organized as hash tables relations dened by

prepro cessor currently do es no typ e checking or even at

rules and relations dened by C functions The interface

tempt to check if the exp orted function is dened in the le

co de to relations makes no assumptions ab out the structure

it op erates purely at a syntactic level

of relations and is designed to make the task of adding new

The exp ort mechanism makes it easy to pass values of

relation implementations easy The getnexttuple inter

these limited typ es b etween CORAL and C co de The

face b etween the query evaluation system and a relation is

predicate denition can use all features of extended C

the basis for adding new relation implementations and index

The source le is prepro cessed into a C le and com

implementations in a clean fashion The implementation of

piled to pro duce a o le If this le was consulted from the

p ersistent relations using EXODUS illustrates the utility of

CORAL prompt then it is loaded into a newly allo cated re

such extensibility Section

9

gion in the data area of the executing CORAL system It is

also p ossible to directly consult a prepro cessed C le or o

Related Systems

le and avoid rep eating the prepro cessing and compilation

There are many similari ties b etween CORAL and deductive

steps

database systems such as Aditi EKSV LDL

GlueNAIL Starburst SQL DECLARE

Extensibility in CORAL

ConceptBase and LOLA However there are several

The implementation of the declarative language of CORAL

imp ortant dierences and CORAL extends all the ab ove

is designed to b e extensible The user can dene new ab

systems in the following ways

stract data typ es new relation implementation s or new in

CORAL supp orts a larger class of programs includ

dexing metho ds and use the query evaluation system with

ing programs with nonground facts and nonstratied

no or in a few cases minor changes The users program

negation and setgeneration

will of course have to b e compiled and linked with the sys

CORAL supp orts a wide range of evaluation tech

tem co de We assume a set of standard op erations on data

niques and gives the user considerable control over the

typ es and all abstract data typ es must provide these op er

choice of techniques

ations as C virtual metho ds

CORAL is extensible new data and relation typ es

and index implementation s can b e added without mo d

Extensibility of Data Typ es

ifying the rest of the system

The typ e system in CORAL is designed to b e extensible

EKSV supp orts integrity constraint checking hyp othet

the class mechanism and virtual metho ds provided by C

ical reasoning and provides some supp ort for nonstratied

help make extensibili ty clean and lo cal Lo cality refers to

aggregation ConceptBase supp orts DATALOG along

the ability to extend the typ e system by adding new co de

with lo cally stratied negation but no setgeneration sev

without mo difying existing system co de the changes are

eral ob jectoriented features integrity constraint checking

thus lo cal to the co de that is added All abstract data typ es

and provides a oneway interface to CProlog ie the im

should have certain virtual metho ds dened in their inter

p erative language can call ConceptBase but not vice versa

face and all system co de that manipulates ob jects op erates

LOLA supp orts stratied programs integrity constraints

only via this interface This ensures that the query eval

several join strategies and some supp ort for typ e informa

uation system do es not need to b e mo died or recompiled

tion The host language of LOLA is Lisp and it is linked

when a new abstract data typ e is dened The required

to the relational database Aditi gives primary

metho ds include the metho d eq ual s that is used to check if

imp ortance to diskresident data and supp orts several join

two ob jects are equal the metho d pr int for printing the ob

strategies

ject the metho d constr uct that is used to create new ob jects

Unlike GlueNAIL and LDL where mo dules have only a

9

That is the new co de is incrementally loaded into CORAL compiletime meaning and no runtime meaning mo dules in

Page

deductive databases and logic rules The overall archi CORAL have imp ortant runtime semantics in that several

tecture was reasonably successful in breaking the prob runtime optimizations are done at the mo dule level Mo d

lem of query pro cessing into relatively orthogonal tasks ules with runtime semantics are also available in several

pro duction rule systems for example RDL LDL

a successor to LDL under development at MCC Austin is

On the negative side some p o or decisions were made and

rep ortedly also moving in the direction taken by CORAL in

some issues were not addressed adequately

many resp ects It will b e partially interpreted supp ort ab

Typ e Information CORAL makes no eort to use typ e

stract data typ es and use a lo cal semantics for choice Carlo

information in its pro cessing No typ e checking or infer

Zaniolo p ersonal communication XSB is a system b eing

encing is p erformed at compiletime and errors due to

develop ed at SUNY Stony Bro ok It will supp ort several

typ e mismatches lead to subtle runtime errors Typing

features similar to CORAL such as nonground terms and

is a desirable feature esp ecially if the language is to b e

mo dularly stratied set grouping and negation Program

used to develop large applications This is one of the is

evaluation in XSB will use OLDTNF which has b een imple

sues addressed by a prop osed extension to CORAL

mented by mo difying the WAM David S Warren p ersonal

communication DECLARE and SDS are early eorts to

Memory Management In an eort to make the sys

commercialize deductive database technology

tem as ecient as p ossible for mainmemory op erations

In comparison to logic programming systems such as var

copying of data has largely b een replaced by p ointer

ious implementations of Prolog CORAL provides b etter in

sharing While this do es make evaluation more e

dexing facilities and supp ort for p ersistent data Most im

cient it requires extensive memory management and

p ortantly the declarative intended mo del semantics is sup

garbage collection Also p ointer based copying is p er

p orted for all p ositive Horn clause programs and a large

formed even for primitive data typ es such as integers

class of programs with negation and aggregation as well

There are a numb er of directions in which CORAL could

Conclusions

b e and in some cases needs to b e extended These include

The CORAL pro ject is at a stage where one version of the

b etter supp ort for p ersistent data improved memory man

system has b een released in the public domain and an en

agement enhanced C interface features ob jectoriented

hanced version will so on b e released The eects of several

extensions and supp ort for constraints While p erformance

design decisions are b ecoming increasingl y evident On the

measurements of a preliminary nature have b een made an

p ositive side most of the decisions we made seem to have

extensive p erformance evaluation of CORAL b oth to eval

paid o with resp ect to simplicity and ease of ecient im

uate various asp ects of the system and to compare it with

plementation

other systems also needs to b e p erformed

Mo dular Design The concept of mo dules in CORAL

was in many ways the key to the successful implemen

tation of the system Given the ambitious goal of com

bining many evaluation strategies controlled by user

hints in an orthogonal fashion the mo dule mechanism

Acknowledgements

app ears to have b een the ideal approach

We would like to acknowledge our debt to Aditi EKSV

Annotations It has b een our exp erience in practice that

LDL NAIL SQL Starburst and various implementations

often the discerning user is able to determine go o d con

of Prolog from which we have b orrowed numerous ideas We

trol strategies that would b e extremely dicult if not

would like to acknowledge the contributions of the follow

imp ossible for a system to do automatically Hence the

ing p eople to the CORAL system Per Bothner who was

strategy of allowing the users to express control choices

largely resp onsible for the initial implementation of CORAL

was a convenient approach to solving an otherwise dif

that served as the basis for subsequent development was

cult problem

a ma jor early contributor Joseph Alb ert worked on some

Extensibility The decision to design an extensible sys

asp ects of the setmanipulation co de Tarun Arora imple

tem seems to have help ed greatly in keeping our co de

mented several utilities and builtin libraries Tom Ball im

clean and mo dular in addition to its utility from an

plemented an early prototyp e of the seminaive evaluation

application development p ersp ective

system Laichong Chan did the initial implementation of

existential query optimization Sumeer Goyal implemented System Architecture The architecture concentrated on

emb edded CORAL constructs in C Vish Karra imple the design of a single user database system leaving is

mented pip elini ng Rob ert Netzer did the initial implemen sues like transaction management concurrency control

tation of Magic rewriting and Bill Roth created test suites and recovery to b e handled by the EXODUS to olkit

implemented the Explanation to ol along with Tarun and Thus CORAL could build on these facilities that were

added userinterface features already available and fo cus instead on the subtleties of

Page

References J F Naughton R Ramakrishnan Y Sagiv and J D Ull

man Argument reduction through factoring In Proceed

I Balbin and K Ramamohanarao A generalization of the ings of the Fifteenth International Conference on Very Large

dierential approach to recursive query evaluation Journal Databases pages Amsterdam The Netherlands

of Logic Programming Septemb er

August

G Phipps M A Derr and K A Ross GlueNAIL A de

F Bancilhon Naive evaluation of recursively dened rela

ductive database system In Proceedings of the ACM SIG

tions In Bro die and Mylop oulos editors On Know ledge

MOD Conference on Management of Data pages

Base Management Systems Integrating Database and AI

Systems SpringerVerlag

R Ramakrishnan Magic Templates A sp ellbinding ap

C Beeri and R Ramakrishnan On the p ower of Magic

proach to logic programs In Proceedings of the International

In Proceedings of the ACM Symposium on Principles of

Conference on Logic Programming pages Seattle

Database Systems pages San Diego California

Washington August

March

R Ramakrishnan C Beeri and R Krishnamurthy Op

M Carey D DeWitt J Richardson and E Shekita Ob ject

timizing existential Datalog queries In Proceedings of the

and le management in the EXODUS extensible database

ACM Symposium on Principles of Database Systems pages

system In Proceedings of the International Conference on

Austin Texas March

Very Large Databases Aug

R Ramakrishnan P Bothner D Srivastava and S Su

D Chimenti R Gamb oa R Krishnamurthy S Naqvi

darshan CORAL A database programming language In

S Tsur and C Zaniolo The LDL system prototyp e IEEE

J Chomicki editor Proceedings of the NACLP Work

Transactions on Know ledge and Data Engineering

shop on Deductive Databases Octob er Available as

Rep ort TRCS Department of Computing and Infor

mation Sciences Kansas State University

B Freitag H Schutz and G Sp echt LOLA a logic lan

guage for deductive databases and its implementation In

R Ramakrishnan P Seshadri D Srivastava and S Sudar

Proceedings of nd International Symposium on Database

shan The CORAL user manual A tutorial intro duction to

Systems for Advanced Applications DASFAA

CORAL Manuscript

E Goto Mono copy and asso ciative algorithms in an ex

R Ramakrishnan D Srivastava and S Sudarshan Rule

tended lisp Technical Rep ort Information Science

ordering in b ottomup xp oint evaluation of logic programs

Lab oratory Univ of Tokyo Tokyo Japan May

IEEE Transactions on Know ledge and Data Engineering to

appear A shorter version app eared in VLDB

M Jeusfeld and M Staudt Query optimization in deductive

R Ramakrishnan D Srivastava and S Sudarshan Control

ob ject bases In G JC Freytag G Vossen and D Maier

ling the search in b ottomup evaluation In Proceedings of

editors Query Processing for Advanced Database Applica

the Joint International Conference and Symposium on Logic

tions MorganKaufmann

Programming

D Kemp K Ramamohanarao and Z Somogyi Right left

R Ramakrishnan D Srivastava and S Sudarshan

and multilinear rule transformations that maintain con

CORAL Control Relations and Logic In Proceedings of the

text information In Proceedings of the International Con

International Conference on Very Large Databases

ference on Very Large Databases pages Brisbane

R Ramakrishnan D Srivastava and S Sudarshan The

Australia

Save Mo dule facility in CORAL Manuscript

G Kiernan C de Maindreville and E Simon Making de

R Ramakrishnan and S Sudarshan TopDown vs Bottom

ductive database a practical technology a step forward In

Up Revisited In Proceedings of the International Logic Pro

Proceedings of the ACM SIGMOD Conference on Manage

gramming Symposium

ment of Data

J Rohmer R Lesco eur and J M Kerisit The Alexander

W Kiessling DECLARE and SDS Early eorts to com

metho d a technique for the pro cessing of recursive axioms

mercialize deductive database technology Submitted to the

in deductive database queries New Generation Computing

VLDB Journal

A Lefebvre Towards an ecient evaluation of recursive ag

H Seki On the p ower of Alexander templates In Proc

gregates in deductive databases In Proceedings of the In

of the ACM Symposium on Principles of Database Systems

ternational Conference on Fifth Generation Computer Sys

pages

tems June

D Srivastava R Ramakrishnan P Seshadri and S Su

K Morris J D Ullman and A Van Gelder Design overview

darshan CORAL Adding ob jectorientation to a logic

of the NAIL system In Proceedings of the Third Interna

database language Submitted

tional Conference on Logic Programming

S Tsur and C Zaniolo LDL A logicbased datalanguage

I S Mumick H Pirahesh and R Ramakrishnan Dupli

In Proceedings of the Twelfth International Conference on

cates and aggregates in deductive databases In Proceed

Very Large Data Bases pages Kyoto Japan August

ings of the Sixteenth International Conference on Very Large

Databases Aug

J Vaghani K Ramamohanarao D Kemp Z Somogyi and

S Naqvi and S Tsur A Logical Language for Data and P Stuckey The Aditi deductive database system In Pro

Know ledge Bases Principles of Computer Science Com ceedings of the NACLP Workshop on Deductive Database

puter Science Press New York Systems

Page

L Vieille P Bayer V Kuc henho and A Lefebvre EKS

V a short overview In AAAI Workshop on Know ledge

Base Management Systems

Page