The Intelligent Interface:

Integrating AI and Database Systems



Donald P. McKay and Timothy W. Finin and Anthony O'Hare

Unisys Center for Advanced Information Technology

Paoli, Pennsylvania

[email protected] and [email protected]

Abstract its tuple-at-a-time inference mechanisms. The IDI has

also b een used to implement a query server supp orting

The Intel ligent Database Interface IDI is a cache-based

a database used for an Air Travel Information System

interface that is designed to provide Arti cial Intelligence

which is accessed byaspoken language system imple-

systems with ecient access to one or more on one

[ ]

or more remote database management systems DBMSs. mented in Prolog Dahl, et. al., 1990 .

It can b e used to interface with a wide variety of di erent

In addition to providing ecient access to remote

DBMSs with little or no mo di cation since SQL is used to

DBMSs, the IDI o ers several other distinct advan-

communicate with remote DBMSs and the implementation

tages. It can b e used to interface with a wide vari-

of the IDI provides a high degree of p ortability. The query

ety of di erent DBMSs with little or no mo di cation

language of the IDI is a restricted subset of function-free

since SQL is used to communicate with the remote

Horn clauses which is translated into SQL. Results from the

DBMS. Also, several connections to the same or di er-

IDI are returned one tuple at a time and the IDI manages

ent DBMSs can exist simultaneously and can b e kept

a cache of result relations to improve eciency. The IDI is

active across anynumb er of queries b ecause connec-

one of the key comp onents of the Intel ligent System Server

tions to remote DBMSs are abstract ob jects that are

ISS knowledge representation and reasoning system and is

also b eing used to provide database services for the Unisys

managed as resources by the IDI. Finally, accessing

sp oken language systems program.

schema information is handled automatically by the

IDI, i.e., the application is not required to maintain

up-to-date schema information for the IDI. This signif-

Intro duction

icantly reduces the p otential for errors intro duced by

The Intel ligent Database Interface IDI is a p ortable,

stale schema information or by hand entered data.

cache-based interface designed to provide arti cial intel-

The IDI can b e viewed as a stand-alone DBMS inter-

ligence systems in general and exp ert systems in par-

face which accepts queries in the form of IDIL clauses

ticular with ecient access to one or more databases

and returns the result relation as a set of tuples i.e.,

on one or more remote database management systems

a list of Lisp atoms and/or strings. IDIL queries are

[

DBMS which supp ort SQL Chamb erlin, et. al.,

translated into SQL and sent to the appropriate DBMS

]

1976 . The of the IDI is the Intelligent

for execution. The results from the DBMS are then

[ ]

Database Interface Language IDIL O'Hare, 1989 and

transformed by the IDI into tuples of Lisp ob jects. Al-

is based on a restricted subset of function-free Horn

though the IDI was not designed to b e used directly

clauses where the head of a clause represents the tar-

by a user, the following descriptions will b e couched

get list i.e., the form of the result relation and the

in terms of using the IDI as a stand-alone system so

b o dy is a conjunction of literals which denote database

that wemayavoid complicating our discussions with

relations or op erations on the relations and/or their at-

the details of an AI system such as the ISS.

tributes e.g., negation, aggregation, and arithmetic op-

The design of the IDI was heavily in uenced by pre-

erations.

[

vious research in the area of AI/DB integration Kellog,

The IDI is one of the key comp onents of the In-

et. al., 1986, O'Hare, 1987, O'Hare and Travis, 1989,

[ ]

tel ligent System Server ISS Finin, et. al., 1989

]

O'Hare and Sheth, 1989 . One of the more signi cant

[ ]

which is based on Protem Fritzson and Finin, 1988

design criteria that this lead to is the supp ort of non-

and provides a combined logic-based and frame-based

trivial queries in IDIL. That is, to allow for queries in-

knowledge representation system and supp orts forward-

volving more than just a single database relation. This

chaining, backward-chaining, and truth maintenance.

capability allows the AI system to o -load computa-

The IDI was designed to b e compatible with the logic-

tions that are more eciently pro cessed by the DBMS

based knowledge representation scheme of the ISS and

instead of the AI system e.g., join op erations. In many



cases, this also has the e ect of reducing the size of data

current address: IBM, ResearchTriangle Park, North

Carolina set that is returned by the DBMS.

the AI system is extended with DBMS capabilities to

vide ecient access to, and management of, large

AI AI DB pro

ts of stored data. In general, such systems do

DB amoun

hnology. Rather, the

Extending the AI system Loose coupling not incorp orate full DBMS tec

emphasis is on the AI system and the DBMS capabil-

ities are added in an ad hoc and limited manner, e.g.,

DB AI Interface DB

[ ]

al., 1986 implements only the data access

AI Ceri, et.

yer. Alternatively, a new generation knowledge-based

Extending the DB system Enhanced AI/DB interface la

[ ]

system such as LDL Chimenti, et. al., 1987 maybe

constructed. In either case, this approach e ectively

Figure 1: Of the four alternative approaches to AI/DB inte-

involves \re-inventing" some or all of DBMS technol-

gration, the Intel ligent Database Interface is an example of

ogy. While such systems typically provide sophisticated

an enhanced AI/DB interface.

to ols and environments for the development of applica-

tions such as exp ert systems, they can not readily make

use of existing databases. Thus, the developmentofAI

While the IDI is to some small degree system dep en-

applications whichmust access existing databases will

dent, it do es o er a high degree of p ortability b ecause

b e exceedingly dicult if not imp ossible e.g., when the

it is implemented in Common Lisp, communicates with

database is routinely accessed and up dated via more

remote DBMSs using SQL and standard UNIX pip es,

traditional kinds of applications.

and represents IDIL queries and their results as Com-

mon Lisp ob jects.

Extending the DBMS System: This approach

extends a DBMS to provide knowledge representation

In the following sections we present a brief overview

[

and reasoning capabilities, e.g., POSTGRES Stone-

of the area of AI/DB integration which represents a

]

breaker, et. al., 1987 . Here, the DBMS capabilities are

large part of the motivation for the IDI, a discussion

the central concern and the AI capabilities are added in

of some of the more signi cant features of the IDI, the

an ad hoc manner. The knowledge representation and

organization and ma jor comp onents of the IDI, and -

reasoning capabilities are generally quite limited and

nally an example of how the IDI is b eing used in two

they lack the sophisticated to ols and environments of

applications.

most AI systems. Such systems do not directly sup-

p ort the use of existing DBMSs nor can they directly

AI/DB Integration

supp ort existing AI applications e.g., exp ert systems

The integration of AI and DBMS technologies promises

without substantial e ort on the part of the applica-

to play a signi cant role in shaping the future of com-

tion develop er. In some sense, this is the opp osite of

[ ]

puting. As noted in Bro die, 1988 , AI/DB integration

the previous approach.

is crucial not only for next generation computing but

Lo ose Coupling: The lo ose coupling approachto

also for the continued development of DBMS technology

AI/DB integration uses a simple interface b etween the

and for the e ective application of muchofAItechnol-

twotyp es of systems to provide the AI system with ac-

ogy.

[

cess to existing databases, e.g., KEE-connection Abar-

While b oth DBMS and AI systems, particularly ex-

]

banel and Williams, 1986 . While this approach has

p ert systems, representwell established technologies,

the distinct advantage of integrating existing AI sys-

research and development in the area of AI/DB inte-

tems and existing DBMSs, the relatively low level of

gration is comparatively new. The motivations driv-

integration results in p o or p erformance and limited use

ing the integration of these two technologies include

of the DBMS by the AI system. In addition, access

the need for a access to large amounts of shared

to data from the database, as well as the data itself,

data for knowledge pro cessing, b ecient manage-

is p o orly integrated into the representational scheme of

ment of data as well as knowledge, and c intelli-

the AI system. The highly divergent metho ds repre-

gent pro cessing of data. In addition to these moti-

senting data e.g., relational data mo dels vs. frames is

vations, the design of IDI was also motivated by the

generally left to the application develop er or knowledge

desire to preserve the substantial investment repre-

engineer with only minimal supp ort from the AI/DB

sented by most existing databases. To that end, a

interface.

key design criterion for IDI was that it supp ort the

use of existing DBMSs as indep endent system comp o-

Enhanced AI/DB Interface: The last approach

nents. As illustrated in Figure 1 and describ ed b elow,

to AI/DB integration represents a substantial enhance-

several general approaches to AI/DB integration have

ment of the lo osely coupled approach and provides a

b een investigated and rep orted in the literature e.g.,

more p owerful and ecientinterface b etween the two

[

Bo cca, 1986, Chakravarthy, et. al., 1982, Chang, 1978,

typ es of systems. As with the previous approach, this

Chang and Walker, 1984, Jarke, et. al., 1984, Li, 1984,

metho d of AI/DB integration allows immediate advan-

Minker, 1978, Morris, 1988, Naish and Thom, 1983,

tage to b e taken of existing AI and DB technologies

]

Reiter, 1978, Van Buer, et. al., 1985 .

as well as future advances in them. The problems of

p erformance and under-utilization of the DBMS by the Extending the AI System: In this approach,

 The AI system generates many DBMS queries that tend AI system are handled with di ering degrees of suc-

to yield, on average, small results and the AI system uses

cess rst by increasing the functionality of the interface

most or all of the result.

itself, and then if necessary, enhancing either the AI

system or the DBMS. For example, the BERMUDA

[ ]

system Ioannidis, et. al., 1988 uses a form of result

In the rst case, it would b e b est to avoid the cost

caching to improve eciency and p erforms some sim-

of pro cessing the entire result by using demand-driven

ple pre-analysis of the AI application to identify join

techniques to pro duce only one tuple at a time from

op erations that can b e p erformed by the DBMS rather

the result stream of the DBMS. However, this requires

[

than the AI system. The BrAID system O'Hare and

that separate connections b e created for each DBMS

]

Sheth, 1989 is similar except that it supp orts more gen-

query.Thus the overhead of creating such connections

eral caching and pre-analysis capabilities and allows for

must b e less than the cost of pro cessing the entire result

exp erimentation with di erent inference strategies.

relation.

The IDI is an interface that can b e used to facili-

tate the development of AI/DB systems using this last

In the second case, it would b e b est to avoid the cost

of creating numerous connections to the DBMS by using approach. That is, the IDI is cache-based interface to

DBMSs and is designed to b e easily integrated into var-

a single connection for multiple queries. However, this

ious typ es of AI systems. The design of the IDI also al-

requires that the entire result of each query b e pro cessed

lows it to b e used as an interface b etween DBMSs and

so that successive queries can b e run using the same

other typ es of applications such as database browsers

connection. The cost of pro cessing DBMS results i.e.,

and general query pro cessors.

reading the entire result stream and storing it lo cally

must b e less than the cost of creating a new connection

for each DBMS query.

Design Features

The IDI supp orts several features which simplify its use

For most systems, it seems reasonable to assume that

in an AI system. These include

the total cost for creating a new DBMS connection will

b e relatively high. Thus, using the same connection for

 Connections to a DBMS are managed transparently so

di erent DBMS queries would result in a net savings. that there can b e multiple active queries to the same

database using a single op en connection.

While sp eci c break-even p oints could b e estimated, it

 Connections to a given database are op ened up on de-

is not clear one need go that far since there are other

mand, i.e. at rst use instead of requiring an explicit

reasons for minimizing the numb er of DBMS connec-

database op en request.

tions that are op en at the same time. Foremost among

 information is loaded from the database

these is the limit that most op erating systems haveon

either when the database is op ened or when queries re-

the numb er of streams that can b e op en simultaneously.

quire schema information based up on user declarations.

This can severely limit the numb er of DBMS connec-

 The query interface is a logic-based language but uses

tions that we can have at one time. If one is also inter-

user supplied functions to declare and recognize logic vari-

ested in allowing connections to di erent databases, on

ables.

either the same or a di erent DBMS, then it is imp or-

 Results of queries to a DBMS are cached, improving

tant to minimize the numb er of op en connections for a

the overall p erformance system and the cache is accessed

single database.

transparently by a query manager.

Yet another consideration is the use of caching for

All but the last of the features are describ ed in this sec-

DBMS results. That is, if DBMS results can b e cached

tion. The cache system and initial p erformance results

lo cally by the AI system or an agent serving it then all

are describ ed in subsequent sections.

of the DBMS results will probably b e pro cessed by the

caching mechanism. Thus, the rst alternative where

Making Connections

it is assumed that the DBMS results will not, in general,

As suggested ab ove, there are numerous approaches to

b e totally consumed is no longer applicable.

interfacing an AI system with existing DBMSs. How-

ever, the basic alternatives involve balancing the costs

In light of these constraints and requirements, it

of creating the connection to the DBMS and of pro cess-

seems b est to minimize the numb er of DBMS connec-

ing the result relations from a DBMS query. Deciding

tions that can b e op en simultaneously. Brie y, the

which alternative is the b est requires knowledge ab out

approach taken in the IDI is to op en a connection

the typical b ehavior of the AI system as well as other,

when a DBMS query is encountered against a database

more obvious factors, such as communication overhead

for which no connection exists and pro cess the result

[ ]

cf. O'Hare and Sheth, 1989 . Consider the following

stream one tuple at a time until and unless another

two mo des of interaction b etween an AI system and a

DBMS query on the same database is encountered. At

DBMS:

that p oint, the new query is sent to the DBMS, the re-

mainder of the result stream for the previous query is

 The AI system generates a few DBMS queries that tend

consumed and stored lo cally, and then the new result

to yield very large results and the AI system uses only a

stream is pro cessed one tuple at a time as b efore. fraction of the result.

Get supplier names for suppliers who do not supply part p2.

Automating Access to Schema Information

ans _Sname

One of the key features of the IDI is the automatic

<-

management of database schema information. The user

supplier _Sno _Sname _Status _City

or application program is not required to provide any

not supplier_part _Sno "p2" _Qty

schema information for those database relations that

are accessed via IDIL queries. The IDI assumes the

Get supplier names and quantity supplied for suppliers that

resp onsibility for obtaining the relevantschema infor-

supply more than 300 units of part p2.

mation from the appropriate DBMS. This provides sev-

eral signi cant advantages over interfaces which rely on

ans _Sname _Qty

<-

the user to provide schema information. Most imp or-

supplier _Sno _Sname _Status _City

tantly, the schema information will necessarily b e con-

supplier_part _Sno "p2" _Qty

sistent with that stored in the DBMS and thus any

> _Qty 300

errors intro duced by hand-co ding the schema informa-

tion are eliminated. The only exception to this o ccurs

Figure 2: Two example IDIL queries using the \suppliers"

when the schema on the DBMS is mo di ed after the

" character have b een database. Symb ols b eginning with a \

IDI has accessed it since the IDI caches the schema

declared to b e logic variables.

information and thus maintains a private copy of it.

While this stale data problem exists for any system

which maintains a separate copy of the schema informa-

of inputs or requests to the IDI: a a database dec-

tion, the IDI provides a simple mechanism for forcing

laration; b an IDIL query and subsequent retrieval

the schema information to b e up dated. In addition,

requests against the result of an IDIL query; and c

this approach greatly facilitates the implementation of

advice to the Cache Manager.. Database declarations

database browsers since users need not know the names

convey simple information ab out a given database, e.g.,

or structure of relations stored in a particular database.

the typ e of the DBMS on which the database resides

and the host machine for the DBMS. For each IDIL

Logical Glue

query, the IDI returns a generator which can b e used to

retrieve the result relation of an IDIL query one tuple at

Another signi cant feature of the IDI is the relative

a time. The IDI also supp orts other typ es of requests,

ease with which it can b e integrated with di erentAI

e.g., access to schema information, which are describ ed

systems. Aside from the use of Common Lisp as the

[ ]

elsewhere O'Hare, 1989 .

implementation language for the IDI, this is achieved

The Schema Manager SM is resp onsible for manag-

by employing a logic-based language as the query lan-

ing the schema information for all declared databases

guage for the IDI. The language, IDIL, may b e used

and it supplies the Query Manager with schema infor-

as a totally indep endent query language or, more im-

mation for individual database relations. This entails

p ortantly,itmay b e more closely integrated with the

pro cessing database declarations, accessing and storing

knowledge representation language of a logic-based AI

schema information for declared databases, and man-

system. In the later case, the key is to allow the IDI to

aging relation name aliases which are used when twoor

share the same de nition of a logic variable as the host

more databases contain relations with identical names.

AI system. This is accomplished b e simply rede ning a

Whenever a connection to a database is created, the SM

small set of functions within the IDI which are used to

automatically accesses the list of relation names that

recognize and create instances of logic variables.

are contained within the database. This list is then

1

The IDIL query language is a restricted subset of

cached for later access in the event that the connection

function-free Horn clauses where the head of a clause

is closed and re-op ened at some later time. In this event

represents the target list i.e., the form of the result

the SM will only access the DBMS schema information

relation and the b o dy is a conjunction of literals which

if it is explicitly directed to do so, otherwise the cached

denote database relations or op erations on the relations

list of relation names will b e used.

and/or their attributes e.g., negation, aggregation, and

The DBMS Connection Manager DCM manages all

arithmetic op erations. Figure 2 shows some example

database connections to remote DBMSs. This includes

queries.

pro cessing requests to op en and close database connec-

tions as well as p erforming all the low-level I/O op era-

IDI Organization

tions asso ciated with the connections. Within the IDI,

each database has at most one active connection asso ci-

As Figure 3 illustrates, there are four main comp onents

ated with it and each connection has zero or more query

which comprise the IDI | the Schema Manager, the

result streams or generators asso ciated with it but only

DBMS Connection Manager, the Query Manager, and

one generator may b e active.

the Cache Manager. There are three principal typ es

The Query Manager QM is resp onsible for pro cess-

1

ing IDIL queries and managing their results. IDIL

\IDIL" is pronounced as \idle" and should not b e con-

queries are pro cessed by translating them into SQL fused with \idyll".

Database are unreliable { critical database relations can b e ac-

ance to ensure their availability when

Declarations IDIL Results Advice cessed in adv

needed.

results caching - which query results should b e saved

IDI 

in the cache? Both base and derived relations vary

y. Some will de nitely b e worth

Schema Query Manager in their general utilit

hing since they are likely to b e accessed so on and

Manager cac

others not.

query generalization - which queries can b e usefully Cache 

Connection Manager generalized b efore submitting them to the DBMS?

Manager

Query generalization is a useful technique to reduce

the numb er of queries whichmust b e made against

y constraint satisfaction exp ert

SQL DBMS Results the database in man

systems. It is also a general technique to handle

exp ected subsequent queries after a \null answer"

[ ]

Motro, 1986 .

DBMS ... DBMS

 replacement - which relations should b e removed

when the cache b ecomes full?

DB DB DB DB DB DB

Additional kinds of advice and examples can b e found

[ ]

in O'Hare and Travis, 1989, O'Hare and Sheth, 1989 .

Figure 3: The IDI includes four main comp onents: the

As with anytyp e of cache-based system, one of the

Schema Manager manages the schema information for all de-

more dicult design issues involves the problem of

clared databases; the Connection Manager handles connections

cache validation. That is, determining when to inval-

to remote DBMSs; the Query Manager is resp onsible forpro-

idate cache entries b ecause of up dates to the relevant

cessing IDIL queries and their results; and the Cache Manager

data in the DBMS. Our current implementation do es

controls the cache of query results in accordance with the ad-

not attempt cache validation, which will b e a fo cus of

vice supplied by the application.

future research. This still leaves a large class of applica-

tions for which cache validation is not a problem. These

which is then sent to the appropriate DBMS by the

includes access to databases that are write-protected

DCM. If the query is successfully executed by the

and up dated infrequently, such as the Ocial Airline

DBMS then the QM returns a generator for the result

Guide database, and databases that are static relative

relation. A generator is simply an abstract data typ e

to the time scale of the AI application accessing them.

used to represent the result of an IDIL query. There are

Moreover, this problem is common to any AI system

two basic typ e of op erations whichmay b e p erformed

which gets some of its data from an external source and

on a generator: a get the next tuple from the result

stores in its knowledge base. Most currentinterfaces

relation and b terminate the generator i.e., discard

between AI systems and databases e.g. KEE Connec-

any remaining tuples. Generators are actually created

[ ]

tion Intellicorp, 1987  simply do not worry ab out this

and managed by the DCM since there is more than one

problem at all. Our approach attempts to minimize

p ossible representation for a result relation, e.g., it may

the AI system's copying of database data in twoways.

b e a result stream from a DBMS or a cache element.

First, by providing convenient and ecient access to

The QM merely passes generators to the DCM along

the information in the database, the AI system devel-

with requests for the next tuple or termination.

op ers will have less need to make a lo cal copy of the

The Cache Manager is resp onsible for managing the

data. Second, all of the database information that is

cache of query results. This includes identifying IDIL

copied i.e., in the cache is isolated in one place and

queries for which the results exist in the cache, caching

can therefor b e more easily \managed" { reducing the

query results, and replacing cache elements. In addi-

problem to \cache validation".

tion, our design allows the AI system to provide the

There are a numb er of approaches to the validation

cache manager with advice to help it decide howto

problem whichvary in completeness and ease of im-

manage its cache and make the following kinds of crit-

plementation. Examples of p ossible comp onents of a

ical decisions:

cache validation system include: only caching relations

 pre-fetching - which relations and when should b e declared to b e non-volatile, only caching data b etween

fetched in anticipation of needing them? This can scheduled DB up dates, using heuristics e.g., a decay

yield a signi cant increase in sp eed since the database curve to estimate data validity, and implementing a

server is running as a separate pro cess. This can \sno opy cache" which monitors the database transac-

also b e used to advantage in an environment in which tions for up dates which mightinvalidate the cached

databases are accessed over a network in which links data. 3 Execution 1.0

2

seconds 0.5 seconds

1

Collection

Translation Without Empty Non-empty

Min Mean Max Caching Cache Cache

Figure 4: The aggregate pro cessing time is broken down in Figure 5: Cache p erformance is measured for three cases:

terms of the three main stages of pro cessing: translation, ex- a the without caching or base-line case where caching was

ecution, and col lection.For each pro cessing stage, the mini- disabled, b the empty cache case where caching was enabled

mum, mean, and maximum pro cessing times are shown. but the cache was cleared b efore each rep etition of the test set,

and c the non-empty cache case where the cache contained

the results for all queries in the test set.

Current Status

Performance

of the relative pro cessing times for each stage could b e

The IDI, as describ ed here, has b een implemented in

established.

Common Lisp and tested as a stand-alone query pro-

The di erences in translation time re ect a dep en-

cessor against two di erent databases running on RTI

dence on the numb er of relations in the IDIL query.

INGRES and is also b eing used as a query server for

Similarly, the collection time is a function of the num-

the Unisys sp oken language pro ject. The p erformance

b er of tuples in the result relation. In b oth cases, the

results obtained thus far are, at b est, preliminary since

pro cessing times are signi cantly less than the execu-

the size of the test suite was comparatively small and

tion time which is e ected by the complexity of the

the IDI is just now b eing integrated with an AI system.

SQL query, the communicationoverhead, and the load

However, the results are encouraging and indicate the

on the remote DBMS host since only elapsed time was

p otential for ecient database access a orded by the

recorded.

IDI. The following summarize some of the more inter-

esting of these p erformance results.

Figure 5 indicates the e ects of result caching on

One test set of IDIL queries used consisted of 48

p erformance. The results represent the mean pro cess-

queries where there were 22 unique queries, i.e., each

ing times in seconds for all queries. Three di erent

query was rep eated at least once in the test set. The

cases are represented: a the without caching or base-

queries ranged from simple i.e., only pro ject and select

line case where caching was disabled, b the empty

op erations were required to complex i.e., a four-way

cache case where caching was enabled but the cache

join with two aggregation op erations as well as pro jects

was cleared b efore each rep etition of the test set, and

and selects. The size of the result relations varied from

c the non-empty cache case where the cache was con-

zero to 17 tuples. The statistics presented here are

tained the results for all queries in the test set. The

based on the mean pro cessing times for 20 rep etitions

di erence b etween the base-line and empty cache cases

of the test set of queries.

is due to the numb er of rep eated queries i.e., 26 out

of 48 were rep eated. The fact that the base-line case Figure 4 shows a breakdown of the aggregate pro cess-

is more than twice the empty cache indicated that the ing time in terms of the three main stages of pro cessing:

overhead required for result caching is not signi cant. translation i.e., the time to translate and IDIL query

The non-empty cache case indicates the maximum p o- into SQL, execution i.e., the elapsed time b etween

tential b ene t of result caching, i.e., nearly two orders sending the SQL query to the DBMS and obtaining the

of magnitude improvement in p erformance. Clearly this rst tuple of the result relation, and col lection i.e.,

could only o ccur when the cache is \stacked" as in the the time required to collect all the tuples in the result

test. However, it do es help to establish an upp er limit relation and convert them into internal form. For each

on the p ossible p erformance improvement a orded by pro cessing stage, the minimum, mean, and maximum

result caching. Obviously, as the numb er of rep eated pro cessing times are shown. The cache was disabled

queries increases so will the gain in p erformance. for these measurements so that a more accurate picture

Application of the IDI cache manager to allowittomakeintelligentchoices

ab out query generalizations, pre-fetching and replace-

Clearly, more detailed p erformance results need to b e

ment. We currently have an initial ATIS server running

obtained using more exhaustive test sets. It will b e

and will b e collecting statistics on its transactions which

particularly imp ortanttointegrate the IDI with an AI

can then b e used to de ne a e ective advice strategy.

system and measure its p erformance with a varietyof

di erent applications. We are currently using the IDI

Conclusion

to provide a for the Unisys sp oken lan-

guage understanding system and are investigating the

Although the implementation of the IDI is not com-

integration of the IDI with the Intel ligent System Server

plete, it do es provide a solid foundation for easily cre-

and its Protem representation and reasoning engine,

ating a sophisticated interface to existing DBMSs. The

key characteristics of IDI are eciency, simplicityof

The IDI and the ISS. Protem is a hybrid system

use, and a high degree of p ortability which makeitan

containing b oth a frame-based representation system

ideal choice for supp orting a variety of AI and related

and a logic-based reasoning comp onent. The integra-

applications which require access to remote DBMSs.

tion of a frame-based representation system with a rela-

Among the various extensions to the IDI that have

tional database management system is not straightfor-

b een planned for the future, most involve the Cache

ward. Our current approach lab els some of the classes

Manager. At present, the implementation of the CM

in the frame system as \database classes". Any knowl-

has b een fo cused on ecient result caching and most

edge base activity which searches for the instances of

other cache management functions have not b een im-

this class will b e handed a stream of \database in-

plemented. One of the rst steps will b e to imp ose

stances" which will b e the result of a query senttothe

a parameterized limit on the size of the cache and to

database via the IDI. In order to avoid lling the knowl-

implement a cache replacement strategy. Other exten-

edge base memory with database information, these in-

sions to the CM include cache validation, and the abil-

stances are not installed as p ersistent knowledge base

ity to p erform DBMS-like op erations on cache elements

ob jects but exist as \lightweight ob jects" which are

[ ]

O'Hare and Sheth, 1989 .

garbage collected as so on as active pro cesses stop ex-

If the IDI is extended so that it is capable of p erform-

amining them. They are also not \fully instantiated".

ing DBMS-like op erations on the contents of its cache

That is, the values for the frame's roles are not neces-

then, given an IDIL query, it will have three general

sarily installed. Instead, if an attempt is made to access

courses of action whichitmay take to pro duce the re-

their roles, additional database queries to retrieve the

sults: a the entire IDIL query can b e translated into

information will generated automatically. Once again,

SQL and sent to the remote DBMS for execution; b

this information is not added as p ermanent knowledge

the entire IDIL query can b e executed lo cally by the

base data, but only last as long as the currently active

IDI including simple retrieval from the cache; and c

pro cess is using it.

the IDIL query can b e decomp osed so that part of it

This approach has three advantages: it is relatively

is executed on the remote DBMS and part of it is exe-

simple to implement, transparent to the user and is

cuted lo cally by the IDI. The decision of which action

the key to isolating the data copy problem to cache

to takewould dep end on a numb er of factors includ-

validation as stated earlier. Once the relationship b e-

ing the current contents of the cache and the estimated

tween a database class and its database tables is de-

costs for each alternative.

clared, the class and its instances can b e treated as

any other knowledge base ob jects. However, without

References

the IDI cache implementation, it would b e prohibitively

slow.

[ ]

Abarbanel and Williams, 1986 R. Abarbanel and M.

Williams, \A Relational Representation for Knowledge

The IDI ATIS Server. The second AI system that

Bases,", Technical Rep ort, Intellicorp, Mountain View,

the IDI is b eing used to supp ort is a sp oken language in-

CA, April 1986.

terface to an Air Travel Information System database.

In this pro ject, sp oken queries are pro cessed by a sp eech

[ ]

Bo cca, 1986 J. Bo cca, \EDUCE a Marriage of Conve-

recognition system and interpreted by the Unisys Pun-

nience: Prolog and a Relational DBMS," Third Symp o-

[ ]

sium on Logic Programming, Salt Lake City, Sept. 1986,

dit natural language system Hirschman, et. al., 1989 .

pp. 36-45.

The resulting interpretation is translated into a IDIL

query which is then sent to the ATIS Server for evalua-

[ ]

Bro die, 1988 M. Bro die, \Future Intelligent Informa-

tion. This server is a separate pro cess running the IDI

tion Systems: AI and Database Technologies Working

which, in turn, submits SQL queries to an INGRES

Together" in Readings in Arti cial Intel ligence and

database server.

Databases, Morgan Kaufman, San Mateo, CA, 1988.

The \travel agent" domain is one in which there is a

[ ]

Ceri, et. al., 1986 S. Ceri, G. Gottlob, and G. Wiederhold,

rich source of pragmatic information that can b e used

\Interfacing Relational Databases and Prolog Eciently,"

to infer the user's intentions underlying their queries.

Proc. of the 1st Intl. Conf. on Expert Database Systems,

South Carolina, April 1986. These intentions can b e used to generate advice to the

[ ]

Management System," in Proceedings of the 12th In- Chakravarthy, et. al., 1982 U. Chakravarthy, J. Minker,

ternational ConferenceonVery Large Databases,Kyoto, and D. Tran, \Interfacing Predicate Logic Languages

Japan, 1986. and Relational Databases," in Proceedings of the First

International Logic Programming Conference, pp. 91-98,

[ ]

Li, 1984 D. Li, AProlog Database System, Research Stud-

Septemb er 1982.

ies Press, Letchworth, 1984.

[ ]

Chamb erlin, et. al., 1976 D. Chamb erlin, el al.. \SE-

[ ]

Minker, 1978 J. Minker, \An Exp erimental Relational

QUEL2: A Uni ed Approach to Data De nition, Manip-

Data Base System Based on Logic," in Logic and

ulation, and Control", IBM Journal of R&D, 20, 560-575,

Databases, ed. J. Minker, Plenum Press, New York, 1978.

1976.

[ ]

Morris, 1988 K. Morris, J. Naughton, Y. Saraiya, J. Ull-

[ ]

Chang, 1978 C. Chang, \DEDUCE 2: Further investiga-

man, and A. Van Gelder, \YAWN! Yet Another Window

tions of deduction in relational databases," in Logic and

on NAIL!," IEEE Data Engineering,vol. 10, no. 4, De-

Databases, ed. H. Gallaire, pp. 201-236, New York, 1978.

cemb er 1987, pp. 28-43.

[ ]

Chang and Walker, 1984 C. Chang and A. Walker,

[ ]

Motro, 1986 Amihai Motro, \Query Generalizatio n:

\PROSQL: A Prolog Programming Interface with

A Metho d for interpreting Null Answers", in Ex-

SQL/DS," Proc. of the 1st Intl. Workshop on Expert

pert Database Systems, ed. L. Kerschb erg, Ben-

Database Systems, Kiawah Island, South Carolina, Octo-

jamin/Cummings, Menlo Park CA, 1986.

b er 1984.

[ ]

Naish and Thom, 1983 L. Naish and J. A. Thom, \The

[ ]

Chimenti, et. al., 1987 D. Chimenti, A. O'Hare, R. Krish-

MU-Prolog Deductive Database," Technical Rep ort 83-

namurthy, S. Naqvi, S. Tsur, C. West, and C. Zaniolo,

10, Department of Computer Science, University of Mel-

\An Overview of the LDL System," IEEE Data Engi-

b ourne, Australia, 1983.

neering,vol. 10, no. 4, Decemb er 1987, pp. 52{62.

[ ]

O'Hare, 1987 A. O'Hare, Towards Declarative Control

[ ]

Dahl, et. al., 1990 D. Dahl, L. Norton, D. McKay,M.

of Computational Deduction, University of Wisconsin{

Linebarger and L. Hirschman, \Management and Evalua-

Madison PhD Thesis, June 1987.

tion of Interactive Dialogue in the Air Travel Information

[ ]

O'Hare, 1989 A. O'Hare, \The Intelligent Database Inter-

System Domain", submitted to The DARPA Workshop

face Language", Technical Rep ort, Unisys Paoli Research

on Speech and Natural Language, June 24-27,1990, Hid-

Center, June 1989.

den Valley,PA.

[ ]

O'Hare and Sheth, 1989 A. O'Hare and A. Sheth, \The

[ ]

Finin, et. al., 1989 Tim Finin, RichFritzson, Don Mckay,

interpreted-compiled range of AI/DB systems", SIGMOD

Robin McEntire, and Tony O'Hare, \The Intelligent Sys-

Record, 181, March 1989.

tem Server | Delivering AI to Complex Systems", Pro-

[ ]

O'Hare and Travis, 1989 A O'Hare and L. Travis, \The

ceedings of the IEEE International Workshop on Tools

KMS Inference Engine: Rationale and Design Ob jec-

for Arti cial Intel ligence{ Architectures, languages and

tives", Technical Rep ort TM-8484/003/00, Unisys - West

Algorithms, March 1990.

Coast Research Center, 1989.

[ ]

Fritzson and Finin, 1988 RichFritzson and Tim Finin,

[ ]

O'Hare and Sheth, 1989 Anthony B. O'Hare and Amit

\Protem { An Integrated Exp ert Systems To ol", Tech-

Sheth. The architecture of BrAID: A system for e-

nical Rep ort LBS Technical Memo Numb er 84, Unisys

cient AI/DB Integration.Technical Rep ort PRC-LBS-

Paoli Research Center, May 1988.

8907, Unisys Paoli Research Center, June 1989.

[ ]

Hirschman, et. al., 1989 Lynette Hirschman,

[ ]

Reiter, 1978 R. Reiter, \Deductive Question-Answering

Martha Palmer, John Dowding, Deb orah Dahl, Marcia

on Relational Data Bases," in Logic and Databases, ed.

Linebarger, Reb ecca Passonneau, Francois-Michel Lang,

J. Minker, Plenum Press, New York, 1978.

Catherine Ball, and Carl Weir. The pundit natural-

[ ]

Sheth, et. al., 1988 A. Sheth, D. van Buer, S. Russell, and language pro cessing system. In AI Systems in Govern-

ment Conference. Computer So ciety of the IEEE, March

S. Dao, \Cache Management System: Preliminary Design

and Evaluation Criteria," Unisys Technical Rep ort TM- 1989.

8484/000/00, Octob er 1988.

[ ]

Intellicorp, 1987 Intellicorp, \KEEConnection: A Bridge

[ ]

Stonebreaker, et. al., 1987 M. Stonebraker, E. Hanson

Between Databases and Knowledge Bases", An Intel-

and S. Potamianos, \A Rule Manager for Relational

licorp Technical Article, 1987.

Database Systems," in The Postgres Pap ers, M. Stone-

[ ]

Ioannidis, et. al., 1988 Y, Ioannidis, J. Chen, M. Fried-

braker and L. Rowe eds, Memo UCM/ERL M86/85,

man, and M. Tsangaris, \BERMUDA{AnArchitec-

Univ. of California, Berkeley, 1987.

tural Persp ectiveonInterfacing Prolog to a Database

[ ]

Van Buer, et. al., 1985 D. Van Buer, D. McKay,D.Ko-

Machine," Proceedings of the Second International Con-

gan, L. Hirschman, M. Heineman, and L. Travis, \The

ference on Expert Database Systems, April 1988.

Flexible Deductive Engine: An Environment for Proto-

[ ]

Jarke, et. al., 1984 M. Jarke, J. Cli ord, and Y. Vassiliou,

typing Knowledge Based Systems," Proceedings of the

\An Optimizing Prolog Front-End to a Relational Query

Ninth International Joint ConferenceonArti cial Intel-

System," Proceedings of the 1984 ACM-SIGMOD Con-

ligence, Los Angeles CA, August 1985.

ference on the Management of Data, Boston, MA, June

1984.

[ ]

Kellog, et. al., 1986 C. Kellogg, A. O'Hare, and L. Travis,

\Optimizing the Rule/Data Interface in a Knowledge