The Intelligent Database Interface:
Integrating AI and Database Systems
Donald P. McKay and Timothy W. Finin and Anthony O'Hare
Unisys Center for Advanced Information Technology
Paoli, Pennsylvania
[email protected] and [email protected]
Abstract its tuple-at-a-time inference mechanisms. The IDI has
also b een used to implement a query server supp orting
The Intel ligent Database Interface IDI is a cache-based
a database used for an Air Travel Information System
interface that is designed to provide Arti cial Intelligence
which is accessed byaspoken language system imple-
systems with ecient access to one or more databases on one
[ ]
or more remote database management systems DBMSs. mented in Prolog Dahl, et. al., 1990 .
It can b e used to interface with a wide variety of di erent
In addition to providing ecient access to remote
DBMSs with little or no mo di cation since SQL is used to
DBMSs, the IDI o ers several other distinct advan-
communicate with remote DBMSs and the implementation
tages. It can b e used to interface with a wide vari-
of the IDI provides a high degree of p ortability. The query
ety of di erent DBMSs with little or no mo di cation
language of the IDI is a restricted subset of function-free
since SQL is used to communicate with the remote
Horn clauses which is translated into SQL. Results from the
DBMS. Also, several connections to the same or di er-
IDI are returned one tuple at a time and the IDI manages
ent DBMSs can exist simultaneously and can b e kept
a cache of result relations to improve eciency. The IDI is
active across anynumb er of queries b ecause connec-
one of the key comp onents of the Intel ligent System Server
tions to remote DBMSs are abstract ob jects that are
ISS knowledge representation and reasoning system and is
also b eing used to provide database services for the Unisys
managed as resources by the IDI. Finally, accessing
sp oken language systems program.
schema information is handled automatically by the
IDI, i.e., the application is not required to maintain
up-to-date schema information for the IDI. This signif-
Intro duction
icantly reduces the p otential for errors intro duced by
The Intel ligent Database Interface IDI is a p ortable,
stale schema information or by hand entered data.
cache-based interface designed to provide arti cial intel-
The IDI can b e viewed as a stand-alone DBMS inter-
ligence systems in general and exp ert systems in par-
face which accepts queries in the form of IDIL clauses
ticular with ecient access to one or more databases
and returns the result relation as a set of tuples i.e.,
on one or more remote database management systems
a list of Lisp atoms and/or strings. IDIL queries are
[
DBMS which supp ort SQL Chamb erlin, et. al.,
translated into SQL and sent to the appropriate DBMS
]
1976 . The query language of the IDI is the Intelligent
for execution. The results from the DBMS are then
[ ]
Database Interface Language IDIL O'Hare, 1989 and
transformed by the IDI into tuples of Lisp ob jects. Al-
is based on a restricted subset of function-free Horn
though the IDI was not designed to b e used directly
clauses where the head of a clause represents the tar-
by a user, the following descriptions will b e couched
get list i.e., the form of the result relation and the
in terms of using the IDI as a stand-alone system so
b o dy is a conjunction of literals which denote database
that wemayavoid complicating our discussions with
relations or op erations on the relations and/or their at-
the details of an AI system such as the ISS.
tributes e.g., negation, aggregation, and arithmetic op-
The design of the IDI was heavily in uenced by pre-
erations.
[
vious research in the area of AI/DB integration Kellog,
The IDI is one of the key comp onents of the In-
et. al., 1986, O'Hare, 1987, O'Hare and Travis, 1989,
[ ]
tel ligent System Server ISS Finin, et. al., 1989
]
O'Hare and Sheth, 1989 . One of the more signi cant
[ ]
which is based on Protem Fritzson and Finin, 1988
design criteria that this lead to is the supp ort of non-
and provides a combined logic-based and frame-based
trivial queries in IDIL. That is, to allow for queries in-
knowledge representation system and supp orts forward-
volving more than just a single database relation. This
chaining, backward-chaining, and truth maintenance.
capability allows the AI system to o -load computa-
The IDI was designed to b e compatible with the logic-
tions that are more eciently pro cessed by the DBMS
based knowledge representation scheme of the ISS and
instead of the AI system e.g., join op erations. In many
cases, this also has the e ect of reducing the size of data
current address: IBM, ResearchTriangle Park, North
Carolina set that is returned by the DBMS.
the AI system is extended with DBMS capabilities to
vide ecient access to, and management of, large
AI AI DB pro
ts of stored data. In general, such systems do
DB amoun
hnology. Rather, the
Extending the AI system Loose coupling not incorp orate full DBMS tec
emphasis is on the AI system and the DBMS capabil-
ities are added in an ad hoc and limited manner, e.g.,
DB AI Interface DB
[ ]
al., 1986 implements only the data access
AI Ceri, et.
yer. Alternatively, a new generation knowledge-based
Extending the DB system Enhanced AI/DB interface la
[ ]
system such as LDL Chimenti, et. al., 1987 maybe
constructed. In either case, this approach e ectively
Figure 1: Of the four alternative approaches to AI/DB inte-
involves \re-inventing" some or all of DBMS technol-
gration, the Intel ligent Database Interface is an example of
ogy. While such systems typically provide sophisticated
an enhanced AI/DB interface.
to ols and environments for the development of applica-
tions such as exp ert systems, they can not readily make
use of existing databases. Thus, the developmentofAI
While the IDI is to some small degree system dep en-
applications whichmust access existing databases will
dent, it do es o er a high degree of p ortability b ecause
b e exceedingly dicult if not imp ossible e.g., when the
it is implemented in Common Lisp, communicates with
database is routinely accessed and up dated via more
remote DBMSs using SQL and standard UNIX pip es,
traditional kinds of applications.
and represents IDIL queries and their results as Com-
mon Lisp ob jects.
Extending the DBMS System: This approach
extends a DBMS to provide knowledge representation
In the following sections we present a brief overview
[
and reasoning capabilities, e.g., POSTGRES Stone-
of the area of AI/DB integration which represents a
]
breaker, et. al., 1987 . Here, the DBMS capabilities are
large part of the motivation for the IDI, a discussion
the central concern and the AI capabilities are added in
of some of the more signi cant features of the IDI, the
an ad hoc manner. The knowledge representation and
organization and ma jor comp onents of the IDI, and -
reasoning capabilities are generally quite limited and
nally an example of how the IDI is b eing used in two
they lack the sophisticated to ols and environments of
applications.
most AI systems. Such systems do not directly sup-
p ort the use of existing DBMSs nor can they directly
AI/DB Integration
supp ort existing AI applications e.g., exp ert systems
The integration of AI and DBMS technologies promises
without substantial e ort on the part of the applica-
to play a signi cant role in shaping the future of com-
tion develop er. In some sense, this is the opp osite of
[ ]
puting. As noted in Bro die, 1988 , AI/DB integration
the previous approach.
is crucial not only for next generation computing but
Lo ose Coupling: The lo ose coupling approachto
also for the continued development of DBMS technology
AI/DB integration uses a simple interface b etween the
and for the e ective application of muchofAItechnol-
twotyp es of systems to provide the AI system with ac-
ogy.
[
cess to existing databases, e.g., KEE-connection Abar-
While b oth DBMS and AI systems, particularly ex-
]
banel and Williams, 1986 . While this approach has
p ert systems, representwell established technologies,
the distinct advantage of integrating existing AI sys-
research and development in the area of AI/DB inte-
tems and existing DBMSs, the relatively low level of
gration is comparatively new. The motivations driv-
integration results in p o or p erformance and limited use
ing the integration of these two technologies include
of the DBMS by the AI system. In addition, access
the need for a access to large amounts of shared
to data from the database, as well as the data itself,
data for knowledge pro cessing, b ecient manage-
is p o orly integrated into the representational scheme of
ment of data as well as knowledge, and c intelli-
the AI system. The highly divergent metho ds repre-
gent pro cessing of data. In addition to these moti-
senting data e.g., relational data mo dels vs. frames is
vations, the design of IDI was also motivated by the
generally left to the application develop er or knowledge
desire to preserve the substantial investment repre-
engineer with only minimal supp ort from the AI/DB
sented by most existing databases. To that end, a
interface.
key design criterion for IDI was that it supp ort the
use of existing DBMSs as indep endent system comp o-
Enhanced AI/DB Interface: The last approach
nents. As illustrated in Figure 1 and describ ed b elow,
to AI/DB integration represents a substantial enhance-
several general approaches to AI/DB integration have
ment of the lo osely coupled approach and provides a
b een investigated and rep orted in the literature e.g.,
more p owerful and ecientinterface b etween the two
[
Bo cca, 1986, Chakravarthy, et. al., 1982, Chang, 1978,
typ es of systems. As with the previous approach, this
Chang and Walker, 1984, Jarke, et. al., 1984, Li, 1984,
metho d of AI/DB integration allows immediate advan-
Minker, 1978, Morris, 1988, Naish and Thom, 1983,
tage to b e taken of existing AI and DB technologies
]
Reiter, 1978, Van Buer, et. al., 1985 .
as well as future advances in them. The problems of
p erformance and under-utilization of the DBMS by the Extending the AI System: In this approach,
The AI system generates many DBMS queries that tend AI system are handled with di ering degrees of suc-
to yield, on average, small results and the AI system uses
cess rst by increasing the functionality of the interface
most or all of the result.
itself, and then if necessary, enhancing either the AI
system or the DBMS. For example, the BERMUDA
[ ]
system Ioannidis, et. al., 1988 uses a form of result
In the rst case, it would b e b est to avoid the cost
caching to improve eciency and p erforms some sim-
of pro cessing the entire result by using demand-driven
ple pre-analysis of the AI application to identify join
techniques to pro duce only one tuple at a time from
op erations that can b e p erformed by the DBMS rather
the result stream of the DBMS. However, this requires
[
than the AI system. The BrAID system O'Hare and
that separate connections b e created for each DBMS
]
Sheth, 1989 is similar except that it supp orts more gen-
query.Thus the overhead of creating such connections
eral caching and pre-analysis capabilities and allows for
must b e less than the cost of pro cessing the entire result
exp erimentation with di erent inference strategies.
relation.
The IDI is an interface that can b e used to facili-
tate the development of AI/DB systems using this last
In the second case, it would b e b est to avoid the cost
of creating numerous connections to the DBMS by using approach. That is, the IDI is cache-based interface to
DBMSs and is designed to b e easily integrated into var-
a single connection for multiple queries. However, this
ious typ es of AI systems. The design of the IDI also al-
requires that the entire result of each query b e pro cessed
lows it to b e used as an interface b etween DBMSs and
so that successive queries can b e run using the same
other typ es of applications such as database browsers
connection. The cost of pro cessing DBMS results i.e.,
and general query pro cessors.
reading the entire result stream and storing it lo cally
must b e less than the cost of creating a new connection
for each DBMS query.
Design Features
The IDI supp orts several features which simplify its use
For most systems, it seems reasonable to assume that
in an AI system. These include
the total cost for creating a new DBMS connection will
b e relatively high. Thus, using the same connection for
Connections to a DBMS are managed transparently so
di erent DBMS queries would result in a net savings. that there can b e multiple active queries to the same
database using a single op en connection.
While sp eci c break-even p oints could b e estimated, it
Connections to a given database are op ened up on de-
is not clear one need go that far since there are other
mand, i.e. at rst use instead of requiring an explicit
reasons for minimizing the numb er of DBMS connec-
database op en request.
tions that are op en at the same time. Foremost among
Database schema information is loaded from the database
these is the limit that most op erating systems haveon
either when the database is op ened or when queries re-
the numb er of streams that can b e op en simultaneously.
quire schema information based up on user declarations.
This can severely limit the numb er of DBMS connec-
The query interface is a logic-based language but uses
tions that we can have at one time. If one is also inter-
user supplied functions to declare and recognize logic vari-
ested in allowing connections to di erent databases, on
ables.
either the same or a di erent DBMS, then it is imp or-
Results of queries to a DBMS are cached, improving
tant to minimize the numb er of op en connections for a
the overall p erformance system and the cache is accessed
single database.
transparently by a query manager.
Yet another consideration is the use of caching for
All but the last of the features are describ ed in this sec-
DBMS results. That is, if DBMS results can b e cached
tion. The cache system and initial p erformance results
lo cally by the AI system or an agent serving it then all
are describ ed in subsequent sections.
of the DBMS results will probably b e pro cessed by the
caching mechanism. Thus, the rst alternative where
Making Connections
it is assumed that the DBMS results will not, in general,
As suggested ab ove, there are numerous approaches to
b e totally consumed is no longer applicable.
interfacing an AI system with existing DBMSs. How-
ever, the basic alternatives involve balancing the costs
In light of these constraints and requirements, it
of creating the connection to the DBMS and of pro cess-
seems b est to minimize the numb er of DBMS connec-
ing the result relations from a DBMS query. Deciding
tions that can b e op en simultaneously. Brie y, the
which alternative is the b est requires knowledge ab out
approach taken in the IDI is to op en a connection
the typical b ehavior of the AI system as well as other,
when a DBMS query is encountered against a database
more obvious factors, such as communication overhead
for which no connection exists and pro cess the result
[ ]
cf. O'Hare and Sheth, 1989 . Consider the following
stream one tuple at a time until and unless another
two mo des of interaction b etween an AI system and a
DBMS query on the same database is encountered. At
DBMS:
that p oint, the new query is sent to the DBMS, the re-
mainder of the result stream for the previous query is
The AI system generates a few DBMS queries that tend
consumed and stored lo cally, and then the new result
to yield very large results and the AI system uses only a
stream is pro cessed one tuple at a time as b efore. fraction of the result.
Get supplier names for suppliers who do not supply part p2.
Automating Access to Schema Information
ans _Sname
One of the key features of the IDI is the automatic
<-
management of database schema information. The user
supplier _Sno _Sname _Status _City
or application program is not required to provide any
not supplier_part _Sno "p2" _Qty
schema information for those database relations that
are accessed via IDIL queries. The IDI assumes the
Get supplier names and quantity supplied for suppliers that
resp onsibility for obtaining the relevantschema infor-
supply more than 300 units of part p2.
mation from the appropriate DBMS. This provides sev-
eral signi cant advantages over interfaces which rely on
ans _Sname _Qty
<-
the user to provide schema information. Most imp or-
supplier _Sno _Sname _Status _City
tantly, the schema information will necessarily b e con-
supplier_part _Sno "p2" _Qty
sistent with that stored in the DBMS and thus any
> _Qty 300
errors intro duced by hand-co ding the schema informa-
tion are eliminated. The only exception to this o ccurs
Figure 2: Two example IDIL queries using the \suppliers"
when the schema on the DBMS is mo di ed after the
" character have b een database. Symb ols b eginning with a \
IDI has accessed it since the IDI caches the schema
declared to b e logic variables.
information and thus maintains a private copy of it.
While this stale data problem exists for any system
which maintains a separate copy of the schema informa-
of inputs or requests to the IDI: a a database dec-
tion, the IDI provides a simple mechanism for forcing
laration; b an IDIL query and subsequent retrieval
the schema information to b e up dated. In addition,
requests against the result of an IDIL query; and c
this approach greatly facilitates the implementation of
advice to the Cache Manager.. Database declarations
database browsers since users need not know the names
convey simple information ab out a given database, e.g.,
or structure of relations stored in a particular database.
the typ e of the DBMS on which the database resides
and the host machine for the DBMS. For each IDIL
Logical Glue
query, the IDI returns a generator which can b e used to
retrieve the result relation of an IDIL query one tuple at
Another signi cant feature of the IDI is the relative
a time. The IDI also supp orts other typ es of requests,
ease with which it can b e integrated with di erentAI
e.g., access to schema information, which are describ ed
systems. Aside from the use of Common Lisp as the
[ ]
elsewhere O'Hare, 1989 .
implementation language for the IDI, this is achieved
The Schema Manager SM is resp onsible for manag-
by employing a logic-based language as the query lan-
ing the schema information for all declared databases
guage for the IDI. The language, IDIL, may b e used
and it supplies the Query Manager with schema infor-
as a totally indep endent query language or, more im-
mation for individual database relations. This entails
p ortantly,itmay b e more closely integrated with the
pro cessing database declarations, accessing and storing
knowledge representation language of a logic-based AI
schema information for declared databases, and man-
system. In the later case, the key is to allow the IDI to
aging relation name aliases which are used when twoor
share the same de nition of a logic variable as the host
more databases contain relations with identical names.
AI system. This is accomplished b e simply rede ning a
Whenever a connection to a database is created, the SM
small set of functions within the IDI which are used to
automatically accesses the list of relation names that
recognize and create instances of logic variables.
are contained within the database. This list is then
1
The IDIL query language is a restricted subset of
cached for later access in the event that the connection
function-free Horn clauses where the head of a clause
is closed and re-op ened at some later time. In this event
represents the target list i.e., the form of the result
the SM will only access the DBMS schema information
relation and the b o dy is a conjunction of literals which
if it is explicitly directed to do so, otherwise the cached
denote database relations or op erations on the relations
list of relation names will b e used.
and/or their attributes e.g., negation, aggregation, and
The DBMS Connection Manager DCM manages all
arithmetic op erations. Figure 2 shows some example
database connections to remote DBMSs. This includes
queries.
pro cessing requests to op en and close database connec-
tions as well as p erforming all the low-level I/O op era-
IDI Organization
tions asso ciated with the connections. Within the IDI,
each database has at most one active connection asso ci-
As Figure 3 illustrates, there are four main comp onents
ated with it and each connection has zero or more query
which comprise the IDI | the Schema Manager, the
result streams or generators asso ciated with it but only
DBMS Connection Manager, the Query Manager, and
one generator may b e active.
the Cache Manager. There are three principal typ es
The Query Manager QM is resp onsible for pro cess-
1
ing IDIL queries and managing their results. IDIL
\IDIL" is pronounced as \idle" and should not b e con-
queries are pro cessed by translating them into SQL fused with \idyll".
Database are unreliable { critical database relations can b e ac-
ance to ensure their availability when
Declarations IDIL Results Advice cessed in adv
needed.
results caching - which query results should b e saved
IDI
in the cache? Both base and derived relations vary
y. Some will de nitely b e worth
Schema Query Manager in their general utilit
hing since they are likely to b e accessed so on and
Manager cac
others not.
query generalization - which queries can b e usefully Cache
Connection Manager generalized b efore submitting them to the DBMS?
Manager
Query generalization is a useful technique to reduce
the numb er of queries whichmust b e made against
y constraint satisfaction exp ert
SQL DBMS Results the database in man
systems. It is also a general technique to handle
exp ected subsequent queries after a \null answer"
[ ]
Motro, 1986 .
DBMS ... DBMS
replacement - which relations should b e removed
when the cache b ecomes full?
DB DB DB DB DB DB
Additional kinds of advice and examples can b e found
[ ]
in O'Hare and Travis, 1989, O'Hare and Sheth, 1989 .
Figure 3: The IDI includes four main comp onents: the
As with anytyp e of cache-based system, one of the
Schema Manager manages the schema information for all de-
more dicult design issues involves the problem of
clared databases; the Connection Manager handles connections
cache validation. That is, determining when to inval-
to remote DBMSs; the Query Manager is resp onsible forpro-
idate cache entries b ecause of up dates to the relevant
cessing IDIL queries and their results; and the Cache Manager
data in the DBMS. Our current implementation do es
controls the cache of query results in accordance with the ad-
not attempt cache validation, which will b e a fo cus of
vice supplied by the application.
future research. This still leaves a large class of applica-
tions for which cache validation is not a problem. These
which is then sent to the appropriate DBMS by the
includes access to databases that are write-protected
DCM. If the query is successfully executed by the
and up dated infrequently, such as the Ocial Airline
DBMS then the QM returns a generator for the result
Guide database, and databases that are static relative
relation. A generator is simply an abstract data typ e
to the time scale of the AI application accessing them.
used to represent the result of an IDIL query. There are
Moreover, this problem is common to any AI system
two basic typ e of op erations whichmay b e p erformed
which gets some of its data from an external source and
on a generator: a get the next tuple from the result
stores in its knowledge base. Most currentinterfaces
relation and b terminate the generator i.e., discard
between AI systems and databases e.g. KEE Connec-
any remaining tuples. Generators are actually created
[ ]
tion Intellicorp, 1987 simply do not worry ab out this
and managed by the DCM since there is more than one
problem at all. Our approach attempts to minimize
p ossible representation for a result relation, e.g., it may
the AI system's copying of database data in twoways.
b e a result stream from a DBMS or a cache element.
First, by providing convenient and ecient access to
The QM merely passes generators to the DCM along
the information in the database, the AI system devel-
with requests for the next tuple or termination.
op ers will have less need to make a lo cal copy of the
The Cache Manager is resp onsible for managing the
data. Second, all of the database information that is
cache of query results. This includes identifying IDIL
copied i.e., in the cache is isolated in one place and
queries for which the results exist in the cache, caching
can therefor b e more easily \managed" { reducing the
query results, and replacing cache elements. In addi-
problem to \cache validation".
tion, our design allows the AI system to provide the
There are a numb er of approaches to the validation
cache manager with advice to help it decide howto
problem whichvary in completeness and ease of im-
manage its cache and make the following kinds of crit-
plementation. Examples of p ossible comp onents of a
ical decisions:
cache validation system include: only caching relations
pre-fetching - which relations and when should b e declared to b e non-volatile, only caching data b etween
fetched in anticipation of needing them? This can scheduled DB up dates, using heuristics e.g., a decay
yield a signi cant increase in sp eed since the database curve to estimate data validity, and implementing a
server is running as a separate pro cess. This can \sno opy cache" which monitors the database transac-
also b e used to advantage in an environment in which tions for up dates which mightinvalidate the cached
databases are accessed over a network in which links data. 3 Execution 1.0
2
seconds 0.5 seconds
1
Collection
Translation Without Empty Non-empty
Min Mean Max Caching Cache Cache
Figure 4: The aggregate pro cessing time is broken down in Figure 5: Cache p erformance is measured for three cases:
terms of the three main stages of pro cessing: translation, ex- a the without caching or base-line case where caching was
ecution, and col lection.For each pro cessing stage, the mini- disabled, b the empty cache case where caching was enabled
mum, mean, and maximum pro cessing times are shown. but the cache was cleared b efore each rep etition of the test set,
and c the non-empty cache case where the cache contained
the results for all queries in the test set.
Current Status
Performance
of the relative pro cessing times for each stage could b e
The IDI, as describ ed here, has b een implemented in
established.
Common Lisp and tested as a stand-alone query pro-
The di erences in translation time re ect a dep en-
cessor against two di erent databases running on RTI
dence on the numb er of relations in the IDIL query.
INGRES and is also b eing used as a query server for
Similarly, the collection time is a function of the num-
the Unisys sp oken language pro ject. The p erformance
b er of tuples in the result relation. In b oth cases, the
results obtained thus far are, at b est, preliminary since
pro cessing times are signi cantly less than the execu-
the size of the test suite was comparatively small and
tion time which is e ected by the complexity of the
the IDI is just now b eing integrated with an AI system.
SQL query, the communicationoverhead, and the load
However, the results are encouraging and indicate the
on the remote DBMS host since only elapsed time was
p otential for ecient database access a orded by the
recorded.
IDI. The following summarize some of the more inter-
esting of these p erformance results.
Figure 5 indicates the e ects of result caching on
One test set of IDIL queries used consisted of 48
p erformance. The results represent the mean pro cess-
queries where there were 22 unique queries, i.e., each
ing times in seconds for all queries. Three di erent
query was rep eated at least once in the test set. The
cases are represented: a the without caching or base-
queries ranged from simple i.e., only pro ject and select
line case where caching was disabled, b the empty
op erations were required to complex i.e., a four-way
cache case where caching was enabled but the cache
join with two aggregation op erations as well as pro jects
was cleared b efore each rep etition of the test set, and
and selects. The size of the result relations varied from
c the non-empty cache case where the cache was con-
zero to 17 tuples. The statistics presented here are
tained the results for all queries in the test set. The
based on the mean pro cessing times for 20 rep etitions
di erence b etween the base-line and empty cache cases
of the test set of queries.
is due to the numb er of rep eated queries i.e., 26 out
of 48 were rep eated. The fact that the base-line case Figure 4 shows a breakdown of the aggregate pro cess-
is more than twice the empty cache indicated that the ing time in terms of the three main stages of pro cessing:
overhead required for result caching is not signi cant. translation i.e., the time to translate and IDIL query
The non-empty cache case indicates the maximum p o- into SQL, execution i.e., the elapsed time b etween
tential b ene t of result caching, i.e., nearly two orders sending the SQL query to the DBMS and obtaining the
of magnitude improvement in p erformance. Clearly this rst tuple of the result relation, and col lection i.e.,
could only o ccur when the cache is \stacked" as in the the time required to collect all the tuples in the result
test. However, it do es help to establish an upp er limit relation and convert them into internal form. For each
on the p ossible p erformance improvement a orded by pro cessing stage, the minimum, mean, and maximum
result caching. Obviously, as the numb er of rep eated pro cessing times are shown. The cache was disabled
queries increases so will the gain in p erformance. for these measurements so that a more accurate picture
Application of the IDI cache manager to allowittomakeintelligentchoices
ab out query generalizations, pre-fetching and replace-
Clearly, more detailed p erformance results need to b e
ment. We currently have an initial ATIS server running
obtained using more exhaustive test sets. It will b e
and will b e collecting statistics on its transactions which
particularly imp ortanttointegrate the IDI with an AI
can then b e used to de ne a e ective advice strategy.
system and measure its p erformance with a varietyof
di erent applications. We are currently using the IDI
Conclusion
to provide a database server for the Unisys sp oken lan-
guage understanding system and are investigating the
Although the implementation of the IDI is not com-
integration of the IDI with the Intel ligent System Server
plete, it do es provide a solid foundation for easily cre-
and its Protem representation and reasoning engine,
ating a sophisticated interface to existing DBMSs. The
key characteristics of IDI are eciency, simplicityof
The IDI and the ISS. Protem is a hybrid system
use, and a high degree of p ortability which makeitan
containing b oth a frame-based representation system
ideal choice for supp orting a variety of AI and related
and a logic-based reasoning comp onent. The integra-
applications which require access to remote DBMSs.
tion of a frame-based representation system with a rela-
Among the various extensions to the IDI that have
tional database management system is not straightfor-
b een planned for the future, most involve the Cache
ward. Our current approach lab els some of the classes
Manager. At present, the implementation of the CM
in the frame system as \database classes". Any knowl-
has b een fo cused on ecient result caching and most
edge base activity which searches for the instances of
other cache management functions have not b een im-
this class will b e handed a stream of \database in-
plemented. One of the rst steps will b e to imp ose
stances" which will b e the result of a query senttothe
a parameterized limit on the size of the cache and to
database via the IDI. In order to avoid lling the knowl-
implement a cache replacement strategy. Other exten-
edge base memory with database information, these in-
sions to the CM include cache validation, and the abil-
stances are not installed as p ersistent knowledge base
ity to p erform DBMS-like op erations on cache elements
ob jects but exist as \lightweight ob jects" which are
[ ]
O'Hare and Sheth, 1989 .
garbage collected as so on as active pro cesses stop ex-
If the IDI is extended so that it is capable of p erform-
amining them. They are also not \fully instantiated".
ing DBMS-like op erations on the contents of its cache
That is, the values for the frame's roles are not neces-
then, given an IDIL query, it will have three general
sarily installed. Instead, if an attempt is made to access
courses of action whichitmay take to pro duce the re-
their roles, additional database queries to retrieve the
sults: a the entire IDIL query can b e translated into
information will generated automatically. Once again,
SQL and sent to the remote DBMS for execution; b
this information is not added as p ermanent knowledge
the entire IDIL query can b e executed lo cally by the
base data, but only last as long as the currently active
IDI including simple retrieval from the cache; and c
pro cess is using it.
the IDIL query can b e decomp osed so that part of it
This approach has three advantages: it is relatively
is executed on the remote DBMS and part of it is exe-
simple to implement, transparent to the user and is
cuted lo cally by the IDI. The decision of which action
the key to isolating the data copy problem to cache
to takewould dep end on a numb er of factors includ-
validation as stated earlier. Once the relationship b e-
ing the current contents of the cache and the estimated
tween a database class and its database tables is de-
costs for each alternative.
clared, the class and its instances can b e treated as
any other knowledge base ob jects. However, without
References
the IDI cache implementation, it would b e prohibitively
slow.
[ ]
Abarbanel and Williams, 1986 R. Abarbanel and M.
Williams, \A Relational Representation for Knowledge
The IDI ATIS Server. The second AI system that
Bases,", Technical Rep ort, Intellicorp, Mountain View,
the IDI is b eing used to supp ort is a sp oken language in-
CA, April 1986.
terface to an Air Travel Information System database.
In this pro ject, sp oken queries are pro cessed by a sp eech
[ ]
Bo cca, 1986 J. Bo cca, \EDUCE a Marriage of Conve-
recognition system and interpreted by the Unisys Pun-
nience: Prolog and a Relational DBMS," Third Symp o-
[ ]
sium on Logic Programming, Salt Lake City, Sept. 1986,
dit natural language system Hirschman, et. al., 1989 .
pp. 36-45.
The resulting interpretation is translated into a IDIL
query which is then sent to the ATIS Server for evalua-
[ ]
Bro die, 1988 M. Bro die, \Future Intelligent Informa-
tion. This server is a separate pro cess running the IDI
tion Systems: AI and Database Technologies Working
which, in turn, submits SQL queries to an INGRES
Together" in Readings in Arti cial Intel ligence and
database server.
Databases, Morgan Kaufman, San Mateo, CA, 1988.
The \travel agent" domain is one in which there is a
[ ]
Ceri, et. al., 1986 S. Ceri, G. Gottlob, and G. Wiederhold,
rich source of pragmatic information that can b e used
\Interfacing Relational Databases and Prolog Eciently,"
to infer the user's intentions underlying their queries.
Proc. of the 1st Intl. Conf. on Expert Database Systems,
South Carolina, April 1986. These intentions can b e used to generate advice to the
[ ]
Management System," in Proceedings of the 12th In- Chakravarthy, et. al., 1982 U. Chakravarthy, J. Minker,
ternational ConferenceonVery Large Databases,Kyoto, and D. Tran, \Interfacing Predicate Logic Languages
Japan, 1986. and Relational Databases," in Proceedings of the First
International Logic Programming Conference, pp. 91-98,
[ ]
Li, 1984 D. Li, AProlog Database System, Research Stud-
Septemb er 1982.
ies Press, Letchworth, 1984.
[ ]
Chamb erlin, et. al., 1976 D. Chamb erlin, el al.. \SE-
[ ]
Minker, 1978 J. Minker, \An Exp erimental Relational
QUEL2: A Uni ed Approach to Data De nition, Manip-
Data Base System Based on Logic," in Logic and
ulation, and Control", IBM Journal of R&D, 20, 560-575,
Databases, ed. J. Minker, Plenum Press, New York, 1978.
1976.
[ ]
Morris, 1988 K. Morris, J. Naughton, Y. Saraiya, J. Ull-
[ ]
Chang, 1978 C. Chang, \DEDUCE 2: Further investiga-
man, and A. Van Gelder, \YAWN! Yet Another Window
tions of deduction in relational databases," in Logic and
on NAIL!," IEEE Data Engineering,vol. 10, no. 4, De-
Databases, ed. H. Gallaire, pp. 201-236, New York, 1978.
cemb er 1987, pp. 28-43.
[ ]
Chang and Walker, 1984 C. Chang and A. Walker,
[ ]
Motro, 1986 Amihai Motro, \Query Generalizatio n:
\PROSQL: A Prolog Programming Interface with
A Metho d for interpreting Null Answers", in Ex-
SQL/DS," Proc. of the 1st Intl. Workshop on Expert
pert Database Systems, ed. L. Kerschb erg, Ben-
Database Systems, Kiawah Island, South Carolina, Octo-
jamin/Cummings, Menlo Park CA, 1986.
b er 1984.
[ ]
Naish and Thom, 1983 L. Naish and J. A. Thom, \The
[ ]
Chimenti, et. al., 1987 D. Chimenti, A. O'Hare, R. Krish-
MU-Prolog Deductive Database," Technical Rep ort 83-
namurthy, S. Naqvi, S. Tsur, C. West, and C. Zaniolo,
10, Department of Computer Science, University of Mel-
\An Overview of the LDL System," IEEE Data Engi-
b ourne, Australia, 1983.
neering,vol. 10, no. 4, Decemb er 1987, pp. 52{62.
[ ]
O'Hare, 1987 A. O'Hare, Towards Declarative Control
[ ]
Dahl, et. al., 1990 D. Dahl, L. Norton, D. McKay,M.
of Computational Deduction, University of Wisconsin{
Linebarger and L. Hirschman, \Management and Evalua-
Madison PhD Thesis, June 1987.
tion of Interactive Dialogue in the Air Travel Information
[ ]
O'Hare, 1989 A. O'Hare, \The Intelligent Database Inter-
System Domain", submitted to The DARPA Workshop
face Language", Technical Rep ort, Unisys Paoli Research
on Speech and Natural Language, June 24-27,1990, Hid-
Center, June 1989.
den Valley,PA.
[ ]
O'Hare and Sheth, 1989 A. O'Hare and A. Sheth, \The
[ ]
Finin, et. al., 1989 Tim Finin, RichFritzson, Don Mckay,
interpreted-compiled range of AI/DB systems", SIGMOD
Robin McEntire, and Tony O'Hare, \The Intelligent Sys-
Record, 181, March 1989.
tem Server | Delivering AI to Complex Systems", Pro-
[ ]
O'Hare and Travis, 1989 A O'Hare and L. Travis, \The
ceedings of the IEEE International Workshop on Tools
KMS Inference Engine: Rationale and Design Ob jec-
for Arti cial Intel ligence{ Architectures, languages and
tives", Technical Rep ort TM-8484/003/00, Unisys - West
Algorithms, March 1990.
Coast Research Center, 1989.
[ ]
Fritzson and Finin, 1988 RichFritzson and Tim Finin,
[ ]
O'Hare and Sheth, 1989 Anthony B. O'Hare and Amit
\Protem { An Integrated Exp ert Systems To ol", Tech-
Sheth. The architecture of BrAID: A system for e-
nical Rep ort LBS Technical Memo Numb er 84, Unisys
cient AI/DB Integration.Technical Rep ort PRC-LBS-
Paoli Research Center, May 1988.
8907, Unisys Paoli Research Center, June 1989.
[ ]
Hirschman, et. al., 1989 Lynette Hirschman,
[ ]
Reiter, 1978 R. Reiter, \Deductive Question-Answering
Martha Palmer, John Dowding, Deb orah Dahl, Marcia
on Relational Data Bases," in Logic and Databases, ed.
Linebarger, Reb ecca Passonneau, Francois-Michel Lang,
J. Minker, Plenum Press, New York, 1978.
Catherine Ball, and Carl Weir. The pundit natural-
[ ]
Sheth, et. al., 1988 A. Sheth, D. van Buer, S. Russell, and language pro cessing system. In AI Systems in Govern-
ment Conference. Computer So ciety of the IEEE, March
S. Dao, \Cache Management System: Preliminary Design
and Evaluation Criteria," Unisys Technical Rep ort TM- 1989.
8484/000/00, Octob er 1988.
[ ]
Intellicorp, 1987 Intellicorp, \KEEConnection: A Bridge
[ ]
Stonebreaker, et. al., 1987 M. Stonebraker, E. Hanson
Between Databases and Knowledge Bases", An Intel-
and S. Potamianos, \A Rule Manager for Relational
licorp Technical Article, 1987.
Database Systems," in The Postgres Pap ers, M. Stone-
[ ]
Ioannidis, et. al., 1988 Y, Ioannidis, J. Chen, M. Fried-
braker and L. Rowe eds, Memo UCM/ERL M86/85,
man, and M. Tsangaris, \BERMUDA{AnArchitec-
Univ. of California, Berkeley, 1987.
tural Persp ectiveonInterfacing Prolog to a Database
[ ]
Van Buer, et. al., 1985 D. Van Buer, D. McKay,D.Ko-
Machine," Proceedings of the Second International Con-
gan, L. Hirschman, M. Heineman, and L. Travis, \The
ference on Expert Database Systems, April 1988.
Flexible Deductive Engine: An Environment for Proto-
[ ]
Jarke, et. al., 1984 M. Jarke, J. Cli ord, and Y. Vassiliou,
typing Knowledge Based Systems," Proceedings of the
\An Optimizing Prolog Front-End to a Relational Query
Ninth International Joint ConferenceonArti cial Intel-
System," Proceedings of the 1984 ACM-SIGMOD Con-
ligence, Los Angeles CA, August 1985.
ference on the Management of Data, Boston, MA, June
1984.
[ ]
Kellog, et. al., 1986 C. Kellogg, A. O'Hare, and L. Travis,
\Optimizing the Rule/Data Interface in a Knowledge