Computer Science Technical Reports Computer Science
1995 A Neural Network Architecture for High-Speed Database Query Processing Chun-Hsien Chen Iowa State University
Vasant Honavar Iowa State University
Follow this and additional works at: http://lib.dr.iastate.edu/cs_techreports Part of the OS and Networks Commons, Systems Architecture Commons, and the Theory and Algorithms Commons
Recommended Citation Chen, Chun-Hsien and Honavar, Vasant, "A Neural Network Architecture for High-Speed Database Query Processing" (1995). Computer Science Technical Reports. 43. http://lib.dr.iastate.edu/cs_techreports/43
This Article is brought to you for free and open access by the Computer Science at Iowa State University Digital Repository. It has been accepted for inclusion in Computer Science Technical Reports by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. A Neural Network Architecture for High-Speed Database Query Processing
Abstract Artificial neural networks (ANN), due to their inherent parallelism and potential fault tolerance offer an attractive paradigm for robust and efficient implementations of large modern database and knowledge base systems. This paper explores a neural network model for efficient implementation of a database query system. The ppa lication of the proposed model to a high-speed library query system for retrieval of multiple items is based on partial match of the specified query criteria with the stored records. The performance of the ANN realization of the database query module is analyzed and compared with other techniques commonly in current computer systems. The er sults of this analysis suggest that the proposed ANN design offers an attractive approach for the realization of query modules in large database and knowledge base systems, especially for retrieval based on partial matches.
Disciplines OS and Networks | Systems Architecture | Theory and Algorithms
This article is available at Iowa State University Digital Repository: http://lib.dr.iastate.edu/cs_techreports/43
A Neural Network Architecture for
High-Sp eed Database Query Pro cessing
Chun-Hsien Chen & Vasant Honavar
Department of Computer Science
226 Atanaso Hall,
Iowa State University,
Ames, IA 50011. U.S.A.
Abstract
Arti cial neural networks ANN, due to their inherent parallelism
and p otential fault tolerance o er an attractive paradigm for robust
and ecient implementations of large mo dern database and knowl-
edge base systems. This pap er explores a neural network mo del for
ecient implementation of a database query system. The applica-
tion of the prop osed mo del to a high-sp eed library query system for
retrieval of multiple items is based on partial match of the sp eci ed
query criteria with the stored records. The p erformance of the ANN
realization of the database query mo dule is analyzed and compared
with other techniques commonly in current computer systems. The
results of this analysis suggest that the prop osed ANN design o ers
an attractive approach for the realization of query mo dules in large
database and knowledge base systems, esp ecially for retrieval based
on partial matches.
This researchwas partially supp orted by the National Science Foundation through
the grant IRI-9409580 to Vasant Honavar. A preliminary version of this pap er [Chen
and Honavar, 1995c] app ears in the Pro ceedings of the 1995 World Congress on Neural
Networks. 1
1 Intro duction
Arti cial neural networks ANN o er an attractive computational mo del
foravariety of applications in pattern classi cation, language pro cessing,
complex systems mo delling, control, optimization, prediction and automated
reasoning for a variety of reasons including: p otential for massively parallel,
high-sp eed pro cessing, resilience in the presence of faults failure of comp o-
nents and noise. Despite a large numb er of successful applications of ANN
in pattern classi cation, signal pro cessing, and control, their use in complex
symb olic computing tasks including storage and retrieval of records in large
databases, and inference in deductive knowledge bases is only b eginning to
b e explored Honavar, & Uhr, 1994; Sun, 1994; Levine & Aparicioiv, 1994;
Go onatilake & Khebbal, 1995.
This pap er explores the application of neural networks to database query
pro cessing. Database query can b e viewed as a pro cess of table lo okup which
is used in a wide variety of computing applications. Examples of such lo okup
tables include: routing tables used in routing of messages in communication
networks, symbol tables used in compiling computer programs written in high
level languages, and lexicons or dictionaries used in in natural language pro-
cessing. Many applications require the lo okup mechanism or query pro cessing
system to b e capable of retrieving items based on partial matches or retrieval
of multiple records matching the sp eci ed query criteria. This capabilityis
computationally rather exp ensive in many current systems. The ANN-based
approach to database query pro cessing that is prop osed in this pap er exploits
the fact that the database lo okup task can b e viewed at an abstract level,
in terms of binary mappings which can b e eciently and robustly realized
using ANNs Chen & Honavar, 1994; Chen & Honavar, 1995a.
The rest of the pap er is organized as follows:
Section 2 reviews the mathematical mo del of multiple recall from par-
tially sp eci ed input pattern in a neural network prop osed in Chen &
Honavar, 1995a.
Section 3 develops an ANN design for a high-sp eed library query system
in detail based on the mo del describ ed in section 2.
Section 4 compares the p erformance of the prop osed ANN-based query
pro cessing system with that of several currently used techniques.
Section 5 concludes with a summary of the pap er.
2 Multiple Recall from Partially Sp eci ed
Input
Most of database management systems store data in the form of structured
records. When a query is made, the database system searches and retrieves
records that match the user's query criteria whichtypically only partially
sp ecify the contents of records to b e retrieved. Also, there are usually multi-
ple records that match a query e.g., b o oks written by a particular author in
a library query system. Thus, query pro cessing in a database can b e viewed
as an instance of the task of recall of multiple stored patterns given a partial
sp eci cation of the patterns to b e recalled. A neural asso ciative memory for
p erforming this task was prop osed in Chen & Honavar, 1995a. This section
summarizes the relevant prop erties of the neural memory develop ed in Chen
& Honavar, 1995a.
2.1 Asso ciative Memory with Bip olar Input and Bi-
nary Output
The asso ciative memory used is based on 2-layer p erceptron with a layer
of input neurons, a layer of hidden neurons, and a layer of output neurons;
and hence twolayers of connection weights. Given a set U of k bip olar
input vectors u ,...,u of dimension n and a set V of k desired binary output
1 k
vectors v ,...,v of dimension m, such an asso ciative memory mo dule can b e
1 k
synthesized using one-shot learning as follows: The memory mo dule has n
input, k hidden and m output neurons. For each asso ciative ordered pair
u ;v , where 1 i k , a hidden neuron i is created with threshold n 2p ,
i i i
where p 2 N is the adjustable precision level for that asso ciative pair. The
i
connection weight from input neuron g to hidden neuron i is u and that
ig
from hidden neuron i to output neuron h is v . The threshold for each of the
ih
output neurons is set to 0. The activation function at hidden no de is binary
hardlimiter function f , and that at output neuron can b e binary hardlimiter
h
function f or identity function I , i.e., I x= x, dep ending on the particular
h
hardware implementation employed. The binary hardlimiter function f is h
de ned by f x=1 if x 0 and 0 otherwise.
h
2.2 Asso ciative Recall from Partially Sp eci ed Input
Pattern
This section summarizes a mathematical mo del of asso ciative recall from a
partially sp eci ed bip olar pattern and its neural network realization. The
mo del assumes that the unavailable comp onents of an input pattern vector
have a default value of 0. Thus, a partial input pattern is completed by lling
in a 0 for each of the unavailable comp onents.
Let u_ b e a partially sp eci ed n-dimensional bip olar pattern with the
values of some of its comp onents b eing unknown; bits_u, a function which
counts the numb er of comp onents with known values +1 or -1 of partial
patternu _ ; pad0u _ , a function which pads the unavailable bits of bip olar
partial patternu _ with 0's; and , a binary predicate which tests whether
\ isapartial pattern of ", where \ is a partial pattern of " means that
the values of available bits of are same as those of their corresp onding bits
in .
_
Let D _u; v_ denotes the Hamming distance b etween two bip olar partial
H
patterns, with same corresp onding unavailable comp onents,u _ andv _ resp ec-
tively.Ifbits_u=j , pad0u _ is called as padded j-bit partial pattern derived
j
_
from partially sp eci ed patternu _ . De ne U = fu_ j bits_u= j &_u u g,
i
Pi
j
_
1 j n &1 i k , i.e., U is the set of partial patterns, with j
Pi
j
available bits, of bip olar pattern u . De ne U p =fpad0u j9u;_ u_ 2
i i
Pi
j j
_
U & D u; u_ bj=ncp g,1 j n &1 i k , i.e., U p is the set
H i i
Pi Pi
of padded j-bit partial patterns having Hamming distance less than or equal
to bj=ncp to any one of the padded j-bit partial patterns of full pattern
i
u .
i
j
cn k cn n cn
p. U p=[ p; and U U p=[ Let p = p; 1 i k ; U
i
Pi i=1 P Pi j =c Pi
Let f denote the function of recall from padded bip olar partial pattern.
P
Then f is de ned as follows:
P
cn
p ! V f : U
P
P
cn
f x= v ;ifx 2 U p, 1 i k
P i
Pi
^
f is a partial function and is extended to a full function f for recall
P P
from padded bip olar partial pattern using asso ciative memory as follows:
cn m
^
f : B ! V [f< 0 >g P
cn
f x if x 2 U p
P
P
^
f x=
P
m cn cn
< 0 > if x 2 B U p
P
cn
where B is the universe of n-dimensional vectors each of whose comp o-
nents is 1, 0, or -1 and whichhave at least c non-zero comp onents.
The neural asso ciative memory designed for recall from a fully sp eci ed
input pattern can b e used for asso ciative recall from a partially sp eci ed
input pattern by only adjusting the thresholds of the hidden neurons as
follows: Multiply the threshold of each hidden neuron by the ratio of the
number of available comp onents of a partially sp eci ed input pattern to that
of a complete pattern. That is, reduce the threshold of each hidden neuron
i from n 2p to n 2p n =n = n 2p n =n, where n n is the
a a a a
number of available bits of a partially sp eci ed input pattern.
Note that p is the precision level set for every memory pattern in for
recall from a partial pattern; n equals the number of available bits of a
a
partial input pattern and p n =n is the new precision level.
a
2.3 Multiple Asso ciative Recalls
The memory retrieval pro cess in the neural asso ciative memory describ ed
in previous sub-sections can b e viewed as a two-stage pro cess: identi cation
and recal l. During identi cation of an input pattern, the 1st-layer connections
p erform similarity measurements and suciently activate zero or more hidden
neurons so that they pro duce high outputs of value 1. The actual choice of
hidden neurons to b e turned on is a function of the 1st-layer weights, the
input pattern, and the threshold settings of the hidden neurons. During
recall, if only one hidden neuron is turned on, one of the stored memory
patterns will b e recal led by that hidden neuron along with its asso ciated 2nd-
layer connections. Without any additional control, if multiple hidden neurons
are enabled, the corresp onding output pattern will b e a sup erp osition of the
output patterns asso ciated with each of the individual hidden neurons. With
the addition of appropriate control circuitry, this b ehavior can b e mo di ed
to yield sequential recall of more than one stored pattern.
Multiple recalls are p ossible if some of the asso ciative partitions realized in
the asso ciative ANN memory are not isolated see Chen & Honavar, 1995a
for details. An input pattern a vertex of n-dimensional bip olar hyp ercub e
lo cated in a region of overlap b etween several partitions is close enough to
the corresp onding partition centers stored memory patterns and hence can
turn on more than one hidden neuron as explained b elow.
n n
_ _
Supp ose we de ne U p = fu j u 2 B & D u; u p g,1 i k ,
i H i i
i
n n
_ _
where B is the universe of n-dimensional bip olar vectors; i.e., U to b e
i
the set of n-dimensional bip olar vectors whichhave Hamming distance less
than or equal to p away from the given n-dimensional bip olar vector u ,
i i
where p is a sp eci ed precision level. Tokeep things simple, let us assume
i
p = p; 1 i k . Supp ose we de ne f as follows:
i M
n V
_
f : U p ! 2 ;
M
n
_
f x= fv j x 2 U p; 1 i k g
M i
i
n n n n
______
where U p=U p [ U p [ U p, U p \ U p 6= ; for some i 6= j ,
i j
1 2 k
V
and 2 is the power set of V i.e., the set of all subsets of V. The output of
f is a set of binary vectors that corresp ond to the set of patterns that should
M
b e recalled given the bip olar input vector x. f is a partial function and is
M
^
extended to a full function f to describ e multiple recall in the asso ciative
M
memory mo dule as follows:
n V m
^
_
f : B ! 2 [f< 0 >g ;
M
n
_
f x if x 2 U p
M
^
f x=
M
m n n
_ _
f< 0 >g if x 2 B U p
f can b e further extended to function f for dealing with recall from a
M MP
partially sp eci ed bip olar input pattern. f is de ned as follows:
MP
cn V
f : U p ! 2 ;
MP
P
cn
f x=fv j x 2 U p; 1 i k g
MP i
Pi
h h
_ _
where U p \ U p 6= ; for some h's and i 6= j 's, c h n and
Pi Pj
1 i; j k .
^
f is a partial function and is extended to a full function f for mul-
MP MP
tiple recalls from padded bip olar partial patterns in asso ciative memory as
follows:
cn V m
^
f : B ! 2 [f< 0 >g ;
MP
cn
f x if x 2 U p
MP
P
^
f x=
MP
m cn cn
< 0 > if x 2 B U p P
3 Query Pro cessing Using a Neural Asso cia-
tive Memory
This section describ es the use of the neural asso ciative memory develop ed
in previous section to implement a high-sp eed database query system. A
library query system is used to illustrate the key concepts.
A neural asso ciative memory that can b e used to supp ort a library system
queried by name can b e designed as follows: Supp ose the input is a name
provided in a format with last name followed by rst name of an author,
and characters that app ear in the name are represented using ASCI I co des
extended to 8 bits by padding eachcharacter co de with a leading 0.
Assume the length of b oth last and rst name are truncated to at most
L characters each. Since each ASCI I co de consists of 8 binary bits, we
need 8L input neurons for the last name and 8L neurons for the rst name.
Each binary bit x of the ASCI I input is converted into a bip olar bit x by
b p
expression x =2x 1 b efore it is fed into the ANN memory mo dule for
p b
database queries. This is motivated by the relative eciency of the hardware
implementations of binary and bip olar asso ciative memories { see Chen &
Honavar, 1995a for details.
Let output b e a M -dimensional binary vector p ointing to a record in the
library database that contains information ab out a volume or the binary
vector can enco de information ab out a volume directly. The output binary
vector in turn can therefore b e used to lo cate the title, author, call number
and other relevant information ab out a volume. Using M output neurons,
M
we can access at most 2 records from the library database. Each hidden
neuron in the asso ciative memory mo dule is used to realize an ordered pair
mapping an author's name to an M -bit p ointer that p oints to a record which
contains information ab out a corresp onding volume. During storage of an
asso ciated pair, the connection weights are set as explained in section 2.1.
During recall, the thresholds of hidden neurons are adjusted for each query
as outlined in Section 2.2 where for each query, the value of n can b e set
a
either by centralized check on user's input, or distributed circuitry emb edded
in input neurons. The precision level p is set at 0 for this asso ciative ANN
memory mo dule.
The library query system can handle queries using wild cards as illustrated
by the following:
Case 1 : Supp ose a user enters "Smith " to search for the b o oks
written by authors with last name Smith. In this case, that part of
input for rst name is viewed as unavailable, the corresp onding input
neurons are fed with 0, only the rst 8 5 = 40 input neurons have
input value either 1 or -1, and the threshold set at all hidden neurons is
8 5 = 40. Supp ose the library database contains k volumes written by
authors with last name Smith. In this case, the ANN memory mo dule
contains k hidden neurons for these k volumes one for eachvolume
written by an author whose last name is Smith. During the recall
pro cess all these hidden neurons will have net input of 0 and other
hidden neurons have net input less than 0 see section 2 for details.
The neurons with non-negative net input get activated one at a time
to sequentially recall the desired M -bit p ointers p ointing to the b o oks
written by authors with the sp eci ed last name.
Case 2 : supp ose a user enters " John" to search for the b o oks
written by authors with rst name John. In this case, that part of
input for last name is viewed as unavailable, and the corresp onding
input neurons receive 0s as input, and only the input neurons numb ered
8L + 1 through 8L +4 8 receive inputs of either 1 or -1, and the
threshold set at all hidden neurons is 4 8 = 32. The recall of the
asso ciated p ointers pro ceeds as in case 1.
Case 3 : Supp ose a user enters "Smith J" to search for the b o oks
written by authors with last name called Smith and rst name b egin-
ning with a J. In this case, the rest of the characters of rst name
are viewed as unavailable, the corresp onding input neurons receive0s
as inputs, and only the neurons numb ered 1 through 40 and 8L +1
through 8L +8 have inputs of either 1 or -1, and the threshold set at
all hidden neurons is 6 8 = 48. The recall of the asso ciated p ointers
pro ceeds as in case 1.
The large numb er of hidden neurons in such an ANN mo dule p oses a
problem b ecause of the large fan-out for input neurons and large fan-in
for output neurons in the hardware realization of such a ANN mo dule using
electronic circuits. One solution to this problem is to divide the whole mo dule
into several sub-mo dules which contain same numb er of input, hidden, and
output neurons. These sub-mo dules are linked together by shared input and
output bus see Figure 1. Such a bus top ology also makes it p ossible to easily
expand the size of the database. The 1-dimensional array structure shown
in Figure 1 can b e easily extended to 2 or 3-dimensional array structures.
output output bus module 1 module 2 module n ...
...... output neurons
...... hidden neurons
...... input neurons ...
...
input input bus
Figure 1 shows a mo dular design of the prop osed ANN memory for easy
expansion. This 1-dimensional array structure can b e easily extended to 2
or 3-dimensional array structures.
It is rather straightforward to mo dify the prop osed database query system
to make it case-insensitive. The following shows ASCI I co des of English
characters, which are denoted in hexadecimal and binary co des.
A =41 = 0100 0001 , ... , Z =5A = 0101 1010
16 2 16 2
a =61 = 0110 0001 , ... , z =7A = 0111 1010
16 2 16 2
The binary co des for the capital case and small case of every same English
character only di er at the 3rd bit counted from left hand side. If that bit is
viewed as \don't care", this query system will b e case insensitive. This e ect
can b e achieved by treating the corresp onding input value as though it was
unavailable.
4 Comparison with Other Database Query
Pro cessing Techniques
This section compares the anticipated p erformance of the prop osed neural
architecture for database query pro cessing with other approaches that are
widely used at present. Such a p erformance comparison takes into account
the p erformance of hardware that is used in these systems and the pro cess
used for lo cating data items. First the p erformance of the prop osed neural
network is estimated, based on current CMOS technology for realizing neural
networks. Next, the op eration of conventional database systems is examined,
and their p erformance is estimated and compared to that of the prop osed
neural architecture.
4.1 Performance of Current Electronic Realization
for Neural Networks
Many electronic realizations of arti cial neural networks ANNs have b een
rep orted Graf & Henderson, 1990; Grant et al., 1994; Hamilton et al., 1992;
Lont & Guggenbuhl, 1992; Massengill & Mundie, 1992; Mo on et al., 1992;
Robinson et al., 1992; Uchimura et al., 1992; Watanab e et al., 1993. ANNs
are implemented using mostly CMOS-based analog, digital, and hybrid elec-
tronic circuits. The analog circuit which mainly consists of pro cessing ele-
ments for multiplication, summation and thresholding is p opular for the real-
ization of ANNs since compact circuits capable of high-sp eed asynchronous
op eration can b e achieved Gowda et al., 1993. Uchimura et al., 1992
rep orts a measured propagation delayof104 ns from the START signal un-
til the result is latched by using digital synapse circuit containing an 8-bit
memory, an 8-bit subtractor and an 8-bit adder. Graf & Henderson, 1990
adopts a hybrid analog-digital design with 4-bit binary synapse weightvalues
and current-summing circuits to achieve a network computation time of less
than 100 ns between the loading of the input and the latching of the result
in a comparator. Grant et al., 1994 achieves the throughput at the rate of
10MHz delay = 100 ns in a Hamming Net pattern classi er using analog
circuits. The hardware implementation of the prop osed neural architecture
for database query pro cessing is very similar to that of the rst layer sub-
network of a Hamming Net. Hence the computation delay of the prop osed
ANN can b e exp ected to b e of the order of 100 nanoseconds given the current
CMOS technology for implementing ANN.
The development of sp ecialized hardware for implementation of ANNs
is still in its early stages. Conventional CMOS technology that is currently
the main technology for VLSI implementation of ANN is known to b e slow
kumar, 1994; Lu & Samuleli, 1993. Other technologies, such as BiCMOS,
NCMOS kumar, 1994, pseudo-NMOS logic, standard N-P domino logic,
and quasi N-P domino logic Lu & Samuleli, 1993, may provide b etter p er-
formance for the realization of ANNs. Thus, the p erformance of the hardware
implementation of ANNs is likely to improve with technological advances in
VLSI.
4.2 Analysis of Query Pro cessing in Conventional Com-
puter Systems
In database systems implemented on conventional computer systems, given
the values for a key, a record is lo cated quickly by using key-based organiza-
tions including hashing, index-sequential access and B-trees Ullman, 1988.
Foravery large database, the data is usually stored in secondary storage
devices like hard disks. Conventionally, estimated cost of lo cating a record is
based on the number of physical blo ck accesses of secondary storage devices
Ullman, 1988. The cost-e ective access time with current disk systems
app ears to b e around 10 ms. In comparison, the prop osed ANN-based im-
plementation is clearly far sup erior when sp eed of access is the prime consid-
eration. However, in order to ensure a fair comparison b etween conventional
database systems and the prop osed ANN based system, we will assume that
the former uses RAM random access memory to store all the data. The
cost-e ective access time of RAM using current technology is around 60 ns.
In what follows, we will compare the prop osed ANN-based database query
system with the conventional approach where the latter is assumed to use
main memory RAM.
Analysis of Query Pro cessing Using Hashing Functions
Hashing structure is the fastest of all key-based organizations. However,
although it is e ective in lo cating a single record, it is inecient for lo cating
several related records in resp onse to a single query. Let us consider the time
needed for lo cating a single record using a hash function in current computer
systems. Commonly used hash functions are based on multiplication, division
and addition op erations Kohonen, 1987; Sedgewick, 1988. Assume the
computer systems have dedicated hardware to execute multiplication and
division. Otherwise, their p erformance would not b e as ecient without
the help of dedicated hardware. In hardware implementation addition is
faster than multiplication which is ab out as fast as division. The time taken
to multiply two n-bit numb ers is T n=4n 2 using Braun's array
multiplier Ra quzzaman & Chandra, 1988, where is gate delay.Ifn =12
and = 5ns, T = 230ns; whereas n = 64 and T = 1270ns for a 64-
bit pro cessor. Lu & Samuleli, 1993 prop oses a faster pip elined hardware
1212-bit n = 12 multiplier whose pip eline latency is 13 cycles with a
cycle at 5 ns. If each input eld in a query contains N bits, computing a
hashing function will take several multiplications and/or divisions in at least
dN=ne cycles with a n-bit pro cessor dedicated to each such eld, dep ending
on the algorithm of hashing function employed. Other overhead of computing
hashing function in such systems includes the time for loading and execution
of the instructions for computing the hashing function and for handling the
p otential problem of col lisions in hash function. So it is exp ected that the
total computation time for lo cating a single record will typically b e far in
excess of 100ns.
Analysis of Query Pro cessing Using Index Search
Perfectly balanced binary search tree is another p opular data structure used
in conventional database systems to lo cate a single record. Supp ose every
M
no de in the binary search tree links twochild subtrees and there are 2 1
no des in the tree. Following discussion is based on the case of section 3.
Since the binary search structure is assumed to b e stored on RAM for high
sp eed and for fair comparison with the prop osed ANN-based mo del, it is
M
estimated that the computer system needs at least 2L +6 2 1 bytes
or 72 MB,ifL = 15 and M = 21 - The library at Iowa State University
has over 2 million volumes of RAM to store the binary search structure 8
16MB RAM is typically the size of working memory in current PCs and
workstations. The data structure for each no de at least contains an index of
2L bytes and two 3-byte p ointers might need to use 4 or 8-byte p ointers of
typ e long integer or double precision, dep ending on computer systems used
M
p ointing to 2 1 records. The average numb er of no des visited for lo cating
1
M
log 2 = 0:5M . On an average, every visit a desired record would b e
2
takes ab out 10 instructions executed in a 64-bit pro cessor Assume L = 15.
Then, onn average, half of the 240 bits of key are compared in every visit.
Supp ose the sp eed of the 64-bit pro cessor is 100 MIPS million instructions
p er second. This overhead will b e ab out 0:5M 10 10ns =50Mns=
1050 ns if M = 21 which is far more than 100ns. Once again, note that
this is only the cost for lo cating a single record. The cost for lo cating several
related records using user-entered data for multiple elds could b e partially
sp eci ed is examined in next section.
The Cost of Partial-Match Queries
One of the most commonly used data structures for pro cessing partial-match
queries on multiple elds is k-d-tree Bentley, 1975. In the worst case, the
numb er of visited no des in an ideal k-d-tree containing M no des one for each
record stored for lo cating the desired records for a partial-match query is
K J 1
J + 22 1
K J =K K J =K
[M +1 1] O M
K J
2 1
where K is the numb er of elds used to construct the k-d-tree, and J out
of K elds are explicitly sp eci ed with query criteria by user. For typical
values of M , K , and J , the p erformance of such systems is far worse than
that of the prop osed ANN based mo del.
5 Summary and Discussion
Arti cial neural networks, due to their inherent parallelism and p otential for
fault tolerance o er an attractive paradigm for robust and ecient imple-
mentations of a broad range of information pro cessing tasks. In this pap er,
wehave explored the use of arti cial neural networks for query pro cessing
in large databases. The use of the prop osed approachwas demonstrated
using the example of a library query system. The p erformance of a CMOS
hardware realization of the prop osed database query pro cessing system was
estimated and compared with that of other approaches that are widely used
in conventional databases implemented on current computer systems. The
comparison shows that ANN architectures for query pro cessing o er an at-
tractive alternative to conventional approaches, esp ecially for dealing with
partial-match queries in large databases. With the increased use of large
networked databases over the Internet, ecient architectures for high-sp eed
query pro cessing are of great practical interest. Extensions of the prop osed
ANN architecture for query pro cessing to deal with comp ound queries is in
progress Chen & Honavar, 1995b.
References
J. L. Bently, Multidimensional Binary SearchTrees Used for Asso ciative
Searching, Communications of the ACM,vol. 18, no. 9, pp. 507-517,
1975.
C. Chen & V. Honavar, Neural Network Automata, Proc. of World Congress
on Neural Networks,vol. 4, pp. 470-477, San Diego, 1994.
C. Chen and V. Honavar, Neural Asso ciative Memories for Contentaswell
as Address-Based Storage and Recall: Theory and Applications, To ap-
pear. Preliminary version available as Iowa State University Dept. of
Computer Science Tech. Rep. ISU-CS-TR 95-03.
C. Chen and V. Honavar, Neural Network Architecture for Parallel Set Op-
erations In preparation.
C. Chen and V. Honavar, A Neural Network Archictecture for a High-Sp eed
Database Query Pro cessing System. Proc. of World Congress on Neural
Networks,Washington, D.C. 1995.
Go onatilake, S. and Khebbal, S. Ed. Intelligent Hybrid Systems. Wiley,
London, 1995.
S. M. Gowda et al., Design and Characterization of Analog VLSI Neural
Network Mo dules, IEEE Journal of Solid-State Circuits,vol. 28, no. 3,
pp. 301-313, 1993.
H. P. Graf and D. Henderson, A Recon gurable CMOS Neural Network,
ISSCC Dig. Tech. Papers, pp. 144-145, San Francisco, CA, 1990.
D. Grant et al., Design, Implementation and Evaluation of a High-Sp eed Inte-
grated Hamming Neural Classi er, IEEE Journal of Solid-State Circuits,
vol. 29, no. 9, pp. 1154-1157, 1994.
R. L. Greene, Connectionist Hashed Asso ciative Memory, Arti cial Intel li-
gence 48, pp. 87-98, 1991.
A. Hamilton et al., Integrated Pulse Stream Neural Networks: Results, Is-
sues, and Pointers, IEEE Transactions on Neural Networks,vol. 3, no.
3, pp. 385-393, 1992.
V. Honavar and L. Uhr Ed. Arti cial Intelligence and Neural Networks:
Steps Toward Principled Integration. Academic Press, New York, 1994.
T. Kohonen, Content-Addressable Memories, 2nd ed., Springer-Verlag,
Berlin, 1987.
R. Kumar, NCMOS: A High Performance CMOS Logic, IEEE Journal of
Solid-State Circuits,vol. 29, no. 5, pp. 631-633, 1994.
D. Levine and M. Aparicioiv Ed. Neural Networks for Knowledge Repre-
sentation and Inference. Lawrence Erlbaum, Hillsdale, NJ, 1994.
J. B. Lont and W. Guggenbuhl, Analog CMOS Implementation of a Multi-
layer Perceptron with Nonlinear Synapses, IEEE Transactions on Neural
Networks,vol. 3, no. 3, pp. 457-465, 1992.
F. Lu and H. Samueli, A 200-MHz CMOS Pip elined Multiplier-Accumulator
Using a Quasi-Domino Dynamic Full-Adder Cell Design, IEEE Journal
of Solid-State Circuits,vol. 28, no. 2, pp. 123-132, 1993.
L. W. Massengill and D. B. Mundie, An Analog Neural Network Hardware
Implementation Using Charge-Injection Multipliers and Neuron-Sp eci c
Gain Control, IEEE Transactions on Neural Networks,vol. 3, no. 3, pp.
354-362, 1992.
M. Minsky and S. Pap ert, Perceptrons: An Intro duction to Computational
Geometry, MIT Press, Massachusetts, 1969.
G. Mo on et al., VLSI Implementation of Synaptic Weighting and Summing
in Pulse Co ded Neural-Typ e Cells, IEEE Transactions on Neural Net-
works,vol. 3, no. 3, pp. 394-403, 1992.
M. Ra quzzaman and R. Chandra, Mo dern Computer Architecture, Chapter
3, West Publishing Company, St. Paul, Minnesota, 1988.
M. E. Robinson et al., A Mo dular CMOS Design of a Hamming Network,
IEEE Transactions on Neural Networks,vol. 3, no. 3, pp. 444-456, 1992.
R. Sedgewick, Algorithms, 2nd ed., Addison-Wesley, 1988.
R. Sun and L. Bo okman Ed. Computational Architectures Integrating Sym-
b olic and Neural Pro cesses. Kluwer, New York, 1994.
K. Uchimura et al., An 8G Connection-p er-second 54mW Digital Neural Net-
work with Low-p ower Chain-Reaction Architecture, ISSCC Dig. Tech.
Papers, pp. 134-135, San Francisco, CA, 1992.
J. D. Ullman, Principles of Databases and Knowledge-base Systems, vol. I,
Chapter 6, Computer SciencePress, Maryland, 1988.
6
T. Watanab e et al., A Single 1.5-V Digital Chip for a 10 Synapse Neural
Network, IEEE Transactions on Neural Networks,vol. 4, no. 3, pp. 387- 393, 1993.