<<

THE USE OF A SEMANTIC NETWORK IN A DEDUCTIVE QUESTION-ANSWERING SYSTEM

James R. McSkimin .Jack Minker Bell Telephone Laboratories Department of Science Columbus, Ohio University of Maryland 43209 College Park, Maryland 20742

Abstract tions. Thus, it should be possible to deter• mine that a query such as "Who is the person The use of a semantic network to aid in the who is both father and the mother of a given deductive search process of a Question-Answering individual?n is not answerable. System is described. The semantic network is (3) To identify those queries which have a known based on an adaptation of the predicate calculus. maximum number of solutions so as to termi• It makes available user-supplied, domain-dependent nate searches for additional answers once the info mat ion so as to permit semantic data to be known fixed number is found. used during the search process. The semantic network discussed in this paper Three ways are discussed in which semantic is described briefly in Section 2. In Section 3, information may be used. These are: we describe how the semantic network may be used (a) To apply semantic information during the to solve some of the problems associated with the pattern-matching process. above items. An example which illustrates the (b) To apply semantic well-formedness tests to use of the semantic network is presented in query and data inputs. Section 4. A summary of the work and future (c) To determine when subproblems are fully- directions is given in Section 5. solved (i.e., they have no solutions other than a fixed, finite number). 2. Seniantjc Network An example is provided which illustrates the Although the term 'semantic network' has been use of a semantic network to perform each of the used extensively in the literature, there is no above functions. universal agreement as to what constitutes such a network. Hence, we shall define the term in the 1. Introduction context of this paper. Semantic Networks have been used primarily The semantic network to be described arose in natural language applications to help disam• out of the need to provide meaning to objects in biguate sentences and to understand natural a domain and to statements made about these ob• language text. In this paper we consider the use jects so as to make deductive searches more ef• of a semantic network to aid in the deductive ficient. Although the semantic network developed search process of a Question-Answering (QA Sys• is used in deductive searches, it nevertheless tem. The semantic network is based on an adapta• bears considerable relationship to those tion of the predicate calculus and is described developed through the need to understand natural only briefly in this paper and more extensively language by . The semantic network by McSkimin and Minker (McSkimin [19761, and developed by Schubert [1976], for example, bears McSkimin and Minker [1977]). Terminology from many similarities to the one used here. the predicate calculus will be used throughout the paper. The semantic network described here is an adaptation of the predicate calculus and is able Three ways will be discussed in which seman• to express quantification, functions, terms and tic information may be applied to help restrict logical connectives. The adaptation is based deductive searches. These are: upon the notation of Fisimian and Minker (Fishman (1) To apply semantic information during the [1973], Fishman and Minker [1975]), who modified pattern-matching process (unification al- predicate calculus clause notation to handle sets gorithin). Most current pat tern-matching sys• of objects that have the same template structure. tems are based solely on syntactic tests. Using the semantic network, semantic con• Thje article by Schubert [1976] discusses straints may be applied during the pattern- many semantic network representations used for matching process to inhibit data base asser• natural language processing, and surveys the tions and general axioms that are semantical!}' literature so that we neither refer to nor com• irrelevant to the search from entering into pare our work on semantic networks with that the deductive search space. achieved by others. (2J To apply semantic well-formedness tests to query and data base assertions input to the In order to implement the techniques des• system so as to reject queries that have no cribed in the introduction, domain-dependent answer because they violate semantic restric- information must be stored in the computer in a form convenient for use. Coasequently, a major Matural Lanrnj age-3 : McSk iMin 50 part of this research has concerned the identifi• necessary to subdivide the domain D into subsets cation of the types of semantic information to be since certain relational statements may only be stored, and the development of structures in which made about specified subsets of D, and one would to store the information. To this end, a collec• like to make these subsets explicit rather than tion of structures termed the "semantic network" implicit. This subdivision is specified by a has been developed which contains all information semantic graph Gs which defines how each category available to the question-answering system. C is subdivided into subsets C1,C2,...,Cn and how The semantic, network consists of four com• each of the Cj is similarly defined. Figure 1 ponents: (1) the semantic graph which specifies shows an example of such a graph. Note that both the set-theoretic relation between named subsets animate and living are the superset of animal; of the domain; (2) the data base of assertions however animate is the superset of robot which is and inference rules; (37 the semantic form space disjoint from Living, and living is the superset which defines the semantic constraints placed on of plant which is disjoint from animate. Thus, arguments of relational n-tuples; and (4) the animate and living overlap. dictionary which defines the set membership for Subdividing D in this manner and defining each element of the domain. All four components where in the hierarchy each domain element lives of the semantic network are used by the techniques (the function of the dictionary), has several ad• described above for making the QA process more vantages over expressing set memberships by unary efficient. Illustrations of how this information relations. In particular, it should be computa• is used will be given in the next section. tionally more efficient to perform trivial set (a} The Semantic Graph membership inference using such a structure rather than by using unary relations. Thus, Sirica £ The major emphasis of this effort is the judge might be stored in the dictionary rather investigation of techniques by which user-supplied than storing the unit clause JUDGE(Sirica) semantic information may be stored in a computer in the data base. The rationale for this choice and used to make the deductive inference process is given in McSkimin [1976]. more efficient. The approach taken is to define explicitly the' contents of the domain of discourse (b) Data Base D as well as the relationships in which various Assertions are facts, whereas general axioms subsets of the domain may occur. are used to infer assertions about domain elements To this end, much of the work has involved that are otherwise stored implicitly in the data the investigation of how one might subdivide base. Both types are stored in a "parallel clause" the domain D into a finite number of named subsets notation, termed n-o notation, an extension of the Sj such that all elements of each S^ have some set n-representation of Fishman and Minker. An exam• of properties in common. These sets are expressed ple of an assertion in n-o notation is: ((a,x,y), as Boolean Category Expressions (BCE). Examples {{ [PARENT]/a,[Ruth,Herb]/x, lAnne,Carol,Jim]/y}}). are: senatord male -liberal, state, judge f] The assertion states that Ruth and Herb are the lawyer. The names "senator", "state" and "judge" parents of Anne, Carol and Jim. An example of a are examples of what are defined to be the sim• general axiom is: (^(a,x, y.) v (3, x, y), {{ [RES IDE]/a, plest type of BCE possible and are called semantic congressperson/x,state/y,[REPRESENT]/B)3). The categories. A BCE is any arbitrary combination axiom states that for all «, 3, x, and y, if the of categories using the set operations of union object x is in the set congressperson, and the ob- (ll), intersection (n) and complement (-). It is ject a is the predicate" RESIDE, and x residei s in

and the two II-a literals would be prevented from with because at least one of its argu• unifying. Thus, clause (3) would never be entered ments conflicts with the corresponding argument of into the search space, so that it would not lead all semantic forms for that n-tuple size. to a deductive search , thereby decreasing the time and space used over that of a purely syntac• Although some instances of the n-o literal L tic pattern match. might fail to unify with any semantic form, others may succeed. What is desirable therefore, is to Semantic unification is applied during the transform a II-o clause input to the system into deductive search process. It is also applied when one (or perhaps several) clauses that are entirely one is entering new facts or general rules into well-formed. These clauses may then be entered the system, and when a query is entered. These into the data base or input to the deductive are described in the following sections. mechanism as appropriate. Those instances failing 5.2 Semantic Well-Formedness of n-o Clauses to unify should be isolated and the user informed of the error. The semantic well-formedness al• One way in which the n-o unification algorithm gorithm which does all of these things is given by may be used is to perform semantic well-formedness McSkimin. tests on n-o clauses input to a quest ion-answering system, n-o clauses are used in two different An important part of the well-formedness al• ways: both as assertions and general axioms to be gorithm is the unification of input literals stored in the data base, and as questions posed to against the semantic form space. Each semantic the system. As noted previously, the data base form P = (T,) consists of a template T of the comprises one part of the semantic network. The form: (vo,Vj,...,vn) v S and a n-o set x>y)>{([REPRESEm ]/a>scnatorfl/x>[NY,NJ]#yy for predicate evaluation, they are treated as any {[REPRESENT]/a,rep#I7x, [NJl#15/yl other literal and must be resolved away in order {[REPRESENT]/a,rej)#l/x, [NY|#39/y}}J. for the query to be answered. For every substitution set cpi of a predicate In this section it lias been shown how seman• there is associated a semantic set count (SSC), tic constraints may be used to reject semantical!y which represents the number of possible solutions inconsistent queries and data base assertions. that can be found relative to the predicate. Let The third use of the semantic network is given in the substitution set cp be given by, cp = {S0//l/v0, the following section. Siftiii/vj,... ,Sn#mn/vnj. Then it is easy to see that n 3.3 Semantic Actions in the Search Space SSC = min (Card(S.) • m.)> A major problem facing any problem solving i-1 l system is the growth of the search space. when where (.]ard(Sj_) is the cardinality of the set S., the problem to be solved is complicated, the and m-j_ is the element semantic count. Thus, for search space grows and usually, when a solution is cp = {[REPRRSBNrr]/cx,senatorftl/x, [NY,NJ]fl2/y), SSO not found, one runs out of machine work-space inin(iOO-l,2-2)=4, where Card (senator) =100. Tliat rather than time. One can use knowledge about, the is, there are only 4 possible solutions - the two problem domain to help decrease the workspace senators from each state for the particular sub• needed. stitution set. If one sums the SSC over all sub• 3.5.1 Representing Counting Restrictions stitution sets relative to the subproblem (liter• al), one obtains the total possible solutions. If A natural candidate for decreasing the work• all solutions are found for all substitution sets, space is to have knowledge concerning predicates. the literal is said to be fully solved. For the In particular, one might refer to counting predi• single subproblem associated with the sample query cates as ones in which there can be either a fixed there are 4+15+39=58 possible semantic solutions. or an upper bound to the number of solutions to the problem. For example, when referring to the The syntactic count (SC) for a substitution U.S. Senate, there are two senators who represent set is simply the number of possible entries in one state. Thus, if one is searching for an an• the relation, and is given by swer to a subproblem which concerns senators who represent Maryland, only a maximum of two may be SC = ff Card(S-). i=l 1 found in the system. This number is always greater than or equal to the To take advantage of counting predicates, semantic set count, and generally is considerably counting information must be represented in the larger. In particular SC-200 for the above ex• semantic network, and a bookkeeping mechanism must ample. exist during the search process to keep track of when a solution has been found, and when all pos• Ciiven an n-o literal in the search space, a sible solutions have been found. We sketch how target syntactic count (TSC)is specified for each this is accomplished below. (See McSkimin for substitution set cp/ for each literal, and is de• details.) fined as TSCj * (§Cj_-SSCj) . When a n-o literal is to be solved, it is removed from the clause and To motivate the discussion, consider the placed in a special list. The literal on the list following query, "Who represents the states of New is initially given the count, equal to SC. As York or New Jersey?". The query, in n-o notation unique solutions are found for the literal, they may be given in negated form as: f^(a,x,y), are subtracted out of the substitution set, and {{[REPRESENT^*, [NY,NJ]/y}}) . Information the count of the literal initially set to SC is of a general nature concerning the predicate decremented by the number of unique solutions REPRESENT is contained in the semantic form space. found. When the count for the literal on the Thus, it is desired to denote that two senators list equals TSC, then the literal is fully solved. represent every state, and that there are 15 re• presentatives for NJ and 39 from NY, and that each To take advantage of counting predicates, it senator or congressman can represent at most one is necessary to know, when, during the search pro• state. cess, a literal has been solved, and if so, wheth• er the literal has been fully solved. In the en• The above semantic information can be re• presented in the semantic form space as, vironment of Question-Answering Systems, Fishman [1973,1974] has experimental evidence which indi• F: C(a,x,y),{{[REPRESENT]/q>senator#l/x,state#2/y} cates that linear resolution with selection func• {[REPRESENT]/a, rep#l/x, [NJ]#15/y] tion (SL resolution), developed by Kowalski and {[REPRESENT]/a,repfa/x,[NY]#39/y}}). Kuehner [1971], and independently by Loveland [1969,1970] (who termed it model elimination), or By the notation B #m/v, is meant, that each ele- a variant thereof, will be used for the inference Natnral Lanrua Ke-3: McSkimin 5 k mechanism. SL resolution is very convenient to resolved. Leaving such a literal there will per• use since one knows exactly when a literal has mit it to interact with the data base, thus pro• been solved. This occurs when truncation takes ducing more unnecessary deductions. Therefore, if place. The bookkeeping associated with SL resolu• such a list exists, the selected literals from all tion permits one to backup to the clause where one clauses of a redundant derivation must be removed first started to search for a solution to the lit• from the list and resolvents in process or that eral. It is at the clause where one first starts liave been found must be terminated or deleted as to search for a solution to the literal that one the case may be. wants to initiate semantic actions. The following section illustrates how the Three types of semantic actions may be taken above techniques may be used together in answering when fully solved substitution sets of a search a query. space literal L =(L,) are found. Starting at the 4. An Example of the Use of Semantics in a QA clause in which the fully solved substitution set System appears, one must remove from the search space each substitution set that has been fully solved. The data base used in this example consists This will inhibit any further clauses from being of assertions and general rules that might be use• resolved against, the clause. If, all resolvents ful in a political context. Assume there are many from the n-a literal have been found, some of the assertions that state where a member of Congress resolvents may not yet have been entered into the 1egally resides; i.e., workspace. In this case, a pointer to the resol• A : (cx,x,y),l{ [RHSlDL]/a, tBeall,Mathias,Holt,..., vent set would exist, and could be deleted. The Hogan,Gude]/x,[MdJ/yj, reason for inhibiting any additional clauses from A.,: { [RESIDH/cx, [Buckley,Javits,Chisholm, entering the search space is because all solutions ...,Abzugj/x,[NY]/y]} have already been found, and no other solutions are possible. By bringing in additional clauses, These assertions mav be used to derive the state one is merely trying to find another solution via a member of Congress represents by the following a different path. The best that can result is general rule: that a duplicate solution will be found. R, : ('^(a,x,y) v([i,x,y), {{ [RESTDEj/u, congress/x, if all substitution sets cp; become fully state/y, [REPRESENT/3)}) . solved, then the entire subproblem L=(L/i>) is In addition, there might be several axioms fully solved, all further resolvents or potential that are used to deduce whether a member of Con• resolvents of I should be deleted. Thus all fur• gress supports a special interest group (such as ther interactions between L and the data base are organized labor) by referencing numerical ratings avoided. The semantic action of removing fully established by different lobbying groups (e.g., •solved sets and subproblems can thus potentially COPE, the Committee on Political Education of the save a great deal in search effort. AFLC10). For instance if a senator or represen• Unfortunately, the interactions (i.e., reso• tative lias a COPE rating of 67-100 (on a 0-100 lution operations) between literal L and data base scale) then one may conclude that he (she) sup• clauses are seldom serial in nature. Rather, it- ports organized labor. Many of these rules may be is often the case that several interactions may stored as well as other general rules that deduce whether one supports some state, country, or proceed at the same time as cooperating processes 1 (or coroutines) controlled by the search strategy. interest group. These rules are given below. As a consequence, even though some literal L be• R.,: K(a,x,y) v (e,x,y), {{ [COPE J /a, congress/x, comes fully solved, there may be deductions in R67100/yt [SUPPORT]/6, progress for L that, if continued may duplicate labor/z,} solutions already found. Therefore, ideally, these processes should be terminated. Thus, the Py { [NSC]/a,congress/x, second type of semantic action taken is to prune R67100/y9 [SUPPORTJ/3, possible redundant derivation trees from the defense/Z}, search space. A convenient data structure to fa• R4: Ka,x,y) v (B,x,y),{{ [REPRESENT]/ex,congress/x, cilitate pruning is one in which the immediate state/y, [SUPPORT]/3)} ). resolvents of a clause are linked together in a list, where each clause in the list points to its The last rule states that if a member of Congress parent clause, and the parent clause points to represents a state, then he or she would be ex• the first and last entries in the list (i.e., pected to support that state. it is a binary tree). Pruning in such a search To derive whether one would support a spe- space data structure is straightforward, and is cial-interest group, the ratings of each senator not discussed further here. and representative could be stored as assertions: The third type of semantic action concerns A~:((a,x,y),{{[NSC]/a,[Mathias,Cranston,..., the literals of generated clauses in redundant Mitchell]/x,[11]/y}, derivation trees. As each clause (T is generated by resolution or factoring, the search strategy J may select a literal from (T to be resolved against Note: NSC = Nat. Security Council, a defense the data base. The literal might be on a list of lobby; NFU = Nat. Farm Union; and LCV = League literals waiting to be resolved (for example, see of Conservation Voters; R67loo = A category con• Minker et al.[1973]), or in the process of being taining all integers from 67 to 100. Natural LanKiia*e-3: McSkimfn 55 the semantic set count is calculated. In this case, there are SOI 00-2=200 syntactic solutions and SS04 semantic solutions. Thus, the value TSC is stored with the REPRESENT literal L and referenced whenever new solutions are found. L is then modified to remove those so• lutions found so far. When the syntactic count of L equals TSC=196, L will be fully solved. This process is illustrated by the proof tree of Figure 2 using SL resolution. 3) The general rule R1 is resolved with since no direct match is possible. Since senator c congress and [Md,NY] state, resolvent 2 is formed. 4) The RESIDE literal L is now the right-most (and only) literal in the SL resolvent, and is thus solved next. The element set counts from se• mantic form I;2 are read, and the number of solu• tions for L is calculated as, SSC=min(100-l,2-v) = 100, where "v" means an unknown number of solutions are possible. The target syntactic count is cal• culated as TSO(l00-2)-100=100, and stored with the RESIDE literal. Thus, 100 solutions must be found before the literal is considered fully solved. Since it is semantically impossible for 100 sena• tors to reside in two states, this situation will never occur. Fortunately, the RESIDE literal will become indirectly fully solved because its ancestor the REPRESENT literal will be fully solved. This process is described below. 5) Many RESIDE assertions are stored in the data base; only those instances are allowed to enter the search space that involve senators from Md or NY - all representatives from Md and NY and any resident of some other state is excluded. Since {Beall, Mathias, Buckley, Javits} c senator they are the only ones that pass through the n-o unification algorithm. 6) Since the A-literal appears as the right-most literal of clause 3, it implies that a subproblem lias been solved. Up-links are followed until appears without brackets in clause 2 (denoting the point at which the current deduc• tive chain began as indicated by the dotted lines of Figure 5). The solutions found are:

By the procedure described in McSkimin, these so• lutions are removed from clause 2 (whose syntactic count is 200) leaving a new clause whose syntactic count equals 196. Since this does not equal the target syntactic count of 100, no semantic actions can be taken. 7) from Clause 3, yielding clause 4. 8) Another subproblem has been solved since is the right-most literal of clause 4. Up-links are followed until first appears without brackets (clause 1). Since four solutions have been found for the literal, its syntactic count becomes 196. Since this equals the target syntactic count, is therefore removed from the search space to prevent other inference rules from attempting to deduce the senators from Md or rf*-3: McSkimin 6 NY. In addition, the literal HY,X,Z) from clause pected to prove most beneficial are given by 2 is removed also since all residents of Md and NY McSkimin. of interest have been found. Thus, even though this literal had 96 solutions left to find, it 5. Summary could be pruned since it was a. descendant of a literal which was fully solved. These are We have described three basic uses of a seman• examples of semantic actions. tic network: (1) To semantic-ally unify two liter• als. (2) To perform semantic well-formedness 9) The A-literal [M>,x,z)J is next truncated tests. (3) To perform semantic actions. Semantic from clause 4 yielding clause S, which is next unification serves to decrease the number of de• solved in a similar fashion. ductions that one may have over syntactic methods. This example lias illustrated how user-sup• Semantic well-formedness tests serve to delete data plied semantic information may be incorporated from entering the data base if they are not soman - within the framework of predicate calculus so as tically well-formed relative to the domain of ap• plication. Semantic well-formedness tests may also to make the deductive inference process more ef• introduce new literals into the search space which ficient. Three methods were shown to be effec• mast be satisfied for a solution to be found. Se• tive in reducing the amount of effort involved in mantic actions serve to use counting information to answering a query: semantic well-formedness determine when a literal has been fully solved, and tests, semantic operator selection, and semantic to take actions based upon this information. These actions. Naturally, these are not beneficial for actions serve to delimit the search space. all data bases; conditions under which each is ex• An alternative approach to the one described 16] Loveland, D. W. [1970] "Some Linear Herbrand here would be to build into the system unary predi• Proof Procedures: An Analysis," Dept. of cates rather than set information as represented in Computer Science, Carnegie-Mellon Univ., the semantic graph. We do not believe that such an Pittsburgh, Pennsylvania, 1970. approach would be effective since the addition of axioms to the system to represent transitive super• [7] McSkimin, J. R. [1976] The Use of Semantic set relations and disjoint relations would be too I n formation in Deduct ive~~Quest ion -Answering cumbersome to deal with in practice and would lead SysTems"^ Ph.D. Thesis, "Dept. of Computer to very long proofs. Science, Univ. of Ml., College Park, Md.(1976) Also Tech. Report TR-465, 1976. Although the approach appears to be viable, 18] McSkimin, J. R. and Minker, J. [1977] "A Predi• we cannot yet provide experience which would demon• cate-Calculus Based Semantic Network for Ques• strate its effectiveness. We expect to determine tion-Answering Systems," Dept. of Computer its effectiveness. One factor is the data struc• Science, Univ. of Maryland, College Park, Md., ture to be used to perform pattern-directed search for clauses which semantically and syntactically Tech. Report TR-509, 1977. match literals in the data base. A second factor [9] Minker, J., Fishman, i). H. and McSkimin, J.R. will be the amount of time required to perform the [1973] 'The Q* Algorithm - A Search Strategy CONFLICT algorithm. This algorithm determines for a Deductive Quest ion-Answering System," whetlicr or not two boolean category expressions are 4 (1973.1, 225-243. semantically consistent. If the conflict algorithm is too time consuming, it may defeat the whole [10] Minker, J. [1976] "Search Strategy and Selec• approach and make it comparable to a strictly syn• tion Function for An Inferential Relational tactic approach. Further analysis is needed to System," Tech. Report TR-497, Univ. of estimate whether the effort expended in executing Maryland, College Park, Md.,1976. the semantic routines will exceed the extra effort [11] Schubert, L. K. [1976] "Extending the Expres• incurred if they are not performed at all. sive Power of Semantic Networks," Artificial We believe that an advantage will be shown Intelligence 7 (1976), 103-198. for the techniques described here. We are cur• rently implementing a system, MRPPS 3.0 which in• corporates the techniques described here. We ex• pect to experiment with the system to determine how well it will work on large data bases.

Ac kno w 1 ed g emen t s The authors wish to express their apprecia• tion to the National Science Foundation for the supjx)rt that they liave provided for this effort under NSF CJ-43632. They would also like to ex• press their appreciation to Mr. Guy Zanon for his careful reading of the paper and his many sugges• tions.

References

11 ] F i s hman, I). H. [1973] Experiments with a Re so - 1 ution-Based Deductive Questi on-Answering Sys• tem and "aTProposed Clause Representat lon for Parallel Search, ph.D. Thesis, Dept. of Compu- ter Science, Univ. of Maryland, College Park, Md. Also Tech. Report TR-280, 1973. [2] Fishman, 1). 11. [1974] "Experiments with a De- ductive Question-Answering System," Computer and Inf. Sci. Dept., Univ. of Mass., COINS Tech. Report F4C-10, 1974. (3J Fishman, I). H. and Minker, J. [1975] II-Repre• sentation: A Clause Representation for ParaJ - 1eJ Search,'' Artificial Intelligence 6 (2) (1975), 103-127. [4] Kowalski, R. and Kuehner, D. [1971] "Linear Resolution with Selection Function," Artifi• cial Intelligence 2 (3/4) (1971), 221-260. [5] Loveland, D. W. [1969] "A Simplified Format for the Model-Elimination Theorem Proving Pro• cedure," J.ACM 16 (July 1969), 349-363.

Natural Language-3: McSkimin 58