A Language for Knowledge Representation
Total Page:16
File Type:pdf, Size:1020Kb
l�{ A Language for Knowledge Representation Yukio Nakamura, INFOSTA-NIPDOK, Japan Abstract: A language for the representation of pieces of knowledge is proposed as an extension of subject representation used in document retrieval scheme hitherto used. Main features are the use of case-representation and modifying symbols for noun descriptors as well as the intro-duction of verb-descriptors with tense and modifYing symbols. Beside usual AND and OR operators, some new operators are introduced to show ontological relations. Some examples afknowledge representation is appended. 1. Subject expression In documentation works we have dealed with a subject of a document, not withstanding the size of the document. A subject is a substitute or a re presentative of the document. In dealing with pieces of knowledge, however, a representative of the piece of knowledge must be treated differently from the "subject" of a document. The pieces of knowledge are simple and short comparing with a document. A piece of knowledge does not need a subject representation but a direct description of this piece itself is enough. Then, what is required for the description of knowledge? The subject was and is traditionally described by a noun phrase and without a verb. This was justified by the fact that verbs can be transformed into noun form by using mostly participles in Western languages. Is this applicable in the case of knowledge? In knowledge description, we need to describe some kind of change in matter or in state, that is to say an action, or a concept of change. To represent such changes, the use Qfverbs is very natural. This means a description can be made by using a sentence, if necessary. For example, descriptions (A) A dog bites a man. (B) A man bites a dog. are both permissible as valid knowledge. In these expressions, (A) and (B) can be transformed into noun phrase forms (BI) Biting of a man by a dog. (B2) Biting of a dog by a man. These expressions are all of natural language way and we need to transform these into simpler logical formulations, if we need to treat these by a computer. Advances in Knowledge Organization, Vol.4( 1994), p. 127-133 The fonnulated form will be (Sl) Biting /\ Men /\ Dogs , (S2) Biting /\ Dogs /\ Men. Both (SI) and (S2) are the same due to fonnal logics. It is easy to see that such means of subject fonnulation is inadequate for knowledge representation.!t is also seen that the use of "biting" is not necessary and the use of "bite" is just enough. We see, therefore, that a. a simple use of descriptors is insufficient and, at least, a differentiation of subject and object is necessary when descriptors are used. b. use of verbs as descriptors can be made without any difficulty, if we just abandon the traditional way of thinking. Discussions on these two points are made in the later part ofthis paper. 2. Meaning of AND and OR operators If "Biting", "Men" and "Dogs" are thought as concepts. then the logical product (conjunction) of "Biting" (an action) with "Men" (or "Dogs") is nonsense in the light of E. Wiister's theory I . Truely, there is nothing which is "Biting" and "Men" at the same time. In our cases, both for the subject of a document and knowledge representation, this logical product (conjunction) of "biting", "Men" and "dogs" makes sense. Wiister insisted not to use the symbol /\ but 0. which is a quite different operator that he tenned as "conjunction in theme relation (Thema-Relation)". With this symbol, one can represent (SI) and (S2) as [Biting] f.. [Men] 0. [Dogs], where [ ] represents a concept. This representation will be agreed upon by him and his followers. The present author has a little different way of thinking. Whenever we use some machine (or even a peek-a-boo card) for the handling of a subject of a document, all operations are based on coordination (or collocation) of concepts, descriptors or simple keywords. The machines and tools for processing are just working on the same AND operation (as truth table shows). The difference comes from operands. If operands are concepts, then, concept "Biting" cannot have the AND operation with liMen" nor "Dogs " but if the operands are collocated words or classification notations, then the AND operation is possible for lIbiting" and "Men" (or "Dogs"). All of this can be represented in the author's notation as [Biting] /\ [Men] invalid l;titing 1\ Me!! valid , .::" where [ ] represents a concept and the underlining represents a descriptor. In handling knowledge, we must use descriptors or classification notations and these two correspond to concepts and are not concepts themselves. We must be very careful when we use descriptors or classification notations that relate to the ontological relation and especially when the logical OR is concerned. For instance, Relgium v NETHERLANDS and [Belgium] v [Netherlands] make sense. However, using the AND operator, [Belgium] 1\ [Netherlands] does not make sense, because there is no such place that is Belgium and Netherlands, simultaneously. However, Belgium /\ Netherlands makes sense and further Belgium v Netherlands v �embourg makes sense also and in many cases this combination is the same as Ilene\w>. 3. Verbs as Descriptors Until now many of us have thought that a concept should be shown by nouns. However, ISO R1087(1969) says in its Section 2.1.1 "Concepts may be the mental representation not only of things (expressed by nouns) but in a wider sense, also of qualities (expr. by adjectives or nouns), of actions (as expo by verbs or nouns) and even locations, situations or relations (expr. by adverbs, prepositions, conjunction or nouns)." A representation of knowledge requires to represent an action or a change and therefore verbs come necessarily into our scope. Then we must include some form of verbs in our knowledge representation. The simplest way to do this, is to adopt verbs in our descriptors. Verbs have many forms as these have grammatical conjugation, differences in mood and voice. Concept formation means some kind of abstraction, so we must neglect some characteristic that is less important. The simplest choice is to adopt just the idea of action or change and this means we use only the infinitive (or basic) form and neglect all forms in conjugation, such as mood and voice. ,,,v At first this will be difficult to agree for some people because many natural langnages have some kind of granunatical conjugation which is so familiar with them. There are, however, also many languages that have no conjugation at all but still these can express everything necessary for us, by the addition of some other words (mostly of adverbial nature). For instance, we can think of Chinese and Indonesian. An example is shown below: Chinese Wo shui = I sleep. Wo shui-le = I slept. Indonesian = Saya tidur. = Saya sudah tidur. With these in mind, we can adopt verbs in their infinitive form as descriptors and represent them by a symbol, such as I fillli' I , in contrast to a noun descriptor shown as fillli'. Past and future tense can be represented as flowpand flow{- Of course, reducing the number of verb descriptors is essential as that for noun descriptors. This is a task for the compiler of descriptors. 4, Case representation of noun descriptors As shown earlier a differentiation of noun descriptors was felt necessary, at least, for the granunatical subject and object. In other cases some other representation is also necessary. This necessitywas felt already in the 1960's and a well known device is proposed under the name of "roles". Implementation of such roles were made in experimental systems but after these trials, no commercial retrieval system has employed such roles. This is justified by the increase in indexing effort and little effectiveness for existing retrieval systems based on subject representation of documents. Now for knowledge representation, the situation is quite different. Without such devices a correct representation of knowledge is almost impossible. The Role system proposed in the Thesaurus of Engineering Terms2, was intended for chemical documentation and is not applicable to other diSCiplines. Other proposals were abundant but these were also discipline- or problem-oriented and no general discussion was made. The use of "of-a" relation is made frequently to construct a kind of hierarchy. A less known author3 once proposed to use prepositions and links to represent syntactical relations in subject representation. He gave examples such as the following (converted into descriptor representation in ISO 2788) (FISHES + MARKET) IN San Francisco Fish market in San Francisco FISHES IN (MARKET INSan Francisco) Fishes (sold or exhibited) in San Francisco Market '" , (FISHES + MARKET) IN San Francisco IN SONGS Fish market in San Francisco appeared in a song (In all these examples, IN is a preposition operator) This trial shows that prepositions play an important function in the English syntax and many Western langnages, and they are therefore useful in showing the syntactic role of descriptors in subject representation, In other langnages where case-representation is more direct and clear, these prepositions can be replaced by grammatical case signs, Readers are reminded to think of Slavic langnages and Ural-Altaic langnages. Prepositions are very much langnage-dependent and are difficult to have neutral and abstract representations. In contrary to prepositions, grammatical cases are rather uniform and universal among langnages because these are abstract concepts. In this sense, the present author's proposal is to use an abstract case common to most of the langnages and to limit the num1ier of cases to a minimum, as shown below: symbol corresponding preposition. nominative XXlQIn genitive XXXXg of dative XXMd to accusative x.xM"c ablative xxxxb from locative xxxxb at, in, on instrumental =i by here = means a noun-descriptor.