Constructing a Dictionary Describing Feature Changes of Arguments in Event Sentences

Constructing a Dictionary Describing Feature Changes of Arguments in Event Sentences

Constructing a Dictionary Describing Feature Changes of Arguments in Event Sentences Tetsuaki Nakamura † Daisuke Kawahara †‡ [email protected] [email protected] Graduate School of Informatics, JST PRESTO † Kyoto University ‡ Abstract large. However, the quality of acquired knowledge might be of low quality. The manual acquisition ap- Common sense knowledge plays an essen- proach may use a fully manual technique (at very tial role for natural language understanding, early stage of artificial intelligence studies) or col- human-machine communication and so forth. lective intelligence (e.g., crowdsourcing and games In this paper, we acquire knowledge of events with a purpose). This approach is useful for gath- as common sense knowledge because there is a possibility that dictionaries of such knowl- ering subjective information, such as emotion infor- edge are useful for recognition of implication mation, which is difficult to acquire automatically. relations in texts, inference of human activi- In this paper, we acquire knowledge of events as ties and their planning, and so on. As for event common sense knowledge because there is a possi- knowledge, we focus on feature changes of ar- bility that dictionaries of such knowledge are use- guments (hereafter, FCAs) in event sentences as knowledge of events. To construct a dic- ful for recognition of implication relations in texts, tionary of FCAs, we propose a framework for inference of human activities and their planning, acquiring such knowledge based on both of and so on. As for event knowledge, we focus on the automatic approach and the collective in- feature changes of arguments (hereafter, FCAs) in telligence approach to exploit merits of both event sentences as knowledge of events because approaches. We acquired FCAs in event sen- there are several studies of infant cognitive devel- tences through crowdsourcing and conducted opment (Massey and Gelman, 1988; Baillargeon et the subjective evaluation to validate whether the FCAs are adequately acquired. As a result al., 1989; Spelke et al., 1995) that report that even in- of the evaluation, it was shown that we were fants use information about the basic features of par- able to reasonably well capture FCAs in event ticipants in events to understand the events. There sentences. are some trials suggesting the possibility that dictio- naries of FCAs are useful resources for deep under- standing of texts (Rahman and Ng, 2012; Goyal et 1 Introduction al., 2013). In (Rahman and Ng, 2012), features of re- Common sense knowledge plays an essential role lationships between event causalities and polarities for natural language understanding, human-machine (positive / negative) of participants in the events are communication and so forth. There are two ap- used as features for anaphora resolution. In (Goyal proaches to acquire such knowledge. One is the au- et al., 2013), a dictionary of various binary effects tomatic acquisition approach, the other is the man- (success(+/-), motivation(+/-), and so on) on char- ual acquisition approach. The automatic acquisi- acters caused by events (that is, verbs) are used for tion approach uses machine learning techniques and automatic story generation. pattern matching methods. This approach is useful To construct a dictionary of FCAs, we propose a when the amount of data to be acquired is extremely framework for acquiring such knowledge based on 46 Proceedings of the 4th Workshop on Events: Definition, Detection, Coreference, and Representation, pages 46–50, San Diego, California, June 17, 2016. c 2016 Association for Computational Linguistics both an automatic approach and a manual approach (collective intelligence) that exploits merits of both approaches. That is, we acquire seed knowledge by using collective intelligence and expand the knowl- edge automatically. Figure 1: Example entry in the proposed dictionary. Event sen- tences are associated with FCAs. 2 Related Work Many studies have aimed at automatically acquiring project (Hartshorne et al., 2014), expanding of Verb- relationships between events (Chambers and Juraf- Net (Kipper et al., 2000) through crowdsourcing, is sky, 2009; Chambers and Jurafsky, 2010; Vander- also available. However, these studies do not focus wende, 2005; Shibata et al., 2014). However, these on the granularity of knowledge. studies do not focus on the motivation of events or For the controlled granularity, we focus on the use effects caused by the events. Although we can de- of basic level features of arguments in event sen- termine which events occur after an event by using tences. Since both of animate beings and inanimate the dictionaries constructed in such studies, we can- objects can be arguments in sentences, we assume not understand why the events occur. For example, not only physical features but also mental features. although we may know the event “a girl cries” hap- The decision criteria for these features are described pens after another event “a girl gets injured” by us- in Section 3.2. ing the dictionary, we cannot understand why the former event occurs. To understand the motivation of the former event, we have to know that “the girl 3 Proposal feels pain.” Of course, the problem can be solved by treating the phenomenon “the girl feels pain” as 3.1 Architecture an event and describing it in the dictionary. How- To construct a knowledge database that keeps an ever, such a policy requires infinite descriptions. It is controlled granularity and can be used to understand preferable to form and maintain the dictionary with the motivation of events, we focus on the feature an controlled granularity. changes of arguments (FCAs) in event sentences. In the case that participants in events are ani- Our purpose is to construct a dictionary that has the mated beings, events may influence their emotions. architecture shown in Figure 1. As shown in the fig- Many studies developing software to process hu- ure, the dictionary describes relationships between man emotions exploits models proposed by psychol- events and FCAs in the event sentences. ogists (Hasegawa et al., 2013; Tokuhisa et al., 2008; Tokuhisa et al., 2009; Vu et al., 2014). In these studies, Ekman’s Big Six Model (Ekman, 1992) and 3.2 Features Plutchik’s wheel of emotions (Plutchik, 1980) are As described in Section 2, we focus on the use of used to automatically extract emotion knowledge basic level features of arguments in event sentences. from large corpora. However, since these studies We assume 16 features listed in Table 1 to be associ- do not focus on what happens after evoking various ated with arguments in event sentences. As for phys- changes of emotions, they are not able to understand ical basic level features, we regard eight features complex phenomenon involving such changes. listed in the Table as basic level features because Some dictionaries constructed manually that re- we pay attention to the traditional sense categories tain high quality are available. For example, Con- (“sight”, “hearing”, “smell”, “taste”, and “touch”). ceptNet1 built in the Open Mind Common Sense As for mental basic level features, we use basic emo- project (Singh, 2002) and a large-scale knowledge tions (e.g., eight mental features listed in the Ta- database built in the CYC project (Lenat, 1995) ble) of Plutchik’s model (Plutchik, 1980) because are available. The database built in the VerbCorner the model assumes other emotions derived from the 1 http://conceptnet5.media.mit.edu/ basic emotions systematically. 47 Category Features Degrees zero anaphora and coreferences. The KUCF is a physical form, color, touch, smell, changed, sound, taste, location, unchanged, database of case frames automatically constructed temperature irrelevant from 1.6 billion Japanese sentences taken from Web mental joy, fear, trust, surprise, increased, decreased, anger, disgust, sadness, unchanged, irrelevant pages. The KUCF has about 40,000 predicates, with expectation 13 case frames on average for each predicate. Table 1: Features of arguments. The sentence creation procedure is as follows. (STEP 1) The 200 most frequent verbs were ex- tracted from the KWDLC. (STEP 2) Verbs that are very abstract (e.g., “do”) were omitted. As a re- sult, 167 verbs remained. (STEP 3) The top two frequent arguments were used as the representative arguments of each case frame (meaning) of each Figure 2: Expansion of our dictionary. We used automatic pro- verb. (STEP 4) Sentences were created by com- cedures and crowdsourcing to expand the size of the dictionary. bining the verbs with their representative arguments. As a result, 1,935 Japanese sentences were created. 3.3 Construction Method (STEP 5) Sentences that were difficult to understand were pruned based on crowdsourcing4. The crowd- We are planning to construct the dictionary de- sourcing workers were asked to answer whether they scribed in section 3.1 based on both of automatic could understand the presented sentences. Each sen- procedure and collective intelligence as follows: tence was judged by 10 workers. In total, 244 people (PHASE 1) Frequent event sentences are selected participated in the task. For each sentence, we esti- automatically. (PHASE 2) FCAs of the selected mated the probability that the “yes” was selected by sentences are acquired as seed knowledge based on using the method proposed in (Whitehill et al., 2009) crowdsourcing because our dictionary is designed to and omitted sentences with probabilities less than deal with subjective information such as emotions. 0.9. As a result, 857 event sentences remained and (PHASE 3) Knowledge is automatically expanded 148 verbs (types) were included in the sentences. by using the seed knowledge. Since each of these sentences had two or three ar- The sequence of precedures described above is il- guments, 1,768 arguments (token) were obtained as lustrated in Figure 2. We intend to expand the size a result. of our dictionary based on the loop. 4 Experiment 4.2 Method FCAs in event sentences were acquired through a We conducted an experiment to validate whether crowdsourcing task.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us