Analysis System of Speech Acts and Discourse Structures Using Maximum Entropy Model*
Total Page:16
File Type:pdf, Size:1020Kb
Analysis System of Speech Acts and Discourse Structures Using Maximum Entropy Model* Won Seug Choi, Jeong-Mi Cho and Jungyun Seo Dept. of Computer Science, Sogang University Sinsu-dong 1, Mapo-gu Seoul, Korea, 121-742 {dolhana, jmcho} @nlprep.sogang.ac.kr, [email protected] Recently, machine learning models using a Abstract discourse tagged corpus are utilized to analyze speech acts in order to overcome such problems We propose a statistical dialogue analysis (Nagata (1994a), Nagata (1994b), Reithinger model to determine discourse structures as (1997), Lee (1997), Samuel (1998)). Machine well as speech acts using maximum entropy learning offers promise as a means of model. The model can automatically acquire associating features of utterances with particular probabilistic discourse knowledge from a speech acts, since computers can automatically discourse tagged corpus to resolve analyze large quantities of data and consider ambiguities. We propose the idea of tagging many different feature interactions. These discourse segment boundaries to represent models are based on the features such as cue the structural information of discourse. phrases, change of speaker, short utterances, Using this representation we can effectively utterance length, speech acts tag n-grams, and combine speech act analysis and discourse word n-grams, etc. Especially, in many cases, structure analysis in one framework. the speech act of an utterance influenced by the context of the utterance, i.e., previous utterances. Introduction So it is very important to reflect the information To understand a natural language dialogue, a about the context to the model. computer system must be sensitive to the Discourse structures of dialogues are usually speaker's intentions indicated through utterances. represented as hierarchical structures, which Since identifying the speech acts of utterances is reflect embedding sub-dialogues (Grosz (1986)) very important to identify speaker's intentions, it and provide very useful context for speech act is an essential part of a dialogue analysis system. analysis. For example, utterance 7 in Figure 1 It is difficult, however, to infer the speech act has several surface speech acts such as from a surface utterance since an utterance may acknowledge, inform, and response. Such an represent more than one speech act according to ambiguity can be solved by analyzing the the context. Most works done in the past on the context. If we consider the n utterances linearly dialogue analysis has analyzed speech acts based adjacent to utterance 7, i.e., utterances 6, 5, etc., on knowledge such as recipes for plan inference as context, we will get acknowledge or inform and domain specific knowledge (Litman (1987), with high probabilities as the speech act of Caberry (1989), Hinkelman (1990), Lambert utterance 7. However, as shown in Figure 1, (1991), Lambert (1993), Lee (1998)). Since utterance 7 is a response utterance to utterance 2 these knowledge-based models depend on costly that is hierarchically recent to utterance 7 hand-crafted knowledge, these models are according to the discourse structure of the difficult to be scaled up and expanded to other dialogue. If we know the discourse structure of domains. the dialogue, we can determine the speech act of utterance 7 as response. * This work was supported by KOSEF under the contract 97-0102-0301-3. 230 Some researchers have used the structural dialogues, 10,285 utterances (19.48 utterances information of discourse to the speech act per dialogue). Each utterance in dialogues is analysis (Lee (1997), Lee (1998)). It is not, manually annotated with discourse knowledge however, enough to cover various dialogues such as speaker (SP), syntactic pattern (ST), since they used a restricted rule-based model speech acts (SA) and discourse structure (DS) such as RDTN (Recursive Dialogue Transition information. Figure 2 shows a part of the Networks) for discourse structure analysis. Most annotated dialogue corpus ~. SP has a value of the previous related works, to our knowledge, either "User" or "Agent" depending on the tried to determine the speech act of an utterance, speaker. but did not mention about statistical models to determine the discourse structure of a dialogue. /SPAJser /SP/Agent /ENh'm a student and registered/br a /EN/There is a dormitory in Universily of language course at University of Georgia in Georgia lot language course students. I )User : I would like Io reserve a room. request U.S. ISTIIdecl.pvg,present,no,none.none] ISTl[decl,be,present,no,none,none] /SA/response 2) Agent : What kind of room do you want? ask-ref /SA/introducing -oneself /DS/[21 /DS/[2I 3) User : What kind of room do you have'? ask-ref /SPAJser /SP/User 4) Agent : We have single mid double rooms. response ~9_. /ENfrhen, is meal included in tuilion lee? 5) User : How much are those rooms? ask-tel /EN/I have sa)me questions about lodgings. /ST/¿yn quest.pvg ,present.no.none ,then I IST/Idecl,paa.presenl,no,none,nonel /SA/ask-if 6) Agent : Single costs 30,000 won and double ~SlS 40,000 WOll. response /SA/ask-ref /DS/12. I I ~DS/121 acknowledge --> Continue 7) User : A single room. please. inform r~mmse Figure 2: A part of the annotated dialogue corpus Figure 1 : An example of a dialogue with speech acts The syntactic pattern consists of the selected syntactic features of an utterance, which In this paper, we propose a dialogue analysis approximate the utterance. In a real dialogue, a model to determine both the speech acts of speaker can express identical contents with utterances and the discourse structure of a different surface utterances according to a dialogue using maximum entropy model. In the personal linguistic sense. The syntactic pattern proposed model, the speech act analysis and the generalizes these surface utterances using discourse structure analysis are combined in one syntactic features. The syntactic pattern used in framework so that they can easily provide (Lee (1997)) consists of four syntactic features feedback to each other. For the discourse such as Sentence Type, Main-Verb, Aux-Verb structure analysis, we suggest a statistical model and Clue-Word because these features provide with discourse segment boundaries (DSBs) strong cues to infer speech acts. We add two similar to the idea of gaps suggested for a more syntactic features, Tense and Negative statistical parsing (Collins (1996)). For training, Sentence, to the syntactic pattern and elaborate we use a corpus tagged with various discourse the values of the syntactic features. Table 1 knowledge. To overcome the problem of data shows the syntactic features of a syntactic sparseness, which is common for corpus-based pattern with possible values. The syntactic works, we use split partial context as well as features are automatically extracted from the whole context. corpus using a conventional parser (Kim After explaining the tagged dialogue corpus we (1994)). used in section 1, we discuss the statistical Manual tagging of speech acts and discourse models in detail in section 2. In section 3, we structure information was done by graduate explain experimental results. Finally, we students majoring in dialogue analysis and post- conclude in section 4. processed for consistency. The classification of speech acts is very subjective without an agreed 1 Discourse tagging criterion. In this paper, we classified the 17 types of speech acts that appear in the dialogue In this paper, we use Korean dialogue corpus transcribed from recordings in real fields such as hotel reservation, airline reservation and tour KS represents the Korean sentence and EN reservation. This corpus consists of 528 represents the translated English sentence. 231 corpus. Table 2 shows the distribution of speech S i denote the speech act of U. With these acts in the tagged dialogue corpus. notations, P(SilU1,i) means the probability Discourse structures are determined by focusing that S~ becomes the speech act of utterance U~ on the subject of current dialogue and are given a sequence of utterances U1,U2,...,Ui. hierarchically constructed according to the We can approximate the probability subject. Discourse structure information tagged in the corpus is an index that represents the P(Si I Ul.i) by the product of the sentential hierarchical structure of discourse reflecting the probability P(Ui IS i) and the contextual depth of the indentation of discourse segments. probability P( Si I UI, i - i, $1, ~ - 1). Also we can The proposed system transforms this index approximate P(SilUl, i-l, Si,i-i) by information to discourse segment boundary (DSB) information to acquire various statistical P(Si l SI, g -l) (Charniak (1993)). information. In section 2.2.1, we will describe the DSBs in detail. P(S~IUI,~)= P(SilS~,~-I)P(U~ISi) (1) Syntactic feature Values Notes decl, imperative, It has been widely believed that there is a strong Sentence T)~e The mood of all utterance wh question, yn_question relation between the speaker's speech act and pvg, pvd, paa, pad, be, The type of the main verb. For Main-Verb know, ask, etc. special verbs, lexical items are the surface utterances expressing that speech act (total 88 kinds) used. (Hinkelman (1989), Andernach (1996)). That is, Tense past, present, future. The tense of an utterance the speaker utters a sentence, which most well Negative Sentence Yes or No Yes if an utterance is negative. expresses his/her intention (speech act) so that serve, seem, want, will, The modality of an utterance. Aux-Verb etc. (total 31 kinds) the hearer can easily infer what the speaker's Yes, No, OK., etc. The special word used in the speech act is. The sentential probability utterance having particular Clue-Word (total 26 kinds speech acts. P(UilSO represents the relationship between Table I : Syntactic features used in the syntactic pattern the speech acts and the features of surface sentences.