Automatic Instant Messaging Dialogue Using Statistical Models and Dialogue Acts
Total Page:16
File Type:pdf, Size:1020Kb
Automatic Instant Messaging Dialogue using Statistical Models and Dialogue Acts A thesis presented by Edward Ivanovic to The Department of Computer Science and Software Engineering in total fulfillment of the requirements for the degree of Master of Computer Science and Software Engineering by Research University of Melbourne Melbourne, Australia July 2008 c 2008 - Edward Ivanovic All rights reserved. Thesis advisors Author Steven Bird and Timothy Baldwin Edward Ivanovic Automatic Instant Messaging Dialogue using Statistical Models and Dialogue Acts Abstract Instant messaging dialogue is used for communication by hundreds of millions of people worldwide, but has received relatively little attention in computational linguis- tics. We describe methods aimed at providing a shallow interpretation of messages sent via instant messaging. This is done by assigning labels known as dialogue acts to utterances within messages. Since messages may contain more than one utter- ance, we explore automatic message segmentation using combinations of parse trees and various statistical models to achieve high accuracy for both classification and segmentation tasks. Finally, we gauge the immediate usefulness of dialogue acts in conversation man- agement by presenting a dialogue simulation program that uses dialogue acts to pre- dict utterances during a conversation. The predictions are evaluated via qualitative means where we obtain very encouraging results. Declaration This is to certify that: 1. the thesis comprises only my original work towards the Masters except where indicated in the Preface, 2. due acknowledgement has been made in the text to all other material used, 3. the thesis is approximately 30,000 words in length, exclusive of tables, maps, bibliographies and appendices. Signed, Edward Ivanovic Contents TitlePage.................................... i Abstract..................................... iii TableofContents................................ v ListofFigures.................................. viii ListofTables .................................. ix Citations to Previously Published Work . xi Acknowledgments................................ xii 1 Introduction 1 1.1 Background ................................ 3 1.2 Motivation................................. 4 1.3 Goals.................................... 5 1.4 MethodologicalOutline.......................... 6 1.5 Thesisoverview .............................. 8 2 Instant Messaging 10 2.1 Historical Development of Instant Messaging . ..... 10 2.2 InstantMessagingClients . 12 2.2.1 AOL’sInstantMessenger. 12 2.2.2 MSNMessenger.......................... 14 2.3 Commercial aspects of free Instant Messaging . .... 16 2.3.1 Forecasted Trends in Instant Messaging Usage . .. 18 2.4 Linguistic Features of Instant Messaging . .... 21 2.5 Conclusion................................. 27 3 Human-Computer Dialogue 29 3.1 Dialogue Management Techniques . 29 3.2 Additional research areas for realistic dialogue . ........ 34 3.3 Differences between Dialogue and Monologue . ... 36 3.3.1 ConversationalImplicature. 36 3.3.2 Grounding............................. 38 3.3.3 Turns, Messages, and Utterances . 43 v vi Contents 3.4 DialogueActs............................... 47 3.5 DialogueActAnnotationSchemes. 49 3.5.1 Dialogue Act Markup in Several Layers . 49 3.5.2 Modifications and Extensions of DAMSL . 55 3.6 Prior Work on Dialogue Act Classification . .. 55 3.6.1 Selected Statistical Learning Methods . .. 56 3.6.2 Selected Symbolic Learning Methods . 58 3.7 Summary ................................. 60 4 Construction of an IM dialogue corpus 63 4.1 DialogueActTagSet........................... 64 4.2 DataCollection .............................. 67 4.3 DataAnnotation ............................. 69 4.4 DataPreparation ............................. 69 4.4.1 MessageSynchronisation . 69 4.4.2 UtteranceSegmentation . 72 4.4.3 Lemmatisation .......................... 78 4.4.4 PoSTaggingandChunking . 80 4.5 Conclusion................................. 83 5 Utterance Segmentation and Dialogue Act Classification 84 5.1 HMMUtteranceSegmentationMethod . 85 5.2 DialogueActClassification. 87 5.2.1 NaiveBayesModel ........................ 89 5.2.2 VectorSpaceModel. .. .. 90 5.2.3 SupportVectorMachineModel . 92 5.2.4 MaximumEntropyModel . 93 5.3 Segmentation and Classification using Parse Trees . ...... 95 5.4 Conclusion................................. 98 6 Evaluation 100 6.1 DialogueActClassification. 101 6.1.1 NaiveBayesModel . .. .. 102 6.1.2 Error Analysis of Classification Task . 106 6.2 UtteranceSegmentation . 112 6.2.1 Evaluating Utterance Segmentation . 112 6.2.2 Experimental Results and Discussion . 115 6.3 An Application: Assisted Customer Support . ... 120 6.3.1 UsingtheSimulationProgram. 122 6.3.2 Evaluating the Customer Support Assistant . 124 6.3.3 The Entropy of Utterances within Dialogue Acts . 128 6.4 Conclusion................................. 130 Contents vii 7 Conclusions and Future Work 133 7.1 Conclusions ................................ 133 7.2 FutureWork................................ 135 A Dialogue Macrogame Theory 148 B Computer-assisted conversation logs 153 C Dialogue Act Bigram transition probabilities 174 D Utterance Classifications with Dialogue Acts 177 E Online Chat Support Services 180 E.1 OnlineChatSupport . .. .. 180 E.2 Online Chat Support Software and Services . 180 F Penn Treebank Part of Speech tag set 182 List of Figures 2.1 AOL Instant Messenger (AIM) screen shot. .. 13 2.2 MSNMessengerscreenshot. 15 3.1 Finite-stateautomaton . .. .. 30 3.2 Tree diagram of instant messaging constituents . ...... 44 3.3 Decision tree for Agreement aspect in DAMSL . .. 50 4.1 Example lemmatised sentence. 79 4.2 Sample PoS-tagged and chunked data from corpus . ... 82 5.1 Sample training data used for HMM segmentation . ... 86 5.2 Feature representation mapped from corpus data . .... 88 5.3 RASP parse tree of a message showing utterances separated into sub- trees..................................... 95 5.4 Properanalysesofaparsetree. 96 6.1 Learning curve of naive Bayes models . 106 6.2 Relation between word entropy and dialogue act frequencies ..... 111 6.3 Reference and three hypothesised segmentations for different algorithms.113 6.4 Frequency distribution of utterance and message lengthsinwords. 114 6.5 WindowDiff results of various models used . 116 6.6 ErroneousparsetreeproducedbyRASP . 119 6.7 ParsetreeproducedbyRASP . 120 6.8 Screen shot of the dialogue simulation program . .... 121 viii List of Tables 1.1 An example of the beginning of a dialogue in our corpus. ..... 2 2.1 NetworkingTimeline ........................... 19 2.2 Sample of a many-to-one relationship between requests and a response. 20 2.3 Distinctions between Written and Spoken Language . ..... 23 2.4 Computer Mediated Communication Spectrum . .. 24 2.5 Common abbreviations and emoticons used in instant messaging . 26 3.1 Examples of different types of dialogue systems . .... 31 3.2 Different techniques for dialogue given task complexity ........ 32 3.3 Groundingactsfordiscourseunits. .. 41 3.4 Example of unsynchronised messages in instant messaging ...... 46 3.5 Forward- and backward-looking functions in DAMSL . .... 54 3.6 A selection of studies presented in Stolcke et al. (2000) showing the number of dialogue act tokens, dialogue act tag set size, tag set used, andaccuracy. ............................... 57 4.1 The 12 dialogue act labels with a description and examplesofeach. 65 4.2 Example of the beginning of a dialogue in our corpus . .... 68 4.3 Characteristics of the MSN-Shopping corpus. ..... 68 4.4 Example of unsynchronised messages . 70 4.5 Method of scoring pairwise inter-annotator agreement . ........ 76 4.6 Annotatorsegmentationagreement . 78 5.1 Hypothesised segmentations from sub-tree node combinations .... 97 6.1 Results for all classification algorithms . ..... 102 6.2 Mean n-gram accuracy of labelling utterances with dialogue acts . 103 6.3 Example of a dialogue with utterance boundaries . .... 104 6.4 Relative misclassification error rates . ..... 107 6.5 Perplexity values for dialogue act transitions using bigrams . 108 6.6 Errors showing the impact of dialogue act adjacency pairs ...... 109 ix x List of Tables 6.7 Example from the dialogue simulation program . ... 125 6.8 Dialogue act rankings within the set of predictions by running the shopping evaluation program. n is the n-best rankings of correct sug- gestions. Percentages are cumulative. 127 6.9 Manually clustered utterances to calculate entropy . ........ 129 6.10 Number of utterance clusters with entropy values . ...... 130 C.1 Bigram dialogue act transition probabilities. ....... 176 Citations to Previously Published Work Portions of this thesis have appeared in the following papers: Edward Ivanovic. 2005 Dialogue act tagging for instant messaging chat sessions. In Proceedings of the ACL Student Research Workshop, pages 79–84, Ann Arbor, Michigan, June. Association for Computational Lin- guistics. Edward Ivanovic. 2005 Automatic Utterance Segmentation in Instant Messaging Dialogue. In Proceedings of the Australasian Language Tech- nology Workshop, pages 241–249, Sydney, Australia, December. Aus- tralasian Language Technology Association. Acknowledgments I have benefited greatly from the support and help of many people during my time spent researching and writing this thesis. In particular, I would like to express my gratitude to Steven Bird, my first supervisor, who gave me the opportunity to study computational linguistics and provided inspiration and encouragement