Predicting Thread Linking Structure by Lexical Chaining
Total Page:16
File Type:pdf, Size:1020Kb
Predicting Thread Linking Structure by Lexical Chaining Li Wang,♠♥ Diana McCarthy} and Timothy Baldwin♠♥ ♠ Dept. of Computer Science and Software Engineering, University of Melbourne ~ NICTA Victoria Research Laboratory } Lexical Computing Ltd [email protected], [email protected], [email protected] Abstract ticipate in discussions or obtain/provide answers to questions, the vast volumes of data contained in fo- Web user forums are valuable means for rums make them a valuable resource for “support users to resolve specific information needs, sharing”, i.e. looking over records of past user inter- both interactively for participants and stati- cally for users who search/browse over histor- actions to potentially find an immediately applica- ical thread data. However, the complex struc- ble solution to a current problem. On the one hand, ture of forum threads can make it difficult for more and more answers to questions over a wide users to extract relevant information. Thread range of domains are becoming available on forums; linking structure has the potential to help tasks on the other hand, it is becoming harder and harder such as information retrieval (IR) and thread- to extract and access relevant information due to the ing visualisation of forums, thereby improv- sheer scale and diversity of the data. ing information access. Unfortunately, thread linking structure is not always available in fo- Previous research shows that the thread linking rums. structure can be used to improve information re- This paper proposes an unsupervised ap- trieval (IR) in forums, at both the post level (Xi et proach to predict forum thread linking struc- al., 2004; Seo et al., 2009) and thread level (Seo et ture using lexical chaining, a technique which al., 2009; Elsas and Carbonell, 2009). These inter- identifies lists of related word tokens within a post links also have the potential to enhance thread- given discourse. Three lexical chaining algo- ing visualisation, thereby improving information ac- rithms, including one that only uses statistical cess over complex threads. Unfortunately, linking associations between words, are experimented information is not supported in many forums. While with. Preliminary experiments lead to results which surpass an informed baseline. researchers have started to investigate the task of thread linking structure recovery (Kim et al., 2010; Wang et al., 2011b), most research efforts focus on 1 Introduction supervised methods. Web user forums (or simply “forums”) are online To illustrate the task of thread linking recovery, platforms for people to discuss and obtain informa- we use an example thread, made up of 5 posts from tion via a text-based threaded discourse, generally in 4 distinct participants, from the CNET forum dataset a pre-determined domain (e.g. IT support or DSLR of Kim et al. (2010), as shown in Figure 1. The link- cameras). With the advent of Web 2.0, there has ing structure of the thread is modelled as a rooted di- been rapid growth of web authorship in this area, rected acyclic graph (DAG). In this example, UserA and forums are now widely used in various areas initiates the thread with a question in the first post, such as customer support, community development, by asking how to create an interactive input box on interactive reporting and online education. In ad- a webpage. This post is linked to a virtual root with dition to providing the means to interactively par- link label 0. In response, UserB and UserC pro- Li Wang, Diana Mccarthy and Timothy Baldwin. 2011. Predicting Thread Linking Structure by Lexical Chaining. In Proceedings of Australasian Language Technology Association Workshop, pages 76−85 Ø This paper explores unsupervised approaches for 0 forum thread linking structure recovery, by using lexical chaining to analyse the inter-post lexical co- User A HTML Input Code Post 1 ...Please can someone tell me how to create an input hesion. We investigate three lexical chaining algo- box that asks the user to enter their ID, and then allows them to press go. It will then redirect to the page ... rithms, including one that only uses statistical asso- 1 ciations between words. The contributions of this User B Re: html input code 2 Post 2 Part 1: create a form with a text field. See ... Part research are: 2: give it a Javascript action 3 4 • Proposal of an unsupervised approach using User C asp.net c\# video Post 3 I’ve prepared for you video.link click ... lexical chaining to recover the inter-post links 1 in web user forum threads. User A Thank You! Post 4 Thanks a lot for that ... I have Microsoft Visual • Proposal of a lexical chaining approach that Studio 6, what program should I do this in? Lastly, how do I actually include this in my site? ... only uses statistical associations between User D A little more help words, which can be calculated from the raw Post 5 ... You would simply do it this way: ... You could also just ... An example of this is ... text of the targeted domain. The remainder of this paper is organised as fol- Figure 1: A snippeted CNET thread annotated with link- lows. Firstly, we review related research on fo- ing structure rum thread linking structure classification and lex- ical chaining. Then, the three lexical chaining al- gorithms used in this paper are described in detail. vide independent answers. Therefore their posts are Next, the dataset and the experimental methodology linked to the first post, with link labels 1 and 2 re- are explained, followed by the experiments and anal- spectively. UserA responds to UserC (link = 1) to ysis. Finally, the paper concludes with a brief sum- confirm the details of the solution, and at the same mary and possible future work. time, adds extra information to his/her original ques- tion (link = 3); i.e., this one post has two distinct 2 Related Work links associated with it. Finally, UserD proposes a The linking structure of web user forum threads can different solution again to the original question (link be used in tasks such as IR (Xi et al., 2004; Seo et = 4). al., 2009; Elsas and Carbonell, 2009) and thread- Lexical chaining is a technique for identifying ing visualisation. However, many user forums don’t lists of related words (lexical chains) within a given support the user input of linking information. Au- discourse. The extracted lexical chains represent the tomatically recovering the linking structure of fo- discourse’s lexical cohesion, or “cohesion indicated rum threads is therefore an interesting task, and has by relations between words in the two units, such as started to attract research efforts in recent years. use of an identical word, a synonym, or a hypernym” All the methods investigated so far are supervised, (Jurafsky and Martin, 2008, pp. 685). such as ranking SVMs (Seo et al., 2009), SVM- Lexical chaining has been investigated in many HMMs (Kim et al., 2010), Maximum Entropy (Kim research tasks such as text segmentation (Stokes et et al., 2010) and Conditional Random Fields (CRF) al., 2004), word sense disambiguation (Galley and (Kim et al., 2010; Wang et al., 2011b; Wang et McKeown, 2003), and text summarisation (Barzi- al., 2011a; Aumayr et al., 2011), with CRF models lay and Elhadad, 1997). The lexical chaining al- frequently being reported to deliver superior perfor- gorithms used usually rely on domain-independent mance. While there is research that attempts to con- thesauri such as Roget’s Thesaurus, the Macquarie duct cross-forum classification (Wang et al., 2011a) Thesaurus (Bernard, 1986) and WordNet (Fellbaum, — where classifiers are trained over linking labels 1998), with some algorithms also utilising statisti- from one forum and tested over threads from other cal associations between words (Stokes et al., 2004; forums — the results have not been promising. This Marathe and Hirst, 2010). research explores unsupervised methods for thread 77 linking structure recovery, by exploiting lexical co- cept or category) inventory from the Macquarie The- hesion between posts via lexical chaining. saurus (Bernard, 1986) to build a word-category co- The first computational model for lexical chain occurrence matrix (WCCM), based on the British extraction was proposed by Morris and Hirst (1991), National Corpus (BNC). Lin (1998a)’s measure of based on the use of the hierarchical structure of Ro- distributional similarity based on point-wise mutual get’s International Thesaurus, 4th Edition (1977). information (PMI) is then used to measure the asso- Because of the lack of a machine-readable copy ciation between words. of the thesaurus at the time, the lexical chains This research will explore two thesaurus-based were built by hand. Research in lexical chain- lexical chaining algorithms, as well as a novel lexi- ing has then been investigated by researchers from cal chaining approach which relies solely on statis- different research fields such as information re- tical word associations. trieval, and natural language processing. It has been demonstrated that the textual knowledge pro- 3 Lexical Chaining Algorithms vided by lexical chains can benefit many tasks, in- Three lexical chaining algorithms are experimented cluding text segmentation (Kozima, 1993; Stokes et with in this research, as detailed in the following sec- al., 2004), word sense disambiguation (Galley and tions. McKeown, 2003), text summarisation (Barzilay and Elhadad, 1997), topic detection and tracking (Stokes 3.1 ChainerRoget and Carthy, 2001), information retrieval (Stairmand, Chainer is a Roget’s Thesaurus based lexical 1997), malapropism detection (Hirst and St-Onge, Roget chaining algorithm (Jarmasz and Szpakowicz, 2003) 1998), and question answering (Moldovan and No- based on an off-the-shelf package, namely the Elec- vischi, 2002). tronic Lexical Knowledge Base (ELKB) (Jarmasz Many types of lexical chaining algorithms rely and Szpakowicz, 2001). on examining lexicographical relationships (i.e.