Aspect-Based Sentiment Classification with Attentive Neural

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) Aspect-Based Sentiment Classification with Attentive Neural Turing Machines Qianren Mao 1;2, Jianxin Li 1;2, Senzhang Wang 3, Yuanning Zhang 1;2, Hao Peng 1;2, Min He 4 and Lihong Wang 4 1Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, China 2State Key Laboratory of Software Development Environment, Beihang University, China 3Nanjing University of Aeronautics and Astronautics 4National Computer Network Emergency Response Technical Team/Coordination Center of China fmaoqr, lijx, zhangyn, [email protected], [email protected], [email protected], [email protected] Abstract (e.g., positive, neutral, negative) of the opinion target appear- Aspect-based sentiment classification aims to iden- ing in given comments. As shown in case 1 of the following tify sentiment polarity expressed towards a given example, we called it multiple-target-different-polarity sentence: 0T he food is usually good but it is certainly not opinion target in a sentence. The sentiment polarity 0 of the target is not only highly determined by senti- a relaxing place to go: . The opinion target collocates with ment semantic context but also correlated with the frequently-used sentiment words in which the sentiment po- concerned opinion target. Existing works cannot larity of target food corresponding to sentiment word good is positive while the polarity of target place corresponding to effectively capture and store the inter-dependence 0 between the opinion target and its context. To solve isn t relaxing is negative. this issue, we propose a novel model of Attentive Case 1: The food is usually good but it certainly Neural Turing Machines (ANTM). Via interactive isn’t a relaxing place to go. read-write operations between an external memory storage and a recurrent controller, ANTM can learn Case 2: The only thing I can imagine is that Sony jumped the dependable correlation of the opinion target to on early specifications for Vista requirements from Microsoft context and concentrate on crucial sentiment in- and designed it to those inadequate requirements. formation. Specifically, ANTM separates the in- In addition to the challenge of case 1 where the polarity formation of storage and computation, which ex- could be opposite when different targets are considered, an- tends the capabilities of the controller to learn and other challenge presented in case 2 is referred to as a long- store sequential features. The read and write op- sequential-distance sentence. Unlike other review expres- erations enable ANTM to adaptively keep track of sion cases in which sentiment words always follow forward the interactive attention history between memory or backward to target words in a nearby position, there is content and controller state. Moreover, we append a long distance between target words and related sentiment target entity embeddings into both input and out- words in case 2 which shows that the target word V ista is put of the controller in order to augment the in- far away from the corresponding sentiment word inadequate tegration of target information. We evaluate our and demonstrative pronoun it. Unfortunately, most recurrent model on SemEval2014 dataset which contains re- neural networks are hard ro handle the sentence of case 2 due views of Laptop and Restaurant domains and Twit- to their chain structure of non-linearities being prone to gra- ter review dataset. Experimental results verify that dient vanishing. our model achieves state-of-the-art performance on Among previous works, approaches [Wang et al., 2014; aspect-based sentiment classification. Tang et al., 2016a; Wang et al., 2016] focusing on multiple- target-different-polarity sentence have just simply concate- 1 Introduction nated target representation to hidden state of neural networks. Sentiment analysis, known as opinion mining, has drawn in- However, these methods are deficient in modeling the inter- creasing attention from researchers and industries due to its dependence between target and context where sentiment fea- wide application in understanding people’s attitude towards tures have been separated by long-term dependencies. In- some topic or product reviews and so on. Aspect-based senti- spired by memory augmented neural networks being success- ment analysis (ABSA) is a fine-grained task in the field of text fully applied in Question & Answering (Q&A) task, Mem- classification [Pontiki et al., 2014; Peng et al., 2018]. Several Net, [Tang et al., 2016b] and CEA [Yang et al., 2018a] treat subtasks can be regarded as sentiment classification problems target entity or aspect as a query object, and to find senti- at sentence level, e.g., aspect level sentiment classification ment clues in memory content. RAM model attempts to adopt and aspect term level (opinion target level) sentiment classifi- multiple-attention mechanism to capture sentiment features cation. The goal of our paper is to infer the sentiment polarity separated by a long distance and performs well in target senti- 5139 Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) ment analysis. However, this model has overloaded the usage the sentiment of the aspect/target in the sentence, and outper- of memory representations which takes only a single mem- form recursive neural networks. More efficacious work tends ory to both represent a source sentence and track attention to detect the polarity of aspect or aspect term words using history. In addition, relying on number of attention layers conventional neural networks like long short-term memory makes models hard to achieve a stable performance, as being (LSTM). These models aim to explore the potential correla- the same with MemNet. tion of aspect or aspect term words and sentiment polarity. To solve the above deficiencies, we propose Attentive Neu- TD-LSTM and TC-LSTM [Tang et al., 2016a] take opin- ral Turing Machines (ANTM) for aspect term level sentiment ion target information into consideration, and achieve good classification. We use a structured memory module to store performance in target-dependent classification. [Wang et past information with separation of neural network param- al., 2016] proposes AE-LSTM, AT-LSTM and ATAE-LSTM eters, and utilize an interactive read-write operation to au- methods. These methods introduce attention mechanism to tomatically search for sectional sentiment information from concentrate on different parts of the sentence when different memory content. Specifically, our model contains an exter- aspects are taken as input, and the result shows that feeding nal memory which is stacked by word representations of the the embeddings of aspect or aspect terms is important to cap- input sentence, and a recurrent controller to encode sentence ture the corresponding sentiment polarity especially for the feature representation. With an addressing operation, the ex- case 1 problem mentioned before. Drawing on the experi- ternal memory can be read and written, which helps to capture ence of form for Q&A, some methods [Tang et al., 2016b; sentiment features of context related target words. Specifi- Liu et al., 2018] treat opinion target information as a query cally, the read operation keeps tracking an interactive atten- vector or interactive vector [Ma et al., 2017; Fan et al., 2018] tion among context words to opinion target, while the write to context, and achieve a very competitive performance. All operation fixes contents of memory at each time. Finally, we the result above show that the attention mechanism and ap- concatenate the opinion target into each hidden vector to aug- pending target information are both effective way to capture ment integration of target information before computing at- related sentiment information in response to the concerned tention weights for the final sentiment classification. opinion target. We evaluate our approach on SemEval2014 dataset which contains reviews of Laptop and Restaurant domains and Twit- 2.2 ABSA with Memory Networks ter review dataset. Experimental results show that our model achieves substantial performance improvement over the two The memory networks have initially been explored datasets. The prime contributions of our work can be summa- for the task of Q&A with End-To-End Memory Net- rized as follows. works (MemN2N) [Sukhbaatar et al., 2015] and Gated • With appending opinion target information, our ANTM MemN2N [Liu and Perez, 2017], and the task of copy and as- model is robust to resolve the problem of target-sensitive sociative recall with Neural Turing Machine (NTM) [Graves sentiment by an efficient way of interaction between ex- et al., 2014]. Moreover, some deep learning methods with ternal memory and neural network state. memory augmented neural networks have been used in • Our ANTM model separates the information of storage sentiment classification tasks and holistically succeed gain and computation, which can extends the capabilities of success. [Tang et al., 2016b] proposed a deep memory a recurrent neural network to learn and store sequential network with multi-hops/layers named MemNet for aspect features, and helps improve semantic loss from long- level sentiment classification and achieved comparable per- term dependencies. formance with feature-based SVM system, and substantively outperformed standard LSTM architectures. Inspired by • Our ANTM model sets a new state-of-the-art perfor- multi-hops memory from MemNet, [Yang et al., 2018b] used mance on the task of aspect term/opinion target level multi-hops memory to learn abstractive sentiment-related sentiment classification. representation for both entity and aspect and achieved a significant gain over several baselines. Unlike LSTMs used 2 Related Work in sentiment classification, memory-augmented networks Recent research works of ABSA can be broadly categorized encouraged local changes in memory. This helps not only to into neural network based methods and memory network find the structure in the training data, but also to generalize based methods. to sequences that are beyond the generalization power of LSTMs, such as longer sequences in algorithmic tasks.

Load more