<<

The Workshops of the Thirty-Second AAAI Conference on Artificial Intelligence

Automatic Extraction of Domain Specific Latent Beliefs in Customer Complaints to Help Tailor

Amit Sangroya, C. Anantaram, Pratik Saini, Mrinal Rawat TCS Innovation Labs, Tata Consultancy Services Limited, ASF Insignia, Gwal Pahari, Gurgaon, India (amit.sangroya, c.anantaram, pratik.saini, rawat.mrinal)@tcs.com

Abstract we measure the severity of the complaint and accordingly Understanding a customer’s personal opinion is extremely im- update customer’s opinion. portant to initiate and maintain a meaningful conversation. In • Secondly, we propose an algorithm that make use of RNNs this paper, we propose an approach to extract latent emotional to classify fine grain opinions using a combination of do- beliefs of customers and use them to tailor a ’s conver- main knowledge and information extraction. sation. We present a based mechanism to process customer complaints and extract sentiments like cus- • Lastly, we use the fine grain domain oriented opinion tomer is sad, happy, upset, etc. Further, we also train a model information in tailoring the dialog of conversational that extracts more fine grain sentiments like the customer is system. irritated, harassed etc. in context of a particular complaint scenario. This information helps to tailor the dialog according to customer’s emotional state and hence improve the overall effectiveness of the dialog system. Domain Specific Latent Belief Extraction Customer complaints expressed in natural language form can Introduction be quite complex. For example, in an automobile domain, an irritated car customer might describe a specific problem as: During dialog with a customer for addressing his/her com- my car just died on me. No warning no check engine. Car just plaint the chatbot may pose questions or observations based out of extended warranty all maintenance up to date. Had on its underlying model. Sometimes the questions or observa- issues with charcoal canister, and shift lever. engine croaked tions posed may not be relevant given the nature of complaint without overheating, no warning, no check engine. and the current cognitive beliefs that the customer holds. For Therefore, we first take a complaint and try to categorize example, if a chatbot fails to understand the customer’s emo- it into possible category C such as a "engine failure" or tional situation and responds mechanically with an irritated "transmission". To do this, we build our first machine learning customer, then the chatbot may fail to achieve its primary model ML1model that gives output O1 and helps us to focus objective e.g. to address the customer’s problem in a man- on specific part of a domain; e.g. for "engine failure" category, ner that customer feels positive (satisfied) at the end of the the chatbot would primarily focus on "engine" related issues. conversation. In order to train model ML1model, we take labeled data Traditional machine learning approaches train a system with set of customer complaints and their categories. with extremely large dialog corpus that covers a variety of scenarios. Another approach is to build a system with a com- Thereafter, we extract information and opinions through a O2 plex set of hand-crafted rules that may address some specific combination of tools which leads to output . The example O2 instances. Both approaches may be impractical in many real- of are opinions like { StrongNegative, Negative and Posi- world domains. In this paper, we propose a methodology that tive } and information such as "engine_dead". Now, applying uses a combination of machine-learning mechanism and do- domain rules over extracted information and opinions, we get O3 C = main specific to understand the severity the next output . For example, { if category engine negatives > X of customer’s complaint. This helps to understand the cus- failure and some number , then customer = O3 O2 ML1model tomer’s latent emotional beliefs while giving the complaint. irritated }. Finally, using , and , we ma- Our model then evaluates the beliefs to tailor the dialog and chine learn a model for latent belief estimation, which we ML2model make it consistent with the set of beliefs of the customer. This call as . process then helps drive the conversation in a meaningful way. Figure 1 illustrates the proposed methodology for extract- We make following key contributions in this paper. ing customer’s opinions. Its inputs include the customer com- plaints and a domain ontology. We assume automobile com- • First, we present a novel approach to evaluate a customer’s plaints domain and that we have already categorized the personal opinion in context of a particular domain. Here, complaints into categories like Transmission, Gear,Windows- Copyright c 2018, Association for the Advancement of Artificial Windshield, Engine-failure, etc. (Anantaram and Sangroya Intelligence (www.aaai.org). All rights reserved. 2017).

731 Figure 1: Domain Specific Latent Belief Extraction Process Figure 2: An example of Car Ontology

Step 1: Extracting Customer’s Opinions opposite in the other case. Our domain ontology consists of a large RDF graph and we use optimized techniques for faster We start by finding positive or negative opinions from the analysis of nearest neighborhood based semantic analysis. customer complaints. Most of the complaints have a large number of negative opinions. However, sometimes customers Step 2c: Using Information Extraction and Rules An- include positive opinions as well. Therefore, it becomes chal- other component of our system is based upon information lenging for an automatic opinion extraction system to judg- extraction using latent beliefs analysis. In some specific sit- mentally extract the actual overall opinion of the user. For uations, customers may express an opinion for a particular this purpose, one can also use available tools such as Opin- product or service. To handle such situations, we use informa- ionFinder (OF 2005). tion extraction techniques and rules to understand the context in which a particular opinion is expressed. For example, a Step 2: Updating Opinion Weights customer may say he/she visited the garage three times vs. he/she had an engine failure three times. Our mechanism evaluates the context in which opinions have Once we have the fine grain opinions, we use them to tailor been expressed and adjusts the opinion weights accordingly. the chatbot as demonstrated in next section. Initially all the complaints are assigned equal weights. How- ever, after following a three step approach, our system up- dates the opinion weights. These steps are: 1) after categoriz- Learning the Model for Latent Belief ing the complaints; 2) using knowledge from domain ontol- Extraction ogy and 3) Using information extraction. This is explained We now train a LSTM (Long Short Term Memory) network as follows. to build a model that can automatically categorize the com- Step 2a: Using Complaint Category If a complaint be- plaints based upon the information described in previous longs to more critical categories such as engine failure and section (See Algorithm 1). Like many other studies of LSTM transmission, it is assigned a higher weight as compared to on text, words are first converted to low-dimensional dense category such as body paint. This is intuitive that customer word vectors via a layer. The first layer is with a more critical problem will feel harassed as compared therefore the embedding layer that uses 32 length vectors to to a smaller problem. The complaint categorization is also represent each word. The next layer is the LSTM layer with done withe the help of machine learning in an automatic 100 units. Finally, we use a dense output layer with 5 neurons fashion. (5 classes/labels) and a softmax activation function to make the predictions. We used Categorical_Cross_Entropy as the Step 2b: Using Domain Ontology We make use of knowl- loss function (in Keras) alongwith ADAM optimizer. For edge mining derived using the domain ontology to update the regularization, we employ dropout to prevent co-adaptation. opinion weights. As shown in Figure 2, an ontology consists We run the experiment for 20 epochs with a batch size of 64. of a large knowledge graph expressing the information about In our experiments we consider complaints about car faults a domain. For example, this includes the terms and their re- from http://www.carcomplaints.com. We consider complaints lationships in a particular context. Now, using an automated across six categories such as Transmission Problems, Gear nearest neighborhood approach, we extract the severity of Problems, Windows-Windshield Problems, Engine failure a problem in a particular context. For example, nodes that Problems, Wheels-Hubs Problems and AC-Heater Problems. are closer to sensitive nodes are also considered sensitive Total number of complaints after data processing were 13,797 and hence leads to a positive update in opinions whereas it is (Figure 3). The clean-up process involves converting text to

732 Algorithm 1 Algorithm for Latent Belief Extraction Require: Complaints dataset T Ensure: Complaints and their Opinion Categories for all review r in T do Remove Stopwords and Punctuations Convert to Lowercase Tokenize Mark special name-entities for all sentence m in r do Extract customer opinions neg, strongneg etc. Using Complaint Category engine, transmission, accessories etc., update opinion weight w Category Critical if = then Figure 4: Training and Test Accuracy w ++ end if Using Domain Ontology and Nearest Neighborhood approach, update w can observe that with just less than 20 epochs, the accuracy Using Information Extraction, update w reaches close to 90%. end for end for Example: Tailoring the Chatbot

We parse a complaint description through Dependency parsers (such as Stanford-CoreNLP, GATE, MITIE etc.) and extract triples from the description by focusing on the depen- dencies identified among nouns and verbs. For example, for a description ”my car just died on me", triples such as (my-car, just-died-on, me) are extracted. Once the triples are extracted, we use a hand-crafted fact-assertion rulebase to assert facts implied by the triples. This is done by evaluating the triples in the context of a car ontology, synonym-and-slang dictionary, information-extraction patterns that are relevant for the cat- egory of the complaint, and by triggering the fact-assertion rules. The opinion analysis process helps to understand the customer’s emotional beliefs and tailor the conversations Figure 3: Number of Complaints in each Category (Opinions) accordingly. We assume that we have a hand-crafted complaint- lower case; tokenizing the sentences; and removal of punctu- management finite-state-machine (FSM) to carry out the ations and stopwords. Each input complaint as input is con- dialog with the customer. The FSM operates on slots that verted into a vector form. We identify the top 3000 unique are filled by extracting information from the complaint and words and every word in this vocabulary is given a index. subsequent interaction. The probable beliefs of the customer If the word is not present in vocabulary we consider it as that were asserted as facts, and the category of the complaint 0. For Example: I am very disappointed today. The vector are then evaluated by the epistemic rules encoded in a knowl- representation of this sentence is [10, 100, 23, 467, 0]. Next, edge base for the domain. The rules make assertions about the we need to truncate and pad the input sequences so that they states in the FSM that need to be skipped and the states that are all the same length. We take the max length of complaint need to be evaluated in order to be consistent with the beliefs to be 200. We take both positive and negative opinions as of the customer. The subsequent dialog is carried out and the input features. We divide the data into training set (75%) and next set of beliefs are then asserted. The cycle then continues. test set (25%). We ensure that we do not have data sparsity As shown in Table 1, we can observe that chatbot is able to issue i.e. We keep approximate equal proportion of data for skip some FSM states as a result of latent beliefs, that were each class. derived from information extraction (including opinions) and initial processing of the customer complaint. Experimental Results The beliefs and epistemic rules helped tailor the dialog to It is obvious that the customer’s complaints consisted of pri- the customer expectations. In this work, we demonstrate the marily negative sentiments (See Figure 3). The intention was overall architecture where we use RNN based classification of to find out the level of customers dissatisfaction and use this customer complaints as an input to the epistemic rule engine. information to tailor the chatbot. We got 87.8% classification Our approach is generic and can be applied easily in any accuracy using the opinion extraction based upon latent be- other domains such as complaints about hardware/software lief extraction. Figure 4 shows training and test accuracy. We issues etc.

733 Table 1: Sample Output of Dialog System proposes a reasoning engine to build effective and generic communicating agents. Motivated by the above works, we Customer The gears were slipping when I propose to identify the beliefs, use these beliefs to trigger drove and the car jolted suddenly epistemic rules, and use the assertions of the rules to drive the as it went in and out of conversation by tailoring the states in a finite-state machine gear. I immediately took to the dealer only to have them dialog system. flush the transmission. Later the Opinion mining methods have been in use for over long transmission was “fixed" by plac- time. Recently, these methods have been applied to dialog ing an oil jet kit in the car systems. Roy et al. (Roy et al. 2016) propose a novel ap- which has done absolutely noth- proach to consider customer satisfaction to tailor the dialog. ing. Now even transmission wont While general methods are useful to un- go into 3rd gear. derstand a customer’s mental situation, they may need to ML1model Output Problem: Transmission be complemented with more domain specific information ML2model Output Latent Belief: leading to richer fine grained classes of sentiments. For this Customer(Angry) reason, it is necessary to develop methods which take into Bot (skipstate: general- Okay. Any burning account domain specific sentiments information. chit-chat) ; (skip- smell coming from state: car-movement- your car? experience); (askstate: Conclusion advance-maintainence- This paper presents an approach to contextualize the dialog questions by identifying latent beliefs in a customer’s complaint. Using Customer Yes, there is a burning smell. our methodology, a chatbot is able to tailor its dialog by fac- Bot (askstate: advance- Have you ever got the toring customer’s latent beliefs and hence efficiently handle maintainence-questions) car clutches checked? complaints. Our experimental results have been promising. Customer Not yet The chatbot is able to have one FSM and tailor that appropri- Bot Oh Okay. Well then ately for the customer’s situation on hand. This leads to more it could probably be a clutch issue. relevant dialog for the customer. You need to get the clutches checked. References Anantaram, C., and Sangroya, A. 2017. Contextualizing Cus- tomer Complaints by Identifying Latent Beliefs and Tailoring a Chatbot’s Dialog through Epistemic Reasoning. In Ninth Related Work International Workshop Modelling and Reasoning in Context There are recent motivating examples of works that make at IJCAI 2017. use of machine-learning to build intelligent dialog systems. Lemon, O.; Liu, X.; Shapiro, D.; and Tollander, C. Hierarchi- Traditional dialog systems are specialized for a domain and cal Reinforcement Learning of Dialogue Policies in a devel- rely on slot-filling driven by a knowledge base and a finite- opment environment for dialogue systems: REALL-DUDE. state model (Lemon et al. ). The finite-state model represents In BRANDIAL’06, 185–186. the dialog structure in the form of a state transition network in which the nodes represent the system’s interactions and Miller, A.; Fisch, A.; Dodge, J.; Karimi, A.-H.; Bordes, the transitions between the nodes determine all the possible A.; and Weston, J. 2016. Key-Value Memory Net- paths through the network. works for Directly Reading Documents. arXiv preprint Deep learning based dialog systems (Miller et al. 2016) use arXiv:1606.03126. memory networks to learn the underlying dialog structure 2005. OpinionFinder System. http://mpqa.cs.pitt.edu/ and carry out goal-oriented dialog. However, they do not opinionfinder. [Online; accessed 19-July-2017]. factor in beliefs or trigger epistemic rules in modifying the Roy, S.; Mariappan, R.; Dandapat, S.; Srivastava, S.; Gal- conversation given the evolving context. In (Williams, Raux, hotra, S.; and Peddamuthu, B. 2016. Qart: A system for and Henderson 2016) Williams et.al, describe the dialog real-time holistic quality assurance for contact center dia- state tracking challenge and mention “how task of correctly logues. In Proceedings of the Thirtieth AAAI Conference on inferring the state of the conversation - such as the user’s goal Artificial Intelligence, 3768–3775. - given all of the dialog history up to that turn” is important. Sadek, M. D.; Bretier, P.; and Panaget, F. 1997. ARTIMIS: It is in this overall context, we propose that it is important Natural Dialogue Meets Rational Agency. In Proceedings of to evaluate the probable beliefs held by the human and tailor the Fifteenth IJCAI - Volume 2, 1030–1035. the dialog system suitably to be consistent with the beliefs in order to hold a relevant conversation. Sara, U. 2010. Obligationes as formal dialogue systems. In Uckelman (Sara 2010) describes how in a formal dialog Stairs 2010: Proceedings of the Fifth Starting AI Researchers’ system, dynamic epistemic logic can be used in an Obligatio, Symposium, volume 222, 341. IOS Press. where two agents, an Opponent and a Respondent, engage in Williams, J.; Raux, A.; and Henderson, M. 2016. The Dialog an alternating-move dialog to establish the consistency of a State Tracking Challenge Series: A Review. Dialogue & proposition. Sadek et al. (Sadek, Bretier, and Panaget 1997) Discourse.

734