Running head: CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 1

A chatbot with interpersonal communication recognition: determine the position on Leary’s Rose after

automatic text analyzation.

Name: Gerard Johan Visser

Student number: S1068008

Date: 20-08-2018

Supervisor: Dr. P. Haazebroek

Second reader: Dr. R. E. de Kleijn CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 2

Abstract The research area of chatbots is relatively young and therefore the goal of this study is to gather more information about text categorization by chatbots. At this time chatbots are increasingly used in online communication with users. It is a challenge to let chatbots respond appropriately on an emotional level in a way that users experience the answers as a positive interaction. The current study examined the possibility of mapping this interaction using text analysis with the LIWC and classification on Leary’s Rose. Based on Leary’s Rose we predicted the positive experience with NPS. This study consists of three phases. The first phase is a text analysis to scale sentences on Leary’s Rose. The sentences were scaled by 102 participants on two scales (the “I & We” scale and the “Dominance & Submissive” scale). With these scaled sentences, a classifier (classifier A) is created and trained with the LIWC and a regression analysis. The results of phase one suggests that our database contains mostly “Dominance/I” and “Submissive/We” sentences. Classifier A (80.8%) is 3% better than the random baseline (77.8%). Classifier A is tested in phase two with self-annotated sentences. These self-annotated sentences were from 15 participants on two scenarios’. Based on these self-annotated sentences we created also two new classifiers (B1 & B2). The test results of classifier A (59.0%) is the same as the (59.0%) random baseline. The two new classifiers (B1 & B2) created in phase two performed better (B1: 72.3% and B2: 76.6%) than the random baseline (59.1%). In phase three we tried to predict NPS on Leary’s rose. Based on Kendalls tau-b correlation and crosstabs we compared the classifiers. The findings suggested that it is possible to predict NPS based on Leary’s Rose. A possible implication is the need of a multimodal approach for text analysis. Future research should focus on better ways of annotation to prevent skewed, small and noisy databases. More implications and suggestions are presented in the discussion. Keywords: emotion detection, emotion classification, text analysis, Leary’s Rose, interpersonal communication, chatbot, Net Promoter Score (NPS) CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 3

A chatbot with interpersonal communication recognition: determine the position on Leary’s Rose after automatic text analyzation. In recent years, more and more chatbots became available in different areas. Via chatbot-software a human is able to interact with a computer in natural language. This software can extend daily life, for example a helpdesk chatbot (Rahman, 2012; Shawar, Atwell & Roberts, 2005) who is able to answer questions from customers. A customer would like to have information about a problem with a product and asks the chatbot. The chatbot is able to answer the question. But chatbots are also used in areas such as in educational tools (Keshtkar, Burkett, Li & Graesser, 2014; Vaassen & Daelemans, 2010) or in e-commerce and business (Chattaraman, Kwon & Gilbert, 2012). Because chatbots are used more and more, improvements should be made and one of the challenges is to detect emotion from the user in a chatbot conversation. To return to our ‘helpdesk chatbot’ example; imagine a helpdesk chatbot which can detect someone’s emotion. The chatbot is then able to change its type of communication based on the emotion of the user in a way he or she feels more understood. Thereby, the chatbot is able to recognize when the conversation goes sideways and transfer the customer towards a real human being. To reach this goal more research is needed. This study focuses on emotion classification of customer conversations with a helpdesk chatbot from a large online retailer. Chatbots A chatbot is a computer program designed to communicate with human users via natural language. The chatbot can recognize words or a group of words and based on this data the chatbot gives answers. This type of chatbot has certain benefits. First of all, a chatbot is always present and takes care for real-time events 24 hours a day. In addition, the communication by a chatbot is a dialog and is more effective than a monolog, when dealing with humans (Tatai, Csordás, Kiss, Szaló & Laufer, 2003). Furthermore, the chatbot combines large amounts of information and only shows the information that is asked for by the user. And at last, a chatbot can handle many cases simultaneously, which is cost-saving for the company because fewer employees are needed to answer the questions of the users. However, at this moment, there are challenges to deal with due to the complexity of human language and emotions. Computers have difficulties in understanding the endless variability of expression in how words are meant in language use to communicate meaning (Hill, Ford & Farreras, 2015). To create a computer program that is capable to interact with a person at a CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 4 human level, requires the machine to understand human behaviors. One of the most important things in a conversation is expressing and understanding emotions and affects (Picard & Picard, 1997; Salovey & Mayer, 1990). The possibility to test humanized machines was proposed by Alan Turing in his “Turing Test” (Turing, 1950). This test is based on a conversation between a computer and a human judge. It is based on the ability of a computer program to impersonate a human, with the judge not being able to distinguish between a computer of a human being. One of the first chatbots subjected to the Turing Test was ELIZA, which is created at the Massachusetts Institute of Technology by Weizenbaum. ELIZA is a chatbot that emulates a psychotherapist (Weizenbaum, 1966). After ELIZA, a lot of other chatbots are created with different purposes. Still none of the chatbots passed the Turing Test (Saygin, Cicekli & Akman, 2000; Warwick & Shah, 2016). Emotion and classifying emotions As described above, emotion is an important factor in humanizing computers. Classifying customers emotions is important for companies because emotion has an effect on customer loyalty and satisfaction (DeWitt, Nguyen & Marshall, 2008; Varela-Neira, Vázquez-Casielles & Iglesias-Argüelles, 2008; Yu, White & Xu, 2007). To measure customer satisfaction and loyalty a company can use the Net Promoter Score (NPS). Shaw (2016) introduced a new indicator for emotional value, the Net Emotion Value (NEV). The NEV measures the emotion value towards a company. His work shows that the higher the NEV (positive emotion), the higher the NPS and thus has emotion an effect on the NPS. To extract emotion from text in a chat environment is different than extracting emotion from face to face interactions between humans. Emotion extraction from text is missing facial expressions, intonation of voice and body language (Vaassen, 2014). This complicates the emotion extraction from text. Another difference is the difference in communication styles in a human-chatbot versus a human-human chat-conversation. Users tend to be more agreeable, open, extrovert, conscientious, and self-disclosing when interacting with a human. When a human consciously interacts with a chatbot they report lower perceived attractiveness, less goal driven and more brutal language than in chats with humans (Mou & Xu, 2017). Hill and others (2015) found differences between human-human and human-chatbot communication in: more messages, shorter message lengths, more limited CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 5 vocabulary and greater use of profanity. These differences in interaction should be taken into account when classifying emotion from chatbot texts. Interpersonal interaction To achieve emotion recognition in a chat conversation a real time automatic emotion analysis is needed. In 2010, Vaassen and Daelemans introduced the automatic classification of text according to a framework for interpersonal communication (Vaassen & Daelemans, 2010; Vaassen & Daelemans, 2011; Vaassen, Wauters, van Broeckhoven, van Overveldt, Daelemans & Eneman, 2012; Vaassen, 2014). This approach focuses not only on emotion classification but also the interaction between a human and a chatbot, which is very helpful. Several different frameworks for interpersonal communication have been developed over the past years (Gurtman, 2009). The first interpersonal communication model was created by the Kaiser Research Group by the name interpersonal circle, better known as “Leary’s Rose” (Leary, 1957). This framework determines two roles, a speaker and a listener. These two roles change during the conversation, when someone speaks, he is the speaker and when someone listens, he is the listener. The graphical representation of Leary’s Rose is a circle which is split vertically in the “I” and “We” side and horizontally in the “Dominance” and “Submission” side. The horizontal axis determines if the speaker is dominant or submissive towards the listener. The vertical axis determines the speaker’s willingness to co- operate. These partitions create four quadrants: “Lead”, “Follow”, “Defend” and “Attack.” Each quadrant can again be divided into two octants, which create in total eight octants (Figure 1).

Figure 1. The partition of Leary’s Rose.

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 6

One of the characteristics makes the interpersonal circumplex particularly interesting for interpersonal communication. The circumplex can predict to some extend what the position of the listener will be when he reacts on the speaker (Figure 2) (Dijk, 2013; Dijk & Cremers, 2007; Leary, 1957; Remmerswaal, 2011). “Dominance” will trigger a complementary response namely, “Submission” and vice versa. “We” or “I” behavior will trigger a similar response, “We” behavior will trigger “We” behavior and vice versa. The speaker can thus influence the behavior and emotions, from the listener, by his own conversational actions (Dijk, 2013; Dijk & Cremers, 2007; Leary, 1957; Remmerswaal, 2011). For example, if a colleague is angry at you, he will attack you from the “Dominance/I” quadrant and you will defend yourself from the “Submissive/I” quadrant. In this example “Dominance” triggers a complementary response (“Submissive”) and “I” behavior triggers “I” behavior.

Figure 2. Response prediction according to Leary’s Rose. Dominance will trigger a submissive response. I behavior triggers I behavior and We behavior triggers We behavior

Automatic detection of interpersonal communication Leary’s Rose was used by Vaassen and Daelemans (2010; 2011) in a serious gaming project named “deLearyous.” The deLearyous project aimed at developing a game in which users can improve their communication skills by interacting with a virtual character. In order to apply this framework, they gathered data from series of experiments whereby the virtual agent was replaced with a human actor (Wizard of Oz setting) (Vaassen & Daelemans, 2010; Vaassen, 2014). After gathering the data, they transcripted, analysed and annotated the obtained data. Vaassen and Daelemans used several machine learning algorithms to reach CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 7

52,5% of correctly classified sentences based on the four quadrants; “Lead”, “Follow”, “Defend” and “Attack.” This is higher than the random baseline of 25,15%, which is a bit higher than 25% because of the imbalances in the class distribution, and is a significant improvement (Vaassen & Daelemans, 2010). In subsequent study they used another classifier and reached F-scores (accuracy of the classifier) up to 51% based on four quadrants and 31% based on eight octants. This is again a significant improvement according to the random baseline (25,4% for the four quadrants, 13,1% for the eight octants). In conclusion, it is possible to beat the random baseline (Vaassen, 2014; Vaassen & Daelemans, 2010; Vaassen & Daelemans, 2011; Vaassen et al., 2014) but Vaassen & Daelemans state that it is extremely difficult to reach an acceptable performance for practical use. The main problem is the human annotators who experience difficulty with scaling the sentences on Leary’s Rose and will not always agree on the correct quadrant or octant for a sentence. Other problems are the small size of the corpora and noisy datasets due to the annotation problem (Vaassen & Daelemans 2010; Vaassen & Daelemans, 2011; Vaassen et al., 2012; Vaassen, 2014). Another study based on the automatic detection of interpersonal communication is from Kesthar and others (2014). This study used Leary’s Rose to detect the personality of players in the Land Science game (an educational game). They used also machine learning algorithms and concluded that text categorization based on n-grams reached the highest scores, but a combination-method such as the Linguistic Inquiry and Word Count (LIWC) and Subjective Lexicons along with n-gram features can achieve better performances. Keshtkar and others (2014) indicate in their study a disagreement between the human annotators. In this case the bottleneck is again the categorization by human annotators. Linquistic Inquiry and Word Count (LIWC) Kesthar and others (2014) used the LIWC to divide words into psychological meaning categories. First, the program counts the words from a sentence or text and after that the LIWC divides the categories trough the total word count. The program presents the percentage of words in a category. The Dutch version of the 2007 LIWC contains 11091 words in 66 categories and gives equivalent results compared to the English LIWC (Boot, Zijlstra & Geenen, 2017). Current study The goal of this study is to scale relative short text input, in a chatbot environment, on Leary’s Rose. Furthermore, in phase two a new annotation process, based on self-annotation, CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 8 is examined. In phase three we try to find a correlation between NPS and Leary’s Rose and examined if we could predict NPS based on Leary’s Rose. The study consists of three phases (Figure 3). The first phase is a text analysis whereby participants annotate sentences from a chat conversation on Leary’s Rose. These annotated sentences are the ground truth for the text categorization by classifier A. The classifier was acquired by a logistic regression on the sentences with the participant scores. The first hypothesis: We expect to find a higher overall accuracy than the baseline. The expectation of the higher overall accuracy than the baseline arises from the training of the dataset. This should improve the classifier in such a way that the classifier categorizes sentences better than the baseline.

Figure 3. A systematic flowchart of the phases in this study.

The second phase focuses on testing the classifier of phase one (classifier A). Based on scenarios, another participants group chatted with the chatbot and after the conversation they annotated their own typed sentences. The annotated sentences by the participant will be the ground truth of the sentence. The sentence position on Leary’s Rose, obtained from the classifier A, will be compared with their own ground truth annotated position score. The second hypothesis: we expect as in phase one, the overall accuracy will be higher than the baseline. We expect the classifier will still be higher than the random baseline because of the training by the dataset. This should improve the classifier in such a way that the classifier categorizes sentences better than the baseline. In the second phase we will also create a new classifier (classifier B) based on the annotated sentences by the participant from phase two with a logistic regression. In phase three we compare classifier A with classifier B. In the last phase, phase three, we predict the NPS with the scores on Leary’s Rose to measure the effectiveness of the classifiers and to determine if there is a correlation between the mean scores on Leary’s Rose and the NPS per chat. The third hypothesis: We expect to CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 9 find a correlation between a high NPS and “Submissive/We” scores on Leary’s Rose and a low NPS and “Dominance/I” scores. This expectation is based on the findings from Shaw (2016): the higher the NEV (positive emotion), the higher the NPS. As described above, we will compare the overall accuracy from the classifier A with the overall accuracy from classifier B. This overall accuracy score will be obtained from correct scored NPS sentences. The fourth hypothesis: We expected to find a higher improvement of the overall accuracy from the classifier B from phase two, than from the classifier A from phase one. This expectation is based on the fact that the classifier B from phase two is obtained from sentences which are typed and annotated by the same participant. The methodology, results and a short discussion from the three phases will be discussed separately to keep the structure well-ordered and as simple as possible. In the next sections the methods, results and discussion of each of three phases are presented. In the final section a general discussion and we offer some ideas for future research. Phase one Methods Design. The first phase was a text analysis by participants to score sentences on Leary’s Rose. Every participant got 11 random opening sentences and 6 random dialogs to score. The participant scores determined the sentence place on Leary’s Rose. The scores were obtained from a combination of the I (against) and We (together) scales and were measured on a ratio scale from -100 to 100 and the Dominance and the Submissive variable with a ratio scale from -100 to 100. These scores differed per participant because their scores on a sentence were subjected to moods, emotions and personality. To counterbalance this phenomenon, we asked as many different participants as possible and this leveled out the differences in scores. We also used some control variables (gender, age and education level) to determine the generalizability. After the text analysis by participants we created, through the LIWC, a classifier A to categorize sentences on Leary’s Rose by a formula. Participants. The text categorization, 101 participants participated in our study. These 101 respondents scored only a part of the sentences to keep it short and easy. At least two annotators per sentence were needed to compare the annotation scores and we reached on average five participants per sentence. This procedure differed from other studies in annotating sentences. Most of the studies used up to four participants to score all the sentences (Keshtkar, et al., 2014; Vaassen et al., 2012). The participants in these studies were CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 10 experienced in the practice of Leary’s Rose. Scoring sentences on Leary’s Rose with inexperienced participants wasn’t done before, but annotation on emotions, with no training, was done by Aman and Szpakowicz (2007). In addition: this was a quick and cost-saving possibility compared to the text categorization method from other studies. The participants were recruited by social media (Facebook) and through e-mail and met the following criteria: the participants are familiar with a computer using the internet, Dutch speaking (first language) and in the age-range 18 to 65. These criteria decreased the chance of mistakes made by not understanding the task. Participants who did not finish the task were deleted and if a participant differed significantly on their answers, the answers were checked manually. When the participants finished their tasks, they were rewarded with the possibility of winning one of the five online retailer gift cards from 20 . All the tasks were reviewed by the psychology ethics board from Leiden University and the study meets up with all the applicable laws and guidelines. Procedure. The data from the first phase was collected via an online survey. The survey started with an informed consent to inform the participants about their rights (Appendix A), criteria and the general procedure. After accepting the informed consent, the participant got an explanation on how to score the sentences. The explanation was followed up with an example to give the participant some more clarification about the tasks. This example showed first the sentence, then a glider for the Dominance – Submissive scale, followed by a glider for the I – We scale, and ended with an optional question: “Based on which words did you scale the sentences?” The last question showed us some underlying information about the participants criteria. The first task contained of 11 openings sentences which were randomly assigned to the participants and after completion of these 11 sentences the participant got information about the second task. The second task was similar to the first task but instead of openings sentences the participant got six dialogs between the user and the chatbot. These dialogs contained a few sentences to give the participant the feeling that they read a real conversation to improve the scores on the sentences. Every sentence in the dialog was scored separately in the same way as in the first task: First the sentence, then two gliders followed up by the “Based on which words did you scale the sentences?” question. CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 11

After completion of the second task the participant got another questionnaire with some control variables (gender, age and education level) and a few general questions (if there were some uncertainties and if they liked the tasks). At last the participant could fill in their e- mail addresses for the draw of the gift cards, followed up with the debriefing (Appendix B). To complete all tasks, it took about 15 minutes. Apparatus. In the first phase we compiled a dataset of 250 real chatbot conversations of the subject ‘where is my package from an external seller?’ (from the chatbot from the online retailer). These chat conversations took place between 1 December 2016 and 28 February 2017. After reviewing these 250 conversations we excluded 8 conversations because these users weren’t serious about the topic and were testing the chatbot with strange sentences. Then we separated the opening sentences and the dialogs, because an opening sentence is self-contained and most of the time the opening sentence contained a lot of information. Thereby, not every dialog was usable because of closed-ended questions from the chatbot. However, a nominable number of dialogs contained a lot of information and the development of a dialog was very useful (Vaassen & Daelemans, 2011). Based on these findings we used both opening sentences and dialogs, 203 opening sentences and 110 dialogs. The participants were presented a link in a Facebook advertisement or an e-mail to a Qualtrics questionnaire. Qualtrics is an online tool for questionnaires and can be filled in with a computer, tablet or smartphone with internet. We recommended to use a computer because it is easier to fill in the questionnaire. The questionnaire was pre-tested by two participants to check on time spent and understandability. To analyze the data, we used IBM SPSS 23. To classify words per quadrant, we used the LIWC. The LIWC counts the words from a sentence or text and divide the words into psychological meaningful categories. Afterwards they divide the number of words in a category through the total word count. The program presents the percentage of words per sentence in a category. The first LIWC application was developed as part of a study of language and disclosure (Pennebaker, 1993; Tausczik & Pennebaker, 2010) and the LIWC is through the years translated into several different languages. The 2001 LIWC dictionary was the first which was translated into Dutch (Zijlstra, Meerveld, Middendorp, Pennebaker & Geenen, 2004) and Boot, Zijlstra and Geenen (2017) translated the 2007 version. The Dutch version of the 2007 LIWC contains 11091 words in 66 categories and give equivalent results compared to the English LIWC, except for a small number of categories. This is because of CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 12 difference in word use, homonymous or the less suitable test corpus for some of the categories (Boot et al., 2017). Analysis. The obtained data is inserted in IBM SPSS 23 and was checked on outliers. The participants answers were checked on task completion, total time (boxplot on task duration), descriptives (mean, minimum and maximum) and with a frequency analysis on gender, age, education level and general thoughts about the questionnaire. When the data differed from others, the answers were checked manually. When a participant did not complete the task or has given strange answers their data was deleted and the participant was excluded from the study. The usable data in SPSS is transposed to analyze the data by sentence and not by participant. Then we computed the Mean, Standard deviation, Minimum and Maximum on the annotated sentence scores. Also, we computed a count on respondents per sentence. After these calculations we computed the nominal scores for Leary’s Rose. The ratio scores on the I (< 0) and We (> 0) and the Dominant (> 0) and Submissive (< 0) scale were transformed into one of the four quadrants, for instance a sentence with a score on the I & We scale of 64 and the score on the Dominant & Submissive scale of -67 was transformed into quadrant “Submissive/We.” Before we inserted the sentences with the participants scores into the LIWC, we created an inclusion criterion by the mean and Standard deviation based on the raw scores. The selected sentences for the analysis were scored higher than 15 on both scales (I & We and Dominant & Submissive) and a standard deviation of 50 or higher on both scales. Sentences which did not meet the inclusion criteria were deleted. This approach was used to filter the strong scored sentences from the weak sentences in order to train the classifier on a stronger database. For each sentence, the scores resulting from annotation by the participants were inserted per quadrant into the LIWC. The LIWC categorized the sentences in their categories. With a T-test on the percentage of words per quadrant we could find differences per quadrant. For example, someone typed to the chatbot the sentence: “You are stupid” which has a “Dominance/I” ground truth. The three words were categorized by the LIWC, whereof stupid is categorized in the category “negative emotion.” After categorizing all the sentences, the LIWC category “negative emotion” is significantly more common in the “Dominance/I” CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 13 sentences. This category is then used as a predictor for the “Dominance/I” quadrant. These measurements were inserted in a new SPSS datasheet to create the classifier. At last, a backwards logistic regression is performed on the LIWC output to determine the best predictor word-groups for the quadrants on Leary’s Rose. Through a binary logistic regression analysis, a classifier (classifier A) is created to rate new sentences on Leary’s rose. After the completion of the classifier the cut-off point is optimized through a ROC curve on quadrant scores and p values (Tosteson & Begg, 1988). Optimizing the cut-off point counterbalanced the deviation of the classifier and therefore the classifier reached a higher accuracy. This deviation arose due to imbalances in the number of sentences per quadrant. The classifier is skew distributed towards the quadrant with the most sentences. This classifier is used in phase two to automatically categorize sentences on Leary’s Rose. Results Participants. The questionnaire was finished by 101 participants, 38 males and 63 females. 67 percent of the participants were between the age of 18 and 29 and more than 86 percent of the participant’s education level was HBO (higher Vocational Education) or higher. We examined all cases on time, descriptives and missing data. More than half of the participants did not finish the task, those were deleted. Based on spend time, three participants were outliers, these outliers were checked manually and filled in the questionnaire normally. On the other data we did not find outliers. Besides task completion, we did not need to remove participants. Sentences selection. The participants scored 537 unique sentences. One sentence did not belong to a chatbot conversation, by our mistake, and was deleted. After the deletion, we transposed the data in SPSS. We calculated the mean, standard deviation, minimum and maximum score for every sentence score on both scales. We used the inclusion criteria to select the most important sentences: the standard deviation should be lower than 50, the mean score 15 or higher and every sentence should be scored by two or more annotators. With the inclusion criteria we selected 212 sentences (Table 1). Because of the low number of “I/Submissive” (3,3%) and “We/Dominance” (0.9%) sentences we decided to only use the “Dominance/I” (74.5%) and “Submissive/We” (21.2%) quadrants. This resulted in 203 sentences.

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 14

Table 1

Sentence selection after inclusion Quadrant Number of cases Percentage I/Dom 158 74,5 We/Dom 2 0,9 I/Sub 7 3,3 We/Sub 45 21,2 Total 212 100

Baseline. The baseline for this two-class classification is 77,8 percent. This is higher than a 50% baseline due to imbalances in the class distribution. The baseline is based on the category with the most sentences, in this case the “Dominance/I” category. This category has 158 sentences in a database of 203 sentences. 158 divided by 203 multiplied by 100 is 77,8%. Classifier A should score more than 77,8% of the sentences in the correct quadrant of Leary’s Rose to perform better than the random baseline. LIWC. The 203 sentences with Leary’s Rose nominal scores are inserted in the LIWC. Every sentence is categorized separately, based on the LIWC dictionary, and the category scores are inserted in SPSS. An independent sample T-test determined the significant different categorizations based on the two quadrants. First, we check if the data fits the assumptions of an independent sample t-test: 1. All the observations should be independent and in this test all the observations are independent. 2. Normality, all the data must follow a normal distribution in the population when the samples are smaller than 25 units. Our samples are bigger than 25 units and therefore we do not violate assumption 2. 3. Homogeneity, the standard deviation should be fairly equal in both populations and in this study some of the variables violated this assumption. Still we proceed with this t-test because of the large number of units. The following categories; Dictionary cover, we, past, number, affect, positive emotion, negative emotion, cause, relative, time, work and assent scored significant or almost significant (Table 2).

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 15

Table 2. t-test on LIWC categories per quadrant. Sub/We Dom/I N M SD N M SD t-test df p-value Dictionary cover 45 89,27 12,66 158 81,98 21,67 2,85 123,84 ,005** We 45 ,54 1,84 158 ,09 ,69 2,56 201 ,011* Past 45 1,10 2,93 158 2,65 5,91 -2,42 149,18 ,017* Number 45 ,38 2,16 158 2,18 6,15 -3,07 193,03 ,002** Affect 45 13,98 26,94 158 4,28 15,12 2,32 52,14 ,025* Positive emotion 45 13,54 27,05 158 2,00 11,60 2,79 48,69 ,008** Negative emotion 45 ,44 1,72 158 2,19 9,91 -2,10 184,70 ,037* Cause 45 ,20 1,36 158 1,92 6,37 -3,15 193,44 ,002** Relative 45 9,13 10,19 158 17,45 17,44 -4.05 123,97 ,000*** Time 45 3,62 6,92 158 10,34 14,82 -4,29 158,27 ,000*** Work 45 ,29 1,12 158 2,90 10,06 -3,19 169,72 ,002** Assent 45 9,18 20,85 158 3,34 14,86 1,76 57,31 ,084 Sub/We = Submissive/We, Dom/I = Dominance/I, N= Number of cases, M = Mean, SD = Standard Deviation, df = Degrees of Freedom *p < .05. **p < .01. ***p < .001.

Logistic Regression. To obtain a classifier we conducted a binary logistic regression analysis which creates a formula. This formula is classifier A. The LIWC categories described above are used in this logistic regression. The quadrant scores are the outcome variables. The LIWC categories are the independent variables and should not highly correlate with each other. Another assumption is that the number of covariates, the LIWC categories, should be as low as possible without decreasing the overall percentage of correctly categorized sentences. With a backward selection procedure, we removed in every new logistic regression analysis the least significant category till the above described rules fitted the binary logistic regression. After the backward selection procedure, we reached an overall score of 80.8 percent based on the “we”, “positive emotion” and “relative” variables. All the variables have a significant contribution in the classification formula and the Omnibus Test of Model is also significant. This means that our new model (80.8%) is better than the baseline (77.8%) by three percent. To achieve a higher overall percentage of correct categorized sentences we improved the cut value of the binary logistic regression. Our classifier A reached the highest score by a CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 16 cut value of .65 found by the ROC curve. The percentage of correct sentences is increased to 82.3 percent which is an increase of 4,5 percent. The classifier (classifier A) created by this binary logistic regression is: e (1.027 - .368 * we - .029 * positive emotion + .038 * relativity) P = ______1 + e (1.027 - .368 * we - .029 * positive emotion + .038 * relativity)

Discussion In the first phase we conducted a text analysis to scale sentences from a chat conversation on Leary’s Rose. Participants annotated sentences, these annotations were the ground truth for training the classifier A. The hypothesis was: “We expect to find a higher overall accuracy than the baseline. The conclusion for this hypothesis is that it is possible to increase the accuracy of the categorization by Leary’s Rose. It is a improvement of 4,5 percent. First of all, it is noteworthy to mention that, after a close examination, our chatbot conversations were mostly found in two quadrants of Leary’s Rose. This could be due to the type of conversations between the chatbot and the user. The first type of conversation is the “it goes well/I got my answer” conversation. The user is happy with the answer from the chatbot and their problem is solved. The second type of conversation is “This is not what I want” conversation. The chatbot does not understand the user or the user is unhappy with the given answer by the chatbot. In the first type, the conversation belongs to the “Submissive/We” quadrant and in the second type the conversation belongs to the “Dominance/I” quadrant. The other quadrants do not seem to fit the chatbot conversations on the subject “Where is my package from an external seller”. Another explanation is the differences in communication styles between human vs human and human vs chatbot (Hill et al., 2015; Mou & Xu, 2017). If we compare the results with the study from Vaassen and Daelemans (2010; 2011), Vaassen et al. (2012) we find similar results. Vaassen and Daelemans also found an improvement in categorizing sentences by a classifier and this improvement was even higher than the improvement we found. Our lower classification improvement can be due to the shorter text input and a different approach. For our research we use the chatbot from an online retailer which was a rule based chatbot. Because of the rule based chatbot we also chose for a CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 17 rule-based bag of words approach for our classification framework and not a machine learning approach such as Vaassen and Daelemans (2010; 2011), Vaassen et al. (2012). In addition, Vaassen and Daelemans used a training set with more than 1000 sentences, which is bigger than ours and they used annotators who were trained with the usage of Leary’s Rose. Another important factor for the slight improvement is the annotation problem as described by Vaassen (2014). According to Vaassen (2014) the problem starts with the data collection and manual annotation. Human annotators often disagree of the position on Leary’s Rose. This results in small, noisy and low-agreement datasets. This challenge is also noticed in our annotation process and is reflected in the high standard deviation scores. A possible solution for this problem is a follow-up study to annotate the sentences by the writers of these sentences. The participants should be better in annotating their own sentences, because they know the purpose and meaning of their sentences. This solution was tested in phase two. Phase two Methods Design. The second phase contained a control part with a follow-up part. With the control part we tested the validity of the classifier A and with the follow-up part we created a new classifier (classifier B). In the control part, the participants were asked to maintain two different conversations with the chatbot, based on two scenarios, namely a “Dominant/I” scenario and a “Submissive/We” scenario (Appendix C). The conversations were in the follow-up study annotated by the same participants. They rated their own sentences on one ratio scale, the “Dominant/I” versus the “Submissive/We scale.” The annotated sentences by the participants and the predefined nominal scenario scores (“Dominant/I” or “Submissive/We”) were the ground truth for testing classifier A. After testing classifier A, we used the annotated sentences, as ground truth, to create a new classifier (classifier B). The annotated sentence scored differed per participant despite every participant received the same two scenarios’. This ensured external validity because every participant was subjected to emotions and personality. To concede a well-balanced mean score, we asked different respondents and this leveled out the difference in scores. Also, the participant could have learned from the first scenario and to counterbalance for this issue we randomized the scenarios. CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 18

At last we checked all data on time spent, task completion and descriptives (mean, minimum and maximum of the annotated sentence scores). If there were odd answers we checked them manually and removed these outliers. Participants. We recruited 29 participants, through Facebook and e-mail, like the first phase. The criteria for the participants were; to be familiar with a computer using the internet, an age-range of 18 to 65, Dutch speaking (first language) and they did not participate in the first phase. The participants were also excluded when they did not finish the questionnaire or the follow-up questionnaire and all the questions are checked manually on odd answers. The reward for participating in this study is the chance to win a gift card from an online retailer of 20 euro. None of the earlier studies about text analysis and Leary’s Rose used participants to validate their own outcomes. However, using 20 participants or more is in line with the studies from Settanni and Marengo (2015) and Georgaca and Avdi (2011). Settanni and Marengo (2015) used 20 participants to analyze emotions in Facebook posts and based on this study we also used 20 participants. Georgaca and Avdi (2011) confirmed the number of 20 participants or more, to validate outcomes, in their guide “Discourse Analysis.” This phase is reviewed by the psychology ethics board from Leiden University and the study meets up with all the applicable laws and guidelines. Procedure. The chatbot, was used to have conversations with the participants. These conversations were according to two scenarios on the subject ‘Where is my package from an external seller?’ One scenario is focused on the Dominant/I quadrant and the other scenario was based on the Submissive/We quadrant (see appendix C). The participants were asked to empathize with the tasks in the scenario. Imitate behavior and emotions was possible in a diversity of tests (Grubb & McDaniel, 2007; Keen, 2006; McFarland, Ryan & Ellis, 2002) and based on these studies we expected that the participants could empathize behavior. The second phase started with data collection via an online questionnaire. The questionnaire was attainable through a link on Facebook or in an e-mail. The questionnaire started with an informed consent (Appendix D) to inform the participants about the criteria, the general procedure and their rights. The informed consent was followed up by more information about the task. This contained some basic information about a chatbot and more extensive explanation about the procedure. Then the participant received one of the two scenarios, the order was randomized. After reading the casus the participant started the CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 19 conversation through a link. The chatbot finishes the conversation when the participant asked everything they needed to know. In the chatbot conversation the participant was asked to fill in their e-mail. E-mail address were used to; connect the questionnaire data with the chatbot data, send the follow-up questionnaire and to assign the gift card to the winner. When the conversation was finished the participant received the other scenario. This procedure repeated the same way as described above. After the second conversation the participant was navigated back to the questionnaire. In this questionnaire they were asked to answer some demographic questions (gender, age and education level) and a few general questions (if there were some uncertainties and if they liked the tasks). At last the participant filled in the questionnaire their e-mail address followed up with the debriefing (Appendix E). The total time to complete the whole task was about 15 minutes. In the second part of phase two the respondents were asked to fill in a follow-up questionnaire which was sent to their e-mail. In this questionnaire they annotated their own sentences on a Dominant/I - Submissive/We scale. Participants could only apply this study when the first questionnaire was completely filled in. The questionnaire started with the informed consent (Appendix F) and was followed up with information about the task. This task is almost the same task as in phase one, the only difference is the scale, which is now one scale (Dominant/I – Submissive/We). The instruction is exactly the same as in phase one. A scale ranging from -100 to 100. After the annotating of their own sentences the participant was asked to fill in a few general questions (if there were some uncertainties and if they liked the tasks) and to give their e-mail address. At last the participant could read the debriefing (Appendix G). This questionnaire took about 5 minutes. Apparatus. The questionnaire was an online Qualtrics questionnaire. Qualtrics is an online tool for questionnaires and can be filled in with a computer, tablet or smartphone with internet. We recommended to use a computer because it was easier to fill in the questionnaire. The first task of phase two was pre-tested by one participant to check on time spent and understandability. The second task isn’t pre-tested because it is almost the same task as phase one. Analysis. In phase one we trained classifier A and this classifier was tested in the current phase. The test-set contained self-annotated sentences by participants. The sentences written by the participants were scored by classifier A and compared with the ground truth. CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 20

The ground truth was the predefined scenario (Dominant/I or Submissive/We) or the self- annotated scores from the participants gained by the follow-up study. To clarify, the scores by classifier A were compared with both the predefined scenarios and the self-annotated scores from the participants. First, we checked all data from the participants on task completion, time spent and some descriptive (mean, minimum and maximum). If a participant differed from other participants, their answers were checked manually and were deleted when needed. Second, we wrote two Python codes (Appendix H) to analyze the sentences on the formula created by phase one. The first code is classifier A without a range and the second Python code computed the sentences based on classifier A with range. A range could improve the accuracy of the classifier because close to the cutoff point could be a mix of “Dominance/I” and “Submissive/We” sentences (Jones, 2016; Lord, 1961). For example, if we got a cutoff point of .65 we can add a range to this cut off point from .60 to .70. This means that scores between .60 and.70 were not classified in order to avoid mismatches. The data from Python, the two scenarios and the follow-questionnaire were inserted into SPSS 23. Outliers were manually checked and all the sentences with missing participant scores were deleted. The sentences with a zero score, calculated by classifier A, were also deleted since these sentences did not have matching words in the wordlist. Thereafter we computed the frequencies from the sentences scored by classifier A and the participants. Afterwards we computed the sentences which were equally scored by classifier A, scenarios and the participant. This was followed up with another frequency analysis on the participant sentence scores after selecting only the equally scored sentences. Based on the frequency tables we computed the percentage of correct scored sentences by the classifier. This analysis is repeated with data from the Python program with ranges. In the second part of phase two we created a new formula (classifier B) by repeating a part of the analysis from phase one. The sentences from the follow-up study were analyzed by the LIWC and with the results of this data we created classifier B with a logistic regression analysis. A ROC-curve determined the cutoff point and range. After this analysis we got a new classifier for phase three. Results Participants. In the questionnaire of phase two 29 participants participated. We examined the 29 cases on time and missing data. 15 cases had to be removed because CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 21 participants did not finish the questionnaire or the follow-up study. From the 14 cases we did not have to remove any other participants. Five of our participants were men and nine were women. Except for two, all the participants were between the 18 and 29. Eleven participant’s education level was HBO (higher Vocational Education) or higher. The other participants education level was lower than HBO.

Table 3 A crosstab of the sentence selection by predefined scenarios and follow-up

Follow up None Sub/We Dom/I Not Total scaled Scenario Sub/We N 3 46 24 162 235 % 0.7% 10.1% 5.3% 35.7% 51.8% Dom/I N 4 11 57 147 219

% 0.9% 2.4% 12.6% 32.4% 48.2% Total N 7 57 81 309 454 % 1.5% 12.6% 17.8% 68.1% 100%

Sentence selection. The conversations with the chatbot yielded 454 sentences (Table 3). 235 sentences from the “Submissive/We” scenario and 219 sentences from the “Dominance/I” scenario. After the follow-up study 145 sentences were scored on Leary’s Rose by the participants. The not scaled sentences were short sentences and contained answers like “yes” or “no”. Comparing the predefined casus quadrant score and the participant quadrant score, almost 30% did not match. After a closer review we decided to only use the follow-up quadrant scores because many participants scored in the “Submissive/We” scenario sentences as “Dominance/I” and this affected the reliability of the predefined scenarios (Table 4). This decision is based on a Fisher’s exact test 2-sided (p = .000, FET) which concludes that the distributions are different. We chose the Fisher’s test since two cells have an expected count of less than five. The 309 sentences which were not scored by a follow-up study were deleted. 145 sentences of which, 57 sentences on “Submissive/We”, 81 sentences on “Dominance/I” and 7 sentences neither on “Submissive/We” or “Dominance/I.” These sentences where scored as zero on “Submissive/We” and zero on “Dominance/I” (Table 4).

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 22

Table 4 A crosstab of the sentence selection by predefined scenarios and follow-up after deletion of the not scaled sentences.

Follow up

None Sub/We Dom/I Total Scenario Sub/We N 3 46 24 73 % 2.1% 31.7% 16.6% 50.3% Dom/I N 4 11 57 72

% 2.8% 7.6% 39.9% 49.7% Total N 7 57 81 145 % 4.8% 39.3% 55.9% 100%

Table 5 A crosstab: Sentence selection and classified sentences after deletion of not scaled sentences

Follow up None Sub/We Dom/I Total Classifier None N 4 17 19 40 % 2.8% 11.7% 13.1% 27.6% Sub/We N 0 4 4 8 % .0 2.8% 2.8% 5.5% Dom/I N 3 36 58 97 % 2.8% 7.6% 39.9% 66.9%

Total N 7 57 81 145 % 4.8% 39.3% 55.9% 100%

Table 5 gives an overview of sentences from the follow-up study and the classified sentences by classifier A, after the deletion of the not scaled sentences. The 145 follow-up sentences from the participants were inserted in a Python program to score the sentences on classifier A. The formula classified 105 sentences and the 40 sentences “not being classified” had no words in one of the dictionaries. These sentences were also deleted since we focused on only classified sentences. These 105 sentences were the base for the control part and the improvement part. In Table 6 the sentence selection after the deletion of the non-scaled sentences from the classifier is displayed. Baseline control study. The statistical baseline for this classification problem is 59,0 percent. The baseline is based on the category with the most sentences (Table 6), in this case the “Dominance/I” category (59,0%). This is higher than a 50% baseline due to imbalances in CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 23 the class distribution. Classifier A should score more than 59,0% of the sentences in the correct quadrant of Leary’s Rose to improve the classification distribution problem.

Table 6 A crosstab: Sentence selection and classified sentences after deletion of non-scaled sentences

Follow up None Sub/We Dom/I Total Classifier Sub/We N 0 4 4 8 % .0 3.8% 3.8% 7.6% Dom/I N 3 36 58 97 % 2.9% 34.4% 55.2% 92.4% Total N 3 40 62 105 % 2.9% 38.1% 59.0% 100%

Control study. After analyzing the wrongly scored sentences (Table 6), we argued that the single cut-off point is not precise enough. A range could improve the accuracy of the classifier because around the .65 cut-off point are wrongly classified sentences. If we add a range on the cut-off point from .55 to .75 we reach a classification score of 60.8% which is almost an improvement of 2%. A new classification formula. The 145 sentences from the follow-up study were used as ground truth to create a new classification formula (classifier B). From the 145 sentences are 57 on “Submissive/We”, 81 on “Dominance/I” and 7 sentences neither on “Submissive/We” or “Dominance/I.” The 7 sentences were removed because these sentences have a zero score on both quadrants. The statistical baseline is 59.1%. This baseline is higher than a 50% baseline because of imbalances in the class distribution. Classifier B should score higher than 59.1% to reach an improvement for this classification problem. All the 138 sentences were inserted in the LIWC. Every sentence is categorized separately and the category scores were inserted in SPSS. An independent sample T-test determined the significant different categorizations based on the two quadrants. The categories; six letter words, affect, positive emotion, negative emotion, insight, relativity, time and money scored significant or almost significant (Table 7).

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 24

Table 7. t-test on LIWC categories per quadrant. Sub/We I/Dom N M SD N M SD t-test df p-value Six letter words 56 21.01 14.63 81 13.78 15.48 2,77 122,56 ,006** Affect 56 6,84 13,73 81 2,40 6,83 2,50 135 ,014* Positive emotion 56 6,33 13,78 81 0,36 2,81 3,79 135 ,000*** Negative emotion 56 0,51 2,30 81 1,74 5,00 -1,72 135 ,088 Insight 56 4,15 7,23 81 1,59 4,25 2,60 135 ,010* Relativity 56 11,78 11,84 81 20,42 15,80 -3,47 135 ,001** Time 56 5,95 8,41 81 11,38 13,05 -2,74 135 ,007** Money 56 2,80 6,17 81 0,69 2,33 2,80 135 ,006** Sub/We = Submissive/We, Dom/I = Dominance/I, N= Number of cases, M = Mean, SD = Standard Deviation, df = Degrees of Freedom *p < .05. **p < .01. ***p < .001.

Logistic Regression for a new classifier. The LIWC categories described above were used in a logistic regression. The quadrant scores were the outcome variables and the LIWC categories were the covariates which should not highly correlate with each other. Another assumption was that the number of covariates should be as low as possible without decreasing the overall percentage of correctly categorized sentences. The regression analysis met up all the assumptions. With a backwards selection procedure, we removed, in every new logistic regression the least significant category till the above described assumption fitted the binary logistic regression. After the backwards selection procedure and with a cut-off value of .536, we reached an overall score of 72.3% with the positive emotion, negative emotion, relativity and insight variables in the equation (classifier B1). One variable, namely money, increased the overall percentage of the classification with more than 4%. Therefore, we decided to create a second classifier (classifier B2) with the variable money in the equation. Classifier B reached an overall percentage of 76.6% with a cut-off value of .51. All the variables, in both regression analysis, were significant and the Omnibus Tests of Model was also significant. This means that both of our classifiers B1 and B2 (without and CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 25 with money) are better than the baseline by 13.2% and 17.5%. The classifiers created by the binary logistic regressions are: Classifier B1: e (0.146 - .126 * positive emotion + .071 * negative emotion + .033 * relativity -.057 * insight) P = ______1 + e (0.146 - .126 * positive emotion + .071 * negative emotion + .033 * relativity -.057 * insight)

Classifier B2: e (0.609 -.136 * positive emotion +.056 * negative emotion +.024 * relativity -.075 * insight -.151 * money) P = ______1 + e (0.609 -.136 * positive emotion +.056 * negative emotion +.024 * relativity -.075 * insight -.151 * money)

Discussion In phase two we tested classifier A and created two new classifiers B1 and B2 by creating two new formula’s. The ground truth for testing classifier A was possible by two methods. The first method was using the follow-up study in which the participant who typed the sentences also annotated their own sentences. The second method was using the predefined scenarios. These predefined scenarios turned out to be not reliable with a chi-square. The participants used many “Dominance/I” sentences in the “Submissive/We” scenarios. After a close review we concluded that if the chatbot did not answer as expected the participant used “Dominance/I” sentences, even in the “Submissive/We” scenarios. Due to the unreliable predefined scenarios scores classifier A was only tested with the self-annotated sentence scores from the participants. The hypothesis we tested in phase two was: “The overall accuracy of classifier A will be higher than the baseline.” In the contrary of our expectations, classifier A showed no improvements in categorizing the sentences in the test part of phase two. The classifier reached the same percent of correct scored sentences as the baseline (59%). Notable is the small number of categorized sentences in the “Submissive/We” quadrant and the huge number of categorized sentences in the “Dominance/I” quadrant. Probably due to the training- set from phase one, which created classifier A. The data in the training-set was not well balanced, noisy and small. This is the reason we can’t accept the hypothesis. We argued that a ranged cut-off point could improve the results (Jones, 2016; Lord, 1961) by the classifiers and that a single cut-off point was possibly not precise enough CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 26 because close to the cut-off point there could be a mix of “Dominance/I” and “Submissive/We” sentence. After adding a ranged cut-off point the categorization was slightly improved (2%). We expected that a ranged cut-off point is not the solution to reach huge improvements. After testing classifier A we created two new classifiers (classifier B1 and B2). Classifier B2 had an extra variable, namely money. This variable showed in the binary logistic regression a significant improvement (4,3%) but we expect that this variable was only an important factor in the categorization with this dataset because money words can be used in both “Dominance/I” and “Submissive/We” quadrants. In phase three we test these new classifiers based on NPS and we expect to improve the results with the new classifiers because participants with no training on Leary’s Rose should have more problems with rating sentences than self-rating participants. Phase three Methods Design. The last phase, phase three, is a study to predict the NPS score of a conversation with the classifier score of this particular conversation. The conversations we used, were from the database of the chatbot from the online retailer and these conversations were real conversations between customers and the chatbot. All these conversations have an NPS score and with Python we computed the quadrant score. In this study the NPS score and the quadrant score should show a correlation and this correlation should support the effectiveness of the classifiers. The NPS is a method to indicate customer satisfaction. This method is based on one simple question; “How likely is it that you would recommend this company to a friend or colleague?” The question is answered on an eleven-point scale. The scores from zero to six are the detractors and are unhappy customers. The scores seven and eight are the passives and are satisfied customers but unenthusiastic. The remaining scores, nine and ten, are the promoters and are happy and enthusiastic customers who recommend your company to friends and colleagues. The total NPS score for a company is the subtraction of the percentage of the detractors from the percentage of the promoters (Mattrox II, 2013; Reichheld, 2003). We expected that emotion is an important factor between the NPS score and Leary’s Rose quadrants. The “Dominant/I” scale should correlate negative on the NPS score and the “Submissive/We” scale should correlate positive on the NPS score. Our expectation is based CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 27 on the findings of Shaw (2016) who described that emotion has a moderating effect on the NPS and thereby emotion is an important factor in Leary’s Rose. Based on this data we expect we can predict NPS on the Leary scores. Participants. In this phase we did not need participants because we used existing anonymized conversations from a database and these conversations had already NPS scores. The conversations were from a real world setting and written by real customers. We selected 303 anonymous conversations from the subject “Where is my package from an external seller.” Procedure. A database from the chatbot was selected with almost equal numbers of NPS groups; detractors with an NPS score from zero to six, passives with a scores seven and eight and promoters with the NPS scores nine and ten. The Python programs (Appendix H) scaled the whole conversation and the separate sentences per conversation. Every conversation had an NPS score, a quadrant classification based on the whole conversation and a quadrant classification per sentence. The whole conversation scores were computed by the Python program which took all the sentences and scaled them as one. The separate sentence scores were measured individually and afterwards the mean of the sentences scores were computed by conversation. Apparatus. In this experiment Python was used to classify conversations and sentences by the classifiers and SPSS 23 for the analysis. Analysis. The data from a Python program and the NPS data from the database were inserted in a SPSS dataset. For every Python program (Table 8) we used a different dataset and also the ‘whole conversations scores’ and ‘the separate sentence scores’ were inserted in different SPSS datasets. The conversations with no quadrant score were indicated as missing. A frequency analysis created a quick overview of the data about the number of conversations, the number of cases per quadrant and the number of cases per NPS group. Afterwards we computed a Kendall’s tau-b correlation on the p-value of the quadrant score and the NPS. As explained before detractors should correlate high with the “Dominant/I” quadrant and the promoters should correlate high on the “Submissive/We quadrant.” A crosstab on quadrant and NPS groups measured the cases of NPS groups per quadrant. The passive NPS group was not important in our study and was not be taken into account because this group couldn’t be predicted by the classifiers. With the crosstab we computed the percentage of correctly scored quadrant per NPS group. Detractor with quadrant “I/Dominance” and promoter with quadrant CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 28

“We/Submissive”. This analysis was repeated for every classifier (see Table 8). Afterwards we compared the scores of the three classifiers.

Table 8 The classifiers with their specifications and their corresponding Python codes. Classifier Content Fitting method Fitting configuration Python code A conversation range .55 - .75 1 no range .65 sentence range .55 - .75 2 no range .65 B1 conversation range .45 - .55 3 no range .54 sentence range .45 - .55 4 no range .54 B2 conversation range .40 - .60 5 no range .51 sentence range .40 - .60 6 no range .51 Variables in classifier; A: we, positive emotion, relativity; B1: positive & negative emotion, relativity, insight B2; positive & negative emotion, relativity, insight, money

Results Chat conversations. In total 303 chat conversations with NPS scores were separated into the three NPS groups, namely the detractors with 102 conversations, the passives with 101 conversations and the promoters with 100 conversations. These conversations were scored by the Python programs on the whole conversation scores and on the separate sentence scores for all of the three classifiers A, B1 and B2 (Table 8). The subsequent part discusses the results of classifier A, followed by the results of classifier B1 and at last the results of classifier B2. Classifier A. For explorative purpose, we used every classifier in four different ways. The Python program with the classifier scaled the whole conversation with range, the whole conversation without range, the sentences in the conversation with and without range. In Table 9 are listed the most important outcomes of classifier A. We used a Kendall’s tau-b correlation because our data was skewed and therefore we could not perform a Pearsons correlation. The assumptions for Kendall’s tau-b: variables CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 29

should be at least ordinal and should have a monotonic relationship. The last assumption is not very strict because this assumption is often able to assess. The data did not fail on the assumptions for Kendall’s tau-b. Classifier A shows a slight improvement in three approaches. The correlation between

NPS and p value of the classifier score is significant by ‘classifier A on conversations’ (rτ = -

.094, p = .023) and ‘classifier A on sentences’ (rτ = -.157, p <.001). Most of the “Dominance/I” sentences are scored correctly but the “Submissive/We” sentences are mostly scored wrong. This could explain the low total classification scores and therefore the low improvement relative to the baseline classification. The sentence-based classification has the best improvement with a mean improvement of 4.3%. The classifier without a range showed a better improvement, with a mean score of 3.4%, than the classifier with a range. Classifier A based on sentences without a range scored an improvement of 4.9%, which is the highest improvement of the different approaches by classifier A. The mean improvement on the different ways of measurement of classifier A is 2.6%.

Table 9 Outcomes of classifier A Classifier Correlation scored I/Dom I/Dom We/Sub We/Sub Baseline Total Improvement (R) (sig) conversations total correct total correct classification classification conversation, -,094 (,023) 199 74 74 50 0 59,7% 59,7% 0,0% with range conversation, -,094 (,023) 297 101 101 98 5 50,8% 52,6% 1,8% no range sentence, with -,157 (,000) 208 79 72 59 12 57,2% 60,9% 3,7% range sentence, no -,157 (,000) 297 101 89 98 33 50,8% 55,7% 4,9% range I/Dom total and I/Dom correct are the total I/Dom sentences and the

Classifier B1. As in phase one, we used every classifier in four different ways, for explorative purpose. Three approaches of classifier B1 showed a significant improvement (17.5%, 15.9% & 15.0%; Table 10). The other approach measured a decrease in improvement of -3.4% relative to the baseline classification. The data did not fail on the assumptions for

Kendall’s tau-b and showed significant correlations in the classifiers (rτ = -.170, p < .001) and CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 30

(rτ = -.165, p < .001). In comparison towards classifier A; more “Submissive/We” sentences are scored correctly and a slight decrease in correct scored “Dominance/I” sentences. The classifier B1 based on sentences scored the highest mean improvement with a score of 15.5%. The classifier B1 without a range showed the best improvement with a mean improvement of 16.3%. Classifier B1 with the overall best improvement is the conversation- based formula without a range (17.5%). The total mean improvement classifier B1 is 11.3%.

Table 10 Outcomes of classifier B Classifier Correlation scored I/Dom I/Dom We/Sub We/Sub Baseline Total Improvement (R) (sig) conversations total correct total correct classification classification conversation, -,170 (,000) 136 51 51 29 6 63,8% 60,4% -3,4% with range conversation, -,170 (,000) 299 101 72 99 64 50,5% 68,0% 17,5% no range sentence, with -,165 (,000) 261 88 72 83 44 51,5% 67,4% 15,9% range sentence, no -,165 (,000) 299 101 75 99 56 50,5% 65,5% 15,0% range

Classifier B2. Finally, Classifier B2 was examined in four different ways, for explorative purpose. Two approaches showed a high improvement (13.2% and 13.9%), one approach is a slight improvement (0.5%) and one approach showed no improvement (0.0%; Table 11). The data did not fail on the assumptions for Kendall’s tau-b and classifier B2

scored significant correlations based on conversations (rτ = -.143, p < .001) and on sentences

(rτ = -.151, p < .001). In comparison towards classifier A, showed classifier B2 a higher improvement with more correct classified “Submissive/We” sentences. The classification of the “Dominance/I” sentences showed a slight decrease in comparison towards phase one. But the classifier B2 scored generally lower than classifier B1. The classifier B2 based on sentences scored a mean improvement of 13.6%, which is higher than classifier B2 based on whole conversations. Classifier B2 without a range (7.2%) scored higher than classifier B2 with a range (6.6%). Classifier B2 based on sentences with no ranged cut-off point showed the highest improvement with 13.9%. The total mean improvement classifier B2 is 6.9%.

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 31

Table 11 Outcomes of classifier B2 Classifier Correlation scored I/Dom I/Dom We/Sub We/Sub Baseline Total Improvement (R) (sig) conversations total correct total correct classification classification conversation, -,143 (,000) 263 95 95 78 0 54,9% 54,9% 0,0% with range conversation, -,143 (,000) 300 101 101 99 1 50,5% 51,0% 0,5% no range sentence, with -,151 (,000) 203 73 60 64 33 53,7% 66,9% 13,2% range sentence, no -,151 (,000) 300 101 79 99 50 50,5% 64,4% 13,9% range

Discussion In phase three we predicted the NPS based on the classifier’s p-value. A high p-value, “Dominance/I” quadrant, should correlate with a low NPS. A low p-value, “Submissive/We” quadrant, should correlate with a high NPS. The third hypothesis was therefore: “We expect to find a positive correlation between NPS and “Submissive/We” scores on Leary’s Rose and a negative correlation between NPS and “Dominance/I” scores.” At first, a drawback from this phase was the data collection. All the conversations were anonymized conversations from a database and these conversations had already NPS scores. Because these conversations were from an anonymized database, we did not know anything about the population or outliers. It is possible that this could have affected the outcomes, but we expected this to be minimal. Classifier A did reach significant correlations between the p-value and the NPS. The classifier based on conversations scored lower significant correlations than the classifier based on sentences. Another important finding was that classifier A classified almost all the sentences to the “I/Submissive” quadrant, probably due to the training set. The training set had significant more “I/Submissive” sentences and was therefore skewed. This could have led to a skewed training towards “I/Submissive” sentences. Both, classifier B1 and B2, reached significant correlations with the classification based on conversations and on sentences. The classification based on whole conversations performed worse because the wordcount per category is different relative to sentence-based classification. The whole conversation classification categorizes more sentences as “Dominance/I” probably due to the more “Dominance/I” sensitive classifiers. When CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 32 categorizing a whole text, the “Submissive/We” sentences will be leveled out by the “Dominance/I” sentences. When categorizing a conversation based on sentences the “Submissive/We” sentences affects the total score more with their score. And therefore, the categorization based on sentences is less sensitive for the skewed data. In summary, we conclude that NPS correlates with Leary’s Rose. The last hypothesis, hypothesis four: “We expected to find a higher improvement of the overall accuracy from classifier B1 and B2, than from classifier A.” Based on the results we can conclude that the classifiers B1 and B2 showed better improved scores than the classifier A. Noteworthy is that the conversations based on whole conversations performed worse in the total classification than the sentence scores. This problem also occurred by the NPS and Leary’s Rose correlation, due to the same problem. Another important factor in the classification are the differences in a ranged cut-off point or a single cut-off point. The ranged cut-off point reaches in most of the cases a lower classification score because too much sentences were left out. As described in the results of phase two, we expected that a ranged cut-off point wouldn’t be a solution for this categorization problem. At last, the three different classifiers showed different scores. In general, classifier B1 showed the best categorization. Classifier B1 based on conversations with range performed the worst (Table 10). The best classifier is classifier B1 without money but with a ranged cut- off point. Classifier B1, the classifier without money, was as expected to be the best classifier. The good performance of classifier B1 with a ranged cut-off point is somewhat surprising and we cannot logically explain this outcome (Table 10). Classifier B2, the classifier with money in the equation, was less significant as we expected. The variable money did not improve the performance. General discussion The aim of this study is to scale relative short text input, in a chatbot environment, on Leary’s Rose. Furthermore, the study addresses an answer for a possible new annotating process and tries to find a correlation between NPS and Leary’s Rose. In phase one we created classifier A based on annotated sentences by participants. Phase two tested classifier A with a new database of annotated sentences. These sentences were written and annotated by the same participants (self-annotation). After the testing part we created in phase one two new classifiers, namely B1 and B2. These classifiers were CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 33 trained with the self-annotated sentences. In phase three we tried to predict the NPS based on Leary’s rose and concluded which classifier performed the best based on crosstabs. The general discussion starts with a short summary of the conclusions on the four hypotheses’, followed by the limitations and will finish with some recommendations for future research. The first hypothesis: “We expect to find a higher overall accuracy than the baseline. In phase one we did find a higher overall accuracy for classifier A than the random baseline. We also did find a higher overall accuracy than the studies from Vaassen and Daelemans (2010; 2011). Probably due to the two-way categorization and not the four-way categorization from Vaassen and Daelemans. But when the overall accuracy is compared with the random baseline, the study from Vaassen and Daelemans found a significant higher improvement. Based on Vaassen and Daelemans higher improvement, we can state that their method is a better categorization method. This can be due to a few factors, at first their training set was, with more than 1000 sentences, bigger than ours. The second factor is their annotation process, which was be done by annotators who were trained with the usage of Leary’s Rose. The last factor is the method of categorizing sentences. Vaassen and Daelemans used a machine learning approach and we used a rule-based bag of words approach. Generally, a machine learning approach is more accurate than a rule-based bag of words approach (Kotsiantis, Zaharakis, & Pintelas, 2007; Vaassen, 2014). This is the consequence of the static feature of a rule-based bag of words approach. The classifier is trained once and classifies sentences based on some simple rules. These rules should be implemented by humans and therefore we should determine the rules. This will work correctly when we are aware of all the rules. A machine learning approach can create rules by itself when we show the classifier data which is classified into different classes. The advantage of a machine learning approach is that it can be easily trained on big data and the classifiers can be retained more easily and frequently. Because of this, a machine learning approach is in general more accurate (Kotsiantis, et al., 2007). The second hypothesis: “The overall accuracy of classifier A will be higher than the baseline.” The overall accuracy of classifier A was in the test part not higher than the baseline. In fact, the accuracy was exactly the same. The hypothesis about a higher accuracy than the baseline can be rejected. Probably due to the small and noisy training set from phase one. Thereby, participants with no training on Leary’s Rose could interpret the sentence differently than trained or self-rating participants which can lead to a noisier training set. CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 34

In phase two we also noticed that participants are not able to maintain a conversation based on a scenario and mainly the “Submissive/We” scenario. If the chatbot did not interact as the participant expected the conversation switched towards a “Dominance/I” conversation. The third hypothesis: “We expect to find a correlation between a high NPS and “Submissive/We” scores on Leary’s Rose and a low NPS and “Dominance/I” scores.” This expectation is based on the findings from Shaw (2016): the higher the NEV (positive emotion), the higher the NPS. In our study we found a correlation between NPS and Leary’s Rose. Thus, we can state that classifiers which categorize sentences correctly on Leary’s Rose, correlate on NPS and therefore we can predict NPS on the quadrants of Leary’s Rose. This result is interesting because an NPS score is a given grade from a user or customer. These NPS grades are not always given by a user and to achieve these scores time and money is needed. With an automated machine prediction of the chatbot conversations, NPS is faster, easier and cheaper to get and an automated NPS score will also increase the total number of NPS scores. Mukherjee and Bhattacharyya (2012) used sentiment analysis to determine positive and negative product reviews. The approach of automatic analysis of the customers emotion is somewhat similar to our NPS approach. We expect that automatic analysis of customers emotion, thoughts and needs is feasible in the near future. With more research we expect to achieve better and more reliable analyzations. Such a machine learning sentiment analysis on product reviews or chat conversations can be very valuable for a company. The fourth hypothesis: We expected to find a higher improvement of the overall accuracy from classifier B1 and B2, than from the classifier A. The classifiers from phase two (B1 and B2) showed higher improvements than classifier A. Probably due to the fact that classifiers from phase two are obtained from sentences which were typed and annotated by the same participant. We conclude that annotation by self-rating is more reliable than annotation by participants with no training and lead to less noisy datasets. Less noisy datasets lead to better classifiers for a more effective categorization. This study has some limitations towards the obtained sentences. As mentioned before the training set from the first phase is noisy because the task is highly subjective and the participants did score sentences very different in contrast with other participants. Besides noisy, the dataset was very small in both phases and this affects the classifier negatively. Thereby the rule-based bag of words approach is not as effective as a machine learning CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 35 approach. We expect to find higher overall improvement scores if we use a machine learning approach as in Vaassen and Daelemans (2010, 2011). Based on this expectation we recommend, in most cases, to use machine learning approach in future research. The generalizability of this study is small because of the tiny subject “where is my package of an external seller.” The classifiers cannot be used in other settings because the focused training set, dictionaries, and the variables in the classifiers are based on this subject. Because of the limitations and generalizability, the categorization on Leary’s Rose is in general not reliable enough to implement in chatbots (Vaassen, 2014). Noisy, skew and small datasets make it hard to create classifiers who can categorize sentences correctly in a high percentage of the cases. Future research should focus on larger, more general, less skewed and less noisy datasets. More general datasets can be used in a wider variety of subjects. Such general datasets should be used to add to small datasets to obtain larger and less skew training sets. This approach can be time and cost saving and does not come at the expense of a usable training set. To reduce the noise in the dataset we proposed a self-rating annotation whereby the participant, who typed the sentence, annotates the sentence. This method seems to improve the reliability of the annotation, but the sentences are always annotated by one participant which could be a disadvantage. Besides that, a participant can annotate the sentence by their own feeling, but can this feeling be logically computed from the typed text? Could the feeling be logically explained by text? These questions can be answered in future research. Another approach is proposed by Vaassen (2014) namely a multi-model approach. This approach improves emotion classification, in which we should implement facial expressions, intonation of voice, body language, and other factors that could play a role in the information about our feelings. Raw text in itself does not contain all this information to determine the emotion of the author. We support this idea and therefore we ask to focus in future emotion classification research on a multi-modal approach (Vaassen, 2014). Such a multi-modal approach should be tested in future research. An example of this approach is emotion classification based on text, video, audio and physical measurements (heart rate, sweat, etc.) to base the classification on more than raw text. In the introduction we wrote about differences in communication styles between human-AI versus human-human conversations. Mou and Xu (2017) reported lower perceived attractiveness, less goal driven and more brutal language in a human-AI conversation. Hill CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 36 and others (2015) found differences between human-human and human-chatbot communication in: more messages, shorter message lengths, more limited vocabulary and greater use of profanity. Also, this study supports the idea of more brutal language (profanity) and shorter message lengths. The shorter message lengths can be due to the type of questions the chatbot asks. Most of these questions were closed questions or the questions that can be answered with a short answer. The greater use of profanity is supported by the higher rates of “Dominance/I” sentences. But what happens if a user is not aware that he or she talks towards a chatbot? Is the user in this case more agreeable, open, extrovert, conscientious, and self- disclosing? Or will our type of communication style on the internet changes in the way we always talk towards a chatbot? When the awareness of talking towards a chatbot will fade away, chatbots will pass the Turing test (Saygin, Cicekli & Akman, 2000; Turing, 1950). Maybe this could take a while but eventually the divide will be unclear if we are interacting with a human or a computer. There is no doubt that this will change our lives. Acknowledgements I would like to thank my master-thesis supervisor dr. Pascal Haazebroek for all the help, advice and feedback. Also, I want to thank dr. Roy de Kleijn for his time spent to read my thesis. I would like to thank Roel Vossen and Elmer Hiemstra for their time and given access to the resources of the online retailer. Furthermore, I would like to thank Live Presence for the access to the chatbot and the database with the conversations. At last, I would like to thank Prof. dr. Walter Daelemans and dr. Frederik Vaassen for sharing their data from the “deLearyous” project and I would like to thank dr. Hanna Zijlstra for sharing the newest dutch LIWC dictionary.

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 37

References Aman, S., & Szpakowicz, S. (2007, September). Identifying expressions of emotion in text. In International Conference on Text, Speech and Dialogue (pp. 196-205). Springer, Berlin, Heidelberg. Boot, P., Zijlstra, H., & Geenen, R. (2017). The Dutch translation of the Linguistic Inquiry and Word Count (LIWC) 2007 dictionary. Dutch Journal of Applied Linguistics, 6(1), 65-76. Chattaraman, V., Kwon, W. S., & Gilbert, J. E. (2012). Virtual agents in retail web sites: Benefits of simulated social interaction for older users. Computers in Human Behavior, 28(6), 2055-2066. DeWitt, T., Nguyen, D. T., & Marshall, R. (2008). Exploring customer loyalty following service recovery: The mediating effects of trust and emotions. Journal of service research, 10(3), 269-281. Georgaca, E., & Avdi, E. (2012). Discourse analysis. Qualitative Research Methods in Mental Health and Psychotherapy: A guide for students and practitioners, 147-162. Grubb III, W. L., & McDaniel, M. A. (2007). The fakability of Bar-On's emotional quotient inventory short form: Catch me if you can. Human Performance, 20(1), 43-59. Gurtman, M. B. (2009). Exploring personality with the interpersonal circumplex. Social and Personality Psychology Compass, 3(4), 601-619. Hill, J., Ford, W. R., & Farreras, I. G. (2015). Real conversations with artificial intelligence: A comparison between human–human online conversations and human– chatbot conversations. Computers in Human Behavior, 49, 245-250. Jones, E. A. (2016). A Multiple-Cutoff Regression-Discontinuity Analysis of the Effects of Tier 2 Reading Interventions in a Title I Elementary School. Keen, S. (2006). A theory of narrative empathy. Narrative, 14(3), 207-236. Keshtkar, F., Burkett, C., Li, H., & Graesser, A. C. (2014). Using data mining techniques to detect the personality of players in an educational game. In Educational Data Mining (pp. 125-150). Springer, Cham. Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging Artificial Intelligence Applications in Computer Engineering, 160, 3-24. Leary, T. (1957). Interpersonal Diagnosis of Personality: Functional Theory and Methodology CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 38

for Personality Evaluation, Ronald Press, Oxford. Lord, F. M. (1961). Multiple Cutting Scores and Errors of Measurement. ETS Research Bulletin Series, 1961(1), i-20. McFarland, L. A., Ryan, A. M., & Ellis, A. (2002). Item placement on a personality measure: Effects on faking behavior and test measurement properties. Journal of Personality Assessment, 78(2), 348-369. Mou, Y., & Xu, K. (2017). The media inequality: Comparing the initial human-human and human-AI social interactions. Computers in Human Behavior, 72, 432-440. Mukherjee, S., & Bhattacharyya, P. (2012, March). Feature specific sentiment analysis for product reviews. International Conference on Intelligent Text Processing and Computational Linguistics (pp. 475-487). Springer, Berlin, Heidelberg. Pennebaker, J. W. (1993). Putting stress into words: Health, linguistic, and therapeutic implications. Behaviour Research and Therapy, 31(6), 539-548. Picard, R. W., & Picard, R. (1997). Affective computing. MIT press Cambridge, 252. Remmerswaal, J. (2011). Handboek Groepsdynamica: een nieuwe inleiding op theorie en praktijk (10th ed.). Amsterdam, The : Boom Nelissen. Rahman, J. (2012). Implementation of ALICE chatbot as domain specific knowledge bot for BRAC U (FAQ bot) (Doctoral dissertation, BRAC University). Roberts, K., Roach, M. A., Johnson, J., Guthrie, J., & Harabagiu, S. M. (2012, May). EmpaTweet: Annotating and Detecting Emotions on Twitter. LREC (Vol. 12, pp. 3806-3813). Salovey, P., & Mayer, J. D. (1990). Emotional intelligence. Imagination, Cognition and Personality, 9(3), 185-211. Saygin, A. P., Cicekli, I., & Akman, V. (2000). Turing test: 50 years later. Minds and Machines, 10(4), 463-518. Settanni, M., & Marengo, D. (2015). Sharing feelings online: studying emotional well-being via automated text analysis of Facebook posts. Frontiers in Psychology, 6. Shawar, A., Atwell, E., & Roberts, A. (2005). Faqchat as in information retrieval system. In Human Language Technologies as a Challenge for Computer Science and Linguistics: Proceedings of the 2nd Language and Technology Conference (pp. 274- 278). Poznań: Wydawnictwo Poznańskie: with co-operation of Fundacja Uniwersytetu im. A. Mickiewicza. CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 39

Tatai, G., Csordás, A., Kiss, Á., Szaló, A., & Laufer, L. (2003, September). Happy chatbot, happy user. In International Workshop on Intelligent Virtual Agents (pp. 5-12). Springer, Berlin, Heidelberg. Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24-54. Tosteson, A. N. A., & Begg, C. B. (1988). A general regression methodology for ROC curve estimation. Medical Decision Making, 8(3), 204-215. Turing, A. M. (1950). Mind. Mind, 59(236), 433-460. Vaassen, F. (2014). Measuring Emotion. Exploring the feasibility of automatically classifying emotional text (Doctoral dissertation, Dissertation, Antwerpen. http://www. cnts. ua. ac. be/sites/default/files/frederikvaassen_phdpreprint. pdf). Vaassen, F., & Daelemans, W. (2010). Emotion classification in a serious game for training communication skills. LOT Occasional Series, 16, 155-168. Vaassen, F., & Daelemans, W. (2011). Automatic emotion classification for interpersonal communication. Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (pp. 104-110). Association for Computational Linguistics. Vaassen, F., Wauters, J., Van Broeckhoven, F., Van Overveldt, M., Daelemans, W., & Eneman, K. (2012, October). deLearyous: Training interpersonal communication skills using unconstrained text input. In European Conference on Games Based Learning (p. 505). Academic Conferences International Limited. van Dijk, B. (2014). Beïnvloed anderen, begin bij jezelf. Uitgeverij Thema. van Dijk, B. & Cremers, M. J. (2007). Actie is reactie: Gedrag sturen met de Roos van Leary (waaier). Uitgeverij Thema. Varela-Neira, C., Vázquez-Casielles, R., & Iglesias-Argüelles, V. (2008). The influence of emotions on customer's cognitive evaluations and satisfaction in a service failure and recovery context. The Service Industries Journal, 28(4), 497-512. Weizenbaum, J. (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36-45. Yu, T., White, C., & Xu, C. (2007). The dynamic role of emotions in customer loyalty. In Reputation, Responsibility, Relevance. University of Otago. CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 40

Shaw, C. (2016, February 4). New CX measure to compliment NPS: Net Emotional Value. Retrieved from https://customerthink.com/new-cx-measure-to-compliment-nps-net- emotional-value/ Warwick, K., & Shah, H. (2016). Passing the Turing test does not mean the end of humanity. Cognitive computation, 8(3), 409-419. Zijlstra, H., Van Meerveld, T., Van Middendorp, H., Pennebaker, J. W., & Geenen, R. D. (2004). De Nederlandse versie van de ‘linguistic inquiry and word count’(LIWC). Gedrag Gezond, 32, 271-281.

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 41

Appendix A Informed consent phase one. Toestemmingsverklaring (informed consent)

Titel onderzoek: Zinnen schalen Verantwoordelijke onderzoeker: Gerard Visser

Voor dit onderzoek moet je tussen de 18 en 65 jaar oud zijn en daarnaast ben je Nederlandstalig.

In dit onderzoek ga je zinnen schalen aan de hand van drie vragen per zin. We zullen je ook nog een aantal vragen stellen over jou als persoon. Het gehele onderzoek duurt ongeveer 15 - 20 minuten.

Als vergoeding kun je meedoen aan een loting waar bol.com cadeaukaarten van 25 euro worden verloot.

Door toestemming te geven verklaar je het volgende: • Ik heb de bovenstaande informatie gelezen. Mijn vragen zijn afdoende beantwoord. Ik had voldoende tijd om te beslissen of ik meedoe. • Ik weet dat meedoen helemaal vrijwillig is. Ik weet dat ik op ieder moment kan beslissen om toch niet mee te doen of te stoppen. Daarvoor hoef ik geen reden te geven. • Mijn antwoorden worden anoniem / gecodeerd verwerkt. • Ik geef toestemming om mijn gegevens te gebruiken, voor de doelen die hierboven benoemd zijn.

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 42

Appendix B Debriefing phase one. Nabeschouwing (debriefing)

Titel onderzoek: Zinnen schalen Verantwoordelijke onderzoeker: Gerard Visser

Het doel van dit onderzoek is om zinnen te schalen op de Roos van Leary. Het schalen van de zinnen is nodig voor verder onderzoek naar een chatbot die verschillende gedragingen kan herkennen op basis van tekstuele input. Deze geschaalde zinnen zullen gebruikt worden in verder onderzoek.

Voorafgaand aan de studie hebben we de Roos van Leary niet benoemd omdat dit misschien voor eventuele vooroordelen kan zorgen. De Roos van Leary is een interactiemodel die het mogelijk maakt om menselijke eigenschappen en gedrag te beschrijven. De eerste twee vragen bij elke zin geven een coördinaat op de Roos van Leary en dat beschrijft het gedrag van de persoon. De laatste vraag geeft ons meer inzicht in welke woorden een rol spelen bij het schalen van het gedrag.

We verwachten op deze manier een lijst van woorden te maken die het gedrag van mensen kunnen bepalen aan de hand van de tekstuele input. Deze lijst van woorden zal verder onderzocht worden en we verwachten dat we na verder onderzoek een chatbot kunnen laten reageren op het gedrag van de persoon die met deze chatbot converseert.

Uw deelname aan dit onderzoek is van ons voor groot belang om een zo duidelijk en goed mogelijk beeld te krijgen van de woorden en de daarbij behorende gedragingen. We willen u nogmaals bedanken voor uw medewerking.

Mocht u na het onderzoek nog vragen hebben kunt u contact opnemen met mij, Gerard Visser, op het e-mailadres [email protected] of met mijn supervisor, Dr. Pascal Haazebroek, op het e-mailadres [email protected]

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 43

Appendix C Scenario 1 (“Submissive/We” quadrant) and scenario 2 (“Dominance/I” quadrant). Scenario 1

Gisteren heb je een gieter besteld via een externe verkoper van de online retailer. Je bent vrolijk omdat de gieter die je graag wilde 10 euro goedkoper was dan waar je hem eerst wilde bestellen. Alleen ben je vergeten wanneer je het pakketje ongeveer kan verwachten. Je wilt graag een schatting hebben wanneer het pakketje bezorgd wordt. Daarnaast wil je alvast weten hoe je contact op kan nemen met de externe verkoper voor als je nog meer vragen hebt.

Om een beetje informatie in te winnen ga je een gesprek aan met de virtuele assistent (een chatbot). Je bent momenteel niet thuis en kan niet inloggen omdat je de inloggegevens niet uit je hoofd weet. Goed gemutst begin je het gesprek.

Scenario 2

Twee weken geleden heb je een boek besteld bij een externe verkoper via de online retailer. Deze externe verkoper zou je pakketje binnen een week geleverd hebben, wat je al aan de lange kant vond. Je hebt nu nog steeds niet binnen en bent er flink boos over en wilt het boek zo snel mogelijk in je bezit hebben. Gisteren heb je een geïrriteerde mail gestuurd maar je hebt nog geen reactie gehad.

Je zit nu op je werk en wilt nu echt weten waar je pakketje blijft. Je neemt contact op met de virtuele assistent (een chatbot). Je kan niet inloggen op je account omdat je op je werk bent en de inloggegevens niet bij de hand hebt. Boos en geïrriteerd begin je het gesprek.

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 44

Appendix D Informed consent phase two Toestemmingsverklaring (informed consent)

Titel onderzoek: Gesprek met een chatbot Verantwoordelijke onderzoeker: Gerard Visser

Voor dit onderzoek moet je tussen de 18 en 65 jaar oud zijn en daarnaast ben je Nederlandstalig. Je mag nog niet eerder deel hebben genomen aan het onderzoek ‘Zinnen schalen.’

In dit onderzoek zal je 2 korte conversaties houden met een chatbot. Hierbij moet je je inleven in een bepaalde casus en vanuit die casus het gesprek met de chatbot voeren. Na het onderzoek zullen wij je een aantal vragen stellen over jou als persoon. Het gehele onderzoek duurt ongeveer 10 minuten.

Als vergoeding kun je meedoen aan een loting waar bol.com cadeaukaarten worden verloot.

• Ik heb de informatiebrief voor de proefpersoon gelezen. Mijn vragen zijn afdoende beantwoord. Ik had voldoende tijd om te beslissen of ik meedoe. • Ik weet dat meedoen helemaal vrijwillig is. Ik weet dat ik op ieder moment kan beslissen om toch niet mee te doen of te stoppen. Daarvoor hoef ik geen reden te geven. • Mijn antwoorden worden anoniem / gecodeerd verwerkt. • Ik geef toestemming om mijn gegevens te gebruiken, voor de doelen die in de - informatiebrief staan

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 45

Appendix E Debriefing phase two Nabeschouwing (debriefing)

Titel onderzoek: Zinnen schalen Verantwoordelijke onderzoeker: Gerard Visser

Het doel van dit onderzoek was om geschaalde woorden/zinnen te valideren op de Roos van Leary. Het schalen van de zinnen is in een vorig onderzoek uitgevoerd.

Voorafgaand aan de studie hebben we de Roos van Leary niet benoemd omdat dit misschien voor eventuele vooroordelen kan zorgen. De Roos van Leary is een interactiemodel die het mogelijk maakt om menselijke eigenschappen en gedrag te beschrijven. De conversaties die je met de chatbot hebt gehouden geven ons een beeld of wij meten wat wij moeten meten en dat de zinnen op een correcte wijze zijn geschaald. We verwachten op deze manier een juiste lijst van woorden te maken die het gedrag van mensen kunnen bepalen aan de hand van de tekstuele input.

We willen je vragen of u tot 24 november 2017 het onderzoek geheim wilt houden. Zo kunnen de andere deelnemende participanten niet beïnvloed worden voor het onderzoek.

Je deelname aan dit onderzoek is van ons voor groot belang om een zo duidelijk en goed mogelijk beeld te krijgen van de woorden en de daarbij behorende gedragingen. We willen je nogmaals bedanken voor je medewerking.

Mocht je na het onderzoek nog vragen hebben kun je contact opnemen met mij, Gerard Visser, op het e-mailadres [email protected] of met mijn supervisor, Dr. Pascal Haazebroek, op het e-mailadres [email protected] CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 46

Appendix F

Informed consent follow-up part phase two

Toestemmingsverklaring (informed consent)

Titel onderzoek: Chatgesprek follow-up Verantwoordelijke onderzoeker: Gerard Visser

Voor dit onderzoek moet je tussen de 18 en 65 jaar oud zijn en daarnaast ben je Nederlandstalig.

In dit onderzoek ga je je eigen zinnen schalen die je getypt hebt in het chatbot gesprek. Het gehele onderzoek duurt ongeveer 5 minuten.

Als vergoeding kun je meedoen aan een loting waar een bol.com cadeaukaart van 25 euro wordt verloot.

Door toestemming te geven verklaar je het volgende: • Ik heb de bovenstaande informatie gelezen. Mijn vragen zijn afdoende beantwoord. Ik had voldoende tijd om te beslissen of ik meedoe. • Ik weet dat meedoen helemaal vrijwillig is. Ik weet dat ik op ieder moment kan beslissen om toch niet mee te doen of te stoppen. Daarvoor hoef ik geen reden te geven. • Mijn antwoorden worden anoniem / gecodeerd verwerkt. • Ik geef toestemming om mijn gegevens te gebruiken, voor de doelen die hierboven benoemd zijn.

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 47

Appendix G

Debriefing follow-up part phase two

Nabeschouwing (debriefing)

Titel onderzoek: Zinnen schalen Verantwoordelijke onderzoeker: Gerard Visser

Het doel van dit onderzoek was om de getypte zinnen te schalen om deze te vergelijken met eerdere onderzoeken.

De zinnen worden geschaald aan de hand van de Roos van Leary. De Roos van Leary is een interactiemodel die het mogelijk maakt om menselijke eigenschappen en gedrag te beschrijven. De conversaties die je met de chatbot hebt gehouden en het schalen van deze zinnen geven ons een beeld of wij meten wat wij moeten meten. We verwachten op deze manier een juiste lijst van woorden te maken die het gedrag van mensen kunnen bepalen aan de hand van de tekstuele input.

We willen je vragen of u tot 24 november 2017 het onderzoek geheim wilt houden. Zo kunnen de andere deelnemende participanten niet beïnvloed worden voor het onderzoek.

Je deelname aan dit onderzoek is van ons voor groot belang om een zo duidelijk en goed mogelijk beeld te krijgen van de woorden en de daarbij behorende gedragingen. We willen je nogmaals bedanken voor je medewerking.

Mocht je na het onderzoek nog vragen hebben kun je contact opnemen met mij, Gerard Visser, op het e-mailadres [email protected] of met mijn supervisor, Dr. Pascal Haazebroek, op het e-mailadres [email protected]

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 48

Appendix H

Phyton codes

All the wordlists of the Python codes are in Appendix I

Python code Classifier A with range:

# v1 bare minimum # v2 added session ID in CSV. Added comma notation for numbers with decimals so excel will accept the data. # v3 iterating over files in a directory import tkinter as tk # for opening the file import math import csv from tkinter import filedialog from tkinter import * import re import os list_we = list_posemo = list_relativ1 = list_relativ2 = (Lists in Appendix I) def check_matches(sentence): print (sentence) score_we = 0 score_posemo = 0 score_relativ = 0

sentencelist = sentence.split(" ") # list_we for word in sentencelist: scrubbed_word = word.lower() if scrubbed_word in list_we: score_we += 1 for matchword in list_posemo: if not "*" in matchword and matchword == scrubbed_word: score_posemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_posemo += 1 for matchword in list_relativ1: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_relativ2: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 49

elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1

return {'score_we': score_we, 'score_posemo': score_posemo, 'score_relativ': score_relativ}

def calc_scores(sentence_length, matches): print (sentence_length) # check_matches(x).get('score_we'), check_matches(x).get('score_posemo'), check_matches(x).get('score_relativ') percentage_we = matches.get('score_we') / sentence_length * 100 percentage_posemo = matches.get('score_posemo') / sentence_length * 100 percentage_relativ = matches.get('score_relativ') / sentence_length * 100

if matches.get('score_we') == 0 and matches.get('score_posemo') == 0 and matches.get('score_relativ') == 0: p_value = 0 kwadrant = "geen"

else: p_value = math.exp(1.027 - .368 * percentage_we - .029 * percentage_posemo + .038 * percentage_relativ) / ( 1 + math.exp(1.027 - .368 * percentage_we - .029 * percentage_posemo + .038 * percentage_relativ)) if p_value >= .75: kwadrant = "4" elif p_value <= .55: kwadrant = "2" else: kwadrant = "geen"

return {'num_words':sentence_length, 'percentage_we': percentage_we, 'percentage_posemo': percentage_posemo, 'percentage_relativ': percentage_relativ, 'p_value': p_value, 'kwadrant': kwadrant} def to_comma(original_number): return str(original_number).replace(".", ",") chats = [] session_lines = [] # Zinnen per sessie; wordt steeds overschreven chat_sentences = [] # Alle chat zinnen chat_data_per_line = [] # Alle data omtrent chats per regel chat_data_per_chat = [] # Alle data omtrent chats per chat chat_seq_number = 0 # Teller van het aantal sessies firstline = True # Variabele die er voor zorgt dat we niet al meteen een nieuwe regel data schrijven root = Tk() root.filename = filedialog.askopenfilename(initialdir = "/",title = "Select file",filetypes = (("csv files","*.csv"),("all files","*.*"))) with open(root.filename, newline='', encoding='utf-8') as csvfile: chatreader = csv.reader(csvfile, delimiter=';', quotechar='"', strict=True, skipinitialspace=True) CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 50

next(chatreader, None) # smerige manier om de header van de file over te slaan

# eerst alle chats netjes inlezen for row in chatreader: sentence = re.sub(r'([^\s\w]|_)+', ' ', row[5]) sentence_number = row[2] if (sentence_number == "1" and firstline == False): session = {'session_lines' : session_lines, 'session_id' : previous_row[8], 'NPS' : previous_row[10]} chats.append(session) #omdat ik pas weet wanneer de teller weer op 1 staat dat ik een nieuwe chat ben begonnen, sla ik elke regel de previous row op. session_lines = [] session_lines.append(sentence) previous_row = row firstline = False for chat in chats: tot_lines = 0 # regelnummer tot_customer_sentences = 0 # regels die de klant heeft getypt session_all_words = "" # elke zin die in de chat is gezegd als een lange zin for sentence in chat.get('session_lines'): tot_lines += 1 if ( sentence != "ClientStarted" and sentence != "The conversation starts" and sentence != "ClientClosed" and sentence != "The conversation resumes" and sentence != "OrdersRetrieved?" and sentence != "WaitingForAPI OrderInfo" and sentence != "Save account information for reporting" and sentence != "WaitingForAPI: OrderInfo" and sentence != "SessionEnd" and sentence != "UserInvitedAgent" and sentence != "SessionTimeout" and sentence != "MessagingServerPickedUpEMail" and sentence != "CheckEmailSent" and sentence != "email backoffice service" and "encrypted" not in sentence and "geklikt op " not in sentence): tot_customer_sentences += 1

#print("sentence:", sentence)

for word in sentence.split(" "): checkword = re.sub("[^a-zA-Z]+", "", word) if checkword != "": session_all_words += checkword + " "

matches = check_matches(sentence) scores = calc_scores(len(sentence.rstrip().split(" ")), matches)

chat_data_per_line.append([chat.get('session_id'), sentence, scores.get('num_words'), matches.get('score_we'), matches.get('score_posemo'), CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 51

matches.get('score_relativ'), to_comma(scores.get('percentage_we')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('p_value')), scores.get('kwadrant')])

matches = check_matches(session_all_words) scores = calc_scores(len(session_all_words), matches) #ChatID, Lengte chat in zinnen, aantal, zinnen van klant, gem P-waarde, gem.kwadrant, NPS chat_data_per_chat.append([ chat.get('session_id'), tot_lines, tot_customer_sentences, matches.get('score_we'), matches.get('score_posemo'), matches.get('score_relativ'), to_comma(scores.get('percentage_we')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('p_value')), scores.get('kwadrant'), chat.get('NPS') ]) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file", filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Zin", "Aantal woorden in zin", "telling we lijst", "telling posemo lijst", "telling relativ lijst", "percentage we lijst", "percentage posemo lijst", "percentage relativ lijst", "p-waarde", "kwadrant"]) for row in chat_data_per_line: wr.writerow(row) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file", filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Lengte sessie in zinnen", "Aantal Zinnen van klant", "telling we lijst", "telling posemo lijst", "telling relativ lijst", "percentage we lijst", "percentage posemo lijst", "percentage relativ lijst", "p-waarde", "kwadrant", "NPS"]) for row in chat_data_per_chat: CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 52

wr.writerow(row)

Python code Classifier A without range:

# v1 bare minimum # v2 added session ID in CSV. Added comma notation for numbers with decimals so excel will accept the data. # v3 iterating over files in a directory import tkinter as tk # for opening the file import math import csv from tkinter import filedialog from tkinter import * import re import os list_we = list_posemo = list_relativ1 = list_relativ2 = (Lists in Appendix I) def check_matches(sentence): print (sentence) score_we = 0 score_posemo = 0 score_relativ = 0

sentencelist = sentence.split(" ") # list_we for word in sentencelist: scrubbed_word = word.lower() if scrubbed_word in list_we: score_we += 1 for matchword in list_posemo: if not "*" in matchword and matchword == scrubbed_word: score_posemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_posemo += 1 for matchword in list_relativ1: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_relativ2: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1

return {'score_we': score_we, 'score_posemo': score_posemo, 'score_relativ': score_relativ} CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 53

def calc_scores(sentence_length, matches): print (sentence_length) # check_matches(x).get('score_we'), check_matches(x).get('score_posemo'), check_matches(x).get('score_relativ') percentage_we = matches.get('score_we') / sentence_length * 100 percentage_posemo = matches.get('score_posemo') / sentence_length * 100 percentage_relativ = matches.get('score_relativ') / sentence_length * 100

if matches.get('score_we') == 0 and matches.get('score_posemo') == 0 and matches.get('score_relativ') == 0: p_value = 0 kwadrant = "geen"

else: p_value = math.exp(1.027 - .368 * percentage_we - .029 * percentage_posemo + .038 * percentage_relativ) / ( 1 + math.exp(1.027 - .368 * percentage_we - .029 * percentage_posemo + .038 * percentage_relativ)) if p_value >= .65: kwadrant = "4" elif p_value < .65: kwadrant = "2"

return {'num_words': sentence_length, 'percentage_we': percentage_we, 'percentage_posemo': percentage_posemo, 'percentage_relativ': percentage_relativ, 'p_value': p_value, 'kwadrant': kwadrant} def to_comma(original_number): return str(original_number).replace(".", ",") chats = [] session_lines = [] # Zinnen per sessie; wordt steeds overschreven chat_sentences = [] # Alle chat zinnen chat_data_per_line = [] # Alle data omtrent chats per regel chat_data_per_chat = [] # Alle data omtrent chats per chat chat_seq_number = 0 # Teller van het aantal sessies firstline = True # Variabele die er voor zorgt dat we niet al meteen een nieuwe regel data schrijven root = Tk() root.filename = filedialog.askopenfilename(initialdir = "/",title = "Select file",filetypes = (("csv files","*.csv"),("all files","*.*"))) with open(root.filename, newline='', encoding='utf-8') as csvfile: chatreader = csv.reader(csvfile, delimiter=';', quotechar='"', strict=True, skipinitialspace=True) next(chatreader, None) # smerige manier om de header van de file over te slaan

# eerst alle chats netjes inlezen for row in chatreader: sentence = re.sub(r'([^\s\w]|_)+', ' ', row[5]) sentence_number = row[2] if (sentence_number == "1" and firstline == False): CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 54

session = {'session_lines' : session_lines, 'session_id' : previous_row[8], 'NPS' : previous_row[10]} chats.append(session) #omdat ik pas weet wanneer de teller weer op 1 staat dat ik een nieuwe chat ben begonnen, sla ik elke regel de previous row op. session_lines = [] session_lines.append(sentence) previous_row = row firstline = False for chat in chats: tot_lines = 0 # regelnummer tot_customer_sentences = 0 # regels die de klant heeft getypt session_all_words = "" # elke zin die in de chat is gezegd als een lange zin for sentence in chat.get('session_lines'): tot_lines += 1 if ( sentence != "ClientStarted" and sentence != "The conversation starts" and sentence != "ClientClosed" and sentence != "The conversation resumes" and sentence != "OrdersRetrieved?" and sentence != "WaitingForAPI OrderInfo" and sentence != "Save account information for reporting" and sentence != "WaitingForAPI: OrderInfo" and sentence != "SessionEnd" and sentence != "UserInvitedAgent" and sentence != "SessionTimeout" and sentence != "MessagingServerPickedUpEMail" and sentence != "CheckEmailSent" and sentence != "email backoffice service" and "encrypted" not in sentence and "geklikt op " not in sentence): tot_customer_sentences += 1

#print("sentence:", sentence)

for word in sentence.split(" "): checkword = re.sub("[^a-zA-Z]+", "", word) if checkword != "": session_all_words += checkword + " "

matches = check_matches(sentence) scores = calc_scores(len(sentence.rstrip().split(" ")), matches)

chat_data_per_line.append([chat.get('session_id'), sentence, scores.get('num_words'), matches.get('score_we'), matches.get('score_posemo'), matches.get('score_relativ'), to_comma(scores.get('percentage_we')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('p_value')), scores.get('kwadrant')])

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 55

matches = check_matches(session_all_words) scores = calc_scores(len(session_all_words), matches) #ChatID, Lengte chat in zinnen, aantal, zinnen van klant, gem P-waarde, gem.kwadrant, NPS chat_data_per_chat.append([ chat.get('session_id'), tot_lines, tot_customer_sentences, matches.get('score_we'), matches.get('score_posemo'), matches.get('score_relativ'), to_comma(scores.get('percentage_we')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('p_value')), scores.get('kwadrant'), chat.get('NPS') ]) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file", filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Zin", "Aantal woorden in zin", "telling we lijst", "telling posemo lijst", "telling relativ lijst", "percentage we lijst", "percentage posemo lijst", "percentage relativ lijst", "p-waarde", "kwadrant"]) for row in chat_data_per_line: wr.writerow(row) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file", filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Lengte sessie in zinnen", "Aantal Zinnen van klant", "telling we lijst", "telling posemo lijst", "telling relativ lijst", "percentage we lijst", "percentage posemo lijst", "percentage relativ lijst", "p-waarde", "kwadrant", "NPS"]) for row in chat_data_per_chat: wr.writerow(row)

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 56

Python code Classifier B1 with range:

#v1 bare minimum #v2 added session ID in CSV. Added comma notation for numbers with decimals so excel will accept the data. #v3 iterating over files in a directory # coding=utf-8 import tkinter as tk # for opening the file import math import csv from tkinter import filedialog from tkinter import * import re import os list_insight = list_posemo = list_negemo = list_negemo2 = list_relativ1 = list_relativ2 = (Lists in Appendix I) def check_matches(sentence): print (sentence) score_insight = 0 score_posemo = 0 score_negemo = 0 score_relativ = 0

sentencelist = sentence.split(" ") #searching for words in the lists + count for word in sentencelist: scrubbed_word = word.lower() for matchword in list_insight: if not "*" in matchword and matchword == scrubbed_word: score_insight += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_insight += 1 for matchword in list_posemo: if not "*" in matchword and matchword == scrubbed_word: score_posemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_posemo += 1 for matchword in list_relativ1: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_relativ2: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 57

for matchword in list_negemo1: if not "*" in matchword and matchword == scrubbed_word: score_negemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_negemo += 1 for matchword in list_negemo2: if not "*" in matchword and matchword == scrubbed_word: score_negemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_negemo += 1

return {'score_insight':score_insight, 'score_posemo':score_posemo, 'score_negemo': score_negemo, 'score_relativ': score_relativ} def calc_scores(sentence_length, matches): print (sentence_length) #check_matches(x).get('score_insight'), check_matches(x).get('score_posemo'), check_matches(x).get('score_negemo'), check_matches(x).get('score_relativ') percentage_insight = matches.get('score_insight') / sentence_length * 100 percentage_posemo = matches.get('score_posemo') / sentence_length * 100 percentage_negemo = matches.get('score_negemo') / sentence_length * 100 percentage_relativ = matches.get('score_relativ') / sentence_length * 100

if matches.get('score_insight') == 0 and matches.get('score_posemo') == 0 and matches.get('score_relativ') == 0 and matches.get('score_negemo') == 0: p_value = 0 kwadrant = "geen"

else: p_value = math.exp(.146 - .126 * percentage_posemo - .057 * percentage_insight + .033 * percentage_relativ + 0.071 * percentage_negemo) / (1 + math.exp(.146 - .126 * percentage_posemo - .057 * percentage_insight + .033 * percentage_relativ + 0.071 * percentage_negemo)) if p_value >= .55: kwadrant = "4" elif p_value <= .45: kwadrant = "2" else: kwadrant = "geen"

return {'num_words': sentence_length, 'percentage_insight':percentage_insight, 'percentage_posemo':percentage_posemo, 'percentage_negemo':percentage_negemo, 'percentage_relativ':percentage_relativ, 'p_value':p_value, 'kwadrant':kwadrant} def to_comma (original_number): return str(original_number).replace(".", ",")

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 58 chats = [] session_lines = [] # Zinnen per sessie; wordt steeds overschreven chat_sentences = [] # Alle chat zinnen chat_data_per_line = [] # Alle data omtrent chats per regel chat_data_per_chat = [] # Alle data omtrent chats per chat chat_seq_number = 0 # Teller van het aantal sessies firstline = True # Variabele die er voor zorgt dat we niet al meteen een nieuwe regel data schrijven root = Tk() root.filename = filedialog.askopenfilename(initialdir = "/",title = "Select file",filetypes = (("csv files","*.csv"),("all files","*.*"))) with open(root.filename, newline='', encoding='utf-8') as csvfile: chatreader = csv.reader(csvfile, delimiter=';', quotechar='"', strict=True, skipinitialspace=True) next(chatreader, None) # smerige manier om de header van de file over te slaan

# eerst alle chats netjes inlezen for row in chatreader: sentence = re.sub(r'([^\s\w]|_)+', ' ', row[5]) sentence_number = row[2] if (sentence_number == "1" and firstline == False): session = {'session_lines' : session_lines, 'session_id' : previous_row[8], 'NPS' : previous_row[10]} chats.append(session) #omdat ik pas weet wanneer de teller weer op 1 staat dat ik een nieuwe chat ben begonnen, sla ik elke regel de previous row op. session_lines = [] session_lines.append(sentence) previous_row = row firstline = False for chat in chats: tot_lines = 0 # regelnummer tot_customer_sentences = 0 # regels die de klant heeft getypt session_all_words = "" # elke zin die in de chat is gezegd als een lange zin for sentence in chat.get('session_lines'): tot_lines += 1 if ( sentence != "ClientStarted" and sentence != "The conversation starts" and sentence != "ClientClosed" and sentence != "The conversation resumes" and sentence != "OrdersRetrieved?" and sentence != "WaitingForAPI OrderInfo" and sentence != "Save account information for reporting" and sentence != "WaitingForAPI: OrderInfo" and sentence != "SessionEnd" and sentence != "UserInvitedAgent" and sentence != "SessionTimeout" and sentence != "MessagingServerPickedUpEMail" and sentence != "CheckEmailSent" and sentence != "email backoffice service" and "encrypted" not in sentence and "geklikt op " not in sentence): CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 59

tot_customer_sentences += 1

#print("sentence:", sentence)

for word in sentence.split(" "): checkword = re.sub("[^a-zA-Z]+", "", word) if checkword != "": session_all_words += checkword + " "

matches = check_matches(sentence) scores = calc_scores(len(sentence.rstrip().split(" ")), matches)

chat_data_per_line.append([chat.get('session_id'), sentence, scores.get('num_words'), matches.get('score_insight'), matches.get('score_posemo'), matches.get('score_relativ'), matches.get('score_negemo'),

to_comma(scores.get('percentage_insight')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('percentage_negemo')), to_comma(scores.get('p_value')),

scores.get('kwadrant')])

matches = check_matches(session_all_words) scores = calc_scores(len(session_all_words), matches) #ChatID, Lengte chat in zinnen, aantal, zinnen van klant, gem P-waarde, gem.kwadrant, NPS chat_data_per_chat.append([ chat.get('session_id'), tot_lines, tot_customer_sentences, matches.get('score_insight'), matches.get('score_posemo'), matches.get('score_relativ'), matches.get('score_negemo'), to_comma(scores.get('percentage_insight')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('percentage_negemo')), to_comma(scores.get('p_value')), scores.get('kwadrant'), chat.get('NPS') ]) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file",

filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Zin", "Aantal woorden in zin", "telling insight lijst", "telling posemo lijst", "telling relativ lijst", "telling negemo lijst", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 60

"percentage insight lijst", "percentage posemo lijst", "percentage relativ lijst", "percentage negemo lijst", "p-waarde", "kwadrant"]) for row in chat_data_per_line: wr.writerow(row) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file",

filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Lengte sessie in zinnen", "Aantal Zinnen van klant", "telling insight lijst", "telling posemo lijst", "telling relativ lijst", "telling negemo lijst", "percentage insight lijst", "percentage posemo lijst", "percentage relativ lijst", "percentage negemo lijst", "p-waarde", "kwadrant", "NPS"]) for row in chat_data_per_chat: wr.writerow(row)

Python code Classifier B1 with range:

#v1 bare minimum #v2 added session ID in CSV. Added comma notation for numbers with decimals so excel will accept the data. #v3 iterating over files in a directory # coding=utf-8 import tkinter as tk # for opening the file import math import csv from tkinter import filedialog from tkinter import * import re import os list_insight = list_posemo = list_negemo = list_negemo2 = list_relativ1 = list_relativ2 = (Lists in Appendix I) def check_matches(sentence): print (sentence) score_insight = 0 score_posemo = 0 score_negemo = 0 score_relativ = 0

sentencelist = sentence.split(" ") #searching for words in the lists + count CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 61

for word in sentencelist: scrubbed_word = word.lower() for matchword in list_insight: if not "*" in matchword and matchword == scrubbed_word: score_insight += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_insight += 1 for matchword in list_posemo: if not "*" in matchword and matchword == scrubbed_word: score_posemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_posemo += 1 for matchword in list_relativ1: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_relativ2: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_negemo1: if not "*" in matchword and matchword == scrubbed_word: score_negemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_negemo += 1 for matchword in list_negemo2: if not "*" in matchword and matchword == scrubbed_word: score_negemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_negemo += 1

return {'score_insight':score_insight, 'score_posemo':score_posemo, 'score_negemo': score_negemo, 'score_relativ': score_relativ} def calc_scores(sentence_length, matches): print (sentence_length) #check_matches(x).get('score_insight'), check_matches(x).get('score_posemo'), check_matches(x).get('score_negemo'), check_matches(x).get('score_relativ') percentage_insight = matches.get('score_insight') / sentence_length * 100 percentage_posemo = matches.get('score_posemo') / sentence_length * 100 percentage_negemo = matches.get('score_negemo') / sentence_length * 100 percentage_relativ = matches.get('score_relativ') / sentence_length * 100

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 62

if matches.get('score_insight') == 0 and matches.get('score_posemo') == 0 and matches.get('score_relativ') == 0 and matches.get('score_negemo') == 0: p_value = 0 kwadrant = "geen"

else: p_value = math.exp(.146 - .126 * percentage_posemo - .057 * percentage_insight + .033 * percentage_relativ + 0.071 * percentage_negemo) / (1 + math.exp(.146 - .126 * percentage_posemo - .057 * percentage_insight + .033 * percentage_relativ + 0.071 * percentage_negemo)) if p_value >= .54: kwadrant = "4" elif p_value < .54: kwadrant = "2"

return {'num_words': sentence_length, 'percentage_insight':percentage_insight, 'percentage_posemo':percentage_posemo, 'percentage_negemo':percentage_negemo, 'percentage_relativ':percentage_relativ, 'p_value':p_value, 'kwadrant':kwadrant} def to_comma (original_number): return str(original_number).replace(".", ",") chats = [] session_lines = [] # Zinnen per sessie; wordt steeds overschreven chat_sentences = [] # Alle chat zinnen chat_data_per_line = [] # Alle data omtrent chats per regel chat_data_per_chat = [] # Alle data omtrent chats per chat chat_seq_number = 0 # Teller van het aantal sessies firstline = True # Variabele die er voor zorgt dat we niet al meteen een nieuwe regel data schrijven root = Tk() root.filename = filedialog.askopenfilename(initialdir = "/",title = "Select file",filetypes = (("csv files","*.csv"),("all files","*.*"))) with open(root.filename, newline='', encoding='utf-8') as csvfile: chatreader = csv.reader(csvfile, delimiter=';', quotechar='"', strict=True, skipinitialspace=True) next(chatreader, None) # smerige manier om de header van de file over te slaan

# eerst alle chats netjes inlezen for row in chatreader: sentence = re.sub(r'([^\s\w]|_)+', ' ', row[5]) sentence_number = row[2] if (sentence_number == "1" and firstline == False): session = {'session_lines' : session_lines, 'session_id' : previous_row[8], 'NPS' : previous_row[10]} chats.append(session) #omdat ik pas weet wanneer de teller weer op 1 staat dat ik een nieuwe chat ben begonnen, sla ik elke regel de previous row op. session_lines = [] session_lines.append(sentence) previous_row = row firstline = False CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 63

for chat in chats: tot_lines = 0 # regelnummer tot_customer_sentences = 0 # regels die de klant heeft getypt session_all_words = "" # elke zin die in de chat is gezegd als een lange zin for sentence in chat.get('session_lines'): tot_lines += 1 if ( sentence != "ClientStarted" and sentence != "The conversation starts" and sentence != "ClientClosed" and sentence != "The conversation resumes" and sentence != "OrdersRetrieved?" and sentence != "WaitingForAPI OrderInfo" and sentence != "Save account information for reporting" and sentence != "WaitingForAPI: OrderInfo" and sentence != "SessionEnd" and sentence != "UserInvitedAgent" and sentence != "SessionTimeout" and sentence != "MessagingServerPickedUpEMail" and sentence != "CheckEmailSent" and sentence != "email backoffice service" and "encrypted" not in sentence and "geklikt op " not in sentence): tot_customer_sentences += 1

#print("sentence:", sentence)

for word in sentence.split(" "): checkword = re.sub("[^a-zA-Z]+", "", word) if checkword != "": session_all_words += checkword + " "

matches = check_matches(sentence) scores = calc_scores(len(sentence.rstrip().split(" ")), matches)

chat_data_per_line.append([chat.get('session_id'), sentence, scores.get('num_words'), matches.get('score_insight'), matches.get('score_posemo'), matches.get('score_relativ'), matches.get('score_negemo'),

to_comma(scores.get('percentage_insight')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('percentage_negemo')), to_comma(scores.get('p_value')),

scores.get('kwadrant')])

matches = check_matches(session_all_words) scores = calc_scores(len(session_all_words), matches) #ChatID, Lengte chat in zinnen, aantal, zinnen van klant, gem P-waarde, gem.kwadrant, NPS chat_data_per_chat.append([ chat.get('session_id'), tot_lines, CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 64

tot_customer_sentences, matches.get('score_insight'), matches.get('score_posemo'), matches.get('score_relativ'), matches.get('score_negemo'), to_comma(scores.get('percentage_insight')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('percentage_negemo')), to_comma(scores.get('p_value')), scores.get('kwadrant'), chat.get('NPS') ]) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file",

filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Zin", "Aantal woorden in zin", "telling insight lijst", "telling posemo lijst", "telling relativ lijst", "telling negemo lijst", "percentage insight lijst", "percentage posemo lijst", "percentage relativ lijst", "percentage negemo lijst", "p-waarde", "kwadrant"]) for row in chat_data_per_line: wr.writerow(row) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file",

filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Lengte sessie in zinnen", "Aantal Zinnen van klant", "telling insight lijst", "telling posemo lijst", "telling relativ lijst", "telling negemo lijst", "percentage insight lijst", "percentage posemo lijst", "percentage relativ lijst", "percentage negemo lijst", "p-waarde", "kwadrant", "NPS"]) for row in chat_data_per_chat: wr.writerow(row)

Python code Classifier B2 with range:

#v1 bare minimum #v2 added session ID in CSV. Added comma notation for numbers with decimals so excel will accept the data. #v3 iterating over files in a directory CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 65

# coding=utf-8 import tkinter as tk # for opening the file import math import csv from tkinter import filedialog from tkinter import * import re import os list_insight = list_posemo = list_negemo = list_negemo2 = list_relativ1 = list_relativ2 = list_money = (Lists in Appendix I) def check_matches(sentence): print (sentence) score_insight = 0 score_posemo = 0 score_negemo = 0 score_relativ = 0 score_money = 0

sentencelist = sentence.split(" ") #searching for words in the lists + count for word in sentencelist: scrubbed_word = word.lower() for matchword in list_insight: if not "*" in matchword and matchword == scrubbed_word: score_insight += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_insight += 1 for matchword in list_posemo: if not "*" in matchword and matchword == scrubbed_word: score_posemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_posemo += 1 for matchword in list_relativ1: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_relativ2: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_negemo1: if not "*" in matchword and matchword == scrubbed_word: score_negemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 66

score_negemo += 1 for matchword in list_negemo2: if not "*" in matchword and matchword == scrubbed_word: score_negemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_negemo += 1 for matchword in list_money: if not "*" in matchword and matchword == scrubbed_word: score_money += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_money += 1

return {'score_insight':score_insight, 'score_posemo':score_posemo, 'score_negemo': score_negemo, 'score_relativ': score_relativ, 'score_money': score_money} def calc_scores(sentence_length, matches): print (sentence_length) #check_matches(x).get('score_insight'), check_matches(x).get('score_posemo'), check_matches(x).get('score_negemo'), check_matches(x).get('score_relativ') percentage_insight = matches.get('score_insight') / sentence_length * 100 percentage_posemo = matches.get('score_posemo') / sentence_length * 100 percentage_negemo = matches.get('score_negemo') / sentence_length * 100 percentage_relativ = matches.get('score_relativ') / sentence_length * 100 percentage_money = matches.get('score_money') / sentence_length * 100

if matches.get('score_insight') == 0 and matches.get('score_posemo') == 0 and matches.get('score_relativ') == 0 and matches.get('score_money') == 0 and matches.get('score_negemo') == 0: p_value = 0 kwadrant = "geen"

else: p_value = math.exp(0.609 - 0.136 * percentage_posemo - 0.075 * percentage_insight + 0.024 * percentage_relativ + 0.056 * percentage_negemo - 0.151 * percentage_money) / (1 + math.exp(0.609 - 0.136 * percentage_posemo - 0.075 * percentage_insight + 0.024 * percentage_relativ + 0.056 * percentage_negemo - 0.151 * percentage_money)) if p_value >= .60: kwadrant = "4" elif p_value <= .40: kwadrant = "2" else: kwadrant = "geen"

return {'num_words': sentence_length, 'percentage_insight':percentage_insight, 'percentage_posemo':percentage_posemo, 'percentage_negemo':percentage_negemo, 'percentage_relativ':percentage_relativ, CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 67

'percentage_money':percentage_money, 'p_value':p_value, 'kwadrant':kwadrant} def to_comma (original_number): return str(original_number).replace(".", ",") chats = [] session_lines = [] # Zinnen per sessie; wordt steeds overschreven chat_sentences = [] # Alle chat zinnen chat_data_per_line = [] # Alle data omtrent chats per regel chat_data_per_chat = [] # Alle data omtrent chats per chat chat_seq_number = 0 # Teller van het aantal sessies firstline = True # Variabele die er voor zorgt dat we niet al meteen een nieuwe regel data schrijven root = Tk() root.filename = filedialog.askopenfilename(initialdir = "/",title = "Select file",filetypes = (("csv files","*.csv"),("all files","*.*"))) with open(root.filename, newline='', encoding='utf-8') as csvfile: chatreader = csv.reader(csvfile, delimiter=';', quotechar='"', strict=True, skipinitialspace=True) next(chatreader, None) # smerige manier om de header van de file over te slaan

# eerst alle chats netjes inlezen for row in chatreader: sentence = re.sub(r'([^\s\w]|_)+', ' ', row[5]) sentence_number = row[2] if (sentence_number == "1" and firstline == False): session = {'session_lines' : session_lines, 'session_id' : previous_row[8], 'NPS' : previous_row[10]} chats.append(session) #omdat ik pas weet wanneer de teller weer op 1 staat dat ik een nieuwe chat ben begonnen, sla ik elke regel de previous row op. session_lines = [] session_lines.append(sentence) previous_row = row firstline = False for chat in chats: tot_lines = 0 # regelnummer tot_customer_sentences = 0 # regels die de klant heeft getypt session_all_words = "" # elke zin die in de chat is gezegd als een lange zin for sentence in chat.get('session_lines'): tot_lines += 1 if ( sentence != "ClientStarted" and sentence != "The conversation starts" and sentence != "ClientClosed" and sentence != "The conversation resumes" and sentence != "OrdersRetrieved?" and sentence != "WaitingForAPI OrderInfo" and sentence != "Save account information for reporting" and sentence != "WaitingForAPI: OrderInfo" and sentence != "SessionEnd" and sentence != "UserInvitedAgent" and CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 68

sentence != "SessionTimeout" and sentence != "MessagingServerPickedUpEMail" and sentence != "CheckEmailSent" and sentence != "email backoffice service" and "encrypted" not in sentence and "geklikt op " not in sentence): tot_customer_sentences += 1

#print("sentence:", sentence)

for word in sentence.split(" "): checkword = re.sub("[^a-zA-Z]+", "", word) if checkword != "": session_all_words += checkword + " "

matches = check_matches(sentence) scores = calc_scores(len(sentence.rstrip().split(" ")), matches)

chat_data_per_line.append([chat.get('session_id'), sentence, scores.get('num_words'), matches.get('score_insight'), matches.get('score_posemo'), matches.get('score_relativ'), matches.get('score_negemo'), matches.get('score_money'),

to_comma(scores.get('percentage_insight')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('percentage_negemo')), to_comma(scores.get('percentage_money')), to_comma(scores.get('p_value')),

scores.get('kwadrant')])

matches = check_matches(session_all_words) scores = calc_scores(len(session_all_words), matches) #ChatID, Lengte chat in zinnen, aantal, zinnen van klant, gem P- waarde, gem.kwadrant, NPS chat_data_per_chat.append([ chat.get('session_id'), tot_lines, tot_customer_sentences, matches.get('score_insight'), matches.get('score_posemo'), matches.get('score_relativ'), matches.get('score_negemo'), matches.get('score_money'), to_comma(scores.get('percentage_insight')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('percentage_negemo')), to_comma(scores.get('percentage_money')), to_comma(scores.get('p_value')), scores.get('kwadrant'), chat.get('NPS') ]) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 69

filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Zin", "Aantal woorden in zin", "telling insight lijst", "telling posemo lijst", "telling relativ lijst", "telling negemo lijst", "telling money lijst", "percentage insight lijst", "percentage posemo lijst", "percentage relativ lijst", "percentage negemo lijst", "percentage money lijst", "p-waarde", "kwadrant"]) for row in chat_data_per_line: wr.writerow(row) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file",

filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Lengte sessie in zinnen", "Aantal Zinnen van klant", "telling insight lijst", "telling posemo lijst", "telling relativ lijst", "telling negemo lijst", "telling money lijst", "percentage insight lijst", "percentage posemo lijst", "percentage relativ lijst", "percentage negemo lijst", "percentage money lijst", "p-waarde", "kwadrant", "NPS"]) for row in chat_data_per_chat: wr.writerow(row)

Python code Classifier B2 without range:

#v1 bare minimum #v2 added session ID in CSV. Added comma notation for numbers with decimals so excel will accept the data. #v3 iterating over files in a directory # coding=utf-8 import tkinter as tk # for opening the file import math import csv from tkinter import filedialog from tkinter import * import re import os list_insight = list_posemo = list_negemo = list_negemo2 = list_relativ1 = list_relativ2 = list_money = (Lists in Appendix I) CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 70 def check_matches(sentence): print (sentence) score_insight = 0 score_posemo = 0 score_negemo = 0 score_relativ = 0 score_money = 0

sentencelist = sentence.split(" ") #searching for words in the lists + count for word in sentencelist: scrubbed_word = word.lower() for matchword in list_insight: if not "*" in matchword and matchword == scrubbed_word: score_insight += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_insight += 1 for matchword in list_posemo: if not "*" in matchword and matchword == scrubbed_word: score_posemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_posemo += 1 for matchword in list_relativ1: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_relativ2: if not "*" in matchword and matchword == scrubbed_word: score_relativ += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_relativ += 1 for matchword in list_negemo1: if not "*" in matchword and matchword == scrubbed_word: score_negemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_negemo += 1 for matchword in list_negemo2: if not "*" in matchword and matchword == scrubbed_word: score_negemo += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_negemo += 1 for matchword in list_money: if not "*" in matchword and matchword == scrubbed_word: score_money += 1 elif "*" in matchword and matchword[:len(matchword) - 1] == scrubbed_word[:len(matchword) - 1]: score_money += 1

return {'score_insight':score_insight, 'score_posemo':score_posemo, 'score_negemo': score_negemo, 'score_relativ': score_relativ, 'score_money': score_money} CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 71

def calc_scores(sentence_length, matches): print (sentence_length) #check_matches(x).get('score_insight'), check_matches(x).get('score_posemo'), check_matches(x).get('score_negemo'), check_matches(x).get('score_relativ') percentage_insight = matches.get('score_insight') / sentence_length * 100 percentage_posemo = matches.get('score_posemo') / sentence_length * 100 percentage_negemo = matches.get('score_negemo') / sentence_length * 100 percentage_relativ = matches.get('score_relativ') / sentence_length * 100 percentage_money = matches.get('score_money') / sentence_length * 100

if matches.get('score_insight') == 0 and matches.get('score_posemo') == 0 and matches.get('score_relativ') == 0 and matches.get('score_money') == 0 and matches.get('score_negemo') == 0: p_value = 0 kwadrant = "geen"

else: p_value = math.exp(0.609 - 0.136 * percentage_posemo - 0.075 * percentage_insight + 0.024 * percentage_relativ + 0.056 * percentage_negemo - 0.151 * percentage_money) / (1 + math.exp(0.609 - 0.136 * percentage_posemo - 0.075 * percentage_insight + 0.024 * percentage_relativ + 0.056 * percentage_negemo - 0.151 * percentage_money)) if p_value >= .51: kwadrant = "4" elif p_value < .51: kwadrant = "2"

return {'num_words': sentence_length, 'percentage_insight':percentage_insight, 'percentage_posemo':percentage_posemo, 'percentage_negemo':percentage_negemo, 'percentage_relativ':percentage_relativ, 'percentage_money':percentage_money, 'p_value':p_value, 'kwadrant':kwadrant} def to_comma (original_number): return str(original_number).replace(".", ",") chats = [] session_lines = [] # Zinnen per sessie; wordt steeds overschreven chat_sentences = [] # Alle chat zinnen chat_data_per_line = [] # Alle data omtrent chats per regel chat_data_per_chat = [] # Alle data omtrent chats per chat chat_seq_number = 0 # Teller van het aantal sessies firstline = True # Variabele die er voor zorgt dat we niet al meteen een nieuwe regel data schrijven root = Tk() root.filename = filedialog.askopenfilename(initialdir = "/",title = "Select file",filetypes = (("csv files","*.csv"),("all files","*.*"))) with open(root.filename, newline='', encoding='utf-8') as csvfile: CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 72

chatreader = csv.reader(csvfile, delimiter=';', quotechar='"', strict=True, skipinitialspace=True) next(chatreader, None) # smerige manier om de header van de file over te slaan

# eerst alle chats netjes inlezen for row in chatreader: sentence = re.sub(r'([^\s\w]|_)+', ' ', row[5]) sentence_number = row[2] if (sentence_number == "1" and firstline == False): session = {'session_lines' : session_lines, 'session_id' : previous_row[8], 'NPS' : previous_row[10]} chats.append(session) #omdat ik pas weet wanneer de teller weer op 1 staat dat ik een nieuwe chat ben begonnen, sla ik elke regel de previous row op. session_lines = [] session_lines.append(sentence) previous_row = row firstline = False for chat in chats: tot_lines = 0 # regelnummer tot_customer_sentences = 0 # regels die de klant heeft getypt session_all_words = "" # elke zin die in de chat is gezegd als een lange zin for sentence in chat.get('session_lines'): tot_lines += 1 if ( sentence != "ClientStarted" and sentence != "The conversation starts" and sentence != "ClientClosed" and sentence != "The conversation resumes" and sentence != "OrdersRetrieved?" and sentence != "WaitingForAPI OrderInfo" and sentence != "Save account information for reporting" and sentence != "WaitingForAPI: OrderInfo" and sentence != "SessionEnd" and sentence != "UserInvitedAgent" and sentence != "SessionTimeout" and sentence != "MessagingServerPickedUpEMail" and sentence != "CheckEmailSent" and sentence != "email backoffice service" and "encrypted" not in sentence and "geklikt op " not in sentence): tot_customer_sentences += 1

#print("sentence:", sentence)

for word in sentence.split(" "): checkword = re.sub("[^a-zA-Z]+", "", word) if checkword != "": session_all_words += checkword + " "

matches = check_matches(sentence) scores = calc_scores(len(sentence.rstrip().split(" ")), matches)

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 73

chat_data_per_line.append([chat.get('session_id'), sentence, scores.get('num_words'), matches.get('score_insight'), matches.get('score_posemo'), matches.get('score_relativ'), matches.get('score_negemo'), matches.get('score_money'),

to_comma(scores.get('percentage_insight')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('percentage_negemo')), to_comma(scores.get('percentage_money')), to_comma(scores.get('p_value')),

scores.get('kwadrant')])

matches = check_matches(session_all_words) scores = calc_scores(len(session_all_words), matches) #ChatID, Lengte chat in zinnen, aantal, zinnen van klant, gem P- waarde, gem.kwadrant, NPS chat_data_per_chat.append([ chat.get('session_id'), tot_lines, tot_customer_sentences, matches.get('score_insight'), matches.get('score_posemo'), matches.get('score_relativ'), matches.get('score_negemo'), matches.get('score_money'), to_comma(scores.get('percentage_insight')), to_comma(scores.get('percentage_posemo')), to_comma(scores.get('percentage_relativ')), to_comma(scores.get('percentage_negemo')), to_comma(scores.get('percentage_money')), to_comma(scores.get('p_value')), scores.get('kwadrant'), chat.get('NPS') ]) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file",

filetypes=(("csv files", "*.csv"), ("all files", "*.*"))) with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Zin", "Aantal woorden in zin", "telling insight lijst", "telling posemo lijst", "telling relativ lijst", "telling negemo lijst", "telling money lijst", "percentage insight lijst", "percentage posemo lijst", "percentage relativ lijst", "percentage negemo lijst", "percentage money lijst", "p-waarde", "kwadrant"]) for row in chat_data_per_line: wr.writerow(row) root.filename = filedialog.asksaveasfilename(initialdir="/", title="Select file",

filetypes=(("csv files", "*.csv"), ("all files", "*.*")))

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 74 with open(root.filename, 'w', encoding='utf-8') as myfile: wr = csv.writer(myfile, delimiter=';', quotechar='"', escapechar='_') # wr.writerow(chat_session.get_file_header()) wr.writerow( ["Sessie", "Lengte sessie in zinnen", "Aantal Zinnen van klant", "telling insight lijst", "telling posemo lijst", "telling relativ lijst", "telling negemo lijst", "telling money lijst", "percentage insight lijst", "percentage posemo lijst", "percentage relativ lijst", "percentage negemo lijst", "percentage money lijst", "p-waarde", "kwadrant", "NPS"]) for row in chat_data_per_chat: wr.writerow(row)

CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 75

Appendix I

Wordlists for the Python codes list_we = ["ons", "onszelf", "onze", "onzer", "we", "wij", "wijzelf"] list_posemo = ["a.u.b.", "aanbevel*", "aangena*", "aangespoord*", "aanhankelijk", "aanhankelijkheid", "aanlokkelijk*", "aantrekkelijk", "aanvaardba*", "aardig*", "absolute", "accept*", "accoord", "achting", "actief", "affectie*", "akkoord", "alleraangenaamst*", "alleraardigst", "allerliefst*", "alsjeblieft", "alstublieft", "amicaal", "amusant*", "amuse*", "appreci*", "argelo*", "attent", "aub", "avontuur", "avontuurlijk", "baat", "baten", "bedaard", "bedankje*", "bedreven*", "begaafd*", "beger*", "begrijp*", "begrip*", "begunstig*", "behaag*", "behaal*", "bejubelen", "bekoorlijk*", "bekroning*", "bekroond*", "bekwa*", "belang", "belangen", "belangrijk*", "belangstellend", "belangstelling", "beleefd*", "belief*", "believen", "belon*", "beloon*", "bemin", "bemind*", "beminnelijk*", "benefie*", "bereidheid", "bereidwillig", "bereidwilligheid", "beschermd*", "beschut*", "beter", "beterschap", "betoverend*", "betrouwba*", "bevallig*", "beveilig*", "bevoordeel*", "bevoordelen", "bevoorrecht*", "bevorder*", "bevredig*", "bevrijd*", "bewonder*", "beziel*", "blij", "blijdschap", "blijheid", "blijmoedig*", "boeiend", "bonbon*", "bonus*", "braaf", "briljant", "casual", "charita*", "charmant*", "charme", "charmes", "chic", "chique*", "choco*", "comfort", "comfortabel*", "compliment*", "confidentie*", "consider*", "copieus", "creatie*", "creëren", "dapper*", "deernis*", "deugd*", "dierba*", "doeltreffen*", "dol", "dolgraag", "doortastend*", "dotje*", "duidelijk*", "dynamiek", "dynamisch*", "edel", "edele", "edelmoedig*", "eensgezind*", "eenstemmig*", "eerbied*", "eergevoel", "eerlijk", "eerlijkheid", "eervol*", "elegan*", "energie", "energiek", "engagement", "enthousias*", "erbarmen", "erkenning*", "extase", "extraatje*", "fabelachtig*", "fantastisch*", "fatsoen*", "fijn", "fijne", "fijngevoelig", "flatteer*", "flatteren*", "flatteu*", "flexib*", "flink", "flitsend*", "fortuin*", "foutlo*", "fraai", "fuif", "fuiven", "gaaf", "gaarne", "gave", "geaccepteerd*", "geamuseerdheid", "geapprecieerd*", "geborgen*", "gecreëerd*", "gedij*", "geestdrift*", "geestig", "geestkracht", "gegiechel", "gehecht", "gehechtheid", "geimponeer*", "geinteresseerd*", "geintje*", "gekheid", "geknuffel*", "gekscheren*", "geliefd", "geluk*", "gemak", "gemakkelijk*", "gemoedelijk*", "gemoedsrust", "genade", "genadig*", "genegen*", "genereu*", "genieten", "genoegelijk*", "genoegen", "genot", "genotrijk*", "gepast*", "geperfectioneerd*", "gereedheid", "geriefelijk", "geroemd*", "gerust*", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 76

"geschikt*", "geslaagd*", "gespaard*", "gestimuleerd*", "getroost*", "getrouw", "gevleid", "gevoelvol*", "gewaagd", "gewaardeerd*", "geweldig*", "gewiekst", "gewild*", "geëngageerd*", "geïmponeer*", "geïnspireerd*", "geïnteresseerd*", "glans", "glansrijk*", "glanst", "glanz*", "glimlach*", "gloedvol*", "glori*", "goddelijk*", "goed", "goedaardig*", "goeddoen*", "goede", "goedertieren*", "goedgezind*", "goedgunstig*", "goedheid", "goedkeur*", "gracieu*", "grap*", "gratie", "gratieu*", "grijns", "grootmoedig*", "groots", "grootse", "gul", "gulheid", "gulle", "gunst*", "ha", "haha", "harmon*", "hartelijk*", "hartgrondig*", "hartstocht*", "hartverwarmend*", "heerlijk*", "heilza*", "held", "helden*", "heldhaftig", "heldhaftigheid", "heldin", "hemel*", "heya", "hilarisch", "hoera*", "hoezee", "hoffelijk*", "hoop", "hoopgevend*", "hoopvol*", "humor", "hup", "hé", "ideaal", "ideale", "idealen", "idealis*", "ijver*", "illuster*", "imponeer*", "imponeren", "important*", "imposant*", "indrukwekkend", "informeel", "informele", "ingestemd*", "innemend*", "innig*", "inspir*", "instem*", "intellect", "intellectueel", "intellectuelen", "intelligent", "intelligentie", "intens", "interessant", "interesse", "jolig*", "jool", "juich*", "juist*", "kalm", "kalme*", "kalmte", "kampioen*", "kans", "kansen", "kansrijk*", "keurig*", "knap*", "knuffel*", "koekje*", "koester*", "komediant*", "komedie*", "komiek*", "komisch*", "kostelijk*", "kracht", "krachtdadig*", "krachten", "krachtig*", "kus", "kussen", "kwinkslag*", "lach", "lachwekkend", "legendarisch*", "lekker*", "leuk", "leuke*", "levendig*", "levenskracht*", "lief", "liefdadig*", "liefde", "liefdevol", "liefhebben*", "liefko*", "lieftallig*", "lieve", "lol*", "lonen", "loof", "looft", "loyaal", "loyaliteit", "luisterrijk*", "lust", "magnifiek*", "makkelijk*", "menslievend*", "mild*", "minnaar*", "minnares*", "moed", "moedig*", "moeitelo*", "mooi*", "mop", "moppen*", "netjes", "nette", "nobel", "nobele", "nut", "nuttig*", "ok", "okay", "oke", "oké", "omarm*", "omhels*", "omhelz*", "onbekommerd", "onberispelijk*", "onbetwist*", "onbevangen*", "onbevooroordeeld*", "onbevreesd*", "onbezorgd", "ondubbelzinnig*", "ongedeerd*", "ongedwongen*", "onschatbaar", "onschuld*", "onthaal*", "onthalen", "ontspan*", "ontzag", "ontzagwekkend*", "onverschrokken*", "onvervaard*", "openheid", "openmind*", "opgehemeld*", "opgelucht*", "opgeruimd", "opgetogenheid", "opgewekt*", "ophemelen", "oplucht*", "oppermachtig", "opportune", "opportuun", "oprecht*", "optimaal", "optimal*", "optimis*", "opvrolijk*", "opwindend*", "origineel", "originele", "overtref*", "overtrof*", "overweldigend*", "overwin*", "paradijs", "partijtje*", "passie", "perfect*", "perfekt*", "pienter", "pijnlo*", "plezier*", "populair", "populariteit", "positief", "positieve*", "pracht*", "pret", "pretje", "prettig*", "prijzenswaardig*", "prima", "privilege*", "profijt", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 77

"profijtelijk*", "prominent*", "proper*", "rechtschapen", "redd*", "relax*", "respect*", "rieleks*", "rijk", "rijkdom*", "rijke*", "roem", "roemen", "rofl", "romance", "romanticus", "romantiek", "romantisch", "royaal", "royal*", "ruimdenkend*", "ruimhartig*", "rust", "rustig*", "schalks*", "schappelijk*", "schaterlach*", "schatten", "schattig*", "scheppen*", "scherts*", "schitter*", "schoonheid", "schuldelo*", "secure", "secuur", "sereen", "serene", "sieren", "sierlijk*", "sjiek*", "slimmerik", "smaakvol*", "smakelijk*", "smettelo*", "snoep*", "snoezig*", "soulmate*", "spannend*", "sparen", "speciaal", "speels*", "sponta*", "standvastig*", "stellig", "sterk*", "stimul*", "stoutmoedig*", "stralen*", "strel*", "succes*", "super", "sympathie", "sympathiek*", "talent*", "teder*", "teer", "teergevoelig", "tegemoetkomen*", "tevreden*", "thriller*", "thrillseeker*", "toegejuichd", "toegekend*", "toegenegen", "toegewijd*", "toekenning", "toeschietelijk", "toewijding", "tof", "toffe", "toffee*", "tolerant*", "traktatie*", "tranquil*", "triomf*", "troost*", "trots*", "trouw", "uitblink*", "uitdag*", "uitgelaten", "uitmunten*", "uitstekend*", "vastberaden*", "vastbesloten*", "veelbelovend*", "veilig*", "verademing", "verbeter*", "verblijd*", "verdienst*", "verdraagza*", "veredel*", "vereerd*", "vergeving*", "verheerlijk*", "verheug*", "verheven", "verkikkerd*", "verknocht", "verknochtheid", "verkwik*", "verleid*", "verloss*", "vermaak*", "vermake*", "verrass*", "verrukkelijk*", "verrukking", "verrukt", "verstand", "verstandelijk", "verstandig*", "versterk*", "vervolma*", "verwachting", "verwachtingsvol*", "verzorg*", "vindingrijk*", "virtuoos", "virtuoze*", "vitaal", "vital*", "vleien*", "vlekkelo*", "vlot", "vlotte", "voldoening", "volmaakt*", "vooraanstaand*", "voordeel", "voordelen", "voorkomend", "voorrecht*", "voorspoed*", "voortreffelijk*", "vooruitgaan*", "voorzichtig*", "vordering", "vrede", "vredelievend", "vredig*", "vreedzaam", "vreugd*", "vrij", "vrije", "vrijer", "vrijgelaten*", "vrijgevig*", "vrijheid*", "vrijlaten", "vrijmoedig*", "vrijwillig*", "vrolijk*", "vurig*", "waaghal*", "waarachtig*", "waarde", "waardering*", "waardevol*", "waardig", "waardigheid", "waarheid*", "warm", "warmbloedig*", "warme", "warmte", "wauw", "weelde", "weldaad", "weldadig*", "weldoend*", "welluidend*", "welomlijnd*", "welomschreven*", "welslagen", "welwillend*", "wijs", "wijsheid", "wilskracht*", "winna*", "wonderbaarlijk", "wow*", "zachtaardig*", "zachtmoedig", "zalig", "zalige", "zege", "zegen", "zegening", "zeker*", "zelfredza*", "zelfvertrouwen", "zelfverzekerd*", "zielsverwant*", "zoen", "zoetheid", "zonneschijn", "zonnetje*", "zonnig*", "zorg", "bedankt", "beminnen", "verwachten", "waarderen", "best", "beste", "gewichtig*", "overvloed*", "aanmoediging*", "aansporing*", "band", "barm*", "behulpzaam", "belofte*", "beloof*", "beloven", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 78

"bemoedig*", "compassie", "dank", "dankba*", "dankbetuiging*", "danken*", "dankje", "dankjewel", "dankt*", "dankwoord*", "dankzeggen*", "deelneming", "eer", "erkentelijk*", "extravert*", "feest*", "geefster*", "gesteund*", "gever", "gevers", "gezellig*", "helpen*", "hulp", "hulpvaardig", "lof*", "loven*", "mededogen", "medeleven*", "medelijden*", "onderhoudend", "ondersteun*", "prijzend", "steunend*", "toejuich*", "verering", "verfijnd*", "vergiffenis", "vertrouwelijk*", "vertrouwen", "vriendelijk*", "aanbad", "aanbaden", "aanbeden", "bedankte*", "begreep", "begrepen", "bofte", "boften", "doorstond", "doorstonden", "durfde", "durfden", "geamuseerd", "geboeid", "geboft", "geconcentreerd", "geconfronteerd", "gedurfd", "geflirt", "gegarandeerd*", "gegiecheld", "gegniffeld", "gehoopt", "gejuicht", "gekoester*", "gelachen", "geliefkoosd", "geromantiseerd", "gestreeld", "getrotseerd", "gevat", "gewijd", "gewonnen", "gezegend", "gezoend", "giechelde", "giechelden", "gniffelde", "gniffelden", "goedgekeurd", "hoopte", "hoopten", "interesseerde", "interesseerden", "lachte", "lachten", "losgelaten", "ontlastte", "ontlastten", "opgebeurd", "opgefleurd", "opgemonterd*", "opgevrolijkt", "overeengekomen", "romantiseerde", "romantiseerden", "streelde", "streelden", "uitgeblonken", "uitgemunt", "verraste", "verrasten", "verwachtte", "verwachtten", "verzekerd", "verzekerde", "verzekerden", "vooruitgegaan", "waardeerde", "waardeerden", "won", "wonnen", "zoende", "zoenden", "aaien", "aanbid", "aanbidden", "aanbidt", "bedank", "bedanken", "behagen", "behalen", "bekoren", "bemint", "boeien", "bof", "boffen", "boft", "doorsta", "doorstaan", "doorstaat", "durf", "durft", "durven", "flirten", "giechel", "giechelen", "giechelt", "gniffel", "gniffelen", "gniffelt", "hoopt", "hopen", "interesseer", "interesseert", "interesseren", "lachen", "lacht", "loslaten", "ontlast", "ontlasten", "opbeuren", "opfleuren", "opmonteren", "romantiseer", "romantiseert", "romantiseren", "slagen", "streel", "streelt", "toekennen", "toewijden", "vereer", "vereert", "vereren", "verras", "verrast", "verwacht", "verzeker", "verzekeren", "verzekert", "waardeer", "waardeert", "wijden", "win", "winnen", "wint", "zegenen", "zoenen", "zoent", "absoluut", "graag", "hopelijk", "raadzaam", "steunen", "verwelkomen", "bevriend*", "boezemvriend*", "hartsvriendin*", "lieveling*", "lieverd*", "schat", "schatje*", "snoes*", "snoezepoe*", "vriendschappelijk", "aangemoedigd", "geholpen*", "hielp*", "steunde", "steunden", "toegejuicht", "vergaf", "vergaven", "vertrouwd*", "verwelkomd*", "aanmoedigen", "aansporen*", "helpt", "steun", "steunt", "vergeef", "vergeeft", "vergeven", "vertrouw", "vertrouwt", "verwelkom", "verwelkomt"] list_relativ1 = ["aanbrengen", "aaneengesloten", "aanhouden", "aanhoudend*", "aansluiting*", "aanstaand", "aanstaande", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 79

"aansticht*", "aanstonds", "aanvang", "aanvangs*", "abrupt", "achterste", "achteruit*", "actie", "actualiteit", "actueel", "actuele", "adieu", "afgelegen", "afgelopen", "afgewisseld*", "afgrond", "afkomst*", "afloop", "afnemen", "afrit*", "afslag*", "afstand", "afstanden", "aju", "ajuus", "aldoor", "alledaags*", "allerlaatst*", "anciënniteit", "antieke", "apart", "april*", "areaal", "augustus*", "auto", "auto's", "autootje*", "automobiel*", "autokaart*", "avond", "avonden", "avondje*", "avonds", "begin*", "begon*", "behav*", "bejaard*", "beknopt", "benader*", "beneden*", "bereik*", "berijd*", "bespoedig*", "bestendig*", "bestrijkingsgebied*", "bevat", "bevend*", "bewegend*", "beweging*", "bezig*", "bijeen", "bijgewoond*", "bijtijds*", "bijwonen", "binnenkant*", "binnenste", "bocht", "bochten", "bodem*", "bolvormig*", "bovengrens", "bovenst*", "breder*", "breedst*", "breedte*", "briefd*", "briefen", "briefing", "brieft*", "buig*", "buitenkant", "buitenzijde*", "capaciteit", "centimeter*", "centra*", "centrum", "cm", "coherentie*", "compact*", "continu*", "courant", "courante", "cycli", "cyclus", "daaag", "daag", "daags*", "daarnet", "daaropvolgend*", "dagelijk*", "dagen", "dagtekening", "dans*", "dateer*", "dateren", "datering", "datum", "datumlo*", "december*", "decennium", "decimeter*", "diagona*", "dicht", "dichtbij*", "dichtdoen", "dichte", "dichterbij", "dichtheid", "dichtstbij*", "diep", "diepe", "dieper*", "diepst*", "diepte*", "dinsdag*", "direct", "donderdag*", "doordeweeks*", "doorgelopen", "doorgereden", "doorlopen", "doorlopend*", "doorrijden", "doorsnede*", "doorsturen", "drager*", "dringend", "duren", "duur", "duurza*", "duwend*", "dwarsover", "eenjarig", "eensklaps", "eerdaags", "eerder*", "eergisteren*", "eerstvolgend", "eertijds", "eeuw", "eeuwen*", "eeuwigheid", "eind", "einde", "eindelijk", "eindelo*", "eindig", "elders", "enorm", "enorme", "entrée*", "etage*", "even", "fase*", "februari*", "fiets", "fietskaart*", "finale", "finish", "flow", "follow-up*", "gang", "gangba*", "ganggeno*", "gebeurtenis*", "gebied", "gebogen", "geboorte*", "geboren", "gebrief*", "gebruikelijk*", "gedateerd*", "gedrag*", "gedurig*", "gegroeid", "gehaast*", "gehuppel*", "gejaag*", "gekruist", "geldend*", "gelegd*", "geleidelijk*", "gelijktijdig*", "generatie", "gepasseerd*", "geraasd", "gerild", "gerukt", "geschiedenis", "geschoven", "geslipt*", "gesloten", "gesprongen", "gestalte", "gestegen", "gestroomd*", "gestruikeld*", "getrippel*", "getuimeld*", "gevallen", "gevloeid*", "gevuld*", "gewone", "gewoon", "giga*", "gister*", "gooid*", "gooiend*", "groei*", "grond", "grondgebied*", "grootst*", "grootte", "haal", "haast*", "hal", "halfjaar", "halfjaarlijks*", "halfvol*", "hallen", "halsoverkop", "halverwege", "handeling", "handelswijze", "handhaven", "happening", "hardlope*", "hectometer*", "heden", "hedenavond", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 80

"hedendaags*", "heelal", "herfst*", "herhaal", "herhaalt", "herhal*", "herintre*", "herkomst", "hiernamaals", "historisch", "hoek", "hoeken", "hoeklijn*", "hoge", "hoger*", "holle", "hoog", "hoogbejaard*", "hoogste", "hoogte", "hoogten", "horizonta*", "horloge*", "huidig*", "huisgeno*", "huppel*", "ijlings", "immens*", "incident*", "ingang*", "ingegrepen", "ingehaald*", "ingericht*", "ingrijpen", "inhalen", "inhoud", "initiatie", "initiaties", "initieel", "initiële", "innerlijk*", "inrichten", "intern", "internat*", "interne", "intersectie*", "interval*", "intussen", "inwendig*", "jaar", "jaarlijks*", "jaartal*", "jaartelling*", "januari*", "jaren", "jarenlang", "jeugd*", "jog*", "jong", "jongstleden", "juli", "juni", "kamer", "kamergeno*", "kamers", "kant", "kar", "karren", "kassiewijle*", "keer", "kerst*", "kilometer*", "kinderjaren", "kindertijd", "klaar", "klein", "kleine", "klimmend", "klok", "km", "kolos*", "komend", "komende", "koppeling*", "kort", "kortstondig", "kosmos", "krap", "kringloop*", "krom", "kromme", "kruisen", "kruist", "kruiste*", "kwartalen", "laag", "laagst*", "laatst*", "lage", "lager", "lagere", "land", "landen", "landkaart*", "lang", "langdurig*", "lange", "langer*", "langetermijn", "langst*", "langza*", "late", "later*", "ledig*", "leeftijd*", "leg", "lege", "leggen", "legt", "lengte*", "lente*", "leste", "levensduur", "levensfase*", "ligging", "linker*", "links", "locatie", "lokaal*", "longitudinaal", "loodlijn*", "loodrecht*", "lucht", "luxewagen*", "maand", "maandag*", "maandelijks", "maart*", "maximum", "meegesleept*", "meegesleurd*", "meerderjarig*", "meeslepen*", "meesleuren", "mei", "meter*", "methodisch*", "middag", "middagen", "middellijn", "middelmaat", "middelpunt", "middelste", "midden", "middenvlak", "middenweg", "mijl*", "millimeter*", "mini", "mini-*", "minuscu*", "minuten", "minuut", "mm", "modern*", "moment*", "morgen*", "motorrijtuig*", "muren", "muur", "nabeschouwing", "nabijgelegen", "naburig*", "nacht", "nachtelijk*", "nachten", "nachts", "naderen", "naderhand", "nadien", "najaar", "namiddag", "nasleep", "nasturen", "natie*", "nationaal", "national*", "nauw", "nauwsluitend", "nazenden", "neer", "neergevallen", "neergezet", "neervallen", "neerzetten", "nieuw", "nieuwer*", "nieuwheid", "nieuwst*", "niveau", "niveaus", "noord", "noordelijk", "noordelijke", "noorden", "noorderbreedte", "noordoost", "noordoosten", "noordwest", "noordwesten", "november*", "occasioneel", "occasionele", "ochtend", "oeroud*", "ogenblik", "ogenblikkelijke", "ogenblikken", "oktober*", "omgevallen", "omgeving", "omhooggelopen", "omhooggevlogen", "omhooglopen", "omhooglopend*", "omhoogvliegen", "omhoogvliegend*", "omlaag", "omliggend", "omlooptijd*", "omring*", "omsingel*", "omsluit*", "omstreken", "omtuimel*", "omvallen", "omvangrijk*", "onafgebroken*", "onderaan", "onderbreking*", "onderkant", "onderste", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 81

"ondertussen", "onderzijde", "oneindige", "oneindigheid", "ongedateerd*", "onheugelijk*", "onmetelijk*", "onmiddellijk*", "onophoudelijk*", "onsterfelijk*", "ontvlucht*", "onvergankelijk*", "onverhoeds", "onverwacht", "oost*", "opeengepakt*", "opeenvolg*", "opende*", "opendoen", "opener", "opening*", "opent", "opgehangen", "opgelopen", "opgerezen", "opgestaan", "opgestegen", "ophangen", "oplopen", "oponthoud", "oppervlak", "oppervlakken", "oppervlakte*", "oprijzen", "opschieten", "opstaan", "opstelling", "opstijgen", "opsturen", "opvangen", "oriëntaal", "oriëntale", "oud", "ouderdom", "oudere", "ouderwets*", "oudheden", "oudheid", "oudst*", "overdag", "overdwars", "overgezet*", "overgooien", "overhaast*", "overlap*", "overloop", "overmorgen*", "overstromen", "overstroomd*", "overzetten", "paardgereden", "paardrij*", "passeer", "passeerd*", "passeert", "passen", "passeren", "passerend*", "pauze*", "peil", "periode*", "periodiciteit", "periodisering", "permanent*", "personenewagen*", "piepklein*", "plaats", "plaatselijk*", "plaatsen*", "plaatsgevonden", "plaatshebben", "plaatsing", "plaatst*", "plaatsvinden", "plafond", "plannen", "planning", "plat", "platform*", "plattegrond*", "plek", "plotseling", "poos", "poosje", "positie", "posities", "postscriptum", "pril", "prioritair*", "prioriteit", "progressie*", "prolongatie", "ps", "punt", "rand", "randen", "rangorde*", "rangschik*", "rap", "rappe", "razen", "recent", "recente", "recht", "rechte", "rechterhand", "rechterkant", "rechterzijde", "rechtop", "rechts", "rechtstandig*", "reeds", "regelma*", "regio", "regios", "reis", "reisd*", "rennend*", "renner*", "retour*", "reus*", "rijdend*", "rijtuig*", "rijzen", "rijzig*", "ril", "rilde", "rilden", "rillen", "rillend*", "rilt", "ritme*", "ritmisch*", "rooster*", "routekaart*", "ruim", "ruime", "ruimheid", "ruimte", "ruimten", "ruimtes", "ruk", "rukken", "rukkend*", "rukt*", "runner*", "samengepakt*", "samenstel*", "savonds", "schakeling*", "scheid", "scheidd*", "scheidt", "schikken", "schikt", "schikten", "schoof", "schoven", "schuif", "schuift", "schuin", "schuine", "schuiven", "schuivend*", "seconde*", "seizoen*", "semester*", "senior*", "september*", "sessie*", "set", "simulta*", "sindsdien", "site", "sites", "situer*", "slip", "slippen", "slippend*", "slipt*", "sloot", "sluit", "sluiten", "smiddags", "smorgens", "snachts", "snel", "snelle", "sneller*", "snelst*", "spatie*", "spil", "spoed*", "spring", "springen", "springend*", "springt", "sprong*", "staakt*", "stad", "stadia", "stadium", "stadskaart*", "stadsplan*", "stand", "stappend*", "stapsgewij*", "stapvoets", "start*", "steeg", "stegen", "steil", "steile", "stelsel*", "stijg", "stijgen", "stijgend*", "stijgt", "stip", "stokoud*", "straat", "straten", "streek", "struikel*", "sturend*", "synchro*", "systeem", "systematisch*", "systemen", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 82

"tegenwoordig", "tegenwoordige", "telkens", "tempo", "temporeel", "temporele", "termijn", "termijnen", "termina*", "terrein", "terreinen", "teruggelegd*", "teruggeplaatst*", "teruggezet*", "terugleggen", "terugplaatsen", "terugzetten", "tevoren", "tijd", "tijden", "tijdig*"] list_relativ2 = ["tijdje", "tijdlo*", "tijdperk*", "tijdsverloop", "tijdvak*", "timen", "timer*", "timing", "toegang*", "toekomst", "toekomstig", "toenemend*", "toenmaals", "toenmalig", "toestand", "top", "traag*", "transport", "trede", "trekkend*", "treuzel*", "trillend*", "trimester*", "trimmer", "trip*", "tuimel*", "tussenpoos", "tussenpoze*", "tussenruimte*", "tussentijd", "tweejaarlijks*", "tweemaandelijks*", "tweewekelijks*", "uiteindelijke", "uiterlijk*", "uitgang*", "uitgerekt*", "uitgestrekt*", "uitgezonden", "uitrekken", "uitstel*", "uitsterven", "uitwendig*", "uitzenden", "universum", "update", "uur", "uurwerk", "vallen", "vallend*", "valt", "vanavond", "vandaag", "vanmiddag", "vanmorgen", "vannacht", "vanouds", "vastentijd", "ver", "verbinding*", "verbleken", "verblijf", "verblijv*", "verbreden", "verbreidingsgebied*", "verdere", "verdieping*", "verdiept*", "vereeuwigen", "vereeuwiging", "verhaast*", "verhef*", "verhuis*", "verhuiz*", "verjaar*", "verlate", "verleden*", "verlengen", "vernieuw*", "verouder*", "verplaats*", "verrezen", "verrijzen", "verschjnt", "versnel*", "verspreid", "verst", "verste", "verstreken", "verstrijk*", "verte", "vertica*", "vertraag*", "vertrag*", "vertrek*", "vertrok*", "vervagen", "verval", "vervalen", "vervallen", "vervalt", "vervang", "vervangen", "vervangt", "verviel*", "verving*", "vervoer*", "vervolg", "verwelken", "verwijder*", "verwissel*", "verzond*", "vetera*", "viel", "vielen", "vigerend*", "vlak", "vlakbij", "vlakke", "vlakker*", "vlakst*", "vlakte", "vliegend*", "vloeien", "vloeit", "vloer", "vlucht", "vluchtig", "vluchtt*", "vlug*", "voertuig*", "volgend*", "volgorde*", "volle", "voltooi*", "volume", "vooraf*", "vooravond", "voorbarig*", "voorbijgekomen", "voorbijgetrokken", "voorbijkomen*", "voorbijstromen*", "voorbijtrekken*", "voordien", "voorgaan", "voorgegaan", "voorhand", "voorheen", "voorjaar", "voorlijk*", "voorloper*", "voormalig*", "voormiddag", "voorrang", "voortaan", "voortduren", "voortduring", "voortgaan", "voortgaand*", "voortgang", "voortgegaan", "voortgemaakt", "voortijdig*", "voortmaken", "voortuitgang", "voortzet*", "vooruit", "vooruitkomen", "voorwaarts*", "vorig*", "vrijdag*", "vroeger*", "vroegst*", "vroegte", "vroegtijd*", "vul", "vulde", "vulden", "vullen", "vult", "wacht", "wachten", "wagen*", "wand", "wandel*", "wanden", "waterpas", "wederom", "week", "weekdag", "weekeinde*", "weekend*", "weerkaart*", "weg", "wegen", "wegenkaart*", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 83

"weggekwijnd", "weggenomen", "wegkwijnen", "wegnemen", "wegspringen", "wegsterven", "weids*", "wekelijks*", "weken", "wekker*", "weldra", "welnu", "wereld", "wereldkaart*", "werkweek", "werkweken", "west*", "wierp*", "wijd", "wijde", "wijder", "wijdte", "wijlen", "winter*", "woensdag*", "wondergroot", "wondergrote", "zaal", "zalen", "zaterdag*", "zendend*", "zett*", "ziens", "zijde", "zijden", "zijkant", "zijkanten", "zoeven", "zomen", "zomer*", "zondag*", "zoom", "zuid", "zuidelijk", "zuidelijke", "zuidelijkst", "zuiden", "zuiderbreedte", "zuidoost", "zuidoostelijk", "zuidoosten", "zuidwaarts", "zuidwest", "zuidwestelijk", "zwem*", "gelijkvloers*", "historie", "orde*", "vooruitzicht", "bezoek", "dag", "doeg", "doei", "gescheiden", "gestuurd*", "gezonden", "ontvang*", "samen", "scheiden", "versturen", "verstuur*", "verzend*", "zend", "zenden", "zendt", "zond", "omtrek", "part", "partje*", "reeks*", "segment*", "toegenomen", "toenemen", "vorm", "binnenkomen", "rondlopen", "schudden", "snellen", "trillen", "vangen", "daarachter", "daarboven", "daarin", "daarna", "daarnaast", "daarop", "ernaast", "eronder", "toen", "waarin", "brede", "breed", "dunne", "dunner*", "dunst*", "gestoten", "grijp*", "los", "losse", "pak", "pakt*", "smal", "smalle", "smaller*", "smalst*", "stoot", "stoott*", "stoten", "raas", "raasd*", "geopend*", "kom", "kwam*", "omvat*", "open", "openen*", "beëindig*", "gestaakt", "gestopt*", "grens", "grenzen", "halt", "insluiten*", "omhein*", "stilstand", "stop*", "altijd", "constant*", "definitief", "definitieve", "eeuwig", "eeuwige", "immer", "voortdurend*", "afwissel*", "dralen", "frequent*", "gedraal", "gelegenheid", "geschipper", "getalmd", "schipperen", "talm*", "tijdelijk*", "veelvuldig*", "voorlopig*", "gevolg", "gevolgen", "leider*", "leiding*", "leidraad*", "leidster*", "ontsproten*", "ontstaa*", "oorspron*", "oprichting*", "opstart*", "origine", "voortgekomen*", "afsluit*", "trage", "openheid", "massa", "buurt", "eenmaal", "eerst", "eerste", "kwartaal", "oneindig", "groot", "grote", "groter*", "hoogst", "kleiner*", "kleinst*", "vol", "terwijl", "totdat", "wanneer", "zodra", "aan", "achter", "bij", "binnenin", "boven", "bovenop", "door", "gedurende", "in", "langs", "na", "naar", "naast", "onder", "op", "sedert", "sinds", "tijdens", "tot", "voor", "voorbij", "al", "binnenkort", "dadelijk", "gauw", "herhaaldelijk*", "hierheen", "nog", "nu", "ogenblikkelijk", "onlangs", "overal", "pas", "terug", "thans", "uiteindelijk", "verder", "vervolgens", "waar", "waarheen", "weer", "aankomen", "aanvangen", "achtervolg", "achtervolgen", "achtervolgt", "afdalen", "afgeven", "afleggen", "afleveren", "arriveer", "arriveert", "arriveren", "beklim", "beklimmen", "beklimt", "betreden", "betreed", "betreedt", "beweeg", "beweegt", "bewegen", "blijf", "blijft", "blijven*", "breng", "brengen", "brengt", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 84

"doorgaan", "draaf", "draaft", "draag", "draagt", "dragen", "draven", "eindigen", "fietsen", "fietst", "gooi", "gooien", "gooit", "haalt", "halen", "handelen", "handelt", "hol", "hollen", "holt", "initiëren", "inleveren", "innemen", "intrappen", "klauter", "klauteren", "klautert", "klim", "klimmen", "klimt", "loop", "loopt", "lopen", "meebrengen", "meedragen", "meegaan", "meenemen", "meevoeren", "nader", "nadert", "neemt", "neerleggen", "nemen", "omgeven", "omgooien", "omlopen", "opklimmen", "opschuiven", "opstappen", "overnemen", "oversteken", "race", "racen", "racet", "reik", "reiken", "reikt", "reist", "reizen", "ren", "rennen", "rent", "rijd", "rijden", "rijdt", "schud", "schudt", "sjok", "sjokken", "sjokt", "snelt", "stap", "stappen", "stapt", "stromen", "stroom", "stroomt", "stuur", "terugbrengen", "transporteren", "trek", "trekken", "trekt", "tril", "trilt", "trim", "trimmen", "trimt", "vang", "vangt", "verdwijn", "verdwijnen", "verdwijnt", "verhogen", "verruimen", "vervolgen", "vervolgt", "vlieg", "vliegen", "vliegt", "vluchten", "volg", "volgen", "volgt", "voorbijgaan", "vormen", "waaien", "wegbrengen", "weggaan", "weglopen", "werp", "werpen", "werpt", "zet", "zit", "aangekomen", "achtervolgd", "achtervolgde", "achtervolgden", "afgedaald", "afgegeven", "afgelegd", "afgeleverd", "arriveerde", "arriveerden", "beklom", "beklommen", "betrad", "betraden", "bewogen", "bewoog", "binnengekomen", "bleef", "bleven", "bracht", "brachten", "doorgegaan", "draafde", "draafden", "droeg", "droegen", "fietste", "fietsten", "gearriveerd", "gebleven", "gebracht", "gedanst", "gedraafd", "gefietst", "gegaan", "gegooid*", "gehaald", "gehandeld*", "gehold", "gejogd", "geklauterd", "geklommen", "gelopen", "genaderd*", "geplaatst*", "geracet", "gereden", "gereikt", "gereisd", "gerend", "geschud", "gesjokt", "gesneld", "gestapt", "getransporteerd", "getrild", "getrimd", "getrokken", "gevangen", "gevlogen", "gevlucht", "gevolgd", "gewaaid", "gewandeld", "geworpen", "gezwommen", "ging*", "haald*", "handeld*", "hardgelopen", "holde", "holden", "ingeleverd", "ingenomen", "ingepakt", "ingetrapt", "klauterde", "klauterden", "klom", "klommen", "liep", "liepen", "meegebracht", "meegedragen", "meegegaan", "meegenomen", "meegevoerd", "naderde*", "neergelegd", "omgegooid", "omgelopen", "opgeklommen", "opgeschoven", "opgestapt", "overgenomen", "overgestoken", "racete", "raceten", "reed", "reikte", "reikten", "rende", "renden", "rondgelopen", "schudde", "sjokte", "sjokten", "snelde", "snelden", "stapte", "stapten", "stroomde", "stroomden", "stuurd*", "teruggebracht", "trilde", "trilden", "trimde", "trimden", "trok", "trokken", "verbleef", "verbleven", "verdween", "verdwenen", "verruimd", "vervolgd", "vervolgde", "vervolgden", "ving", "vingen", "vlogen", "vloog", "volgde", "volgden", "voorbijgegaan", "vooruitgekomen", "wachtte", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 85

"wachtten", "weggebracht", "weggegaan", "weggelopen", "zwom", "zwommen", "ga", "gaan", "gaat", "slaapwandel", "slaapwandelen", "slaapwandelt", "geslaapwandeld", "slaapwandelde", "slaapwandelden", "verschijn", "verschijnen", "verscheen", "verschenen", "soms", "uit", "eer", "opnieuw", "bevatten", "afgezonderd*", "afzonder*", "deel", "veranderen", "bezoeken", "bezoekt", "bezorg", "bezorgen", "bezorgt", "overbrengen", "sturen", "stuurt", "wegsturen", "bezocht", "bezochten", "ontving*", "overgebracht*", "weggestuurd*", "dun", "duw", "duwen", "duwt", "pakken", "duwd*", "geduwd", "gegrepen", "gepakt", "greep", "grepen", "vroeg", "tegen", "buitensluiten", "uitsluiten", "allebei", "beide", "getwee*", "zowel", "binnen", "mee", "nabij", "rondom", "komen", "komt", "gekomen", "afwachten", "ophouden", "uitscheiden", "afgewacht", "opgehouden", "staken", "geheel", "nimmer", "nooit", "steeds", "ergens", "vaak", "daarbuiten", "leid", "leiden*", "leidt", "oprichten", "verander", "verandert", "geleid", "leidd*", "veranderd*", "tenslotte", "verlaag", "verlaagt", "verlagen", "verlaagd", "verlaagde", "verlaagden", "laat", "raast", "eens", "buiten", "reden", "verlaat", "beven", "opjagen, "gebeefd", "ooit", "geleden"] list_negemo1 = ["geleden", "aarzel", "aarzelen*", "aarzelt", "aarzeld*", "ontoereikend*", "opgeven", "opgegeven", "verlaat", "opjagen", "raast", "beven", "gebeefd", "kut*", "lul", "aanranden", "aangerand", "leed", "druk", "tier", "tiert", "geobsedeerd", "dwing*", "dwong*", "gedwongen", "weiger", "weigert", "weigerde", "weigerden", "negeren", "genegeerd*", "ontweek", "ontweken", "verdrongen", "vermeden", "vermeed", "verwar", "verwart", "verward*", "missen", "forceren", "geforceerd", "leden", "gerouwd", "doodslaan", "doodgeslagen", "onverzettelijk*", "inadequa*", "ondeugdelijk*", "ongeschikt*", "faal", "faalt", "falen", "tekortschieten", "verliezen", "verspelen", "faalde", "faalden", "gefaald*", "tekortgeschoten", "verloren", "dwarsbomen", "gedwarsboomd*", "overstelpt", "verlaag", "verlaagt", "verlagen", "verlaagd", "verlaagde", "verlaagden", "onteren", "sloerie*", "schijt", "gillen", "gilt", "schreeuw", "schreeuwen*", "schreeuwt", "gegild", "schreeuwde*", "tieren*", "bedorven", "aarzeling*", "schromen", "schroom*", "bedwang", "mijd", "mijden", "mijdt", "ontloop", "ontloopt", "ontlopen", "gemeden", "meden", "meed", "ontliep", "ontliepen", "verbood", "verdrong", "verworpen", "afwijz*", "berisp*", "verwarren", "gemis*", "conflict*", "aanrichten", "aangericht", "verdacht", "beklagen*", "toetakelen", "wanhoopt", "wanhopen", "gewanhoopt", "wanhoopte", "wanhoopten", "bedrieg", "bedriegen", "bedriegt", "pest", "pesten", "uitvechten", "bedrogen", "bedroog", "gepest", "gevochten", "pestte", "pestten", "uitgevochten", "vocht", "vochten", "kankeren", "verneder", "vernederen", "vernedert", "geterroriseerd", "vernederd", "vernederde", "vernederden", "afkeuren", "beklaag", "beklaagt", "onderbreek", "onderbreekt", "onderbreken", "roddelen", "roddelt", "stoor", "stoort", "storen", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 86

"afgekeurd", "beklaagd", "beklaagde", "beklaagden", "geroddeld", "gestoord", "onderbrak", "onderbraken", "onderbroken", "roddelde", "roddelden", "stoorde", "stoorden", "verdragen", "gedood", "zondig", "zondigen", "zondigt", "gezondigd", "zondigde", "zondigden", "domineren", "geruïneerd", "domineert", "domineerden", "dronkaard*", "dronkelap*", "dronkenlap*", "domineer", "domineerde", "bedwing*", "bedwong*", "onvoldoende", "onbekwa*", "sukkel*", "aansprakelijk*", "afgezonderd*", "afzonder*", "smakelo*", "aanranding*", "onteer*", "verkracht*", "depressi*", "lijd*", "pijn", "pijnlijk*", "kwaadaardig*", "verstik*", "fobi*", "bot", "dreun", "ruw", "ruwe", "slag", "brom*", "getier*", "tierd*", "dwang", "geweigerd*", "weigeren", "weigering", "ontmoedig*", "verontachtza*", "bekrompen*", "defensief", "defensiev*", "halsstarrig*", "hardnekkig", "hinder*", "koppig*", "negeer*", "tegenstand", "verdedig*", "verweer", "beklemmend", "benauw*", "geremd*", "onderdruk*", "ontwijk*", "schoorvoetend", "star", "verdring*", "vermijd*", "weerhoud*", "proble*", "ongelukkigerwij*", "bedrieglijk*", "besluitelo*", "dub", "dubben*", "dubd*", "dubt", "gedubd*", "geschroomd*", "onbestendig*", "ongewis*", "onzeker*", "risico*", "riskeer*", "riskeren*", "stamel*", "twijfel*", "verwarrend", "verwarring*", "weifel*", "berouw*", "betreur*", "hunker*", "smacht*", "spijt*", "geweld", "achterdocht*", "argwa*", "suspicie*", "verdenk*", "mededogen", "medelijden*", "berust", "berusten", "bezwijk", "bezwijken", "bezwijkt", "boet", "boeten", "deprimeer", "deprimeert", "deprimeren", "huilt", "isoleren", "krenk", "krenken", "krenkt", "kwel", "kwelt", "kwets", "kwetsen", "kwetst", "ontneem", "ontneemt", "ontnemen", "schaad", "schaadt", "schaden", "treur", "treurt", "berustte", "berustten", "bezweek", "bezweken", "boette", "boetten", "deprimeerde", "deprimeerden", "geboet", "gedeprimeerd", "gehuild", "gejammerd", "gejankt", "gekrenkt", "gekweld*", "gekwetst*", "gelaten", "geschaad", "getreurd", "geïsoleerd", "huilde", "huilden", "krenkte", "krenkten", "kwetste", "kwetsten", "miste", "misten", "ontnam", "ontnamen", "ontnomen", "schaadde", "schaadden", "treurde", "treurden", "verliet", "verlieten", "afkraken", "afstraffen", "ageer", "ageert", "ageren", "bederft", "bederven", "bedreig", "bedreigt", "bekritiseer", "bekritiseert", "bekritiseren", "bestraf", "bestraffen", "bestraft", "ergeren", "ergert", "frustreer", "frustreert", "frustreren", "haat", "haten", "intimideer", "intimideert", "intimideren", "irriteer", "irriteert", "irriteren", "jen", "jennen", "jent", "mishandel", "mishandelen", "mishandelt", "molesteren", "opdonderen", "scheld", "schelden", "scheldt", "straf", "straffen", "straft", "teister", "teisteren", "teistert", "treiter", "treiteren", "treitert", "uitschelden", "veracht", "verachten", "vergal", "vergallen", "vergalt", "verniel", "vernielen", "vernielt", "vernietig", "vernietigen", "vernietigt", "vervloek", "vervloeken", "vit", "vitten", "walg", "walgen", "walgt", "wreek", "wreekt", "wreken", "zeur", "zeuren", "zeurt", "afgekraakt", "afgestraft", "ageerde", "ageerden", "bedreigd", "bedreigde", "bedreigden", "bekritiseerde", "bekritiseerden", "bestrafte", "bestraften", "bevocht", "bevochten", "ergerde", "ergerden", "frustreerde", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 87

"frustreerden", "geageerd", "gedreigd", "gefrustreerd", "gehaat", "gehinderd", "gejend", "gelogen", "gemolesteerd", "gescholden", "gestraft", "geteisterd", "getreiterd", "gevit", "gewalgd", "gewroken", "gezeurd", "geërgerd", "geïntimideerd", "geïrriteerd", "haatte", "haatten", "intimideerde", "intimideerden", "irriteerde", "irriteerden", "jende", "jenden", "mishandeld", "mishandelde", "mishandelden", "opgedonderd", "schold", "scholden", "strafte", "straften", "teisterde", "teisterden", "treiterde", "treiterden", "uitgescholden", "verachtte", "verachtten", "vergald", "vergalde", "vergalden", "vernield", "vernielde", "vernielden", "vernietigd", "vernietigde", "vernietigden", "vitte", "walgde", "walgden", "wreekte", "wreekten", "zeurde", "zeurden", "idioot", "generen", "obsedeer", "obsedeert", "obsederen", "overstelp", "overstelpen", "overweldig", "overweldigen", "overweldigt", "pieker", "piekeren", "piekert", "schaam", "schaamt", "schamen", "schrik", "schrikt", "terroriseren", "terugschrikken", "tob", "tobben", "tobt", "verstijf", "verstijft", "verstijven", "vrezen", "afgeschrikt", "bezeten", "gegeneerd*", "gepiekerd", "getobd", "gevreesd", "obsedeerde", "obsedeerden", "opgejaagd*", "overstelpte", "overstelpten", "overweldigd", "overweldigde", "overweldigden", "piekerde", "piekerden", "schaamd*", "schrok", "schrokken", "teruggeschrokken", "tobde", "tobden", "verstijfd", "verstijfde", "verstijfden", "begeef", "begeeft", "begeven", "belast", "belasten", "beroof", "berooft", "beroven", "doemdenken", "kwijtraken", "losrukken", "misbruiken", "mislopen", "ontraad", "ontraadt", "ontraden", "opbreken", "plaagt", "tenietdoen", "verbijten", "verbreek", "verbreekt", "verbreken", "verdraag", "verdraagt", "vererger", "verergert", "verknoei", "verknoeien", "verknoeit", "veroordeel", "veroordeelt", "veroordelen", "verveel", "verveelt", "vervelen", "wantrouw", "wantrouwen", "wantrouwt", "begaf", "begaven", "belastte", "belastten", "beroofd", "beroofde", "beroofden", "geplaagd", "geprotesteerd", "geschonden", "geschreeuwd", "gewantrouwd", "kwijtgeraakt", "logen", "loog", "losgerukt", "misbruikt", "misgelopen", "ontraadde", "ontraadden", "opgebracht*", "opgebroken", "plaagde", "plaagden", "schond", "schonden", "tenietgedaan", "verbeten", "verbrak", "verbraken", "verbroken", "verdroeg", "verdroegen", "verergerd*", "verknoeid", "verknoeide", "verknoeiden", "veroordeeld", "veroordeelde", "veroordeelden", "verveeld", "verveelde", "verveelden", "verwierp", "verwierpen", "wantrouwde", "wantrouwden", "doodsstrijd", "rouw*", "moord*", "oorlog*", "vermoord*", "hel", "verkwist*", "verspil*", "hebzucht", "hebzuchtig", "mat", "afgang", "fiasco*", "hulpelo*", "machtelo*", "misluk*", "nederlaag", "onsucces*", "verlies*", "verliezer", "verliezers", "verloor*", "verspeel*", "overheers*", "verijdel*", "trage", "del", "ontrouw", "pervers*", "slet", "sletje*", "sletten", "sletterig*", "afmat*", "gevoello*", "lustelo*", "uitgeput*", "uitput*", "versuft*", "pikken", "slap", "zwak", "gil", "bitter*", "wrang*", "afkeren", "belemmer*", "ignoreren", "onderworpen", "onwillig", "rem", "remmend", "remming", "rigide", "starheid", "streng*", "verbied*", "verbod*", "verwerp*", "bedenkelijk", "dubieu*", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 88

"onbetrouwba*", "onfortuinlijk*", "pech*", "vaag", "wazig*", "dwal*", "fout", "fouten", "foutie*", "foutje*", "gebrek*", "ongewenst*", "vergis*", "verkeerd*", "geschil", "onzinnig*", "irratione*", "deernis*", "geklaag*", "klaag*", "klacht*", "klagen*"] list_negemo2 = ["meelij*", "huilen", "treuren", "gezanik*", "pijnig*", "toegetakeld*", "verwoest*", "waardelo*", "wroeging", "hopelo*", "overstuur", "ramp*", "uitzichtlo*", "verpletter*", "wanhoop", "wanhopig*", "bedrieger*", "bestrijd*", "bevecht*", "gevecht*", "grief*", "grieven*", "onenigheid", "plaaggeest", "ruzie*", "strijd*", "tegenstander*", "vecht*", "vijand*", "wrok", "besodemieter*", "bullshit", "donderop", "eikel", "gekanker", "gelul", "geouwehoer", "goddomme", "godverdomme", "godverdorie", "hufter*", "klere", "klojo", "klootzak*", "klote*", "kreng*", "lullig", "rotvent", "rotzak*", "schoft*", "shit", "smeerlap*", "snertding", "snob", "sodemieter", "stommerd*", "stommerik", "teef", "trut", "verdom*", "verdorie", "verrek", "vervloekt*", "afschrik*", "afschuwelijk*", "gevaar", "maniak*", "smaad", "vernederend", "vernedering*", "vervelend*", "worstel*", "schrikken", "afkeuring", "beklag", "bezorgd", "geroddel", "interruptie", "misgun*", "plagen*", "plagerij*", "roddel", "domkop*", "sufferd*", "opbrengen", "verergeren", "aftakeling", "dodelijk*", "doden", "fataal", "fatale", "noodlottig*", "duivel*", "immoreel", "immorele", "onzedelijk*", "verdorven*", "zonden", "behoeftig*", "noodlijdend*", "verdonkerema*", "gedomineerd*", "incapabel*", "indolent*", "ineffecti*", "lui", "luie", "onvolkomen*", "slapjanus*", "vadzig*", "zelfvolda*", "ontslagen", "suf", "gestonken*", "onappetijtelijk*", "onsmakelijk*", "ranzig*", "stank", "stink*", "stonk*", "vies", "vieze", "onredelijk*", "aangetast*", "aantasten", "achteloos*", "achtergesteld*", "achterstellen", "afgestompt*", "afgewezen*", "bedremmeld", "bedroef*", "bedroeven", "bedroevend", "bedrukt", "bedruktheid", "behuild*", "benadeel*", "benadelen", "beroerd*", "berokken*", "berusting", "beschadig*", "bewen*", "blut", "desillusi*", "diepbedroefd*", "doodsangst", "dreinen*", "droefgeestig*", "droefheid", "droevig*", "eenzaam*", "ellend*", "erbarmelijk*", "foltering", "gejengel*", "gelatenheid", "hartbrekend*", "hartenbre*", "hartverscheurend*", "hartzeer", "helaas", "huil", "huilend", "huilerig", "inferieur*", "inferior*", "isolement", "jammer*", "jank*", "jengel*", "krenkend", "krenking", "leedwezen", "leeg*", "martela*", "melanchol*", "minderwaardig*", "mis", "miserabel*", "mismoedig", "mist", "misère", "moedeloos", "naargeestig*", "nadeel", "nadelen", "nalatig*", "narigheid", "neerslachtig*", "nietig*", "nietswaardig*", "nuttelo*", "onbevredigend*", "ondergang", "ondraaglijk", "ongeloofwaardig*", "ongelukkig", "ongelukkige", "ontevreden*", "ontgoochel*", "onwaardig*", "pathetisch", "pessimisme", "pessimist", "pessimistisch", "platzak", "plechtig*", "plechtstatig*", "ruine", "ruines", "ruïne*", "schade", "schadelijk", "schrei*", "slachtoffer*", "smart*", "snik*", "somber*", "tegenspoed", "teleur*", "teneergeslagen*", "teneerslaan", "terneergeslagen*", "terneerslaan", "traan", "tragedie*", "tragiek", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 89

"tragisch*", "tranen", "treurend", "treurende", "treurig*", "triest*", "verdriet*", "vergeefs*", "verlaten*", "verslagenheid", "verwaarlo*", "vruchtelo*", "waterlanders", "ween*", "wenen*", "zielig*", "zucht*", "zwaarmoedig*", "aangevallen", "aanslag*", "aanstoot*", "aanval*", "afgunst*", "afstraffing", "afweer", "agressie*", "akelig*", "barbaar*", "bedonder*", "bedot*", "bedreigen*", "bedreiging*", "bedrog", "beestachtig*", "beetgenomen", "beetnemen", "belachelijk", "belazer*", "beledig*", "benijd*", "bespot*", "bestraffing", "bestreden", "boos", "boosaardig*", "boosdoener*", "boosheid", "booswicht*", "botte", "brute", "bruut*", "confront*", "cru", "crue", "cynici", "cynicus", "cynisch*", "debiel*", "destructief", "dom", "domme", "donders", "doortrapt*", "dreig*", "driftbui*", "driftig", "driftkop", "drommels", "dupe", "dwaas*", "dwaze", "enggeestig*", "erger", "ergerlijk*", "ergernis*", "fanaat", "fel", "folter", "folteren", "foltert", "frustratie", "furie*", "gebelgd", "gehate", "gekmaken*", "gemeen", "gemene", "gemor", "geprikkeld", "geringschatting", "geslagen", "gestreden", "getergd*", "geweldadig*", "gewelddadig*", "gezeur", "geëmmer", "gotver", "grof*", "grove", "grr*", "haatdragend*", "hardvochtig*", "hartelo*", "hartvochtig*", "hatelijk*", "hater", "heftig", "hels*", "hetze", "honen*", "hoon*", "humeurig*", "idiote*", "imbeciel*", "inbreuk*", "intimidatie", "irritant", "irritatie", "irriterend*", "jaloers*", "jaloezie", "kampen", "keihard*", "klap", "kleingeestig*", "knorrig*", "kritiek*", "kritisch*", "kwaad", "kwade", "kwalijk", "laagheid", "laakbaar*", "laakbare", "last", "lastig*", "lastpost*", "lelijk*", "lesbo", "leugen*", "lieg*", "list", "listig", "lomp*", "maling", "martel", "marteld*", "marteling*", "martelt", "martelwerk*", "meedogenlo*", "mep*", "minachten*", "minachting", "mokkend*", "mopper*", "mor", "morren", "mort", "na- ijverig*", "nijd*", "offensie*", "ombrengen", "omgebracht*", "omgelegd*", "omleggen*", "onbehouwen*", "onbeschaafd*", "onbeschoft", "ondermijn*", "oneerlijkhe*", "ongelikt*", "onheus*", "onmenselijk*", "onnozel*", "onruststoker", "ontheilig*", "ontwijd*", "onuitstaanbaar", "onwaarhe*", "oproer", "opschudding", "opstand", "opstandig", "opvliegend*", "overtrad*", "overtred*", "overval*", "pissig", "plaag", "protest", "proteste*", "randdebiel*", "razend", "razende", "razernij", "rebel", "rebels", "ressentiment", "ruig", "ruwheid", "sarcasme", "sarcastisch", "scepsis", "scepti*", "schandalig*", "schandelijk*", "schend*", "schurk*", "slaan", "slaat", "slechterik*", "spot", "spotte*", "spuugzat", "stampij", "stom", "stomme", "stommeling*", "stompzinnig*", "stribbel*", "tegengestribbel*", "tegenstribbelen*", "tekeer*", "terg*", "toorn", "treiteraar", "uitrazen*", "uitwoeden", "vals", "valshe*", "venijnig", "verachtelijk*", "verachting", "verbitterd", "verbitteren*", "verdraaid", "verdraaiing*", "verfoeilijk*", "vergiftig*", "vergrijp", "vernieler", "vernieling", "vernielzuchtig", "vernietigend", "waanzinnig*", "walgelijk*", "walging", "wapen*", "weerzinwekkend*", "wildheid", "woedde", "woede*", "woest*", "wraak", "wraakzucht", "wraakzuchtig", "wrede", "wreed*", "wrevel", "wrevelig", "zakkenwasser*", "zanik*", "zelfverdediging", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 90

"ziedend*", "zot*", "afgeschrok*", "afgrijselijk*", "afkeer*", "afkerig*", "afschuw", "alarmerend*", "angst*", "asocia*", "aversie*", "bang*", "beangst*", "bedeesd*", "beef*", "beschaamd*", "bescham*", "beschroomd*", "beverig*", "bevreesd*", "bezorgdheid", "despera*", "doodsbang*", "doodsbenauwd*", "doordrammen", "eng", "enge", "gek", "gekken", "genant*", "geneer*", "geschaamd*", "geschrokken*", "gespannen*", "gevaarlijk*", "gevaren", "geworstel", "griezel", "griezelig", "gruwel", "gruwelijk", "hardleers*", "horror", "huiver*", "hypernerveus", "inhibe*", "kwetsba*", "nerveu*", "neuroot", "neuroten", "neuroti*", "nood", "obsessie", "onaangena*", "onaangepast*", "onbedwingba*", "onbehaaglijk", "onbehaaglijkheid", "onbeheerst*", "onbeholpen*", "oneer", "ongemak", "ongemakkelijk*", "ongerust*", "onhandelba*", "onhandig*", "onredza*", "onrust", "onrustig", "ontbering", "ontsteld*", "ontstelt*", "onveilig", "onveiligheid", "onwrikba*", "opgelaten*", "overstelpend", "paniek", "paniekerig", "paniekzaaier", "paranoia", "paranoïde", "penibel*", "radelo*", "rusteloos", "schaamte", "schande", "schichtig", "schrikaanjagend*", "schrikachtig*", "schrikbarend*", "schrikbeeld*", "schrikwekkend*", "schuchter*", "schuldgevoel", "schuldig", "schuw*", "sidder*", "spanning*", "starre", "stott*", "stress", "tekortkoming*", "terreur", "terugdein*", "teruggedeinsd*", "teugello*", "timide", "tobberig", "trillerig*", "verdruk*", "verdwaas*", "verdwazen", "verlegen*", "verontrust*", "verschrik*", "verstar*", "vertwijfeld*", "vertwijfeling", "vrees*", "weerloos", "zenuwachtig*", "zenuwinzinking", "zenuwslopend*", "aasgier*", "abnormaal", "achteroverdrukken", "achterovergedrukt*", "afgejakkerd*", "afgemat*", "afgezaagd*", "antipathi*", "apati*", "arrogant", "arrogantie", "baatzucht", "baatzuchtig", "bah", "banaal", "banale", "beheksen", "bezwaar", "bezwarend", "biets*", "bizar", "blase", "catastrofe*", "chagrijnig*", "cliche*", "deren", "dief", "dievegge", "dieven", "doetje*", "domhe*", "dommerd*", "droplul*", "egoïsme", "egoïst", "egoïstisch", "eigenaardig", "eigengereid", "eigenzinnig", "emotioneel", "ergst*", "ernst", "ernstig*", "fake", "fataliteit*", "flater*", "flop", "freak", "gap", "gappen", "gapt*", "gebietst*", "geblaseerd*", "gegapt*", "gepijnigd*", "geradbraakt", "geschokt*", "gestolen", "gestresst", "getikt", "geveinsd*", "geëmotioneerd", "gril", "grimmig", "gênant", "gêne", "hakkel*", "halvegare*", "hectisch", "heks", "hevig", "huichelachtig*", "hypocriet*", "ijdel", "ijdelheid", "inefficiënt", "jat*", "kinderachtig", "kleinzielig", "knoeiboel", "krankzinnig", "krankzinnigheid", "kregelig*", "kweld*", "kwell*", "labbekak*", "liederlijk*", "lulletje*", "malloot", "mallot*", "manie", "masochis*", "matig", "mensonwaardig", "misbruik", "misdadig", "misser", "misstap*", "moeilijk", "moeilijke", "moeilijkheid", "nadelig", "namaak", "nep", "nerd*", "netelige", "nietszeggend*", "nonchalant", "nooddruftig*", "oen", "oenig*", "onaantrekkelijk", "onaardig*", "onachtza*", "onbeleefd*", "onbeschermd", "ondankba*", "onderdanig*", "ondoelmatig", "oneens", "ongeduld", "ongeduldig", "ongeduldigheid", "ongedurig", "ongemanierd*", "ongunstig", "onheil*", "onhoffelijk*", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 91

"oninteressant*", "onkunde", "onkundig*", "onoprecht*", "onplezierig", "onrecht", "onrechtmatig", "onrechtvaardig", "ontdaan*", "ontvreemd*", "ontzetting", "onvermogen", "onverschillig*", "onvoordelig", "onvriendelijk*", "onwelkom*", "onwellevend*", "onwetend*", "overdreven", "pressie", "prikkelbaar", "profiteur", "puinhoop", "raar", "redeloos", "rot", "rotte", "rotzooi", "saai", "saggerijnig*", "schaamtelo*", "schok", "schokken", "schokkend*", "schokte*", "shock", "sjagrijnig*", "slecht", "slechter", "slechtgehumeurd*", "slechtgezind*", "slechtst*", "slome", "slons", "slonzig*", "sloom", "slordig", "smeerboel", "steel", "steelt", "stelen", "stomkop*", "stommiteit*", "stuitend", "sul*", "tegennatuurlijk*", "tegenvaller", "tegenwerking", "tegenzin*", "tevergeefs", "tik", "trauma*", "triviaal", "triviale", "uitgeblust*", "verbijster*", "verbouwereerd*", "verderf", "verderfelijk*", "verergering", "veronachtza*", "verslechtering", "vervalst", "verveling", "verworden", "vooringenomen", "vooroordeel", "vreemd*", "vreselijk*", "wanordelijk", "wantrouwig", "warboel", "weerzin", "zedenkwetsend*", "zonde", "zuiplap*", "zuipschuit*"]

list_money = ["baten", "domineren", "geruïneerd", "winkelt", "winkelde", "winkelden", "bonus*", "verdien", "verdient", "verdiend*", "lonen", "betalen", "uitkeren", "uitgekeerd", "controle", "verkwist*", "verspil*", "hebzucht", "hebzuchtig", "gebedeld", "winkelen", "opbrengst*", "winst*", "verdienen", "duur", "boodschappen", "gokversla*", "speculatie*", "taxeren", "behoeftig*", "noodlijdend*", "verdonkerema*", "fortuin*", "liefdadig*", "profijtelijk*", "rijk", "rijkdom*", "rijke*", "vrijgevig*", "waarde", "weelde", "aanschaffen", "afdingen", "bedelen", "beleggen", "erft", "erven", "kopen", "kosten", "omkopen", "profiteer", "profiteert", "profiteren", "storten", "uitgeven", "verkopen", "aangeschaft*", "erfde", "erfden", "gefinancierd*", "gekocht*", "geprofiteerd", "gewinkeld", "geërfd", "kocht", "kochten", "profiteerde", "profiteerden", "uitgegeven", "verkocht", "verkochten", "huren", "huur*", "hypotheek*", "hypotheke*", "casino*", "gokkast*", "gokker*", "gokpalei*", "shop*", "rendement*", "aandelen*", "afnemer*", "begrot*", "beurs", "bezoldig*", "boekhoud*", "commerc*", "credit*", "debet", "debiteur*", "dividend*", "econ*", "failliet", "franchise*", "gratificatie", "groothandel*", "handel", "handela*", "inkomsten*", "kapita*", "klant*", "krediet*", "loon", "makelaar*", "markt*", "minimumloon", "onderbeta*", "ondernemer*", "overuren", "overwerk*", "pacht*", "rendabel*", "salaris*", "studiebeur*", "studiefinanciering", "studielening*", "stufi*", "uitbeta*", "uitkering*", "wedde", "zakelijk*", "aanbieding*", "bod", "aalmoes", "aalmoezen", "aanbetalen", "aanbetaling*", "aanbieder*", "aandeelhouder*", "aangekocht*", "aankoop", "aankope*", "aanschaf", "aanschaffing*", "accountant*", "afbeta*", "afgedongen", "afgerekend", "afrekenen", "aftrek", "aftrekpost*", "afzetmarkt*", "alleenverkoop", "arme", "armoed*", "armzalig", "assurantie*", "atm*", "audit", "audits", "bankbiljet*", "banken", "bankklui*", "bankpas*", "bankrekening", "bankroet", "banksaldo", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 92

"bankwezen", "bankzaken", "bedelaar*", "bedeld*", "bedelt", "belasting*", "belegging*", "bemiddeld*", "besparen", "besparing*", "betaal*", "betaler*", "betaling*", "beurzen", "bezuinig*", "bieding*", "biljet*", "bon", "bonnen", "bonnetje", "budget*", "cent", "centen", "centje*", "cheque", "cheques", "chipknip*", "chippen", "compens*", "consume*", "contant", "contanten", "contributie", "controleer*", "controleren", "controles", "corruptie", "coupon*", "daalder*", "deposito*", "detailhandel*", "dinar*", "dinero", "disconto", "distributiebon*", "dobbel*", "*", "donatie*", "doneren", "*", "", "duiten", "effecten", "effectenmakelaar*", "entreegeld", "erf", "erfenis", "erfgenaam", "erfgename", "euro", "euro's", "eurocent", "*", "", "factureer*", "facturen", "factuur", "faillissement", "faillissementsuitkering*", "financi*", "fiscus", "fonds*", "fooien", "gechipt*", "gecompenseerd*", "gecontroleerd*", "gefactureerd*", "gefortuneerd*", "gegireerd*", "gehuurd*", "geld", "geldautoma*", "geldelijk*", "geldwaarde*", "geleased", "gepacht*", "gepind*", "geruineerd*", "geru‹neerd*", "gespeculeerd*", "gestort*", "gesubsidieerd*", "gewed", "geïnvesteerd", "gift", "giften", "gireer*", "gireren", "giropas*", "girorekening", "goederenmakelaar*", "goedko*", "gokmaffia*", "gokschanda*", "gratis", "gulden*", "honorari*", "honore*", "hoofdprijs", "huishoudkund*", "hypothecair*", "ingekocht*", "inkomen", "inkoop", "inkope*", "investeer", "investeerde", "investeerden", "investeerder", "investeert", "investeren", "investering", "jackpot", "kas", "kasboek", "kleinhandel", "kluis", "kluizen", "koers", "koop*", "kopek", "kopeke", "koper*", "korting*", "kostba*", "kostelo*", "kwartje*", "kwitantie", "lease*", "leenstelsel", "lening*", "lesgeld*", "licentie*", "lire*", "loonstrook", "loterij*", "maandloon", "maandsalaris", "mastercard*", "monopol*", "munt", "munten", "muntgeld", "muntje*", "obligatie*", "omgekocht*", "omkoopbaar", "omzet", "onbetaalbaar", "onderpand", "onkosten", "onkostenvergoeding*", "opgekocht*", "opkope*", "opruiming", "opslag", "ouderbijdrage*", "overbetaald*", "overgemaakt*", "overmak*", "papiergeld", "penning*", "peseta", "peseta's", "peso", "peso's", "piek", "pinautoma*", "pinde*", "pinnen", "polis*", "portefeuille*", "premie*", "prijs", "prijslijst", "prijsniveau", "prijsstelling", "prijzenniveau", "prijzig", "recessie", "rekening*", "rendabiliteit", "rendeer*", "renderen", "rentab*", "rente*", "restitu*", "rijksdaalder*", "roebel*", "roepie*", "ruilhandel", "ruineer*", "ruineren", "ruïneren", "ru‹neer*", "ru‹neren", "schadeloos*", "schadevergoeding*", "schenking*", "schoolgebouw*", "schoolgeld*", "schuld", "schulden", "shilling*", "spaar*", "specule*", "spilziek*", "staatsschuld*", "statiegeld", "stipendi*", "storting*", "studiegeld*", "studierende*", "*", "subsidi*", "systeemlicentie*", "tarieven", "tegoed*", "terugbeta*", "teruggave", "terugvorder*", "tientje*", "toeslag*", "uitgave", "uitgaven", "uitsparen", "uitverkoop", "uurloon", "valuta", "valutamakelaar*", "veilen", "veiling*", "verarm*", "verbouwt", "verduisteraar*", "vereffen*", "vergoed*", "vergokken", "verhandel*", "verhuren", "verhuur*", "verkoop*", "verkoper*", "verkoping", "verpacht*", "verreken*", "verschuldigd*", "verwed*", CHATBOT WITH INTERPERSONAL COMMUNICATION RECOGNITION 93

"verzekering*", "visa", "visacard*", "voordeelprij*", "voordeeltje*", "voordelig*", "vrijkaart*", "waarborgsom*", "waard", "waardebon*", "waardepunt*", "wed", "wedden*", "wedder", "wedt", "weekloon", "welgesteld*", "welstand", "winkel", "winkelier*", "wisselgeld", "wisselhandel", "wisselkantoor*", "wisselkoers*", "yen", "yuan", "zuinig*",]