Pooling Tweets by Fine-Grained Emotions to Uncover Topic Trends in

Annika Schoene Geeth de Mel IBM Research IBM Research IBM UK Ltd. IBM UK Ltd. Daresbury, United Kingdom Daresbury, United Kingdom [email protected] [email protected]

Abstract—In this paper, we present a lexicon-based sentiment blogging platforms such as has a number of wide- analysis method that is used as an annotation scheme for ranging applications such as tracking public mood [9] How- identifying fine-grained emotions in social media topics. This ever, this is a difficult task given that content and dynamics methodology is based on Plutchik’s wheel of emotion and Latent Dirichlet Allocation (LDA). We firstly annotate a tweet based in such platforms varies among topics as well as in the on eight basic emotions and secondly we compute further eight same conversation [10]. Additionally, micro-blogging data has dyads as a product of the basic emotions. We demonstrate that also been used in topic modeling to uncover topics in user this lexicon-based approach achieves up to 78.53% ground truth conversations [11] as well as detecting connections between accuracy when compared to human annotated data that is split emotions and affective terms [12]; this involves analysis of into positive and negative polarities. Moreover, we investigate a novel means to identify trending topics in twitter data by utilizing emotions conveyed towards specific topics, and how emotions LDA and focusing on fine-grained emotions associated with each of groups change over time [13]. tweet. We compare the most dominant emotions in social media as Motivated by this observation, we first explore a lexicon- topics from an emotion-document pooling strategy and compare based approach for fine-grained emotion detection and then the results to an author-topic modeling strategy. propose means to detect topics using the emotions expressed Index Terms—Fine-Grained Emotions, Emotion Lexicons, Emotion Fusion, Topic Modelling, Plutchik in the annotated tweets. Specifically, we use Latent Dirichlet Allocation (LDA) [14] to detect topics encapsulated in each individual tweet and use insights from our emotion annotation I.INTRODUCTION strategy to pool tweets into emotion groups and show which Given the advancement in the technology, cyber social emotions are most dominant in topics. Finally, we evaluate our are fast having an impact on the inferences users— emotion annotation strategy against an existing lexicon called be it humans or autonomous systems—make about their SentiWord Net [15], which is frequently used in Sentiment surrounding context. We refer to a network of humans and Analysis task and is annotated for polarities. We then compare intelligent autonomous systems as a cyber social space [1] our topic-based emotion pooling scheme to another successful where there are links among human societies, autonomous model called the Author-Topic Model (AT Model) [16]. societies, and between human and autonomous societies. In such settings, when the power of artificial intelligence (AI) II.BACKGROUND capability increases, the influence such systems will have on Much research has been conducted using basic emotion the- the network will be amplified; there are multitude of recent ory such as Ekman’s six [17]. In order to perform fine-grained examples [2], [3], where autonomous systems were used to analysis of emotions, in this work, we use Plutchik’s [18] polarise a social view or an outcome. Thus, it is important eight basic emotions—and associated less intense emotions— to have means to effectively mine the sentiment—especially in a wheel as shown by Figure 1. The key ideas of this work the emotional contagion— [4] of such components, thus the are that (a) basic emotions have polar opposites (e.g., joy vs network. In order to infer emotions, sentiment analysis is used sadness); (b) emotions can have different intensities (e.g., rage and has been defined as a sub-discipline of Natural Language is more intense than anger); and (c) emotions can be mixed Processing with the aim of determining valence, emotions, and or have derivative states that are called dyads (e.g., anger and other affectual states from text [5]. disgust together will result in contempt). Within the field of sentiment analysis, Twitter has become Work by Kiritchenko et al. has developed two lexicons an increasingly popular data source, with mainly machine and for sentiments analysis for short informal messages and is deep learning models being submitted to shared tasks [6]. based on supervised text classification methods using Support Most of this work has either focused on detecting polarities Vector Machines [19]. Moreover, Abdul-Mageed and Ungar [7] in a tweet or used data that has been annotated by human has derived Plutchik’s eight basic emotions from a self- annotators for additional information such as semantic inter- collected dataset of 24 fine-grained emotions using gated pretations [8]. Detecting fine-grained emotions from micro- recurrent neural networks [20]. Rosen-Zvi et al. have developed the author-topic model which is an extension to LDA where the model includes information about authors [16]. Mehrotra et al. has looked at pooling twitter topics using unmodified LDA models to aggregate tweets by [27]. This is mainly because, in micro- blogs, message quality is variable and language used is more colloquial when compared with traditional documents. Detect- ing topics in social media and their emotional content has a number of different applications—e.g., [28] have argued that due to constant political and social changes, there is a need to better understand the role of social media and how it influences individuals and groups [29] have proposed a statistical model for ideological discourse where it is hypothesized that some words are more frequently used when people speak about an ideological topic or hold a specific ideological perspective.

III.APPROACH In order to infer fine-grained emotions and to indentify the emerging topics w.r.t. those emotions, we do the fol- lowing: (a) we employ a lexicon-based approach for fine- grained emotion detection in tweets—both for text content and hashtags; (b) we then devise a strategy to infer added emotions so that additional information is available to deduce Fig. 1. Plutchik’s Wheel of Emotion the polarity of the identified emotions; and (c) we employ means to detect topics using the emotions expressed in the annotated tweets so as to pool tweets into emotion groups and a) Lexicon- and Knowledge- based Approaches: One of show which emotions are most dominant in topics. We used the most commonly used resources for lexicon-based senti- both the EmoLex—i.e., National Research Council Canada ment analysis is WordNet Affect [21] and has annotations (NRC) Emotion Word lexicon—which was manually created for affective states such as positive, negative and neutral by [30] and the automatically created NRC lexicon polarities or emotions. Another popular lexicon is SentiWord by [31] for our emotion annotation scheme. Net [22] which also has annotations for polarities and has been used for a number of different studies such as derivation of A. Tweet Annotation prior polarities [15]. Taboada et al. developed a lexicon-based We annotated each tweet through the emotion lexicon for approach to infer sentiment from text [23]; their system is each word and as a result we have a number of basic emotions referred to as SO-CAL (Semantic orientation calculator). associated to each tweet. Then we analyzed which emotions b) Sentiment Analysis (SA) for Topic Modeling: A were predominately used in each tweet through the Algo- promising approach to detect emotions in topics is the use of rithm 1. Motivated by this observation, we developed three LDA where the focus is on reviews (or blogs) with multiple different strategies for finding the dominant emotion for each sentiments and are calculated over the document [14]. Typi- tweet. Every tweet that only had one basic emotion we kept cally, the approach is aimed at detecting pre-determined sets this emotion as the main indicator. For two or more dominant of emotions [12] or Ekman’s six basic emotions [24]. Thus, emotions, we checked if there are any matches between the many SA models in this area amended the prior knowledge basic emotions which can create a less strong dyad as per of the models with different lexicons or word lists and have Algorithm 2. therefore been deemed domain or topic-dependent as well as Moreover, for tweets with only two highest emotions, the not easily transferable [25]. algorithm searches for a match between the second highest The work by Bao et al. uses LDA to discover topics emotion. For tweets that did not have one basic emotion or from emotions and then generating affective terms from those no matches, we used the emotions expressed by hashtags topics [12]. The data for this study was taken from a popular to detect the emotions. For example, let us assume a tweet Chinese social media platform called Sina, where users can I hate dogs but love cats #truetalk. After steps 2–9, the rate entries based on eight given emojis. Kim et al. focused emotionlist contains {(anger:3),(love:1),(disgust:2), (trust:0)}. on detecting emotional influences and patterns in unannotated After steps 10–19, we have (anger:3), (love:1), and (disgust:2) Twitter conversations using so called sentence-LDA based on as our three highest emotions. In the case of emotion values Plutchik’s emotion theory with a lexicon [26]. being zero in the above example, we would have no emotion -data increases the interpretability of topic models. annotation. However, we would then consider the relevant hashtags for potential emotions through Algorithm 3. Addi- tionally, if a tweet has more than six distinct emotions as input : Preprocessed Tweets the dominant emotion or no annotation at all we discarded output: Tweets annotated for three most dominant the tweet—this is because it indicates that (a) the number of basic emotions potential dyad matches increases exponentially so that there is no clear dominant emotion; or (b) none of the words /* Each word in a tweet is annotated used in a tweet are not present in the lexicon. Furthermore through a lexicon */ we argue that not all tweets are of the intensity of a basic 1 emotionlist = [] emotion and there might be multiple basic emotions expressed 2 for ( word in T weet ) { in one tweet that are of similar polarity but the true sentiment 3 if word in Lexicon then of the tweet might be a less strong dyad emotion. Taking 4 map word to emotion; the above example, let’s assume the emotionlist returns the 5 add each emotion to emotionlist ; following emotions: {(anger:3),(love:1),(disgust:3), (trust:0)}. 6 else In this case Algorithm 2, will check if there is a match between 7 continue; the two highest emotions anger and disgust, which would 8 end be true for this example and contempt is returned. Finally, 9 } if there is a negation present in the tweet, such as not we use /* Count how many times emotions Algorithm 4 to invert an emotion. An example of this would occur per tweet */ be where the original annotation for a tweet is joy, but due to 10 for ( emotion in emotionlist ) { the negation annotation is inverted to sadness. 11 if emotion value > 0 then B. Hashtag Annotation 12 count each emotion; 13 return three highest emotions per tweet; We treated hashtags separately from the main tweet content 14 else if emotion = 0 or emotion = same value as it has been shown in previous research to have emphasizing then emotions or change the sentiment of presumably a neutral 15 return None; content [31]. Furthermore, it has been argued that hashtags 16 else can be seen as annotations provided by a user for topics and 17 return None; emotions [32]. Therefore, we employed a similar approach— 18 end i.e., after preprocessing tweets for emotion annotations, we 19 } used hashtags as a main indicator where the main tweet did Algorithm 1: Emotion Annotation not return a main basic emotion or dyad Algorithm 3.

input : Tweets with no emotion annotation output: Tweets with emotion annotation through hashtag /* Tweets are annotated for emotions through a hashtag lexicon */ input : Tweets annotated for basic emotions 1 for ( hashtag in T weet ) { output: Tweets annotated for basic emotions and 2 if hashtag in Lexicon then dyads 3 map hashtag to emotion; /* Dyads are calculated for tweets 4 return each emotion with value 1; that have more than one basic 5 else emotion */ 6 return None; 1 for ( emotion in emotions ) { 7 end 2 if emotion = 1 then 8 } 3 return basic emotions; Algorithm 3: Emotion annotation with hashtags 4 else if emotion ≤ 2 then 5 return dyad; C. Dealing with Negations in Tweets 6 else 7 return None; Negations play an important role when detecting emotions 8 end in textual data and previous work has inversed polarities 9 } for each tweet, however, there has been evidence that this Algorithm 2: Calculating Dyads technique is not always applicable [19]. For the purpose of this work, we still inversed emotions as this is a different approach compared to the polarity inversion in the previously mentioned work. We inversed the dominant emotion and dyad of each tweet if a negation occurred as per Algorithm 4 adhering to The Weather and 104K Datasets were already preprocessed, the axes of Plutchik’s wheel of emotion [33]. thus there was not a significant drop in tweets after the preprocessing. Table I outlines general information on the /* Negations contains a list of datasets and gives an overview over the size of the datasets negations in the English, after each preprocessing step was performed. In order to take emotionpairs is a dictionary with advantage of hashtags, we preprocessed hashtags separately by opposing emotions according to firstly splitting ‘#’from the full string and then split strings into [33] */ possible words by using word lists; we used the first suggested input : Tweets annotated for basic emotions and option for further analysis. dyads, negations, emotionpairs Number of Weather Russia 104K output: Final tweet annotation Original tweets 203482 1600000 1000 1 for ( word in T weet ) { Preprocessed tweets 192558 1600000 1000 Tweets with annotations 133220 963844 737 2 if word in negations then TABLE I 3 invert emotion according to emotionpairs; DATASETS AFTER PRE -PROCESSING 4 return inverted emotion; 5 else In Figure 2 we demonstrate the distributions of basic 6 do nothing; emotions and dyads across the three datasets. In order to 7 end compare our emotion tagging approach and it’s accuracy we 8 } have split emotions into two different polarities as depicted Algorithm 4: Negations in tweets by Table II. Whilst we first considered to split emotions into three categories (positive, negative and neutral) we decided for IV. EVALUATION:EMOTION TAGGING the purpose and ease of this evaluation to split emotions into two categories, because there is some lack of clarity around In order to evaluate our emotion tagging approach, espe- some emotions associated polarity categories or orientations cially the inference of dyads, we used three data sets—two of (i.e.: surprise) [37]. which already had some ground truth information in the form of polarities. Polarity Emotions Positive Joy, Trust, Love, Optimism A. Data Negative Sadness, Disgust,Remorse, Disapproval Submission, Fear, Awe, Surprise, Contempt, We used three different Twitter datasets for our experiments. Anger,Aggressiveness, Anticipation The first dataset is a collection of NBC’s tweets [34] from TABLE II Russian chat bots to influence the 2016 U.S. elections (hence- EMOTIONSPLITINTOPOLARITIES forth referred to as Russia Data); this dataset has no anno- tation of polarities nor fine-grained sentiments. The second Table III shows the ground truth of the human annotated dataset contains tweets collected about the weather (henceforth weather dataset and 104K dataset compared to our own referred to as Weather Data) which has been annotated for emotion split as well as a comparison to SentiWordNet. We three different polarities—i.e., positive, negative and neutral— have found that for positive polarities we achieved similar as well as a category called not related to content [35]. The last accuracies, however for negative polarities we have found a twitter dataset (henceforth referred to as 104K Data) has been distinct difference. For the Weather data accuracy is signifi- collected based on query terms on varying topics where only cantly higher compared to positive polarities and 104K data positive and negative content was considered [36]. Figure 2 results. The results in Table III show that based on our emotion shows how many emotions are present in each dataset when split our approach performs better on negative emotions and the emotion annotation strategy is applied to the data sets. SentiWordNet better on positive emotions. Furthermore it can B. Preprocessing be seen that our approach is performing significantly better when detecting positive polarities compared to SentiWordNet’s It has been noted by that extensive text preprocessing and ability to detect negative polarities. However, it has to be taken labeling usually takes place prior to conducting experiments into consideration that according to our emotion split there are where longer documents are broken down into sentences [25]. more negative than positive emotions and that our approach We, therefore, applied a similar text preprocessing methods can be split down further into more fine-grained categories. to our tweets, which included tokenization,lower casing all tokens,removing punctuation and stopwords. However, it is V. EVALUATION:TOPIC MODELINGWITH EMOTIONS important to note that we excluded negations from the stop- For our experiments we used LDA [14] to detect topics word removal process as negations in a tweet could inverse in our datasets. We conduct 3 different experiments to show the overall emotions expressed in a tweet. Furthermore, we (a) how tweet topics change when tweets are treated individ- anonymised usernames, removed retweets and duplicates as ually when compared to a pooled document; and (b) which well as web references. emotions are dominant in each topic for each dataset. Fig. 2. Emotion Distribution in %

Polarity Weather Data 104K Data a) Experiment 1: Table IV shows the topics for each Positive (Annotation Scheme) 57.63 55.73 dataset, where the number of topics was based on the best Negative (Annotation Scheme) 78.53 76.38 coherence score for each dataset. Furthermore, each topic Positive (SentiWordNet) 65.53 79.12 was named intuitively based on the words that occurred most Negative (SentiWordNet) 34.46 41.58 TABLE III commonly in these topics. EMOTION ANNOTATION GROUND TRUTHIN % Topic Weather Russia 104K 1 storm/tornado support America watching Show 2 hot/sunshine terrorism thankfulness 3 weather/storm attacks email friends • Experiment 1 Each tweet is treated as an individual 4 rain/ guns another year 5 sunny/may black rights well wishes document to extract dominant topics 6 weather/link/user trending news weekend/music • Experiment 2 All tweets belonging to one emotion 7 mixed weather police violence game/luck annotation are combined into one document to extract 8 weather warning vote white house school 9 - time beautiful summer dominant topics 10 - presidential candidates time/work • Experiment 3 The Author-Topic Model is applied to see TABLE IV difference in Topics RESULTS EXPERIMENT 1 - DATASET 1,2 AND 3 The reason for pooling tweets into one document is to over- come the issue of sparsity in twitter data—this is due to b) Experiment 2: After pooling each tweet by its emotion the length of each tweet available is shorter than traditional annotation we conducted further experiments to see which documents used for training LDA models. We show that topic is most dominant for each emotion in Table V. We did pooling tweets by emotion produces coherent topics for both not use minimum count of tokens and maximum frequency sparse and less sparse data which can clearly be associated in our experiments with the Weather Data, because there is to emotions. Finally, we compare our pooling strategy to the only a small amount of tweets and a total of 2721 tokens. Author-Topic-Model. We have chosen this model as parallels For experiments with the Russia data we have adjusted the can be drawn from the author of a document to the dominant minimum number of words to 3 and the maximum frequency emotion of a tweet. We used topic coherence scores to evaluate of occurrence to 0.5, where the maximum number of tokens the results of our models by using UCI coherence measure was set to 7508. Experiments with the 104K dataset we implemented in the Gensim library [38]. used 100000 tokens and adjusted the maximum occurrence After obtaining all emotion annotations for each dataset we frequency to 0.5 and the minimum occurrence of a word to 3. used all tweets that had a first basic emotions and dyads in We have observed that for results of Experiment 2 that there each model. There are some tweets which have more than one was an overlap of words used in topics of basic emotions and basic emotion assigned to them, however in order to establish dyads that are close to each other on the wheel of emotion. a baseline model we only consider tweets with one dominant This may indicate that (1) topics might be associated to similar emotion. emotions and (2) words used in certain topics may be specific Emotion Weather Data Russia Data 104K Data Joy sunshine/beautiful Trump/supporter love/baby Trust snow/patrol president/Trump l school/follower Fear tornado/hot war/police change/homework Surprise weather/chance Trump/Donald Trump/Donald Sadness weather/rainy black/refugee lost/cry Disgust cold/weather Trump/Clinton boy/crap Anger morning/patchy Hillary/Obama/hit hot/hit Anticipation time/storm/tomorrow time/Trump/start time/tomorrow Love sunshine/weather Trump/Supporter wonderful/smile Submission link/user Clinton/Trump hospital/hurt Awe morning/patchy day Hillary/ #politics good/day Disapproval storm/weather Clinton/black/vote no good/don’t like Remorse hope/sunny/forecast Trump/ #tcot sick/shame Contempt window/freezing Clinton/criminal horrible/damn Aggressiveness weather/ storm against/Hillary day/get Optimism sunny/day Trump/young/gift good/hope TABLE V RESULTS EXPERIMENT 2 - MOSTCOMMONWORDSFORTHEMOSTDOMINANTTOPIC

to corresponding emotions. These notions also correspond to • Through our emotion pooling strategy, we found that pos- findings with the AT model in Experiment 3. itive emotions are associated with Donald Trump, whilst c) Experiment 3: For our experiments with the AT model negative emotions often included references to Hillary we replaced the authors of a document with our emotion Clinton or Barack Obama as per Table V. Similarly, we annotations, which means that each tweet was treated as a found that topics such as black rights, crime and war document. Our results show that it is possible to use the AT were associated with emotions such as fear, sadness and model for detecting emotions in topics, however there are some disapproval. observations to be made w.r.t. experiments conducted using the • We also found that there are few usernames that were AT model. Firstly, we have found that when visualizing our AT more frequently referenced in some topic that we later model results that in many cases emotions were overlapping correlated with emotions, such as @realdonaldtrump with topics to the extend that multiple topics were associated was frequently mentioned in tweets associated to topics to one emotion. This meant that it was not possible to extract that belonged to positive emotion topics which included coherent or dominant topics with the associated emotions. terms such as #makeamericagreatagain, GoP, Voter and When comparing the results in Table VI to Table V, it can Supporter. Whilst @HillaryClinton is frequently men- be seen that we get more detailed insight into topics for each tioned with terms such as #neverhillary, illegal, email emotion. One of the reasons for this could be due to the fact and debate and associated to emotions such as contempt, that the AT model was originally tailored for larger documents disapproval and aggressiveness. and therefore tailoring it further may yield more coherent • Last but not least, a black news agency was frequently topics in the future. mentioned with terms such as freedom, history, Islam and Hillary Clinton. Again these topics were closely associ- Topic Title ated to emotions such as sadness, fear and contempt. 1 storm/hot Thus, it can be argued that using both sentiment analysis and 2 user/link topic modeling combined can give a detailed insight into large 3 storm/coming/ area 4 sunny weather datasets and help detect trends. 5 snow/patrol 6 rainy day 7 degrees/thunderstorm VII.CONCLUSIONAND FUTURE WORK 8 tornado/warning TABLE VI In this work we have shown that it is possible to use a NUMBEROF TOPICSFOREACH DATASET USING THE AT MODEL lexicon to determine fine-grained emotions in tweets. More- over, we have demonstrated that it is possible to achieve a ground truth accuracy of 78.53% when comparing our VI.DISCUSSION annotation algorithm against a human annotated dataset. We Whilst the main purposes of this work was to develop a have shown that by pooling tweets by their main emotions baseline system to detect fine-grained emotions with regards allows us to see more coherent topics compared to treating to topics on social media, it also demonstrates the knowledge each tweet as a document. Future work will aim to further one can extract—especially from the Russia dataset. Below investigate lexicons and other knowledge-based approaches to we summaries a number of findings that we believe useful in develop domain- independent fine-grained emotion annotation better understanding such diverse emotions and provides an systems. Additionally, we will investigate how to best integrate implicit measure of influence. additional meta-data into topic models, such as the time when a tweet was posted in order to see how emotions in topics [16] M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth, “The author- change over time. topic model for authors and documents,” in Proceedings of the 20th conference on Uncertainty in artificial intelligence. AUAI Press, 2004, ACKNOWLEDGMENT pp. 487–494. [17] P. Ekman, “An argument for basic emotions,” Cognition & emotion, This work was supported by the STFC Hartree Centre’s vol. 6, no. 3-4, pp. 169–200, 1992. Innovation Return on Research programme, funded by the [18] R. Plutchik, “The nature of emotions: Human emotions have deep evolutionary roots, a fact that may explain their complexity and provide Department for Business, Energy & Industrial Strategy. tools for clinical practice,” American scientist, vol. 89, no. 4, pp. 344– 350, 2001. This research was sponsored by the U.S. Army Research [19] S. Kiritchenko, X. Zhu, and S. M. Mohammad, “Sentiment analysis of short informal texts,” Journal of Artificial Intelligence Research, vol. 50, Laboratory and the U.K. Ministry of Defence under pp. 723–762, 2014. Agreement Number W911NF-16-3-0001. The views and [20] M. Abdul-Mageed and L. Ungar, “Emonet: Fine-grained emotion de- conclusions contained in this document are those of the tection with gated recurrent neural networks,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics authors and should not be interpreted as representing the (Volume 1: Long Papers), vol. 1, 2017, pp. 718–728. official policies, either expressed or implied, of the U.S. [21] C. Strapparava, A. Valitutti et al., “Wordnet affect: an affective extension Army Research Laboratory, the U.S. Government, the U.K. of wordnet.” in Lrec, vol. 4. Citeseer, 2004, pp. 1083–1086. [22] S. Baccianella, A. Esuli, and F. Sebastiani, “Sentiwordnet 3.0: an Ministry of Defence or the U.K. Government. The U.S. enhanced lexical resource for sentiment analysis and opinion mining.” and U.K. Governments are authorized to reproduce and in Lrec, no. 2010, 2010, pp. 2200–2204. distribute reprints for Government purposes notwithstanding [23] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon- based methods for sentiment analysis,” Computational linguistics, any copyright notation hereon. vol. 37, no. 2, pp. 267–307, 2011. [24] C. O. Alm, D. Roth, and R. Sproat, “Emotions from text: machine REFERENCES learning for text-based emotion prediction,” in Proceedings of the con- ference on human language technology and empirical methods in natural [1] A. Sheth, P. Anantharam, and C. Henson, “Physical-cyber-social com- language processing. Association for Computational Linguistics, 2005, puting: An early 21st century approach,” IEEE Intelligent Systems, no. 1, pp. 579–586. pp. 78–82, 2013. [25] C. Lin and Y. He, “Joint sentiment/topic model for sentiment analy- [2] P. Wang, R. Angarita, and I. Renna, “Is this the era of misinformation sis,” in Proceedings of the 18th ACM conference on Information and yet? combining social bots and fake news to deceive the masses,” in The knowledge management. ACM, 2009, pp. 375–384. 2018 Web Conference Companion, 2018. [26] S. Kim, J. Bak, and A. H. Oh, “Do you feel what i feel? social aspects [3] E. Hunt, “Tay, microsoft’s ai chatbot, gets a crash course in racism from of emotions in twitter conversations.” in ICWSM, 2012. twitter,” The Guardian, vol. 24, 2016. [27] R. Mehrotra, S. Sanner, W. Buntine, and L. Xie, “Improving lda [4] J. Jouhki, E. Lauk, M. Penttinen, N. Sormanen, and T. Uskali, “Face- topic models for microblogs via tweet pooling and automatic labeling,” book’s emotional contagion experiment as a challenge to research in Proceedings of the 36th international ACM SIGIR conference on ethics,” Media and Communication, vol. 4, 2016. Research and development in information retrieval. ACM, 2013, pp. [5] S. M. Mohammad, “Sentiment analysis: Detecting valence, emotions, 889–892. and other affectual states from text,” in Emotion measurement. Elsevier, [28] W.-H. Lin, E. Xing, and A. Hauptmann, “A joint topic and perspective 2016, pp. 201–237. model for ideological discourse,” in Joint European Conference on [6] S. Rosenthal, P. Nakov, S. Kiritchenko, S. Mohammad, A. Ritter, and Machine Learning and Knowledge Discovery in Databases. Springer, V. Stoyanov, “Semeval-2015 task 10: Sentiment analysis in twitter,” in 2008, pp. 17–32. Proceedings of the 9th international workshop on semantic evaluation [29] J. Skinner, “Social media and revolution: The arab spring and the (SemEval 2015), 2015, pp. 451–463. occupy movement as seen through three information studies paradigms,” [7] E. Kouloumpis, T. Wilson, and J. D. Moore, “Twitter sentiment analysis: Working Papers on Information Systems, vol. 11, no. 169, pp. 2–26, The good the bad and the omg!” Icwsm, vol. 11, no. 538-541, p. 164, 2011. 2011. [30] S. M. Mohammad and P. D. Turney, “Crowdsourcing a word–emotion [8] H. Saif, Y. He, and H. Alani, “Semantic sentiment analysis of twitter,” in association lexicon,” Computational Intelligence, vol. 29, no. 3, pp. 436– International semantic web conference. Springer, 2012, pp. 508–524. 465, 2013. [9] J. Bollen, H. Mao, and A. Pepe, “Modeling public mood and emotion: [31] S. M. Mohammad and S. Kiritchenko, “Using hashtags to capture fine Twitter sentiment and socio-economic phenomena.” Icwsm, vol. 11, pp. emotion categories from tweets,” Computational Intelligence, vol. 31, 450–453, 2011. no. 2, pp. 301–326, 2015. [10] Y. Wang, E. Agichtein, and M. Benzi, “Tm-lda: efficient online modeling [32] X. Wang, F. Wei, X. Liu, M. Zhou, and M. Zhang, “Topic sentiment of latent topic transitions in social media,” in Proceedings of the 18th analysis in twitter: a graph-based hashtag sentiment classification ap- ACM SIGKDD international conference on Knowledge discovery and proach,” in Proceedings of the 20th ACM international conference on data mining. ACM, 2012, pp. 123–131. Information and knowledge management. ACM, 2011, pp. 1031–1040. [11] D. Alvarez-Melis and M. Saveski, “Topic modeling in twitter: Aggre- [33] R. Plutchik, “Emotions: A general psychoevolutionary theory,” Ap- gating tweets by conversations.” ICWSM, vol. 2016, pp. 519–522, 2016. proaches to emotion, vol. 1984, pp. 197–219, 1984. [12] S. Bao, S. Xu, L. Zhang, R. Yan, Z. Su, D. Han, and Y. Yu, “Joint [34] B. Popken. (2018) Twitter doesn’t make it easy to emotion-topic modeling for social affective text mining,” in Data Min- track Russian propaganda efforts — this database can ing, 2009. ICDM’09. Ninth IEEE International Conference on. IEEE, help. [Online]. Available: https://www.nbcnews.com/tech/social-media/ 2009, pp. 699–704. now-available-more-200-000-deleted-russian-troll-tweets-n844731 [13] S. Stieglitz and L. Dang-Xuan, “Social media and political communica- [35] crowdflower. (2016) Weather sentiment. Accessed: 2018-06-30. tion: a social media analytics framework,” Analysis and [Online]. Available: https://data.world/crowdflower/weather-sentiment Mining, vol. 3, no. 4, pp. 1277–1291, 2013. [36] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment classification using [14] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” distant supervision,” CS224N Project Report, Stanford, vol. 1, no. 12, Journal of machine Learning research, vol. 3, no. Jan, pp. 993–1022, 2009. 2003. [37] E. Cambria, D. Das, S. Bandyopadhyay, and A. Feraco, A practical [15] M. Guerini, L. Gatti, and M. Turchi, “Sentiment analysis: How to derive guide to sentiment analysis. Springer, 2017, vol. 5. prior polarities from sentiwordnet,” arXiv preprint arXiv:1309.5843, [38] R. Rehurek and P. Sojka, “Gensim–python framework for vector space 2013. modelling,” NLP Centre, Faculty of Informatics, Masaryk University, Brno, Czech Republic, vol. 3, no. 2, 2011.