ECNU at Semeval-2018 Task 1: Emotion Intensity Prediction Using Effective Features and Machine Learning Models

ECNU at SemEval-2018 Task 1: Emotion Intensity Prediction Using Effective Features and Machine Learning Models 1 1,2 1,2 Huimin Xu , Man Lan ∗ ,Yuanbin Wu 1Department of Computer Science and Technology, East China Normal University, Shanghai, P.R.China 2Shanghai Key Laboratory of Multidimensional Information Processing [email protected], mlan, ybwu @cs.ecnu.edu.cn { } Abstract Semeval 2018 Task 1 subtask 5 takes basic human emotion proposed by Ekman (Ekman, 1999) into In this paper we describe our systems submit- ted to Semeval 2018 Task 1 “Affect in Tweet” consideration, including Anger, Anticipation, Dis- (Mohammad et al., 2018). We participated in gust, Fear, Joy, Sadness, Surprise, and Trust. all subtasks of English tweets, including emo- The difference between these subtasks lies in tion intensity classification and quantification, the emotion granularity and classification or quan- valence intensity classification and quantifica- tification, so in our work, the similar method is tion. In our systems, we extracted four type- adopted for five subtasks. We extracted a rich set s of features, including linguistic, sentiment of elaborately designed features. In addition to lin- lexicon, emotion lexicon and domain-specific features, then fed them to different regressors, guistic features, sentiment lexicon features and e- finally combined the models to create an en- motion lexicon features, we also extracted some semble for the better performance. Officially domain specific features. Also, we conducted a released results showed that our system can be series of experiments on different machine learn- further extended. ing algorithms and ensemble methods to obtain the better performing for each subtask. For subask 5, 1 Introduction we adopted multiple binary classification and con- The Semeval 2018 Task 1 aims to automatically structed a model for each emotion. determine the intensity of emotions of the tweeters from their tweets, including five subtasks. That is, 2 System Description given a tweet and one of the four emotions (anger, fear, joy, sadness), the subtask 1 and 2 are to deter- We first performed data preprocessing, then ex- mine the intensity and classify the tweet into one tracted several types of features from tweets and of the four ordinal classes of intensity of the e- constructed supervised models for this task. motion respectively. Similarly, the subtask 3 and 4 determine the intensity and classify the tweet 2.1 Data Preprocessing into one of seven ordinal classes of intensity of Firstly, all words are converted to lower case, valance. Subtask 5 is a multi-label emotion clas- URLs are replaced by “url”, abbreviations, s- sification task which classifies the tweets as neu- langs and elongated words are transformed to tral or no emotion or as one, or more, of eleven their normal format. Then, emojis are replaced given emotions (anger, anticipation, disgust, fear, by corresponding emojis names by “Emoji Li- joy, love, optimism, pessimism, sadness, surprise, brary”1. Finally, we use Stanford CoreNLP tools trust) that best represent the mental state of the (Manning et al., 2014) for tokenization, POS tag- tweeter. For each task, training and test dataset- ging, named entity recognizing (NER) and pars- s are divided into English, Arabic, and Spanish ing. tweets. We participated in all subtasks of English tweets. 2.2 Feature Engineering Traditional sentiment classification is a coarse- We extracted a set of features to construct super- grained task in sentiment analysis which focuses vised models for five subtasks, that is linguistic on sentiment polarity classification of the whole sentence (i.e., positive, negative, neutral, mixed). 1https://github.com/fvancesco/emoji/ 231 Proceedings of the 12th International Workshop on Semantic Evaluation (SemEval-2018), pages 231–235 New Orleans, Louisiana, June 5–6, 2018. ©2018 Association for Computational Linguistics features, sentiment lexicon features, emotion lexi- Sentiment Lexicon9, and NRC Sentiment140 Lexi- con features and domain-specific features. con10. There is not a unified form among the eight- 2.2.1 Linguistic Features s lexicons. For example, Bing Liu lexicon use t- Lemma unigram Considering there is sim- wo values for each word to represent its sentiment • ilar emotion intensity expressed by “anger” scores which one for positive sentiment and the and “angers”, we choose word lemma uni- other for negative sentiment. In order to unify the gram features from tweets rather than word form, we transformed the two scores into a one- unigram features. dimensional value by subtracting negative emotion scores from positive emotion scores. Given Negation Negation in a sentence often affect- a tweet, we calculated the following six scores: • s its sentiment orientation, and conveys it- s intensity of the sentiment. For example, a – the ratio of positive words to all words. sentence with several negation words is more inclined to negative sentiment polarity. Fol- – the ratio of negative words to all words. lowing previous work (Zhang et al., 2015), – the maximum sentiment scores. we manually collected 29 negations2 and designed two binary features. One is to indicate – the minimum sentiment scores. whether there is any negation in the tweet and the other is to record whether this tweet con- – the sum of sentiment scores. tains more than one negation. – the sentiment score of the last word in tweet. NER Given a tweet “@JackHoward the • 2.2.3 Emotion Lexicon Features Christmas episode genuinely had me in tears Considering subtask 1, 2, 5 are related to e- of laughter”, it has useful information like motion intensity prediction, subtask 3, 4 are person name and festival which may con- related valence intensity prediction, three e- vey tweeter’s happiness. So we extracted 12 motion lexicons and one valence lexion are types of named entities (DURATION, SET, adopted. That is NRC Hashtag Sentimen- NUMBER, LOCATION, PERSON, ORGA- t Lexicon (Mohammad and Kiritchenko, 2015), NIZATION, PERCENT, MISC, ORDINAL, NRC Affect Intensity Lexicon (Mohammad, TIME, DATE, MONEY) from the sentence 2017), NRC Word-Emotion Association Lexicon and represented each type of named entity as (Bravo-Marquez et al., 2017) and ANEW-1999 a binary feature to check whether it appears Lexicon (Bradley and Lang, 1999). Given a tweet, in the sentence. we calculate three scores for each lexicon to construct emotion lexicon features: the maximum s- 2.2.2 Sentiment Lexicon Features cores, the sum of scores, the number of words ex- Many tasks related to sentiment or emotion anal- ist in lexicons. ysis depend upon affect, opinion, sentiment, sense and emotion lexicons. So we employ eight sen- 2.2.4 Domain-specific Features timent lexicons to capture the sentiment informa- Punctuation People often use exclamation • tion of the given sentence. The eight sentiment mark(!) and question mark(?) to express sur- lexicons are as follows: Bing Liu lexicon3, Gener- prise or emphasis. Therefore, we extract the al Inquirer lexicon4, IMDB5, MPQA6, NRC Emo- following 6 features: tion Sentiment Lexicon7, AFINN8, NRC Hashtag – whether the tweet contains an exclama- 2https://github.com/haierlord/resource tion mark. 3 http://www.cs.uic.edu/liub/FBS/sentiment- – whether the tweet contains more than analysis.html#lexicon 4http://www.wjh.harvard.edu/inquirer/homecat.htm one exclamation mark. 5http://www.aclweb.org/anthology/S13-2067 – whether the tweet has a question mark. 6http://mpqa.cs.pitt.edu/ 7http://www.saifmohammad.com/WebPages/lexicons.html 9http://www.umiacs.umd.edu/saif/WebDocs/NRC- 8http://www2.imm.dtu.dk/pubdb/views/publication Hashtag-Sentiment-Lexicon-v0.1.zip details.php?id=6010 10http://help.sentiment140.com/for-students/ 232 – whether the tweet contains more than Gradient Boosting Regressor (GBR) implement- one question mark. ed in scikit-learn tools13 and XGBoost Regressor 14 – whether the tweet contains both excla- (XGB) . All these algorithms are used with de- mation marks and question marks. fault parameters. – whether the last token of this tweet is an 3 Experiments exclamation or question mark. 3.1 Dataset Bag-of-Hashtags Hashtags reflect emotion • The statistics of the English datasets provided by orientation of tweets directly, so we con- Semeval 2018 Task 1 are shown in Table 1 and structed a vocabulary of hashtags appearing 2. How the English data created is described in in the training set and development set, then (Mohammad and Kiritchenko, 2018). adopted the bag-of-hashtags method for each tweet. Datasets anger fear joy sadness train 1,701 2,252 1,616 1,533 Emoticon We collected 67 emoticons from dev 388 689 290 397 • 11 subtask 1 17,939 17,923 18,042 17,912 Internet , including 34 positive emoticons test subtask 2 1,002 986 1,105 975 and 33 negative emoticons, then designed the following 4 binary features: Table 1: The statistics of data sets for subtask 1 and 2. – to record whether the positive and negative emoticons are present in the tweet, Subtask train dev test respectively (1 for yes, 0 for no). 3 1,181 449 17,874 – to record whether the last token is a pos- 4 1,181 449 937 itive or a negative emoticon. 5 6,838 886 3,259 Intensity Words Some words appeared more Table 2: The statistics of data sets for subtask 3, 4, 5. • frequently in tweets with higher intensity, some words has higher score in emotion lex- 3.2 Evaluation Metric icons, these words may contain information that express strong emotion intensity. So we To evaluate the performance of different system- extracted this type words in two ways: s, the official evaluation measure Pearson Corre- lation Coefficient with the Gold ratings/labels is – Pick up words whose emotion score is adopted for the first four subtasks. The correlation greater than threshold from emotion lex- scores across all four emotions will be averaged icons. (macro-average) to determine the final system per- – Calculate the probability of each word formance.

ECNU at Semeval-2018 Task 1: Emotion Intensity Prediction Using Effective Features and Machine Learning Models

Bad Is Stronger Than Good

The Effects of an Emotion Strengthening Training Program On

Using an Ambiguous Cue Paradigm to Assess Cognitive Bias in Gorillas (Gorilla Gorilla Gorilla) During a Forage Manipulation

Research-Based Practice with Women Who Have Had Miscarriages

The Influence of Physical Activity, Anxiety, Resilience And

Optimism As a Mediating Factor in the Relationship Between Anxiety and News Media Exposure in College Students

Ambiguous Loss: a Phenomenological

The Tyranny of Optimism

Loneliness and Expressive Suppression; the Role of Pessimism About Expressivity Pooya Razavi, Seung Hee Yoo San Francisco State University

Cosmic Pessimism

Trends, Sentiments and Emotions

The Experience of Men After Miscarriage Stephanie Dianne Rose Purdue University