Modelling Valence and Arousal in Facebook Posts Daniel Preot¸Iuc-Pietro H

Modelling Valence and Arousal in Facebook posts Daniel Preot¸iuc-Pietro H. Andrew Schwartz Positive Psychology Center Department of Computer Science University of Pennsylvania Stony Brook University [email protected] [email protected] Gregory Park and Johannes C. Eichstaedt Margaret Kern Positive Psychology Center Centre for Positive Psychology University of Pennsylvania University of Melbourne Lyle Ungar Elizabeth P. Shulman Computer & Information Science Department of Psychology University of Pennsylvania Brock University [email protected] [email protected] Abstract what emotion it conveys (Strapparava and Mihal- cea, 2007) and towards which entity or aspect of Access to expressions of subjective personal the text i.e., aspect based sentiment analysis (Brody posts increased with the popularity of Social and Elhadad, 2010). Downstream applications are Media. However, most of the work in senti- mostly interested in automatically inferring public ment analysis focuses on predicting only valence from text and usually targeted at a prod- opinion about products or actions. Besides express- uct, rather than affective states. In this pa- ing attitudes towards other objects, texts can also ex- per, we introduce a new data set of 2895 So- press the emotions of the ones writing them, most cial Media posts rated by two psychologically- common recently with the rise of Social Media us- trained annotators on two separate ordinal age (Rosenthal et al., 2015). This study focuses nine-point scales. These scales represent va- on presenting a gold standard data set as well as a lence (or sentiment) and arousal (or intensity), model trained on this data in order to drive research which defines each post’s position on the circumplex model of affect, a well-established in learning about the affective norms of people post- system for describing emotional states (Rus- ing subjective messages. This is of great interest to sell, 1980; Posner et al., 2005). The data set is applications in social science which study text at a used to train prediction models for each of the large scale and with orders of magnitude more users two dimensions from text which achieve high than traditional studies. predictive accuracy – correlated at r = :65 Emotion classification is a widely debated r = :85 with valence and with arousal anno- topic in psychology (Gendron and Barrett, 2009). tations. Our data set offers a building block to a deeper study of personal affect as expressed Two main theories about emotions exist: the first in social media. This can be used in appli- posits a discrete and finite set of emotions, while cations such as mental illness detection or in the second suggests that emotions are a combina- automated large-scale psychological studies. tion of different scales. Research in Natural Lan- guage Processing (NLP) has been focused mostly on 1 Introduction Ekman’s model of emotion (Ekman, 1992) which posits the existence of six basic emotions: anger, Sentiment analysis is a very active research area that disgust, fear, joy, sadness and surprise (Strappar- aims to identify, extract and analyze subjective in- ava and Valitutti, 2004; Strapparava and Mihalcea, formation from text (Pang and Lee, 2008). This 2008; Calvo and D’Mello, 2010). In this study, generally includes identifying if a piece of text is we focus on the most popular dimensional model of subjective or objective, what sentiment it expresses emotion: the circumplex model introduced in (Rus- (positive or negative; often referred to as valence), sell, 1980). This model suggests that all affective states are represented in a two-dimensional space • Arousal (or intensity) represents the intensity with two independent neurophysiological systems: of the affective content, rated on a nine point valence (or sentiment) and arousal. Any affective scale from 1 (neutral/objective post) to 9 (very experience is a linear combination of these two in- high). dependent systems, which is then interpreted as rep- resenting a particular emotion. For example, fear Our corpus is comprised of Facebook sta- is a state involving the combination of negative va- tus updates shared by participants as part of the lence and high arousal (Posner et al., 2005). Previ- MyPersonality Facebook application (Kosinski et ous research in NLP focused mostly on valence or al., 2013), in which they also took a variety of ques- sentiment, either binary or having a strength com- tionnaires. All authors have explicitly given permis- ponent coupled with sentiment (Wilson et al., 2005; sion to include their information in a corpus for re- Thelwall et al., 2010; Thelwall et al., 2012). search purposes. We have manually anonymized the In this paper we build a new data set con- entire corpus by removing any references to other sisting of 2895 anonymized Facebook posts labeled names of persons, addresses, telephone numbers, e- with both valence and arousal by two annotators mails and URLs, and replaced them with placehold- with psychology training. The ratings are made on ers. two independent nine point scales, reaching a high In order to reduce biases due our participant agreement correlations of :768 for valence and :827 demographics, the data set sample was stratified by for arousal. Data set statistics suggest that while the gender and age and we have not rated more than dimensions of valence and arousal are associated, two messages written by the same person. Re- they present distinct information, especially in posts search is inconclusive about whether females ex- with a clear positive or negative valence. press more emotions in general (Wester et al., 2002). Further, we train a bag-of-words linear regres- With regards to age, an age positivity bias has been sion model to predict ratings of new messages. This found, where positive emotion expression increases model achieves high correlation with actual mean with age (Mather and Carstensen, 2005; Kern et al., ratings, reaching Pearson r = :85 correlation on the 2014). arousal dimension and r = :65 on the valence di- The data originally consisted of 3120 posts. mension without using any other sentiment analysis All of these posts were annotated by the same two resources. Comparing our method to other estab- independent raters with a training in psychology. lished lexicons for valence and arousal and methods The raters performed the coding in a similar environ- from sentiment analysis, we demonstrate that these ment without any distractions (e.g., no listening to methods are not able to handle well the type of posts music, no watching TV/videos) as these could have present in our data set. We further illustrate the most influenced the emotions of raters, and therefore the correlated words with both dimensions and identify coding. opportunities for improvement. The data set and an- The annotators were instructed to sparingly notations are freely available online.1 rate messages as un-ratable when they were written in other languages than English or that offered 2 Data set no cues for a accurate rating (only characters with no meaning). The annotators were instructed to rate We create a new data set with annotations on two a message if they could judge at least a part of the independent scales: message. Then, the raters were asked to rate the two dimensions, valence and arousal, after they have ex- • Valence (or sentiment) represents the polar- plicitly been briefed that these should be indepen- ity of the affective content t in a post, rated on dent of each other. The raters were provided with a nine point scale from 1 (very negative) to 5 anchors with specified valence and arousal and were (neutral/objective) to 9 (very positive); instructed to rate neutral messages at the middle of 1http://mypersonality.org/wiki/doku.php? the scale in terms of valence and 1 if they lacked id=download_databases arousal. Dimension R1 µ ± σ R2 µ ± σ IA Corr. Valence of posts 1–9 1–3.5 1–4 6–9 6.5–9 Valence 5.274 ± 1.041 5.250 ± 1.485 .768 Correlation to arousal .222 -.047 -.201 .226 .085 Arousal 3.363 ± 1.958 3.342 ± 2.183 .827 Mean arousal 3.35 3.85 3.47 4.31 4.68 Table 2: Individual rater mean and standard devia- Table 3: Correlation with arousal and mean arousal tion and inter-annotator correlation (IA Corr). values for different posts grouped by valence. 6.0 presence of either positive and negative valence is correlated with a arousal score different than 1, but 5.5 this correlation is weaker when the positive or negative valence passes a certain threshold (i.e. 3:5 and Valence 5.0 6:5 respectively). We also note that the high overall 4.5 correlation is also due to higher mean arousal for 15 20 25 30 35 Age positive valence posts compared to negative posts 4.5 (4:68 cf. 3:85) 4.0 Figure 2 displays the relationship between the 3.5 age of the user at posting time and the valence and 3.0 Arousal arousal of their posts in our data set, and further di- 2.5 vided by gender. We notice some patterns emerge 2.0 in our data. Valence increases with age for both 15 20 25 30 35 Age genders, especially at the start and end of our age Figure 2: Variation in valence and arousal with age intervals (13–16 and 30–35), confirming the aging in our data set using a LOESS fit. Data is split positivity bias (Mather and Carstensen, 2005). Va- by gender: Male (coral orange) and Female (mint lence is higher for females across almost the entire green). age range. Posts written by females are also sig- nificantly higher in arousal for all age groups. Age In total, 2895 messages were rated by both does not play a significant effect in post arousal, al- users in both dimensions.

Load more