Understanding Communities Via Hashtag Engagement: a Clustering Based Approach
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the Tenth International AAAI Conference on Web and Social Media (ICWSM 2016) Understanding Communities via Hashtag Engagement: A Clustering Based Approach Orianna DeMasi ∗ Douglas Mason † Jeff Ma University of California, Berkeley Pinterest Twitter, Inc. Berkeley, CA 94720 San Francisco, CA 94103 San Francisco, CA 94103 [email protected] [email protected] [email protected] Abstract user experience, and hashtags can potentially enable such connections. Even though researchers have acknowledged We develop insight into community use of hashtags on so- cial media and find that hashtags with behavior indicative of this possibility (Laniado and Mika 2010; Russo and Nov real world communities are more engaging. To do this, we 2010), little research thus far has targeted understanding the study the relationship of hashtag usage with user engagement community use of hashtags. on Twitter. Hashtag engagement is useful as a surrogate mea- In this study we focus on how engaging different types of sure of how active community members are. We develop a hashtags are and consider relations with community metrics. framework for describing hashtag temporal usage, show the We find that hashtags that have a stronger resemblance to existence of 4 broad classes of hashtags, and show that the real world communities are more engaging. engagement of a hashtag varies significantly between classes. One of our contributions is a framework for understanding Periodically used hashtags, such as for TV shows and weekly community chats, are the most engaging, while hashtags re- different dynamic types of hashtags. With this framework, lating to events are the least engaging. Looking at how com- we unify previous work on identifying periodic hashtags munity dynamics vary within this framework reveals that a (Cook, Kenthapadi, and Mishra 2013) and event hashtags hashtag being used more frequently is not positively corre- (Cha et al. 2010; Lehmann et al. 2012; Crane and Sornette lated with it being more engaging. We then explore the pe- 2008; Lin et al. 2013; Shamma, Kennedy, and Churchill riodically used hashtags and find negative correlations with 2011). This unification validates previous observations that diversity of the user base, which implies concentrated com- there are coherent dynamic types of hashtags (Hsu, Chang, munities are the most engaging. We conclude by studying and Chen 2010; Romero, Meeder, and Kleinberg 2011). a set of community conversation-oriented hashtags and find Our second contribution is a set of analyses using this these hashtags to be more engaging than other hashtags, re- gardless of dynamic type. Our findings support the hypothe- framework on a comprehensive cohort dataset. We include sis that hashtags with stronger community behavior are more a comparison of engagement between hashtag types. Our engaging. analyses take steps in the direction of understanding engage- ment of hashtag types. This understanding is important, not just as a retrospective analysis, but as an actionable way for Introduction finding, connecting, and supporting communities. Hashtags have become a cornerstone of online social me- One of our findings is that periodically recurring hash- dia. From their birth on the Twitter platform, hashtags have tags are the most engaging type of hashtag, on average. Pre- evolved from their basic form of a short string of text pre- vious work that has analyzed peaks in hashtag usage, i.e. ceded by a pound symbol to be a tool for myriad purposes, events, is minimally actionable, as events are difficult to pre- e.g. ad campaigns and online chats (Yang et al. 2012). In ad- dict. In contrast, periodic events are predictable, so the abil- dition to being versatile, they are exceedingly popular and ity to identify and understand periodically used hashtags has have been adopted on nearly every social media platform. implications for how to design and implement new features While hashtags may have been introduced to catalog for social media. Such new features could include weekly information, they now can enable users to rally social checkins on relevant content or community features built up movements (e.g. #BlackLivesMatter a social movement for around a periodic event. By showing that periodically used racial justice), disseminate public health campaigns (e.g. hashtags are the most engaging, our work implies that sys- #ECigTruths from Chicago Department of Public Health), tems implemented around this periodic content would have and connect to communities (e.g. #RunChat a community the most impact. While our work focuses on hashtags, its for runners). Finding and connecting users to relevant com- broader implication is that systems designed to leverage pe- munities online is of paramount importance for improving riodic content to connect recurring communities will create a ∗Work does as intern as Twitter, Inc. more engaging experience than connecting more ephemeral †Work done as employee at Twitter, Inc. groups, e.g. groups that connect over events. Copyright c 2016, Association for the Advancement of Artificial The structure of this paper is as follows. We begin by Intelligence (www.aaai.org). All rights reserved. motivating the complexity of hashtags. We then propose a 102 framework for understanding coarse hashtag usage of all (Kouloumpis, Wilson, and Moore 2011; Davidov, Tsur, and popular hashtags. Using this framework, we show there Rappoport 2010) or political polarity (Conover et al. 2011). are 4 broad types of hashtags: events, stable, periodic, and Studies have also looked at hashtags themselves, both trying stochastic. These classes have been alluded to in prior work to understand their dynamics (Cha et al. 2010; Lehmann et (Hsu, Chang, and Chen 2010; Romero, Meeder, and Klein- al. 2012; Crane and Sornette 2008; Lin et al. 2013; Shamma, berg 2011), but this is the first systematic categorization val- Kennedy, and Churchill 2011; Yang and Leskovec 2011; idating their existence. We then propose a metric of engage- Hsu, Chang, and Chen 2010), popularity (Pervin et al. 2015; ment for a hashtag and study this metric within the different Ma, Sun, and Cong 2013; Tsur and Rappoport 2012), rel- hashtag types. We find hashtags with a recurring usage are at evance (Denton et al. 2015) and semantic meaning (Fer- least 6.5% more engaging than other types of hashtags and ragina, Piccinno, and Santoro 2015; Posch et al. 2013; over 100% more engaging than event hashtags, on average. Romero, Meeder, and Kleinberg 2011; Tsur, Littman, and We build models that consider the relationship of engage- Rappoport 2012). This work highlights the breadth and util- ment with Tweet volume, author diversity, and Tweet con- ity of hashtags as an important feature of social media. tent for each dynamic type. These models show that pop- Here, we are most interested in engagement of hashtags ular hashtags are not the most engaging and that low di- by type. We consider types to be classes of temporal usage versity groups can be particularly engaging. We then pro- patterns rather than a priori defined semantic or topic classes pose a refining framework that partitions the recurring hash- (Ferragina, Piccinno, and Santoro 2015; Posch et al. 2013; tags into 4 more interpretable classes: weekly events, all day Romero, Meeder, and Kleinberg 2011; Tsur, Littman, and weekly events, more frequent and less frequent than weekly Rappoport 2012). events. These subclusters reveal an even stronger negative Previous studies have looked at hashtag usage dynam- correlation of diversity with engagement for weekly events, ics and noted that there are different types. Some studies which implies that concentrated groups of users that con- have acknowledged at least three types of dynamics roughly vene weekly and tweet a lot about a shared interest generate falling into: continuous activity, periodic activity, or activity the most engaging hashtags. We conclude by examining en- concentrated around an isolated time domain. (Hsu, Chang, gagement for a particular subset of community conversation- and Chen 2010; Lehmann et al. 2012). However, these stud- oriented “chat” hashtags. ies did not provide a comprehensive framework for showing these classes exist or showing how to identify and catalog Contributions hashtags into each class. They also did not consider differ- ences in engagement between classes. Our contributions complement prior work that has looked at Other studies have focused on identifying and studying temporal usage of hashtags. We extend such work by pro- a single class of temporal dynamics. “Peaky” events, such viding a comprehensive categorization of dynamics and an- as news have been studied and classified into as many as 6 alyzing the implications for engagement and communities. types of classes (Cha et al. 2010; Lehmann et al. 2012; Crane We provide a novel comprehensive framework for clus- and Sornette 2008; Lin et al. 2013; Shamma, Kennedy, and tering hashtags based on temporal usage. We then analyze Churchill 2011; Yang and Leskovec 2011) . It has been sug- the resulting hashtag dynamic types, propose a metric of gested that these classes of events could relate to communi- engagement, and compare this metric of engagement be- ties if we consider a group of people interested in an event as tween the hashtag types. This comparison is an important a community. Here we are interested in considering commu- step for designing systems that is missing from prior work. nities as groups of users who continue to share relevant and We consider models that relate Tweet volume, author diver- specific information, say on a topic. Furthermore, studying sity, and Tweet content to hashtag engagement. We also pro- events is more retrospective and less actionable, as events pose and analyze another framework for understanding pe- are difficult to predict. riodic type hashtags and add a discussion of implications for Periodic hashtags, on the other hand, can be predicted and community engagement. Finally, we consider engagement thus could be actionable. Studies have briefly considered of community-oriented “chat” hashtags within this cluster- subsets from the class of periodic hashtags, for example pe- ing.