<<

Integrating context in metrics: preliminary investigation on the possibilities of hashtags as an resource

Jeroen van Honk; Rodrigo Costas Centre for Science and Technology Studies (CWTS), Leiden University, Leiden, the Netherlands

Introduction

In the burgeoning field of altmetric research Twitter is seen as a relevant resource in the discussion and dissemination of , being one of the most important altmetric sources (Robinson-García et al, 2014). The need for more contextualized and content analyses of Twitter data has been highlighted already in the altmetric literature (Haustein et al, 2014b; Bornnman,2016).

One prominent element in Twitter is the use of hashtags, a user-generated mechanism for tagging and collating tweets which are related to specific topics (Bruns & Burgess, 2011). Thus hashtags work as a linking element used to create impromptu categorizations of tweets but also as a means for Twitter users to follow a “conversation” on a topic and communicate with similar communities of interest around the hashtag. Thus, hashtags can be seen as instruments to enlarge the potential audience of a tweet and the mentioned publication.

In the field of altmetrics, particularly on Twitter studies, there have been some studies that have considered the presence of hashtags in tweets mentioning scientific publications. For example, Haustein et al (2014a) found that hashtags were more common among active astrophysicists in Twitter than those with a scarce Twitter activity, and actually their use of hashtags can be linked to different behaviors and groups of users (Holmberg et al, 2014).

To the best of our knowledge however, there hasn’t been any large scale analysis of the value of hashtags as potential boosters of other impacts beyond Twitter. In this article, we hypothesize that, due to their broadcasting potential, the use of hashtags will increase the visibility and uptake of scientific publications for other altmetric sources (i.e., blogs, news, and users). Similarly, we explore the possibility of analyzing the network of hashtags occurring around a scientific area.

Methodology Altmetric.com data on Twitter mentions to scientific publications (until June 2016 a total of 3.9 million publications had received more than 24 mill. tweets) has been considered. Hashtags in the tweets to scientific publications have been further extracted.

An interesting perspective about hashtags is their consideration as streams of “conversations”. Thus, two Twitter conversations (i.e. hashtags) are coupled if they both co-occur in tweets with one or more publication(s) in common. This approach allows studying the network of hashtags in a given field by how users have been linking hashtags with publications. In this study we look at the hashtag coupling network for the publications in the field of .

1

Results

Figure 1 depicts the average blog impact per publication depending on whether they have been (re)tweeted together with at least one hashtag or not. The x-axis presents the cohort of publications by their total number of tweets. Thus we can explore whether the average impact in blogs of publications that have been linked to at least a hashtag is higher/lower as compared to those without such a linkage, controlling by their number of tweets. It is visible how for the lowest levels of tweeting activity (i.e. publications with less than 5 tweets) the blog impact is very similar regardless the linkage of publications to hashtags. However, this changes from 5 tweets onwards, when blog impact is higher for those publications that have been linked to a hashtag. Interestingly enough, we can also see how the gap between the number of documents without hashtags decreases as the number of tweets increases. This suggests that the chances of a paper being linked to a hashtag increases with the overall (re)tweeting activity.

Figures 2 and 3 show very similar patterns for News mainstream media mentions and Mendeley readership with an increasing pattern in the visibility of publications with hashtags.

Publications (and tweets) can be linked to more than one hashtag. Figure 4 explores whether the linkage of a publication to several different hashtags has an effect on its Mendeley readership, showing that although with small differences, Mendeley readership seem to be higher for papers with a linkage to at least 5 hashtags.

Finally, a hashtag coupling map for the subset of tweets related to publications in bibliometric journals (Figure 5) was created. In this map different clusters of hashtags are identified. For instance, the blue cluster in the top-right corner is physics-related, with hashtags like “”, “physics” and “bigdata”. The green and red clusters deal with bibliometric terms, with an emphasis of altmetrics for the green cluster, and more general hashtags like “research” and “science” in the red cluster. In order to determine the effect of retweets (Holmberg & Thelwall (2014) already pointed out the idea that retweets are more prone to have hashtags than original tweets) in the perception of the hashtag network we excluded them from the analysis (Figure 6). When retweets are removed, it seems that the map gets sparser as the linkages among hashtags is less strong, therefore the role of retweets in hashtag coupling is also an element to consider.

2

0.8

0.7

0.6

0.5

0.4

0.3 no hashtag 0.2 hashtag

0.1 Avg. Avg. nr.of posts blog per document

0

23 (1/6)23 19 (2/9)19 (1/8)20 (1/7)21 (1/7)22 (1/6)24 (1/5)25

12 (6/21)12 5 (64/89) 5 (40/68) 6 (27/53) 7 (19/42) 8 (14/35) 9 (8/24)11 (5/18)13 (4/16)14 (3/14)15 (3/12)16 (2/11)17 (2/10)18

10 (10/29)10

2 (502/248) 2 (223/171) 3 (112/122) 4 1 (1284/397) 1 nr. of tweets per document, with in parentheses the nr. of documents (*1000) for no hashtag and hashtag, respectively

Figure 1: Effect of hashtags on correlation between number of tweets and number of blog posts

1.6

1.4

1.2

1

0.8

0.6 no hashtag 0.4 hashtag 0.2

0

Avg. Avg. nr.of News media perposts document

23 (1/6)23 19 (2/9)19 (1/8)20 (1/7)21 (1/7)22 (1/6)24 (1/5)25

12 (6/21)12 5 (64/89) 5 (40/68) 6 (27/53) 7 (19/42) 8 (14/35) 9 (8/24)11 (5/18)13 (4/16)14 (3/14)15 (3/12)16 (2/11)17 (2/10)18

10 (10/29)10

2 (502/248) 2 (223/171) 3 (112/122) 4 1 (1284/397) 1 nr. of tweets per document, with in parentheses the nr. of documents (*1000) for no hashtag and hashtag, respectively

Figure 2: Effect of hashtags on correlation between number of tweets and number of news posts

3

45 40 35

30

25 20 15 no hashtag 10 hashtag 5

0

Avg. Avg. nr.of readers Mendeley in

22 (1/7)22 (1/6)24 19 (2/9)19 (1/8)20 (1/7)21 (1/6)23 (1/5)25

5 (64/89) 5 (40/68) 6 (27/53) 7 (19/42) 8 (14/35) 9 (8/24)11 (6/21)12 (5/18)13 (4/16)14 (3/14)15 (3/12)16 (2/11)17 (2/10)18

10 (10/29)10

2 (502/248) 2 (223/171) 3 (112/122) 4 1 (1284/397) 1 Nr. of tweets per document, with in parentheses the nr. of documents (*1000) for no hashtag and hashtag, respectively

Figure 3: Effect of hashtags on correlation between “average times saved in Mendeley” and number of tweets

45

40

35

30

25 no hashtags 1-2 hashtag(s) 20 3-4 hashtags 5+ hashtags 15

10

5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 4: Effect of number of distinct hashtags on correlation between “average times saved in Mendeley” and number of tweets

4

Figure 5: Vosviewer map of hashtag coupling – all tweets considered

Figure 5: Vosviewer map of hashtag coupling - excluding retweets

5

Conclusions and further research This paper explores the potential interest of hashtags in in the Twitter analysis of scientific publications. Our results support the idea that hashtags can be seen as an element related to the higher reception of scientific publications beyond Twitter. The ‘hashtag coupling’ map introduces the possibility of identifying conversations happening around scientific topics and how they are connected. This type of maps can be an instrument for librarians in order to inform users about hashtags related to their areas of interest. These maps can also help Twitter users to identify inconsistencies among the different hashtags used by their communities of interest, thus allowing them to improve their communication strategies. Further research will necessarily focus on studying the different characterization and typologies of hashtags and their role in boosting other impacts and the visibility of scientific publications. For example, is it better to be linked to a broad conversation where many of the users may miss the tweet, or rather to a narrower conversation with more expert and interested users? Also the role of retweets needs to be considered. More fine-tuning and in-depth analyses are indeed necessary, but this paper already shows that hashtags are a key element in the understanding of Twitter metrics and altmetrics.

Acknowledgments This research has been partly by funding from the South African DST-NRF Centre of Excellence in and STI Policy (SciSTIP).

References

Bruns, A., & Burgess, J. E. (2011). The Use of Twitter Hashtags in the Formation of Ad Hoc Publics. In Proceeding of the 6th European Consortium for Political Research (ECPR) General Conference 2011. Retrieved from http://eprints.qut.edu.au/46515/

Bornmann, L. (2016). What do altmetrics counts mean? A plea for content analyses. Journal of the Association for Information Science and Technology, 67(4), 1016–1017

Haustein, S., Bowman, T. D., Holmberg, K., Peters, I., & Larivière, V. (2014a). Astrophysicists on Twitter: An in-depth analysis of tweeting and scientific publication behavior. Aslib Journal of Information Management, 66(3), 279–296. doi:10.1108/AJIM-09-2013-0081

Haustein, S., Peters, I., Sugimoto, C. R., Thelwall, M., & Larivière, V. (2014b). Tweeting Biomedicine : An Analysis of Tweets and in the Biomedical Literature. Journal of the Association for Information Sciences and Technology, 65(4), 656–669

Holmberg, K., Bowman, T. D., Haustein, S., & Peters, I. (2014). Astrophysicists’ conversational connections on twitter. (L. Bornmann, Ed.)PloS one, 9(8), e106086. doi:10.1371/journal.pone.0106086

6

Robinson-García, N., Torres-Salinas, D., Zahedi, Z., & Costas, R. (2014). New data, new possibilities: exploring the insides of Altmetric.com. El Profesional de la Informacion, 23(4), 359–366. doi:10.3145/epi.2014.jul.03

7