
This is a preprint. Please cite this work by using the reference on https://ieeexplore.ieee.org/document/8765252/ I. Annamoradnejad and J. Habibi, "A Comprehensive Analysis of Twitter Trending Topics," 2019 5th International Conference onWeb Research (ICWR), Tehran, Iran, 2019, pp. 22-27. doi: 10.1109/ICWR.2019.8765252 A Comprehensive Analysis of Twitter Trending Topics Issa Annamoradnejad Jafar Habibi Department of Computer Engineering Department of Computer Engineering Sharif University of Technology Sharif University of Technology Tehran, Iran Tehran, Iran [email protected] [email protected] Abstract— In Twitter, a name, phrase, or topic that is Twitter trends has shown their powerful ability in many mentioned at a greater rate than others is called a "trending topic" public events, such as in the wildfires in San Diego and the or simply “trend”. Twitter trends list has a powerful ability to earthquake in Japan [4]. In addition, governments and promote public events such as natural events, political scandals, businesses analyze and understand the dynamics of general market changes and other types of breaking news. Nevertheless, mood of population to reach better results. Some previous works there have been very few works focused on the dynamics of these studied the importance of Twitter in detecting real time events, trending topics. In this article, we thoroughly examined the predicting market fluctuations and even election results. Twitter’s trending topics of 2018. To this end, we automatically Nevertheless, there has been very few works that focused on accessed Twitter’s trends API and stored the resulting 50 top understanding the dynamics and statistics of these trending trending topics in a novel dataset. We propose and analyze our topics. dataset according to six criteria: lexical analysis, time to reach, trend reoccurrence, trending time, tweets count, and language In this article, we thoroughly examined the Twitter’s analysis. Based on our results, 77.6% of the topics that reached the trending topics of 2018. To this end, we accessed Twitter’s Top-10 list were trending with less than 100k tweets. More than trends API for the full year of 2018 and generated a full dataset. 50% of the topics could not hold the position for more than an Contrary to the top ten list shown in the website, the API returns hour. English and Arabic languages comprised close to 40% and a list of top 50 trending topics for a given place. A version of our 20% of the first rank topics, respectively. dataset (hourly basis) is provided at [5]. Keywords— trending topics; trends; Twitter; trending time; To analyze this aggregated dataset in several aspects, we language classification; knowledge extraction; Year 2018 devised six criteria, which are: lexical analysis, time to reach, trend reoccurrence, trending time, tweets count, and language. I. INTRODUCTION We examined our dataset in these six criteria according to three conditions: First rank trends, Top10 and Top50 list. In addition In the last two decades, online social media sites (such as to computing general statistics about each criterion, determining Facebook, Twitter, Youtube, etc.) have revolutionized the way longest trending topics, topics with highest tweet counts, most we communicate with each and transformed everyday practices reoccurring topics, and most used languages, we will generate [1]. Among these Online Social Networks (OSNs), Twitter is a related distributions such as: tweets count distribution, free long range microblogging website that enables enlisted reoccurrence count distribution, trending time distribution, individuals to communicate short posts called tweets. Twitter language distribution, and words and characters count users can communicate tweets and take after other client's tweets distributions. by utilizing numerous activities [2]. Twitter is currently the third most popular social networking service, with more than 335 The structure of this article is as follows: Section 2 reviews million active users [3]. past works on Twitter trends and related topics, alongside a brief description of used terms in Twitter. Section 3 explains the data A name, phrase, or topic that is mentioned at a greater rate and methodology. In Section 5, results and discussion for the six than others is called a "trending topic" or simply “trend”. criteria are given, and Section 6 is the concluding remarks. Trending topics become popular either through a concerted Background effort by users (as in promoting an election nominee) or because of an event that prompts people to talk about a specific topic (such as a TV series or earthquake). A list of top ten trending A. Literature review topics is listed in the website, which help Twitter and their users Many examples from the real world events have to understand what is happening in the world and what people's demonstrated the effectiveness of trending topics in attracting opinions are about it. more attention from the world, during disasters and social movements, and there have been good body of research to provide analytics for those events. Reference [4] study social, Users are encouraged to categorize their tweets by a hashtag, spatial, and temporal characteristics of earthquake-related which is any keyword preceded by a hash sign “#” (e.g., tweets, [6] describe a method for using Twitter to track forest #sorcery). This allows users for faster content discovery or to fires and the response to the fires by Twitter users, [7] analyzed track specific events in real time. Twitter displays top ten information diffusion activity during the 2011 Egyptian political trending topics for several places as well as worldwide trends, uprisings, [8] examined the tweets and strategies corresponding which over the years has gained a lot of attention and news to the United States Presidential election, [9] used spike coverage. detection algorithm to detect the important moments within sporting events like World Cup Soccer matches, which take Twitter provides an application programming interface place over a short period of time, and [10] analyzed the topical, (API), which allows developers to programmatically access the geographic, and temporal importance of descriptors for three public data streams, search the old tweets, access trending topics events over time that can help visualize the event data. and many other features of the service. The availability of Twitter data has motivated significant research work in various In contrast, very little study has been done to explain or disciplines and led to numerous applications and tools. understand Twitter trends dynamics. Reference [11] classified trends into 18 categories using two separate approaches. Their II. DATA AND METHOD results based on 768 unique trending topics, showed that Sports, Music and Movies had the highest numbers of trending topics, In this section, we will briefly explain our data gathering respectively. Reference [12] proposed Sequential method, alongside some general statistics on our collected Summarization to generate a serial of chronologically related dataset. In addition, we will explain the methods we will utilize sub-summaries for a given topic while retaining the order of in the next chapter to analyze our data. information presentation and each sub-summary attempts to concentrate on one theme or subtopic. Some researchers (such A. Data as [13], [14]) attempted to aggregate several explanations into We collected our dataset using the Twitter’s trends API1, one long summary using traditional summarization approaches, which provides current trending topics for a given place in the but it still loses much useful information, such as the change of format of WOEID2. Contrary to the trending list in the website Twitters’ focus and the temporal information. Reference [15] which displays only 10 items, the API result consists of (mostly) provided an evaluation on the methods of trends disambiguation 50 trending topics with their corresponding tweets’ count (where to find the most successful method that uses to retrieve the available). We accessed the API for worldwide trending topics representative contents of trending topics. using a script for the entire year of 2018. The script accessed the In addition, some services have been dedicated to explain API every 10 minutes with 97.35% availability. It should be Twitter trends autonomously or in a collaborative scenario noted that because of the script time interval, there could be a (more information in [12]). Applications like echofon few uncovered states, such as those trends that held the first rank (www.echofon.com), whatthetrend (whatthetrend.com) have for less than 10 minutes. A version of our dataset (hourly basis) evolved from Twitter, which provide services to explain why a is provided at [5]. term becomes a trending topic or to give a short description of Our dataset is comprised of 155899 unique topics which the trending topic. These applications or services generally track were listed once or more in the top50 trending topics. 68% of the the topics in Twitter to automatically generate a description for topics were hashtags, while the rest of them were names or a given topic. In some cases, they encourage users to give small expressions without the hashtag sign (#). Tweets count was summaries on a new tweet to explain the topics. For example, available for more than 96% of the first rank topics, 72% of the whatthetrend encourages users to edit explanatory tweets about topics in the top10 list and 43% of all aggregated
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-