Arxiv:1610.01655V2 [Cs.SI] 3 Nov 2016
Total Page:16
File Type:pdf, Size:1020Kb
Trump vs. Hillary Analyzing Viral Tweets during US Presidential Elections 2016 Walid Magdy1 and Kareem Darwish2 1School of Informatics, The University of Edinburgh, UK 2Qatar Computing Research Institute, HBKU, Doha, Qatar Email: [email protected], [email protected] Twitter: @walid magdy, kareem2darwish Abstract during these two months on TweetElect. After manu- ally tagging all the tweets in our collection for support In this paper, we provide a quantitative and qual- for either candidate, we looked at: which candidate has itative analyses of the viral tweets related to the US presidential election. In our study, we focus more traction on Twitter and a more diverse support on analyzing the most retweeted 50 tweets for base; when a shift in the volume of supporting tweets everyday during September and October 2016. happens; and which tweets were the most viral. The resulting set is composed 3,050 viral tweets, We observed that retweet volume of pro-Trump and they were retweeted over 20.5 million times. tweets dominated the retweet volume of pro-Clinton We manually annotated the tweets as favorable tweets on most days during September 2016, and al- of Trump, Clinton, or neither. Our quantitative most all the days during October. A notable exception study shows that tweets favoring Trump were usu- was the day after the first presidential debate and the ally retweeted more than pro-Clinton tweets, with day after the leak of the Access Hollywood tape in which the exception of a few days in September and two days in October, especially the day following the Trump used lewd language. first presidential debate and following the release of the Access Hollywood tape. On two days in Oc- Data Collection tober 2016, pro-Trump tweet volume accounted for than 90% of the total tweet volume. In this section, we describe the collection of viral tweets. We initially give an introduction to TweetElect website, Introduction which is the source we used to identify the daily viral tweets. Later, we explain the data annotation process Social media is an important platform for political and give some statistics on the data collected. discourse and political campaigns (Shirky 2011; West 2013). Political candidates have been increasingly us- ing social media platforms to promote themselves and TweetElect their policies and to attack their opponents and their TweetElect.com1 is a free website that aggregates policies. Consequently, some political campaigns have and shows the most retweeted tweets related to the their own social media advisers and strategists, whose 2016 US presidential election. The website shows tweets success can be pivotal to the success of the campaign as about the elections in general, with the option of dis- a whole. In the context of this paper, we are interested playing tweets about each of the two main candidates in measuring the volume and diversity of support for the separately. It offers search functionality with filters on arXiv:1610.01655v2 [cs.SI] 3 Nov 2016 two main candidates for the 2016 US presidential elec- media type (text, image, video, or links), while en- tions, Donald Trump and Hillary Clinton, on Twitter abling the display of search results related to each can- during the two months preceding the elections, namely didate separately. TweetElect shows the most retweeted September and October 2016. The work is based on the tweets, images, videos, and links in the last hour, 12 data being collected via TweetElect.com, which is hours, 1 day, or 2 days. an online website that tracks tweets and Twitter trends During September and October, the number of ag- pertaining to the US presidential elections. gregated tweets per day (including retweets) related to For our analysis, we use the top 50 retweeted tweets, the US presidential elections typically ranged between aka viral tweets, for everyday in September and Oc- 300k and 600k. This number increased dramatically af- tober 2016 pertaining to the US presidential election. ter specific events or revelations, such as after the pres- The total number of unique tweets that we analyze is idential debates, where the number of tweets exceeded 3,050, whose retweet volume of 20.5 million retweets ac- 4 million tweets. counts for more than 50% of the total retweet volume Copyright c 2016, All rights reserved. 1http://www.tweetelect.com/ Number of retweets of the top N viral tweets on the US elections TweetElect is a special edition of Tweet- 900,000 Mogaz2 (Magdy 2013), which is an Arabic news 800,000 Top50 portal that automatically generates news from tweets. 700,000 Top10 600,000 Top10F It uses state-of-the-art adaptive filtering methods for 500,000 detecting relevant tweets on broad and dynamic topics, 400,000 such as politics and elections (Magdy and Elsayed 300,000 2016). TweetElect used an initial set of 38 keywords 200,000 related to the US elections for streaming relevant 100,000 tweets. Consequently, adaptive filtering continuously 0 9/1/16 9/2/16 9/3/16 9/4/16 9/5/16 9/6/16 9/7/16 9/8/16 9/9/16 enriches the set of keywords with additional terms that 9/10/16 9/11/16 9/12/16 9/13/16 9/14/16 9/15/16 9/16/16 9/17/16 9/18/16 9/19/16 9/20/16 9/21/16 9/22/16 9/23/16 9/24/16 9/25/16 9/26/16 9/27/16 9/28/16 9/29/16 9/30/16 emerge by time (Magdy and Elsayed 2016). Figure 1: Total number of retweets of the Top50, Top10, Tweet Collection and Top10F daily viral tweets relating to the US elec- We were interested in analyzing the most \viral" tweets tions during September 2016 pertain to the US presidential elections. Therefore, we . constructed a set of the most retweeted 50 topically . relevant (as provided by TweetElect.com) tweets Number of retweets of the top N viral tweets on the US elections 1,400,000 (October 2016) for everyday in September and October 2016. Thus, Top50 1,200,000 our collection contained 3,050 unique tweets that were Top10 1,000,000 retweeted 20.5 million times. By month, September had Top10F 800,000 1,500 unique tweets with a retweet count of 6.67 million, and October had 1,550 tweets with a retweet count of 600,000 13.89 million. This volume of retweets represents more 400,000 than 50% of the total tweet volume collected by Tweet- 200,000 Elect related to the US elections during September and 0 10/1/16 10/2/16 10/3/16 10/4/16 10/5/16 10/6/16 10/7/16 10/8/16 10/9/16 October 2016. 10/10/16 10/11/16 10/12/16 10/13/16 10/14/16 10/15/16 10/16/16 10/17/16 10/18/16 10/19/16 10/20/16 10/21/16 10/22/16 10/23/16 10/24/16 10/25/16 10/26/16 10/27/16 10/28/16 10/29/16 10/30/16 10/31/16 In our analysis, we show statistics based on three types of viral tweets: Figure 2: Total number of retweets of the Top50, Top10, • Top50: The top 50 viral tweets per day. and Top10F daily viral tweets relating to the US elec- tions during October 2016 • Top10: The top 10 viral tweets per day. Checking only the top 10 rather than 50 can give a better in- dicator of the direction of the trends on Twitter on month. Tweets were labeled as pro-Trump, pro-Clinton, that day, and the top few retweeted tweets usually or neither. dominate the retweet volume. Out of the 1,500 tweets collected during September, • Top10F: The top 10 viral tweets per day for the 636 were tweeted by either of the candidates' official candidates' supporters (\Fans") only and excluding accounts. This number was 612 out of 1,550 for October. tweets from the official accounts of the candidates. These tweets were automatically annotated to be in the Since many of the top tweets are usually coming from favor of the candidate who posted them. The remaining 4 the presidential candidates, this gives a depth on the tweets were then posted to a crowd-sourcing platform support of the candidates by what their fans say. to be manually annotated. Each tweet was annotated by at least 3 annotators, and the majority voting is The total number of retweets for the TOP 50, TOP taken for selecting the final label. A golden control set of 10, and TOP 10 Fa n tweets per day are 6.67 million, 17 tweets was provided to control the annotators work 3.49 million, and 1.75 million retweets respectively. Fig- quality. ures 1 and 2 show the virality of the top tweets day-by- We asked annotators to label each of the tweet with day during September and October 2016 respectively. one of three labels: 1) In favor of Trump, 2) In favor of As shown, the days with the largest number of retweets Clinton, 3) Neither of them. for the top N tweets were September 27 and October Instruction were given to annotators as follows: 10, which are the days following the first and second debate between the candidates 3. • Tweets in favor of a candidate can be: 1. Clearly showing support to the candidate Tweet Labeling 2. Giving positive facts about the candidate or We labeled the tweets on two stages, as we labeled the his/her campaign (for example showing that viral tweets of each month directly after the end of this he/she leads in polls) 3.