Arxiv:1203.4642V1 [Cs.SI] 21 Mar 2012
Total Page:16
File Type:pdf, Size:1020Kb
Why Watching Movie Tweets Won’t Tell the Whole Story? Felix Ming Fai Wong Soumya Sen Mung Chiang EE, Princeton University EE, Princeton University EE, Princeton University [email protected] [email protected] [email protected] ABSTRACT this study because marketers consider brand interac- Data from Online Social Networks (OSNs) are providing tion and information dissemination as a major aspect analysts with an unprecedented access to public opinion on of Twitter. The focus on movies in this paper is also elections, news, movies etc. However, caution must be taken driven by two key factors: to determine whether and how much of the opinion extracted (a) Right in the Level of Interest: Movies tend to from OSN user data is indeed reflective of the opinion of the generate a high interest among Twitter users as well as larger online population. In this work we study this issue in other online user population (e.g., IMDb). in the context of movie reviews on Twitter and compare the (b) Right in Timing: We collected Twitter data dur- opinion of Twitter users with that of the online population of ing Academy Award season (Oscars) to obtain a unique IMDb and Rotten Tomatoes. We introduce new metrics to dataset to analyze characteristic differences between Twit- show that the Twitter users can be characteristically differ- ter and IMDb or Rotten Tomatoes users in their reviews ent from general users, both in their rating and their relative of Oscar-nominated versus non-nominated movies. preference for Oscar-nominated and non-nominated movies. We collected data from Twitter between February- Additionally, we investigate whether such data can truly pre- March 2012 and manually labeled 10K tweets as train- dict a movie’s box-office success. ing data for a set of classifiers based on SVM. We focus on the following questions to investigate whether Twit- Categories and Subject Descriptors ter data is sufficiently representative and indicative of future outcomes: H.1.2 [Information Systems]: User/Machine Systems| Are there more positive or negative reviews about Human factors movies• on Twitter? Do users tweet before or after watching a movie? General Terms • How does the proportion of positive to negative • Measurements, Online Social Networks reviews on Twitter compare to those from other movie rating sites (e.g., Rotten Tomatoes, IMDb)? Keywords Do the opinions of Twitter users about the Oscar- nominated• and non-nominated movies differ quantita- Information Dissemination, Movie Ratings, Psychology tively from these other rating sites? Does greater hype and positive reviews on Twitter 1. INTRODUCTION directly• translate to a higher rating for the movie in arXiv:1203.4642v1 [cs.SI] 21 Mar 2012 Online social networks (OSN) provide a rich repos- other rating sites? itory of public opinion that are being used to analyze How well do reviews on Twitter and other online trends and predict outcomes. But such practices have rating• sites correspond to box-office gains or losses? been criticized as it is unclear whether polling based on The paper is organized as follows: Section 2 reviews OSNs data can be extrapolated to the general popula- related work. Section 3 discusses the data collection and tion [2]. Motivated by this need to evaluate the \rep- classification techniques used. The results are reported resentativeness" of OSN data, we report on a study of in 4, followed by conclusions in Section 5. movie reviews in Twitter and compare them with other online rating sites (e.g., IMDb and Rotten Tomatoes) by introducing new metrics on inferrability ( ), posi- 2. RELATED WORK tiveness ( ), bias ( ), and hype-approval ( ).I This work complements earlier works in three related AlthoughP the contextB of this work is movieH reviews topics: (a) OSNs as a medium of information dissemi- in Twitter, its scope extends to other product cate- nation, (b) sentiments analysis, and (c) Twitter's role gories and social networks. Twitter is our choice for in predicting movies box-office. 1 Network Influence. Several works have reported ID Movie Title2 Category3 1 The Grey I on how OSN users promote viral information dissemina- 2 Underworld: Awakening I tion [10] and create powerful electronic\word-of-mouth" 3 Red Tails I (WoM) effects [7] through tweets. [9, 12] study these 4 Man on a Ledge I 5 Extremely Loud & Incredibly Close I tweets to identify social interaction patterns, user be- 6 Contraband I havior, and network growth. Instead, we focus on the 7 The Descendants I sentiment expressed in these tweets on popular new 8 Haywire I movies and their ratings. 9 The Woman in Black I 10 Chronicle I Sentiment Analysis & Box-office Forecasting. 11 Big Miracle I Researchers have mined Twitter dataset to analyze pub- 12 The Innkeepers I lic reaction to various events, from election debate per- 13 Kill List I 14 W.E. I formance [5] to movie box-office predictions on the re- 15 The Iron Lady II lease day [1]. In contrast, we improve on the training 16 The Artist II and classification techniques, and specifically focus on 17 The Help II 18 Hugo II developing new metrics to ascertain whether opinions 19 Midnight in Paris II of Twitter users are sufficiently representative of the 20 Moneyball II general online population of sites like IMDb and Rot- 21 The Tree of Life II ten Tomatoes. Additionally, we also revisit the issue of 22 War Horse II 23 A Cat in Paris II how well factors like hype and satisfaction reported in 24 Chico & Rita II the user tweets can be translated to online ratings and 25 Kung Fu Panda 2 II eventual box-office sales. 26 Puss in Boots II 27 Rango II 28 The Vow III 3. METHODOLOGY 29 Safe House III 30 Journey 2: The Mysterious Island III 31 Star Wars I: The Phantom Menace III 3.1 Data Collection 32 Ghost Rider: Spirit of Vengeance III From February 2 to March 12, we collected a set of 12 33 This Means War III 34 The Secret World of Arrietty III million tweets (world-wide) using the Twitter Stream- 1 ing API . The tweets were collected by tracking key- Table 1: List of movies tracked (Ref. footnotes 2,3) words in the titles of 34 movies, which were either re- cently released (January earliest) or nominated for the for comparison with the data from Twitter. Rotten Academy Awards 2012 (\Oscars"). The details are listed Tomatoes defines a review being positive when its nu- in Table 1. merical rating is 3:5=5 or above, and the site also pro- There were two limitations with the API. Firstly, the vides the proportion of positive user reviews. Then for server imposes a rate limit and discards tweets when the IMDb, we can do a simple scaling to give a compati- limit is reached, but fortunately, the number of dropped ble definition of a positive review as one with a rating tweets accounts for only less than 0.04% of all tweets, of 7=10 or above. The proportion of positive user re- and rate limiting was observed only during the night views in IMDb is calculated over the per-movie rating of the Oscars award ceremony. The second problem is distributions provided in their website. the API does not support exact keyword phrase match- 3.2 Tweet Training & Classification ing. As a result we received many spurious tweets with keywords in the wrong order, e.g., tracking \the grey" We classify tweets by relevance, sentiment and tem- returns the tweet\a cat in the grey box". To account for poral context as defined in Table 2. variations in spacing and punctuation, we used regular We highlight several design challenges before describ- expressions to filter for movie titles, and after that we ing the implementation. Some of the movies we tracked obtained a dataset of 1.77 million tweets. have terse titles with common words (The Help, The On March 12, we also collected data from IMDb and Grey), and as a result many tweets are irrelevant even Rotten Tomatoes for box office figures and the propor- though they contain the titles, e.g., \thanks for the tion of positive user reviews per movie. help" is a valid tweet. Another difficulty is the large Definition of a Positive Review. In contrast to number of non-English tweets. Presently we treat them our classification of user tweets with a movie review as as irrelevant, but we intend to include them in future being positive or negative, the users of the above two work. Lastly, both movie reviews and online social me- sites attach a numerical rating to each movie. There- 2Bold indicates the movie was nominated for the Academy Awards for Best Picture or Best Animated Feature Film. fore, we use Rotten Tomatoes' definition of a \positive" 3 rating as a binary classifier to convert the movie scores Trending category: (I) trending as of Feb 2; (II) trending as of Feb 7 after Oscars nomination; (III) trending as of Feb 1https://dev.twitter.com/docs/streaming-api 15 after Valentine's Day. 2 Class Definition Example Relevance Irrelevant (I) Non-English (possibly relevant), or irrelevant from the context \thanks for the help" Relevant (R) Otherwise \watched The Help" Sentiment Negative (N) Contains any negative comment \liked the movie, but don't like how it ended" Positive (P) Unanimously and unambiguously positive \the movie was awesome!" Mention (M) Otherwise \the movie was about wolves" Temporal Context After (A) After watching as inferred from context \had a good time watching the movie" Before (B) Before watching movie \can't wait to see the movie!" Current (C) Tweeted when person was already inside the cinema \at cinema about to watch the movie" Don't know (D) Otherwise \have you seen the movie?" Table 2: Definition of tweet classes.